1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Extending Search: the lesser of two evils

Discussion in 'XenForo Development Discussions' started by Lawrence, Jun 3, 2016.

  1. Lawrence

    Lawrence Well-Known Member

    So, I added a new field to the thread table and want it to be searchable. And it is. Extending the post save datawriters make it pretty easy to add in the new content into the message field of the search index table.

    The issue is, when rebuilding the search index from the adminCP the message field is emptied because of this line in the protected function _insertIntoIndex (the '' where the message field is after $title):

    PHP:
            $indexer->insertIntoIndex(
                
    'thread'$data['thread_id'],
                
    $title'',
                
    $data['post_date'], $data['user_id'], $data['thread_id'], $metadata
            
    );
    To get around this I over wrote that method, and now it works. However; if another add-on over-rides that method (which i doubt there is one), either my new searchable data will be removed from the search index, or the other add-on's data will be. (using parent:: returns null)

    The other solution is to create a new content type for my add-on. That's straight forward to do, but I'm on the fence for it as the new content type is thread-related so rebuilding the search index would involve looping through all the threads twice (once for rebuilding the standard thread info, and once again to rebuild my new data field). IMO, that's just a waste of resources.

    So, I'm not too sure which is the lesser evil of the two methods. I'm leaning towards the first method as it doesn't take any more resources to rebuild the search index than what threads would use anyways. Plus it keeps the search index smaller as a new content type isn't inserted.

    Any feedback/opinions welcomed.
     
    Last edited: Jun 3, 2016
  2. Nobita.Kun

    Nobita.Kun Well-Known Member

    XenForo have provide event to extends source handler ('search_source_create'). So I think using that event and extending to function insertIntoIndex to do your stuff.
     
    Lawrence likes this.
  3. Lawrence

    Lawrence Well-Known Member

    Thanks for your reply. I tried that yesterday and it still emptied the message field for threads when rebuilding the search index. insertIntoIndex when called by the datawriter works, from the rebuild tools it just doesn't because of that field being set to empty in the thread handler, hence why I had to over-write the method.
     
  4. Nobita.Kun

    Nobita.Kun Well-Known Member

    If running from Tools->Rebuild. I think you should extends this function: XenForo_Search_DataHandler_Thread::rebuildIndex
     
  5. Lawrence

    Lawrence Well-Known Member

    That's what I did originally, :). It sends the thread data to the above mentioned method that sets the message field to an empty value and is why I over-wrote it for now at least, hoping that there was something I was missing...
     
  6. Xon

    Xon Well-Known Member

    @Lawrence there is actually a reasonably extendable way todo this.

    Here is how I add the word count to posts;

    First create an indexer proxy:
    https://github.com/Xon/XenForo-Sear...rchImprovements/Search/IndexerProxy.php#L3-25
    PHP:
    class SV_SearchImprovements_Search_IndexerProxy extends XenForo_Search_Indexer
    {
        protected 
    $_proxiedIndexer null;
        protected 
    $_metadata = array();

        public function 
    __construct(XenForo_Search_Indexer $otherIndexer, array $metadata)
        {
            
    $this->_sourceHandler $otherIndexer->_sourceHandler;
            
    $this->_proxiedIndexer $otherIndexer;
            
    $this->_metadata $metadata;
        }

        public function 
    setProxyMetaData(array $metadata)
        {
            
    $this->_metadata $metadata;
        }

        public function 
    insertIntoIndex($contentType$contentId$title$message$itemDate$userId$discussionId 0, array $metadata = array())
        {
            
    $metadata XenForo_Application::mapMerge($metadata$this->_metadata);
            
    $this->_proxiedIndexer->insertIntoIndex($contentType$contentId$title$message$itemDate$userId$discussionId$metadata);
        }
    }
    This purely exists to capture the insertIntoIndex() function, and push more data into the $metadata container.

    Then define the override on the datahandler for some content type:
    https://github.com/Xon/XenForo-Word...rch/XenForo/Search/DataHandler/Post.php#L7-30
    PHP:
    class SV_WordCountSearch_XenForo_Search_DataHandler_Post extends XFCP_SV_WordCountSearch_XenForo_Search_DataHandler_Post
    {
        protected function 
    _insertIntoIndex(XenForo_Search_Indexer $indexer, array $data, array $parentData null)
        {
            if (!isset(
    $data[SV_WordCountSearch_Globals::WordCountField]))
            {
                
    $wordcount $this->_getSearchModel()->getTextWordCount($data['message']);
                
    $db XenForo_Application::getDb();
                
    $db->query("
                    insert ignore into xf_post_words (post_id, word_count) values (?,?)
                "
    , array($data['post_id'], $wordcount));
                
    $data[SV_WordCountSearch_Globals::WordCountField] = $wordcount;
            }

            
    $metadata = array();
            
    $metadata[SV_WordCountSearch_Globals::WordCountField] = $data[SV_WordCountSearch_Globals::WordCountField];

            if (
    $indexer instanceof SV_SearchImprovements_Search_IndexerProxy)
            {
                
    $indexer->setProxyMetaData($metadata);
            }
            else
            {
                
    $indexer = new SV_SearchImprovements_Search_IndexerProxy($indexer$metadata);
            }


            
    parent::_insertIntoIndex($indexer$data$parentData);
        }

        public function 
    quickIndex(XenForo_Search_Indexer $indexer, array $contentIds)
        {
            
    $indexer = new SV_SearchImprovements_Search_IndexerProxy($indexer, array());
            return 
    parent::quickIndex($indexer$contentIds);
        }

        protected 
    $_searchModel null;
        protected function 
    _getSearchModel()
        {
            if (!
    $this->_searchModel)
            {
                
    $this->_searchModel XenForo_Model::create('XenForo_Model_Search');
            }

            return 
    $this->_searchModel;
        }
    }
    Make sure XenES_Model_ElasticSearch is overriden to ensure the new field is correctly indexed by Elastic Search.

    Example:
    https://github.com/Xon/XenForo-Word...untSearch/XenES/Model/Elasticsearch.php#L5-12
    https://github.com/Xon/XenForo-Word...ibrary/SV/WordCountSearch/Installer.php#L5-11

    Note; all this code is MIT licenced, and my installer code is available under that licence too.
     
    Lawrence likes this.
  7. Lawrence

    Lawrence Well-Known Member

    Thanks for the reply Xon, you are always very helpful, :)

    I tried many different ways, but because the default behaviour for the insertInto for threads when rebuilding the indexes is to set the 'message' field to '' it always wiped out my data, that is why I needed to overwrite that method and not to call the parent:: action. Which I'm not fond of.

    I thought about this over the weekend, and decided to make my thread data it's own content type, I think it will give more flexibility to upgrading the add-on with new features. The number of threads affected should be far fewer than the number of actual threads, so rebuilding the search index should not be impacted that much.
     
    Xon likes this.
  8. Xon

    Xon Well-Known Member

    The useful aspect of the Indexer proxy class, is you can just rewrite what is sent to insertIntoIndex :D
     

Share This Page