Extending Search: the lesser of two evils

Lawrence

Well-known member
So, I added a new field to the thread table and want it to be searchable. And it is. Extending the post save datawriters make it pretty easy to add in the new content into the message field of the search index table.

The issue is, when rebuilding the search index from the adminCP the message field is emptied because of this line in the protected function _insertIntoIndex (the '' where the message field is after $title):

PHP:
        $indexer->insertIntoIndex(
            'thread', $data['thread_id'],
            $title, '',
            $data['post_date'], $data['user_id'], $data['thread_id'], $metadata
        );

To get around this I over wrote that method, and now it works. However; if another add-on over-rides that method (which i doubt there is one), either my new searchable data will be removed from the search index, or the other add-on's data will be. (using parent:: returns null)

The other solution is to create a new content type for my add-on. That's straight forward to do, but I'm on the fence for it as the new content type is thread-related so rebuilding the search index would involve looping through all the threads twice (once for rebuilding the standard thread info, and once again to rebuild my new data field). IMO, that's just a waste of resources.

So, I'm not too sure which is the lesser evil of the two methods. I'm leaning towards the first method as it doesn't take any more resources to rebuild the search index than what threads would use anyways. Plus it keeps the search index smaller as a new content type isn't inserted.

Any feedback/opinions welcomed.
 
Last edited:
XenForo have provide event to extends source handler ('search_source_create'). So I think using that event and extending to function insertIntoIndex to do your stuff.
 
Thanks for your reply. I tried that yesterday and it still emptied the message field for threads when rebuilding the search index. insertIntoIndex when called by the datawriter works, from the rebuild tools it just doesn't because of that field being set to empty in the thread handler, hence why I had to over-write the method.
 
Thanks for your reply. I tried that yesterday and it still emptied the message field for threads when rebuilding the search index. insertIntoIndex when called by the datawriter works, from the rebuild tools it just doesn't because of that field being set to empty in the thread handler, hence why I had to over-write the method.
If running from Tools->Rebuild. I think you should extends this function: XenForo_Search_DataHandler_Thread::rebuildIndex
 
If running from Tools->Rebuild. I think you should extends this function: XenForo_Search_DataHandler_Thread::rebuildIndex

That's what I did originally, :). It sends the thread data to the above mentioned method that sets the message field to an empty value and is why I over-wrote it for now at least, hoping that there was something I was missing...
 
@Lawrence there is actually a reasonably extendable way todo this.

Here is how I add the word count to posts;

First create an indexer proxy:
https://github.com/Xon/XenForo-Sear...rchImprovements/Search/IndexerProxy.php#L3-25
PHP:
class SV_SearchImprovements_Search_IndexerProxy extends XenForo_Search_Indexer
{
    protected $_proxiedIndexer = null;
    protected $_metadata = array();

    public function __construct(XenForo_Search_Indexer $otherIndexer, array $metadata)
    {
        $this->_sourceHandler = $otherIndexer->_sourceHandler;
        $this->_proxiedIndexer = $otherIndexer;
        $this->_metadata = $metadata;
    }

    public function setProxyMetaData(array $metadata)
    {
        $this->_metadata = $metadata;
    }

    public function insertIntoIndex($contentType, $contentId, $title, $message, $itemDate, $userId, $discussionId = 0, array $metadata = array())
    {
        $metadata = XenForo_Application::mapMerge($metadata, $this->_metadata);
        $this->_proxiedIndexer->insertIntoIndex($contentType, $contentId, $title, $message, $itemDate, $userId, $discussionId, $metadata);
    }
}

This purely exists to capture the insertIntoIndex() function, and push more data into the $metadata container.

Then define the override on the datahandler for some content type:
https://github.com/Xon/XenForo-Word...rch/XenForo/Search/DataHandler/Post.php#L7-30
PHP:
class SV_WordCountSearch_XenForo_Search_DataHandler_Post extends XFCP_SV_WordCountSearch_XenForo_Search_DataHandler_Post
{
    protected function _insertIntoIndex(XenForo_Search_Indexer $indexer, array $data, array $parentData = null)
    {
        if (!isset($data[SV_WordCountSearch_Globals::WordCountField]))
        {
            $wordcount = $this->_getSearchModel()->getTextWordCount($data['message']);
            $db = XenForo_Application::getDb();
            $db->query("
                insert ignore into xf_post_words (post_id, word_count) values (?,?)
            ", array($data['post_id'], $wordcount));
            $data[SV_WordCountSearch_Globals::WordCountField] = $wordcount;
        }

        $metadata = array();
        $metadata[SV_WordCountSearch_Globals::WordCountField] = $data[SV_WordCountSearch_Globals::WordCountField];

        if ($indexer instanceof SV_SearchImprovements_Search_IndexerProxy)
        {
            $indexer->setProxyMetaData($metadata);
        }
        else
        {
            $indexer = new SV_SearchImprovements_Search_IndexerProxy($indexer, $metadata);
        }


        parent::_insertIntoIndex($indexer, $data, $parentData);
    }

    public function quickIndex(XenForo_Search_Indexer $indexer, array $contentIds)
    {
        $indexer = new SV_SearchImprovements_Search_IndexerProxy($indexer, array());
        return parent::quickIndex($indexer, $contentIds);
    }

    protected $_searchModel = null;
    protected function _getSearchModel()
    {
        if (!$this->_searchModel)
        {
            $this->_searchModel = XenForo_Model::create('XenForo_Model_Search');
        }

        return $this->_searchModel;
    }
}

Make sure XenES_Model_ElasticSearch is overriden to ensure the new field is correctly indexed by Elastic Search.

Example:
https://github.com/Xon/XenForo-Word...untSearch/XenES/Model/Elasticsearch.php#L5-12
https://github.com/Xon/XenForo-Word...ibrary/SV/WordCountSearch/Installer.php#L5-11

Note; all this code is MIT licenced, and my installer code is available under that licence too.
 
Thanks for the reply Xon, you are always very helpful, :)

I tried many different ways, but because the default behaviour for the insertInto for threads when rebuilding the indexes is to set the 'message' field to '' it always wiped out my data, that is why I needed to overwrite that method and not to call the parent:: action. Which I'm not fond of.

I thought about this over the weekend, and decided to make my thread data it's own content type, I think it will give more flexibility to upgrading the add-on with new features. The number of threads affected should be far fewer than the number of actual threads, so rebuilding the search index should not be impacted that much.
 
  • Like
Reactions: Xon
The useful aspect of the Indexer proxy class, is you can just rewrite what is sent to insertIntoIndex :D
 
Back
Top Bottom