Not a bug Search engine: "Search title only" appears inconsistent when searching non-threads

DragonByte Tech

Well-known member
Affected version
2.0.4
I'm not sure where exactly the bug lies, but if you tick the "Search title only" box when searching content types other than threads, it doesn't work as expected:

qbuJ9Po.png


Link Essentials shouldn't match, as "favicon" does not exist in the title of this resource.

YS0Tkh5.png


"IMG-20170130-WA0004" should not match because "test" is not featured in the title.

TzYAUvK.png


Etc.

If we need to do something specific in our /Search/Data/$contentType.php file to support title-only searches, it seems like both the official add-ons and myself are failing to do so :D

@ DBTech we are not running ElasticSearch, but I assume you are here, so it's unlikely this bug is related to either ES or lack thereof.


Fillip
 
I can't speak for your add-on, but certainly for the first two cases, this is because those particular examples both have relevant tags. To add more relevancy to searches, we index the tags as part of the title.

So in the case of Link Essentials, the actual title that is indexed is:

"[WMTech] Link Essentials cache cdn favicon interstitial link title link title conversion screenshot skimlinks title detection wmtech"

So I potentially see how this could be confusing (it took me a bit of time to even remember we did that) but in the grand scheme of things, it makes title searches better. I think in this case (generally speaking) it's better that Link Essentials comes up, than not.
 
I can't speak for your add-on, but certainly for the first two cases, this is because those particular examples both have relevant tags. To add more relevancy to searches, we index the tags as part of the title.
I see, that's certainly unexpected behaviour.

That being said, one of the products that come up in the search has no tags so that can't be the reason. Could you please give me some pointers as to how I can begin debugging this?

Here's the index record:
PHP:
$index = IndexRecord::create('dbtech_ecommerce_product', $entity->product_id, [
    'title' => $entity->title,
    'message' => $entity->description_full,
    'date' => $entity->creation_date,
    'user_id' => $entity->user_id,
    'discussion_id' => $entity->product_id,
    'metadata' => $this->getMetaData($entity)
]);

And the metadata:
PHP:
protected function getMetaData(\DBTech\eCommerce\Entity\Product $entity)
{
    $metadata = [
        'prodcat' => $entity->product_category_id,
        'product' => $entity->product_id
    ];
    if ($entity->prefix_id)
    {
        $metadata['prodprefix'] = $entity->prefix_id;
    }
    
    return $metadata;
}

The index record:
SQL:
INSERT INTO `xf_search_index` (`content_type`, `content_id`, `title`, `message`, `metadata`, `user_id`, `item_date`, `discussion_id`)
VALUES
    ('dbtech_ecommerce_product', 370, 'DragonByte Classifieds [Closed Beta]', 'Allowing your users to sell items has never been easier with a feature rich Classifieds system that allows for Auctions as well as stock controlled Buy-it-Now listings.\n\nIntegrated into vBulletin, DragonByte Classifieds also allows users to see current listings, create their own postage and address options for use with their own listings and upload pictures to be displayed along with the listings.\n\nWith a permissions system, automated bidding on auctions, customizable feedback system, fees and fully customizable categories and specifications, and featured and digital items, DragonByte Classifieds caters for a wide range of requirements.', '_md_user_1925 _md_content_dbtech_ecommerce_product _md_prodcat_10 _md_product_370 _md_prodprefix_1', 1925, 1523126227, 370);

I can't find a single reason why a title-only search would match this product. Any advice would be greatly appreciated :)


Fillip
 
This is the resulting query string that gets executed (removed the 200 item limit and added title and metadata for readability:
SQL:
SELECT search_index.content_type, search_index.content_id, search_index.title, search_index.metadata
FROM xf_search_index AS search_index
              
WHERE MATCH(search_index.title, search_index.metadata) AGAINST ('+Thanks -_md_prodcat_16 -_md_prodcat_9 +_md_content_dbtech_ecommerce_product' IN BOOLEAN MODE)
              
ORDER BY search_index.item_date DESC
This returns literally every single product record. Something is clearly not working correctly somewhere.

Furthermore, the same issue happens when searching for threads, if I search for "Thanks" with title only, this thread turns up: https://www.dragonbyte-tech.com/threads/ecommerce-addon.22322/ in spite of the fact it does not appear in titles or tags.

This is the query string that gets executed when I search for threads:
SQL:
SELECT search_index.content_type, search_index.content_id, search_index.title, search_index.metadata
FROM xf_search_index AS search_index
               
WHERE MATCH(search_index.title, search_index.metadata) AGAINST ('+Thanks +(_md_content_post _md_content_thread)' IN BOOLEAN MODE)
               
ORDER BY search_index.item_date DESC

I have rebuilt the search index (with the --truncate option) multiple times now, to no avail.

The DB system is MariaDB 10.2.14.


Fillip
 
Interestingly, I am only allowed to search for the word "Thanks" if I click Advanced Search. If I use the quick search, even with title only, it tells me "The search could not be completed because the search keywords were too short, too long, or too common."

I don't know if that's related to this issue, but even if it isn't, that doesn't change the fact that product content type searches returns 300 unrelated entries (no tags or title that should match) and thread/post searches return approx. 100,000 unrelated entries.

I am at a loss here. Should I raise a support ticket in order to get this resolved?


Fillip
 
Back
Top Bottom