Fixed Possible Search Bug?

Josephur

Member
Forum site: WindowsForum.com

When searching for "Mexican Star Trek - Mad TV"

It appears the minus symbol being in there excludes all search results, kind of like it's interpreted as "-*"

If you exclude the minus symbol everything works fine, or if you put quotations around the entire thread name and then search it works. Is this the intended way for search to work? I believe the minus should be treated as part of the search if there is no A-Z 0-9 character after it, not treated as some kind of expression.

Input would be greatly appreciated.

-Joseph
 
Actually thinking about how Google and other search engines handle minus symbols without immediate trailing content, I think a stand alone minus symbol should be probably just ignored all together..
 
This appears to be specific to XF Enhanced Search, though I'm not positive what code is triggering the bug so I won't move this yet.
 
Also note the URL you're redirected to with a minus symbol in the search terms. https://windowsforum.com/search/search

The odd thing is it appears testing with this bug that a minus in the search terms without trailing content doesn't always exclude search results (what the heck)???

if I search for "windows 10 - end" and other similar searches I get results. Is it just a thread name that matches exactly with a minus symbol that does it, not certain now.. it's a bit odd.

We are using the enhanced search plugin from Xenforo with elasticsearch btw.
 
Doing further testing seems to half confirm this.

For instance there is a thread named "Microsoft Confirms - Windows 10 Free for Insider Program Members"

If you search for it verbatim and include the minus symbol just as the thread is named, you'll get the "bug". If you search for "Microsoft Confirms - Windows 10" you get results, but not the thread just other results. I still can't pin down the logic exactly.
 
I stand corrected. This isn't Enhanced Search specific. The original example didn't apply because the "Mad" was under the default 4 character MySQL minimum, so it was being skipped.

The issue is that "- word" is being considered as "-word" to require a word to not be present. With our AND (+) and OR (|) operators, this is reasonable (AND is basically ignored since it's the default). I've adjusted this specific case so only "-word" is considered as a negation. This appears to be how Google works.

This requires a change for MySQL full text search (the default) and Enhanced Search. The fix is the same for each though. The file is either library/XenForo/Search/SourceHandler/MySqlFt.php or library/XenES/Serach/SourceHandler/ElasticSearch.php and this:
Code:
(?P<modifier>\-|\+|\||)
Should be changed to:
Code:
(?P<modifier>\-(?!\s)|\+|\||)
 
Top Bottom