Fixed Search not working as expected

Affected version
2.0

Floyd R Turbo

Well-known member
Not suuuuuure this is a bug, but I just thought it was very odd.

when googling "bogboardadmin" I got a hit for a thread here:
https://xenforo.com/community/threads/tapatalk-and-security-convos-etc.61114/post-654395

But when I go to search the entire site for "bigboardadmin" or "bigboardadmin.com" I get no results.

If I search for "big board" "board admin" "big board admin" I get some results, but none of those contain the above linked post/thread in the search results.
 

Mike

XenForo developer
Staff member
I've finally managed to track this down. I thought it was a parsing quirk with Elasticsearch, but it was actually a regressing in our parsing in XFES2 that lead to "domain.com" actually being searched as "domain com". I've just rolled out a fix here and it will be included in XFES 2.0.1.
 

Chromaniac

Well-known member
I am noticing a similar issue but for domain name in links. i managed to replicate here in the test forum. Enhanced search based search seems to be ignoring keywords in domain but keywords in actual url seems to work fine.

no result: https://xenforo.com/community/search/search/?keywords=livemint

result: https://xenforo.com/community/search/search/?keywords=11606043356871

so the behavior now is even more unusual. first link shows this thread but not my test thread. second link shows my thread in test forum but not this one!
 
Last edited:

Mike

XenForo developer
Staff member
This is really just down to how Elasticsearch is doing the tokenizing in these cases. It's actually quite a complex operation in some cases. (It ultimately uses a Unicode text segmenting standard: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-standard-tokenizer.html)

www.domain.com is one word. You can confirm this by searching for the full domain name and it will return your test message. The number case behaves differently because the Unicode algorithm specifies some slightly different behaviors for numbers versus letters in terms of determining word boundaries.
 
Top