• This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.

Clarification about "built in" search

Weppa333

Active member
#1
Sorry if this is a stupid question, but I can't find an answer

On my install, it seems to built in search is rather dumb, for exempla, if you search for "car" it doesn't match the word "cars". Also, in french we obviously use many accents, and the search does not match "tête" if you look for "tete" (or the other way around)

Is there a way to
- index only unaccentuated chars in the search DB, and "unaccent" what's entered in the search box
- find partial matches by default (at least plurals should match !)

Finally, I really would like to un-UTF the URLs (because they look AWFUL when copy pasted), is there a way to do that ?

thanks to all !
 

CTXMedia

Formerly CyclingTribe
#2
On my install, it seems to built in search is rather dumb, for exempla, if you search for "car" it doesn't match the word "cars". Also, in french we obviously use many accents, and the search does not match "tête" if you look for "tete" (or the other way around)
This is known as stemming and isn't available in the default search.

To get stemming you need to use the XFES (XenForo Enhanced Search) add-on - although I'm unsure if it supports French, so it may be worth asking in the ES support forum.

Cheers,
Shaun :D
 

Weppa333

Active member
#3
I'd be happy to buy an extra product if a team member could confirm it would solve these issues. :)

In the meantime, I'll simply disconnect the built in search altogether and replace it with a Google CSE.
 

Brogan

XenForo moderator
Staff member
#4

Weppa333

Active member
#5
Without entering into too much complexity, it would be nice (maybe for 1.2 ?) to have an option to simply
- unaccent everything sent to the search index - by the search box or by adding new posts ( looks feasible to me ?)
- unaccent ( un-UTF8, transliterate ) URLs for people allergic to UTF8 in URLs, like I am.

It would solve most problems. The search for "plurals" is another story.
I understand french is a bit "in between" plain ASCII and UTF8, as many people simply don't use accents when they should, so the benefit of UTF8 (which is obviously a NEED for some languages) are little. In fact, many french webmasters struggle to make software vendors ( IPB, XF, etc) remove UTF8 from URLs because it currently does more harm than good (for french)

Just my opinion tough.
 

HWS

Well-known member
#6
This conclusion is not correct. ;-)

The default language stemming in XFES is English. If you change it to "French" and rebuild your index it would do exactly what you look for. And it is possible to change the stemmer. We have a forum where we have successfully changed it to "German" which has the same "problems" like french.

However I cannot remember how we've done it. I only remember to have found a solution at Elasticsearch's web site and followed that description exactly. It can't be done in the GUI. Maybe Slavik, the XFES expert, can help with that?

Maybe this helps to find the way:
https://www.google.com/#output=sear...lasticsearch&oq=stemming+french+elasticsearch
 

Weppa333

Active member
#7
Just wanted to thank everyone for their time, it's really great to be helped and supported here. I'll also have a look at ES doc, I'm more of a former Sphinx fan, so this is all new to me.