1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Unmaintained [WMTech] How to configure Elasticsearch for non-english languages

Optimize Elasticsearch for german, french and other languages with special letters

  1. wmtech

    wmtech Well-Known Member

    wmtech submitted a new resource:

    How to configure Elasticsearch for non-english languages - Optimize Elasticsearch for german, french and other languages with special letters

    Read more about this resource...
     
  2. TBDragon

    TBDragon Active Member

    thanks for this

    but does it work with arabic?!
     
    thedude likes this.
  3. wmtech

    wmtech Well-Known Member

    The snowball filter does not support Arabic. Just the languages mentioned in the ressource description.
    Sorry. You need to find another solution.
     
  4. Marcus

    Marcus Well-Known Member

    What about the xenforo acp settings for the elasticsearch index? Are these settings overwritten with these lines?

    /elasticsearch/config/elasticsearch.yml


    index.analysis.analyzer.default.type: custom
    index.analysis.analyzer.default.tokenizer: standard
    index.analysis.analyzer.default.filter: ["standard", "lowercase", "stop", "snow", "length" ]
    index.analysis.filter.snow.type: snowball
    index.analysis.filter.snow.language: German2
    index.analysis.filter.length.type: length
     
  5. wmtech

    wmtech Well-Known Member

    If you change the settings for the Xenforo search index from Xenforo ACP (like switch "stemming" to on), they would override the defaults you are setting with those lines and thus disable this modification.

    You are save if you DO NOT change the Elasticsearch settings in Xenforo ACP after you have added those lines to elasticsearch.yml.
     
    Dennis B and Marcus like this.
  6. Dennis B

    Dennis B Member

    It looks like the new ES release now supports these different languages directly from the ES Options in the AdminCP, correct?
     
  7. duderuud

    duderuud Active Member

    Indeed, looks like it...
     
  8. wmtech

    wmtech Well-Known Member

    With the most recent XFES add-on you can choose the language setting from the ACP.
    However it does no harm to set this in the elasticsearch config also.
     
    Dennis B likes this.
  9. sinucello

    sinucello Well-Known Member

    Hi,
    thanks, this is really usefull. But the German2 umlaut stemming is not working very well. I mean it`s a good thing that if you search for "gruen" - "grün" (green) will also be found but if you search for Kuchen (cake) the result will also include Küchen (kitchen) which is not relevant.

    So we have to kind of Umlaut-Words:
    • 2 variants of a word with exactly the same meaning
    • 2 words with totally different meanings
    Google can handle this very well. I don`t know if this can be solved by scripting. It`d be helpful already if the word the user typed in would be weighted higher than the variant that is found because of the umlaut stemming.

    I would like to hear your experiences or solutions for this problem.

    all the best,
    Sacha
     

Share This Page