1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Adding a new language (polish), few other questions.

Discussion in 'Enhanced Search Support' started by janslu, Sep 20, 2016.

  1. janslu

    janslu Member

    1. I am using Enhanced Search on a 13.000.000 post forum in polish and it is working great. But I started looking into using polish stemming, expecting even better search results and accuracy. There is a plugin called Stempel - it seems to integrate polish language rules coming from Lucene and is easy to install in ElasticSearch. But the list of language stems in XenForo ES seems to be hardcoded. What should I do to add a new option over there? Anyone's done this before? If I understand correctly I should also reindex the site afterwards?

    2. Elasticsearch seems to use a lot of memory - top shows 18Gb of virtual memory usage. I know it mostly sits unused (part of it goes to swap) but even for a 6.3GB index it seems to be a lot. Are there any options I should look into?

    3. What is Optimize mappings button doing? It doesn't seem to do anything on my forum...
  2. Mike

    Mike XenForo Developer Staff Member

    This isn't just a matter of changing the language listed in XF. You would need to use a totally different analyzer to what is set within XF itself. XF only knows about the Snowball analyzer and the languages it has: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-snowball-analyzer.html

    You'd need to configure this within ES via the command line.

    Well optimizing the mappings will help reduce memory usage. We've generally seen around 300MB per million posts for the index size, which should help memory usage. The option generally should not appear if it's not needed. Make sure you're running the latest XFES if you're not.

    Determining true memory usage can be difficult because of things like mmap. Is the memory usage causing you problems elsewhere? Of course, if you have a fast disk (SSD), you could consider reducing the amount of memory given to ES/java and just rely on fast SSD access. Generally though you do want to fit your data in memory if possible (both in ES and MySQL).
  3. janslu

    janslu Member

    Oh my... This is much more complicated than I thought. I was happy to find polish stemmer but it seems it was created before the Snowball and is considered as "the stemmer" for polish. I have it installed into elastic search and I will try to look into actually using this for search.

    i am running the latest XFES but i'm also using digitalpoint add-on. All in all Optimizing mappings doesn't seem to be doing anything. I will try to play with it later on.
    As for the memory - I am using a single server for mysql and elasticsearch and I am getting closer and closer to physical memory limits. I was hoping to find a setting that would free some of the ES memory. I will dig into optimizing mappings and I'll see where it leads me.

    Thanks for support.

Share This Page