Help me configure Elasticsearch better!

zackw

Member
We recently changed VPSes due to CentOS going bye bye. In this case we ended up with AlmaLinux 8.

I re-setup everything as best I could to get the sites working again, but I know more can be done to make indexing better.
Elasticsearch would crash all the time with OOM and I setup Monit to auto-restart it, but lately it's been nuts, restarting dozens of times a day.

The server is 4vCPU and 8GB RAM. It has no swap file because apparently that's not allowed at this vendor.

I just now looked up versions of things and it's running ES 7.17.25 and openjdk 22.0.1. Default settings.

I know the default of ES is to set the heap to half the total RAM or it's 4GB. In my case it was 4GB, however it determined that. I have already updated the heap to use 3GB in a custom jvm options file, just so I can get the system working again, but it's still using (in total) around 7GB and so the cache is using the rest and ya, pretty much still maxed out RAM more or less.

That said, all services run on this box, it is a full WHM/cPanel server, and has 4 websites. One of them is XenForo and is the only one with traffic really, the other sites don't get meaningful traffic.

XF is the only one using ES with one index. It reports that I have 660,000+ documents at just over 400MB. Searches average 19 milliseconds. The Allocated memory has been reporting about 68KB all day that I've been working on this. I'm guessing that with about 500 active users, we're getting 300+ searches an hour, not really sure on that one. 1500 in the last couple hours.

-----------

Now here are my questions trying to sort this out:

1) All documentation says that ES needs 4GB or half the RAM out of the box, period. I've not seen any kind of official statements that we can use it with anything less than 4GB as a minimum. Is this true? I mean, if there is no index at all and Elasticsearch is simply installed, it still needs that much RAM just to exist? What if I'm indexing 10 documents and it's 5MB? This doesn't make sense to me.

2) Is it wrong to think that the heap size should be based more on the size of the actual indexes? My total index is less than 500MB, so why would I need 4GB RAM? What is the rest of the overhead even doing? I found one whole person on the internet that suggested I could set the heap size to about 20% of the index size. Applying the peredo calc, about 80% of searches are for 20% of the index, so don't need much more I guess.

3) Would it be wise to update ES and openjdk to latest versions? I know I have to completely uninstal ES to do this. Are there any downsides, like requiring even more RAM for the later versions?

4) What other optimizations can I do to reduce RAM needs, given I can't use a swap file? I've read about changing malloc as one option, or tweaking other Java and ES variables.

5) Am I screwed and need to upgrade the server to get even more RAM? It seems crazy that I should need more than 8GB for a forum with 655k posts and 60k members and a 400MB index.

6) What other tools could I use to manage this? It gets old and ugly trying to do everything on a command line.
 
We've got a couple of ES7 instances (on test/dev environments) running with lower JVM memory limits (eg -Xms2g -Xmx2g for instance) without any issues, but they are not heavily used. Generally for most of our active clients we tend to run 4GB+, their indexes tend to be up to about 1-2GB on disk and we have memory so we've never tried them with less (sorry not much use when you are memory constrained!). I remember with earlier ES instances (mainly ES5) we used to see some random crashing and used monit to keep them running, but not very often these days and you're on a current release (we never did quite track down those crashes, but we run FreeBSD rather than Linux so there isn't any formal ES support and the crashes were not frequent).

The one ES instance we have tied to Xenforo (if it's of any use to you as a reference) currently sits at (/_cat/indices?format=json):
JSON:
[
  {
    "health": "green",
    "status": "open",
    "index": "*****",
    "uuid": "*****",
    "pri": "1",
    "rep": "0",
    "docs.count": "9647560",
    "docs.deleted": "2051648",
    "store.size": "2.2gb",
    "pri.store.size": "2.2gb",
    "dataset.size": "2.2gb"
  }
]
That board has 170,317 threads and 9,527,895 messages. That ES instance has 4GB for ms and mx and beyond the occasional error relating to a network timeout in XF seems happy enough.

Are you actually seeing an error in the ES logs? You can tweak up the verbosity if you want in the log4j config which might reveal something.
 
I have VPSs with 8gb ram and i have uncommented xmx4g & xms4g in the jvm file and it works fine
If its using 7gb then sounds like its not abiding to or reading the limits for some reason so thats got to be the first place to start ?
 

That doesn't say anything about how to calculate heap size though.


The one ES instance we have tied to Xenforo (if it's of any use to you as a reference) currently sits at (/_cat/indices?format=json):

So you have 9.6 million documents and 2.2GB index on 4GB heap, that's helpful, since I only have 660k docs and 402MB index.

I changed my heap from 4GB to 3GB but with everything else the server does, RAM uses is still pretty up there.
 
I'd have thought that should be fine with a lower heap size. I mean if I look at our aforementioned server with /_cat/nodes?format=json&h=heap* I see:
JSON:
[
  {
    "heap.current": "536.2mb",
    "heap.percent": "13",
    "heap.max": "4gb"
  }
]
Which rather implies (although I am no ES expert) rather a lot of overhead left. I've seen a few posts over time suggesting tuning the garbage collection on ES to better manage the heap size - although I'd have thought now we're up to versions 7 and 8 the built in processes for this are probably pretty optimised. It's not something I've played with.

I guess you could work on a process of elimination - temporarily grab another machine and use that just for ES and see if it still dies lots - that might suggest some OS woe maybe rather than memory exhaustion. I'd not expect ES to actually die - just get slow. This blog article whilst not immediately helpful does indicate that you can vary the heap quite a bit: https://bigdataboutique.com/blog/tuning-elasticsearch-the-ideal-java-heap-size-2toq2j
 
I'd have thought that should be fine with a lower heap size. I mean if I look at our aforementioned server with /_cat/nodes?format=json&h=heap* I see:
JSON:
[
  {
    "heap.current": "536.2mb",
    "heap.percent": "13",
    "heap.max": "4gb"
  }
]
Which rather implies (although I am no ES expert) rather a lot of overhead left. I've seen a few posts over time suggesting tuning the garbage collection on ES to better manage the heap size - although I'd have thought now we're up to versions 7 and 8 the built in processes for this are probably pretty optimised. It's not something I've played with.

I guess you could work on a process of elimination - temporarily grab another machine and use that just for ES and see if it still dies lots - that might suggest some OS woe maybe rather than memory exhaustion. I'd not expect ES to actually die - just get slow. This blog article whilst not immediately helpful does indicate that you can vary the heap quite a bit: https://bigdataboutique.com/blog/tuning-elasticsearch-the-ideal-java-heap-size-2toq2j

Thanks for that command, I'm seeing the heap using 1.1GB with a max of 3GB which is what I changed it to. So I feel pretty safe setting it down another notch to 2GB.

The reason ES was dieing is just the normal OOM killer. RAM maxed out and this host doesn't allow a swap file to be created, so OOM got rid of ES.

It has not been killed again in the last few days since changing from 4 to 3, but free RAM is still slim, with buffer/cache only having 1.3GB. I think I'll change ES to 2GB and keep an eye on it.
 
Back
Top Bottom