I'm running 1.4.2. Looking back, the crash happened around the time a 5GB tar file was being made; my theory is that may have consumed a bit too much memory and caused ES to fail. This is not, however, the first time I have made such a large tar file, and ES has been running for months without issue.
Are there any log files I should check? The ones in /var/log/elasticsearch don't show anything useful.
I thought search index operations were queued in the event it was not possible to index an entry? @Mike can you comment on this? It's not unusual for Elasticsearch to fall over now and then and re-indexing every time it does so seems pretty extreme, especially on a 10 mill post forum. If this is the case, then adding failed items to a queue (for retrying) triggered via cron might overcome this on future updates to enhanced search.
They are queued, with increasing delays before repeating the action of 1, 2, 4, 8 and 16 hours (that's the time between each try). Unless you see errors in the server error log indicating that indexing failed more than 5 times and it was skipped, the data should eventually appear, but it could be 8 - 16 hours later. A reindex would bring it in immediately.
I thought that was case... so, to fail 5 times, lets do the math..... the elastic search instance would need to be offline for 31 hours straight for indexing to be missed - that's a pretty nice window @DeltaHF.
To begin with, do not use a very large java heap if you can help it: set it only as large as is necessary (ideally no more than half of the machine's RAM) to hold the overall maximum working set size for your usage of Elasticsearch