Always rebuild search index after ElasticSearch crash?

DeltaHF · Apr 27, 2015

ElasticSearch crashed on my server, and it was down for about 12 hours before I was able to restart the service (it fixed the issue, though I still don't know the reason for the crash).

Of course, content added to the site while ES was down does not appear to be in the index. Do I need to rebuild the entire search index to get this missing content back into it?

rdn · Apr 27, 2015

DeltaHF said:
though I still don't know the reason for the crash

Probably because of this: https://bugzilla.openvz.org/show_bug.cgi?id=3187

MattW · Apr 27, 2015

RoldanLT said:
Probably because of this: https://bugzilla.openvz.org/show_bug.cgi?id=3187

If it was due to that, ES won't stay up, and will crash constantly

AndyB · Apr 27, 2015

DeltaHF said:
Do I need to rebuild the entire search index to get this missing content back into it?

Yes.

DeltaHF · Apr 27, 2015

Thanks, Andy.

RoldanLT said:
Probably because of this: https://bugzilla.openvz.org/show_bug.cgi?id=3187

I'm running 1.4.2. Looking back, the crash happened around the time a 5GB tar file was being made; my theory is that may have consumed a bit too much memory and caused ES to fail. This is not, however, the first time I have made such a large tar file, and ES has been running for months without issue.

Are there any log files I should check? The ones in /var/log/elasticsearch don't show anything useful.

AndyB · Apr 27, 2015

DeltaHF said:
my theory is that may have consumed a bit too much memory

How much memory does your server have?

DeltaHF · Apr 27, 2015

16GB (Linode SSD). I've only given ES 1GB, though it's never complained about memory.

It's a 10.3 million post forum, all running on one box.

Rob · Apr 27, 2015

I thought search index operations were queued in the event it was not possible to index an entry? @Mike can you comment on this? It's not unusual for Elasticsearch to fall over now and then and re-indexing every time it does so seems pretty extreme, especially on a 10 mill post forum. If this is the case, then adding failed items to a queue (for retrying) triggered via cron might overcome this on future updates to enhanced search.

Mike · Apr 27, 2015

They are queued, with increasing delays before repeating the action of 1, 2, 4, 8 and 16 hours (that's the time between each try). Unless you see errors in the server error log indicating that indexing failed more than 5 times and it was skipped, the data should eventually appear, but it could be 8 - 16 hours later. A reindex would bring it in immediately.

Rob · Apr 27, 2015

I thought that was case... so, to fail 5 times, lets do the math..... the elastic search instance would need to be offline for 31 hours straight for indexing to be missed - that's a pretty nice window @DeltaHF.

AndyB · Apr 27, 2015

DeltaHF said:
It's a 10.3 million post forum,

In that case your ES_HEAP_SIZE should be 10GB if I understand correctly.

Xon · Apr 28, 2015

AndyB said:
In that case your ES_HEAP_SIZE should be 10GB if I understand correctly.

Nah, I've got ~16 million posts in ~1.5gb of ram for a 3 node Elastic Search cluster on some Linode VPSs, and it works absolutely fine.

SSDs offer massive performance saving for Elastic Search as it will just trade IOPs for memory usage. And modern SSDs that Linode and such use have IOPs to spare.

https://sbdevel.wordpress.com/2013/06/06/memory-is-overrated/

Economically, copious amounts of RAM does not make sense. Yes, you guessed it, this is about Solid State Drives.

Their price is 1/10 of RAM (or 1/5 if you want RAID 1)

They suffer a lot less from the cleared disk cache problem

They can be easily RAIDed for TB-scale

They even draw less power than the same amount of RAM

...
Conclusion
Using SSDs as storage for search delivers near maximum performance at a fraction of the cost of an equivalent RAM solution.

Throwing more memory at Elastic Search isn't always desireable: https://www.elastic.co/blog/performance-considerations-elasticsearch-indexing

To begin with, do not use a very large java heap if you can help it: set it only as large as is necessary (ideally no more than half of the machine's RAM) to hold the overall maximum working set size for your usage of Elasticsearch

jeffwidman · Dec 7, 2015

@Xon curious if you're running even lower RAM for ES now that v2 is out?

I saw that one of the improvements mentioned in the ES changelog was more efficient memory use.

Xon · Dec 8, 2015

I haven't actually changed over to ES v2 (or XF 1.5.3 & XFES 1.1.3) yet due to time constraints.

jeffwidman · Dec 8, 2015

Xon said:
I haven't actually changed over to ES v2 (or XF 1.5.3 & XFES 1.1.3) yet due to time constraints.

I understand, I'm in the same situation, planning to switch later this week. Whenever you do eventually switch, if you experiment with heap sizes I'd be curious to hear the results.

Always rebuild search index after ElasticSearch crash?

DeltaHF

Well-known member

rdn

Well-known member

MattW

Well-known member

AndyB

Well-known member

DeltaHF

Well-known member

AndyB

Well-known member

DeltaHF

Well-known member

Rob

Well-known member

Mike

XenForo developer

Rob

Well-known member

AndyB

Well-known member

Xon

Well-known member

jeffwidman

Active member

Xon

Well-known member

jeffwidman

Active member

Similar threads

We value your privacy