• This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.

Implemented Cache posts for later insertion if Elasticsearch service is offline.

Slavik

XenForo moderator
Staff member
#1
As per title realy.

Would be good that if the Elasticsearch stopped responding, that on an error the post that was missed out gets added to a backup mysql table, so once the admin has fixed the issue, he has an option to re-hand those posts over to Elasticsearch to be indexed without requiring a full re-index.

Don't think it would be too hard either? Just have a script to parse the stack trace the logs throw up on error and extract the required information.
 

Deebs

Well-known member
#2
Very good suggestion. Thinking out loud:
  1. Insert post into main post table
  2. Insert postid into a xf_tobe_indexed table
  3. Cron job runs every x minutes scanning the xf_tobe_indexed table, if entries, index and add to ES
  4. blah blah
Obviously with all the error trapping gubbings wrapped around etc.
 

lazy llama

Well-known member
#3
Very good suggestion. Thinking out loud:
  1. Insert post into main post table
  2. Insert postid into a xf_tobe_indexed table
  3. Cron job runs every x minutes scanning the xf_tobe_indexed table, if entries, index and add to ES
  4. blah blah
Obviously with all the error trapping gubbings wrapped around etc.
Sounds like a good idea, and is pretty much how the SphinxSearch add-on works out which posts to add to the delta.
Could even borrow the code ;)
 

digitalpoint

Well-known member
#6
I just saw this after making a semi-related suggestion...

http://xenforo.com/community/threads/fallback-hostnames.30962/

It would be much easier to implement a secondary hostname for reads/writes if the primary ElasticSearch node is down. ElasticSearch is made to shard and replicate across multiple servers... and site big enough to use it are probably going to be running it on multiple servers (at least they should be).
 

Slavik

XenForo moderator
Staff member
#7
I just saw this after making a semi-related suggestion...

http://xenforo.com/community/threads/fallback-hostnames.30962/

It would be much easier to implement a secondary hostname for reads/writes if the primary ElasticSearch node is down. ElasticSearch is made to shard and replicate across multiple servers... and site big enough to use it are probably going to be running it on multiple servers (at least they should be).

Consider the smaller large boards not using a multi server setup, how would they handle such failover?
 

Rudy

Well-known member
#14
As per title realy.

Would be good that if the Elasticsearch stopped responding, that on an error the post that was missed out gets added to a backup mysql table, so once the admin has fixed the issue, he has an option to re-hand those posts over to Elasticsearch to be indexed without requiring a full re-index.

Don't think it would be too hard either? Just have a script to parse the stack trace the logs throw up on error and extract the required information.
What about this? Rather than cache the posts, create a way for XF to alert the admins that ES has stopped working. That would give the admins the opportunity to restart ES (if they have permission) and rebuild the indexes.

Additional option? Have a cron job setup in XF to automatically reindex the forum after ES is restarted, with some sort of "flag" where XF could indicate whether its search index is out of date or not. I realize that some busy servers cannot handle a reindex during peak hours, but I have found that the reindexing of ES (or even Sphinx for that matter) never put a terribly crushing load on our servers...and we are beyond 8 million posts now.

Not quite the caching idea, but at least some sort of safeguard to let admins know ES has stopped and the indexes are stale.
 

Rudy

Well-known member
#16
XFES 1.1 is a future release, then? Only asking since we had a couple of server hiccups and I want to know if I should go and regenerate the search index for the time being.
 

Rudy

Well-known member
#18
Interesting--I may give it a try on our private "testing" forum. I like the relevance improvement--slick! If the ETA is a month or two out, I'll refrain from using it on our production forum (although I've never had problems with XF beta software in the past).

Thanks!
 

Rudy

Well-known member
#20
It definitely will help with the bias on newer vs. older threads. We have threads from twelve years ago that sometimes get dragged out of obscurity. (I personally would rather close all threads more than a year old, as we used to, but I have some hesitation from other staffers in doing so.) The problem on our forum is that over the course of three or four years, products change, new products come out, our opinions might change, etc. And many members find themselves replying to a thread a second time, several years later, often with the same thought they had several years ago while not realizing the thread is stale. Pushing those down in the search results certainly will help pull up more current threads.