1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Cache Rebuild Error - ES Stopped Working, Again

Discussion in 'Enhanced Search Support' started by Anthony Parsons, Apr 19, 2012.

  1. Anthony Parsons

    Anthony Parsons Well-Known Member

    ES has been having issues on and off for the past few days, even though ES is running at the server every single time checked, now ES has completely stopped working.

    The server software has not updated, XF has not been updated... yet all sites suddenly have issues with ES running. At present I've had to revert search back to mysql on my main sites, though using a small forum to play around with and test ES. So far, no avail to fixing this issue and any help is certainly welcome.

    I have read through many threads and posts here about this same issue, tried many of the suggestions with adding limits, configurations, etc... still no avail.

    Every time I go to rebuild the cache it gives the error: No response returned from Elasticsearch. Is it running?

    A server error on a smaller site gives:

    PHP:
    Error Info
     
    XenForo_Exception
    Elasticsearch server returned no responseIs it runningElasticsearch indexing failed for post- - library/XenES/Search/SourceHandler/ElasticSearch.php:721
    Generated By
    Anthony17 minutes ago
     
    Stack Trace
     
    #0 /home/mycombat/public_html/library/XenES/Search/SourceHandler/ElasticSearch.php(748): XenES_Search_SourceHandler_ElasticSearch->_logSearchResponseError(false, true, 'Elasticsearch i...')
    #1 /home/mycombat/public_html/library/XenES/Search/SourceHandler/ElasticSearch.php(67): XenES_Search_SourceHandler_ElasticSearch->_assertIndexSuccessful(false, 'post')
    #2 /home/mycombat/public_html/library/XenForo/Search/Indexer.php(125): XenES_Search_SourceHandler_ElasticSearch->finalizeRebuildSet()
    #3 /home/mycombat/public_html/library/XenForo/CacheRebuilder/SearchIndex.php(93): XenForo_Search_Indexer->finalizeRebuildSet()
    #4 /home/mycombat/public_html/library/XenForo/ControllerHelper/CacheRebuild.php(26): XenForo_CacheRebuilder_SearchIndex->rebuild(0, Array, NULL)
    #5 /home/mycombat/public_html/library/XenForo/ControllerAdmin/Tools.php(78): XenForo_ControllerHelper_CacheRebuild->rebuildCache(Array, 'http://www.myco...', 'admin.php?tools...', true)
    #6 /home/mycombat/public_html/library/XenForo/FrontController.php(310): XenForo_ControllerAdmin_Tools->actionCacheRebuild()
    #7 /home/mycombat/public_html/library/XenForo/FrontController.php(132): XenForo_FrontController->dispatch(Object(XenForo_RouteMatch))
    #8 /home/mycombat/public_html/admin.php(13): XenForo_FrontController->run()
    #9 {main}
     
    Request State
     
    array(3) {
      [
    "url"] => string(57"http://www.mycombatptsd.com/admin.php?tools/cache-rebuild"
      
    ["_GET"] => array(1) {
        [
    "tools/cache-rebuild"] => string(0""
      
    }
      [
    "_POST"] => array(5) {
        [
    "process"] => string(1"1"
        
    ["caches"] => string(63"[["SearchIndex",{"content_type":"","batch":"500","delay":"5"}]]"
        
    ["position"] => string(1"0"
        
    ["redirect"] => string(51"http://www.mycombatptsd.com/admin.php?tools/rebuild"
        
    ["_xfToken"] => string(53"1,1334789763,1581f0874c143b78b236b8d0ea26fdb852d380e7"
      
    }
    }
    I have no idea what is wrong with this thing or why suddenly now.

    Anyone with ideas would be greatly appreciated.

    I have restarted it over and over, it is running from the server.

    Screen Shot 2012-04-19 at 9.09.49 AM.png
     
  2. Slavik

    Slavik XenForo Moderator Staff Member

    Have your elasticsearch logs picked anything up?
     
  3. Anthony Parsons

    Anthony Parsons Well-Known Member

    What would I run to see the logs?
     
  4. Slavik

    Slavik XenForo Moderator Staff Member

    I cant remember the exact paths, but it would be in one of the folders in /elasticsearch
     
  5. Anthony Parsons

    Anthony Parsons Well-Known Member

    It has been getting a lot of this it seems recently:

    [2012-04-17 08:49:15,519][WARN ][netty.channel.socket.nio.NioServerSocketPipelineSink] Failed to accept a connection.
    java.io.IOException: Too many open files
    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:163)
    at org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:244)
    at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at java.lang.Thread.run(Thread.java:636)
    [2012-04-17 08:49:16,031][WARN ][index.shard.service ] [Shiver Man] [mycombat_***][4] Failed to perform scheduled engine refresh
    org.elasticsearch.index.engine.RefreshFailedEngineException: [mycombat_***][4] Refresh failed
    at org.elasticsearch.index.engine.robin.RobinEngine.refresh(RobinEngine.java:789)
    at org.elasticsearch.index.shard.service.InternalIndexShard.refresh(InternalIndexShard.java:412)
    at org.elasticsearch.index.shard.service.InternalIndexShard$EngineRefresher$1.run(InternalIndexShard.java:699)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at java.lang.Thread.run(Thread.java:636)
    Caused by: java.io.FileNotFoundException: /var/elasticsearch/elasticsearch/nodes/0/indices/mycombat_***/4/index/_rb.prx (Too many open files)
    at java.io.RandomAccessFile.open(Native Method)
    at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
    at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:441)
    at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:306)
    at org.elasticsearch.index.store.Store$StoreDirectory.createOutput(Store.java:416)
    at org.elasticsearch.index.store.Store$StoreDirectory.createOutput(Store.java:388)
    at org.apache.lucene.index.FormatPostingsPositionsWriter.<init>(FormatPostingsPositionsWriter.java:43)
    at org.apache.lucene.index.FormatPostingsDocsWriter.<init>(FormatPostingsDocsWriter.java:57)
    at org.apache.lucene.index.FormatPostingsTermsWriter.<init>(FormatPostingsTermsWriter.java:33)
    at org.apache.lucene.index.FormatPostingsFieldsWriter.<init>(FormatPostingsFieldsWriter.java:51)
    at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
    at org.apache.lucene.index.TermsHash.flush(TermsHash.java:113)
    at org.apache.lucene.index.DocInverter.flush(DocInverter.java:70)
    at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:60)
    at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:581)
    at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3623)
    at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3588)
    at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:452)
    at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:401)
    at org.apache.lucene.index.DirectoryReader.doOpenFromWriter(DirectoryReader.java:428)
    at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:448)
    at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:396)
    at org.apache.lucene.index.IndexReader.openIfChanged(IndexReader.java:520)
    at org.elasticsearch.index.engine.robin.RobinEngine.refresh(RobinEngine.java:764)
    ... 5 more
     
  6. CyclingTribe

    CyclingTribe Well-Known Member

    These stand out:

    Code:
    [2012-04-17 08:49:15,519][WARN ][netty.channel.socket.nio.NioServerSocketPipelineSink] Failed to accept a connection.
    java.io.IOException: Too many open files
    Caused by: java.io.FileNotFoundException: /var/elasticsearch/elasticsearch/nodes/0/indices/mycombat_***/4/index/_rb.prx (Too many open files)
    
    Unless they are normal for ES?
     
  7. Slavik

    Slavik XenForo Moderator Staff Member

  8. Anthony Parsons

    Anthony Parsons Well-Known Member

    Yer, tried that... no such luck.
     
  9. Anthony Parsons

    Anthony Parsons Well-Known Member

    For example, I just rebooted the server then tried to connect. ES is running, same error at Xenforo sites when trying a cache rebuild.
     
  10. Slavik

    Slavik XenForo Moderator Staff Member

    What does

    Code:
    lsof | wc -l
    return?
     
  11. Anthony Parsons

    Anthony Parsons Well-Known Member

  12. Slavik

    Slavik XenForo Moderator Staff Member

    I have an idea what it may be, unfortunately its late and im about to go to bed, but it may be that the transportclient is hanging on a corrupt ES node.

    For now I would suggest (if you haven't already) completely killing the ES and Java PID's.

    Ensure the raised file limits are correctly set for your java/es users.

    Updating both java and es, clearing your current indexes and re-indexing everything.
     
    Anthony Parsons likes this.
  13. Wuebit

    Wuebit Well-Known Member

    Try Raising the limit in the config

    Post about it here
     
  14. Slavik

    Slavik XenForo Moderator Staff Member

    Issue resolved now (hopefully).

    It apears one of the shards/nodes had been corrupted. So clearing them down resolved the issue.
     
    CyclingTribe likes this.
  15. Anthony Parsons

    Anthony Parsons Well-Known Member

    Yep, Slavik fixed this up for me, whom I am extremely grateful for his very fast assistance. Wiredtree are good, but their techs have pretty much no real experience or knowledge with this specific software to provide much assistance with... so again, thank you very much Slavik for your help to rectify this on my sites.
     
    CyclingTribe likes this.
  16. CyclingTribe

    CyclingTribe Well-Known Member

    Good call Slavik - nice one! (y)
     
  17. graham_w

    graham_w Active Member

    What did you do to resolve this?

    I had a site throwing similar errors yesterday. Luckily it was a test site so no big deal. BUT at the same time it was saying ES wasn't running, I was able to re cache a 1 million post board on the same server.

    I ended up removing the ES folder, installing a new copy of the latest ES, and reindex. Issues gone.
     
  18. Slavik

    Slavik XenForo Moderator Staff Member

    Pretty much what you've done, clear down the old node and re-create it.
     
    graham_w likes this.

Share This Page