1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to: Basic Elasticsearch installation. (RHEL/SUSE)

Discussion in 'Enhanced Search Support' started by Slavik, Jan 18, 2012.

  1. digitalpoint

    digitalpoint Well-Known Member

    I was using SUSE Linux Enterprise until last September. At that point I switched everything over to openSUSE (same underpinnings, but free vs. ~$940/per 3 years, per server, and also not a generation behind).
  2. Marcus

    Marcus Well-Known Member

    It works now on my CentOS installation:

  3. Marcus

    Marcus Well-Known Member

  4. Rob

    Rob Well-Known Member

    I can start the service manually no problem, but it will not start automatically on server reboot.

    Any ideas?

    Thanks
  5. Slavik

    Slavik XenForo Moderator Staff Member

    Create an init.d script

    Use chkconfig command.
  6. Rob

    Rob Well-Known Member

    Well, i followed the RHEL/SUSE install instructions in this thread. Do i really need to do that? If so, how do i make a script?
  7. Slavik

    Slavik XenForo Moderator Staff Member

    If you ran the "elasticsearch/bin/service/elasticsearch install" command it should start automatically on boot, which, if it isn't would suggest a server configuration or installation error.
  8. Slavik

    Slavik XenForo Moderator Staff Member

    p4guru and Clickfinity like this.
  9. p4guru

    p4guru Well-Known Member

    Curious are Elastic Search's memory requirements still as large as what Floren/Shawn etc have discussed in the older posts on this thread ? max 1GB per 1 million posts ?
  10. Slavik

    Slavik XenForo Moderator Staff Member

    Custom mapping seems to be able to reduce it from between 1/4 to 1/2 depending on your board.
  11. p4guru

    p4guru Well-Known Member

  12. Floren

    Floren Well-Known Member

    Isn't that custom mapping designed to produce slow output results? I'm trying to understand what the "store" part does:
    "message" : {"type" : "string", "store" : "no"}
    The idea of storing the strings into memory is to allow a quick search through them based on the original search query and avoid a warm-up. You search for specific keywords into "message" string (for example), they are processed through advanced search and a number of ID's is returned.

    I was wondering if anyone can post some comparative results between the full memory index usage and a custom mapping one. Testing the IGN search shows their results pulled in average between 2 and 3 seconds.
  13. Slavik

    Slavik XenForo Moderator Staff Member

    http://xenforo.com/community/threads/how-to-apply-custom-mapping.31103/#post-355331
  14. Floren

    Floren Well-Known Member

    I'm sorry but that post does not answers my question. It does not say anything about the memory usage or how data is stored. From my experience, you can either store the data on disk or memory. The "store" condition means you store it on disk or memory? The data has to be stored somewhere, orelse how can you search it?

    The moment you store the data on disk, Elastic will need a warm-up and produce slow results. If has been confirmed by many, including the Elastic developers.
  15. EQnoble

    EQnoble Equilux NobiliterĂ 

    http://www.elasticsearch.org/guide/reference/index-modules/store.html
    dunno if that helps
  16. Slavik

    Slavik XenForo Moderator Staff Member

    When a document is indexed under default settings an extra field is created in the index which stores that original document alongside the indexed content in case it is required (in most cases, this would be used to return and show the actual search result).

    XenForo does not opperate the search like this, instead the search queries the content as per normal and then ES returns the content id which is then shown from the database and not the document from ES. As this document is never used it is effectively dead data, and thus by removing it you aim to reduce the index size on disk (and in turn, when the files are opened and stored in memory, reduce the memory requirement)

    Nothing actually happens in effect to "how" ES is working for XenForo, the ES instance still works as before, we are just removing unused information.

    I threw this together, maybe it will help you visualise it in a simple way.

    flow.jpg

    As you can see, the XenForo search doesn't require that original document to be stored. All the mapping does is chop off that extra.
    p4guru and Clickfinity like this.
  17. Floren

    Floren Well-Known Member

    Thanks, now I understand. So by default Elastic adds twice the same document data, one is for searching and one for returning the actual document in case is needed.
    Now, how do we measure the performance and hardware needs? By reading other users experiences, the number of documents implies twice the size of the memory. For example 500k posts will require 1GB of ram, 700k posts 1.4GB, etc. But that is not accurate at all, documents are not equal in size. The custom mapping will eliminate a portion of the memory usage so we end-up with what in real life? You cannot guess on that area, so it has to be a real factor calculator that allows you to determine precisely the memory usage.

    For example, on Sphinx is very easy. All you have to do is sum the total size of all indices that are stored into memory and you know for sure the actual memory usage:

    memory.png

    In my example for 46 million posts, 3.3GB of RAM are used to produce index results extracted in 0.057 seconds.
    It has to be a method that allows us to be very precise in determining the memory used. A large forum cannot "guess", it has to take into consideration all factors before they might end-up with outrageous new hardware requirements.

    I talk about memory storage a lot because it seems is the ONLY direction that is viable for Elastic, in order to produce proper search results in a fashionable time. Storing the data into disk will simply produce undesirable slow results. And since the memory data is volatile, what happens if I reboot the server? How long does it takes for memory indices to rebuilt? Is the index data created locally as file and then its contents stored into memory?
  18. Dinh Thanh

    Dinh Thanh Well-Known Member

    Digital Point have an addon to generate very nice stats of ElasticSearch.
  19. Floren

    Floren Well-Known Member

    So can you post the memory usage on indices?
  20. Slavik

    Slavik XenForo Moderator Staff Member

    This is the part a few of us are trying to work out, factors effecting index sizes and speed and attempting to work out a requirement, having said that the 1gb/million posts seems like a reasonable enough starting point and any custom mapping reducing that requirement is a bonus... but for the time being it seems more like a "suck it and see" aproach is the only way to get those answers.

    Es indexes are written to disk as files within the /es/data directory, when a file is requested it is loaded into memory (at which point it remains open for a period of time before being closed) unless you specifically set the storage type to memory in which case the indexes are exclusively indexed and stored to your servers memory with the associated risks of doing so and hence why some of us are using a little bash script to randomly load search terms to maintain the files in active memory to speed up searches.


    DP is releasing an addon, theres a nice pretty UI you can find on github (https://github.com/mobz/elasticsearch-head) or more simply hit this in and work out the numbers yourself, also remember memory usage is linked to user load. :)

    Code:
     curl -XGET 'http://localhost:9200/_cluster/nodes/stats?pretty=true'


    Code:
    curl -XGET 'http://localhost:9200/_cluster/nodes/stats?pretty=true'
    {
      "cluster_name" : "testbedES",
      "nodes" : {
        "myP4CtBhRLOH3BB1NtDYUw" : {
          "name" : "Stardust",
          "indices" : {
            "store" : {
              "size" : "948mb",
              "size_in_bytes" : 994055264
            },
            "docs" : {
              "count" : 1317585,
              "deleted" : 2472
            },
            "indexing" : {
              "index_total" : 8749,
              "index_time" : "1.4m",
              "index_time_in_millis" : 85390,
              "index_current" : 0,
              "delete_total" : 84,
              "delete_time" : "855ms",
              "delete_time_in_millis" : 855,
              "delete_current" : 0
            },
            "get" : {
              "total" : 0,
              "time" : "0s",
              "time_in_millis" : 0,
              "exists_total" : 0,
              "exists_time" : "0s",
              "exists_time_in_millis" : 0,
              "missing_total" : 0,
              "missing_time" : "0s",
              "missing_time_in_millis" : 0,
              "current" : 0
            },
            "search" : {
              "query_total" : 139160,
              "query_time" : "1.5h",
              "query_time_in_millis" : 5475259,
              "query_current" : 0,
              "fetch_total" : 103345,
              "fetch_time" : "4.8h",
              "fetch_time_in_millis" : 17426649,
              "fetch_current" : 0
            },
            "cache" : {
              "field_evictions" : 0,
              "field_size" : "15mb",
              "field_size_in_bytes" : 15823076,
              "filter_count" : 6,
              "filter_evictions" : 0,
              "filter_size" : "826kb",
              "filter_size_in_bytes" : 845824
            },
            "merges" : {
              "current" : 0,
              "current_docs" : 0,
              "current_size" : "0b",
              "current_size_in_bytes" : 0,
              "total" : 1,
              "total_time" : "86ms",
              "total_time_in_millis" : 86,
              "total_docs" : 19,
              "total_size" : "38.9kb",
              "total_size_in_bytes" : 39885
            },
            "refresh" : {
              "total" : 8632,
              "total_time" : "1.2m",
              "total_time_in_millis" : 74320
            },
            "flush" : {
              "total" : 5155,
              "total_time" : "9.3m",
              "total_time_in_millis" : 562688
            }
          },
          "os" : {
            "timestamp" : 1336296997532,
            "uptime" : "515 hours, 52 minutes and 41 seconds",
            "uptime_in_millis" : 1857161000,
            "load_average" : [ 0.46, 0.35, 0.33 ],
            "cpu" : {
              "sys" : 2,
              "user" : 3,
              "idle" : 93
            },
            "mem" : {
              "free" : "5.3gb",
              "free_in_bytes" : 5727977472,
              "used" : "2.4gb",
              "used_in_bytes" : 2622029824,
              "free_percent" : 77,
              "used_percent" : 22,
              "actual_free" : "6gb",
              "actual_free_in_bytes" : 6451220480,
              "actual_used" : "1.7gb",
              "actual_used_in_bytes" : 1898786816
            },
            "swap" : {
              "used" : "11.9mb",
              "used_in_bytes" : 12550144,
              "free" : "3.7gb",
              "free_in_bytes" : 4001366016
            }
          },
          "process" : {
            "timestamp" : 1336296997533,
            "open_file_descriptors" : 585,
            "cpu" : {
              "percent" : 0,
              "sys" : "5 minutes, 38 seconds and 720 milliseconds",
              "sys_in_millis" : 338720,
              "user" : "14 minutes, 35 seconds and 220 milliseconds",
              "user_in_millis" : 875220,
              "total" : "20 minutes, 13 seconds and 940 milliseconds",
              "total_in_millis" : 1213940
            },
            "mem" : {
              "resident" : "596.9mb",
              "resident_in_bytes" : 625917952,
              "share" : "11mb",
              "share_in_bytes" : 11628544,
              "total_virtual" : "4.5gb",
              "total_virtual_in_bytes" : 4902260736
            }
          },
          "jvm" : {
            "timestamp" : 1336296997533,
            "uptime" : "515 hours, 46 minutes, 22 seconds and 165 milliseconds",
            "uptime_in_millis" : 1856782165,
            "mem" : {
              "heap_used" : "238.2mb",
              "heap_used_in_bytes" : 249812744,
              "heap_committed" : "509.9mb",
              "heap_committed_in_bytes" : 534708224,
              "non_heap_used" : "36.9mb",
              "non_heap_used_in_bytes" : 38785000,
              "non_heap_committed" : "57.7mb",
              "non_heap_committed_in_bytes" : 60506112
            },
            "threads" : {
              "count" : 41,
              "peak_count" : 63
            },
            "gc" : {
              "collection_count" : 9209,
              "collection_time" : "34 seconds and 935 milliseconds",
              "collection_time_in_millis" : 34935,
              "collectors" : {
                "ParNew" : {
                  "collection_count" : 9191,
                  "collection_time" : "34 seconds and 750 milliseconds",
                  "collection_time_in_millis" : 34750
                },
                "ConcurrentMarkSweep" : {
                  "collection_count" : 18,
                  "collection_time" : "185 milliseconds",
                  "collection_time_in_millis" : 185
                }
              }
            }
          },
          "network" : {
            "tcp" : {
              "active_opens" : 76254,
              "passive_opens" : 8693590,
              "curr_estab" : 19,
              "in_segs" : 260179338,
              "out_segs" : 121374258,
              "retrans_segs" : 1126081,
              "estab_resets" : 31153,
              "attempt_fails" : 11703,
              "in_errs" : 0,
              "out_rsts" : 38940
            }
          },
          "transport" : {
            "server_open" : 7,
            "rx_count" : 0,
            "rx_size" : "0b",
            "rx_size_in_bytes" : 0,
            "tx_count" : 0,
            "tx_size" : "0b",
            "tx_size_in_bytes" : 0
          },
          "http" : {
            "current_open" : 1,
            "total_opened" : 35904
          }
        }
      }
    }
    simbolo, Clickfinity and p4guru like this.

Share This Page