1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to: Basic Elasticsearch installation. (RHEL/SUSE)

Discussion in 'Enhanced Search Support' started by Slavik, Jan 18, 2012.

  1. Slavik

    Slavik XenForo Moderator Staff Member

    Basic Elasticsearch Installation (RHEL / SUSE)

    This guide is provided to show how to do a basic (vanilla get up and go) install of Elasticsearch (0.90.0 Beta 1), the Elasticsearch Service Wrapper and the required Java Runtime Environment (JRE) (1.7.0_17) on RHEL / SUSE. This guide will not cover running a dedicated Elasticsearch user.

    For Debian/Ubuntu users, a guide can be found here.

    This guide assumes the user has basic knowledge of SSH and prior to starting the steps below has logged in as root. This guide also assumes the user does not currently have any JRE installed. You can check if you have JRE installed by typing

    Code:
    java -version
    As of writing, the current file locations for JRE are as follows:

    32 bit
    Code:
    http://download.oracle.com/otn-pub/java/jdk/7u17-b02/jre-7u17-linux-i586.rpm
    64 bit
    Code:
    http://download.oracle.com/otn-pub/java/jdk/7u17-b02/jre-7u17-linux-x64.rpm
    The guide will be shown using the 64 bit install, however if you are using a 32 bit system, change the file names as appropriate.

    Please note, whilst this is a simple and easy setup, I take no responsibility for any damages or losses that may occur to your system by following the steps below. If you are unsure at any stage, please ask for assistance or seek the help of a qualified Linux Systems Administrator.

    Installing the JRE
    Type the following commands into your SSH terminal.
    Code:
    cd /tmp
    wget http://download.oracle.com/otn-pub/java/jdk/7u17-b02/jre-7u17-linux-x64.rpm
    rpm -ivh jre-7u17-linux-x64.rpm
    java -version
    
    Assuming everything was done correctly, you should get the following output.

    Code:
    # java -version
    java version "1.7.0_17"
    Java(TM) SE Runtime Environment (build 1.7.0_17)
    Java HotSpot(TM) 64-Bit Server VM (build 22.0-b10, mixed mode)
    
    Install Elasticsearch

    Code:
    cd /
    curl -L -O -k https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.0.Beta1.zip
    unzip elasticsearch-0.90.0.Beta1.zip
    mv elasticsearch-0.90.0.Beta1 elasticsearch
    
    Install the Elasticsearch Service Wrapper

    Code:
    curl -L -k http://github.com/elasticsearch/elasticsearch-servicewrapper/tarball/master | tar -xz
    mv *servicewrapper*/service elasticsearch/bin/
    elasticsearch/bin/service/elasticsearch install
    ln -s `readlink -f elasticsearch/bin/service/elasticsearch` /usr/local/bin/rcelasticsearch
    rcelasticsearch start
    
    Assuming everything was done correctly, you should see the following output.

    Code:
    rcelasticsearch start
    Starting ElasticSearch...
    Waiting for ElasticSearch......
    running: PID: xxxxx
    
    Basic Configuration

    You should do some basic configuration of Elasticsearch before installing the addon in XenForo.


    1) Open up /elasticsearch/config/elasticsearch.yml and on line 32 edit

    Code:
    # cluster.name: elasticsearch
    To

    Code:
    cluster.name: PUT-SOMETHING-UNIQUE-HERE
    on line 199 edit

    Code:
    # network.host: 192.168.0.1
    to

    Code:
    network.host: 127.0.0.1
    On line 211 edit

    Code:
    # http.port: 9200
    to

    Code:
    http.port: 9200
    Save and Close


    2) Open up /elasticsearch/bin/service/elasticsearch.conf on line 2 edit

    Code:
    set.default.ES_HEAP_SIZE=1024
    To a number suitable for the size of your forum.

    I reccomend approximately 1 GB for the HEAP_SIZE per 1 million posts on your forum.

    1 Million Posts: 1024
    2 Million Posts: 2048
    3 Million Posts: 3072
    4 Million Posts: 4096
    etc

    This will not mean the service will use all that available memory, however it will have it at its disposal if required.

    So for example a 3 Million Post forum would edit

    Code:
    set.default.ES_HEAP_SIZE=1024
    to

    Code:
    set.default.ES_HEAP_SIZE=3072


    Save and Exit.


    3) Optional - Move the Elasticsearch data directory.

    Your linux install may be configured in such a way that your install partition is only a few Gb in size, and placing a large Elasticsearch index there is not ideal.

    In which case you will want to move the index directory to a different, larger, location (in this example /var/elasticsearch)

    Code:
    cd /var
    mkdir elasticsearch
    
    Open up /elasticsearch/config/elasticsearch.yml on line 143 edit

    Code:
    # path.data: /path/to/data
    
    to

    Code:
    path.data: /var/elasticsearch
    
    Save and Exit

    4) Restart the Elasticsearch Service

    In SSH type

    Code:
    rcelasticsearch restart
    You should get the following output

    Code:
    rcelasticsearch restart
    Stopping ElasticSearch...
    Stopped ElasticSearch.
    Starting ElasticSearch...
    Waiting for ElasticSearch......
    running: PID: xxxxx
    
    Elasticsearch is now runing with your updated config.



    Install the XenForo Enhanced Search Addon

    1) Turn your board off into maintainance mode*

    2) Download the addon from your customer area at http://xenforo.com/customers/

    3) Follow the instructions found at http://xenforo.com/help/enhanced-search/

    4) Wait for your indexes to be rebuilt

    5) Open your board.

    6) Install the index pre-warmer.

    As of 0.90.0 Beta an index pre-warmer is available. This keeps your search index "warm" in active memory so when a search is done, the access time latency is highly reduced.

    Installing this is simple, in SSH simply run the following replacing the *INDEX NAME* with the name of your ES index.

    Code:
    
    curl -XPUT localhost:9200/*INDEX NAME*/_warmer/warmer_1 -d '{
        "query" : {
            "match_all" : {}
        }
    }'
    
    
    You should have the following returned

    Code:
    {"ok":true,"acknowledged":true}

    *You may leave your board open during the re-index process.

    Congratulations. Your board should now be running XenForo Enhanced Search.
    Last edited: Sep 4, 2013
    KiF, Andy.N, p4guru and 28 others like this.
  2. Clickfinity

    Clickfinity Well-Known Member

    Many thanks for the guide, very much appreciated.

    I needed to install Java 6 (Debian package "openjdk-6-jre") on mine before elasticsearch would start.

    Just need to buy the add-on now ... (y)
    digitalpoint and M@rc like this.
  3. Slavik

    Slavik XenForo Moderator Staff Member

  4. Clickfinity

    Clickfinity Well-Known Member

  5. Slavik

    Slavik XenForo Moderator Staff Member

    In theory they should be doing the exact same thing, just the RHEL one comes a little more pre-configured than the Debian one, and the RHEL one has the settings for the service wrapper in an external file opposed to in the wrapper itself.
  6. Clickfinity

    Clickfinity Well-Known Member

    Okay, great, no point in reinventing the wheel - if it's working, don't touch it - right?!! :D

    Thanks again for the guides. (y)
  7. tmb

    tmb Active Member

    Definitely saved me some time. Thanks. Now I just need to get around to updating to 1.1 so I can finish installing the search add-on.
  8. DBA

    DBA Well-Known Member

    Thank you!

    Didn't take long to implement. (y)
  9. graham_w

    graham_w Active Member

    Thanks for this great, easy to use guide! Was very quick to install on a Centos 5.7 x64 box using this guide and rebuilding cache as we speak :)
  10. shawn

    shawn Well-Known Member

    Yeah, might want to edit the title and add CentOS to the list... just in case folks don't know it's RHEL-based. Should help with search queries, too.
    D.O.A. likes this.
  11. MrC

    MrC Active Member

    That was quick. Thanks :)
  12. SneakyDave

    SneakyDave Well-Known Member

    If you're running CentOS and don't know it's RHEL based, you probably shouldn't be running it! Just a joke!
  13. Floren

    Floren Well-Known Member

    Just curious, how did you get those numbers? I just try to make sure the memory requirements are accurate because a board with over 30mil posts will require 32GB of RAM just to run the search. Compared to Sphinx, who needs only 512MB for the same number of posts.... this is a BIG difference. Anyone knows what is the I/O impact on disks to read/write the index data?

    Looks like is confirmed, you will need a lot of memory to run Elastic on a large forum.
    A forum with 9mil posts and 1GB of allocated memory returns search results in 6-7seconds, slower than MySQL. That confirms the calculations posted by Slavik above.
    digitalpoint likes this.
  14. digitalpoint

    digitalpoint Well-Known Member

    I was actually wondering the same thing myself. I purchased the XF enhanced search, but I haven't had time to mess with it yet. But seeing the speed and memory requirements have me a little worried. My current Sphinx setup for vB takes about 2GB of memory for around 25M searchable documents spread across 16 searchable content types (posts, users, PMs, FAQs, articles, blogs, etc.) and results NEVER takes more than 0.1 seconds for the most obscure search (usually more like 0.02 seconds).

    But we'll see how it works for me before too long...
    Adam Howard, p4guru and Floren like this.
  15. Floren

    Floren Well-Known Member

    My thoughts exactly, related to query speed. Honestly, 2GB is a lot for 25mil posts. Maybe because you spread it through a lot of indices. I recently setup Searchlight on XDA-Developers site with workers as threads and they use less than 1GB of RAM with 30,000 online users and 20mil posts.

    I want to see what IGN says about the memory consumption on their test board and find out about the I/O impact on disks also, no idea how often the data is read/written into indices.
  16. digitalpoint

    digitalpoint Well-Known Member

    Well, it's not 25M posts... It's ~18M posts, and the non-post stuff tends to be larger on average. For example PMs tend to be larger than posts. Users are also searchable and they end up being daily large bits of data since we make everything about the user searchable (for example every email they ever used). We also make IP searchable via Sphinx on all content type, so that's an extra 25M 32-bit numbers, etc...

    Hopefully the new search doesn't take the amount of memory people are saying or requires continuous "warm-up" to be fast. I have quite a bit of work to do still before I can really dig into it.
  17. lazy llama

    lazy llama Well-Known Member

    That is on my test system which is very low spec though, and I've never run MySQL searches on it but suspect they'd be slower. I don't have the disk space on that to do like-for-like comparison with Sphinx either sadly. Sphinx did appear to be slightly quicker and use less resource though. It certainly used less disk space.
    I'm hoping to investigate further over the next few days, and I'm certainly not ruling anything out at this early stage.
  18. digitalpoint

    digitalpoint Well-Known Member

    Maybe it's just because it's using a minimum word length of 1 or something... {shrug}
  19. Slavik

    Slavik XenForo Moderator Staff Member


    Please remember, my reccommended settings are for this basic setup guide only.

    If you are clustering a large forum, I would expect you to be using custom mapping which may reduce the requirements by up to half. However information like that is for an advanced guide (underway) for much larger boards, and also includes things such as setting up dedicated elasticsearch users etc.

    What I would like to know from you guys with the 20m+ forums, what OS are you running on?
  20. Floren

    Floren Well-Known Member

    You mean the Star enabled in Sphinx? I use the Star only for thread titles, or else the indices will become huge. Searching for 1char does not increases the index if you don't enable the Star. I presume the same logic is used for Elastic?

    Even if is half of the memory usage with a custom mapping, Elastic is using LARGE amounts of memory compared to Sphinx. Doing a quick search on Nabble for other Elasic user experiences revealed that all big users had a common issue: Elastic needs several machines with lots of memory in order to be query efficient. Obviously we are talking about millions of documents, not the average Joe user. Most relevant case for us is this thread where the developer confirms the high memory usage for 5mil posts only. He also recommends the use of several boxes with small amounts of memory instead of a large one, for increased shard performance? Not sure. Plus, reading the guide tells me Elastic requires constant warm-up if your indices are not completely stored into memory:
    Let me know if I misunderstood.
    p4guru likes this.

Share This Page