How to: Basic Elasticsearch installation. (RHEL/SUSE)

Slavik

XenForo moderator
Staff member
Basic Elasticsearch Installation (RHEL / SUSE)

@Floren has an amazing repo for RHEL 6 and 7, the Elasticsearch RPM he provides is perfectly set up and currently I suggest using it.

The old manual setup guide can be found below.

Step 1) Install the Axivo Repo: https://www.axivo.com/resources/repository-setup.1/

Step 2) Install ElasticSearch: https://www.axivo.com/resources/elasticsearch-setup.11/




This guide is provided to show how to do a basic (vanilla get up and go) install of Elasticsearch (0.90.0 Beta 1), the Elasticsearch Service Wrapper and the required Java Runtime Environment (JRE) (1.7.0_17) on RHEL / SUSE. This guide will not cover running a dedicated Elasticsearch user.

For Debian/Ubuntu users, a guide can be found here.

This guide assumes the user has basic knowledge of SSH and prior to starting the steps below has logged in as root. This guide also assumes the user does not currently have any JRE installed. You can check if you have JRE installed by typing

Code:
java -version

As of writing, the current file locations for JRE are as follows:

32 bit
Code:
http://download.oracle.com/otn-pub/java/jdk/7u17-b02/jre-7u17-linux-i586.rpm

64 bit
Code:
http://download.oracle.com/otn-pub/java/jdk/7u17-b02/jre-7u17-linux-x64.rpm

The guide will be shown using the 64 bit install, however if you are using a 32 bit system, change the file names as appropriate.

Please note, whilst this is a simple and easy setup, I take no responsibility for any damages or losses that may occur to your system by following the steps below. If you are unsure at any stage, please ask for assistance or seek the help of a qualified Linux Systems Administrator.

Installing the JRE
Type the following commands into your SSH terminal.
Code:
cd /tmp
wget http://download.oracle.com/otn-pub/java/jdk/7u17-b02/jre-7u17-linux-x64.rpm
rpm -ivh jre-7u17-linux-x64.rpm
java -version

Assuming everything was done correctly, you should get the following output.

Code:
# java -version
java version "1.7.0_17"
Java(TM) SE Runtime Environment (build 1.7.0_17)
Java HotSpot(TM) 64-Bit Server VM (build 22.0-b10, mixed mode)

Install Elasticsearch

Code:
cd /
curl -L -O -k https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.0.Beta1.zip
unzip elasticsearch-0.90.0.Beta1.zip
mv elasticsearch-0.90.0.Beta1 elasticsearch

Install the Elasticsearch Service Wrapper

Code:
curl -L -k http://github.com/elasticsearch/elasticsearch-servicewrapper/tarball/master | tar -xz
mv *servicewrapper*/service elasticsearch/bin/
elasticsearch/bin/service/elasticsearch install
ln -s `readlink -f elasticsearch/bin/service/elasticsearch` /usr/local/bin/rcelasticsearch
rcelasticsearch start

Assuming everything was done correctly, you should see the following output.

Code:
rcelasticsearch start
Starting ElasticSearch...
Waiting for ElasticSearch......
running: PID: xxxxx

Basic Configuration

You should do some basic configuration of Elasticsearch before installing the addon in XenForo.


1) Open up /elasticsearch/config/elasticsearch.yml and on line 32 edit

Code:
# cluster.name: elasticsearch

To

Code:
cluster.name: PUT-SOMETHING-UNIQUE-HERE

on line 199 edit

Code:
# network.host: 192.168.0.1

to

Code:
network.host: 127.0.0.1

On line 211 edit

Code:
# http.port: 9200

to

Code:
http.port: 9200

Save and Close


2) Open up /elasticsearch/bin/service/elasticsearch.conf on line 2 edit

Code:
set.default.ES_HEAP_SIZE=1024

To a number suitable for the size of your forum.

I reccomend approximately 1 GB for the HEAP_SIZE per 1 million posts on your forum.

1 Million Posts: 1024
2 Million Posts: 2048
3 Million Posts: 3072
4 Million Posts: 4096
etc

This will not mean the service will use all that available memory, however it will have it at its disposal if required.

So for example a 3 Million Post forum would edit

Code:
set.default.ES_HEAP_SIZE=1024

to

Code:
set.default.ES_HEAP_SIZE=3072



Save and Exit.


3) Optional - Move the Elasticsearch data directory.

Your linux install may be configured in such a way that your install partition is only a few Gb in size, and placing a large Elasticsearch index there is not ideal.

In which case you will want to move the index directory to a different, larger, location (in this example /var/elasticsearch)

Code:
cd /var
mkdir elasticsearch

Open up /elasticsearch/config/elasticsearch.yml on line 143 edit

Code:
# path.data: /path/to/data

to

Code:
path.data: /var/elasticsearch

Save and Exit

4) Restart the Elasticsearch Service

In SSH type

Code:
rcelasticsearch restart

You should get the following output

Code:
rcelasticsearch restart
Stopping ElasticSearch...
Stopped ElasticSearch.
Starting ElasticSearch...
Waiting for ElasticSearch......
running: PID: xxxxx

Elasticsearch is now runing with your updated config.



Install the XenForo Enhanced Search Addon

1) Turn your board off into maintainance mode*

2) Download the addon from your customer area at http://xenforo.com/customers/

3) Follow the instructions found at http://xenforo.com/help/enhanced-search/

4) Wait for your indexes to be rebuilt

5) Open your board.

6) Install the index pre-warmer.

As of 0.90.0 Beta an index pre-warmer is available. This keeps your search index "warm" in active memory so when a search is done, the access time latency is highly reduced.

Installing this is simple, in SSH simply run the following replacing the *INDEX NAME* with the name of your ES index.

Code:
curl -XPUT localhost:9200/*INDEX NAME*/_warmer/warmer_1 -d '{
    "query" : {
        "match_all" : {}
    }
}'

You should have the following returned

Code:
{"ok":true,"acknowledged":true}


*You may leave your board open during the re-index process.

Congratulations. Your board should now be running XenForo Enhanced Search.
 
Last edited:
Right yes, found the issue you're having regardless

Assuming you have set up ES as per florens guide (most important binding the service to 127.0.0.1

Run

Code:
echo script.disable_dynamic: false >> /etc/elasticsearch/elasticsearch.yml
service elasticsearch restart

@Floren should probably add this to his settings/repo/guide somewhere as the new versions of the Enhanced Search addon require dynamic scripting to use the new time based relevancy settings.
 
Thanks for your help, Slavik. I'm not sure I understood your instructions exactly (I ran those commands to add script.disable_dynamic to yml file, but it didn't help), but it did get me thinking about the problem in a different way.

Upon checking the process list, I noticed the ElasticSearch wrapper I had installed from your original guide was still running. Just to make sure everything was as fresh as possible, I uninstalled the Axivo repo ES install with yum and then rebooted the server. When it came back up, the wrapper was gone, and I re-installed ES with yum. Checking out the ES stats confirms it's back on 1.2 now:

Code:
$ curl -XGET 'http://localhost:9200/'
{
  "status" : 200,
  "name" : "Sagittarius",
  "version" : {
    "number" : "1.2.1",
    "build_hash" : "6c95b759f9e7ef0f8e17f77d850da43ce8a4b364",
    "build_timestamp" : "2014-06-03T15:02:52Z",
    "build_snapshot" : false,
    "lucene_version" : "4.8"
  },
  "tagline" : "You Know, for Search"
}

...and better yet, XenForo was finally able to communicate with it. :D

I did have one more problem - upon rebuilding the search index, I upped the "items to process per page" to 25,000, the value I had used on my last install. It completed the first batch fine, but then crashed on the second, and using XenForo's deferred process resume functionality didn't work - it was just completely stuck until I went in and deleted the RebuildSearchIndex task from the xf_deferred table.

I started again with a much smaller number of items per page (5,000) and it seems to be working much better now - I'm 2 million posts in to a 9.7 million post re-index.

Of course, this is probably due to some difference in heap size or memory allocation in Floren's default settings (I never changed anything aside from what's mentioned in the install guide), but it's something to keep in mind if someone experiences this problem in the future.

Now that everything is running fine, @Slavik, should I still add the script.disable_dynamic line to my elasticsearch.yml config?
 
Yes, its required if you want to use date weighted relevancy.

You can change the heap size in /etc/sysconfig/elasticsearch it will be 1gb atm, so up that to whatever you need.
 
Great, thanks Floren!

Can we just run...
Code:
# yum --enablerepo=axivo update elasticsearch

...to update? Are any other precautions necessary (rebuilding search index, etc.)?
 
Right yes, found the issue you're having regardless

Assuming you have set up ES as per florens guide (most important binding the service to 127.0.0.1

Run

Code:
echo script.disable_dynamic: false >> /etc/elasticsearch/elasticsearch.yml
service elasticsearch restart

@Floren should probably add this to his settings/repo/guide somewhere as the new versions of the Enhanced Search addon require dynamic scripting to use the new time based relevancy settings.
I don't see any script related sections in their example configuration, thanks for the tip. :)
If you have time, please link me the documentation on their "script" prefixes.
 
@DeltaHF, ya is confusing. Their setting documentation says:
Uncomment if you want to disable JSONP as a valid return transport on the http server. With this enabled, it may pose a security risk, so disabling it unless you need it is recommended.
I'm going to update the configuration file to make it clearer. New setting:
# HTTP JSONP Transport Enabled
# Disables JSONP as valid return transport on the http server. By default,
# is enabled and poses a security risk.
# The default is true.
#http.jsonp.enable: false
I'm uploading the new rpm with the clearer setting.
 
Last edited:
I actually did not uncomment it, because I thought it was set to "false" by default.
I don't know what to think, I looked at Elastic documentation and I cannot find a way to return all default settings currently enabled. If anyone knows how to do it please let me know. But you are right... on their site documentation it says it is disabled by default, so the new configuration I have now is:
# HTTP JSONP Transport Enabled
# Enables JSONP as valid return transport on the http server. If enabled,
# it might pose a security risk.
# The default is false.
#http.jsonp.enable: false
Yet, on their configuration file they state the opposite???
 
@Floren should probably add this to his settings/repo/guide somewhere as the new versions of the Enhanced Search addon require dynamic scripting to use the new time based relevancy settings.
Thanks for the tip Slavik, but that presents a security risk. In their documentation they clearly state NOT to disable dynamic scripting:
First, you should not run Elasticsearch as the root user, as this would allow a script to access or do anything on your server, without limitations. Second, you should not expose Elasticsearch directly to users, but instead have a proxy application in between. If you do intend to expose Elasticsearch directly to your users, then you have to decide whether you trust them enough to run scripts on your box or not. If you do, you can enable dynamic scripting by adding the script.disable_dynamic: false setting to the config/elasticsearch.yml file on every node.
I added 4 new options into /etc/elasticsearch/elasticsearch.yml:
# Default Scripting Language
# The default is groovy.
#script.default_lang: groovy

# Dynamic Scripting
# Disables running dynamic scripts.
If enabled, it exposes Elasticsearch
# directly to users and might pose a security risk
.
# The default is sandbox.
#script.disable_dynamic: true


# Script Reloading Enabled
# Enables new and changed scripts being reloaded, while deleted scripts being
# removed from preloaded scripts cache.
# The default is true.
#script.auto_reload_enabled: true

# Script Reloading Interval
# The default is 60 seconds.
#watcher.interval: 60s
Personally, I will NOT enable dynamic scripting on my servers, right now it is set to script.disable_dynamic: true. Let's work on this and find the proper way to pass scripts to XES. For example, something like that can be done easy with Nginx. Is secure and will also limit the number of open connections:
Code:
map $request_filename $filter {
        default          0;
        _cluster         1;
}

location / {
        if ($filter) {
                return 403;
        }
        location ~ ^/.*/_search$ {
                proxy_pass        http://127.0.0.1:9200;
                proxy_set_header  X-Real-IP       $remote_addr;
                proxy_set_header  X-Forwarded-For $proxy_add_x_forwarded_for;
                proxy_set_header  Host            $http_host;
                proxy_redirect    off;
        }
}
That's off my head, nothing tested... just to get you started on the logic. I'm going to bed now, will look into it tomorrow.
 
Last edited:
@Slavik, I was talking about proxying Elasticsearch through Nginx which is much more efficient.
script.disable_dynamic: false is not needed anymore
Does XES 1.1.0 provides the .groovy files? .mvel is deprecated and will be removed in 1.4.0.
I added the /etc/elasticsearch/scripts directory into latest rpm that I recreated just now. Simply do:
# yum --enablerepo=axivo clean all
# yum --enablerepo=axivo reinstall elasticsearch
It will add the missing /etc/elasticsearch/scripts directory without overwriting your edited .yml files. Do not create the directory manually, this method is better than creating it yourself because it is registered into yum database as directory to be removed if you ever perform an uninstall. I have this option set to true:
# Dynamic Scripting
# Disables running dynamic scripts. If enabled, it exposes Elasticsearch
# directly to users and might pose a security risk.
# The default is sandbox.
script.disable_dynamic: true
 
Top Bottom