Monday, September 19, 2016

[ElasticSearch] ElasticSearch Cluster Installation & memory problems

Hi everyone,

Today I will make my 1st post about ElasticSearch..

In this work, i will install ElasticSearch (Cluster) on 3 nodes and talk about memory problems we got during installation.

Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).

ElasticSearch 1.7.1, Linux 6


We all know that ElasticSearch is a search engine works base on Lucene. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON document.

Here, we have 3 identical nodes which have following properties

$uname -a
Linux NODE1 2.6.32-573.12.1.el6.x86_64 #1 SMP Mon Nov 23 12:55:32 EST 2015 x86_64 x86_64 x86_64 GNU/Linux

2 cpu and 16gb ram


Lets go to installation.

Define a data directory to install elasticsearch and give ownership to elastic user

$ mkdir /elasticsearch
$ chown -R elastic:elastic /elasticsearch

After that open elasticsearch binaries  to that directory

$ cd /elasticsearch
$ cp /tmp/elasticsearch-1.7.1.tar .
$ tar xvf elasticsearch-1.7.1.tar

..

Now modify configs and change below parameters

$ cd elasticsearch-1.7.1/config
$ vi elasticsearch.yml


cluster.name: CLUSTERNAME
node.name: "NODE1"
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["NODE2:9600", "NODE3:9600"]
discovery.zen.minimum_master_nodes: 2
transport.tcp.port: 9600                   <---- ports
http.port: 9500                                  <---- ports
#Add below lines : 
http.cors.enabled: true
http.cors.allow-origin: "*"
# enable memory lock
bootstrap.mlockall: true

Make those changes on remaining nodes..

After that modify memory parameters on nodes,

$ cd elasticsearch-1.7.1/bin
$ vi elasticsearch.in.sh

..

change ES_MAX_MEM=1g to HALFOFMEMORY <--- here we can adjust it to 8gb

add JAVA_HOME=PATH_TO_JAVA           <-- add this line to Java related lines  

save file.

Now change .bash_profile file for elastic user

$ cd 
$ vi .bash_profile

export JAVA_HOME=PATH_TO_JAVA    
PATH=$JAVA_HOME/bin:$PATH:$HOME/bin
export PATH

Run sh file to get changes

$ . ./.bash_profile
$ which java
PATH_TO_JAVA/bin/java

Do not forget to make these changes on all nodes

Now its time to start elastic search

$ cd /elastic/elasticsearch-1.7.1/bin/
$ elasticsearch -d

When you see logs , you see following

[2016-09-19 10:52:55,392][WARN ][bootstrap                ] Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out. Increase RLIMIT_MEMLOCK (ulimit).
[2016-09-19 10:52:55,856][INFO ][node                     ] [NODE1] version[1.7.1], pid[11469], build[b88f43f/2015-07-29T09:54:16Z]
[2016-09-19 10:52:55,856][INFO ][node                     ] [NODE1] initializing ...
[2016-09-19 10:52:56,111][INFO ][plugins                  ] [NODE1] loaded [], sites []
[2016-09-19 10:52:56,193][INFO ][env                      ] [NODE1] using [1] data paths, mounts [[/elastic (/dev/mapper/elasticmap)]], net usable_space [85.1gb], net total_space [86.4gb], types [ext4]
[2016-09-19 10:53:00,346][INFO ][node                     ] [NODE1] initialized
[2016-09-19 10:53:00,346][INFO ][node                     ] [NODE1] starting ...
[2016-09-19 10:53:00,477][INFO ][transport                ] [NODE1] bound_address {inet[/0:0:0:0:0:0:0:0:9600]}, publish_address {inet[/IPOFNODE1:9600]}
[2016-09-19 10:53:00,493][INFO ][discovery                ] [NODE1] CLUSTERNAME/Iz75wvhBSjS-Bplg_cL0Yw
[2016-09-19 10:53:30,493][WARN ][discovery                ] [NODE1] waited for 30s and no initial state was set by the discovery
[2016-09-19 10:53:30,506][INFO ][http                     ] [NODE1] bound_address {inet[/0:0:0:0:0:0:0:0:9500]}, publish_address {inet[/IPOFNODE1:9500]}
[2016-09-19 10:53:30,506][INFO ][node                     ] [NODE1] started

It is an error when we look at the ulimit values for elastic user, we see following

$ ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63639
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited


As you see max locked memory value is 64, but we set half of memory for configuration. Make it unlimited with root user

After that you should see this for elastic user

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63639
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited


After this, kill elastic process and restart elastic engine.

Now you no error, on logs

[2016-09-19 11:20:55,573][INFO ][plugins                  ] [NODE1] loaded [], sites []
[2016-09-19 11:20:55,637][INFO ][env                      ] [NODE1] using [1] data paths, mounts [[/elastic (/dev/mapper/elasticmap)]], net usable_space [85.1gb], net total_space [86.4gb], types [ext4]

[2016-09-19 11:20:59,420][INFO ][node                     ] [NODE1] initialized
[2016-09-19 11:20:59,420][INFO ][node                     ] [NODE1] starting ...
[2016-09-19 11:20:59,641][INFO ][transport                ] [NODE1] bound_address {inet[/0:0:0:0:0:0:0:0:9600]}, publish_address {inet[/IPOFNODE1:9600]}
[2016-09-19 11:20:59,657][INFO ][discovery                ] [NODE1] CLUSTERNAME/aU8RHAEmSaqXLTWw9DfHCg


Check cluster health with following..

$curl http://localhost:9500/_nodes/process?pretty

{
  "cluster_name" : "CLUSTERNAME",
  "nodes" : {
    "mOsNQ2MhQwqQIE1_SN5Srg" : {
      "name" : "NODE1",
      "transport_address" : "inet[/IPOFNODE1:9600]",
      "host" : "NODE1",
      "ip" : "IPOFNODE1",
      "version" : "1.7.1",
      "build" : "b88f43f",
      "http_address" : "inet[/IPOFNODE1:9500]",
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 13859,
        "max_file_descriptors" : 65536,
        "mlockall" : true                                      
      }
    }
  }
}

Ok, it seems correct, now start the other nodes

> Start node2
> logs you should see 

[2016-09-19 11:20:28,807][INFO ][node                     ] [NODE2] initialized
[2016-09-19 11:20:28,808][INFO ][node                     ] [NODE2] starting ...
[2016-09-19 11:20:29,118][INFO ][transport                ] [NODE2] bound_address {inet[/0:0:0:0:0:0:0:0:9600]}, publish_address {inet[/IPOFNODE2:9600]}
[2016-09-19 11:20:29,140][INFO ][discovery                ] [NODE2] CLUSTERNAME/TrF8CHbDRpGsY7t3Hl5GLw
[2016-09-19 11:20:32,265][INFO ][cluster.service          ] [NODE2] new_master [NODE2][TrF8CHbDRpGsY7t3Hl5GLw][NODE2][inet[/IPOFNODE2:9600]], reason: zen-disco-join (elected_as_master)
[2016-09-19 11:20:32,291][INFO ][http                     ] [NODE2] bound_address {inet[/0:0:0:0:0:0:0:0:9500]}, publish_address {inet[/IPOFNODE2:9500]}
[2016-09-19 11:20:32,292][INFO ][node                     ] [NODE2] started
[2016-09-19 11:20:34,171][INFO ][cluster.service          ] [NODE2] added {[NODE1][mOsNQ2MhQwqQIE1_SN5Srg][NODE1][inet[/IPOFNODE1:9600]],}, reason: zen-disco-receive(join from node[[NODE1][mOsNQ2MhQwqQIE1_SN5Srg][NODE1][inet[/IPOFNODE1:9600]]])
[2016-09-19 11:20:34,332][INFO ][gateway                  ] [NODE2] recovered [0] indices into cluster_state


> Start node3
> logs you should see 


[2016-09-19 11:21:02,831][INFO ][cluster.service          ] [NODE3] detected_master [NODE2][TrF8CHbDRpGsY7t3Hl5GLw][NODE2][inet[/IPOFNODE3:9600]], added {[NODE2][TrF8CHbDRpGsY7t3Hl5GLw][NODE2][inet[/IPOFNODE33:9600]],[NODE1][mOsNQ2MhQwqQIE1_SN5Srg][NODE1][inet[/IPOFNODE1:9600]],}, reason: zen-disco-receive(from master [[NODE2][TrF8CHbDRpGsY7t3Hl5GLw][NODE2][inet[/IPOFNODE2:9600]]])
[2016-09-19 11:21:02,927][INFO ][http                     ] [NODE3] bound_address {inet[/0:0:0:0:0:0:0:0:9500]}, publish_address {inet[/IPOFNODE3:9500]}
[2016-09-19 11:21:02,927][INFO ][node                     ] [NODE3] started


After that check cluster health again

$curl http://localhost:9500/_nodes/process?pretty
{
  "cluster_name" : "CLUSTERNAME",
  "nodes" : {
    "aU8RHAEmSaqXLTWw9DfHCg" : {
      "name" : "NODE3",
      "transport_address" : "inet[/IPOFNODE3:9600]",
      "host" : "NODE3",
      "ip" : "IPOFNODE3",
      "version" : "1.7.1",
      "build" : "b88f43f",
      "http_address" : "inet[/IPOFNODE3:9500]",
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 7624,
        "max_file_descriptors" : 65536,
        "mlockall" : true
      }
    },
    "TrF8CHbDRpGsY7t3Hl5GLw" : {
      "name" : "NODE2",
      "transport_address" : "inet[/IPOFNODE2:9600]",
      "host" : "NODE2",
      "ip" : "IPOFNODE2",
      "version" : "1.7.1",
      "build" : "b88f43f",
      "http_address" : "inet[/IPOFNODE2:9500]",
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 7863,
        "max_file_descriptors" : 65536,
        "mlockall" : true
      }
    },
    "mOsNQ2MhQwqQIE1_SN5Srg" : {
      "name" : "NODE1",
      "transport_address" : "inet[/IPOFNODE1:9600]",
      "host" : "NODE1",
      "ip" : "IPOFNODE1",
      "version" : "1.7.1",
      "build" : "b88f43f",
      "http_address" : "inet[/IPOFNODE1:9500]",
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 13859,
        "max_file_descriptors" : 65536,
        "mlockall" : true
      }
    }
  }
}

And check cluster health in this way

$curl 'localhost:9500/_cat/health?v'
epoch      timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks 
1474302630 19:30:30 CLUSTERNAME green 

Ok, everything is ok now..

NOTE: if you still hit memory limits after memlock change , probably memory is on cache because of unzip operation of tar file, please consider reboot of flushing cache before starting elastic engine.

Ok, that is all

Thanks for reading.

Enjoy & share.

Source:
https://www.elastic.co/products/elasticsearch

No comments :

Post a Comment