Elasticsearch: administration

See also ElasticHQ and Elasticsearch Concepts

Colours

Yellow => nodes are down

Red => data is missing

Upgrades

Do a rolling restart

  1. stop a Master node
  2. upgrade Master (don’t overwrite config)
  3. restart Master
  4. repeat for Data nodes
    1. note: when a data node is taken offline the shards will reshuffle and the cluster takes a big performance hit
    2. to avoid this set cluster.routing.allocation.enable to noneto fully disable shard rebalancing on the cluster. Useful if you know you’re going to take a node offline for just a few minutes. Use:
    3. curl -XPUT 0:9200/_cluster/settings -H’Content-Type: application/json’ -d ‘{ “transient” : { “cluster.routing.allocation.enable” : “all” }}’

Restore indices from a snapshot

POST /_snapshot/my_backup/snapshot_1/_restore

but remember to close the index before and open it after.

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html

Delete indices

Use Curator.

curator_cli --host <host> show_indices

curator Issues

Unable to create client connection to Elasticsearch. Error: Elasticsearch version 1.7.5 incompatible with this version of Curator (5.5.4)

See https://www.elastic.co/guide/en/elasticsearch/client/curator/current/version-compatibility.html

Install Curator 3.5.0

pip install elasticsearch-curator==3.5.0

https://www.elastic.co/guide/en/elasticsearch/client/curator/3.5/installation.html

 

Annoyingly, the syntax between versions is completely different. E.g.

Curator 3: curator, show_indices

Curator 5: curator_cli, show indices

And for this error:

ERROR. At least one filter must be supplied.

in Curator 3, you need something like this:

curator --host 10.33.12.203 show indices --regex '.'

Configuration

gateway.recover_after_nodes: 8

E.g. 8, to avoid thrashing after initial cluster restart.

See also:

gateway.recover_after_time: 5m

gateway.expected_nodes: 2

discovery.zen.ping.multicast.enabled: false
discovery.zen.minimum_master_nodes: 2

Data nodes:

node.master: false
node.data: true

http.enabled: false

Master nodes:

node.master: true
node.data: false

JVM

Don’t change garbage collection or thread pools.

Do change heap size. Use half of available memory. e.g. 32GB.

  • 32GB is enough
  • Lucene needs the rest
  • Heap size > 32GB is inefficient

E.g. ES_HEAP_SIZE=32g

/usr/bin/java -Xms30g -Xmx30g

Swap

Can harm performance. Disable by removing from /etc/fstab.

Alternatively, set bootstrap.mlockall: true in elasticsearch.yml to lock memory and prevent it being swapped.

File Descriptors / MMap

File Descriptors: 64,000

MMap: unlimited

See also https://docs.oracle.com/cd/E23389_01/doc.11116/e21036/perf002.htm

and sysctl vm.max_map_count

Maximum number of mmap()’ed ranges and how to set it on Linux?

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *