curl: tips – silent, using variables in bash

silent

-s=> silent – so no Progress bar

-S=> don’t output errors

https://stackoverflow.com/questions/7373752/how-do-i-get-curl-to-not-show-the-progress-bar

bash variables

Using a variable in a curl call within Bash – remember to double-escape quotes. E.g.

https://stackoverflow.com/questions/40852784/how-can-i-use-a-variable-in-curl-call-within-bash-script

processing curl output

e.g. I wanted to count the lines from a GET for an Elasticsearch nodes. This seemed the obvious:

curl 0:9200/_cat/indices?v&health=yellow | wc -l

but there are a few things wrong here.

1. it outputted everything to the terminal including all the green nodes and then seemed to hang until I hit Enter – not what I was expecting

Solution:

there’s an ampersand character in the URL.

2. also there’s progress output from curl

Solution:

silence with -s

curl -s "0:9200/_cat/indices?v&health=yellow" | wc -l

3. you still get the header from elasticsearch. i.e.

health status index uuid pri rep docs.count docs.deleted store.size pri.store.size

 

 

Elasticsearch: ElasticHQ

What I’m hoping for is that it will make tasks simpler:

1. to see Unassigned shards I currently:

ssh into the node then:

curl -X GET "localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason" | grep UNASS

The output isn’t easy to understand. E.g.

index.1 0 p UNASSIGNED NODE_LEFT
index.2 8 p UNASSIGNED INDEX_CREATED

 

ElasticHQ does mention that it will:

  • Monitor Nodes, Indices, Shards, and general cluster metrics.

However, if it’s via the Diagnostics pane then that takes ages – an indefinite amount of time. I’m currently at around 15 minutes and still seeing json scroll past in my console without seeing anything in the GUI.

 

2. delete an index

3. stop shard allocation

 

 

More info:

Elasticsearch: administration

See also ElasticHQ and Elasticsearch Concepts

Colours

Yellow => nodes are down

Red => data is missing

Upgrades

Do a rolling restart

  1. stop a Master node
  2. upgrade Master (don’t overwrite config)
  3. restart Master
  4. repeat for Data nodes
    1. note: when a data node is taken offline the shards will reshuffle and the cluster takes a big performance hit
    2. to avoid this set cluster.routing.allocation.enable to noneto fully disable shard rebalancing on the cluster. Useful if you know you’re going to take a node offline for just a few minutes. Use:
    3. curl -XPUT 0:9200/_cluster/settings -H’Content-Type: application/json’ -d ‘{ “transient” : { “cluster.routing.allocation.enable” : “all” }}’

Restore indices from a snapshot

POST /_snapshot/my_backup/snapshot_1/_restore

but remember to close the index before and open it after.

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html

Delete indices

Use Curator.

curator_cli --host <host> show_indices

curator Issues

Unable to create client connection to Elasticsearch. Error: Elasticsearch version 1.7.5 incompatible with this version of Curator (5.5.4)

See https://www.elastic.co/guide/en/elasticsearch/client/curator/current/version-compatibility.html

Install Curator 3.5.0

pip install elasticsearch-curator==3.5.0

https://www.elastic.co/guide/en/elasticsearch/client/curator/3.5/installation.html

 

Annoyingly, the syntax between versions is completely different. E.g.

Curator 3: curator, show_indices

Curator 5: curator_cli, show indices

And for this error:

ERROR. At least one filter must be supplied.

in Curator 3, you need something like this:

curator --host 10.33.12.203 show indices --regex '.'

Configuration

gateway.recover_after_nodes: 8

E.g. 8, to avoid thrashing after initial cluster restart.

See also:

gateway.recover_after_time: 5m

gateway.expected_nodes: 2

discovery.zen.ping.multicast.enabled: false
discovery.zen.minimum_master_nodes: 2

Data nodes:

node.master: false
node.data: true

http.enabled: false

Master nodes:

node.master: true
node.data: false

JVM

Don’t change garbage collection or thread pools.

Do change heap size. Use half of available memory. e.g. 32GB.

  • 32GB is enough
  • Lucene needs the rest
  • Heap size > 32GB is inefficient

E.g. ES_HEAP_SIZE=32g

/usr/bin/java -Xms30g -Xmx30g

Swap

Can harm performance. Disable by removing from /etc/fstab.

Alternatively, set bootstrap.mlockall: true in elasticsearch.yml to lock memory and prevent it being swapped.

File Descriptors / MMap

File Descriptors: 64,000

MMap: unlimited

See also https://docs.oracle.com/cd/E23389_01/doc.11116/e21036/perf002.htm

and sysctl vm.max_map_count

Maximum number of mmap()’ed ranges and how to set it on Linux?

 

 

Elasticsearch: another snakes nest

I ran into a problem with UNASSIGNED shards in an Elasticsearch cluster after experiencing a degraded EC2 instance.

I did the usual steps to solve:

See Elasticsearch: degraded instance

and ran into a heap of problems.

I installed curator_cli using

pip install elasticsearch-curator

but found that, despite reams of pages using the command delete_indices, curator kept saying No such command.

https://stackoverflow.com/questions/52794903/elasticsearch-curator-cli-error-error-no-such-command-delete-indices

 

Someone else had experienced this:

https://discuss.elastic.co/t/not-able-to-use-delete-indices-for-curator-cli/151490

The solution seemed to be to downgrade some library called Click.

https://github.com/elastic/curator/issues/1279

 

So I downgraded Click with:

pip install click==6.7 elasticsearch-curator==5.5.4

Now I’m running into

elasticsearch.exceptions.ElasticsearchException: Unable to create client connection to Elasticsearch. Error: Elasticsearch version 1.7.5 incompatible with this version of Curator (5.5.4)

It seems there’s a compatibility issue.

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/version-compatibility.html

 

Elasticsearch Concepts

See also Elasticsearch: administration

Indexes: store all the JSON documents and are stored in a shard

Shards: a complete Lucene database

 

Indices can be stored in multiple shards.

i.e. if we have multiple nodes then shards will migrate across nodes – aka rebalancing.

 

Replicas: an exact duplicate of a shard (except designated as a Replica). Can configure as many replicas per shard as you like.

E.g. here you have 2 shards plus 2 replicas. Each contain the one index (I01).

http://www.snowcrash.eu/wp-content/uploads/2018/10/Screen-Shot-2018-10-16-at-11.05.46-AM-300x176.png 300w, http://www.snowcrash.eu/wp-content/uploads/2018/10/Screen-Shot-2018-10-16-at-11.05.46-AM-768x449.png 768w, http://www.snowcrash.eu/wp-content/uploads/2018/10/Screen-Shot-2018-10-16-at-11.05.46-AM-588x344.png 588w" sizes="(max-width: 815px) 100vw, 815px" />

Replicas are Read-only and can serve data thereby increasing scale.

https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-shards.html

 

Node Roles

Can run all on a single node but makes it more efficient if you separate them out.

Data:

  • most hard working
  • contains all shards
  • don’t typically receive service queries
  • tend to be the beefiest

Client

  • gateway to the cluster
  • big increase in performance
  • handle all query requests and redirects them to the data nodes

Master

  • brains of cluster
  • maintains cluster state
  • all nodes have a copy of the state but only the master can update cluster state

Capacity Planning

Data Nodes

To test, set up a load test on one node until node is completely saturated.

E.g. 1M documents on 1 node = 4.0 seconds then probably need 4 nodes to get to 1.0 second response time.

Master Nodes

To avoid split brain scenario:  set minimum_master_nodes to (number of master nodes / 2 ) + 1]

Should have at least 3.

Client Nodes

Could exist behind a load balancer.

E.g. in summary, a setup could be:

4 data nodes, 3 master nodes, 2 client nodes – i.e. a total of 9 nodes.

Server Requirements

  • CPU: more cores the better (favour over clock speed) – i.e. better to run more processes concurrently rather than run them faster
  • RAM: 64GB for Data nodes is ideal (e.g. in AWS an i3.2xlarge – https://aws.amazon.com/ec2/instance-types/i3/ )
  • Disks: fastest disks possible. Safe to use RAID 0 for more speed though not fault tolerant but Elasticsearch has shards. Avoid using NAS as performance will drop drastically
  • Networking: keep clustering within same data centre as shard rebalancing requires fast networking
  • VMs: don’t use for data nodes in production

 

Running on AWS

Data: i3.2xlarge

Master:

Client:

See also https://www.elastic.co/guide/en/elasticsearch/plugins/master/cloud-aws-best-practices.html

and https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing