Elasticsearch: degraded instance

AWS will warn you when you have an instance running on degraded hardware. You will get a scheduled retirement date and, on this date, your instance will be switched off.

However, your hardware is likely already compromised. If you’re running Elasticsearch on this hardware you need to decommission the node and create another node.

The steps to do this are:

0. get health

curl <host>:9200/_cluster/health?pretty

From outside box:

1. get details of shards / indices

curl <host>:9200/_cluster/health?level=shards | jq . > cluster-shards-health.json

curl <host>:9200/_cat/indices?v > node-indices.txt

2. disable shard allocation

curl -XPUT 0:9200/_cluster/settings -H’Content-Type: application/json’ -d ‘{ “transient” : { “cluster.routing.allocation.enable” : “none” }}’

3. stop service

sudo service elasticsearch stop

4. terminate the instance via console

5. create a new instance – e.g. with Terraform

6. re-enable shard allocation

curl -XPUT 0:9200/_cluster/settings -H'Content-Type: application/json' -d '{ "transient" : { "cluster.routing.allocation.enable" : "all" }}'

7. validate cluster health goes back to green

curl 0:9200/_cluster/health?pretty


Leave a Reply

Your email address will not be published. Required fields are marked *