Kubectl: create pod

Let’s try creating a pod.

  1. create a pod.ymlwith your favourite image:

Then:

kubectl create -f pod.yml

returns

pod/hello-pod created

but we need to check what’s going on behind the scenes with:

kubectl get pods

“kubectl get pods” warnings

kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-pod 0/1 ContainerCreating 0 1m

kubectl describe pods

This:

Reason: ContainerCreating

is because it’s still creating the container.

kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-pod 0/1 ImagePullBackOff 0 10m

Again:

kubectl describe pods

Reason: ImagePullBackOff

means there’s a problem with the image.

Check with:

docker inspect --type=image <name of image>:latest

I was getting:

Error: No such image:

So, let’s try another image. i.e. in pod.ymllet’s use:

image: hello-world:latest

Now create hits:

 

Let’s try deleting this pod with:

https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/

 

Let’s try --all as mentioned:

Solution is:

https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/

 

Let’s try create again:

What is a CrashLoopBackOff? How to alert, debug / troubleshoot, and fix Kubernetes CrashLoopBackOff events.

 

 

Other errors

Unable to connect to the server

Unable to connect to the server: dial tcp 192.168.64.5:8443: i/o timeout

  • check minikube is started

kubectl describe pods

 

 

 

curl: silent, using variables in bash

-s=> silent – so no Progress bar

-S=> don’t output errors

https://stackoverflow.com/questions/7373752/how-do-i-get-curl-to-not-show-the-progress-bar

 

Using a variable in a curl call within Bash – remember to double-escape quotes. E.g.

https://stackoverflow.com/questions/40852784/how-can-i-use-a-variable-in-curl-call-within-bash-script

 

 

Elasticsearch: ElasticHQ

What I’m hoping for is that it will make tasks simpler:

1. to see Unassigned shards I currently:

ssh into the node then:

curl -X GET "localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason" | grep UNASS

The output isn’t easy to understand. E.g.

index.1 0 p UNASSIGNED NODE_LEFT
index.2 8 p UNASSIGNED INDEX_CREATED

 

ElasticHQ does mention that it will:

  • Monitor Nodes, Indices, Shards, and general cluster metrics.

However, if it’s via the Diagnostics pane then that takes ages – an indefinite amount of time. I’m currently at around 15 minutes and still seeing json scroll past in my console without seeing anything in the GUI.

 

2. delete an index

3. stop shard allocation

 

 

More info:

Elasticsearch: administration

See also ElasticHQ and Elasticsearch Concepts

Colours

Yellow => nodes are down

Red => data is missing

Upgrades

Do a rolling restart

  1. stop a Master node
  2. upgrade Master (don’t overwrite config)
  3. restart Master
  4. repeat for Data nodes
    1. note: when a data node is taken offline the shards will reshuffle and the cluster takes a big performance hit
    2. to avoid this set cluster.routing.allocation.enable to noneto fully disable shard rebalancing on the cluster. Useful if you know you’re going to take a node offline for just a few minutes. Use:
    3. curl -XPUT 0:9200/_cluster/settings -H’Content-Type: application/json’ -d ‘{ “transient” : { “cluster.routing.allocation.enable” : “all” }}’

Restore indices from a snapshot

POST /_snapshot/my_backup/snapshot_1/_restore

but remember to close the index before and open it after.

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html

Delete indices

Use Curator.

curator_cli --host <host> show_indices

curator Issues

Unable to create client connection to Elasticsearch. Error: Elasticsearch version 1.7.5 incompatible with this version of Curator (5.5.4)

See https://www.elastic.co/guide/en/elasticsearch/client/curator/current/version-compatibility.html

Install Curator 3.5.0

pip install elasticsearch-curator==3.5.0

https://www.elastic.co/guide/en/elasticsearch/client/curator/3.5/installation.html

 

Annoyingly, the syntax between versions is completely different. E.g.

Curator 3: curator, show_indices

Curator 5: curator_cli, show indices

And for this error:

ERROR. At least one filter must be supplied.

in Curator 3, you need something like this:

curator --host 10.33.12.203 show indices --regex '.'

Configuration

gateway.recover_after_nodes: 8

E.g. 8, to avoid thrashing after initial cluster restart.

See also:

gateway.recover_after_time: 5m

gateway.expected_nodes: 2

discovery.zen.ping.multicast.enabled: false
discovery.zen.minimum_master_nodes: 2

Data nodes:

node.master: false
node.data: true

http.enabled: false

Master nodes:

node.master: true
node.data: false

JVM

Don’t change garbage collection or thread pools.

Do change heap size. Use half of available memory. e.g. 32GB.

  • 32GB is enough
  • Lucene needs the rest
  • Heap size > 32GB is inefficient

E.g. ES_HEAP_SIZE=32g

/usr/bin/java -Xms30g -Xmx30g

Swap

Can harm performance. Disable by removing from /etc/fstab.

Alternatively, set bootstrap.mlockall: true in elasticsearch.yml to lock memory and prevent it being swapped.

File Descriptors / MMap

File Descriptors: 64,000

MMap: unlimited

See also https://docs.oracle.com/cd/E23389_01/doc.11116/e21036/perf002.htm

and sysctl vm.max_map_count

Maximum number of mmap()’ed ranges and how to set it on Linux?

 

 

Elasticsearch: another snakes nest

I ran into a problem with UNASSIGNED shards in an Elasticsearch cluster after experiencing a degraded EC2 instance.

I did the usual steps to solve:

See Elasticsearch: degraded instance

and ran into a heap of problems.

I installed curator_cli using

pip install elasticsearch-curator

but found that, despite reams of pages using the command delete_indices, curator kept saying No such command.

https://stackoverflow.com/questions/52794903/elasticsearch-curator-cli-error-error-no-such-command-delete-indices

 

Someone else had experienced this:

https://discuss.elastic.co/t/not-able-to-use-delete-indices-for-curator-cli/151490

The solution seemed to be to downgrade some library called Click.

https://github.com/elastic/curator/issues/1279

 

So I downgraded Click with:

pip install click==6.7 elasticsearch-curator==5.5.4

Now I’m running into

elasticsearch.exceptions.ElasticsearchException: Unable to create client connection to Elasticsearch. Error: Elasticsearch version 1.7.5 incompatible with this version of Curator (5.5.4)

It seems there’s a compatibility issue.

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/version-compatibility.html

 

Elasticsearch: degraded instance

AWS will warn you when you have an instance running on degraded hardware. You will get a scheduled retirement date and, on this date, your instance will be switched off.

However, your hardware is likely already compromised. If you’re running Elasticsearch on this hardware you need to decommission the node and create another node.

The steps to do this are:

1. get current status

From outside box:

2. disable shard allocation

3. stop service

sudo service elasticsearch stop

4. terminate the instance via console

5. create a new instance – e.g. with Terraform

6. re-enable shard allocation

7. validate cluster health goes back to green