docker: automatically restart a container

Say you need to restart a VM or restart Docker. How do you restart a container?

E.g. you have:

docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b431e9c92bae d5e863a4b712 "sleep 1d" 30 minutes ago Up 30 minutes frosty_shaw

then restarting Docker (e.g. on the Mac you can click the Docker icon in the toolbar and select Docker > Restart) you get:

docker ps
Error response from daemon: Bad response from Docker engine

whilst Docker is restarting and:

docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

after the restart (i.e. no containers).

 

You can use --restart always

docker container run --restart always -d <image id> sleep 1d

to restart after a docker reboot.

 

This https://docs.docker.com/config/containers/start-containers-automatically/

says

Restart policies are different from the --live-restore flag of the dockerd command

See also https://docs.docker.com/config/containers/live-restore/#enable-live-restore

 

More on Restart Policies

There are 4 restart policies: no, on-failure, unless-stopped, always.

no is the default. i.e. don’t restart if a container stops.

The others are:

always

We’ve seen this before. E.g. let’s say we have a script:

#/bin/bash
sleep 30
exit 1

Note: exit 1 indicates an error (exit 0 would indicate success).

which we use as follows:

FROM alpine
ADD crash.sh /crash.sh
CMD /bin/sh /crash.sh

 

We can build and run with:

docker build -t testing_restarts .
docker container run -d testing_restarts

This will exit.

To restart with always restart policy use:

docker container run --restart always -d testing_restarts

Now, when it crashes, under docker ps you’ll see:

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
18bac13c1f42 testing_restarts "/bin/sh -c '/bin/sh…" 31 seconds ago Restarting (1) Less than a second ago competent_shtern

 

on-failure

Here we can restart a container if it exits with a non-zero exit code. We can also specify a number of retries. E.g.

--restart on-failure:3

docker container run --restart on-failure:3 -d testing_restarts

Notes

  • the container will not restart if you do a docker stop <container id>
  • the container (and oddly even any containers that have stopped as a result of completing the on-failure number of retries – although only the first time the daemon was restarted) WILL restart if you restart the docker daemon

 

unless-stopped

Behaves the same as always except if a container is stopped.

Note: if you manually stop a container its restart policy is ignored until the Docker daemon restarts.

https://docs.docker.com/config/containers/start-containers-automatically/#restart-policy-details

 

Ensuring Containers Are Always Running with Docker’s Restart Policy

 

Live Restore

Lets you keep containers alive when the daemon becomes unavailable,.

https://docs.docker.com/config/containers/live-restore/#enable-live-restore

However, doing this on my installation of Docker gave:

time="2019-01-02T17:09:32.547843592Z" level=fatal msg="Error starting cluster component: --live-restore daemon configuration is incompatible with swarm mode"

because I was running a swarm service.

More: https://docs.docker.com/config/containers/live-restore/#live-restore-and-swarm-mode

I had to restore to Factory Defaults which means signing in to cloud.docker.com again.

 

git: error: Your local changes to the following files would be overwritten by merge

You’ve got some local changes in your git repo. What to do?

1. you want to keep your changes

a. and track them

git add <local-changes>; git commit -m "<your message>"

b. but don’t want to track them

Note: if you’re doing a git pullthen:

git update-index --assume-unchanged <file>

will still result in error: Your local changes to the following files would be overwritten by merge

git: what to do with untracked files

I tried --skip-worktree which didn’t work so just moved my .gitignore file (which was causing the problem) out of the way.

https://stackoverflow.com/questions/36996875/git-pull-error-for-git-update-index-assume-unchanged-files

More: https://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree/23806990#23806990

2. you don’t want your changes

git co <local-changes>

Debugging SSH

Debugging ssh is monotonous, monotonous, monotonous, monotonous shit ‘cos you get reams of messages which don’t tell you why you can’t connect.

E.g.

WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED

ssh snowcrash@1.2.3.4
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:fingerprint.
Please contact your system administrator.
Add correct host key in /Users/snowcrash/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /Users/snowcrash/.ssh/known_hosts:293
ECDSA host key for 1.2.3.4 has changed and you have requested strict checking.
Host key verification failed.

Actual error message should be:

YOU'VE PROBABLY REPLACED YOUR HOST AND YOUR EXISTING KEY IN ~/.ssh/known_hosts DOES NOT MATCH

Delete your key on line 293.

Permission denied (publickey).

This one is guaranteed to waste at least several months of your life.

Check:

  • Your public key is in the ~/.ssh/authorized_keys file of the user you’re trying to login with on the destination server
  • Your private key matches that public key (use `ssh-keygen -y -e -f .ssh/id_rsa` to output the public key version of the private key)
  • Your private key has the correct permissions

Use ssh -v to debug. Ignore the 20 odd lines of useless information that get output and focus on:

debug1: Offering public key: RSA SHA256:hash /Users/snowcrash/.ssh/id_rsa
debug1: Authentications that can continue: publickey
debug1: Offering public key: RSA SHA256:hash /Users/snowcrash/.ssh/another_key
debug1: Authentications that can continue: publickey
debug1: No more authentication methods to try.
snowcrash@1.2.3.4: Permission denied (publickey).

Output to look out for:

  1. debug1: identity file /home/snowcrash/.ssh/id_rsa type -1
    The -1 => it doesn’t exist. If it’s a 0 then you’re good.

 

Here’s a few pretty useless StackOverflow articles:

  1. https://superuser.com/questions/1137438/ssh-key-authentication-fails/1145465

Fails here: Watch the messages file tail -l /var/log/messages

tail: cannot open ‘/var/log/messages’ for reading: No such file or directory

2.

ECR Console Version 2

ECR (Amazon Container Registry) now has a dedicated management console.

https://aws.amazon.com/about-aws/whats-new/2018/12/amazon-ecr-console-version-2

Simple guide to creating a repo and pushing a docker image to it:

1. https://eu-west-2.console.aws.amazon.com/ecr/home?region=eu-west-2# and click Create a repository > Get Started

2. Enter a repository name (usually namespace/repo-name). e.g. snowcrash/wordpress

3. You’ll get a panel showing the URI – e.g. 026972849384.dkr.ecr.eu-west-2.amazonaws.com/snowcrash/wordpress

4. You’ll need to push a docker image to this repo. Assuming you’ve got a docker image you’re happy with locally then get a docker login command by running `$(aws ecr get-login –no-include-email –region eu-west-2)`.

You get this aws ecr get-login command from your ECR console by clicking View push commands.

Note: the --no-include-email is required for more recent versions of docker. E.g. if you get the error message:

== -e none https://026972849384.dkr.ecr.us-east-1.amazonaws.com
unknown shorthand flag: 'e' in -e
See 'docker login --help'.

If it succeeds, you should get:

WARNING! Using --password via the CLI is insecure. Use --password-stdin.
Login Succeeded

5.  tag it with

docker tag <image id> <remote tag>

6. and push with

docker push <remote tag>

 

Terraform 0.12 HCL and interpolation syntax

HCL2

HCL2 combines HCL (Hashicorp Language) and HIL (Hashicorp Interpolation Language). So we now have first-class expression syntax. i.e. the end to "${ ... }".

https://www.hashicorp.com/blog/announcing-terraform-0-12

i.e. v0.11

  ip_cidr_range = "${cidrsubnet(var.base_network_cidr, 4, count.index)}"

v0.12

  ip_cidr_range = cidrsubnet(var.base_network_cidr, 4, count.index)

 

Note: the wording used here by Hashicorp is confusing:

0.11 wrapped string interpolations in ${}. See https://www.terraform.io/docs/configuration-0-11/interpolation.html

However, 0.12 now extends this to loops and conditionals: https://www.hashicorp.com/blog/terraform-0-12-template-syntax

Improved Error messages

Error messages that actually mean something!

Remote Plan and Apply

 

Kubernetes Up & Running: Chapter 2

Page 16

1. make sure you’ve cloned the git repo mentioned earlier in the chapter

2. once in the repo, run:

make build

to build the application.

3. create this Dockerfile (not the one mentioned in the book)

FROM alpine
LABEL maintainer="e@snowcrash.eu"
COPY bin/1/amd64/kuard /kuard
ENTRYPOINT ["/kuard"]

MAINTAINER is deprecated. Use a LABEL instead: https://github.com/kubernetes-up-and-running/kuard/issues/7

However, whilst MAINTAINER can take 1 argument, LABELtakes key/value pairs. E.g.

LABEL <key>=<value>

https://docs.docker.com/engine/reference/builder/#label

 

And the  COPY path in the book is incorrect.

and run

docker build -t kuard-amd64:1 .

to build the Dockerfile.

Here we’ve got a repo name of kuard-amd64 and a tag of 1.

https://docs.docker.com/engine/reference/commandline/build/#tag-an-image–t

4. Check the repo using

docker images

Note: a registry is a collection of repositories, and a repository is a collection of images

https://docs.docker.com/get-started/part2/#recap-and-cheat-sheet-optional

 

Page 17

Files removed in subsequent layers are not available but still present in the image.

 

Image sizes:

docker images

Or a specific one:

docker image ls <repository name>

E.g. alpine is 4.41MB.

 

Let’s create a 1MB file and add / remove it:

dd if=/dev/zero of=1mb.file bs=1024 count=1024

https://www.unix.com/shell-programming-and-scripting/26389-creating-file-1mb-using-shell-command.html

then copy it in:

FROM alpine
LABEL MAINTAINER=me
COPY 1mb.file /

Now building it (and creating a repo called alpine_1mb) we can see the image has increased in size by a bit over 1MB (probably due to the overhead of an additional layer).

However, if we now remove this file in a subsequent Dockerfile – e.g. with something like:

FROM alpine_1mb
LABEL MAINTAINER=me
RUN rm /1mb.file

the image is still the same size.

The solution is to ensure you use an rm in the same RUN command as you create/use your big file: https://stackoverflow.com/questions/53998310/docker-remove-intermediate-layer

and https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#run

 

Page 19

TLDR

To run the kuard app, use:

docker run -d --name kuard --publish 8080:8080 gcr.io/kuar-demo/kuard-amd64:1

 

More:

docker tag kuard-amd64:1 gcr.io/kuar-demo/kuard-amd64:1

According to https://docs.docker.com/get-started/part2/#tag-the-image

the tag command is:

docker tag image username/repository:tag

so image is kuard-amd64:1

but what’s the username?

Is it gcr.io ?

Or gcr.io/kuar-demo?

The answer is that Docker’s docs here:

https://docs.docker.com/get-started/part2/#tag-the-image

are incorrect. You don’t need a username or repository. It’s just a label. E.g. see https://docs.docker.com/engine/reference/commandline/tag/

Correct would be:

docker tag image <any label you want>:tag

BUT for the purposes of pushing to a repository that label DOES need to be of a specific format. i.e. username/image_name.

https://stackoverflow.com/questions/41984399/denied-requested-access-to-the-resource-is-denied-docker

Shame they didn’t explain that in more detail.

 

And the next line is misleading too.

docker push gcr.io/kuar-demo/kguard-amd64:1

This creates the impression that you’re pushing your image to a website (or registry) hosted at gcr.io.

It’s not.

It’s just the tag you created above. Having said that, I had to simplify the tag (from 2 slashes to 1 slash) to get it to work. E.g.

docker tag kuard-amd64:1 snowcrasheu/kuard-amd64:1

docker push snowcrash/kuar/kuard-amd64:1
The push refers to repository [docker.io/snowcrash/kuar/kuard-amd64]
7b816b232464: Preparing
73046094a9b8: Preparing
denied: requested access to the resource is denied

The reason for

denied: requested access to the resource is denied

is that (from https://stackoverflow.com/questions/41984399/denied-requested-access-to-the-resource-is-denied-docker )

You need to include the namespace for Docker Hub to associate it with your account.
The namespace is the same as your Docker Hub account name.
You need to rename the image to YOUR_DOCKERHUB_NAME/docker-whale.

https://stackoverflow.com/questions/41984399/denied-requested-access-to-the-resource-is-denied-docker

 

To login with docker use:

docker login

or to use a specific username / password.

docker login -u <username> -p <password>

Better is --password-stdin however.

and push with:

docker push snowcrasheu/kuard-amd64:1

which you should then be able to see on Docker Hub. E.g.

https://hub.docker.com/r/snowcrasheu/kuard-amd64/

 

Limit CPU / Memory

docker run -d --name kuard
--publish 8080:8080
--memory 200m
--memory-swap 1G
--cpu-shares 1024 
gcr.io/kuar-demo/kuard-amd64:1

 

Notes:

How to change a repository name: 

https://stackoverflow.com/questions/25211198/docker-how-to-change-repository-name-or-rename-image/25214186#25214186

 

Handy copy-and-paste code: https://github.com/rusrushal13/Kubernetes-Up-and-Running/blob/master/Chapter2.md