AWS EKS: An error occurred (ResourceNotFoundException) when calling the DescribeCluster operation: No cluster found for name

Following along to https://github.com/terraform-providers/terraform-provider-aws/tree/master/examples/eks-getting-started

I deployed and then ran:

aws eks update-kubeconfig --name terraform-eks-demo

to get:

An error occurred (ResourceNotFoundException) when calling the DescribeCluster operation: No cluster found for name: terraform-eks-demo.

I can see the cluster so why is this happening?

 

Let’s try listing the clusters.

aws eks list-clusters

{
    "clusters": []
}

Docs say:

Lists the Amazon EKS clusters in your AWS account in the specified Region.

so let’s specify the Region:

aws eks list-clusters --region 'us-west-2'
{
    "clusters": [
        "terraform-eks-demo"
    ]
}

so perhaps it’s our default region that’s the issue however that says:

[default]
region = us-west-2

but our  `~/.aws/credentials` says:

[default]
aws_access_key_id = <key id>
aws_secret_access_key = <secret access key>
region=eu-west-1

Odd that there was a region in the credentials file. It’s usually seen in the config.

Deleting it fixed the issue so the credentials file must have overridden the config file.

https://docs.aws.amazon.com/cli/latest/userguide/cli-config-files.html

https://docs.aws.amazon.com/cli/latest/reference/eks/list-clusters.html

 

 

Terraform: EKS cluster – aws_eks_cluster.demo: error creating EKS Cluster: InvalidParameterException: Error in role params

The code I used:

https://github.com/terraform-providers/terraform-provider-aws/tree/master/examples/eks-getting-started

The first few times were fine. Then, on the third terraform apply, I got:

aws_security_group_rule.demo-node-ingress-self: Creation complete after 3s (ID: sgrule-3180869992)

Error: Error applying plan:

1 error(s) occurred:

* aws_eks_cluster.demo: 1 error(s) occurred:

* aws_eks_cluster.demo: error creating EKS Cluster (my-cluster): InvalidParameterException: Error in role params
status code: 400, request id: d063ca1b-ecb0-11e8-acff-5347eb3dd87f

No idea what the issue was but deleting .terraform fixed the problem.

 

 

AWS: creating an EKS cluster

Top Tips

Stuff, perhaps not immediately relevant, but you’ll keep coming back to:

List contexts: kubectx

Switch contexts: `kubectx <your context>`

Namespaces:  `kubectl get pods -o yaml -n kube-system`

(e.g. if you run kubectl get pods and see nothing it may be ‘cos you’re using the wrong namespace – i.e. there are no pods in that namespace)

 

 

Notes and Guides:

Notes: EKS is only available in:

  • US West (Oregon) (us-west-2)
  • US East (N. Virginia) (us-east-1)
  • EU (Ireland) (eu-west-1)

Terraform guide:  https://www.terraform.io/docs/providers/aws/guides/eks-getting-started.html

(The Terraform code provided is here: https://github.com/terraform-providers/terraform-provider-aws/tree/master/examples/eks-getting-started )

and the AWS EKS guide: https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html

 

Terraform notes:

  • TF code creates 2 m4.large instances based on the latest EKS Amazon Linux 2 AMI: Operator managed Kubernetes worker nodes for running Kubernetes service deployments
  • Full code: https://github.com/terraform-providers/terraform-provider-aws/tree/master/examples/eks-getting-started

 

AWS EKS notes

You’ll need:

  • aws-iam-authenticator

Don’t use the instructions given on https://github.com/kubernetes-sigs/aws-iam-authenticator unless you want to waste half an hour of your time figuring out why it doesn’t work. I got this error: https://stackoverflow.com/questions/53344191/running-go-gives-me-go-clang-error-no-input-files

Use the instructions here: https://docs.aws.amazon.com/eks/latest/userguide/configure-kubectl.html

i.e. curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-07-26/bin/darwin/amd64/aws-iam-authenticator

  • helm
  • kubectl

 

Name of cluster: in AWS console or use:

aws eks list-clusters

 

To use kubectl:

aws eks update-kubeconfig --name <name of cluster>

This will add the config to your ~/.kube/config.

Checking:

1. You can check this is in your config with:

  • kubectl config view

See also Kubernetes: kubectl

 

Note:  aws cli version <= 1.15.53 does not have this. Upgrade AWS CLI, with:`pip install awscli –upgrade –user`

https://docs.aws.amazon.com/cli/latest/userguide/installing.html

Typical problems when upgrading AWS CLI:

aws --version
aws-cli/1.11.10 Python/2.7.10 Darwin/17.7.0 botocore/1.4.67

pip install awscli --upgrade --user
Collecting awscli
  Downloading https://files.pythonhosted.org/packages/a6/da/c99b10bfc509cbbea520886d2e8fe0e738e3ce22e2f528381f3bb2229433/awscli-1.16.57-py2.py3-none-any.whl (1.4MB)
...
Successfully installed awscli-1.16.57 botocore-1.12.47

aws --version
aws-cli/1.11.10 Python/2.7.10 Darwin/17.7.0 botocore/1.4.67

You’ve probably got a PATH problem.

Check you haven’t got an older version at /usr/local/bin

 

2. And that you can see pods in your cluster with:

kubectl get all -n kube-system

E.g. I got this back:

NAME                          READY   STATUS    RESTARTS   AGE
pod/kube-dns-fcd468cb-8fhg2   0/3     Pending   0          41m

NAME               TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)         AGE
service/kube-dns   ClusterIP   172.20.0.10   <none>        53/UDP,53/TCP   41m

NAME                        DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/aws-node     0         0         0       0            0           <none>          41m
daemonset.apps/kube-proxy   0         0         0       0            0           <none>          41m

NAME                       DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/kube-dns   1         1         1            0           41m

NAME                                DESIRED   CURRENT   READY   AGE
replicaset.apps/kube-dns-fcd468cb   1         1         0       41m

 

Debugging:

Some more information on debugging Pods

kubectl get events --all-namespaces

shows

kube-system 1m 1h 245 kube-dns-fcd468cb-8fhg2.156899dbda62d287 Pod Warning FailedScheduling default-scheduler no nodes available to schedule pods

and

kubectl get nodes
No resources found.

so ssh into one of the nodes and run journalctl

You’ll need to add your ssh key to the node and get the public IP address. Then:

ssh -i ~/path/to/key ec2-user@public.ip.address

 

StackOverflow post: https://stackoverflow.com/questions/53381739/kube-system-pod-warning-failedscheduling-default-scheduler-no-nodes-available-t

 

The trick to solving this is the output that’s generated by Terraform needs to be applied.

i.e. copy `config_map_aws_auth` which, for me, looked like:

apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - rolearn: arn:aws:iam::<owner id>:role/terraform-eks-demo-node
      username: system:node:{{EC2PrivateDNSName}}
      groups:
        - system:bootstrappers
        - system:nodes

into a file,  config_map_aws_auth.tf.output and apply as is:

kubectl apply -f config_map_aws_auth.tf.output

The {{EC2PrivateDNSName}} is parsed by one of the Kubernetes controllers.

More on this issue in #office-hours – https://kubernetes.slack.com/archives/C6RFQ3T5H/p1542812144088800

 

Issues

Warning FailedScheduling – default-scheduler no nodes available to schedule pods

error creating EKS Cluster: InvalidParameterException: Error in role params

AWS EKS: An error occurred (ResourceNotFoundException) when calling the DescribeCluster operation: No cluster found for name

 

 

Screencast

https://asciinema.org/a/zYFCtGrXSqJaLHybKwq6V9rFF

 

Configure kubectl for Amazon EKS

To use the stock kubectl client for EKS you need to:

  • install the AWS IAM Authenticator for Kubernetes

https://docs.aws.amazon.com/eks/latest/userguide/configure-kubectl.html

  • modify your kubectl configuration file to use it for authentication

 

Other things that may be useful are:

  • helm – if you’re using Helm charts to manage your cluster in EKS
  • kubectl and awscli – goes without saying

E.g. check your aws cli version with:

aws --version and upgrade with `pip install awscli –upgrade –user`

  • assume-role – if you’re using IAM roles

https://github.com/remind101/assume-role

  • nice to have is fzf: https://github.com/junegunn/fzf#installation

 

To update your kubeconfig use:

aws eks update-kubeconfig --name CLUSTER_NAME-eks --region REGION

You’ll need an up-to-date version of the awscli. E.g. 1.15.53 won’t cut it.

aws --version
aws-cli/1.15.53 Python/2.7.10 Darwin/17.7.0 botocore/1.10.52
aws eks update-kubeconfig --name <cluster-name> --region us-east-1
usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters]
To see help text, you can run:

aws help
aws <command> help
aws <command> <subcommand> help
aws: error: argument operation: Invalid choice, valid choices are:

 

To assume role use:

eval $(assume-role <role-name>)

Issues:

If you get:

error: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors

it would be because you don’t have a profile in your ~/.aws/config

Your profile in ~/.aws/config should look like:

[profile iam-manager]
region=us-east-1
output=json

# IAM roles
[profile role-name]
region = us-east-1
role_arn = arn:aws:iam::<account number with role you want to assume>:role/NameOfAssumeRole
source_profile = iam-manager

 

You should be able to run:

assume-role <role-name>

and see the assume role output.

 

 

 

Testing:

To test you can access your EKS cluster, use:

kubectl get all -n kube-system

Or for none-system:

kubectl get all

 

 

AWS Fargate

Fargate

AWS Fargate is a compute engine for Amazon ECS and EKS that allows you to run containers without having to manage servers or clusters. With AWS Fargate, you no longer have to provision, configure, and scale clusters of virtual machines to run containers.

https://aws.amazon.com/fargate/

Fargate is not currently (August 2018) available in the UK.

How does this differ from ECS (Elastic Container Service) and EKS (Elastic Container Service for Kubernetes) though?

ECS

Amazon Elastic Container Service (Amazon ECS) is a highly scalable, high-performance container orchestration service that supports Docker containers and allows you to easily run and scale containerized applications on AWS. Amazon ECS eliminates the need for you to install and operate your own container orchestration software, manage and scale a cluster of virtual machines, or schedule containers on those virtual machines.

https://aws.amazon.com/ecs/

ECS was first to market as a commercial container service between the big players and is now suffering as it’s rather out-dated. It’s basically Docker as a Service offering a Docker Registry (aka Amazon Elastic Container Registry or ECR) and support in its CLI for Docker Compose.

EKS

EKS (aka Amazon Elastic Container Service for Kubernetes) is a managed Kubernetes service.

The differences? Use:

  • ECS if you like using Docker
  • EKS if you like Kubernetes
  • Fargate if you don’t want to managing either Docker or Kubernetes

See also https://dzone.com/articles/ecs-vs-eks-vs-fargate-the-good-the-bad-the-ugly

 

 

 

 

Kubernetes Dashboard

Minikube

If you’re running minikube you can open up a Kubernetes dashboard with minikube dashboard.

(ignore all the stuff here: https://github.com/kubernetes/dashboard )

There’s a lot to see here so let’s break it down a bit.

For more info see https://github.com/kubernetes/dashboard/wiki

Note this is from Dashboard Version v1.8.1.

See also https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/

For example:

  • to see System Pods, choosing kube-systemfrom Namespaces, then Pods
  • or a summary of everything in a Namespace, using Workloads
  • Dashboard lets you create and deploy a containerized application as a Deployment using a wizard (see the CREATE button in the top right of a page)
  • Logs – via Pods > name of pod> Logs

 

EKS

See AWS: creating an EKS cluster then

http://www.snowcrash.eu/wp-content/uploads/2018/08/Screen-Shot-2018-11-21-at-17.35.48-300x266.png 300w" sizes="(max-width: 558px) 100vw, 558px" />

Selecting my ~/.kube/config file gave:

Not enough data to create auth info structure.

 

The other options are Tokenwhich says:

and SKIP which gives you a dashboard with:
warningconfigmaps is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list configmaps in the namespace “default”close
warningpersistentvolumeclaims is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list persistentvolumeclaims in the namespace “default”close
warningsecrets is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list secrets in the namespace “default”close
warningservices is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list services in the namespace “default”close
warningingresses.extensions is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list ingresses.extensions in the namespace “default”close
warningdaemonsets.apps is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list daemonsets.apps in the namespace “default”close
warningpods is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list pods in the namespace “default”close
warningevents is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list events in the namespace “default”close
warningdeployments.apps is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list deployments.apps in the namespace “default”close
warningreplicasets.apps is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list replicasets.apps in the namespace “default”close
warningjobs.batch is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list jobs.batch in the namespace “default”close
warningcronjobs.batch is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list cronjobs.batch in the namespace “default”close
warningreplicationcontrollers is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list replicationcontrollers in the namespace “default”close
warningstatefulsets.apps is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list statefulsets.apps in the namespace “default”
and  There is nothing to display here.
The Skipoption will make the Dashboard use the privileges of the Service Account used by the Dashboard.
Admin Privileges (less secure)
or grant Admin privileges to the Dashboard’s Service Account with:
i.e. add this:
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: kubernetes-dashboard
  labels:
    k8s-app: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: kubernetes-dashboard
  namespace: kube-system

to dashboard-admin.yaml and deploy with:

kubectl create -f dashboard-admin.yaml

 

Then serve up the dashboard with kubectl proxy and view it at:

http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/

(If you’ve done the `ClusterRoleBinding` then you can click Skip at the Dashboard dialog)

 

Note, you can view and delete this ClusterRoleBinding with:

kubectl get clusterrolebindings

kubectl delete clusterrolebinding kubernetes-dashboard

 

and delete the dashboard with:

kubectl -n kube-system delete deployment kubernetes-dashboard

https://stackoverflow.com/questions/46173307/how-do-i-remove-the-kubernetes-dashboard-pod-from-my-deployment-on-google-cloud

 

and