Install and configure Terraform to provision VMs and other infra to Azure

Note: if you don’t want to install terraform locally then use Azure Cloud Shell

use the >_ icon in the Azure portal

Set up Terraform access to Azure

1. get your subscription ID and tenant ID

az login

az account show --query "{subscriptionId:id, tenantId:tenantId}"

and set via an environment variable:

export SUBSCRIPTION_ID=abcd-abcd-etc
az account set --subscription="${SUBSCRIPTION_ID}"

2. create an Azure AD service principal

(an Azure AD service principal is a credential for your application – https://docs.microsoft.com/en-us/azure/azure-stack/azure-stack-create-service-principals )

az ad sp create-for-rbac --role="Contributor" --scopes="/subscriptions/${SUBSCRIPTION_ID}"

and note your appId, tenantID and password.

Slot them into this Terraform script (tf_env_vars.sh) to set up env vars:

#!/bin/sh
echo “Setting environment variables for Terraform”
export ARM_SUBSCRIPTION_ID=your_subscription_id
export ARM_CLIENT_ID=your_appId
export ARM_CLIENT_SECRET=your_password
export ARM_TENANT_ID=your_tenant_id

# Not needed for public, required for usgovernment, german, china
export ARM_ENVIRONMENT=public

Note: remember to apply these environment variables to your current shell. i.e. use:

. ./tf_env_vars.sh

(notice the leading dot?)

Create a test.tf file with:

provider "azurerm" {
}
resource "azurerm_resource_group" "rg" {
    name = "testResourceGroup"
    location = "westus"
}

and run with terraform initterraform plan and terraform apply

This should create a Resource Group.

https://docs.microsoft.com/en-us/azure/virtual-machines/linux/terraform-install-configure

Azure VM Tiers

Basic tier: Introductory level

Standard tiers:

A-series: Standard level

D-series: Faster processors, high memory-to-core ratio, SSD temp disk

Dv2-series: 35% faster than D series. Same memory/disk

DS-series: Premium storage (SSDs)

G-series: Biggest VM size. Intel Xeon E5 V3 processors

GS-series: Premium storage (SSDs)

 

Key VM IaaS Questions

  • CPU
  • RAM
  • NIC
  • Temp disk performance
  • Data disk
  • Cache size
  • Max data disk IOPS/bandwidth

 

  • Fault Domains => single point of failure (e.g. all servers in same rack fail ‘cos power fails). Place resources in separate fault domains
  • Update Domains => software updates (e.g. when server OS is updated then VMs are shifted off and then shifted back)

Azure Availability Sets => fault or update won’t take workload down. So distribute workloads across availability sets. Each workload in its own availability set (e.g. SQL server and web server in same availability set).

Note: VMs in same availability set should be of same sort.

 

Azure Active Directory

Azure Active Directory now allows synchronisation between Azure AD and on-prem Directory Services.

Azure AD B2C

  • allows users to use 3rd party identities such as Facebook, Google or Microsoft ID (similar to AWS Federated Identity)

Azure AD Premium: supports password writeback.

 

PagerDuty: Alerts vs Incidents

Incidents

An incident represents a problem or an issue that needs to be addressed and resolved.

Incidents trigger on a service, which prompts notifications to go out to on-call responders per the service’s escalation policy.

https://support.pagerduty.com/docs/incidents

 

Note on the difference between Urgency, Severity and Priority:

  • Urgency tied to Incidents – how you should be notified
  • Severity tied to Alerts – how impacted a service is
  • Priority tied to Incidents – order of incidents

https://community.pagerduty.com/t/whats-the-difference-between-urgency-severity-and-priority/291

Priority: Configuration > Incident Priorities (Enterprise only)

 

 

Alerts

When a service sends an event to PagerDuty, an alert and corresponding incident is triggered in PagerDuty…. Multiple alerts can be aggregated into a single incident for triage.

States of Alert:

Note: notifications are not sent to users based on alerts. If all alerts for an incident are resolved then that incident becomes resolved.

https://support.pagerduty.com/docs/alerts#section-states-of-alert

https://support.pagerduty.com/docs/alerts

Azure: Infrastructure and Networking

Azure Datacentres Architecture

Azure Datacentres

Terms

AWS  -> Azure

Placement Groups -> Affinity Groups

 

Fabric Controller

Some racks have a fabric controller which:

  • provisions VMs
  • heals failed VMs
  • rehydrates VMs
  • manages health and lifecycle of VMs

Azure Stamp / Cluster

  • shipping container
  • 20 rack group
  • all hardware in stamp uses same processor generation
  • 800 to 1000 individual servers very close together

Regional Availability and High Availability

Each rack functions as a fault domain.

Availability sets keep VMs available during downtime (which includes unscheduled – e.g. equipment failures and scheduled maintenance).

You need to have VMs in an availability set to qualify for the Azure 99.95% SLA.

Availability set: https://docs.microsoft.com/en-us/azure/virtual-machines/windows/tutorial-availability-sets#availability-set-overview

The closest I can find in AWS is this: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-increase-availability.html

 

Kubernetes Dashboard

Minikube

If you’re running minikube you can open up a Kubernetes dashboard with minikube dashboard.

(ignore all the stuff here: https://github.com/kubernetes/dashboard )

There’s a lot to see here so let’s break it down a bit.

For more info see https://github.com/kubernetes/dashboard/wiki

Note this is from Dashboard Version v1.8.1.

See also https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/

For example:

  • to see System Pods, choosing kube-systemfrom Namespaces, then Pods
  • or a summary of everything in a Namespace, using Workloads
  • Dashboard lets you create and deploy a containerized application as a Deployment using a wizard (see the CREATE button in the top right of a page)
  • Logs – via Pods > name of pod> Logs

 

EKS

See AWS: creating an EKS cluster then

http://www.snowcrash.eu/wp-content/uploads/2018/08/Screen-Shot-2018-11-21-at-17.35.48-300x266.png 300w" sizes="(max-width: 558px) 100vw, 558px" />

Selecting my ~/.kube/config file gave:

Not enough data to create auth info structure.

 

The other options are Tokenwhich says:

and SKIP which gives you a dashboard with:
warningconfigmaps is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list configmaps in the namespace “default”close
warningpersistentvolumeclaims is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list persistentvolumeclaims in the namespace “default”close
warningsecrets is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list secrets in the namespace “default”close
warningservices is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list services in the namespace “default”close
warningingresses.extensions is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list ingresses.extensions in the namespace “default”close
warningdaemonsets.apps is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list daemonsets.apps in the namespace “default”close
warningpods is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list pods in the namespace “default”close
warningevents is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list events in the namespace “default”close
warningdeployments.apps is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list deployments.apps in the namespace “default”close
warningreplicasets.apps is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list replicasets.apps in the namespace “default”close
warningjobs.batch is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list jobs.batch in the namespace “default”close
warningcronjobs.batch is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list cronjobs.batch in the namespace “default”close
warningreplicationcontrollers is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list replicationcontrollers in the namespace “default”close
warningstatefulsets.apps is forbidden: User “system:serviceaccount:kube-system:kubernetes-dashboard” cannot list statefulsets.apps in the namespace “default”
and  There is nothing to display here.
The Skipoption will make the Dashboard use the privileges of the Service Account used by the Dashboard.
Admin Privileges (less secure)
or grant Admin privileges to the Dashboard’s Service Account with:
i.e. add this:
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: kubernetes-dashboard
  labels:
    k8s-app: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: kubernetes-dashboard
  namespace: kube-system

to dashboard-admin.yaml and deploy with:

kubectl create -f dashboard-admin.yaml

 

Then serve up the dashboard with kubectl proxy and view it at:

http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/

(If you’ve done the `ClusterRoleBinding` then you can click Skip at the Dashboard dialog)

 

Note, you can view and delete this ClusterRoleBinding with:

kubectl get clusterrolebindings

kubectl delete clusterrolebinding kubernetes-dashboard

 

and delete the dashboard with:

kubectl -n kube-system delete deployment kubernetes-dashboard

https://stackoverflow.com/questions/46173307/how-do-i-remove-the-kubernetes-dashboard-pod-from-my-deployment-on-google-cloud

 

and

 

 

Installing Kubernetes: Minikube

There are a bunch of ways to install Kubernetes.

Here we’ll take a look at Minikube but see future posts for Google Container Engine (GKE), AWS and a Manual Install.

Minikube

Minikube is a tool that is used to run Kubernetes locally. It  runs a single-node Kubernetes cluster inside a VM so you can try out Kubernetes or develop with it.

TLDR

Start: minikube start

Upgrade:

brew update
brew cask reinstall minikube

https://stackoverflow.com/questions/45002364/how-to-upgrade-minikube

Important note: the default minikube startup config for me was 4GB. With a Mac that has 8GB of memory, it will very quickly die from memory pressure as com.docker.hyperkit has 1.58GB.

To fix this use:

minikube config set memory 2048

(I’ve found I get memory pressure with less – e.g. 1024)

https://github.com/kubernetes/minikube/issues/567

 

[RANT: why not just use upgrade rather than reinstall. And why should you have to spend several minutes Google’ing then have to go to a Stackoverflow page to find out how to upgrade a piece of software? Wouldn’t it have made sense to put some info somewhere like here: https://kubernetes.io/docs/tasks/tools/install-minikube/%5D

What is Minikube

Minikube creates a VM which has a Master and Node that can be managed from the Host (via kubectl).

Components are:

VM: minikube

Localkube (binary): runs Master and Node

Container runtime: currently runs Docker

Host: kubectl

Install

brew install kubectl
brew cask install minikube
brew install docker-machine-driver-xhyve
sudo chown root:wheel /usr/local/opt/docker-machine-driver-xhyve/bin/docker-machine-driver-xhyve
sudo chmod u+s /usr/local/opt/docker-machine-driver-xhyve/bin/docker-machine-driver-xhyve
minikube start --vm-driver=xhyve

Note: this driver now seems to be deprecated. i.e.

WARNING: The xhyve driver is now deprecated and support for it will be removed in a future release.
Please consider switching to the hyperkit driver, which is intended to replace the xhyve driver.

Also, kubectl is the same as kubernetes-cli: https://formulae.brew.sh/formula/kubernetes-cli

Checking

Checking which cluster we’re managing:

kubectl config current-context

minikube

Note: we could use kubectl to manage another Kubernetes cluster (e.g. Production). Just need to switch clusters.

Commands

List nodes

kubectl get nodes

If it’s not running you may get something like:

kubectl get nodes
No resources found.
Unable to connect to the server: dial tcp 192.168.64.5:8443: i/o timeout

Annoyingly, it takes around 2:30 mins to timeout!

Stop minikube

minikube stop

(keeps config)

I got an error here:

Stopping local Kubernetes cluster…
Error stopping machine: Error stopping host: minikube: unexpected EOF

and hanging occasionally at:

Starting cluster components…

Solution seemed to be:

minikube delete; rm -rf ~/.minikube

https://github.com/kubernetes/minikube/issues/227

and

https://github.com/kubernetes/minikube/issues/867

Note: be patient. On a 2018 top-end MacBook Pro with a 10Mbps network connection, this took just under 14 minutes with the major chunks of time being:

Downloading Minikube ISO and

Starting cluster components...

If you’ve already downloaded and started the components previously then this step will only take around 1 minute.

Delete Kubernetes cluster

minikube delete

Start Minikube with a Kubernetes version

Note however that downgrading is not supported. E.g. if you’ve already run v1.10.0 then

minikube start --vm-driver=xhyve --kubernetes-version="v1.6.0"

will fail with

Kubernetes version downgrade is not supported. Using version: v1.10.0

minikube commands

minikube by itself shows all the minikube commands.

minikube dashboard opens the dashboard. This shows (amongst much else):

  • Nodes
  • Pods (note: choose kube-systemrather than the default default from Namespace to see these)

minikube statusshows the current status.

References:

https://kubernetes.io/docs/tutorials/hello-minikube/

Create a test application

server.js

var http = require('http');

var handleRequest = function(request, response) {
  console.log('Received request for URL: ' + request.url);
  response.writeHead(200);
  response.end('Hello World!');
};
var www = http.createServer(handleRequest);
www.listen(8080);

Run with

node server.js

And test with

http://localhost:8080/

Package into a docker image

Create a file in the same folder called Dockerfile

FROM node:6.9.2
EXPOSE 8080
COPY server.js .
CMD node server.js

Note from https://kubernetes.io/docs/tutorials/hello-minikube/

Because this tutorial uses Minikube, instead of pushing your Docker image to a registry, you can simply build the image using the same Docker host as the Minikube VM, so that the images are automatically present. To do so, make sure you are using the Minikube Docker daemon:

eval $(minikube docker-env)
Note: Later, when you no longer wish to use the Minikube host, you can undo this change by running eval $(minikube docker-env -u).

Build with

docker build -t hello-node:v1 .

The -ttags it as hello-node:v1

Run with `kubectl run hello-node –image=hello-node:v1 –port=8080 –image-pull-policy=Never`

--image-pull-policy=Never‘cos you’ve only built it locally (i.e. you haven’t pushed it to the Docker registry).

View deployments

kubectl get deployments

View the pod

kubectl get pods

Note: you can see this all via a GUI if you run minikube dashboardand select Overview.

E.g. events

kubectl get events

and via Dashboard:

Workloads > Pods > <select pod> > Events

Create a Service

To make the Pod accessible from outside (it’s currently only accessibly via its internal IP address), use the expose command. i.e.

kubectl expose deployment hello-node --type=LoadBalancer

and view the Service:

kubectl get services

e.g.

minikube service hello-node

and you should see the logs for this via:

kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-node-57c6b66f9c-csxq5 1/1 Running 0 29m
hellonode > kubectl logs hello-node-57c6b66f9c-csxq5
Received request for URL: /
Received request for URL: /favicon.ico

Update your app

Update server.jsto return a new message

e.g.  `response.end(‘Hello World version 2!’);`

Build a new version (e.g. v2):

docker build -t hello-node:v2 .

Update the image of your deployment (i.e. replacing the previous one):

kubectl set image deployment/hello-node hello-node=hello-node:v2

and run with:

minikube service hello-node

Clean up

Clean up the resources created in the cluster:

kubectl delete service hello-node
kubectl delete deployment hello-node

Optionally, force removal of the Docker images created:

docker rmi hello-node:v1 hello-node:v2 -f

Optionally, stop the Minikube VM:

minikube stop
eval $(minikube docker-env -u)

Optionally, delete the Minikube VM:

minikube delete

 

Errors / Problems

“Starting cluster components” hangs

e.g. https://github.com/kubernetes/minikube/issues/2886

Ignore the suggestion to use  --show-libmachine-logs  – that doesn’t work.

Instead:

minikube delete
rm -rf ~/.minikube

https://github.com/kubernetes/minikube/issues/2765

Other issues