What EC2 instances available in London?

Seems like a simple question, right?

However, it starts getting a little more intricate when you look a the details. Let’s say I want an m3.medium. Here’s my goto info page:

https://www.ec2instances.info/?region=eu-west-2

Cool – these are available.

 

But lets say I need them for Elasticache.

OK, looking at: https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/CacheNodes.SupportedTypes.html

I can see a cache.m3.medium. But then, reading further on – i.e. the Supported Node Types by AWS Region section – as of the time of posting this blog post, m3 is not supported in EU (London). Nor in EU (Paris).

m4is supported in EU (London). But not in EU (Paris)!

Fortunately, we’re only looking for London.

So, the obvious choice would be an m4.medium. However, scanning back up to the top of the Supported Node Types page the smallest available is a cache.m4.large.

It’s all in the details!

 

Elasticsearch Concepts

See also Elasticsearch: administration

Indexes: store all the JSON documents and are stored in a shard

Shards: a complete Lucene database

 

Indices can be stored in multiple shards.

i.e. if we have multiple nodes then shards will migrate across nodes – aka rebalancing.

 

Replicas: an exact duplicate of a shard (except designated as a Replica). Can configure as many replicas per shard as you like.

E.g. here you have 2 shards plus 2 replicas. Each contain the one index (I01).

http://www.snowcrash.eu/wp-content/uploads/2018/10/Screen-Shot-2018-10-16-at-11.05.46-AM-300x176.png 300w, http://www.snowcrash.eu/wp-content/uploads/2018/10/Screen-Shot-2018-10-16-at-11.05.46-AM-768x449.png 768w, http://www.snowcrash.eu/wp-content/uploads/2018/10/Screen-Shot-2018-10-16-at-11.05.46-AM-588x344.png 588w" sizes="(max-width: 815px) 100vw, 815px" />

Replicas are Read-only and can serve data thereby increasing scale.

https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-shards.html

 

Node Roles

Can run all on a single node but makes it more efficient if you separate them out.

Data:

  • most hard working
  • contains all shards
  • don’t typically receive service queries
  • tend to be the beefiest

Client

  • gateway to the cluster
  • big increase in performance
  • handle all query requests and redirects them to the data nodes

Master

  • brains of cluster
  • maintains cluster state
  • all nodes have a copy of the state but only the master can update cluster state

Capacity Planning

Data Nodes

To test, set up a load test on one node until node is completely saturated.

E.g. 1M documents on 1 node = 4.0 seconds then probably need 4 nodes to get to 1.0 second response time.

Master Nodes

To avoid split brain scenario:  set minimum_master_nodes to (number of master nodes / 2 ) + 1]

Should have at least 3.

Client Nodes

Could exist behind a load balancer.

E.g. in summary, a setup could be:

4 data nodes, 3 master nodes, 2 client nodes – i.e. a total of 9 nodes.

Server Requirements

  • CPU: more cores the better (favour over clock speed) – i.e. better to run more processes concurrently rather than run them faster
  • RAM: 64GB for Data nodes is ideal (e.g. in AWS an i3.2xlarge – https://aws.amazon.com/ec2/instance-types/i3/ )
  • Disks: fastest disks possible. Safe to use RAID 0 for more speed though not fault tolerant but Elasticsearch has shards. Avoid using NAS as performance will drop drastically
  • Networking: keep clustering within same data centre as shard rebalancing requires fast networking
  • VMs: don’t use for data nodes in production

 

Running on AWS

Data: i3.2xlarge

Master:

Client:

See also https://www.elastic.co/guide/en/elasticsearch/plugins/master/cloud-aws-best-practices.html

and https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

 

t2.nano – what can it do and how much does it cost?

What do you get:

  • 1 vCPU
  • 512MB RAM
  • 72 minutes of CPU time per day (if you run out of this then your CPU time is throttled to 5%)

What is it good for?

  • bastion server (you’ll occasionally use them when you ssh in)
  • low traffic web server (perhaps use a CDN to reduce load)

Note: don’t use it as a web server that needs to be up 24×7 as you’ll just find it slows to a crawl after your 72 minutes.

How much does a t2.nanocost per day?

In eu-west-2 (aka Ireland) it’s $5 per month. Or around £4 per month.