Some core terms
An endpoint is an instance – e.g. a single process.
A collection of instances with the same purpose (e.g. a replicated process such as an API server) is called a job.
A node is a target – e.g. localhost on port 9090.
https://prometheus.io/docs/introduction/first_steps/
https://prometheus.io/docs/introduction/overview/
https://prometheus.io/docs/concepts/jobs_instances/
Configuration
Prometheus is configured via /etc/prometheus/prometheus.yml
and typically starts with:
global:
https://prometheus.io/docs/prometheus/latest/configuration/configuration/
e.g. let’s dissect this:
1 2 3 4 5 6 7 8 9 10 |
<span class="pl-ent">groups</span>: - <span class="pl-ent">name</span>: <span class="pl-s">example</span> <span class="pl-ent">rules</span>: - <span class="pl-ent">alert</span>: <span class="pl-s">HighErrorRate</span> <span class="pl-ent">expr</span>: <span class="pl-s">job:request_latency_seconds:mean5m{job="myjob"} > 0.5</span> <span class="pl-ent">for</span>: <span class="pl-c1">10m</span> <span class="pl-ent">labels</span>: <span class="pl-ent">severity</span>: <span class="pl-s">page</span> <span class="pl-ent">annotations</span>: <span class="pl-ent">summary</span>: <span class="pl-s">High request latency</span> |
See alerting rules: https://github.com/prometheus/prometheus/blob/master/docs/configuration/alerting_rules.md
and recording rules: https://github.com/prometheus/prometheus/blob/master/docs/configuration/recording_rules.md
and this on notifications
and this on expr:
https://pierrevincent.github.io/2017/12/prometheus-blog-series-part-5-alerting-rules/
Basics of querying:
1. Go to Prometheus –https://prom-server/graph
2. Enter time series selectors
e.g.
http_requests_total
or
node_filesystem_avail
or with a label
node_filesystem_avail{mountpoint="/"}
Notes:
Label matching operators:
- = Select labels that are exactly equal to the provided string
- != Select labels that are not equal to the provided string
- =~ Select labels that regex-match the provided string (or substring)
- !~ Select labels that do not regex-match the provided string (or substring)
Get list of metrics available on Prom server using:
curl http://localhost:9090/metrics
And targets:
curl http://localhost:9090/api/v1/targets
https://prometheus.io/docs/prometheus/latest/querying/basics/
/api/v1
is the HTTP API.
E.g. see https://prometheus.io/docs/prometheus/latest/querying/api/
More later.
More useful docs:
https://petargitnik.github.io/blog/2018/01/04/how-to-write-rules-for-prometheus
Note: Prometheus was developed to monitor web services. To monitor a node, you’ll need Node Exporter: https://www.digitalocean.com/community/tutorials/how-to-use-prometheus-to-monitor-your-centos-7-server
HTTP API
is exposed at /api/v1
.
https://prometheus.io/docs/prometheus/latest/querying/api/
and label values:
https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values
E.g. curl http://localhost:9090/api/v1/label/job/values
gets all the label values for the job
label.
Exporters
It’s the job of an exporter to export values from a node into Prometheus. E.g. on an Elasticsearch node:
1 2 3 |
ps -ef | grep export root 11637 1 0 Mar21 ? 00:44:18 /usr/local/bin/elasticsearch_exporter -web.listen-address=:9000 root 15603 1 0 2017 ? 03:10:45 /usr/local/bin/node_exporter -web.listen-address=:10000 -collector.textfile.directory=/var/local |
we can see here an Elasticsearch exporter and a node exporter (for CPU, etc metrics).
The Elasticsearch exporter is configured to send data to Prometheus as follows:
and we can check the data in Prometheus via:
Notes:
Marvel allows you to monitor Elasticsearch via Kibana. As of 5.0, Marvel is part of X-Pack.
https://www.elastic.co/guide/en/marvel/current/introduction.html