Untitled

Monitoring involves collecting information regarding the the cluster and its performance metrics such as memory utilization, disk utilization, network utilization etc. Monitoring data is retrieved from the Kubelet service running on each node.

K8s does not have a native monitoring solution. There are many 3rd party open-source monitoring solutions like Metrics Server, Elastic Stack, Prometheus, etc. There are also some proprietary monitoring solutions like DataDog and DynaTrace.

Metrics Server

It is an open-source in-memory monitoring solution built as a slim-down version of Heapster (monitoring tool used earlier). To setup metric server, clone the below repo and run k apply -f . inside it.

git clone <https://github.com/kubernetes-incubator/metrics-server.git>

We can then run k top node to see the nodes consuming most resources and k top pods to see the same for pods.

<aside> 💡 A better way to monitor the cluster is to use a dedicated monitoring solution.

</aside>

Prometheus and Grafana

Installation

Use kube-prometheus-stack helm chart to deploy all the required components on the cluster, including grafana and alert manager. The Prometheus operator creates several CRDs to provide abstraction and allow us to configure prometheus by creating K8s manifests.

The prometheus UI is available on port 9090 on the Prometheus server (check for corresponding service for pod prometheus-prometheus). The helm chart installs node-exporter daemonset on each node which exports system level metrics for each node to the prometheus server.

ServiceMonitor

ServiceMonitor CRD (created by the Prometheus Operator) can be used to add a scrape target to Prometheus. The kube-prometheus-stack helm chart automatically creates some service monitors to scrape the cluster control plane components. We can also add our own service monitors to scrape metrics from applications running inside the pods.

In the example below, a service monitor is created to scrape the api-service every 30 seconds for metrics on port web (3000) at path /swagger-stats/metrics. The name of the scraping job will be node-api in this case.

Untitled

PrometheusRule

To add new rules to Prometheus, we can create a PrometheusRule object (CRD created by the Prometheus Operator). The kube-prometheus-stack helm chart automatically creates some prometheus rules in the cluster.

Untitled

AlertManagerRule