Logging and monitoring

This page explains the logging and monitoring options that are available with Anthos private mode.

Prometheus and Grafana

Prometheus and Grafana are two popular open source monitoring products:

  • Prometheus collects application and system metrics.

  • Alertmanager handles sending alerts out with several different alerting mechanisms.

  • Grafana is a dashboarding tool.

Prometheus and Grafana are enabled on each admin cluster and user cluster.

How Logging and Monitoring works

Logging and metrics agents are installed in each cluster when you create a new admin or user cluster. The components are:

  • LogMon operator (logmon-operator-): An operator to manage the lifecycle of all other components that serves LogMon APIs.
  • Logging agents (anthos-log-forwarder-): A Fluent bit Daemonset that forwards logs from each node of each cluster to Logs Storage.
  • Metrics agents: (anthos-prometheus-k8s-) A Prometheus agent is deployed in each cluster to collect the metrics for the cluster.
  • Metrics addons: (node-exporter-, kube-state-metrics-) Node Exporter and Kube State Metrics are deployed to provide richer metrics in the cluster about the node itself or Kubernetes global states.
  • Metrics Storage: (anthos-prometheus-k8s-) A Prometheus agent in the admin cluster is the central metrics storage for the metrics of both admin and user clusters, hosted by the persistent volume.
  • Logs Storage: (loki-) Loki in the admin cluster is the central logs storage for both admin and user clusters, hosted by the persistent volume.
  • UI: (grafana-) A Grafana agent is deployed in the admin cluster to visualize and query both logs and metrics.
  • Alerting: (alertmanager-) An alertmanager is deployed in the admin cluster to configure and push alert notifications.
  • Multi-cluster monitoring: (pushprox-server-, pushprox-client-) A pushprox client is deployed in the user cluster, and a pushprox server is deployed in the admin cluster for metrics federation.

Architecture

The following diagrams show the architecture of admin and user clusters in Anthos private mode.

Admin cluster

The admin cluster contains Prometheus for metrics storage, Loki for logs storage, Grafana as the UI for exploring metrics and logs, and alertmanager for alerting.

Admin cluster architecture

User clusters

Metrics in user clusters are collected by Prometheus and sent to the admin cluster, and logs in user clusters are collected by fluent-bit and sent to the admin cluster.

User cluster architecture