Categories
kubernetes Uncategorized

How to Scrape cAdvisor Metrics in GKE Using Prometheus

Table of Contents

TLDR;

The prometheus configurations are below. Be sure to give the prometheus service account cluster permissions to GET nodes/proxy and nodes api endpoints.

Go directly to the 3. Prometheus Configurations


Google cloud monitor only exposes a small subsection of cAdvisor metrics. With the setup below you’ll be able to collect all of the cAdvisor metrics from GKE. Here are the steps to directly query kubernetes to get cAdvisor metrics and the Prometheus configuration.

1. Create Service

To scrape the cAdvisor endpoint you’ll need to create a service account with cluster permissions to GET nodes/proxy and nodes.

Create a manifest called sa-manifests.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: test
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: test
rules:
  - apiGroups: [""]
    resources:
      - nodes
      - nodes/proxy
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: test
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: test
subjects:
  - kind: ServiceAccount
    name: test
    namespace: default

Run kubectl apply -f sa-manifests.yaml

2. Test API Manually

Create manifest file call pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: network
  namespace: default
spec:
  containers:
    - name: network
      image: praqma/network-multitool:c3d4e04
  serviceAccountName: test

Run the following commands

kubectl apply -f pod.yaml

kubectl exec -it network bash -n default

Now that we are in the lets actually make a call api to kubernetes api get the cAdvisor Metrics. Run these individual commands.

# export the KSA bearer token to an env variable
export BEARER_TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)

# Find the first K8s node
 export NODE_NAME=$(curl https://kubernetes.default.svc.cluster.local:443/api/v1/nodes/ -s -H "Authorization: Bearer $BEARER_TOKEN" -k | jq -r .items[0].metadata.name)

# Make an api call to kubernetes using curl
curl https://kubernetes.default.svc.cluster.local:443/api/v1/nodes/$NODE_NAME/proxy/metrics/cadvisor -H "Authorization: Bearer $BEARER_TOKEN" -k

After that you should see metrics for the node

# HELP machine_nvm_capacity NVM capacity value labeled by NVM mode (memory mode or app direct mode).
# TYPE machine_nvm_capacity gauge
machine_nvm_capacity{boot_id="bf88bcb1-f7dc-425d-87cc-ec4994216eb9",machine_id="b1962a4fef066daf20ce3f9adc1ca5e5",mode="app_direct_mode",system_uuid="b1962a4f-ef06-6daf-20ce-3f9adc1ca5e5"} 0
machine_nvm_capacity{boot_id="bf88bcb1-f7dc-425d-87cc-ec4994216eb9",machine_id="b1962a4fef066daf20ce3f9adc1ca5e5",mode="memory_mode",system_uuid="b1962a4f-ef06-6daf-20ce-3f9adc1ca5e5"} 0

You can find a complete list of cAdvisor metrics on the official github repository.

3. Prometheus Configurations

Lets put these pieces together and create a Prometheus configuration that can read from the cAdvisors metrics.

alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: kubernetes-cadvisor
    honor_timestamps: true
    scrape_interval: 15s
    scrape_timeout: 10s
    metrics_path: /metrics/cadvisor
    scheme: https
    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      insecure_skip_verify: true
    kubernetes_sd_configs:
      - role: node
    relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc.cluster.local:443
      - source_labels: [ __meta_kubernetes_node_name ]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
    metric_relabel_configs:
      - source_labels: [ namespace ]
        separator: ;
        regex: ^$
        replacement: $1
        action: drop
      - source_labels: [ pod ]
        separator: ;
        regex: ^$
        replacement: $1
        action: drop

Cheers!

Leave a Reply

Your email address will not be published. Required fields are marked *