Skip to content

Commit

Permalink
more info about how to scrape k0s internals
Browse files Browse the repository at this point in the history
  • Loading branch information
Bartosz Fenski committed Jan 10, 2024
1 parent 7fa70d3 commit 809a772
Showing 1 changed file with 45 additions and 1 deletion.
46 changes: 45 additions & 1 deletion docs/system-monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,50 @@ You can read more about metrics for Kubernetes system components [here](https://
sudo k0s install controller --enable-metrics-scraper
```

Once you enable it new set of objects will show up on your cluster:

```shell
~ kubectl get all -n k0s-system
NAME READY STATUS RESTARTS AGE
pod/k0s-pushgateway-6c5d8c54cf-bh8sb 1/1 Running 0 43h

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/k0s-pushgateway ClusterIP 10.100.11.116 <none> 9091/TCP 43h

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/k0s-pushgateway 1/1 1 1 43h

NAME DESIRED CURRENT READY AGE
replicaset.apps/k0s-pushgateway-6c5d8c54cf 1 1 1 43h
```

That's not enough to start scraping these additional metrics.

For Prometheus based solution you can create ServiceMonitor for it like this:

```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: k0s
namespace: k0s-system
spec:
endpoints:
- port: http
selector:
matchLabels:
app: k0s-observability
component: pushgateway
k0s.k0sproject.io/stack: metrics
```
Note that it won't clear alerts like "KubeControllerManagerDown" nor "KubeSchedulerDown" as they are based on Prometheus's internal "up" metrics.
But you can get rid of these alerts by modifying them to detect working component like this:
```

Check failure on line 54 in docs/system-monitoring.md

View workflow job for this annotation

GitHub Actions / Lint markdown

Fenced code blocks should have a language specified [Context: "```"]
absent(apiserver_audit_event_total{job="kube-scheduler"})
```

Check failure on line 56 in docs/system-monitoring.md

View workflow job for this annotation

GitHub Actions / Lint markdown

Trailing spaces [Expected: 0 or 2; Actual: 1]

## Jobs

The list of components which is scrapped by k0s:
Expand All @@ -26,4 +70,4 @@ The list of components which is scrapped by k0s:

![k0s metrics exposure architecture](img/pushgateway.png)

k0s uses pushgateway with TTL to make it possible to detect issues with the metrics delivery. Default TTL is 2 minutes.
k0s uses pushgateway with TTL to make it possible to detect issues with the metrics delivery. Default TTL is 2 minutes.

0 comments on commit 809a772

Please sign in to comment.