-
Notifications
You must be signed in to change notification settings - Fork 39
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Clément Nussbaumer <[email protected]>
- Loading branch information
1 parent
4950dd6
commit bd1ee9f
Showing
4 changed files
with
55 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,10 +3,12 @@ | |
![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/postfinance/kubenurse) | ||
|
||
# Kubenurse | ||
|
||
kubenurse is a little service that monitors all network connections in a Kubernetes | ||
cluster. Kubenurse measures request durations, records errors and exports those metrics in Prometheus format. | ||
|
||
## Deployment | ||
|
||
You can get the Docker image from [Docker Hub](https://hub.docker.com/r/postfinance/kubenurse/). | ||
The [examples](https://github.com/postfinance/kubenurse/tree/master/examples) directory | ||
contains manifests which can be used to deploy kubenurse to the kube-system namespace of your cluster. | ||
|
@@ -45,6 +47,7 @@ The following command can be used to install kubenurse with Helm: `helm upgrade | |
| insecure | Set `KUBENURSE_INSECURE` environment variable | `true` | | ||
| allow_unschedulable | Sets `KUBENURSE_ALLOW_UNSCHEDULABLE` environment variable | `false` | | ||
| neighbour_filter | Sets `KUBENURSE_NEIGHBOUR_FILTER` environment variable | `app.kubernetes.io/name=kubenurse` | | ||
| neighbour_limit | Sets `KUBENURSE_NEIGHBOUR_LIMIT` environment variable | `10` | | ||
| extra_ca | Sets `KUBENURSE_EXTRA_CA` environment variable | | | ||
| check_api_server_direct | Sets `KUBENURSE_CHECK_API_SERVER_DIRECT` environment variable | `true` | | ||
| check_api_server_dns | Sets `KUBENURSE_CHECK_API_SERVER_DNS` environment variable | `true` | | ||
|
@@ -74,7 +77,6 @@ dashboards [as this example](./doc/grafana-kubenurse.json) that show network lat | |
![Grafana ingress view](doc/grafana_ingress.png "Grafana ingress view") | ||
![Grafana path view](doc/grafana_path.png "Grafana path view") | ||
## Configuration | ||
kubenurse is configured with environment variables: | ||
|
@@ -85,12 +87,13 @@ kubenurse is configured with environment variables: | |
- `KUBENURSE_EXTRA_CA`: Additional CA cert path for TLS connections | ||
- `KUBENURSE_NAMESPACE`: Namespace in which to look for the neighbour kubenurses | ||
- `KUBENURSE_NEIGHBOUR_FILTER`: A Kubernetes label selector (eg. `app=kubenurse`) to filter neighbour kubenurses | ||
- `KUBENURSE_NEIGHBOUR_LIMIT`: The maximum number of neighbours each kubenurse will query | ||
- `KUBENURSE_ALLOW_UNSCHEDULABLE`: If this is `"true"`, path checks to neighbouring kubenurses are made even if they are running on unschedulable nodes. | ||
- `KUBENURSE_CHECK_API_SERVER_DIRECT`: If this is `"true"` kubenurse will perform the check [API Server Direct](#API Server Direct). default is "true" | ||
- `KUBENURSE_CHECK_API_SERVER_DNS`: If this is `"true"`, kubenurse will perform the check [API Server DNS](#API Server DNS). default is "true" | ||
- `KUBENURSE_CHECK_ME_INGRESS`: If this is `"true"`, kubenurse will perform the check [Me Ingress](#Me Ingress). default is "true" | ||
- `KUBENURSE_CHECK_ME_SERVICE`: If this is `"true"`, kubenurse will perform the check [Me Service](#Me Service). default is "true" | ||
- `KUBENURSE_CHECK_NEIGHBOURHOOD`: If this is `"true"`, kubenurse will perform the check [Neighbourhood](#Neighbourhood). default is "true" | ||
- `KUBENURSE_CHECK_NEIGHBOURHOOD`: If this is `"true"`, kubenurse will perform the check [Neighbourhood](#neighbourhood). default is "true" | ||
- `KUBENURSE_CHECK_INTERVAL`: the frequency to perform kubenurse checks. the string should be formatted for [time.ParseDuration](https://pkg.go.dev/time#ParseDuration). defaults to `5s` | ||
- `KUBENURSE_REUSE_CONNECTIONS`: whether to reuse connections or not for all checks. default is "false" | ||
- `KUBENURSE_HISTOGRAM_BUCKETS`: optional comma-separated list of float64, used in place of the [default prometheus histogram buckets](https://pkg.go.dev/github.com/prometheus/[email protected]/prometheus#DefBuckets) | ||
|
@@ -152,8 +155,8 @@ The `/alive` endpoint returns a JSON like this with status code 200 if everythin | |
} | ||
``` | ||
|
||
|
||
## Health Checks | ||
|
||
Every five seconds and on every access of `/alive`, the checks described below are run. | ||
Check results are cached for 3 seconds in order to prevent excessive network traffic. | ||
|
||
|
@@ -162,19 +165,22 @@ A little illustration of what communication occurs, is here: | |
![Communication](doc/Communication.png "Communication") | ||
|
||
### API Server Direct | ||
|
||
Checks the `/version` endpoint of the Kubernetes API Server through | ||
the direct link (`KUBERNETES_SERVICE_HOST`, `KUBERNETES_SERVICE_PORT`). | ||
|
||
Metric type: `api_server_direct` | ||
|
||
### API Server DNS | ||
|
||
Checks the `/version` endpoint of the Kubernetes API Server through | ||
the Cluster DNS URL `https://kubernetes.default.svc:$KUBERNETES_SERVICE_PORT`. | ||
This also verifies a working `kube-dns` deployment. | ||
|
||
Metric type: `api_server_dns` | ||
|
||
### Me Ingress | ||
|
||
Checks if the kubenurse is reachable at the `/alwayshappy` endpoint behind the ingress. | ||
This address is provided by the environment variable `KUBENURSE_INGRESS_URL` that | ||
could look like `https://kubenurse.example.com`. | ||
|
@@ -183,6 +189,7 @@ This also verifies a correct upstream DNS resolution. | |
Metric type: `me_ingress` | ||
|
||
### Me Service | ||
|
||
Checks if the kubenurse is reachable at the `/alwayshappy` endpoint through the Kubernetes service. | ||
The address is provided by the environment variable `KUBENURSE_SERVICE_URL` that | ||
could look like `http://kubenurse.mynamespace.default.svc:8080`. | ||
|
@@ -191,6 +198,7 @@ This also verifies a working `kube-proxy` setup. | |
Metric type: `me_service` | ||
|
||
### Neighbourhood | ||
|
||
Checks if every neighbour kubenurse is reachable at the `/alwayshappy` endpoint. | ||
Neighbours are discovered by querying the kube-apiserver for every Pod in the | ||
`KUBENURSE_NAMESPACE` with label `KUBENURSE_NEIGHBOUR_FILTER`. | ||
|
@@ -201,7 +209,44 @@ this can be changed by setting `KUBENURSE_ALLOW_UNSCHEDULABLE="true"`. | |
|
||
Metric type: `path_$KUBELET_HOSTNAME` | ||
|
||
#### Neighbourhood filtering | ||
|
||
The number of checks for the neighbourhood used to grow as $O(N^2)$, which | ||
rendered `kubenurse` impractical on large clusters, as documented in issue | ||
[#55](https://github.com/postfinance/kubenurse/issues/55). | ||
To combat this, a node filtering feature was implemented, which works as follows | ||
|
||
- kubenurse computes the `sha256` checksums for all neighbours' node names | ||
- it sorts those checksums (this is actually implemented with a max-heap) | ||
- it computes its own node name checksum, and queries the next 10 (per default) | ||
nodes in the sorted checksums list | ||
|
||
Thanks to this, every node is making queries to the same 10 nodes, unless one | ||
of those nodes disappears, in which case kubenurse will pick the next node in | ||
the sorted checksums list. This comes with several advantages: | ||
|
||
- because of the way we first hash the node names, the checks distribution is | ||
randomly distributed, independant of the node names. if we only picked the 10 | ||
next nodes in a sorted list of the node names, then we might have biased the | ||
results in environments where node names are sequential | ||
- metrics-wise, a `kubenurse` pod should typically only have entries for ca. 10 | ||
other neighbouring nodes worth of checks, which greatly reduces the load on | ||
your monitoring infrastructure | ||
- because we use a deterministic algorithm to choose which nodes to query, the | ||
metrics churn rate stays minimal. (that is, if we randomly picked 10 nodes | ||
for every check, then in the end there would be one prometheus bucket for | ||
every node on the cluster, which would put useless load on the monitoring | ||
infrastructure) | ||
|
||
Per default, the neighbourhood filtering is set to 10 nodes, which means that | ||
on cluster with more than 10 nodes, each kubenurse will query 10 nodes, as | ||
described above. | ||
|
||
To bypass the node filtering feature, you simply need to set the | ||
`KUBENURSE_NEIGHBOUR_LIMIT` environment variable to 0. | ||
|
||
## Metrics | ||
|
||
All performed checks expose metrics which can be used to monitor/alert: | ||
|
||
- SDN network latencies and errors | ||
|
@@ -214,5 +259,6 @@ All performed checks expose metrics which can be used to monitor/alert: | |
- External DNS resolution errors (ingress URL resolution) | ||
|
||
At `/metrics` you will find these: | ||
|
||
- `kubenurse_errors_total`: Kubenurse error counter partitioned by error type | ||
- `kubenurse_request_duration`: a histogram for Kubenurse request duration partitioned by error type |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters