Skip to content

Commit

Permalink
docs: neighbour filtering
Browse files Browse the repository at this point in the history
Signed-off-by: Clément Nussbaumer <[email protected]>
  • Loading branch information
clementnuss committed Mar 12, 2024
1 parent 4950dd6 commit bd1ee9f
Show file tree
Hide file tree
Showing 4 changed files with 55 additions and 3 deletions.
52 changes: 49 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@
![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/postfinance/kubenurse)

# Kubenurse

kubenurse is a little service that monitors all network connections in a Kubernetes
cluster. Kubenurse measures request durations, records errors and exports those metrics in Prometheus format.

## Deployment

You can get the Docker image from [Docker Hub](https://hub.docker.com/r/postfinance/kubenurse/).
The [examples](https://github.com/postfinance/kubenurse/tree/master/examples) directory
contains manifests which can be used to deploy kubenurse to the kube-system namespace of your cluster.
Expand Down Expand Up @@ -45,6 +47,7 @@ The following command can be used to install kubenurse with Helm: `helm upgrade
| insecure | Set `KUBENURSE_INSECURE` environment variable | `true` |
| allow_unschedulable | Sets `KUBENURSE_ALLOW_UNSCHEDULABLE` environment variable | `false` |
| neighbour_filter | Sets `KUBENURSE_NEIGHBOUR_FILTER` environment variable | `app.kubernetes.io/name=kubenurse` |
| neighbour_limit | Sets `KUBENURSE_NEIGHBOUR_LIMIT` environment variable | `10` |
| extra_ca | Sets `KUBENURSE_EXTRA_CA` environment variable | |
| check_api_server_direct | Sets `KUBENURSE_CHECK_API_SERVER_DIRECT` environment variable | `true` |
| check_api_server_dns | Sets `KUBENURSE_CHECK_API_SERVER_DNS` environment variable | `true` |
Expand Down Expand Up @@ -74,7 +77,6 @@ dashboards [as this example](./doc/grafana-kubenurse.json) that show network lat
![Grafana ingress view](doc/grafana_ingress.png "Grafana ingress view")
![Grafana path view](doc/grafana_path.png "Grafana path view")
## Configuration
kubenurse is configured with environment variables:
Expand All @@ -85,12 +87,13 @@ kubenurse is configured with environment variables:
- `KUBENURSE_EXTRA_CA`: Additional CA cert path for TLS connections
- `KUBENURSE_NAMESPACE`: Namespace in which to look for the neighbour kubenurses
- `KUBENURSE_NEIGHBOUR_FILTER`: A Kubernetes label selector (eg. `app=kubenurse`) to filter neighbour kubenurses
- `KUBENURSE_NEIGHBOUR_LIMIT`: The maximum number of neighbours each kubenurse will query
- `KUBENURSE_ALLOW_UNSCHEDULABLE`: If this is `"true"`, path checks to neighbouring kubenurses are made even if they are running on unschedulable nodes.
- `KUBENURSE_CHECK_API_SERVER_DIRECT`: If this is `"true"` kubenurse will perform the check [API Server Direct](#API Server Direct). default is "true"
- `KUBENURSE_CHECK_API_SERVER_DNS`: If this is `"true"`, kubenurse will perform the check [API Server DNS](#API Server DNS). default is "true"
- `KUBENURSE_CHECK_ME_INGRESS`: If this is `"true"`, kubenurse will perform the check [Me Ingress](#Me Ingress). default is "true"
- `KUBENURSE_CHECK_ME_SERVICE`: If this is `"true"`, kubenurse will perform the check [Me Service](#Me Service). default is "true"
- `KUBENURSE_CHECK_NEIGHBOURHOOD`: If this is `"true"`, kubenurse will perform the check [Neighbourhood](#Neighbourhood). default is "true"
- `KUBENURSE_CHECK_NEIGHBOURHOOD`: If this is `"true"`, kubenurse will perform the check [Neighbourhood](#neighbourhood). default is "true"
- `KUBENURSE_CHECK_INTERVAL`: the frequency to perform kubenurse checks. the string should be formatted for [time.ParseDuration](https://pkg.go.dev/time#ParseDuration). defaults to `5s`
- `KUBENURSE_REUSE_CONNECTIONS`: whether to reuse connections or not for all checks. default is "false"
- `KUBENURSE_HISTOGRAM_BUCKETS`: optional comma-separated list of float64, used in place of the [default prometheus histogram buckets](https://pkg.go.dev/github.com/prometheus/[email protected]/prometheus#DefBuckets)
Expand Down Expand Up @@ -152,8 +155,8 @@ The `/alive` endpoint returns a JSON like this with status code 200 if everythin
}
```


## Health Checks

Every five seconds and on every access of `/alive`, the checks described below are run.
Check results are cached for 3 seconds in order to prevent excessive network traffic.

Expand All @@ -162,19 +165,22 @@ A little illustration of what communication occurs, is here:
![Communication](doc/Communication.png "Communication")

### API Server Direct

Checks the `/version` endpoint of the Kubernetes API Server through
the direct link (`KUBERNETES_SERVICE_HOST`, `KUBERNETES_SERVICE_PORT`).

Metric type: `api_server_direct`

### API Server DNS

Checks the `/version` endpoint of the Kubernetes API Server through
the Cluster DNS URL `https://kubernetes.default.svc:$KUBERNETES_SERVICE_PORT`.
This also verifies a working `kube-dns` deployment.

Metric type: `api_server_dns`

### Me Ingress

Checks if the kubenurse is reachable at the `/alwayshappy` endpoint behind the ingress.
This address is provided by the environment variable `KUBENURSE_INGRESS_URL` that
could look like `https://kubenurse.example.com`.
Expand All @@ -183,6 +189,7 @@ This also verifies a correct upstream DNS resolution.
Metric type: `me_ingress`

### Me Service

Checks if the kubenurse is reachable at the `/alwayshappy` endpoint through the Kubernetes service.
The address is provided by the environment variable `KUBENURSE_SERVICE_URL` that
could look like `http://kubenurse.mynamespace.default.svc:8080`.
Expand All @@ -191,6 +198,7 @@ This also verifies a working `kube-proxy` setup.
Metric type: `me_service`

### Neighbourhood

Checks if every neighbour kubenurse is reachable at the `/alwayshappy` endpoint.
Neighbours are discovered by querying the kube-apiserver for every Pod in the
`KUBENURSE_NAMESPACE` with label `KUBENURSE_NEIGHBOUR_FILTER`.
Expand All @@ -201,7 +209,44 @@ this can be changed by setting `KUBENURSE_ALLOW_UNSCHEDULABLE="true"`.

Metric type: `path_$KUBELET_HOSTNAME`

#### Neighbourhood filtering

The number of checks for the neighbourhood used to grow as $O(N^2)$, which
rendered `kubenurse` impractical on large clusters, as documented in issue
[#55](https://github.com/postfinance/kubenurse/issues/55).
To combat this, a node filtering feature was implemented, which works as follows

- kubenurse computes the `sha256` checksums for all neighbours' node names
- it sorts those checksums (this is actually implemented with a max-heap)
- it computes its own node name checksum, and queries the next 10 (per default)
nodes in the sorted checksums list

Thanks to this, every node is making queries to the same 10 nodes, unless one
of those nodes disappears, in which case kubenurse will pick the next node in
the sorted checksums list. This comes with several advantages:

- because of the way we first hash the node names, the checks distribution is
randomly distributed, independant of the node names. if we only picked the 10
next nodes in a sorted list of the node names, then we might have biased the
results in environments where node names are sequential
- metrics-wise, a `kubenurse` pod should typically only have entries for ca. 10
other neighbouring nodes worth of checks, which greatly reduces the load on
your monitoring infrastructure
- because we use a deterministic algorithm to choose which nodes to query, the
metrics churn rate stays minimal. (that is, if we randomly picked 10 nodes
for every check, then in the end there would be one prometheus bucket for
every node on the cluster, which would put useless load on the monitoring
infrastructure)

Per default, the neighbourhood filtering is set to 10 nodes, which means that
on cluster with more than 10 nodes, each kubenurse will query 10 nodes, as
described above.

To bypass the node filtering feature, you simply need to set the
`KUBENURSE_NEIGHBOUR_LIMIT` environment variable to 0.

## Metrics

All performed checks expose metrics which can be used to monitor/alert:

- SDN network latencies and errors
Expand All @@ -214,5 +259,6 @@ All performed checks expose metrics which can be used to monitor/alert:
- External DNS resolution errors (ingress URL resolution)

At `/metrics` you will find these:

- `kubenurse_errors_total`: Kubenurse error counter partitioned by error type
- `kubenurse_request_duration`: a histogram for Kubenurse request duration partitioned by error type
2 changes: 2 additions & 0 deletions helm/kubenurse/templates/daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ spec:
value: {{ .Release.Namespace }}
- name: KUBENURSE_NEIGHBOUR_FILTER
value: {{ .Values.neighbour_filter }}
- name: KUBENURSE_NEIGHBOUR_LIMIT
value: {{ .Values.neighbour_limit | quote }}
{{- if .Values.extra_ca }}
- name: KUBENURSE_EXTRA_CA
value: {{ .Values.extra_ca }}
Expand Down
2 changes: 2 additions & 0 deletions helm/kubenurse/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ service_url: ""
allow_unschedulable: false
# KUBENURSE_NEIGHBOUR_FILTER
neighbour_filter: app.kubernetes.io/name=kubenurse
# KUBENURSE_NEIGHBOUR_LIMIT
neighbour_limit: 10
# KUBENURSE_EXTRA_CA
extra_ca: ""
# KUBENURSE_CHECK_API_SERVER_DIRECT
Expand Down
2 changes: 2 additions & 0 deletions internal/kubenurse/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ func New(ctx context.Context, c client.Client) (*Server, error) { //nolint:funle
if err != nil {
return nil, err
}
} else {
chk.NeighbourLimit = 10
}

//nolint:goconst // No need to make "false" a constant in my opinion, readability is better like this.
Expand Down

0 comments on commit bd1ee9f

Please sign in to comment.