forked from influxdata/telegraf
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add kube_inventory input plugin (influxdata#5110)
- Loading branch information
1 parent
6be5b55
commit 94de223
Showing
21 changed files
with
1,928 additions
and
0 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,238 @@ | ||
# Kube_Inventory Plugin | ||
This plugin generates metrics derived from the state of the following Kubernetes resources: | ||
- daemonsets | ||
- deployments | ||
- nodes | ||
- persistentvolumes | ||
- persistentvolumeclaims | ||
- pods (containers) | ||
- statefulsets | ||
|
||
#### Series Cardinality Warning | ||
|
||
This plugin may produce a high number of series which, when not controlled | ||
for, will cause high load on your database. Use the following techniques to | ||
avoid cardinality issues: | ||
|
||
- Use [metric filtering][] options to exclude unneeded measurements and tags. | ||
- Write to a database with an appropriate [retention policy][]. | ||
- Limit series cardinality in your database using the | ||
[max-series-per-database][] and [max-values-per-tag][] settings. | ||
- Consider using the [Time Series Index][tsi]. | ||
- Monitor your databases [series cardinality][]. | ||
- Consult the [InfluxDB documentation][influx-docs] for the most up-to-date techniques. | ||
|
||
### Configuration: | ||
|
||
```toml | ||
[[inputs.kube_inventory]] | ||
## URL for the Kubernetes API | ||
url = "https://127.0.0.1" | ||
|
||
## Namespace to use | ||
# namespace = "default" | ||
|
||
## Use bearer token for authorization. ('bearer_token' takes priority) | ||
# bearer_token = "/path/to/bearer/token" | ||
## OR | ||
# bearer_token_string = "abc_123" | ||
|
||
## Set response_timeout (default 5 seconds) | ||
# response_timeout = "5s" | ||
|
||
## Optional Resources to exclude from gathering | ||
## Leave them with blank with try to gather everything available. | ||
## Values can be - "daemonsets", deployments", "nodes", "persistentvolumes", | ||
## "persistentvolumeclaims", "pods", "statefulsets" | ||
# resource_exclude = [ "deployments", "nodes", "statefulsets" ] | ||
|
||
## Optional Resources to include when gathering | ||
## Overrides resource_exclude if both set. | ||
# resource_include = [ "deployments", "nodes", "statefulsets" ] | ||
|
||
## Optional TLS Config | ||
# tls_ca = "/path/to/cafile" | ||
# tls_cert = "/path/to/certfile" | ||
# tls_key = "/path/to/keyfile" | ||
## Use TLS but skip chain & host verification | ||
# insecure_skip_verify = false | ||
``` | ||
|
||
#### Kubernetes Permissions | ||
|
||
If using [RBAC authorization](https://kubernetes.io/docs/reference/access-authn-authz/rbac/), you will need to create a cluster role to list "persistentvolumes" and "nodes". You will then need to make an [aggregated ClusterRole](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#aggregated-clusterroles) that will eventually be bound to a user or group. | ||
```yaml | ||
--- | ||
kind: ClusterRole | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
metadata: | ||
name: influx:cluster:viewer | ||
labels: | ||
rbac.authorization.k8s.io/aggregate-view-telegraf: "true" | ||
rules: | ||
- apiGroups: [""] | ||
resources: ["persistentvolumes","nodes"] | ||
verbs: ["get","list"] | ||
|
||
--- | ||
kind: ClusterRole | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
metadata: | ||
name: influx:telegraf | ||
aggregationRule: | ||
clusterRoleSelectors: | ||
- matchLabels: | ||
rbac.authorization.k8s.io/aggregate-view-telegraf: "true" | ||
rbac.authorization.k8s.io/aggregate-to-view: "true" | ||
rules: [] # Rules are automatically filled in by the controller manager. | ||
``` | ||
Bind the newly created aggregated ClusterRole with the following config file, updating the subjects as needed. | ||
```yaml | ||
--- | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: ClusterRoleBinding | ||
metadata: | ||
name: influx:telegraf:viewer | ||
roleRef: | ||
apiGroup: rbac.authorization.k8s.io | ||
kind: ClusterRole | ||
name: influx:telegraf | ||
subjects: | ||
- kind: ServiceAccount | ||
name: telegraf | ||
namespace: default | ||
``` | ||
### Metrics: | ||
+ kubernetes_daemonset | ||
- tags: | ||
- daemonset_name | ||
- namespace | ||
- fields: | ||
- generation | ||
- current_number_scheduled | ||
- desired_number_scheduled | ||
- number_available | ||
- number_misscheduled | ||
- number_ready | ||
- number_unavailable | ||
- updated_number_scheduled | ||
- kubernetes_deployment | ||
- tags: | ||
- deployment_name | ||
- namespace | ||
- fields: | ||
- replicas_available | ||
- replicas_unavailable | ||
- created | ||
+ kubernetes_node | ||
- tags: | ||
- node_name | ||
- fields: | ||
- capacity_cpu_cores | ||
- capacity_memory_bytes | ||
- capacity_pods | ||
- allocatable_cpu_cores | ||
- allocatable_memory_bytes | ||
- allocatable_pods | ||
- kubernetes_persistentvolume | ||
- tags: | ||
- pv_name | ||
- phase | ||
- storageclass | ||
- fields: | ||
- phase_type (int, [see below](#pv-phase_type)) | ||
+ kubernetes_persistentvolumeclaim | ||
- tags: | ||
- pvc_name | ||
- namespace | ||
- phase | ||
- storageclass | ||
- fields: | ||
- phase_type (int, [see below](#pvc-phase_type)) | ||
- kubernetes_pod_container | ||
- tags: | ||
- container_name | ||
- namespace | ||
- node_name | ||
- pod_name | ||
- fields: | ||
- restarts_total | ||
- state | ||
- terminated_reason | ||
- resource_requests_cpu_units | ||
- resource_requests_memory_bytes | ||
- resource_limits_cpu_units | ||
- resource_limits_memory_bytes | ||
+ kubernetes_statefulset | ||
- tags: | ||
- statefulset_name | ||
- namespace | ||
- fields: | ||
- created | ||
- generation | ||
- replicas | ||
- replicas_current | ||
- replicas_ready | ||
- replicas_updated | ||
- spec_replicas | ||
- observed_generation | ||
#### pv `phase_type` | ||
|
||
The persistentvolume "phase" is saved in the `phase` tag with a correlated numeric field called `phase_type` corresponding with that tag value. | ||
|
||
|Tag value |Corresponding field value| | ||
-----------|-------------------------| | ||
|bound | 0 | | ||
|failed | 1 | | ||
|pending | 2 | | ||
|released | 3 | | ||
|available | 4 | | ||
|unknown | 5 | | ||
|
||
#### pvc `phase_type` | ||
|
||
The persistentvolumeclaim "phase" is saved in the `phase` tag with a correlated numeric field called `phase_type` corresponding with that tag value. | ||
|
||
|Tag value |Corresponding field value| | ||
-----------|-------------------------| | ||
|bound | 0 | | ||
|lost | 1 | | ||
|pending | 2 | | ||
|unknown | 3 | | ||
|
||
|
||
### Example Output: | ||
|
||
``` | ||
kubernetes_configmap,configmap_name=envoy-config,namespace=default,resource_version=56593031 created=1544103867000000000i 1547597616000000000 | ||
kubernetes_daemonset | ||
kubernetes_deployment,deployment_name=deployd,namespace=default replicas_unavailable=0i,created=1544103082000000000i,replicas_available=1i 1547597616000000000 | ||
kubernetes_node,node_name=ip-172-17-0-2.internal allocatable_pods=110i,capacity_memory_bytes=128837533696,capacity_pods=110i,capacity_cpu_cores=16i,allocatable_cpu_cores=16i,allocatable_memory_bytes=128732676096 1547597616000000000 | ||
kubernetes_persistentvolume,phase=Released,pv_name=pvc-aaaaaaaa-bbbb-cccc-1111-222222222222,storageclass=ebs-1-retain phase_type=3i 1547597616000000000 | ||
kubernetes_persistentvolumeclaim,namespace=default,phase=Bound,pvc_name=data-etcd-0,storageclass=ebs-1-retain phase_type=0i 1547597615000000000 | ||
kubernetes_pod,namespace=default,node_name=ip-172-17-0-2.internal,pod_name=tick1 last_transition_time=1547578322000000000i,ready="false" 1547597616000000000 | ||
kubernetes_pod_container,container_name=telegraf,namespace=default,node_name=ip-172-17-0-2.internal,pod_name=tick1,state=running resource_requests_cpu_units=0.1,resource_limits_memory_bytes=524288000,resource_limits_cpu_units=0.5,restarts_total=0i,state_code=0i,terminated_reason="",resource_requests_memory_bytes=524288000 1547597616000000000 | ||
kubernetes_statefulset,namespace=default,statefulset_name=etcd replicas_updated=3i,spec_replicas=3i,observed_generation=1i,created=1544101669000000000i,generation=1i,replicas=3i,replicas_current=3i,replicas_ready=3i 1547597616000000000 | ||
``` | ||
[metric filtering]: https://github.com/influxdata/telegraf/blob/master/docs/CONFIGURATION.md#metric-filtering | ||
[retention policy]: https://docs.influxdata.com/influxdb/latest/guides/downsampling_and_retention/ | ||
[max-series-per-database]: https://docs.influxdata.com/influxdb/latest/administration/config/#max-series-per-database-1000000 | ||
[max-values-per-tag]: https://docs.influxdata.com/influxdb/latest/administration/config/#max-values-per-tag-100000 | ||
[tsi]: https://docs.influxdata.com/influxdb/latest/concepts/time-series-index/ | ||
[series cardinality]: https://docs.influxdata.com/influxdb/latest/query_language/spec/#show-cardinality | ||
[influx-docs]: https://docs.influxdata.com/influxdb/latest/ | ||
[k8s-telegraf]: https://www.influxdata.com/blog/monitoring-kubernetes-architecture/ | ||
[tick-charts]: https://github.com/influxdata/tick-charts |
Oops, something went wrong.