tags: addons, metrics, metrics-server
注意:
- 如果没有特殊指明,本文档的所有操作均在 zhangjun-k8s01 节点上执行;
- kuberntes 自带插件的 manifests yaml 文件使用 gcr.io 的 docker registry,国内被墙,需要手动替换为其它 registry 地址(本文档未替换);
metrics-server 通过 kube-apiserver 发现所有节点,然后调用 kubelet APIs(通过 https 接口)获得各节点(Node)和 Pod 的 CPU、Memory 等资源使用情况。
从 Kubernetes 1.12 开始,kubernetes 的安装脚本移除了 Heapster,从 1.13 开始完全移除了对 Heapster 的支持,Heapster 不再被维护。
替代方案如下:
- 用于支持自动扩缩容的 CPU/memory HPA metrics:metrics-server;
- 通用的监控方案:使用第三方可以获取 Prometheus 格式监控指标的监控系统,如 Prometheus Operator;
- 事件传输:使用第三方工具来传输、归档 kubernetes events;
Kubernetes Dashboard 还不支持 metrics-server(PR:#3504),如果使用 metrics-server 替代 Heapster,将无法在 dashboard 中以图形展示 Pod 的内存和 CPU 情况,需要通过 Prometheus、Grafana 等监控方案来弥补。
注意:如果没有特殊指明,本文档的所有操作均在 zhangjun-k8s01 节点上执行。
从 github clone 源码:
$ cd /opt/k8s/work/
$ git clone https://github.com/kubernetes-incubator/metrics-server.git
$ cd metrics-server/deploy/1.8+/
$ ls
aggregated-metrics-reader.yaml auth-delegator.yaml auth-reader.yaml metrics-apiservice.yaml metrics-server-deployment.yaml metrics-server-service.yaml resource-reader.yaml
修改 metrics-server-deployment.yaml
文件,为 metrics-server 添加三个命令行参数:
$ diff metrics-server-deployment.yaml.orig metrics-server-deployment.yaml
32a33,36
> args:
> - --metric-resolution=30s
> - --requestheader-allowed-names=aggregator
> - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
- --metric-resolution=30s:从 kubelet 采集数据的周期;
- --requestheader-allowed-names=aggregator:允许请求 metrics-server API 的用户名,该名称与 kube-apiserver 的
--proxy-client-cert-file
指定的证书 CN 一致; - --kubelet-preferred-address-types:优先使用 InternalIP 来访问 kubelet,这样可以避免节点名称没有 DNS 解析记录时,通过节点名称调用节点 kubelet API 失败的情况(未配置时默认的情况);
部署 metrics-server:
$ cd /opt/k8s/work/metrics-server/deploy/1.8+/
$ kubectl create -f .
$ kubectl -n kube-system get pods -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-7cffff65bc-hkfr7 1/1 Running 0 56s
$ kubectl get svc -n kube-system metrics-server
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
metrics-server ClusterIP 10.254.37.139 <none> 443/TCP 94s
$ docker run -it --rm k8s.gcr.io/metrics-server-amd64:v0.3.3 --help
Launch metrics-server
Usage:
[flags]
Flags:
--alsologtostderr log to standard error as well as files
--authentication-kubeconfig string kubeconfig file pointing at the 'core' kubernetes server with enough rights to create tokenaccessreviews.authentication.k8s.io.
--authentication-skip-lookup If false, the authentication-kubeconfig will be used to lookup missing authentication configuration from the clust
er.
--authentication-token-webhook-cache-ttl duration The duration to cache responses from the webhook token authenticator. (default 10s)
--authentication-tolerate-lookup-failure If true, failures to look up missing authentication configuration from the cluster are not considered fatal. Note that this can result in authentication that treats all requests as anonymous.
--authorization-always-allow-paths strings A list of HTTP paths to skip during authorization, i.e. these are authorized without contacting the 'core' kubernetes server.
--authorization-kubeconfig string kubeconfig file pointing at the 'core' kubernetes server with enough rights to create subjectaccessreviews.authorization.k8s.io.
--authorization-webhook-cache-authorized-ttl duration The duration to cache 'authorized' responses from the webhook authorizer. (default 10s)
--authorization-webhook-cache-unauthorized-ttl duration The duration to cache 'unauthorized' responses from the webhook authorizer. (default 10s)
--bind-address ip The IP address on which to listen for the --secure-port port. The associated interface(s) must be reachable by the
rest of the cluster, and by CLI/web clients. If blank, all interfaces will be used (0.0.0.0 for all IPv4 interfaces and :: for all IPv6 interfaces). (default 0.0.0.0)
--cert-dir string The directory where the TLS certs are located. If --tls-cert-file and --tls-private-key-file are provided, this flag will be ignored. (default "apiserver.local.config/certificates")
--client-ca-file string If set, any request presenting a client certificate signed by one of the authorities in the client-ca-file is authenticated with an identity corresponding to the CommonName of the client certificate.
--contention-profiling Enable lock contention profiling, if profiling is enabled
-h, --help help for this command
--http2-max-streams-per-connection int The limit that the server gives to clients for the maximum number of streams in an HTTP/2 connection. Zero means t
o use golang's default.
--kubeconfig string The path to the kubeconfig used to connect to the Kubernetes API server and the Kubelets (defaults to in-cluster c
onfig)
--kubelet-certificate-authority string Path to the CA to use to validate the Kubelet's serving certificates.
--kubelet-insecure-tls Do not verify CA of serving certificates presented by Kubelets. For testing purposes only.
--kubelet-port int The port to use to connect to Kubelets. (default 10250)
--kubelet-preferred-address-types strings The priority of node address types to use when determining which address to use to connect to a particular node (d
efault [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP])
--log-flush-frequency duration Maximum number of seconds between log flushes (default 5s)
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
--log_file string If non-empty, use this log file
--logtostderr log to standard error instead of files (default true)
--metric-resolution duration The resolution at which metrics-server will retain metrics. (default 1m0s)
--profiling Enable profiling via web interface host:port/debug/pprof/ (default true)
--requestheader-allowed-names strings List of client certificate common names to allow to provide usernames in headers specified by --requestheader-user
name-headers. If empty, any client certificate validated by the authorities in --requestheader-client-ca-file is allowed.
--requestheader-client-ca-file string Root certificate bundle to use to verify client certificates on incoming requests before trusting usernames in hea
ders specified by --requestheader-username-headers. WARNING: generally do not depend on authorization being already done for incoming requests.
--requestheader-extra-headers-prefix strings List of request header prefixes to inspect. X-Remote-Extra- is suggested. (default [x-remote-extra-])
--requestheader-group-headers strings List of request headers to inspect for groups. X-Remote-Group is suggested. (default [x-remote-group])
--requestheader-username-headers strings List of request headers to inspect for usernames. X-Remote-User is common. (default [x-remote-user])
--secure-port int The port on which to serve HTTPS with authentication and authorization.If 0, don't serve HTTPS at all. (default 443)
--skip_headers If true, avoid header prefixes in the log messages
--stderrthreshold severity logs at or above this threshold go to stderr
--tls-cert-file string File containing the default x509 Certificate for HTTPS. (CA cert, if any, concatenated after server cert). If HTTPS serving is enabled, and --tls-cert-file and --tls-private-key-file are not provided, a self-signed certificate and key are generated for the public address and saved to the directory specified by --cert-dir.
--tls-cipher-suites strings Comma-separated list of cipher suites for the server. If omitted, the default Go cipher suites will be use. Possible values: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_RC4_128_SHA,TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_RC4_128_SHA,TLS_RSA_WITH_3DES_EDE_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_RC4_128_SHA
--tls-min-version string Minimum TLS version supported. Possible values: VersionTLS10, VersionTLS11, VersionTLS12
--tls-private-key-file string File containing the default x509 private key matching --tls-cert-file.
--tls-sni-cert-key namedCertKey A pair of x509 certificate and private key file paths, optionally suffixed with a list of domain patterns which are fully qualified domain names, possibly with prefixed wildcard segments. If no domain patterns are provided, the names of the certificate are extracted. Non-wildcard matches trump over wildcard matches, explicit domain patterns trump over extracted names. For multiple key/certificate pairs, use the --tls-sni-cert-key multiple times. Examples: "example.crt,example.key" or "foo.crt,foo.key:*.foo.com,foo.com". (default [])
-v, --v Level number for the log level verbosity
--vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging
-
通过 kube-apiserver 或 kubectl proxy 访问:
https://172.27.137.240:6443/apis/metrics.k8s.io/v1beta1/nodes https://172.27.137.240:6443/apis/metrics.k8s.io/v1beta1/nodes/ https://172.27.137.240:6443/apis/metrics.k8s.io/v1beta1/pods https://172.27.137.240:6443/apis/metrics.k8s.io/v1beta1/namespace//pods/
-
直接使用 kubectl 命令访问:
kubectl get --raw apis/metrics.k8s.io/v1beta1/nodes kubectl get --raw apis/metrics.k8s.io/v1beta1/pods kubectl get --raw apis/metrics.k8s.io/v1beta1/nodes/ kubectl get --raw apis/metrics.k8s.io/v1beta1/namespace//pods/
$ kubectl get --raw "/apis/metrics.k8s.io/v1beta1" | jq .
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "metrics.k8s.io/v1beta1",
"resources": [
{
"name": "nodes",
"singularName": "",
"namespaced": false,
"kind": "NodeMetrics",
"verbs": [
"get",
"list"
]
},
{
"name": "pods",
"singularName": "",
"namespaced": true,
"kind": "PodMetrics",
"verbs": [
"get",
"list"
]
}
]
}
$ kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" | jq .
{
"kind": "NodeMetricsList",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
},
"items": [
{
"metadata": {
"name": "zhangjun-k8s01",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/zhangjun-k8s01",
"creationTimestamp": "2019-05-26T10:55:10Z"
},
"timestamp": "2019-05-26T10:54:52Z",
"window": "30s",
"usage": {
"cpu": "311155148n",
"memory": "2881016Ki"
}
},
{
"metadata": {
"name": "zhangjun-k8s02",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/zhangjun-k8s02",
"creationTimestamp": "2019-05-26T10:55:10Z"
},
"timestamp": "2019-05-26T10:54:54Z",
"window": "30s",
"usage": {
"cpu": "253796835n",
"memory": "1028836Ki"
}
},
{
"metadata": {
"name": "zhangjun-k8s03",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/zhangjun-k8s03",
"creationTimestamp": "2019-05-26T10:55:10Z"
},
"timestamp": "2019-05-26T10:54:54Z",
"window": "30s",
"usage": {
"cpu": "280441339n",
"memory": "1072772Ki"
}
}
]
}
- /apis/metrics.k8s.io/v1beta1/nodes 和 /apis/metrics.k8s.io/v1beta1/pods 返回的 usage 包含 CPU 和 Memory;
kubectl top 命令从 metrics-server 获取集群节点基本的指标信息:
$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
zhangjun-k8s01 296m 3% 2804Mi 17%
zhangjun-k8s02 312m 3% 1161Mi 7%
zhangjun-k8s03 252m 3% 1159Mi 7%
- https://kubernetes.feisky.xyz/zh/addons/metrics.html
- metrics-server RBAC:kubernetes-sigs/metrics-server#40
- metrics-server 参数:kubernetes-sigs/metrics-server#25
- https://kubernetes.io/docs/tasks/debug-application-cluster/core-metrics-pipeline/
- metrics-server 的 APIs 文档。