Skip to content

Latest commit

 

History

History
261 lines (223 loc) · 15.7 KB

09-3.metrics-server插件.md

File metadata and controls

261 lines (223 loc) · 15.7 KB

tags: addons, metrics, metrics-server

09-4.部署 metrics-server 插件

注意:

  1. 如果没有特殊指明,本文档的所有操作均在 zhangjun-k8s01 节点上执行
  2. kuberntes 自带插件的 manifests yaml 文件使用 gcr.io 的 docker registry,国内被墙,需要手动替换为其它 registry 地址(本文档未替换);

metrics-server 通过 kube-apiserver 发现所有节点,然后调用 kubelet APIs(通过 https 接口)获得各节点(Node)和 Pod 的 CPU、Memory 等资源使用情况。

从 Kubernetes 1.12 开始,kubernetes 的安装脚本移除了 Heapster,从 1.13 开始完全移除了对 Heapster 的支持,Heapster 不再被维护。

替代方案如下:

  1. 用于支持自动扩缩容的 CPU/memory HPA metrics:metrics-server;
  2. 通用的监控方案:使用第三方可以获取 Prometheus 格式监控指标的监控系统,如 Prometheus Operator;
  3. 事件传输:使用第三方工具来传输、归档 kubernetes events;

Kubernetes Dashboard 还不支持 metrics-server(PR:#3504),如果使用 metrics-server 替代 Heapster,将无法在 dashboard 中以图形展示 Pod 的内存和 CPU 情况,需要通过 Prometheus、Grafana 等监控方案来弥补。

注意:如果没有特殊指明,本文档的所有操作均在 zhangjun-k8s01 节点上执行

监控架构

monitoring_architecture.png

安装 metrics-server

从 github clone 源码:

$ cd /opt/k8s/work/
$ git clone https://github.com/kubernetes-incubator/metrics-server.git
$ cd metrics-server/deploy/1.8+/
$ ls
aggregated-metrics-reader.yaml  auth-delegator.yaml  auth-reader.yaml  metrics-apiservice.yaml  metrics-server-deployment.yaml  metrics-server-service.yaml  resource-reader.yaml

修改 metrics-server-deployment.yaml 文件,为 metrics-server 添加三个命令行参数:

$ diff metrics-server-deployment.yaml.orig metrics-server-deployment.yaml
32a33,36
>         args:
>         - --metric-resolution=30s
>         - --requestheader-allowed-names=aggregator
>         - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
  • --metric-resolution=30s:从 kubelet 采集数据的周期;
  • --requestheader-allowed-names=aggregator:允许请求 metrics-server API 的用户名,该名称与 kube-apiserver 的 --proxy-client-cert-file 指定的证书 CN 一致;
  • --kubelet-preferred-address-types:优先使用 InternalIP 来访问 kubelet,这样可以避免节点名称没有 DNS 解析记录时,通过节点名称调用节点 kubelet API 失败的情况(未配置时默认的情况);

部署 metrics-server:

$ cd /opt/k8s/work/metrics-server/deploy/1.8+/
$ kubectl create -f .

查看运行情况

$  kubectl -n kube-system get pods -l k8s-app=metrics-server
NAME                              READY   STATUS    RESTARTS   AGE
metrics-server-7cffff65bc-hkfr7   1/1     Running   0          56s

$  kubectl get svc -n kube-system  metrics-server
NAME             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
metrics-server   ClusterIP   10.254.37.139   <none>        443/TCP   94s

metrics-server 的命令行参数

$ docker run -it --rm k8s.gcr.io/metrics-server-amd64:v0.3.3 --help
Launch metrics-server

Usage:
   [flags]

Flags:
      --alsologtostderr                                         log to standard error as well as files
      --authentication-kubeconfig string                        kubeconfig file pointing at the 'core' kubernetes server with enough rights to create tokenaccessreviews.authentication.k8s.io.
      --authentication-skip-lookup                              If false, the authentication-kubeconfig will be used to lookup missing authentication configuration from the clust
er.
      --authentication-token-webhook-cache-ttl duration         The duration to cache responses from the webhook token authenticator. (default 10s)
      --authentication-tolerate-lookup-failure                  If true, failures to look up missing authentication configuration from the cluster are not considered fatal. Note that this can result in authentication that treats all requests as anonymous.
      --authorization-always-allow-paths strings                A list of HTTP paths to skip during authorization, i.e. these are authorized without contacting the 'core' kubernetes server.
      --authorization-kubeconfig string                         kubeconfig file pointing at the 'core' kubernetes server with enough rights to create subjectaccessreviews.authorization.k8s.io.
      --authorization-webhook-cache-authorized-ttl duration     The duration to cache 'authorized' responses from the webhook authorizer. (default 10s)
      --authorization-webhook-cache-unauthorized-ttl duration   The duration to cache 'unauthorized' responses from the webhook authorizer. (default 10s)
      --bind-address ip                                         The IP address on which to listen for the --secure-port port. The associated interface(s) must be reachable by the
 rest of the cluster, and by CLI/web clients. If blank, all interfaces will be used (0.0.0.0 for all IPv4 interfaces and :: for all IPv6 interfaces). (default 0.0.0.0)
      --cert-dir string                                         The directory where the TLS certs are located. If --tls-cert-file and --tls-private-key-file are provided, this flag will be ignored. (default "apiserver.local.config/certificates")
      --client-ca-file string                                   If set, any request presenting a client certificate signed by one of the authorities in the client-ca-file is authenticated with an identity corresponding to the CommonName of the client certificate.
      --contention-profiling                                    Enable lock contention profiling, if profiling is enabled
  -h, --help                                                    help for this command
      --http2-max-streams-per-connection int                    The limit that the server gives to clients for the maximum number of streams in an HTTP/2 connection. Zero means t
o use golang's default.
      --kubeconfig string                                       The path to the kubeconfig used to connect to the Kubernetes API server and the Kubelets (defaults to in-cluster c
onfig)
      --kubelet-certificate-authority string                    Path to the CA to use to validate the Kubelet's serving certificates.
      --kubelet-insecure-tls                                    Do not verify CA of serving certificates presented by Kubelets.  For testing purposes only.
      --kubelet-port int                                        The port to use to connect to Kubelets. (default 10250)
      --kubelet-preferred-address-types strings                 The priority of node address types to use when determining which address to use to connect to a particular node (d
efault [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP])
      --log-flush-frequency duration                            Maximum number of seconds between log flushes (default 5s)
      --log_backtrace_at traceLocation                          when logging hits line file:N, emit a stack trace (default :0)
      --log_dir string                                          If non-empty, write log files in this directory
      --log_file string                                         If non-empty, use this log file
      --logtostderr                                             log to standard error instead of files (default true)
      --metric-resolution duration                              The resolution at which metrics-server will retain metrics. (default 1m0s)
      --profiling                                               Enable profiling via web interface host:port/debug/pprof/ (default true)
      --requestheader-allowed-names strings                     List of client certificate common names to allow to provide usernames in headers specified by --requestheader-user
name-headers. If empty, any client certificate validated by the authorities in --requestheader-client-ca-file is allowed.
      --requestheader-client-ca-file string                     Root certificate bundle to use to verify client certificates on incoming requests before trusting usernames in hea
ders specified by --requestheader-username-headers. WARNING: generally do not depend on authorization being already done for incoming requests.
      --requestheader-extra-headers-prefix strings              List of request header prefixes to inspect. X-Remote-Extra- is suggested. (default [x-remote-extra-])
      --requestheader-group-headers strings                     List of request headers to inspect for groups. X-Remote-Group is suggested. (default [x-remote-group])
      --requestheader-username-headers strings                  List of request headers to inspect for usernames. X-Remote-User is common. (default [x-remote-user])
      --secure-port int                                         The port on which to serve HTTPS with authentication and authorization.If 0, don't serve HTTPS at all. (default 443)
      --skip_headers                                            If true, avoid header prefixes in the log messages
      --stderrthreshold severity                                logs at or above this threshold go to stderr
      --tls-cert-file string                                    File containing the default x509 Certificate for HTTPS. (CA cert, if any, concatenated after server cert). If HTTPS serving is enabled, and --tls-cert-file and --tls-private-key-file are not provided, a self-signed certificate and key are generated for the public address and saved to the directory specified by --cert-dir.
      --tls-cipher-suites strings                               Comma-separated list of cipher suites for the server. If omitted, the default Go cipher suites will be use.  Possible values: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_RC4_128_SHA,TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_RC4_128_SHA,TLS_RSA_WITH_3DES_EDE_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_RC4_128_SHA
      --tls-min-version string                                  Minimum TLS version supported. Possible values: VersionTLS10, VersionTLS11, VersionTLS12
      --tls-private-key-file string                             File containing the default x509 private key matching --tls-cert-file.
      --tls-sni-cert-key namedCertKey                           A pair of x509 certificate and private key file paths, optionally suffixed with a list of domain patterns which are fully qualified domain names, possibly with prefixed wildcard segments. If no domain patterns are provided, the names of the certificate are extracted. Non-wildcard matches trump over wildcard matches, explicit domain patterns trump over extracted names. For multiple key/certificate pairs, use the --tls-sni-cert-key multiple times. Examples: "example.crt,example.key" or "foo.crt,foo.key:*.foo.com,foo.com". (default [])
  -v, --v Level                                                 number for the log level verbosity
      --vmodule moduleSpec                                      comma-separated list of pattern=N settings for file-filtered logging

查看 metrics-server 输出的 metrics

  1. 通过 kube-apiserver 或 kubectl proxy 访问:

    https://172.27.137.240:6443/apis/metrics.k8s.io/v1beta1/nodes https://172.27.137.240:6443/apis/metrics.k8s.io/v1beta1/nodes/ https://172.27.137.240:6443/apis/metrics.k8s.io/v1beta1/pods https://172.27.137.240:6443/apis/metrics.k8s.io/v1beta1/namespace//pods/

  2. 直接使用 kubectl 命令访问:

    kubectl get --raw apis/metrics.k8s.io/v1beta1/nodes kubectl get --raw apis/metrics.k8s.io/v1beta1/pods kubectl get --raw apis/metrics.k8s.io/v1beta1/nodes/ kubectl get --raw apis/metrics.k8s.io/v1beta1/namespace//pods/

$ kubectl get --raw "/apis/metrics.k8s.io/v1beta1" | jq .
{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "metrics.k8s.io/v1beta1",
  "resources": [
    {
      "name": "nodes",
      "singularName": "",
      "namespaced": false,
      "kind": "NodeMetrics",
      "verbs": [
        "get",
        "list"
      ]
    },
    {
      "name": "pods",
      "singularName": "",
      "namespaced": true,
      "kind": "PodMetrics",
      "verbs": [
        "get",
        "list"
      ]
    }
  ]
}

$ kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" | jq .
{
  "kind": "NodeMetricsList",
  "apiVersion": "metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
  },
  "items": [
    {
      "metadata": {
        "name": "zhangjun-k8s01",
        "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/zhangjun-k8s01",
        "creationTimestamp": "2019-05-26T10:55:10Z"
      },
      "timestamp": "2019-05-26T10:54:52Z",
      "window": "30s",
      "usage": {
        "cpu": "311155148n",
        "memory": "2881016Ki"
      }
    },
    {
      "metadata": {
        "name": "zhangjun-k8s02",
        "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/zhangjun-k8s02",
        "creationTimestamp": "2019-05-26T10:55:10Z"
      },
      "timestamp": "2019-05-26T10:54:54Z",
      "window": "30s",
      "usage": {
        "cpu": "253796835n",
        "memory": "1028836Ki"
      }
    },
    {
      "metadata": {
        "name": "zhangjun-k8s03",
        "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/zhangjun-k8s03",
        "creationTimestamp": "2019-05-26T10:55:10Z"
      },
      "timestamp": "2019-05-26T10:54:54Z",
      "window": "30s",
      "usage": {
        "cpu": "280441339n",
        "memory": "1072772Ki"
      }
    }
  ]
}
  • /apis/metrics.k8s.io/v1beta1/nodes 和 /apis/metrics.k8s.io/v1beta1/pods 返回的 usage 包含 CPU 和 Memory;

使用 kubectl top 命令查看集群节点资源使用情况

kubectl top 命令从 metrics-server 获取集群节点基本的指标信息:

$ kubectl top node
NAME             CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
zhangjun-k8s01   296m         3%     2804Mi          17%
zhangjun-k8s02   312m         3%     1161Mi          7%
zhangjun-k8s03   252m         3%     1159Mi          7%

参考

  1. https://kubernetes.feisky.xyz/zh/addons/metrics.html
  2. metrics-server RBAC:kubernetes-sigs/metrics-server#40
  3. metrics-server 参数:kubernetes-sigs/metrics-server#25
  4. https://kubernetes.io/docs/tasks/debug-application-cluster/core-metrics-pipeline/
  5. metrics-server 的 APIs 文档