Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply 2024.1 backports from 2023.1 #172

Merged
merged 11 commits into from
Jul 5, 2024
38 changes: 30 additions & 8 deletions doc/source/user/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -463,6 +463,12 @@ the table are linked to more details elsewhere in the user guide.
+---------------------------------------+--------------------+---------------+
| `octavia_lb_healthcheck`_ | see bellow | true |
+---------------------------------------+--------------------+---------------+
| `extra_network`_ | see below | "" |
+---------------------------------------+--------------------+---------------+
| `extra_subnet`_ | see below | "" |
+---------------------------------------+--------------------+---------------+
| `extra_security_group`_ | see below | see below |
+---------------------------------------+--------------------+---------------+

.. _cluster:

Expand Down Expand Up @@ -1175,13 +1181,14 @@ _`container_infra_prefix`

Images that might be needed if 'monitoring_enabled' is 'true':

* quay.io/prometheus/alertmanager:v0.20.0
* docker.io/squareup/ghostunnel:v1.5.2
* docker.io/jettech/kube-webhook-certgen:v1.0.0
* quay.io/coreos/prometheus-operator:v0.37.0
* quay.io/coreos/configmap-reload:v0.0.1
* quay.io/coreos/prometheus-config-reloader:v0.37.0
* quay.io/prometheus/prometheus:v2.15.2
* quay.io/prometheus/alertmanager:v0.21.0
* docker.io/jettech/kube-webhook-certgen:v1.5.0
* quay.io/prometheus-operator/prometheus-operator:v0.44.0
* docker.io/jimmidyson/configmap-reload:v0.4.0
* quay.io/prometheus-operator/prometheus-config-reloader:v0.44.0
* quay.io/prometheus/prometheus:v2.22.1
* quay.io/prometheus/node-exporter:v1.0.1
* docker.io/directxman12/k8s-prometheus-adapter:v0.8.2

Images that might be needed if 'cinder_csi_enabled' is 'true':

Expand Down Expand Up @@ -1548,6 +1555,22 @@ _`octavia_lb_healthcheck`
If true, enable Octavia load balancer healthcheck
Default: true

_`extra_network`
Optional additional network name or UUID to add to cluster nodes.
When not specified, additional networks are not added. Optionally specify
'extra_subnet' if you wish to use a specific subnet on the network.
Default: ""

_`extra_subnet`
Optional additional subnet name or UUID to add to cluster nodes.
Only used when 'extra_network' is defined.
Default: ""

_`extra_security_group`
Optional additional group name or UUID to add to network port.
Only used when 'extra_network' is defined.
Default: cluster node default security group.

Supported versions
------------------

Expand Down Expand Up @@ -2297,7 +2320,6 @@ _`calico_tag`
Victoria default: v3.13.1
Wallaby default: v3.13.1


Besides, the Calico network driver needs kube_tag with v1.9.3 or later, because
Calico needs extra mounts for the kubelet container. See `commit
<https://github.com/projectatomic/atomic-system-containers/commit/54ab8abc7fa1bfb6fa674f55cd0c2fa0c812fd36>`_
Expand Down
21 changes: 13 additions & 8 deletions doc/source/user/monitoring.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,13 +33,15 @@ _`metrics_server_enabled`

_`monitoring_enabled`
Enable installation of cluster monitoring solution provided by the
stable/prometheus-operator helm chart.
prometheus-community/kube-prometheus-stack helm chart.
To use this service tiller_enabled must be true when using
helm_client_tag<v3.0.0.
Default: false

_`prometheus_adapter_enabled`
Enable installation of cluster custom metrics provided by the
stable/prometheus-adapter helm chart. This service depends on
monitoring_enabled.
prometheus-community/prometheus-adapter helm chart.
This service depends on monitoring_enabled.
Default: true

To control deployed versions, extra labels are available:
Expand All @@ -52,14 +54,17 @@ _`metrics_server_chart_tag`

_`prometheus_operator_chart_tag`
Add prometheus_operator_chart_tag to select version of the
stable/prometheus-operator chart to install. When installing the chart,
helm will use the default values of the tag defined and overwrite them based
on the prometheus-operator-config ConfigMap currently defined. You must
certify that the versions are compatible.
prometheus-community/kube-prometheus-stack chart to install.
When installing the chart, helm will use the default values of the tag
defined and overwrite them based on the prometheus-operator-config
ConfigMap currently defined.
You must certify that the versions are compatible.
Wallaby-default: 17.2.0

_`prometheus_adapter_chart_tag`
The stable/prometheus-adapter helm chart version to use.
The prometheus-community/prometheus-adapter helm chart version to use.
Train-default: 1.4.0
Wallaby-default: 2.12.1

Full fledged cluster monitoring
+++++++++++++++++++++++++++++++
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,7 @@ CERT_DIR=/etc/kubernetes/certs

# kube-proxy config
PROXY_KUBECONFIG=/etc/kubernetes/proxy-kubeconfig.yaml
KUBE_PROXY_ARGS="--kubeconfig=${PROXY_KUBECONFIG} --cluster-cidr=${PODS_NETWORK_CIDR} --hostname-override=${INSTANCE_NAME}"
KUBE_PROXY_ARGS="--kubeconfig=${PROXY_KUBECONFIG} --cluster-cidr=${PODS_NETWORK_CIDR} --hostname-override=${INSTANCE_NAME} --metrics-bind-address=0.0.0.0"
cat > /etc/kubernetes/proxy << EOF
KUBE_PROXY_ARGS="${KUBE_PROXY_ARGS} ${KUBEPROXY_OPTIONS}"
EOF
Expand Down Expand Up @@ -406,6 +406,8 @@ KUBE_CONTROLLER_MANAGER_ARGS="--leader-elect=true --kubeconfig=/etc/kubernetes/a
KUBE_CONTROLLER_MANAGER_ARGS="$KUBE_CONTROLLER_MANAGER_ARGS --cluster-name=${CLUSTER_UUID}"
KUBE_CONTROLLER_MANAGER_ARGS="${KUBE_CONTROLLER_MANAGER_ARGS} --allocate-node-cidrs=true"
KUBE_CONTROLLER_MANAGER_ARGS="${KUBE_CONTROLLER_MANAGER_ARGS} --cluster-cidr=${PODS_NETWORK_CIDR}"
KUBE_CONTROLLER_MANAGER_ARGS="${KUBE_CONTROLLER_MANAGER_ARGS} --secure-port=10257"
KUBE_CONTROLLER_MANAGER_ARGS="${KUBE_CONTROLLER_MANAGER_ARGS} --authorization-always-allow-paths=/healthz,/readyz,/livez,/metrics"
KUBE_CONTROLLER_MANAGER_ARGS="$KUBE_CONTROLLER_MANAGER_ARGS $KUBECONTROLLER_OPTIONS"
if [ -n "${ADMISSION_CONTROL_LIST}" ] && [ "${TLS_DISABLED}" == "False" ]; then
KUBE_CONTROLLER_MANAGER_ARGS="$KUBE_CONTROLLER_MANAGER_ARGS --service-account-private-key-file=$CERT_DIR/service_account_private.key --root-ca-file=$CERT_DIR/ca.crt"
Expand All @@ -428,7 +430,7 @@ sed -i '
/^KUBE_CONTROLLER_MANAGER_ARGS=/ s#\(KUBE_CONTROLLER_MANAGER_ARGS\).*#\1="'"${KUBE_CONTROLLER_MANAGER_ARGS}"'"#
' /etc/kubernetes/controller-manager

sed -i '/^KUBE_SCHEDULER_ARGS=/ s#=.*#="--leader-elect=true --kubeconfig=/etc/kubernetes/admin.conf"#' /etc/kubernetes/scheduler
sed -i '/^KUBE_SCHEDULER_ARGS=/ s#=.*#="--leader-elect=true --kubeconfig=/etc/kubernetes/admin.conf --authorization-always-allow-paths=/healthz,/readyz,/livez,/metrics "#' /etc/kubernetes/scheduler

$ssh_cmd mkdir -p /etc/kubernetes/manifests
KUBELET_ARGS="--register-node=true --pod-manifest-path=/etc/kubernetes/manifests --hostname-override=${INSTANCE_NAME}"
Expand Down Expand Up @@ -497,7 +499,14 @@ KUBELET_ARGS="${KUBELET_ARGS} --client-ca-file=${CERT_DIR}/ca.crt --tls-cert-fil

# specified cgroup driver
KUBELET_ARGS="${KUBELET_ARGS} --cgroup-driver=${CGROUP_DRIVER}"

if [ ${CONTAINER_RUNTIME} = "containerd" ] ; then
# check kubelet version, 1.27.0 dropped docker shim and --container-runtime command line option
KUBELET_VERSION=$($ssh_cmd podman run --rm ${CONTAINER_INFRA_PREFIX:-${HYPERKUBE_PREFIX}}hyperkube:${KUBE_TAG} kubelet --version | awk '{print $2}')
CONTAINER_RUNTIME_REMOTE_DROPPED="v1.27.0"
if [[ "${CONTAINER_RUNTIME_REMOTE_DROPPED}" != $(echo -e "${CONTAINER_RUNTIME_REMOTE_DROPPED}\n${KUBELET_VERSION}" | sort -V | head -n1) && "${KUBELET_VERSION}" != "devel" ]]; then
KUBELET_ARGS="${KUBELET_ARGS} --container-runtime=remote"
fi
KUBELET_ARGS="${KUBELET_ARGS} --runtime-cgroups=/system.slice/containerd.service"
KUBELET_ARGS="${KUBELET_ARGS} --runtime-request-timeout=15m"
KUBELET_ARGS="${KUBELET_ARGS} --container-runtime-endpoint=unix:///run/containerd/containerd.sock"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,12 @@ KUBELET_ARGS="${KUBELET_ARGS} --client-ca-file=${CERT_DIR}/ca.crt --tls-cert-fil
# specified cgroup driver
KUBELET_ARGS="${KUBELET_ARGS} --cgroup-driver=${CGROUP_DRIVER}"
if [ ${CONTAINER_RUNTIME} = "containerd" ] ; then
# check kubelet version, 1.27.0 dropped docker shim and --container-runtime command line option
KUBELET_VERSION=$($ssh_cmd podman run --rm ${CONTAINER_INFRA_PREFIX:-${HYPERKUBE_PREFIX}}hyperkube:${KUBE_TAG} kubelet --version | awk '{print $2}')
CONTAINER_RUNTIME_REMOTE_DROPPED="v1.27.0"
if [[ "${CONTAINER_RUNTIME_REMOTE_DROPPED}" != $(echo -e "${CONTAINER_RUNTIME_REMOTE_DROPPED}\n${KUBELET_VERSION}" | sort -V | head -n1) && "${KUBELET_VERSION}" != "devel" ]]; then
KUBELET_ARGS="${KUBELET_ARGS} --container-runtime=remote"
fi
KUBELET_ARGS="${KUBELET_ARGS} --runtime-cgroups=/system.slice/containerd.service"
KUBELET_ARGS="${KUBELET_ARGS} --runtime-request-timeout=15m"
KUBELET_ARGS="${KUBELET_ARGS} --container-runtime-endpoint=unix:///run/containerd/containerd.sock"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,14 +74,17 @@ data:
Corefile: |
.:53 {
errors
log
health
log stdout
health {
lameduck 5s
}
ready
kubernetes ${DNS_CLUSTER_DOMAIN} ${PORTAL_NETWORK_CIDR} ${PODS_NETWORK_CIDR} {
pods verified
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
forward . /run/systemd/resolve/resolv.conf
cache 30
loop
reload
Expand Down Expand Up @@ -141,6 +144,9 @@ spec:
readOnly: true
- name: tmp
mountPath: /tmp
- name: resolvconf
mountPath: /run/systemd/resolve/resolv.conf
readOnly: true
ports:
- containerPort: 53
name: dns
Expand Down Expand Up @@ -183,6 +189,10 @@ spec:
items:
- key: Corefile
path: Corefile
- name: resolvconf
hostPath:
path: /run/systemd/resolve/resolv.conf
type: File
---
apiVersion: v1
kind: Service
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,54 +30,50 @@ rules:
resources: ["leases"]
resourceNames: ["cluster-autoscaler"]
verbs: ["get", "update", "patch", "delete"]
# TODO: remove in 1.18; CA uses lease objects for leader election since 1.17
- apiGroups: [""]
resources: ["endpoints"]
resources: ["events", "endpoints"]
verbs: ["create", "patch"]
- apiGroups: [""]
resources: ["pods/eviction"]
verbs: ["create"]
- apiGroups: [""]
resources: ["pods/status"]
verbs: ["update"]
- apiGroups: [""]
resources: ["endpoints"]
resourceNames: ["cluster-autoscaler"]
verbs: ["get", "update", "patch", "delete"]
# accessing & modifying cluster state (nodes & pods)
verbs: ["get", "update"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch", "update", "patch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
verbs: ["watch", "list", "get", "update"]
- apiGroups: [""]
resources: ["pods/eviction"]
verbs: ["create"]
# read-only access to cluster state
- apiGroups: [""]
resources: ["services", "replicationcontrollers", "persistentvolumes", "persistentvolumeclaims"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["daemonsets", "replicasets"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["statefulsets"]
verbs: ["get", "list", "watch"]
resources:
- "namespaces"
- "pods"
- "services"
- "replicationcontrollers"
- "persistentvolumeclaims"
- "persistentvolumes"
verbs: ["watch", "list", "get"]
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["get", "list", "watch"]
verbs: ["watch", "list", "get"]
- apiGroups: ["policy"]
resources: ["poddisruptionbudgets"]
verbs: ["get", "list", "watch"]
verbs: ["watch", "list"]
- apiGroups: ["apps"]
resources: ["daemonsets", "replicasets", "statefulsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses", "csinodes"]
verbs: ["get", "list", "watch"]
# misc access
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
verbs: ["watch", "list", "get"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["create"]
verbs: ["create","list","watch"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status"]
verbs: ["get", "update", "patch", "delete"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete", "get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ EOF
curl $VERIFY_CA -X GET \
-H "X-Auth-Token: $USER_TOKEN" \
-H "OpenStack-API-Version: container-infra latest" \
$MAGNUM_URL/certificates/$CLUSTER_UUID | python -c 'import sys, json; print(json.load(sys.stdin)["pem"])' >> $CA_CERT
$MAGNUM_URL/certificates/$CLUSTER_UUID | python -c 'import sys, json; print(json.load(sys.stdin)["pem"])' > $CA_CERT

# Generate client's private key and csr
$ssh_cmd openssl genrsa -out "${_KEY}" 4096
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,11 @@ EOF
cat << EOF >> ${HELM_CHART_DIR}/values.yaml
prometheus-adapter:
image:
repository: ${CONTAINER_INFRA_PREFIX:-docker.io/directxman12/}k8s-prometheus-adapter-${ARCH}
repository: ${CONTAINER_INFRA_PREFIX:-k8s.gcr.io/prometheus-adapter/}prometheus-adapter
priorityClassName: "system-cluster-critical"
prometheus:
url: http://web.tcp.prometheus-prometheus.kube-system.svc.cluster.local
url: http://web.tcp.magnum-kube-prometheus-sta-prometheus.kube-system.svc.cluster.local
path: /prometheus
resources:
requests:
cpu: 150m
Expand Down
Loading