Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to configure cgroupsv1 per nodepool #368

Merged
merged 1 commit into from
Oct 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Changed

- Allow configuring `cgroups` v1 or v2 compatibility per node pool, instead of the whole cluster. Control plane nodes always use cgroups v2.

## [1.0.2] - 2024-09-30

### Added
Expand Down
4 changes: 3 additions & 1 deletion helm/cluster/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@ For Giant Swarm internal use only, not stable, or not supported by UIs.
| `internal.advancedConfiguration` | **Advanced configuration** - Advanced configuration of cluster components, to be configured by Giant Swarm staff only.|**Type:** `object`<br/>|
| `internal.advancedConfiguration.appPlatform` | **App Platform** - Advanced configuration of App Platform.|**Type:** `object`<br/>|
| `internal.advancedConfiguration.appPlatform.fluxBackend` | **Flux Backend** - Use Flux as App Platform backend for installing apps.|**Type:** `boolean`<br/>**Default:** `false`|
| `internal.advancedConfiguration.cgroupsv1` | **CGroups v1** - Force use of CGroups v1 for whole cluster.|**Type:** `boolean`<br/>**Default:** `false`|
| `internal.advancedConfiguration.cgroupsv1` | **Cgroups v1** - Force use of cgroups v1 for whole cluster. Deprecated: Use the node pool level setting instead.|**Type:** `boolean`<br/>**Default:** `false`|
| `internal.advancedConfiguration.controlPlane` | **Control plane** - Advanced configuration of control plane components.|**Type:** `object`<br/>|
| `internal.advancedConfiguration.controlPlane.apiServer` | **API server** - Advanced configuration of API server.|**Type:** `object`<br/>|
| `internal.advancedConfiguration.controlPlane.apiServer.additionalAdmissionPlugins` | **Additional admission plugins** - A list of plugins to enable, in addition to the default ones that include DefaultStorageClass, DefaultTolerationSeconds, LimitRanger, MutatingAdmissionWebhook, NamespaceLifecycle, PersistentVolumeClaimResize, Priority, ResourceQuota, ServiceAccount and ValidatingAdmissionWebhook.|**Type:** `array`<br/>|
Expand Down Expand Up @@ -405,6 +405,7 @@ Properties within the `.global.nodePools` object
| `global.nodePools.PATTERN` | **Node pool**|**Type:** `object`<br/>**Key pattern:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>|
| `global.nodePools.PATTERN.annotations` | **Annotations** - These annotations are added to all Kubernetes resources defining this node pool.|**Type:** `object`<br/>**Key pattern:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>|
| `global.nodePools.PATTERN.annotations.PATTERN_2` | **Annotation**|**Type:** `string`<br/>**Key patterns:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>`PATTERN_2`=`^([a-zA-Z0-9/\.-]{1,253}/)?[a-zA-Z0-9/\._-]{1,63}$`<br/>|
| `global.nodePools.PATTERN.cgroupsv1` | **Cgroups v1** - Flag that indicates if cgroups v1 should be used.|**Type:** `boolean`<br/>**Key pattern:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>**Default:** `false`|
| `global.nodePools.PATTERN.customNodeLabels` | **Node labels. Deprecated: use nodeLabels instead.** - Labels that are passed to kubelet argument 'node-labels'.|**Type:** `array`<br/>**Key pattern:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>|
| `global.nodePools.PATTERN.customNodeLabels[*]` | **Label**|**Type:** `string`<br/>**Key pattern:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>|
| `global.nodePools.PATTERN.customNodeTaints` | **Custom node taints. Deprecated: use nodeTaints instead.**|**Type:** `array`<br/>**Key pattern:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>|
Expand Down Expand Up @@ -809,6 +810,7 @@ Provider-specific properties that can be set by cluster-$provider chart in order
| `providerIntegration.workers.defaultNodePools.PATTERN` | **Node pool**|**Type:** `object`<br/>**Key pattern:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>|
| `providerIntegration.workers.defaultNodePools.PATTERN.annotations` | **Annotations** - These annotations are added to all Kubernetes resources defining this node pool.|**Type:** `object`<br/>**Key pattern:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>|
| `providerIntegration.workers.defaultNodePools.PATTERN.annotations.PATTERN_2` | **Annotation**|**Type:** `string`<br/>**Key patterns:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>`PATTERN_2`=`^([a-zA-Z0-9/\.-]{1,253}/)?[a-zA-Z0-9/\._-]{1,63}$`<br/>|
| `providerIntegration.workers.defaultNodePools.PATTERN.cgroupsv1` | **Cgroups v1** - Flag that indicates if cgroups v1 should be used.|**Type:** `boolean`<br/>**Key pattern:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>**Default:** `false`|
| `providerIntegration.workers.defaultNodePools.PATTERN.customNodeLabels` | **Node labels. Deprecated: use nodeLabels instead.** - Labels that are passed to kubelet argument 'node-labels'.|**Type:** `array`<br/>**Key pattern:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>|
| `providerIntegration.workers.defaultNodePools.PATTERN.customNodeLabels[*]` | **Label**|**Type:** `string`<br/>**Key pattern:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>|
| `providerIntegration.workers.defaultNodePools.PATTERN.customNodeTaints` | **Custom node taints. Deprecated: use nodeTaints instead.**|**Type:** `array`<br/>**Key pattern:**<br/>`PATTERN`=`^[a-z0-9][-a-z0-9]{3,18}[a-z0-9]$`<br/>|
Expand Down
63 changes: 63 additions & 0 deletions helm/cluster/ci/test-cgroupsv1-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
global:
managementCluster: giantmc
release:
version: v27.0.0-alpha.1
metadata:
name: awesome
organization: giantswarm
description: "Awesome Giant Swarm cluster"
connectivity:
baseDomain: example.gigantic.io
nodePools:
def00:
cgroupsv1: false
replicas: 3
def01:
cgroupsv1: true
replicas: 3
def02:
cgroupsv1: false
replicas: 3
providerIntegration:
provider: aws
workers:
defaultNodePools:
def00:
customNodeLabels:
- label=default
maxSize: 3
minSize: 3
components:
containerd:
sandboxContainerImage:
name: giantswarm/pause
tag: "3.9"
resourcesApi:
clusterResourceEnabled: true
controlPlaneResourceEnabled: true
machinePoolResourcesEnabled: true
machineHealthCheckResourceEnabled: true
bastionResourceEnabled: true
infrastructureCluster:
group: infrastructure.cluster.x-k8s.io
version: v1beta1
kind: AWSCluster
nodePoolKind: MachinePool
infrastructureMachinePool:
group: infrastructure.cluster.x-k8s.io
version: v1beta1
kind: AWSMachinePool
helmRepositoryResourcesEnabled: true
bastion:
infrastructureMachineTemplate:
group: infrastructure.cluster.x-k8s.io
version: v1beta1
kind: AWSMachineTemplate
infrastructureMachineTemplateSpecTemplateName: "cluster.test.bastion.machineTemplate.spec"
controlPlane:
resources:
infrastructureMachineTemplate:
group: infrastructure.cluster.x-k8s.io
kind: GiantMachineTemplate
version: v1beta1
infrastructureMachineTemplateSpecTemplateName: cluster.internal.test.controlPlane.machineTemplate.spec
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ no_shim = false
# setting runc.options unsets parent settings
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = {{ if $.Values.internal.advancedConfiguration.cgroupsv1 }}false{{else}}true{{end}}
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "{{ include "cluster.image.registry" $ }}/{{ $.Values.providerIntegration.components.containerd.sandboxContainerImage.name }}:{{ $.Values.providerIntegration.components.containerd.sandboxContainerImage.tag }}"

Expand Down
56 changes: 56 additions & 0 deletions helm/cluster/files/etc/containerd/workers-config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
version = 2

# recommended defaults from https://github.com/containerd/containerd/blob/main/docs/ops.md#base-configuration
# set containerd as a subreaper on linux when it is not running as PID 1
subreaper = true
# set containerd's OOM score
oom_score = -999
disabled_plugins = []
[plugins."io.containerd.runtime.v1.linux"]
# shim binary name/path
shim = "containerd-shim"
# runtime binary name/path
runtime = "runc"
# do not use a shim when starting containers, saves on memory but
# live restore is not supported
no_shim = false

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
# setting runc.options unsets parent settings
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = {{ if or $.nodePool.config.cgroupsv1 $.Values.internal.advancedConfiguration.cgroupsv1 }}false{{else}}true{{end}}
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "{{ include "cluster.image.registry" $ }}/{{ $.Values.providerIntegration.components.containerd.sandboxContainerImage.name }}:{{ $.Values.providerIntegration.components.containerd.sandboxContainerImage.tag }}"

[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
{{- range $host, $config := $.Values.global.components.containerd.containerRegistries }}
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."{{$host}}"]
endpoint = [
{{- if and $.Values.global.components.containerd.localRegistryCache.enabled (has $host $.Values.global.components.containerd.localRegistryCache.mirroredRegistries) -}}
"http://127.0.0.1:{{ $.Values.global.components.containerd.localRegistryCache.port }}",
{{- end -}}
{{- if and $.Values.global.components.containerd.managementClusterRegistryCache.enabled (has $host $.Values.global.components.containerd.managementClusterRegistryCache.mirroredRegistries) -}}
"https://zot.{{ $.Values.global.managementCluster }}.{{ $.Values.global.connectivity.baseDomain }}",
{{- end -}}
{{- range $value := $config -}}
"https://{{$value.endpoint}}",
{{- end -}}
]
{{- end }}
[plugins."io.containerd.grpc.v1.cri".registry.configs]
{{- range $host, $config := $.Values.global.components.containerd.containerRegistries -}}
{{ range $value := $config -}}
{{- with $value.credentials }}
[plugins."io.containerd.grpc.v1.cri".registry.configs."{{$value.endpoint}}".auth]
{{ if and .username .password -}}
auth = {{ printf "%s:%s" .username .password | b64enc | quote }}
{{- else if .auth -}}
auth = {{ .auth | quote }}
{{- else if .identitytoken -}}
identitytoken = {{ .identitytoken | quote }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}
19 changes: 0 additions & 19 deletions helm/cluster/templates/clusterapi/_helpers_files.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,7 @@
{{- include "cluster.internal.kubeadm.files.sysctl" $ }}
{{- include "cluster.internal.kubeadm.files.selinux" $ }}
{{- include "cluster.internal.kubeadm.files.systemd" $ }}
{{- include "cluster.internal.kubeadm.files.cgroupv1" $ }}
{{- include "cluster.internal.kubeadm.files.ssh" $ }}
{{- include "cluster.internal.kubeadm.files.cri" $ }}
{{- include "cluster.internal.kubeadm.files.kubelet" $ }}
{{- include "cluster.internal.kubeadm.files.proxy" $ }}
{{- include "cluster.internal.kubeadm.files.teleport" $ }}
Expand Down Expand Up @@ -36,14 +34,6 @@
{{- end }}
{{- end }}

{{- define "cluster.internal.kubeadm.files.cgroupv1" }}
{{- if $.Values.internal.advancedConfiguration.cgroupsv1 }}
- path: /etc/flatcar-cgroupv1
filesystem: root
permissions: "0444"
{{- end }}
{{- end }}

{{- define "cluster.internal.kubeadm.files.ssh" }}
{{- if $.Values.providerIntegration.resourcesApi.bastionResourceEnabled }}
{{- if .Values.global.connectivity.bastion.enabled }}
Expand All @@ -59,15 +49,6 @@
{{- end }}
{{- end }}

{{- define "cluster.internal.kubeadm.files.cri" }}
- path: /etc/containerd/config.toml
permissions: "0644"
contentFrom:
secret:
name: {{ include "cluster.resource.name" $ }}-containerd-{{ include "cluster.data.hash" (dict "data" (tpl ($.Files.Get "files/etc/containerd/config.toml") $) "salt" $.Values.providerIntegration.hashSalt) }}
key: config.toml
{{- end }}

{{- define "cluster.internal.kubeadm.files.kubelet" }}
- path: /etc/kubernetes/patches/kubeletconfiguration.yaml
permissions: "0644"
Expand Down
11 changes: 11 additions & 0 deletions helm/cluster/templates/clusterapi/controlplane/_helpers_files.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
{{- include "cluster.internal.kubeadm.files" $ }}
{{- include "cluster.internal.controlPlane.kubeadm.files.admission" $ }}
{{- include "cluster.internal.controlPlane.kubeadm.files.audit" $ }}
{{- include "cluster.internal.controlPlane.kubeadm.files.cri" $ }}
{{- include "cluster.internal.controlPlane.kubeadm.files.encryption" $ }}
{{- include "cluster.internal.controlPlane.kubeadm.files.fairness" $ }}
{{- include "cluster.internal.controlPlane.kubeadm.files.oidc" $ }}
Expand Down Expand Up @@ -102,3 +103,13 @@
{{ include "cluster.internal.processFiles" (dict "files" $.Values.internal.advancedConfiguration.controlPlane.files "clusterName" (include "cluster.resource.name" $)) }}
{{- end }}
{{- end }}

{{/* containerd configuration for the control plane nodes */}}
{{- define "cluster.internal.controlPlane.kubeadm.files.cri" }}
- path: /etc/containerd/config.toml
permissions: "0644"
contentFrom:
secret:
name: {{ include "cluster.resource.name" $ }}-controlplane-containerd-{{ include "cluster.data.hash" (dict "data" (tpl ($.Files.Get "files/etc/containerd/controlplane-config.toml") $) "salt" $.Values.providerIntegration.hashSalt) }}
key: config.toml
{{- end }}
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,6 @@ localAPIEndpoint:
bindPort: {{ $.Values.internal.advancedConfiguration.controlPlane.apiServer.bindPort | default 6443 }}
nodeRegistration:
kubeletExtraArgs:
{{- if $.Values.internal.advancedConfiguration.cgroupsv1 }}
cgroup-driver: cgroupfs
{{- end }}
cloud-provider: external
healthz-bind-address: 0.0.0.0
node-ip: {{ printf "${%s}" $.Values.providerIntegration.environmentVariables.ipv4 }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@ controlPlane:
bindPort: {{ $.Values.internal.advancedConfiguration.controlPlane.apiServer.bindPort | default 6443 }}
nodeRegistration:
kubeletExtraArgs:
{{- if $.Values.internal.advancedConfiguration.cgroupsv1 }}
cgroup-driver: cgroupfs
{{- end }}
cloud-provider: external
node-ip: {{ printf "${%s}" $.Values.providerIntegration.environmentVariables.ipv4 }}
node-labels: ip={{ printf "${%s}" $.Values.providerIntegration.environmentVariables.ipv4 }}
Expand Down
23 changes: 23 additions & 0 deletions helm/cluster/templates/clusterapi/workers/_helpers_files.tpl
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
{{- define "cluster.internal.workers.kubeadm.files" }}
{{- include "cluster.internal.kubeadm.files" $ }}
{{- include "cluster.internal.workers.kubeadm.files.cgroupv1" $ }}
{{- include "cluster.internal.workers.kubeadm.files.cri" $ }}
{{- include "cluster.internal.workers.kubeadm.files.provider" $ }}
{{- include "cluster.internal.workers.kubeadm.files.custom" $ }}
{{- include "cluster.internal.workers.kubeadm.files.cloudConfig" $ }}
Expand Down Expand Up @@ -33,3 +35,24 @@
owner: root:root
{{- end }}
{{- end }}

{{/* containerd configuration for the worker nodes
When we don't support cgroups v1 anymore we can get rid of the logic to generate different configuration for
different node pools, and use the same cgroups configuration in the containerd config file both for workers and control plane nodes */}}
{{- define "cluster.internal.workers.kubeadm.files.cri" }}
- path: /etc/containerd/config.toml
permissions: "0644"
contentFrom:
secret:
name: {{ include "cluster.resource.name" $ }}-{{ $.nodePool.name }}-containerd-{{ include "cluster.data.hash" (dict "data" (tpl ($.Files.Get "files/etc/containerd/workers-config.toml") $) "salt" $.Values.providerIntegration.hashSalt) }}
key: config.toml
{{- end }}

{{/* flatcar configuration to use cgroupsv1 for the worker nodes. When we don't support cgroups v1 anymore we can remove it */}}
{{- define "cluster.internal.workers.kubeadm.files.cgroupv1" }}
{{- if $.nodePool.config.cgroupsv1 }}
- path: /etc/flatcar-cgroupv1
filesystem: root
permissions: "0444"
{{- end }}
{{- end }}
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ nodeRegistration:
{{- if eq $.Values.providerIntegration.provider "azure" }}
azure-container-registry-config: {{ $.Values.providerIntegration.controlPlane.kubeadmConfig.clusterConfiguration.apiServer.cloudConfig }}
{{- end }}
{{- if $.Values.internal.advancedConfiguration.cgroupsv1 }}
{{- if or $nodePool.config.cgroupsv1 $.Values.internal.advancedConfiguration.cgroupsv1 }}
cgroup-driver: cgroupfs
{{- end }}
cloud-provider: external
Expand Down
18 changes: 15 additions & 3 deletions helm/cluster/templates/containerd.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,20 @@
{{- if or $.Values.providerIntegration.resourcesApi.controlPlaneResourceEnabled $.Values.providerIntegration.resourcesApi.machinePoolResourcesEnabled }}
{{- if $.Values.providerIntegration.resourcesApi.machinePoolResourcesEnabled }}
{{- range $nodePoolName, $nodePoolConfig := $.Values.global.nodePools | default $.Values.providerIntegration.workers.defaultNodePools }}
{{- $_ := set $ "nodePool" (dict "name" $nodePoolName "config" $nodePoolConfig) }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "cluster.resource.name" $ }}-containerd-{{ include "cluster.data.hash" (dict "data" (tpl ($.Files.Get "files/etc/containerd/config.toml") $) "salt" $.Values.providerIntegration.hashSalt) }}
name: {{ include "cluster.resource.name" $ }}-{{ $nodePoolName }}-containerd-{{ include "cluster.data.hash" (dict "data" (tpl ($.Files.Get "files/etc/containerd/workers-config.toml") $) "salt" $.Values.providerIntegration.hashSalt) }}
data:
config.toml: {{ tpl ($.Files.Get "files/etc/containerd/config.toml") $ | b64enc | quote }}
config.toml: {{ tpl ($.Files.Get "files/etc/containerd/workers-config.toml") $ | b64enc | quote }}
---
{{- end }}
{{- end }}
{{- if $.Values.providerIntegration.resourcesApi.controlPlaneResourceEnabled }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "cluster.resource.name" $ }}-controlplane-containerd-{{ include "cluster.data.hash" (dict "data" (tpl ($.Files.Get "files/etc/containerd/controlplane-config.toml") $) "salt" $.Values.providerIntegration.hashSalt) }}
data:
config.toml: {{ tpl ($.Files.Get "files/etc/containerd/controlplane-config.toml") $ | b64enc | quote }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
version = 2

# recommended defaults from https://github.com/containerd/containerd/blob/main/docs/ops.md#base-configuration
# set containerd as a subreaper on linux when it is not running as PID 1
subreaper = true
# set containerd's OOM score
oom_score = -999
disabled_plugins = []
[plugins."io.containerd.runtime.v1.linux"]
# shim binary name/path
shim = "containerd-shim"
# runtime binary name/path
runtime = "runc"
# do not use a shim when starting containers, saves on memory but
# live restore is not supported
no_shim = false

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
# setting runc.options unsets parent settings
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = false
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "gsoci.azurecr.io/giantswarm/pause:3.9"

[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io","https://giantswarm.azurecr.io",]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."gsoci.azurecr.io"]
endpoint = ["https://zot.giantmc.example.gigantic.io","https://gsoci.azurecr.io",]
[plugins."io.containerd.grpc.v1.cri".registry.configs]
Loading
Loading