Add a PodDisruptionBudget for CoreDNS #3585

twz123 · 2023-10-13T16:00:02Z

Description

As CoreDNS is a rather critical component in Kubernetes clusters, k0s should ensure it remains available, even during node maintenance or other disruptions. A PodDisruptionBudget helps to maintain a minimum level of availability.

Some additional tweaks and their considerations

No pod anti-affinity for single replica deployments, so that CoreDNS can be rolled on a single node without downtime, as the single replica can remain operational until the new replica can take over.
No PodDisruptionBudget for single replica deployments, so that CoreDNS may be drained from the node running the single replica. This might leave CoreDNS eligible for eviction due to node pressure, but such clusters aren't HA in the first place and it seems more desirable to not block a drain.
Set maxUnavailable=1 only for deployments with two or three replicas. Use the Kubernetes defaults (maxUnavailable=25%, rounded down to get absolute values) for all other cases. For single replica deployments, maxUnavailable=1 meant that the deployment's available condition would be true even with zero ready replicas. For deployments with 4 to 7 replicas, this has been the same as the default, and for deployments with 8 or more replicas, this has been artificially constraining the rolling update speed.

Basic integration test

Use strings as identifier when detecting multiple CoreDNS pods on nodes. Makes for more readable log/error messages.
Ignore pods without pod anti-affinity, as they are remnants when scaling up single node clusters.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update

How Has This Been Tested?

Manual test
Auto test added

Checklist:

twz123 · 2023-10-18T16:21:15Z

This more or less revealed CoreDNS failures in inttests: #3602

twz123 · 2023-10-19T08:58:44Z

Also revealed #3609.

As CoreDNS is a rather critical component in Kubernetes clusters, k0s should ensure it remains available, even during node maintenance or other disruptions. A PodDisruptionBudget helps to maintain a minimum level of availability. Some additional tweaks and their considerations: * No pod anti-affinity for single replica deployments, so that CoreDNS can be rolled on a single node without downtime, as the single replica can remain operational until the new replica can take over. * No PodDisruptionBudget for single replica deployments, so that CoreDNS may be drained from the node running the single replica. This might leave CoreDNS eligible for eviction due to node pressure, but such clusters aren't HA in the first place and it seems more desirable to not block a drain. * Set maxUnavailable=1 only for deployments with two or three replicas. Use the Kubernetes defaults (maxUnavailable=25%, rounded down to get absolute values) for all other cases. For single replica deployments, maxUnavailable=1 meant that the deployment's available condition would be true even with zero ready replicas. For deployments with 4 to 7 replicas, this has been the same as the default, and for deployments with 8 or more replicas, this has been artificially constraining the rolling update speed. Basic integration test: * Use strings as identifier when detecting multiple CoreDNS pods on nodes. Makes for more readable log/error messages. * Ignore pods without pod anti-affinity, as they are remnants when scaling up single node clusters. Signed-off-by: Tom Wieczorek <[email protected]>

twz123 added the area/controlplane label Oct 13, 2023

twz123 requested a review from a team as a code owner October 13, 2023 16:00

twz123 requested review from ncopa and juanluisvaladas October 13, 2023 16:00

jnummelin previously approved these changes Oct 13, 2023

View reviewed changes

twz123 marked this pull request as draft October 18, 2023 11:16

twz123 dismissed jnummelin’s stale review via 4a50683 October 19, 2023 14:56

twz123 force-pushed the coredns-pdb branch 2 times, most recently from 4a50683 to 0b1b3e3 Compare October 24, 2023 15:21

twz123 marked this pull request as ready for review October 24, 2023 16:07

juanluisvaladas previously approved these changes Oct 25, 2023

View reviewed changes

twz123 dismissed juanluisvaladas’s stale review via c802ffd October 26, 2023 10:35

twz123 force-pushed the coredns-pdb branch from 0b1b3e3 to c802ffd Compare October 26, 2023 10:35

juanluisvaladas approved these changes Oct 26, 2023

View reviewed changes

twz123 merged commit c0b94d8 into k0sproject:main Oct 26, 2023
71 checks passed

twz123 deleted the coredns-pdb branch October 26, 2023 11:32

twz123 mentioned this pull request Jan 5, 2024

Use reflect.DeepEqual for CoreDNS config comparison #3883

Merged

16 tasks

twz123 mentioned this pull request Sep 6, 2024

Reintroduce K0smotronNetworks in tests #4927

Closed

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a PodDisruptionBudget for CoreDNS #3585

Add a PodDisruptionBudget for CoreDNS #3585

twz123 commented Oct 13, 2023 •

edited

Loading

twz123 commented Oct 18, 2023

twz123 commented Oct 19, 2023

Add a PodDisruptionBudget for CoreDNS #3585

Add a PodDisruptionBudget for CoreDNS #3585

Conversation

twz123 commented Oct 13, 2023 • edited Loading

Description

Some additional tweaks and their considerations

Basic integration test

Type of change

How Has This Been Tested?

Checklist:

twz123 commented Oct 18, 2023

twz123 commented Oct 19, 2023

twz123 commented Oct 13, 2023 •

edited

Loading