Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain why to use priorityClassName: system-cluster-critical in production #1444

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .spelling
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,7 @@ YAMLs
accessors
acme-dns
ad-hoc
add-ons
allowlist
alrs
analyse
Expand Down
67 changes: 67 additions & 0 deletions content/docs/installation/best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,73 @@ cainjector:
> You must increase the `replicaCount` of each Deployment to more than the `minAvailable` value,
> otherwise the PodDisruptionBudget will prevent you from draining cert-manager Pods.

### Priority Class Name

The reason for setting a priority class is summarized as follows in the Kubernetes blog [Protect Your Mission-Critical Pods From Eviction With `PriorityClass`](https://kubernetes.io/blog/2023/01/12/protect-mission-critical-pods-priorityclass/):
> Pod priority and preemption help to make sure that mission-critical pods are up in the event of a resource crunch by deciding order of scheduling and eviction.

If cert-manager is mission-critical to your platform,
then set a `priorityClassName` on the cert-manager Pods
to protect them from [preemption](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#preemption),
in situations where a Kubernetes node becomes starved of resources.
Without a `priorityClassName` the cert-manager Pods may be evicted to free up resources for other Pods,
and this may cause disruption to any applications that rely on cert-manager.

Most Kubernetes clusters will come with two builtin priority class names:
`system-cluster-critical` and `system-node-critical`,
which are used for Kubernetes core components.
These [can also be used for critical add-ons](https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/),
such as cert-manager.

We recommend using the following Helm chart values to set `priorityClassName: system-cluster-critical`, for all cert-manager Pods:

```yaml
global:
priorityClassName: system-cluster-critical
```

On some clusters the [`ResourceQuota` admission controller](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#resourcequota) may be configured to [limit the use of certain priority classes to certain namespaces](https://kubernetes.io/docs/concepts/policy/resource-quotas/#limit-priority-class-consumption-by-default).
For example, Google Kubernetes Engine (GKE) will only allow `priorityClassName: system-cluster-critical` for Pods in the `kube-system` namespace,
by default.

> 📖 Read [Kubernetes PR #93121](https://github.com/kubernetes/kubernetes/pull/93121) to see how and why this was implemented.

In such cases you will need to create a `ResourceQuota` in the `cert-manager` namespace:

```yaml
# cert-manager-resourcequota.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: cert-manager-critical-pods
namespace: cert-manager
spec:
hard:
pods: 1G
scopeSelector:
matchExpressions:
- operator: In
scopeName: PriorityClass
values:
- system-node-critical
- system-cluster-critical
```

```sh
kubectl apply -f cert-manager-resourcequota.yaml
```

> 📖 Read [Protect Your Mission-Critical Pods From Eviction With `PriorityClass`](https://kubernetes.io/blog/2023/01/12/protect-mission-critical-pods-priorityclass/), a Kubernetes blog post about how Pod priority and preemption help to make sure that mission-critical pods are up in the event of a resource crunch by deciding order of scheduling and eviction.
>
> 📖 Read [Guaranteed Scheduling For Critical Add-On Pods](https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/) to learn why `system-cluster-critical` should be used for add-ons that are critical to a fully functional cluster.
>
> 📖 Read [Limit Priority Class consumption by default](https://kubernetes.io/docs/concepts/policy/resource-quotas/#limit-priority-class-consumption-by-default), to learn why platform administrators might restrict usage of certain high priority classes to a limited number of namespaces.
>
> 📖 Some examples of other critical add-ons that use the `system-cluster-critical` priority class name:
> [NVIDIA GPU Operator](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/google-gke.html),
> [OPA Gatekeeper](https://github.com/open-policy-agent/gatekeeper/pull/1282),
> [Cilium](https://github.com/cilium/cilium/pull/13878).

## Scalability

cert-manager has three long-running components: controller, cainjector, and webhook.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
#
# Read the rationale for these values in:
# * https://cert-manager.io/docs/installation/best-practice/
global:
priorityClassName: system-cluster-critical

replicaCount: 2
podDisruptionBudget:
Expand Down
Loading