Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to configure a pod topology spread constraint #37

Merged
merged 1 commit into from
Aug 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 43 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,10 +171,46 @@ spec:
...
```

| Variable name | Default | Description |
| ---------------------- | --------------------------------- | -------------------------------------------------- |
| ENVOY_IMAGE | `envoyproxy/envoy-alpine:v1.16.5` | Name of the Envoy Proxy image to use |
| TAINT_TOLERATION_KEY | Empty, no tolerations applied | Toleration key to apply to gateway pods |
| TAINT_TOLERATION_VALUE | Empty, no tolerations applied | Toleration value to apply to gateway pods |
| NODE_SELECTOR_KEY | Empty, no node selector added | Node selector label key to apply to gateway pods |
| NODE_SELECTOR_VALUE | Empty, no node selector added | Node selector label value to apply to gateway pods |
**You can also configure the [pod topology spread constraint](https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/)**

```yaml
env:
- name: ENABLE_POD_TOPOLOGY_SPREAD
value: "true"
- name: POD_TOPOLOGY_ZONE_MAX_SKEW
value: 1
- name: POD_TOPOLOGY_HOSTNAME_MAX_SKEW
value: 1
```

This will inject a topology spread constraint into the gateway pods, which will
ensure that pods are spread across zones and hosts.

```yaml
spec:
topologySpreadConstraints:
- labelSelector:
matchLabels:
app: egress-gateway
egress.monzo.com/gateway: egress-gateway-name
maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
- labelSelector:
matchLabels:
app: egress-gateway
egress.monzo.com/gateway: egress-gateway-name
```

| Variable name | Default | Description |
|------------------------------------|-------------------------------------------|----------------------------------------------------|
| ENVOY_IMAGE | `envoyproxy/envoy-alpine:v1.16.5` | Name of the Envoy Proxy image to use |
| TAINT_TOLERATION_KEY | Empty, no tolerations applied | Toleration key to apply to gateway pods |
| TAINT_TOLERATION_VALUE | Empty, no tolerations applied | Toleration value to apply to gateway pods |
| NODE_SELECTOR_KEY | Empty, no node selector added | Node selector label key to apply to gateway pods |
| NODE_SELECTOR_VALUE | Empty, no node selector added | Node selector label value to apply to gateway pods |
| POD_TOPOLOGY_ZONE_MAX_SKEW_KEY | `topology.kubernetes.io/zone` | Topology key for the zone constraint |
| POD_TOPOLOGY_ZONE_MAX_SKEW | Empty, won't inject a zone constraint | Value of maxSkew for the zone constraint |
| POD_TOPOLOGY_HOSTNAME_MAX_SKEW_KEY | `kubernetes.io/hostname` | Topology key for the hostname constraint |
| POD_TOPOLOGY_HOSTNAME_MAX_SKEW | Empty, won't inject a hostname constraint | Value of maxSkew for the hostname constraint |

50 changes: 47 additions & 3 deletions controllers/deployment.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,8 @@ func deployment(es *egressv1.ExternalService, configHash string) *appsv1.Deploym
img = i
}

labelSelector := metav1.SetAsLabelSelector(labelsToSelect(es))

var tolerations []corev1.Toleration
tk, kok := os.LookupEnv("TAINT_TOLERATION_KEY")
tv, vok := os.LookupEnv("TAINT_TOLERATION_VALUE")
Expand All @@ -91,6 +93,47 @@ func deployment(es *egressv1.ExternalService, configHash string) *appsv1.Deploym
}
}

var podTopologySpread []corev1.TopologySpreadConstraint
topologyEnable, _ := os.LookupEnv("ENABLE_POD_TOPOLOGY_SPREAD")
if topologyEnable == "true" {
zoneSkew, zoneEnabled := os.LookupEnv("POD_TOPOLOGY_ZONE_MAX_SKEW")
zoneKey, zoneKeyFound := os.LookupEnv("POD_TOPOLOGY_ZONE_MAX_SKEW_KEY")
if zoneEnabled {
maxSkew, err := strconv.Atoi(zoneSkew)
if err != nil {
maxSkew = 1
}
// Default zone key to the Kubernetes topology one if not specified
if !zoneKeyFound {
zoneKey = "topology.kubernetes.io/zone"
}
Comment on lines +107 to +109

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when would be the case where we want to specify a different zone key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an old zone key Kubernetes used to use or the user's cluster might have a different zone topology

podTopologySpread = append(podTopologySpread, corev1.TopologySpreadConstraint{
TopologyKey: zoneKey,
WhenUnsatisfiable: corev1.ScheduleAnyway,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this field so that we schedule a pod even if none of the rules match?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this stops us not scheduling pods for when an AZ goes down or there's a small amount of nodes (usually in a nonprod environment)

MaxSkew: int32(maxSkew),
LabelSelector: labelSelector,
})
}
hostnameSkew, hostnameEnabled := os.LookupEnv("POD_TOPOLOGY_HOSTNAME_MAX_SKEW")
hostnameKey, hostnameKeyFound := os.LookupEnv("POD_TOPOLOGY_HOSTNAME_MAX_SKEW_KEY")
if hostnameEnabled {
maxSkew, err := strconv.Atoi(hostnameSkew)
if err != nil {
maxSkew = 1
}
// Default zone key to the Kubernetes topology one if not specified
if !hostnameKeyFound {
hostnameKey = "kubernetes.io/hostname"
}
podTopologySpread = append(podTopologySpread, corev1.TopologySpreadConstraint{
TopologyKey: hostnameKey,
WhenUnsatisfiable: corev1.ScheduleAnyway,
MaxSkew: int32(maxSkew),
LabelSelector: labelSelector,
})
}
}

var resources corev1.ResourceRequirements
if es.Spec.Resources != nil {
resources = *es.Spec.Resources
Expand Down Expand Up @@ -124,15 +167,16 @@ func deployment(es *egressv1.ExternalService, configHash string) *appsv1.Deploym
MaxSurge: intstr.ValueOrDefault(nil, intstr.FromString("25%")),
},
},
Selector: metav1.SetAsLabelSelector(labelsToSelect(es)),
Selector: labelSelector,
Template: corev1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: labels(es),
Annotations: a,
},
Spec: corev1.PodSpec{
Tolerations: tolerations,
NodeSelector: nodeSelector,
Tolerations: tolerations,
NodeSelector: nodeSelector,
TopologySpreadConstraints: podTopologySpread,
Containers: []corev1.Container{
{
Name: "gateway",
Expand Down
Loading