Skip to content

Commit

Permalink
feat: upgrade to kubernetes v1.28 TDE-916 (#227)
Browse files Browse the repository at this point in the history
#### Motivation

Kubernetes upgrades every quarter and to upgrade EKS is a few steps, we
need this process to be well documented so we can upgrade frequently.

#### Modification

Upgrade EKS to 1.28 and document the steps that it took to upgrade.


#### Checklist

_If not applicable, provide explanation of why._

- [ ] Tests updated
- [ ] Docs updated
- [ ] Issue linked in Title

---------

Co-authored-by: paulfouquet <[email protected]>
  • Loading branch information
blacha and paulfouquet authored Nov 6, 2023
1 parent b5fe78a commit 61e72c7
Show file tree
Hide file tree
Showing 5 changed files with 105 additions and 8 deletions.
92 changes: 92 additions & 0 deletions docs/infrastructure/kubernetes.version.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Upgrade Kubernetes Versions

Because Kubernetes deprecates quickly and releases often, we need to keep our kubernetes cluster up to date.

**You cannot jump multiple versions** You must do a deployment per individual version bump.

## Upgrade steps

Below is an example of upgrading from v1.27 to v1.28

1. Update lambda-layer version to the matching version number

```bash
npm install --save-dev @aws-cdk/lambda-layer-kubectl-v28
```

While also removing the old lambda-layer version

```bash
npm rm @aws-cdk/lambda-layer-kubectl-v27
```

2. Set the new Kubernetes version in `LinzEksCluster`

```typescript
version = KubernetesVersion.of('1.28');
```

3. Modify layer version

```typescript
import { KubectlV28Layer } from '@aws-cdk/lambda-layer-kubectl-v28';

// ...

kubectlLayer: new KubectlV28Layer(this, 'KubeCtlLayer'),
```

4. Diff the stack to make sure that only versions are updated

```bash
npx cdk diff Workflows -c ci-role-arn=...
```

The only changes should be Kubernetes version related.

```
Resources
[~] AWS::Lambda::LayerVersion KubeCtlLayer KubeCtlLayer replace
├─ [~] Content (requires replacement)
│ └─ [~] .S3Key:
│ ├─ [-] 8e18eb5caccd2617fb76e648fa6a35dc0ece98c4681942bc6861f41afdff6a1b.zip
│ └─ [+] b4d47e4f1c5e8fc2df2cd474ede548de153300d332ba8d582b7c1193e61cbe1e.zip
├─ [~] Description (requires replacement)
│ ├─ [-] /opt/kubectl/kubectl 1.27; /opt/helm/helm 3.12
│ └─ [+] /opt/kubectl/kubectl 1.28; /opt/helm/helm 3.13
└─ [~] Metadata
└─ [~] .aws:asset:path:
├─ [-] asset.8e18eb5caccd2617fb76e648fa6a35dc0ece98c4681942bc6861f41afdff6a1b.zip
└─ [+] asset.b4d47e4f1c5e8fc2df2cd474ede548de153300d332ba8d582b7c1193e61cbe1e.zip
[~] Custom::AWSCDK-EKS-Cluster EksWorkflows/Resource/Resource EksWorkflows
└─ [~] Config
└─ [~] .version:
├─ [-] 1.27
└─ [+] 1.28
```

5. Create a pull request and wait for CI/CD to deploy the changes.

**Version bump deployments can take 10+ minutes :coffee: **

## Cycle out EC2 Nodes to the new version.

1. Find the nodegroup name for the cluster

```bash
aws eks list-nodegroups --cluster-name Workflows
```

2. Describe the nodegroup to validate the versions

By describing the node group you can check the current version, or you can use `k get nodes` to see what version is currently running

```bash
aws eks describe-nodegroup --cluster-name Workflows --nodegroup-name EksWorkflowsNodegroupCluste
```

3. Update the version to match

```bash
aws eks update-nodegroup-version --cluster-name Workflows --nodegroup-name EksWorkflowsNodegroupCluste-OWsXxRuVz2B7
```
5 changes: 5 additions & 0 deletions infra/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,3 +110,8 @@ The deployment of the K8s config is managed by GithubActions in [main](../.githu
## Troubleshoot

- [DNS](../docs/dns.configuration.md)

## Upgrading Kubernetes Versions

Kubernetes upgrades very frequently. To deploy a new version follow the
[Version Upgrade Guide](../docs/infrastructure/kubernetes.version.md)
6 changes: 3 additions & 3 deletions infra/eks/cluster.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { KubectlV27Layer } from '@aws-cdk/lambda-layer-kubectl-v27';
import { KubectlV28Layer } from '@aws-cdk/lambda-layer-kubectl-v28';
import { Aws, CfnOutput, Duration, RemovalPolicy, Stack, StackProps } from 'aws-cdk-lib';
import { InstanceType, IVpc, SubnetType, Vpc } from 'aws-cdk-lib/aws-ec2';
import { Cluster, ClusterLoggingTypes, IpFamily, KubernetesVersion, NodegroupAmiType } from 'aws-cdk-lib/aws-eks';
Expand All @@ -25,7 +25,7 @@ export class LinzEksCluster extends Stack {
/* Cluster ID */
id: string;
/** Version of EKS to use, this must be aligned to the `kubectlLayer` */
version = KubernetesVersion.V1_27;
version = KubernetesVersion.of('1.28');
/** Argo needs a temporary bucket to store objects */
tempBucket: Bucket;
/* Bucket where read/write roles config files are stored */
Expand Down Expand Up @@ -65,7 +65,7 @@ export class LinzEksCluster extends Stack {
defaultCapacity: 0,
vpcSubnets: [{ subnetType: SubnetType.PRIVATE_WITH_EGRESS }],
/** This must align to Cluster version: {@link version} */
kubectlLayer: new KubectlV27Layer(this, 'KubeCtlLayer'),
kubectlLayer: new KubectlV28Layer(this, 'KubeCtlLayer'),
/** To prevent IP exhaustion when running huge workflows run using ipv6 */
ipFamily: IpFamily.IP_V6,
clusterLogging: [ClusterLoggingTypes.API, ClusterLoggingTypes.CONTROLLER_MANAGER, ClusterLoggingTypes.SCHEDULER],
Expand Down
8 changes: 4 additions & 4 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
"@aws-sdk/client-cloudformation": "3.429.0",
"@aws-sdk/client-eks": "3.429.0",
"@aws-sdk/client-ssm": "3.429.0",
"@aws-cdk/lambda-layer-kubectl-v27": "^2.0.0",
"@aws-cdk/lambda-layer-kubectl-v28": "^2.0.0",
"@linzjs/style": "^5.0.0",
"aws-cdk": "2.93.x",
"aws-cdk-lib": "2.93.x",
Expand Down

0 comments on commit 61e72c7

Please sign in to comment.