Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new: Spot instances with Managed Node Groups #592

Merged
merged 22 commits into from
Nov 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions cluster/eksctl/cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,5 +38,7 @@ managedNodeGroups:
instanceType: m5.large
privateNetworking: true
releaseVersion: 1.27.3-20230816
updateConfig:
maxUnavailablePercentage: 50
labels:
workshop-default: 'yes'
4 changes: 4 additions & 0 deletions cluster/terraform/eks.tf
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ module "eks" {
min_size = 3
max_size = 6
desired_size = 3

update_config = {
max_unavailable_percentage = 50
}

labels = {
workshop-default = "yes"
Expand Down
3 changes: 2 additions & 1 deletion lab/bin/delete-nodegroup
Original file line number Diff line number Diff line change
Expand Up @@ -13,5 +13,6 @@ if [ ! -z "$check" ]; then
echo "Deleting node group $nodegroup..."

aws eks delete-nodegroup --region $AWS_REGION --cluster-name $EKS_CLUSTER_NAME --nodegroup-name $nodegroup > /dev/null
aws eks wait nodegroup-deleted --cluster-name $EKS_CLUSTER_NAME --nodegroup-name $nodegroup > /dev/null
# Skip waiting for node group to finish, allowing reset-environment/prepare-environment to finish more quickly
# aws eks wait nodegroup-deleted --cluster-name $EKS_CLUSTER_NAME --nodegroup-name $nodegroup > /dev/null
fi
11 changes: 11 additions & 0 deletions lab/scripts/installer.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,12 @@ flux_checksum='fe6d32da40d5f876434e964c46bc07d00af138c560e063fdcfa8f73e37224087'
argocd_version='2.7.4'
argocd_checksum='1b9a5f7c47b3c1326a622533f073cef46511e391d296d9b075f583b474780356'

terraform_version='1.4.1'
terraform_checksum='9e9f3e6752168dea8ecb3643ea9c18c65d5a52acc06c22453ebc4e3fc2d34421'

ec2_instance_selector_version='2.4.1'
ec2_instance_selector_checksum='dfd6560a39c98b97ab99a34fc261b6209fc4eec87b0bc981d052f3b13705e9ff'

download_and_verify () {
url=$1
checksum=$2
Expand Down Expand Up @@ -93,6 +99,11 @@ download_and_verify "https://github.com/argoproj/argo-cd/releases/download/v${ar
chmod +x ./argocd-linux-amd64
mv ./argocd-linux-amd64 /usr/local/bin/argocd

# ec2 instance selector
download_and_verify "https://github.com/aws/amazon-ec2-instance-selector/releases/download/v${ec2_instance_selector_version}/ec2-instance-selector-linux-amd64" "$ec2_instance_selector_checksum" "ec2-instance-selector-linux-amd64"
chmod +x ./ec2-instance-selector-linux-amd64
mv ./ec2-instance-selector-linux-amd64 /usr/local/bin/ec2-instance-selector

REPOSITORY_OWNER=${REPOSITORY_OWNER:-"aws-samples"}
REPOSITORY_NAME=${REPOSITORY_NAME:-"eks-workshop-v2"}

Expand Down
11 changes: 0 additions & 11 deletions manifests/modules/fundamentals/mng/.workshop/cleanup.sh

This file was deleted.

26 changes: 0 additions & 26 deletions manifests/modules/fundamentals/mng/.workshop/terraform/addon.tf

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash

set -e

delete-nodegroup taint-mng
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
data "aws_vpc" "selected" {
tags = {
created-by = "eks-workshop-v2"
env = local.addon_context.eks_cluster_id
}
}

data "aws_subnets" "private" {
tags = {
created-by = "eks-workshop-v2"
env = local.addon_context.eks_cluster_id
}

filter {
name = "tag:Name"
values = ["*Private*"]
}
}

resource "aws_iam_role" "spot_node" {
name = "${local.addon_context.eks_cluster_id}-spot-node"

assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Sid = ""
Principal = {
Service = "ec2.amazonaws.com"
}
},
]
})

managed_policy_arns = [
"arn:${local.addon_context.aws_partition_id}:iam::aws:policy/AmazonEKS_CNI_Policy",
"arn:${local.addon_context.aws_partition_id}:iam::aws:policy/AmazonEKSWorkerNodePolicy",
"arn:${local.addon_context.aws_partition_id}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly",
"arn:${local.addon_context.aws_partition_id}:iam::aws:policy/AmazonSSMManagedInstanceCore"
]

tags = local.tags
}

output "environment" {
value = <<EOF
export SPOT_NODE_ROLE="${aws_iam_role.spot_node.arn}"
%{for index, id in data.aws_subnets.private.ids}
export PRIMARY_SUBNET_${index + 1}=${id}
%{endfor}
EOF
}
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../../../../base-application/ui
- ../../../../../../base-application/ui
patches:
- path: deployment.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../../../../base-application/ui
- ../../../../../../base-application/ui
patches:
- path: deployment.yaml
5 changes: 5 additions & 0 deletions manifests/modules/fundamentals/mng/spot/.workshop/cleanup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash

set -e

delete-nodegroup managed-spot
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
data "aws_vpc" "selected" {
tags = {
created-by = "eks-workshop-v2"
env = local.addon_context.eks_cluster_id
}
}

data "aws_subnets" "private" {
tags = {
created-by = "eks-workshop-v2"
env = local.addon_context.eks_cluster_id
}

filter {
name = "tag:Name"
values = ["*Private*"]
}
}

resource "aws_iam_role" "spot_node" {
name = "${local.addon_context.eks_cluster_id}-spot-node"

assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Sid = ""
Principal = {
Service = "ec2.amazonaws.com"
}
},
]
})

managed_policy_arns = [
"arn:${local.addon_context.aws_partition_id}:iam::aws:policy/AmazonEKS_CNI_Policy",
"arn:${local.addon_context.aws_partition_id}:iam::aws:policy/AmazonEKSWorkerNodePolicy",
"arn:${local.addon_context.aws_partition_id}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly",
"arn:${local.addon_context.aws_partition_id}:iam::aws:policy/AmazonSSMManagedInstanceCore"
]

tags = local.tags
}

output "environment" {
value = <<EOF
export SPOT_NODE_ROLE="${aws_iam_role.spot_node.arn}"
%{for index, id in data.aws_subnets.private.ids}
export PRIMARY_SUBNET_${index + 1}=${id}
%{endfor}
EOF
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: catalog
spec:
template:
spec:
nodeSelector:
eks.amazonaws.com/capacityType: SPOT
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../../../../base-application/catalog
patches:
- path: deployment.yaml
3 changes: 3 additions & 0 deletions website/docs/fundamentals/managed-node-groups/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"collapsed": false
}
39 changes: 39 additions & 0 deletions website/docs/fundamentals/managed-node-groups/basics/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: MNG basics
sidebar_position: 30
sidebar_custom_props: {"module": true}
---

:::tip Before you start
Prepare your environment for this section:

```bash timeout=600 wait=30
$ prepare-environment fundamentals/mng/basics
```

:::

In the Getting started lab, we deployed our sample application to EKS and saw the running Pods. But where are these Pods running?

We can inspect the default managed node group that was pre-provisioned for you:

```bash
$ eksctl get nodegroup --cluster $EKS_CLUSTER_NAME --name $EKS_DEFAULT_MNG_NAME
```

There are several attributes of managed node groups that we can see from this output:
* Configuration of minimum, maximum and desired counts of the number of nodes in this group
* The instance type for this node group is `m5.large`
* Uses the `AL2_x86_64` EKS AMI type


We can also inspect the nodes and the placement in the availability zones.

```bash
$ kubectl get nodes -o wide --label-columns topology.kubernetes.io/zone
```

You should see:
* Nodes are distributed over multiple subnets in various availability zones, providing high availability

Over the course of this module we'll make changes to this node group to demonstrate the basic capabilities of MNGs.
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,16 @@ sidebar_position: 10
For the purpose of this exercise we'll provision a separate managed node group which we'll apply taints to.

```file
manifests/modules/fundamentals/mng/taints/nodegroup.yaml
manifests/modules/fundamentals/mng/basics/taints/nodegroup.yaml
```

Note: This configuration file does not yet configure the taints, it only applies a label `tainted: 'yes'`. We will configure the taints on this node group further below.

The following command creates this node group:

```bash timeout=600 hook=configure-taints
$ cat ~/environment/eks-workshop/modules/fundamentals/mng/taints/nodegroup.yaml | envsubst | eksctl create nodegroup -f -
$ cat ~/environment/eks-workshop/modules/fundamentals/mng/basics/taints/nodegroup.yaml \
| envsubst | eksctl create nodegroup -f -
```

It will take *2-3* minutes for the node to join the EKS cluster, until you see this command give the following output:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,14 +37,14 @@ As anticipated, the application is running succesfully on a non-tainted node. Th
Let's update our `ui` deployment to bind its pods to our tainted managed node group. We have pre-configured our tainted managed node group with a label of `tainted=yes` that we can use with a `nodeSelector`. The following `Kustomize` patch describes the changes needed to our deployment configuration in order to enable this setup:

```kustomization
modules/fundamentals/mng/taints/nodeselector-wo-toleration/deployment.yaml
modules/fundamentals/mng/basics/taints/nodeselector-wo-toleration/deployment.yaml
Deployment/ui
```

To apply the Kustomize changes run the following command:

```bash
$ kubectl apply -k ~/environment/eks-workshop/modules/fundamentals/mng/taints/nodeselector-wo-toleration/
$ kubectl apply -k ~/environment/eks-workshop/modules/fundamentals/mng/basics/taints/nodeselector-wo-toleration/
namespace/ui unchanged
serviceaccount/ui unchanged
configmap/ui unchanged
Expand Down Expand Up @@ -94,12 +94,12 @@ Our changes are reflected in the new configuration of the `Pending` pod. We can
To fix this, we need to add a toleration. Let's ensure our deployment and associated pods are able to tolerate the `frontend: true` taint. We can use the below `kustomize` patch to make the necessary changes:

```kustomization
modules/fundamentals/mng/taints/nodeselector-w-toleration/deployment.yaml
modules/fundamentals/mng/basics/taints/nodeselector-w-toleration/deployment.yaml
Deployment/ui
```

```bash
$ kubectl apply -k ~/environment/eks-workshop/modules/fundamentals/mng/taints/nodeselector-w-toleration/
$ kubectl apply -k ~/environment/eks-workshop/modules/fundamentals/mng/basics/taints/nodeselector-w-toleration/
namespace/ui unchanged
serviceaccount/ui unchanged
configmap/ui unchanged
Expand Down
26 changes: 1 addition & 25 deletions website/docs/fundamentals/managed-node-groups/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
---
title: Managed Node Groups
sidebar_position: 30
sidebar_custom_props: {"module": true}
---

:::tip Before you start
Expand All @@ -13,8 +12,6 @@ $ prepare-environment fundamentals/mng

:::

In the Getting started lab, we deployed our sample application to EKS and saw the running Pods. But where are these Pods running?

An EKS cluster contains one or more EC2 nodes that Pods are scheduled on. EKS nodes run in your AWS account and connect to the control plane of your cluster through the cluster API server endpoint. You deploy one or more nodes into a node group. A node group is one or more EC2 instances that are deployed in an EC2 Auto Scaling group.

EKS nodes are standard Amazon EC2 instances. You're billed for them based on EC2 prices. For more information, see [Amazon EC2 pricing](https://aws.amazon.com/ec2/pricing/).
Expand All @@ -31,25 +28,4 @@ Advantages of running Amazon EKS managed node groups include:
* Node updates and terminations automatically and gracefully drain nodes to ensure that your applications stay available
* No additional costs to use Amazon EKS managed node groups, pay only for the AWS resources provisioned

We can inspect the default managed node group that was pre-provisioned for you:

```bash
$ eksctl get nodegroup --cluster $EKS_CLUSTER_NAME --name $EKS_DEFAULT_MNG_NAME
```

There are several attributes of managed node groups that we can see from this output:
* Configuration of minimum, maximum and desired counts of the number of nodes in this group
* The instance type for this node group is `m5.large`
* Uses the `AL2_x86_64` EKS AMI type


We can also inspect the nodes and the placement in the availability zones.

```bash
$ kubectl get nodes -o wide --label-columns topology.kubernetes.io/zone
```

You should see:
* Nodes are distributed over multiple subnets in various availability zones, providing high availability

Over the course of this module we'll make changes to this node group to demonstrate the capabilities of MNGs.
Tha labs in this section deal with various ways that EKS managed node groups can be used to provide compute capacity to a cluster.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading