-
Notifications
You must be signed in to change notification settings - Fork 464
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'aws-samples:main' into ci2flux
- Loading branch information
Showing
62 changed files
with
834 additions
and
258 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,3 +31,5 @@ env | |
*.zip | ||
|
||
cdk.out | ||
|
||
.envrc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,49 +1,59 @@ | ||
# Steering Committee and Module Leads | ||
|
||
## Steering Commitee Members | ||
|
||
The Steering Committee is a 6 member body, overseeing the governance of the EKS Workshop. | ||
|
||
### Terms end in February 2024 | ||
|Name|Profile|Role| | ||
|:----|:-------|:----| | ||
|Sai Vennam|[@svennam92](https://github.com/svennam92)|Principal EKS DA | ||
|Niall Thomson|[@niallthomson](https://github.com/niallthomson)|Specialist Solution Architect, Containers| | ||
|Ray Krueger|[@raykrueger](https://github.com/raykrueger)|Principal Container Specialist| | ||
|Ameet Naik|[@ameetnaik](https://github.com/ameetnaik)|Technical Account Manager| | ||
|Kamran Habib|[@kmhabib](https://github.com/kmhabib)|Solution Architect (TFC at large)| | ||
|Theo Salvo|[@buzzsurfr](https://github.com/buzzsurfr)|Container Specialist (TFC core team member)| | ||
|
||
| Name | Profile | Role | | ||
| :------------ | :----------------------------------------------- | :------------------------------------------ | | ||
| Sai Vennam | [@svennam92](https://github.com/svennam92) | Principal EKS DA | | ||
| Niall Thomson | [@niallthomson](https://github.com/niallthomson) | Specialist Solution Architect, Containers | | ||
| Ray Krueger | [@raykrueger](https://github.com/raykrueger) | Principal Container Specialist | | ||
| Ameet Naik | [@ameetnaik](https://github.com/ameetnaik) | Technical Account Manager | | ||
| Kamran Habib | [@kmhabib](https://github.com/kmhabib) | Solution Architect (TFC at large) | | ||
| Theo Salvo | [@buzzsurfr](https://github.com/buzzsurfr) | Container Specialist (TFC core team member) | | ||
|
||
## Working Groups | ||
|
||
The working groups are led by chairs (6 month terms) and maintainers (6 month terms). | ||
|
||
|Working Group|Chair|Maintainers| | ||
|:----|:-------|:----| | ||
|Infrastructure|[Niall Thomson](https://github.com/niallthomson)|| | ||
|Fundamentals|[Sai Vennam](https://github.com/svennam92)|[Bijith Nair](https://github.com/bijithnair), [Tolu Okuboyejo](https://github.com/oktab1), [Hemanth AVS](https://github.com/hemanth-avs)| | ||
|Autoscaling|[Sanjeev Ganjihal](https://github.com/sanjeevrg89)|| | ||
|Automation|[Carlos Santana](https://github.com/csantanapr)|[Tsahi Duek](https://github.com/tsahiduek), [Christina Andonov](https://github.com/candonov), [Sébastien Allamand](https://github.com/allamand)| | ||
|Machine Learning|[Masatoshi Hayashi](https://github.com/literalice)|| | ||
|Networking|[Sheetal Joshi](https://github.com/sheetaljoshi)|[Umair Ishaq](https://github.com/umairishaq)| | ||
|Observability|[Nirmal Mehta](https://github.com/normalfaults)|[Steven David](https://github.com/StevenDavid)| | ||
|Security|[Rodrigo Bersa](https://github.com/rodrigobersa)| | | ||
|Storage|[Eric Heinrichs](https://github.com/heinrichse)|[Andrew Peng](https://github.com/pengc99)| | ||
| Working Group | Chair | Maintainers | | ||
| :--------------- | :------------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| Infrastructure | [Niall Thomson](https://github.com/niallthomson) | | | ||
| Fundamentals | [Sai Vennam](https://github.com/svennam92) | [Bijith Nair](https://github.com/bijithnair), [Tolu Okuboyejo](https://github.com/oktab1), [Hemanth AVS](https://github.com/hemanth-avs) | | ||
| Autoscaling | [Sanjeev Ganjihal](https://github.com/sanjeevrg89) | | | ||
| Automation | [Carlos Santana](https://github.com/csantanapr) | [Tsahi Duek](https://github.com/tsahiduek), [Sébastien Allamand](https://github.com/allamand), [Yuriy Bezsonov](https://github.com/ybezsonov) | | ||
| Machine Learning | [Masatoshi Hayashi](https://github.com/literalice) | [Benjamin Gardiner](https://github.com/bkgardiner) | | ||
| Networking | [Sheetal Joshi](https://github.com/sheetaljoshi) | [Umair Ishaq](https://github.com/umairishaq) | | ||
| Observability | [Nirmal Mehta](https://github.com/normalfaults) | [Steven David](https://github.com/StevenDavid) | | ||
| Security | [Rodrigo Bersa](https://github.com/rodrigobersa) | | | ||
| Storage | [Eric Heinrichs](https://github.com/heinrichse) | [Andrew Peng](https://github.com/pengc99) | | ||
|
||
## Wranglers | ||
|
||
Wranglers will work across all topic areas and serve for at least 6 months. | ||
|Name|Profile|Role| | ||
|:----|:-------|:----| | ||
|Math Bruneau|[@ROunofF](https://github.com/ROunofF)|Specialist Solution Architect, Containers| | ||
|
||
|
||
## Emeritus | ||
|Name|Profile|Role| | ||
|:----|:-------|:----| | ||
|Jeremy Cowan|[@jicowan](https://github.com/jicowan)|EKS DA manager| | ||
|
||
| Name | Profile | Role | | ||
| :----------- | :------------------------------------- | :------------- | | ||
| Jeremy Cowan | [@jicowan](https://github.com/jicowan) | EKS DA manager | | ||
|
||
## Meetings | ||
|
||
### Schedule and Cadence | ||
|
||
The steering committee will host a public meeting every third Thursday of the month at 9AM CT. <!--update with Chime link--> | ||
|
||
### Resources | ||
* <!--add links to meeting notes and recordings--> | ||
|
||
- <!--add links to meeting notes and recordings--> | ||
|
||
## Contact | ||
* Mailing List: <[email protected]> | ||
|
||
- Mailing List: <[email protected]> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,4 +4,4 @@ idna==3.4 | |
PyYAML==6.0 | ||
requests==2.31.0 | ||
semantic-version==2.10.0 | ||
urllib3==2.0.2 | ||
urllib3==2.0.3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
#!/bin/bash | ||
|
||
set -e | ||
|
||
echo "Deleting AIML resources..." | ||
|
||
kubectl delete namespace aiml > /dev/null | ||
|
||
echo "Deleting Karpenter provisioners..." | ||
|
||
kubectl delete provisioner --all > /dev/null | ||
kubectl delete awsnodetemplate --all > /dev/null | ||
|
||
echo "Waiting for Karpenter nodes to be removed..." | ||
|
||
EXIT_CODE=0 | ||
|
||
timeout --foreground -s TERM 30 bash -c \ | ||
'while [[ $(kubectl get nodes --selector=type=karpenter -o json | jq -r ".items | length") -gt 0 ]];\ | ||
do sleep 5;\ | ||
done' || EXIT_CODE=$? | ||
|
||
if [ $EXIT_CODE -ne 0 ]; then | ||
echo "Warning: Karpenter nodes did not clean up" | ||
fi |
128 changes: 128 additions & 0 deletions
128
manifests/modules/aiml/inferentia/.workshop/terraform/addon.tf
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
data "aws_subnets" "private" { | ||
tags = { | ||
created-by = "eks-workshop-v2" | ||
env = local.addon_context.eks_cluster_id | ||
} | ||
|
||
filter { | ||
name = "tag:Name" | ||
values = ["*Private*"] | ||
} | ||
} | ||
|
||
module "iam_assumable_role_inference" { | ||
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc" | ||
version = "~> v5.5.0" | ||
create_role = true | ||
role_name = "${local.addon_context.eks_cluster_id}-inference" | ||
provider_url = local.addon_context.eks_oidc_issuer_url | ||
role_policy_arns = [aws_iam_policy.inference.arn] | ||
oidc_fully_qualified_subjects = ["system:serviceaccount:aiml:inference"] | ||
|
||
tags = local.tags | ||
} | ||
|
||
|
||
resource "aws_iam_policy" "inference" { | ||
name = "${local.addon_context.eks_cluster_id}-inference" | ||
path = "/" | ||
description = "IAM policy for the inferenct workload" | ||
|
||
policy = <<EOF | ||
{ | ||
"Version": "2012-10-17", | ||
"Statement": [ | ||
{ | ||
"Effect": "Allow", | ||
"Action": "s3:*", | ||
"Resource": [ | ||
"arn:aws:s3:::${aws_s3_bucket.inference.id}", | ||
"arn:aws:s3:::${aws_s3_bucket.inference.id}/*" | ||
] | ||
} | ||
] | ||
} | ||
EOF | ||
} | ||
|
||
module "karpenter" { | ||
source = "github.com/aws-ia/terraform-aws-eks-blueprints?ref=v4.25.0//modules/kubernetes-addons/karpenter" | ||
addon_context = merge(local.addon_context, { default_repository = local.amazon_container_image_registry_uris[data.aws_region.current.name] }) | ||
|
||
node_iam_instance_profile = aws_iam_instance_profile.karpenter_node.name | ||
|
||
helm_config = { | ||
set = [{ | ||
name = "replicas" | ||
value = "1" | ||
}] | ||
} | ||
} | ||
|
||
resource "aws_iam_instance_profile" "karpenter_node" { | ||
name = "${local.addon_context.eks_cluster_id}-karpenter-node" | ||
role = aws_iam_role.karpenter_node.name | ||
} | ||
|
||
resource "aws_iam_role" "karpenter_node" { | ||
name = "${local.addon_context.eks_cluster_id}-karpenter-node" | ||
|
||
assume_role_policy = jsonencode({ | ||
Version = "2012-10-17" | ||
Statement = [ | ||
{ | ||
Action = "sts:AssumeRole" | ||
Effect = "Allow" | ||
Sid = "" | ||
Principal = { | ||
Service = "ec2.amazonaws.com" | ||
} | ||
}, | ||
] | ||
}) | ||
|
||
managed_policy_arns = [ | ||
"arn:${local.addon_context.aws_partition_id}:iam::aws:policy/AmazonEKS_CNI_Policy", | ||
"arn:${local.addon_context.aws_partition_id}:iam::aws:policy/AmazonEKSWorkerNodePolicy", | ||
"arn:${local.addon_context.aws_partition_id}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly", | ||
"arn:${local.addon_context.aws_partition_id}:iam::aws:policy/AmazonSSMManagedInstanceCore" | ||
] | ||
|
||
tags = local.tags | ||
} | ||
|
||
data "http" "neuron_device_plugin_rbac_manifest" { | ||
url = "https://raw.githubusercontent.com/aws-neuron/aws-neuron-sdk/v2.6.0/src/k8/k8s-neuron-device-plugin-rbac.yml" | ||
} | ||
|
||
data "http" "neuron_device_plugin_manifest" { | ||
url = "https://raw.githubusercontent.com/aws-neuron/aws-neuron-sdk/v2.6.0/src/k8/k8s-neuron-device-plugin.yml" | ||
} | ||
|
||
data "kubectl_file_documents" "neuron_device_plugin_rbac_doc" { | ||
content = data.http.neuron_device_plugin_rbac_manifest.response_body | ||
} | ||
|
||
data "kubectl_file_documents" "neuron_device_plugin_doc" { | ||
content = data.http.neuron_device_plugin_manifest.response_body | ||
} | ||
|
||
resource "kubectl_manifest" "neuron_device_plugin_rbac" { | ||
for_each = data.kubectl_file_documents.neuron_device_plugin_rbac_doc.manifests | ||
yaml_body = each.value | ||
} | ||
|
||
resource "kubectl_manifest" "neuron_device_plugin" { | ||
for_each = data.kubectl_file_documents.neuron_device_plugin_doc.manifests | ||
yaml_body = each.value | ||
} | ||
|
||
output "environment" { | ||
value = <<EOF | ||
export AIML_NEURON_ROLE_ARN=${module.iam_assumable_role_inference.iam_role_arn} | ||
export AIML_NEURON_BUCKET_NAME=${resource.aws_s3_bucket.inference.id} | ||
export AIML_DL_IMAGE=763104351884.dkr.ecr.${data.aws_region.current.name}.amazonaws.com/pytorch-inference-neuron:1.13.1-neuron-py310-sdk2.12.0-ubuntu20.04 | ||
export AIML_SUBNETS=${data.aws_subnets.private.ids[0]},${data.aws_subnets.private.ids[1]},${data.aws_subnets.private.ids[2]} | ||
export KARPENTER_NODE_ROLE="${aws_iam_role.karpenter_node.arn}" | ||
EOF | ||
} |
6 changes: 6 additions & 0 deletions
6
manifests/modules/aiml/inferentia/.workshop/terraform/addon_infrastructure.tf
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
resource "aws_s3_bucket" "inference" { | ||
bucket_prefix = "eksworkshop-inference" | ||
force_destroy = true | ||
|
||
tags = local.tags | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
AIML_NEURON_ROLE_ARN |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
apiVersion: kustomize.config.k8s.io/v1beta1 | ||
kind: Kustomization | ||
configMapGenerator: | ||
- name: base-vars | ||
namespace: aiml | ||
env: config.properties | ||
options: | ||
disableNameSuffixHash: true | ||
replacements: | ||
- source: | ||
kind: ConfigMap | ||
name: base-vars | ||
version: v1 | ||
namespace: aiml | ||
fieldPath: data.AIML_NEURON_ROLE_ARN | ||
targets: | ||
- select: | ||
kind: ServiceAccount | ||
name: inference | ||
namespace: aiml | ||
fieldPaths: | ||
- metadata.annotations.[eks.amazonaws.com/role-arn] | ||
resources: | ||
- serviceaccount.yaml | ||
- namespace.yaml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
apiVersion: v1 | ||
kind: Namespace | ||
metadata: | ||
name: aiml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
apiVersion: v1 | ||
kind: ServiceAccount | ||
metadata: | ||
name: inference | ||
namespace: aiml | ||
annotations: | ||
eks.amazonaws.com/role-arn: ${AIML_NEURON_ROLE_ARN} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
apiVersion: v1 | ||
kind: Pod | ||
metadata: | ||
labels: | ||
role: compiler | ||
name: compiler | ||
namespace: aiml | ||
spec: | ||
containers: | ||
- command: | ||
- sh | ||
- -c | ||
- sleep infinity | ||
image: ${AIML_DL_IMAGE} | ||
name: compiler | ||
serviceAccountName: inference |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
AIML_DL_IMAGE |
Oops, something went wrong.