From 5b4dc300d6ec7b9682c0de6e8ed254c704da66ef Mon Sep 17 00:00:00 2001 From: Blayne Chard Date: Thu, 9 Nov 2023 10:24:44 +1300 Subject: [PATCH] docs: restructure to allow expansion on the components in use (#232) #### Motivation The infrastructure README is starting to get very cluttered with notes and tips, split the notes and components into their own files #### Modification Split more infrastructure docs into /docs #### Checklist _If not applicable, provide explanation of why._ - [ ] Tests updated - [ ] Docs updated - [ ] Issue linked in Title --- .../components/argo.workflows.md | 4 + docs/infrastructure/components/fluentbit.md | 0 docs/infrastructure/components/karpenter.md | 1 + docs/infrastructure/helm.md | 15 +++ docs/infrastructure/initial.deployment.md | 14 +++ infra/README.md | 107 +++++++----------- 6 files changed, 73 insertions(+), 68 deletions(-) create mode 100644 docs/infrastructure/components/argo.workflows.md create mode 100644 docs/infrastructure/components/fluentbit.md create mode 100644 docs/infrastructure/components/karpenter.md create mode 100644 docs/infrastructure/helm.md create mode 100644 docs/infrastructure/initial.deployment.md diff --git a/docs/infrastructure/components/argo.workflows.md b/docs/infrastructure/components/argo.workflows.md new file mode 100644 index 000000000..fd8410dfd --- /dev/null +++ b/docs/infrastructure/components/argo.workflows.md @@ -0,0 +1,4 @@ +# Argo Workflows + +Argo Workflows is used to run the workflows inside K8s. +It is deployed using its [Helm chart](https://github.com/argoproj/argo-helm/tree/main/charts/argo-workflows). \ No newline at end of file diff --git a/docs/infrastructure/components/fluentbit.md b/docs/infrastructure/components/fluentbit.md new file mode 100644 index 000000000..e69de29bb diff --git a/docs/infrastructure/components/karpenter.md b/docs/infrastructure/components/karpenter.md new file mode 100644 index 000000000..f0f6647ce --- /dev/null +++ b/docs/infrastructure/components/karpenter.md @@ -0,0 +1 @@ +# Karpenter \ No newline at end of file diff --git a/docs/infrastructure/helm.md b/docs/infrastructure/helm.md new file mode 100644 index 000000000..11c0dae50 --- /dev/null +++ b/docs/infrastructure/helm.md @@ -0,0 +1,15 @@ +# Working with Helm & CDK8s + +It is possible to generate a specific Helm construct for the component if their chart includes a `value.schema.json`. This is useful to provide typing hints when specifying their configuration () + +To generate the Helm Construct for a specific Chart, follow the instructions [here](https://github.com/cdk8s-team/cdk8s/blob/master/docs/cli/import.md#values-schema): + +Specify the output for the imports: + +`--output infra/imports/` + +However, some of the component Helm charts do not have a `values.schema.json`. And that the case for most of our components: + +- [aws-for-fluent-bit](./components/fluentbit.md) () +- [Karpenter](./components/karpenter.md) +- [Argo workflows](./components/argo.workflows.md) \ No newline at end of file diff --git a/docs/infrastructure/initial.deployment.md b/docs/infrastructure/initial.deployment.md new file mode 100644 index 000000000..554e49d3d --- /dev/null +++ b/docs/infrastructure/initial.deployment.md @@ -0,0 +1,14 @@ +# Initial deployment + +The initial deployment is a two step process where AWS-CDK is used to create a EKS cluster then CDK8s is used to deploy the applications into the cluster. + + +## Custom Resource Definitions + +The first time a cluster is deployed Custom Resource Definitions (CRD) will not exist, when `kubectl` is used to deploy the CRDs for [Karpenter](./components/karpenter.md) and [Argo Workflows](./components/argo.workflows.md) it does not wait for the CRDs to finish deploying before starting the next step. + +This means that any resources that require a CRD will fail to deploy with a error similar to + +> resource mapping not found for name: "karpenter-template" namespace: "" from "dist/0003-karpenter-provisioner.k8s.yaml": no matches for kind "AWSNodeTemplate" in version "karpenter.k8s.aws/v1alpha1" + +To work around this problem the first deployment can be repeated, as the CRDs are deployed early in the deployment process. \ No newline at end of file diff --git a/infra/README.md b/infra/README.md index a56b01dff..789bff7f2 100644 --- a/infra/README.md +++ b/infra/README.md @@ -1,95 +1,71 @@ # Topo-Workflows Infrastructure -The infrastructure running the workflows is mainly based on a Kubernetes (EKS) cluster and Argo Workflows. It is currently run on AWS. -Generally all Kubernetes resources are defined with cdk8s and anything that needs AWS interactions such as service accounts are defined with CDK. +The infrastructure running the workflows is mainly based on a Kubernetes (AWS EKS) cluster and Argo Workflows. -## EKS Cluster / AWS CDK - -The EKS Cluster base configuration is defined in `./cdk.ts` using [`aws-cdk`](https://aws.amazon.com/cdk/). - -### Deployment - -To deploy with AWS CDK a few configuration variables need to be set - -Due to VPC lookups a AWS account ID needs to be provided +Generally all Kubernetes resources are defined with [`cdk8s`](https://cdk8s.io/) and anything that needs AWS interactions such as service accounts are defined with [`aws-cdk`](https://aws.amazon.com/cdk/). -This can be done with either a `export CDK_DEFAULT_ACCOUNT=1234567890` or passed in at run time with `-c aws-account-id=1234567890` - -Then a deployment can be made with `cdk` - -``` -npx cdk diff -c aws-account-id=1234567890 -c ci-role-arn=arn::... -``` +## EKS Cluster / AWS CDK -#### Context +The EKS Cluster base configuration is defined in [./cdk.ts](./cdk.ts) using [`aws-cdk`](https://aws.amazon.com/cdk/). -- `aws-account-id`: Account ID to deploy into -- `ci-role-arn`: AWS Role ARN for the CI user ## Kubernetes resources / CDK8s -The additional components (or Kubernetes resources) running on the EKS cluster are defined in `./cdk8s` using [`cdk8s`](https://cdk8s.io/). +The additional components (or Kubernetes resources) running on the EKS cluster are defined in [./cdk8s.ts](./cdk8s.ts) using [`cdk8s`](https://cdk8s.io/). Main entry point: [app](./cdk8s.ts) -- Argo - Argo workflows for use with [linz/topo-workflows](https://github.com/linz/topo-workflows) -- Karpenter +#### Components: -### Argo Workflows +- [ArgoWorkflows](../docs/infrastructure/components/argo.workflows.md) - Main Workflow engine +- [Karpenter](../docs/infrastructure/components/karpenter.md) - Autoscale EC2 Nodes +- [FluentBit](../docs/infrastructure/components/karpenter.md) - Forward logs to AWS CloudWatch -Argo Workflows is used to run the workflows inside K8s. -It is deployed using its [Helm chart](https://github.com/argoproj/argo-helm/tree/main/charts/argo-workflows). -#### Semaphores -ConfigMap that list the synchronization limits for parallel execution of the workflows. -### Karpenter +## Deployments -TODO - -### Event Exporter +Ensure all dependencies are installed -[`kubernetes-event-exporter`](https://github.com/resmoio/kubernetes-event-exporter) is used to log the kubernetes events. Some events are useful to be find in the logs so we can create some alerts (`WorkflowRunning`, `WorkflowFailed`, etc.). +```shell +npm install +``` -### Generate code +Login to AWS +### Deploy CDK -It is possible to generate a specific Helm construct for the component if their chart includes a `value.schema.json`. This is useful to provide typing hints when specifying their configuration () +To deploy with AWS CDK a few configuration variables need to be set -To generate the Helm Construct for a specific Chart, follow the instructions [here](https://github.com/cdk8s-team/cdk8s/blob/master/docs/cli/import.md#values-schema): +Due to VPC lookups a AWS account ID needs to be provided -Specify the output for the imports: +This can be done with either a `export CDK_DEFAULT_ACCOUNT=1234567890` or passed in at run time with `-c aws-account-id=1234567890` -`--output infra/imports/` +Then a deployment can be made with `cdk` -However, some of the component Helm charts do not have a `values.schema.json`. And that the case for most of our components: +``` +npx cdk diff -c aws-account-id=1234567890 -c ci-role-arn=arn::... +``` -- aws-for-fluent-bit () -- Karpenter -- Argo workflows +#### CDK Context -## Usage (for test) +- `aws-account-id`: Account ID to deploy into +- `ci-role-arn`: AWS Role ARN for the CI user -Ensure all dependencies are installed +### Deploy CDK8s +Generate the kubernetes configuration yaml into `dist/` ```shell -npm install +npx cdk8s synth ``` -Login to AWS - -Generate the kubernetes configuration yaml into `dist/` - -add Helm repositories () +Apply the generated yaml files ```shell -helm repo add eks https://aws.github.io/eks-charts -helm repo add argo https://argoproj.github.io/argo-helm +kubectl apply -f dist/ ``` -```shell -npx cdk8s synth -``` +### Testing To debug use the following as `cdk8s syth` swallows the errors @@ -97,21 +73,16 @@ To debug use the following as `cdk8s syth` swallows the errors npx tsx infra/cdk8s.ts ``` -Apply the generated yaml files - -```shell -kubectl apply -f dist/ -``` - -## Deployment +## CICD Deployment The deployment of the K8s config is managed by GithubActions in [main](../.github/workflows/main.yml). -## Troubleshoot +## Notes + +- [Initial Deployment](../docs/infrastructure/initial.deployment.md) +- [Version Upgrade Guide](../docs/infrastructure/kubernetes.version.md) +- [DNS Troubleshooting](../docs/dns.configuration.md) +- [Working with Helm](../docs/infrastructure/helm.md) -- [DNS](../docs/dns.configuration.md) -## Upgrading Kubernetes Versions -Kubernetes upgrades very frequently. To deploy a new version follow the -[Version Upgrade Guide](../docs/infrastructure/kubernetes.version.md)