-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Addon for nvidia device plugin (#995)
Co-authored-by: Bryant Biggs <[email protected]>
- Loading branch information
1 parent
36c403b
commit 632a698
Showing
11 changed files
with
182 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# NVIDIA Device Plugin | ||
|
||
The NVIDIA device plugin for Kubernetes is a Daemonset that allows you to automatically: | ||
|
||
* Expose the number of GPUs on each nodes of your cluster | ||
* Keep track of the health of your GPUs | ||
* Run GPU enabled containers in your Kubernetes cluster. | ||
|
||
|
||
For complete project documentation, please visit the [NVIDIA Device Plugin](https://github.com/NVIDIA/k8s-device-plugin#readme). | ||
|
||
Additionally, refer to this AWS [blog](https://aws.amazon.com/blogs/compute/running-gpu-accelerated-kubernetes-workloads-on-p3-and-p2-ec2-instances-with-amazon-eks/) for more information on how the add-on can be tested. | ||
|
||
## Usage | ||
|
||
NVIDIA device plugin can be deployed by enabling the add-on via the following. | ||
|
||
```hcl | ||
enable_nvidia_device_plugin = true | ||
``` | ||
|
||
You can optionally customize the Helm chart via the following configuration. | ||
|
||
```hcl | ||
enable_nvidia_device_plugin = true | ||
# Optional nvidia_device_plugin_helm_config | ||
nvidia_device_plugin_helm_config = { | ||
name = "nvidia-device-plugin" | ||
chart = "nvidia-device-plugin" | ||
repository = "https://nvidia.github.io/k8s-device-plugin" | ||
version = "0.12.3" | ||
namespace = "nvidia-device-plugin" | ||
values = [templatefile("${path.module}/values.yaml", { | ||
... | ||
})] | ||
} | ||
``` | ||
|
||
### GitOps Configuration | ||
The following properties are made available for use when managing the add-on via GitOps. | ||
|
||
Refer to [locals.tf](https://github.com/aws-ia/terraform-aws-eks-blueprints/blob/main/modules/kubernetes-addons/nvidia-device-plugin/locals.tf) for latest config. GitOps with ArgoCD Add-on repo is located [here](https://github.com/aws-samples/eks-blueprints-add-ons/blob/main/chart/values.yaml) | ||
|
||
```hcl | ||
argocd_gitops_config = { | ||
enable = true | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
# NVIDIA Device Plugin | ||
|
||
The NVIDIA device plugin for Kubernetes is a Daemonset that allows you to automatically: | ||
|
||
* Expose the number of GPUs on each nodes of your cluster | ||
* Keep track of the health of your GPUs | ||
* Run GPU enabled containers in your Kubernetes cluster. | ||
|
||
Read the add-on [docs](../../../docs/add-ons/nvidia-device-plugin.md) for more details. | ||
|
||
<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK --> | ||
## Requirements | ||
|
||
| Name | Version | | ||
|------|---------| | ||
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.0.0 | | ||
|
||
## Providers | ||
|
||
No providers. | ||
|
||
## Modules | ||
|
||
| Name | Source | Version | | ||
|------|--------|---------| | ||
| <a name="module_helm_addon"></a> [helm\_addon](#module\_helm\_addon) | ../helm-addon | n/a | | ||
|
||
## Resources | ||
|
||
No resources. | ||
|
||
## Inputs | ||
|
||
| Name | Description | Type | Default | Required | | ||
|------|-------------|------|---------|:--------:| | ||
| <a name="input_addon_context"></a> [addon\_context](#input\_addon\_context) | Input configuration for the addon | <pre>object({<br> aws_caller_identity_account_id = string<br> aws_caller_identity_arn = string<br> aws_eks_cluster_endpoint = string<br> aws_partition_id = string<br> aws_region_name = string<br> eks_cluster_id = string<br> eks_oidc_issuer_url = string<br> eks_oidc_provider_arn = string<br> tags = map(string)<br> irsa_iam_role_path = string<br> irsa_iam_permissions_boundary = string<br> })</pre> | n/a | yes | | ||
| <a name="input_helm_config"></a> [helm\_config](#input\_helm\_config) | Helm provider config for the add-on | `any` | `{}` | no | | ||
| <a name="input_manage_via_gitops"></a> [manage\_via\_gitops](#input\_manage\_via\_gitops) | Determines if the add-on should be managed via GitOps. | `bool` | `false` | no | | ||
|
||
## Outputs | ||
|
||
| Name | Description | | ||
|------|-------------| | ||
| <a name="output_argocd_gitops_config"></a> [argocd\_gitops\_config](#output\_argocd\_gitops\_config) | Configuration used for managing the add-on with ArgoCD | | ||
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
locals { | ||
name = "nvidia-device-plugin" | ||
version = "0.12.3" | ||
|
||
default_helm_config = { | ||
name = local.name | ||
chart = local.name | ||
repository = "https://nvidia.github.io/k8s-device-plugin" | ||
version = local.version | ||
namespace = local.name | ||
description = "nvidia-device-plugin Helm Chart deployment configuration" | ||
create_namespace = true | ||
} | ||
|
||
helm_config = merge( | ||
local.default_helm_config, | ||
var.helm_config | ||
) | ||
|
||
argocd_gitops_config = { | ||
enable = true | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
module "helm_addon" { | ||
source = "../helm-addon" | ||
manage_via_gitops = var.manage_via_gitops | ||
helm_config = local.helm_config | ||
addon_context = var.addon_context | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
output "argocd_gitops_config" { | ||
description = "Configuration used for managing the add-on with ArgoCD" | ||
value = var.manage_via_gitops ? local.argocd_gitops_config : null | ||
} |
28 changes: 28 additions & 0 deletions
28
modules/kubernetes-addons/nvidia-device-plugin/variables.tf
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
variable "helm_config" { | ||
description = "Helm provider config for the add-on" | ||
type = any | ||
default = {} | ||
} | ||
|
||
variable "manage_via_gitops" { | ||
description = "Determines if the add-on should be managed via GitOps." | ||
type = bool | ||
default = false | ||
} | ||
|
||
variable "addon_context" { | ||
description = "Input configuration for the addon" | ||
type = object({ | ||
aws_caller_identity_account_id = string | ||
aws_caller_identity_arn = string | ||
aws_eks_cluster_endpoint = string | ||
aws_partition_id = string | ||
aws_region_name = string | ||
eks_cluster_id = string | ||
eks_oidc_issuer_url = string | ||
eks_oidc_provider_arn = string | ||
tags = map(string) | ||
irsa_iam_role_path = string | ||
irsa_iam_permissions_boundary = string | ||
}) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
terraform { | ||
required_version = ">= 1.0.0" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters