NVIDIA® GPU Operator helps you provision GPUs in Kubernetes clusters. Using the operator pattern to extend Kubernetes, NVIDIA GPU Operator automatically manages the components needed to provision GPUs, such as the NVIDIA drivers (to enable CUDA), Kubernetes device plugin for GPUs, the NVIDIA Container Toolkit, automatic node labelling using GFD, DCGM based monitoring and others.
NVIDIA® GPU Operator automates GPU setup in Kubernetes clusters.
-
Configure the application:
- RDMA: Select this option to enable GPUDirect RDMA and boost the data exchange speed between GPUs. It is recommended to select the option.
-
Click Install.
To check that the NVIDIA GPU Operator is working:
-
Install the kubectl and configure it to work with the created cluster.
-
Check that NVIDIA GPU Operator pods are running:
kubectl get pods -n <namespace>
- Automating management of GPU software components in Kubernetes clusters.
- Scaling GPU deployments in Kubernetes.
By using the application, you agree to their terms and conditions: the helm-chart and NVIDIA GPU Operator Licenses.