Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

Latest commit

 

History

History
58 lines (41 loc) · 2.4 KB

README.md

File metadata and controls

58 lines (41 loc) · 2.4 KB

FPGA support for your Katib Experiments

Let's spawn F1 instances and accelerate time-consuming Katib experiments on AWS, with zero FPGA knowledge!

If you want to read more about provisioning FPGA resources and deploying accelerated applications (e.g. Kubeflow Pipelines) on any Kubernetes cluster, visit the InAccel documentation.

Simplifying FPGA management in EKS* (Elastic Kubernetes Service)

*For development and testing purposes you can still deploy Kubeflow Katib using MicroK8s in a single AMI instance. In production environments, Amazon's managed Kubernetes service (EKS) is recommended.

The InAccel FPGA Operator allows administrators of Kubernetes clusters to manage FPGA nodes just like CPU nodes in the cluster. Instead of provisioning a special OS image for FPGA nodes, administrators can rely on a standard OS image for both CPU and FPGA nodes and then rely on the FPGA Operator to provision the required software components for FPGAs.

Note that the FPGA Operator is specifically useful for scenarios where the Kubernetes cluster needs to scale quickly - for example provisioning additional FPGA nodes on the cloud and managing the lifecycle of the underlying software components.

Enabling FPGA based workloads

The following section describes how to run a workload on an FPGA based instance with the InAccel FPGA Operator.

After your FPGA worker nodes join your cluster, you must apply the InAccel FPGA Operator for Kubernetes, as a Helm app on your cluster, with the following command.

helm repo add inaccel https://setup.inaccel.com/helm

helm install -n kube-system inaccel inaccel/fpga-operator

You can verify that your nodes have available FPGAs with the following command:

kubectl get nodes -o custom-columns=NAME:metadata.name,FPGAS:.status.capacity.xilinx/aws-vu9p-f1,SHELL:.metadata.labels.xilinx/aws-vu9p-f1

Experiment

You can submit a new accelerated Experiment and check your Experiment results using the Web UI, as usual.

XGBoost Parameter Tuning [source]

kubectl apply -f xgboost-example.yaml