diff --git a/solutions/kueue-admission-check/README.md b/solutions/kueue-admission-check/README.md new file mode 100644 index 000000000..beacb45dd --- /dev/null +++ b/solutions/kueue-admission-check/README.md @@ -0,0 +1,641 @@ +# Set up Multikueue with OCM Kueue Admission Check Controller + +This guide demonstrates how to use the external OCM [Kueue Admission Check Controller](https://kueue.sigs.k8s.io/docs/concepts/admission_check/) which integrates OCM `Placement` results with [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) for intelligent multi-cluster job scheduling. +The controller reads OCM `Placement` decisions and generates corresponding `MultiKueueConfig` and `MultiKueueCluster` resources, streamlining the setup of the [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) environment and enabling users to select clusters based on custom criteria. +We'll walk through different user stories that showcase the power and flexibility of this integration. + +## Background + +### Existing Components + +1. **OCM Placement and AddonPlacementScore**: + +- `Placement` is used to dynamically select a set of `managedClusters` in one or multiple `ManagedClusterSet` to achieve Multi-Cluster scheduling. +- `AddOnPlacementScore` is an API introduced by `Placement` to support scheduling based on customized scores. + +2. **Kueue MultiKueue and AdmissionChecks**: + +- [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) is a feature of Kueue for job dispatching across multiple clusters. +- The [AdmissionChecks](https://kueue.sigs.k8s.io/docs/concepts/admission_check/) are a mechanism which manages Kueue and allows it to consider additional criteria before admitting a workload. Kueue only proceeds with a workload if all associated AdmissionChecks return a positive signal. + +REF: [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/), [Admission Check](https://kueue.sigs.k8s.io/docs/concepts/admission_check/), [Placement](https://open-cluster-management.io/concepts/placement/). + +## Motivation + +- Setting up a [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) environment for multiple clusters is a complex and manual process, often requiring users to create `MultiKueueCluster` and `MultiKueueConfig` resources for each worker cluster individually. + +- Driven by the growing need for optimal compute resource utilization, particularly in AI/ML workloads, multi-cluster users increasingly seek to leverage the OCM framework with [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) for intelligent cluster selection. + +REF: [Setup a MultiKueue environment](https://kueue.sigs.k8s.io/docs/tasks/manage/setup_multikueue/#multikueue-specific-kubeconfig) + +## Prerequisites + +1. A Kubernetes environment with OCM installed on a hub cluster and at least three managed clusters. +2. [Kueue](https://kueue.sigs.k8s.io/docs/installation/) deployed across all clusters. +3. [Managed-serviceaccount](https://github.com/open-cluster-management-io/managed-serviceaccount), [cluster-permission](https://github.com/open-cluster-management-io/cluster-permission) and [resource-usage-collect-addon](https://github.com/open-cluster-management-io/addon-contrib/tree/main/resource-usage-collect-addon) installed on managed clusters. + + + +- You can set up these above by running the command: +```bash +./setup-env.sh +``` +**Notice**: Currently, this functionality relies on the support of `ClusterProfile` and the user's manual installation of the Admission Check Controller. +OCM achieves this by replacing some OCM images in this `setup-env.sh`. In the future, we plan to address the items listed in the [TODO section](#todo). + +After that, you can verify your setup. + +- Check the managed clusters. + +```bash +kubectl get mcl +NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE +cluster1 true https://cluster1-control-plane:6443 True True 116s +cluster2 true https://cluster2-control-plane:6443 True True 94s +cluster3 true https://cluster3-control-plane:6443 True True 73s +``` +- Verify the installed addons. +```bash +kubectl get mca -A +NAMESPACE NAME AVAILABLE DEGRADED PROGRESSING +cluster1 managed-serviceaccount True False +cluster1 resource-usage-collect True False +cluster2 managed-serviceaccount True False +cluster2 resource-usage-collect True False +cluster3 managed-serviceaccount True False +cluster3 resource-usage-collect True False +``` +- Confirm Kueue is running on the clusters. +```bash +kubectl get pods -n kueue-system --context kind-hub # Same for managed clusters. +NAME READY STATUS RESTARTS AGE +kueue-controller-manager-87bd7888b-gqk4g 2/2 Running 0 69s +``` + +- On the hub cluster, check `ClusterProfiles`. +```bash +kubectl get clusterprofile -A +NAMESPACE NAME AGE +open-cluster-management cluster1 23s +open-cluster-management cluster2 23s +open-cluster-management cluster3 23s +``` +- The `ClusterProfile` status contains credentials that Kueue can use. +```bash +kubectl get clusterprofile -A -ojson | jq '.items[] | .metadata.name, .status.credentials[]' +"cluster1" +{ + "accessRef": { + "kind": "Secret", + "name": "kueue-admin-cluster1-kubeconfig", + "namespace": "kueue-system" + }, + "consumer": "kueue-admin" +} +"cluster2" +{ + "accessRef": { + "kind": "Secret", + "name": "kueue-admin-cluster2-kubeconfig", + "namespace": "kueue-system" + }, + "consumer": "kueue-admin" +} +"cluster3" +{ + "accessRef": { + "kind": "Secret", + "name": "kueue-admin-cluster3-kubeconfig", + "namespace": "kueue-system" + }, + "consumer": "kueue-admin" + +} +``` +- On hub cluster, Check secrets with `kubeconfig` for the managed cluster created under `kueue-system` namespace. +```bash +kubectl get secret -n kueue-system +NAME TYPE DATA AGE +kueue-admin-cluster1-kubeconfig Opaque 1 4m4s +kueue-admin-cluster2-kubeconfig Opaque 1 4m4s +kueue-admin-cluster3-kubeconfig Opaque 1 4m4s +kueue-webhook-server-cert Opaque 4 5m27s +``` + +## User Stories + +#### Story 1 + +As an admin, I want to automate [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) configuration across multiple clusters, so that I can streamline the setup process without manual intervention. + +- With the help of the `ClusterProfile` API, we can easily set up MultiKueue environment. +```bash +kubectl apply -f ./multikueue-setup-demo1.yaml +``` +- After that, check the status of `MultiKueueCluster`, `AdmissionChecks` and `Clusterqueues` + +```bash +kubectl get multikueuecluster -A -ojson | jq '.items[] | .metadata.name, .status.conditions' +kubectl get admissionchecks -ojson | jq '.items[] | .metadata.name, .status.conditions' +kubectl get clusterqueues -ojson | jq '.items[] | .metadata.name, .status.conditions' +``` +Success is indicated when "status": "True" and reasons like "Active" or "Ready" are present in the conditions. + +```bash +"multikueue-demo1-cluster1" +[ + { + "lastTransitionTime": "2024-08-31T20:41:41Z", + "message": "Connected", + "observedGeneration": 1, + "reason": "Active", + "status": "True", + "type": "Active" + } +] +"multikueue-demo1-cluster2" +[ + { + "lastTransitionTime": "2024-08-31T20:41:41Z", + "message": "Connected", + "observedGeneration": 1, + "reason": "Active", + "status": "True", + "type": "Active" + } +] +"multikueue-demo1" +[ + { + "lastTransitionTime": "2024-08-31T20:41:41Z", + "message": "The admission check is active", + "observedGeneration": 1, + "reason": "Active", + "status": "True", + "type": "Active" + }, + { + "lastTransitionTime": "2024-08-31T20:41:41Z", + "message": "only one multikueue managed admission check can be used in one ClusterQueue", + "observedGeneration": 1, + "reason": "MultiKueue", + "status": "True", + "type": "SingleInstanceInClusterQueue" + }, + { + "lastTransitionTime": "2024-08-31T20:41:41Z", + "message": "admission check cannot be applied at ResourceFlavor level", + "observedGeneration": 1, + "reason": "MultiKueue", + "status": "True", + "type": "FlavorIndependent" + } +] +"cluster-queue-demo1" +[ + { + "lastTransitionTime": "2024-08-31T20:41:41Z", + "message": "Can admit new workloads", + "observedGeneration": 1, + "reason": "Ready", + "status": "True", + "type": "Active" + } +] +``` +- Deploy a job to the MultiKueue. + +```bash +kubectl create -f ./job-demo1.yaml +``` +- Check the workload on the managed clusters. Here when the job’s Workload receives a QuotaReservation in the manager cluster, a copy of the Workload is created in all configured worker clusters. +Once `kind-cluster1` admitted the workload, the manager removed the corresponding workloads from the other clusters(`kind-cluster2`). +```bash +kubectl get workload --context kind-cluster1 +NAME QUEUE RESERVED IN ADMITTED AGE +job-demo1-jobnktc6-6c5f3 user-queue-demo1 cluster-queue-demo1 True 5s + +kubectl get workload --context kind-cluster2 +No resources found in default namespace. # After cluster1 admitted the workload, no workload should show up here. +``` +#### Story 2 + +As an admin, I want to use OCM `Placement` results for scheduling, so that clusters with specific attributes, like those with the `nvidia-t4` GPU accelerator label, are automatically selected and converted into a [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) for targeted workload deployment. + +- You can manually label the accelerators on the clusters. +```bash +kubectl label managedcluster cluster2 accelerator=nvidia-tesla-t4 +kubectl label managedcluster cluster3 accelerator=nvidia-tesla-t4 +``` +The `placememt-demo2-1.yaml` selects clusters with the `nvidia-tesla-t4` accelerator label. +```yaml +apiVersion: cluster.open-cluster-management.io/v1beta1 +kind: Placement +metadata: + name: placement-demo2 + namespace: kueue-system +spec: + clusterSets: + - spoke + tolerations: + - key: cluster.open-cluster-management.io/unreachable + operator: Exists + - key: cluster.open-cluster-management.io/unavailable + operator: Exists + predicates: + - requiredClusterSelector: + labelSelector: + matchLabels: + accelerator: nvidia-tesla-t4 +``` +- Bind the cluster set to the Kueue namespace and verify the bindings. + +```bash +clusteradm clusterset bind spoke --namespace kueue-system +clusteradm get clustersets + +└── + └── default,kueue-system + └── 3 ManagedClusters selected + └── [cluster1 cluster2 cluster3] +``` + +- Apply the placement policy. + +```bash +kubectl apply -f placement-demo2-1.yaml +``` + +- Apply the MultiKueue setup configuration. + +```bash +kubectl apply -f ./multikueue-setup-demo2.yaml +``` + +- Check the `MultikueueKonfig` and `MultikueueClusters`. + +```bash +kubectl get multikueueconfig +NAME AGE +placement-demo2 60s + +kubectl get multikueuecluster +NAME AGE +placement-demo2-cluster2 60s +placement-demo2-cluster3 60s +``` +- After that, check the status of `MultiKueueCluster`, `AdmissionChecks` and `Clusterqueues` +```bash +kubectl get multikueuecluster -A -ojson | jq '.items[] | .metadata.name, .status.conditions' +kubectl get admissionchecks -ojson | jq '.items[] | .metadata.name, .status.conditions' +kubectl get clusterqueues -ojson | jq '.items[] | .metadata.name, .status.conditions' +``` +If success, there should be "status": "True" and reasons like "Active" or "Ready" presented in the conditions. +```bash +"placement-demo2-cluster2" +[ + { + "lastTransitionTime": "2024-08-31T22:03:16Z", + "message": "Connected", + "observedGeneration": 1, + "reason": "Active", + "status": "True", + "type": "Active" + } +] +"placement-demo2-cluster3" +[ + { + "lastTransitionTime": "2024-08-31T22:03:16Z", + "message": "Connected", + "observedGeneration": 1, + "reason": "Active", + "status": "True", + "type": "Active" + } +] +"multikueue-demo2" # The status of the admissioncheck `multikueue-demo2` +[ + { + "lastTransitionTime": "2024-08-31T22:03:16Z", + "message": "The admission check is active", + "observedGeneration": 1, + "reason": "Active", + "status": "True", + "type": "Active" + }, + { + "lastTransitionTime": "2024-08-31T22:03:16Z", + "message": "only one multikueue managed admission check can be used in one ClusterQueue", + "observedGeneration": 1, + "reason": "MultiKueue", + "status": "True", + "type": "SingleInstanceInClusterQueue" + }, + { + "lastTransitionTime": "2024-08-31T22:03:16Z", + "message": "admission check cannot be applied at ResourceFlavor level", + "observedGeneration": 1, + "reason": "MultiKueue", + "status": "True", + "type": "FlavorIndependent" + } +] +"placement-demo2" # The status of the admissioncheck `placement-demo2` +[ + { + "lastTransitionTime": "2024-08-31T22:03:16Z", + "message": "MultiKueueConfig and MultiKueueCluster generated", + "reason": "Active", + "status": "True", + "type": "Active" + } +] +"cluster-queue-demo2" +[ + { + "lastTransitionTime": "2024-08-31T22:03:16Z", + "message": "Can admit new workloads", + "observedGeneration": 1, + "reason": "Ready", + "status": "True", + "type": "Active" + } +] +``` +- Create a job requesting GPU resources to the MultiKueue. +```bash +kubectl create -f ./job-demo2.yaml +``` +- Check the workload on managed clusters. Like we explained in the case in story 1, once one cluster(here `kind-cluster3`) has admitted the workload, the manager removed the corresponding workloads from the other clusters(here `kind-cluster2`). +```bash +kubectl get workload --context kind-cluster2 +No resources found in default namespace. + +kubectl get workload --context kind-cluster3 +NAME QUEUE RESERVED IN ADMITTED AGE +job-demo2-jobl2t6d-a8cdd user-queue-demo2 cluster-queue-demo2 True 3s +``` +#### Story 3 + +As an admin, I want to leverage OCM's `AddonPlacementScore` for dynamic workload scheduling, so that clusters with higher GPU scores, indicating clusters with more GPU resources, are selected and converted into a [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/), which automatically adjusts by adding or removing clusters as scores change. + +`placememt-demo2-2` selects clusters with the `nvidia-tesla-t4` accelerator label, and select one cluster with the highest GPU-score, indicating having more GPU resources. + +```yaml +apiVersion: cluster.open-cluster-management.io/v1beta1 +kind: Placement +metadata: + name: placement-demo2 + namespace: kueue-system +spec: + clusterSets: + - spoke + tolerations: + - key: cluster.open-cluster-management.io/unreachable + operator: Exists + - key: cluster.open-cluster-management.io/unavailable + operator: Exists + predicates: + - requiredClusterSelector: + labelSelector: + matchLabels: + accelerator: nvidia-tesla-t4 + numberOfClusters: 1 + prioritizerPolicy: + mode: Exact + configurations: + - scoreCoordinate: + type: AddOn + addOn: + resourceName: resource-usage-score + scoreName: gpuClusterAvailable + weight: 1 +``` +- You can manually edit the GPU resources on the managed clusters for testing, for example on `kind-cluster2`, set 3 fake GPU resources on the `control-plane-node`. +```bash +kubectl edit-status node cluster2-control-plane --context kind-cluster2 # Same operation with other clusters/nodes. +``` +- Edit the `status` of the node `cluster2-control-plane`: +```yaml + allocatable: + cpu: "8" + ephemeral-storage: 61202244Ki + hugepages-1Gi: "0" + hugepages-2Mi: "0" + hugepages-32Mi: "0" + hugepages-64Ki: "0" + memory: 8027168Ki + nvidia.com/gpu: "3" # Add 3 fake GPUs in allocatable + pods: "110" + capacity: + cpu: "8" + ephemeral-storage: 61202244Ki + hugepages-1Gi: "0" + hugepages-2Mi: "0" + hugepages-32Mi: "0" + hugepages-64Ki: "0" + memory: 8027168Ki + nvidia.com/gpu: "3" # Add 3 fake GPUs in capacity + pods: "110" +``` + +- Here in this environment, cluster1 has no GPUs, while cluster2 and cluster3 each have 3 GPUs. +Check `AddonPlacementScore`, the range of the score is from -100 to 100, clusters with more resources available have higher scores. +Here cluster1, which has no GPUs, should have a score of -100, and the cluster running the workload(here from story 2 we have one workload running on `kind-cluster3`) will have a lower score. +```bash +kubectl get addonplacementscore -A -ojson | jq '.items[] | .metadata.name, .status.scores[5]' +"resource-usage-score" +{ + "name": "gpuClusterAvailable", + "value": -100 +} +"resource-usage-score" # kind-cluster2 has no workload. +{ + "name": "gpuClusterAvailable", + "value": -70 +} +"resource-usage-score" # kind-cluster3 has a workload from story 2, so it has fewer GPU available, thus lower score. +{ + "name": "gpuClusterAvailable", + "value": -80 +} +``` + +- Apply the changes in the `Placement` to update MultiKueue dynamically. +```bash +kubectl apply -f ./placement-demo2-2.yaml +``` + +- Review the update in `MultikueueKonfig`. +```bash +kubectl get multikueueconfig +NAME AGE +placement-demo2 22m + +kubectl get multikueueconfig placement-demo2 -oyaml +apiVersion: kueue.x-k8s.io/v1alpha1 +kind: MultiKueueConfig +metadata: + creationTimestamp: "2024-08-31T22:03:16Z" + generation: 5 + name: placement-demo2 + resourceVersion: "18109" + uid: 3c16af72-94bf-4444-bf79-7e896165aabc +spec: + clusters: + - placement-demo2-cluster2 # cluster2 has a higher GPU score, so it got selected by the placement decision. +``` +- Create a job for the updated MultiKueue and check the workload, this time the workload is admitted by `kind-cluster2`, in `kind-cluster3` can only find the old workload from Story 2. +```bash +kubectl create -f ./job-demo2.yaml +kubectl get workload --context kind-cluster2 +NAME QUEUE RESERVED IN ADMITTED AGE +job-demo2-jobxn888-4b91e user-queue-demo2 cluster-queue-demo2 True 6s + +kubectl get workload --context kind-cluster3 +NAME QUEUE RESERVED IN ADMITTED AGE +job-demo2-jobl2t6d-a8cdd user-queue-demo2 cluster-queue-demo2 True 9m13s +``` + +## Design Details + +### OCM Admission Check Controller + +The OCM Admission Check Controller will integrate OCM `Placement` results into MultiKueue by reading `Placement` decisions and generating the necessary `MultiKueueConfig` and `MultiKueueCluster` resources. + +- `controllerName`: Identifies the controller that processes the Admission Check, currently set to `open-cluster-management.io/placement` +- `parameters`: Identifies a configuration with additional parameters for the check, here we add the existing OCM `Placement` component. Clusters specified in the `Placement` will be bound to the `kueue-system` namespace. + +Example OCM Admission Check Controller design: + +```yaml +# OCM implements an admissioncheck controller to automate the MultiKueue setup process. +# MultiKueueConfigs and MultiKueueClusters are generated dynamically based on OCM placement decisions. +apiVersion: kueue.x-k8s.io/v1beta1 +kind: AdmissionCheck +metadata: + name: placement-demo2 +spec: + controllerName: open-cluster-management.io/placement + parameters: + apiGroup: cluster.open-cluster-management.io + kind: Placement + name: placement-demo2 +# Leverages OCM's placement mechanism to select clusters based on specific criteria. +# For example `Placement-demo2-1` selects clusters with the `nvidia-tesla-t4` accelerator label. +``` + +### Changes in the Configuration Process with OCM Admission Check Controller + +Using the OCM Admission Check Controller significantly simplifies the configuration process for system administrators by automating several manual tasks. + +#### Before Using OCM Admission Check Controller + +In the traditional setup, administrators must manually configure both `MultiKueueConfig` and `MultiKueueCluster` resources: + +- **MultiKueueConfig**: Defines which clusters are part of the [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) environment. Admins need to specify each cluster manually. +- **MultiKueueCluster**: Each cluster requires a `MultiKueueCluster` resource, which includes a kubeconfig secret that administrators must create manually for secure communication. + +```yaml +apiVersion: kueue.x-k8s.io/v1alpha1 +kind: MultiKueueConfig +metadata: + name: multikueue-config +spec: + clusters: + - multikueue-cluster1 + - multikueue-cluster2 +--- +apiVersion: kueue.x-k8s.io/v1alpha1 +kind: MultiKueueCluster +metadata: + name: multikueue-cluster1 +spec: + kubeConfig: + locationType: Secret + location: kueue-admin-cluster1-kubeconfig +--- +apiVersion: kueue.x-k8s.io/v1alpha1 +kind: MultiKueueCluster +metadata: + name: multikueue-cluster2 +spec: + kubeConfig: + locationType: Secret + location: kueue-admin-cluster2-kubeconfig +``` + +#### After Using OCM Admission Check Controller + +With the OCM Admission Check Controller, the need for manual configuration of `MultiKueueConfig` and `MultiKueueCluster` is eliminated. Instead, the administrator only needs to configure two additional admission checks in the ClusterQueue resource: +`multikueue-demo2` and `placement-demo2` (see in `multikueue-setup-demo2.yaml`) which leverage OCM's placement mechanism to select clusters based on specific criteria and automate the process of setting up `MultiKueueConfig` and `MultiKueueCluster`. + +```yaml +apiVersion: kueue.x-k8s.io/v1beta1 +kind: ClusterQueue +metadata: + name: "cluster-queue-demo2" +spec: + namespaceSelector: {} # match all. + resourceGroups: + - coveredResources: ["cpu", "memory","nvidia.com/gpu"] + flavors: + - name: "default-flavor-demo2" + resources: + - name: "cpu" + nominalQuota: 9 + - name: "memory" + nominalQuota: 36Gi + - name: "nvidia.com/gpu" + nominalQuota: 3 + admissionChecks: + - multikueue-demo2 + - placement-demo2 +--- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: AdmissionCheck +metadata: + name: multikueue-demo2 +spec: + controllerName: kueue.x-k8s.io/multikueue + parameters: + apiGroup: kueue.x-k8s.io + kind: MultiKueueConfig + name: placement-demo2 +--- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: AdmissionCheck +metadata: + name: placement-demo2 +spec: + controllerName: open-cluster-management.io/placement + parameters: + apiGroup: cluster.open-cluster-management.io + kind: Placement + name: placement-demo2 +``` + +#### OCM Admission Check Controller Workflow + +- The OCM Admission Check Controller retrieves the OCM `Placement` associated with an AdmissionCheck in the `kueue-system` namespace. +- It uses a `PlacementDecisionTracker` to gather the selected clusters and retrieves their `ClusterProfile` for `credentials`. +- The controller creates or updates `MultiKueueCluster` resources with the kubeconfig details for each cluster, and then lists these clusters in a `MultiKueueConfig` resource. +- Finally, it updates the AdmissionCheck condition to true, indicating successful generation of the `MultiKueueConfig` and `MultiKueueCluster`, readying the [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) environment for job scheduling. + +## TODO +- In the future, the `AdmissionCheckcontroller` may be added to `featureGates` as a user-enabled feature or possibly developed into an individual component running as a pod on the `hub`. +- Users may also need to enable the `ClusterProfile` feature in the `featureGates` to utilize the OCM Admission Check. This can be done by configuring the `ClusterManager` in `hub`. +```yaml +apiVersion: operator.open-cluster-management.io/v1 +kind: ClusterManager +metadata: + name: cluster-manager +spec: + registrationConfiguration: + featureGates: + - feature: ClusterProfile + mode: Enable +... +``` + diff --git a/solutions/kueue-admission-check/env/cp-c1.yaml b/solutions/kueue-admission-check/env/cp-c1.yaml new file mode 100644 index 000000000..6f5eccf48 --- /dev/null +++ b/solutions/kueue-admission-check/env/cp-c1.yaml @@ -0,0 +1,63 @@ +apiVersion: rbac.open-cluster-management.io/v1alpha1 +kind: ClusterPermission +metadata: + name: kueue-admin-cluster1 + namespace: cluster1 +spec: + clusterRole: + rules: + - apiGroups: + - batch + resources: + - jobs + verbs: + - create + - delete + - get + - list + - watch + - apiGroups: + - batch + resources: + - jobs/status + verbs: + - get + - apiGroups: + - jobset.x-k8s.io + resources: + - jobsets + verbs: + - create + - delete + - get + - list + - watch + - apiGroups: + - jobset.x-k8s.io + resources: + - jobsets/status + verbs: + - get + - apiGroups: + - kueue.x-k8s.io + resources: + - workloads + verbs: + - create + - delete + - get + - list + - watch + - apiGroups: + - kueue.x-k8s.io + resources: + - workloads/status + verbs: + - get + - patch + - update + clusterRoleBinding: + subject: + kind: ServiceAccount + name: kueue-admin-cluster1 + namespace: open-cluster-management-agent-addon diff --git a/solutions/kueue-admission-check/env/cp-c2.yaml b/solutions/kueue-admission-check/env/cp-c2.yaml new file mode 100644 index 000000000..6199444b5 --- /dev/null +++ b/solutions/kueue-admission-check/env/cp-c2.yaml @@ -0,0 +1,63 @@ +apiVersion: rbac.open-cluster-management.io/v1alpha1 +kind: ClusterPermission +metadata: + name: kueue-admin-cluster2 + namespace: cluster2 +spec: + clusterRole: + rules: + - apiGroups: + - batch + resources: + - jobs + verbs: + - create + - delete + - get + - list + - watch + - apiGroups: + - batch + resources: + - jobs/status + verbs: + - get + - apiGroups: + - jobset.x-k8s.io + resources: + - jobsets + verbs: + - create + - delete + - get + - list + - watch + - apiGroups: + - jobset.x-k8s.io + resources: + - jobsets/status + verbs: + - get + - apiGroups: + - kueue.x-k8s.io + resources: + - workloads + verbs: + - create + - delete + - get + - list + - watch + - apiGroups: + - kueue.x-k8s.io + resources: + - workloads/status + verbs: + - get + - patch + - update + clusterRoleBinding: + subject: + kind: ServiceAccount + name: kueue-admin-cluster2 + namespace: open-cluster-management-agent-addon diff --git a/solutions/kueue-admission-check/env/cp-c3.yaml b/solutions/kueue-admission-check/env/cp-c3.yaml new file mode 100644 index 000000000..842d9480f --- /dev/null +++ b/solutions/kueue-admission-check/env/cp-c3.yaml @@ -0,0 +1,63 @@ +apiVersion: rbac.open-cluster-management.io/v1alpha1 +kind: ClusterPermission +metadata: + name: kueue-admin-cluster3 + namespace: cluster3 +spec: + clusterRole: + rules: + - apiGroups: + - batch + resources: + - jobs + verbs: + - create + - delete + - get + - list + - watch + - apiGroups: + - batch + resources: + - jobs/status + verbs: + - get + - apiGroups: + - jobset.x-k8s.io + resources: + - jobsets + verbs: + - create + - delete + - get + - list + - watch + - apiGroups: + - jobset.x-k8s.io + resources: + - jobsets/status + verbs: + - get + - apiGroups: + - kueue.x-k8s.io + resources: + - workloads + verbs: + - create + - delete + - get + - list + - watch + - apiGroups: + - kueue.x-k8s.io + resources: + - workloads/status + verbs: + - get + - patch + - update + clusterRoleBinding: + subject: + kind: ServiceAccount + name: kueue-admin-cluster3 + namespace: open-cluster-management-agent-addon diff --git a/solutions/kueue-admission-check/env/msa-c1.yaml b/solutions/kueue-admission-check/env/msa-c1.yaml new file mode 100644 index 000000000..b7466e992 --- /dev/null +++ b/solutions/kueue-admission-check/env/msa-c1.yaml @@ -0,0 +1,7 @@ +apiVersion: authentication.open-cluster-management.io/v1beta1 +kind: ManagedServiceAccount +metadata: + name: kueue-admin-cluster1 + namespace: cluster1 +spec: + rotation: {} diff --git a/solutions/kueue-admission-check/env/msa-c2.yaml b/solutions/kueue-admission-check/env/msa-c2.yaml new file mode 100644 index 000000000..91971cdfd --- /dev/null +++ b/solutions/kueue-admission-check/env/msa-c2.yaml @@ -0,0 +1,7 @@ +apiVersion: authentication.open-cluster-management.io/v1beta1 +kind: ManagedServiceAccount +metadata: + name: kueue-admin-cluster2 + namespace: cluster2 +spec: + rotation: {} diff --git a/solutions/kueue-admission-check/env/msa-c3.yaml b/solutions/kueue-admission-check/env/msa-c3.yaml new file mode 100644 index 000000000..d9f8046e6 --- /dev/null +++ b/solutions/kueue-admission-check/env/msa-c3.yaml @@ -0,0 +1,7 @@ +apiVersion: authentication.open-cluster-management.io/v1beta1 +kind: ManagedServiceAccount +metadata: + name: kueue-admin-cluster3 + namespace: cluster3 +spec: + rotation: {} diff --git a/solutions/kueue-admission-check/env/multicluster.x-k8s.io_clusterprofiles.yaml b/solutions/kueue-admission-check/env/multicluster.x-k8s.io_clusterprofiles.yaml new file mode 100644 index 000000000..6ddbff0e9 --- /dev/null +++ b/solutions/kueue-admission-check/env/multicluster.x-k8s.io_clusterprofiles.yaml @@ -0,0 +1,219 @@ +--- +apiVersion: apiextensions.k8s.io/v1 +kind: CustomResourceDefinition +metadata: + annotations: + controller-gen.kubebuilder.io/version: v0.14.0 + name: clusterprofiles.multicluster.x-k8s.io +spec: + group: multicluster.x-k8s.io + names: + kind: ClusterProfile + listKind: ClusterProfileList + plural: clusterprofiles + singular: clusterprofile + scope: Namespaced + versions: + - name: v1alpha1 + schema: + openAPIV3Schema: + description: ClusterProfile represents a single cluster in a multi-cluster + deployment. + properties: + apiVersion: + description: |- + APIVersion defines the versioned schema of this representation of an object. + Servers should convert recognized schemas to the latest internal value, and + may reject unrecognized values. + More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources + type: string + kind: + description: |- + Kind is a string value representing the REST resource this object represents. + Servers may infer this from the endpoint the client submits requests to. + Cannot be updated. + In CamelCase. + More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds + type: string + metadata: + type: object + spec: + description: ClusterProfileSpec defines the desired state of ClusterProfile. + properties: + clusterManager: + description: ClusterManager defines which cluster manager owns this + ClusterProfile resource + properties: + name: + description: Name defines the name of the cluster manager + type: string + required: + - name + type: object + x-kubernetes-validations: + - message: ClusterManager is immutable + rule: self == oldSelf + displayName: + description: DisplayName defines a human-readable name of the ClusterProfile + type: string + required: + - clusterManager + type: object + status: + description: ClusterProfileStatus defines the observed state of ClusterProfile. + properties: + conditions: + description: Conditions contains the different condition statuses + for this cluster. + items: + description: "Condition contains details for one aspect of the current + state of this API Resource.\n---\nThis struct is intended for + direct use as an array at the field path .status.conditions. For + example,\n\n\n\ttype FooStatus struct{\n\t // Represents the + observations of a foo's current state.\n\t // Known .status.conditions.type + are: \"Available\", \"Progressing\", and \"Degraded\"\n\t // + +patchMergeKey=type\n\t // +patchStrategy=merge\n\t // +listType=map\n\t + \ // +listMapKey=type\n\t Conditions []metav1.Condition `json:\"conditions,omitempty\" + patchStrategy:\"merge\" patchMergeKey:\"type\" protobuf:\"bytes,1,rep,name=conditions\"`\n\n\n\t + \ // other fields\n\t}" + properties: + lastTransitionTime: + description: |- + lastTransitionTime is the last time the condition transitioned from one status to another. + This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable. + format: date-time + type: string + message: + description: |- + message is a human readable message indicating details about the transition. + This may be an empty string. + maxLength: 32768 + type: string + observedGeneration: + description: |- + observedGeneration represents the .metadata.generation that the condition was set based upon. + For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date + with respect to the current state of the instance. + format: int64 + minimum: 0 + type: integer + reason: + description: |- + reason contains a programmatic identifier indicating the reason for the condition's last transition. + Producers of specific condition types may define expected values and meanings for this field, + and whether the values are considered a guaranteed API. + The value should be a CamelCase string. + This field may not be empty. + maxLength: 1024 + minLength: 1 + pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$ + type: string + status: + description: status of the condition, one of True, False, Unknown. + enum: + - "True" + - "False" + - Unknown + type: string + type: + description: |- + type of condition in CamelCase or in foo.example.com/CamelCase. + --- + Many .condition.type values are consistent across resources like Available, but because arbitrary conditions can be + useful (see .node.status.conditions), the ability to deconflict is important. + The regex it matches is (dns1123SubdomainFmt/)?(qualifiedNameFmt) + maxLength: 316 + pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$ + type: string + required: + - lastTransitionTime + - message + - reason + - status + - type + type: object + type: array + credentials: + description: |- + TokenRequests describes a list of token requests on this cluster and its + approval status. + items: + properties: + accessRef: + description: RequestRef points to a specific AuthTokenRequest + object. + properties: + kind: + description: Kind is the kind of the referred token request + object. + type: string + name: + description: Name is the name of the referred token request + object. + type: string + namespace: + description: Namespace is the namespace of the referred + token request object. + type: string + required: + - kind + - name + - namespace + type: object + consumer: + type: string + required: + - accessRef + - consumer + type: object + type: array + properties: + description: |- + Properties defines name/value pairs to represent properties of a cluster. + It could be a collection of ClusterProperty (KEP-2149) resources, + but could also be info based on other implementations. + The names of the properties can be predefined names from ClusterProperty resources + and is allowed to be customized by different cluster managers. + items: + description: |- + Property defines a name/value pair to represent a property of a cluster. + It could be a ClusterProperty (KEP-2149) resource, + but could also be info based on other implementations. + The name of the property can be predefined name from a ClusterProperty resource + and is allowed to be customized by different cluster managers. + This property can store various configurable details and metrics of a cluster, + which may include information such as the number of nodes, total and free CPU, + and total and free memory, among other potential attributes. + properties: + name: + description: |- + Name is the name of a property resource on cluster. It's a well-known + or customized name to identify the property. + maxLength: 253 + minLength: 1 + type: string + value: + description: Value is a property-dependent string + maxLength: 1024 + minLength: 1 + type: string + required: + - name + - value + type: object + type: array + version: + description: Version defines the version information of the cluster. + properties: + kubernetes: + description: Kubernetes is the kubernetes version of the cluster. + type: string + type: object + type: object + required: + - spec + type: object + served: true + storage: true + subresources: + status: {} diff --git a/solutions/kueue-admission-check/env/patch-clusterrole.json b/solutions/kueue-admission-check/env/patch-clusterrole.json new file mode 100644 index 000000000..0d876009d --- /dev/null +++ b/solutions/kueue-admission-check/env/patch-clusterrole.json @@ -0,0 +1,83 @@ +[ + { + "op": "add", + "path": "/rules/-", + "value": { + "apiGroups": ["multicluster.x-k8s.io"], + "resources": ["clusterprofiles"], + "verbs": ["get", "list", "watch", "create", "update", "patch", "delete"] + } + }, + { + "op": "add", + "path": "/rules/-", + "value": { + "apiGroups": ["multicluster.x-k8s.io"], + "resources": ["clusterprofiles/status"], + "verbs": ["update", "patch"] + } + }, + { + "op": "add", + "path": "/rules/-", + "value": { + "apiGroups": ["rbac.open-cluster-management.io"], + "resources": ["clusterpermissions"], + "verbs": ["get", "list", "watch", "create", "update", "patch", "delete"] + } + }, + { + "op": "add", + "path": "/rules/-", + "value": { + "apiGroups": ["authentication.open-cluster-management.io"], + "resources": ["managedserviceaccounts"], + "verbs": ["get", "list", "watch", "create", "update", "patch", "delete"] + } + }, + { + "op": "add", + "path": "/rules/-", + "value": { + "apiGroups": ["kueue.x-k8s.io"], + "resources": ["multikueueconfigs"], + "verbs": ["get", "list", "watch", "create", "update", "patch", "delete"] + } + }, + { + "op": "add", + "path": "/rules/-", + "value": { + "apiGroups": ["kueue.x-k8s.io"], + "resources": ["multikueueclusters"], + "verbs": ["get", "list", "watch", "create", "update", "patch", "delete"] + } + }, + { + "op": "add", + "path": "/rules/-", + "value": { + "apiGroups": ["kueue.x-k8s.io"], + "resources": ["admissionchecks"], + "verbs": ["get", "list", "watch", "create", "update", "patch", "delete"] + } + }, + { + "op": "add", + "path": "/rules/-", + "value": { + "apiGroups": ["kueue.x-k8s.io"], + "resources": ["admissionchecks/status"], + "verbs": ["update", "patch"] + } + }, + { + "op": "add", + "path": "/rules/-", + "value": { + "apiGroups": [""], + "resources": ["secrets"], + "verbs": ["get", "list", "watch", "create", "update", "patch", "delete"] + } + } +] diff --git a/solutions/kueue-admission-check/env/patch-mg-sa-cma.json b/solutions/kueue-admission-check/env/patch-mg-sa-cma.json new file mode 100644 index 000000000..09cfb0367 --- /dev/null +++ b/solutions/kueue-admission-check/env/patch-mg-sa-cma.json @@ -0,0 +1,18 @@ +[ + { + "op": "replace", + "path": "/spec/installStrategy", + "value": { + "placements": [ + { + "name": "placement-spoke", + "namespace": "default", + "rolloutStrategy": { + "type": "All" + } + } + ], + "type": "Placements" + } + } +] diff --git a/solutions/kueue-admission-check/env/placement.yaml b/solutions/kueue-admission-check/env/placement.yaml new file mode 100644 index 000000000..d6bfbbea4 --- /dev/null +++ b/solutions/kueue-admission-check/env/placement.yaml @@ -0,0 +1,14 @@ +# clusteradm clusterset bind global --namespace default +apiVersion: cluster.open-cluster-management.io/v1beta1 +kind: Placement +metadata: + name: placement-spoke + namespace: default +spec: + clusterSets: + - spoke + tolerations: + - key: cluster.open-cluster-management.io/unreachable + operator: Exists + - key: cluster.open-cluster-management.io/unavailable + operator: Exists diff --git a/solutions/kueue-admission-check/env/single-clusterqueue-setup-mwrs.yaml b/solutions/kueue-admission-check/env/single-clusterqueue-setup-mwrs.yaml new file mode 100644 index 000000000..de636fca9 --- /dev/null +++ b/solutions/kueue-admission-check/env/single-clusterqueue-setup-mwrs.yaml @@ -0,0 +1,90 @@ +apiVersion: work.open-cluster-management.io/v1alpha1 +kind: ManifestWorkReplicaSet +metadata: + name: single-clusterqueue + namespace: default +spec: + placementRefs: + - name: placement-spoke + manifestWorkTemplate: + workload: + manifests: + - apiVersion: rbac.authorization.k8s.io/v1 + kind: ClusterRoleBinding + metadata: + name: kueue-manager-ocm-rolebinding + roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: kueue-manager-role + subjects: + - kind: ServiceAccount + name: klusterlet-work-sa + namespace: open-cluster-management-agent + - apiVersion: rbac.authorization.k8s.io/v1 + kind: ClusterRoleBinding + metadata: + name: kueue-batch-admin-ocm-rolebinding + roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: kueue-batch-admin-role + subjects: + - kind: ServiceAccount + name: klusterlet-work-sa + namespace: open-cluster-management-agent + - apiVersion: kueue.x-k8s.io/v1beta1 + kind: ResourceFlavor + metadata: + name: "default-flavor-demo1" + - apiVersion: kueue.x-k8s.io/v1beta1 + kind: ClusterQueue + metadata: + name: "cluster-queue-demo1" + spec: + namespaceSelector: {} # match all. + resourceGroups: + - coveredResources: ["cpu", "memory"] + flavors: + - name: "default-flavor-demo1" + resources: + - name: "cpu" + nominalQuota: 9 + - name: "memory" + nominalQuota: 36Gi + - apiVersion: kueue.x-k8s.io/v1beta1 + kind: LocalQueue + metadata: + namespace: "default" + name: "user-queue-demo1" + spec: + clusterQueue: "cluster-queue-demo1" + - apiVersion: kueue.x-k8s.io/v1beta1 + kind: ResourceFlavor + metadata: + name: "default-flavor-demo2" + - apiVersion: kueue.x-k8s.io/v1beta1 + kind: ClusterQueue + metadata: + name: "cluster-queue-demo2" + spec: + namespaceSelector: {} # match all. + resourceGroups: + - coveredResources: ["cpu", "memory","nvidia.com/gpu"] + flavors: + - name: "default-flavor-demo2" + resources: + - name: "cpu" + nominalQuota: 9 + - name: "memory" + nominalQuota: 36Gi + - name: "nvidia.com/gpu" + nominalQuota: 3 + - apiVersion: kueue.x-k8s.io/v1beta1 + kind: LocalQueue + metadata: + namespace: "default" + name: "user-queue-demo2" + spec: + clusterQueue: "cluster-queue-demo2" + diff --git a/solutions/kueue-admission-check/job-demo1.yaml b/solutions/kueue-admission-check/job-demo1.yaml new file mode 100644 index 000000000..e68cda738 --- /dev/null +++ b/solutions/kueue-admission-check/job-demo1.yaml @@ -0,0 +1,25 @@ +apiVersion: batch/v1 +kind: Job +metadata: + generateName: demo1-job + namespace: default + labels: + kueue.x-k8s.io/queue-name: user-queue-demo1 +spec: + parallelism: 1 + completions: 1 + suspend: true + template: + spec: + containers: + - name: dummy-job + image: gcr.io/k8s-staging-perf-tests/sleep:v0.1.0 + args: ["30s"] + resources: + requests: + cpu: "1" + memory: "200Mi" + limits: + cpu: "1" + memory: "200Mi" + restartPolicy: Never diff --git a/solutions/kueue-admission-check/job-demo2.yaml b/solutions/kueue-admission-check/job-demo2.yaml new file mode 100644 index 000000000..7b4aa845a --- /dev/null +++ b/solutions/kueue-admission-check/job-demo2.yaml @@ -0,0 +1,27 @@ +apiVersion: batch/v1 +kind: Job +metadata: + generateName: demo2-job + namespace: default + labels: + kueue.x-k8s.io/queue-name: "user-queue-demo2" +spec: + parallelism: 1 + completions: 1 + suspend: true + template: + spec: + containers: + - name: dummy-job + image: gcr.io/k8s-staging-perf-tests/sleep:v0.1.0 + args: ["600s"] + resources: + requests: + cpu: "1" + memory: "200Mi" + nvidia.com/gpu: "1" + limits: + cpu: "1" + memory: "200Mi" + nvidia.com/gpu: "1" # This job requires one GPU. + restartPolicy: Never diff --git a/solutions/kueue-admission-check/multikueue-setup-demo1.yaml b/solutions/kueue-admission-check/multikueue-setup-demo1.yaml new file mode 100644 index 000000000..3d4888c03 --- /dev/null +++ b/solutions/kueue-admission-check/multikueue-setup-demo1.yaml @@ -0,0 +1,71 @@ +apiVersion: kueue.x-k8s.io/v1beta1 +kind: ResourceFlavor +metadata: + name: "default-flavor-demo1" +--- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: ClusterQueue +metadata: + name: "cluster-queue-demo1" +spec: + namespaceSelector: {} # match all. + resourceGroups: + - coveredResources: ["cpu", "memory"] + flavors: + - name: "default-flavor-demo1" + resources: + - name: "cpu" + nominalQuota: 9 + - name: "memory" + nominalQuota: 36Gi + admissionChecks: + - multikueue-demo1 +--- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: LocalQueue +metadata: + namespace: "default" + name: "user-queue-demo1" +spec: + clusterQueue: "cluster-queue-demo1" +--- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: AdmissionCheck +metadata: + name: multikueue-demo1 +spec: + controllerName: kueue.x-k8s.io/multikueue + parameters: + apiGroup: kueue.x-k8s.io + kind: MultiKueueConfig + name: multikueue-config-demo1 +--- +apiVersion: kueue.x-k8s.io/v1alpha1 +kind: MultiKueueConfig +metadata: + name: multikueue-config-demo1 +spec: + clusters: + - multikueue-demo1-cluster1 + - multikueue-demo1-cluster2 +--- +apiVersion: kueue.x-k8s.io/v1alpha1 +kind: MultiKueueCluster +metadata: + name: multikueue-demo1-cluster1 +spec: + kubeConfig: + locationType: Secret + location: kueue-admin-cluster1-kubeconfig + # a secret called "kueue-admin-cluster1-kubeconfig" should be created in the namespace the kueue + # controller manager runs into, holding the kubeConfig needed to connect to the + # worker cluster in the "kubeconfig" key; +--- +apiVersion: kueue.x-k8s.io/v1alpha1 +kind: MultiKueueCluster +metadata: + name: multikueue-demo1-cluster2 +spec: + kubeConfig: + locationType: Secret + location: kueue-admin-cluster2-kubeconfig diff --git a/solutions/kueue-admission-check/multikueue-setup-demo2.yaml b/solutions/kueue-admission-check/multikueue-setup-demo2.yaml new file mode 100644 index 000000000..ae4a2e525 --- /dev/null +++ b/solutions/kueue-admission-check/multikueue-setup-demo2.yaml @@ -0,0 +1,57 @@ +apiVersion: kueue.x-k8s.io/v1beta1 +kind: ResourceFlavor +metadata: + name: "default-flavor-demo2" +--- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: ClusterQueue +metadata: + name: "cluster-queue-demo2" +spec: + namespaceSelector: {} # match all. + resourceGroups: + - coveredResources: ["cpu", "memory","nvidia.com/gpu"] + flavors: + - name: "default-flavor-demo2" + resources: + - name: "cpu" + nominalQuota: 9 + - name: "memory" + nominalQuota: 36Gi + - name: "nvidia.com/gpu" + nominalQuota: 3 + admissionChecks: + - multikueue-demo2 + - placement-demo2 +--- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: LocalQueue +metadata: + namespace: "default" + name: "user-queue-demo2" +spec: + clusterQueue: "cluster-queue-demo2" +--- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: AdmissionCheck +metadata: + name: multikueue-demo2 +spec: + controllerName: kueue.x-k8s.io/multikueue + parameters: + apiGroup: kueue.x-k8s.io + kind: MultiKueueConfig + name: placement-demo2 +--- +# OCM implements an admissioncheck controller to automate the MultiKueue setup process. +# MultiKueueConfigs and MultiKueueClusters are generated dynamically based on OCM placement decisions. +apiVersion: kueue.x-k8s.io/v1beta1 +kind: AdmissionCheck +metadata: + name: placement-demo2 +spec: + controllerName: open-cluster-management.io/placement + parameters: + apiGroup: cluster.open-cluster-management.io + kind: Placement + name: placement-demo2 diff --git a/solutions/kueue-admission-check/placement-demo2-1.yaml b/solutions/kueue-admission-check/placement-demo2-1.yaml new file mode 100644 index 000000000..8d58c64a4 --- /dev/null +++ b/solutions/kueue-admission-check/placement-demo2-1.yaml @@ -0,0 +1,18 @@ +apiVersion: cluster.open-cluster-management.io/v1beta1 +kind: Placement +metadata: + name: placement-demo2 + namespace: kueue-system +spec: + clusterSets: + - spoke + tolerations: + - key: cluster.open-cluster-management.io/unreachable + operator: Exists + - key: cluster.open-cluster-management.io/unavailable + operator: Exists + predicates: + - requiredClusterSelector: + labelSelector: + matchLabels: + accelerator: nvidia-tesla-t4 diff --git a/solutions/kueue-admission-check/placement-demo2-2.yaml b/solutions/kueue-admission-check/placement-demo2-2.yaml new file mode 100644 index 000000000..4934613b2 --- /dev/null +++ b/solutions/kueue-admission-check/placement-demo2-2.yaml @@ -0,0 +1,28 @@ +apiVersion: cluster.open-cluster-management.io/v1beta1 +kind: Placement +metadata: + name: placement-demo2 + namespace: kueue-system +spec: + clusterSets: + - spoke + tolerations: + - key: cluster.open-cluster-management.io/unreachable + operator: Exists + - key: cluster.open-cluster-management.io/unavailable + operator: Exists + predicates: + - requiredClusterSelector: + labelSelector: + matchLabels: + accelerator: nvidia-tesla-t4 + numberOfClusters: 1 + prioritizerPolicy: + mode: Exact + configurations: + - scoreCoordinate: + type: AddOn + addOn: + resourceName: resource-usage-score + scoreName: gpuClusterAvailable + weight: 1 diff --git a/solutions/kueue-admission-check/setup-env.sh b/solutions/kueue-admission-check/setup-env.sh new file mode 100755 index 000000000..2ef680a86 --- /dev/null +++ b/solutions/kueue-admission-check/setup-env.sh @@ -0,0 +1,126 @@ +#!/bin/bash + +cd $(dirname ${BASH_SOURCE}) + +set -e + +hub=${CLUSTER1:-hub} +c1=${CLUSTER1:-cluster1} +c2=${CLUSTER2:-cluster2} +c3=${CLUSTER2:-cluster3} + +hubctx="kind-${hub}" +c1ctx="kind-${c1}" +c2ctx="kind-${c2}" +c3ctx="kind-${c3}" + +kind create cluster --name "${hub}" --image kindest/node:v1.29.0@sha256:eaa1450915475849a73a9227b8f201df25e55e268e5d619312131292e324d570 +kind create cluster --name "${c1}" --image kindest/node:v1.29.0@sha256:eaa1450915475849a73a9227b8f201df25e55e268e5d619312131292e324d570 +kind create cluster --name "${c2}" --image kindest/node:v1.29.0@sha256:eaa1450915475849a73a9227b8f201df25e55e268e5d619312131292e324d570 +kind create cluster --name "${c3}" --image kindest/node:v1.29.0@sha256:eaa1450915475849a73a9227b8f201df25e55e268e5d619312131292e324d570 + +echo "Initialize the ocm hub cluster" + +clusteradm init --feature-gates="ManifestWorkReplicaSet=true,ManagedClusterAutoApproval=true" --bundle-version="latest" --wait --context ${hubctx} +joincmd=$(clusteradm get token --context ${hubctx} | grep clusteradm) + +echo "Join cluster1 to hub" +$(echo ${joincmd} --force-internal-endpoint-lookup --wait --context ${c1ctx} | sed "s//$c1/g") + +echo "Join cluster2 to hub" +$(echo ${joincmd} --force-internal-endpoint-lookup --wait --context ${c2ctx} | sed "s//$c2/g") + +echo "Join cluster3 to hub" +$(echo ${joincmd} --force-internal-endpoint-lookup --wait --context ${c3ctx} | sed "s//$c3/g") + +echo "Accept join of cluster1 and cluster2" +clusteradm accept --context ${hubctx} --clusters ${c1},${c2},${c3} --wait + +kubectl get managedclusters --all-namespaces --context ${hubctx} + +echo "Install Kueue (this can be replaced with OCM Manifestwork in the future)" +kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.7.1/manifests.yaml --context ${hubctx} +kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.7.1/manifests.yaml --context ${c1ctx} +kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.7.1/manifests.yaml --context ${c2ctx} +kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.7.1/manifests.yaml --context ${c3ctx} + +echo "Install Jobset for MultiKueue (this can be replaced with OCM Manifestwork in the future)" +kubectl apply --server-side -f https://github.com/kubernetes-sigs/jobset/releases/download/v0.5.2/manifests.yaml --context ${hubctx} +kubectl apply --server-side -f https://github.com/kubernetes-sigs/jobset/releases/download/v0.5.2/manifests.yaml --context ${c1ctx} +kubectl apply --server-side -f https://github.com/kubernetes-sigs/jobset/releases/download/v0.5.2/manifests.yaml --context ${c2ctx} +kubectl apply --server-side -f https://github.com/kubernetes-sigs/jobset/releases/download/v0.5.2/manifests.yaml --context ${c3ctx} + +kubectl config use-context ${hubctx} + +echo "Patch permission" +kubectl patch clusterrole cluster-manager --type='json' -p "$(cat env/patch-clusterrole.json)" + +echo "Patch image" +kubectl patch deployment cluster-manager -n open-cluster-management --type=json -p='[ + {"op": "replace", "path": "/spec/template/spec/containers/0/image", "value": "quay.io/haoqing/registration-operator:latest"}, + {"op": "replace", "path": "/spec/template/spec/containers/0/imagePullPolicy", "value": "Always"} +]' +kubectl patch clustermanager cluster-manager --type=json -p='[{"op": "replace", "path": "/spec/registrationImagePullSpec", "value": "quay.io/haoqing/registration:latest"}]' +kubectl patch clustermanager cluster-manager --type=json -p='[{"op": "replace", "path": "/spec/placementImagePullSpec", "value": "quay.io/haoqing/placement:latest"}]' + +echo "Install CRDs" +kubectl create -f env/multicluster.x-k8s.io_clusterprofiles.yaml + +echo "Install managed-serviceaccount" +git clone git@github.com:open-cluster-management-io/managed-serviceaccount.git || true +cd managed-serviceaccount +helm uninstall -n open-cluster-management-addon managed-serviceaccount || true +helm install \ + -n open-cluster-management-addon --create-namespace \ + managed-serviceaccount charts/managed-serviceaccount/ \ + --set tag=latest \ + --set featureGates.ephemeralIdentity=true \ + --set enableAddOnDeploymentConfig=true \ + --set hubDeployMode=AddOnTemplate +cd - +rm -rf managed-serviceaccount + +echo "Install managed-serviceaccount mca" +clusteradm create clusterset spoke +clusteradm clusterset set spoke --clusters ${c1},${c2},${c3} +clusteradm clusterset bind spoke --namespace default +kubectl apply -f env/placement.yaml || true +kubectl patch clustermanagementaddon managed-serviceaccount --type='json' -p="$(cat env/patch-mg-sa-cma.json)" || true + +echo "Install cluster-permission" +git clone git@github.com:open-cluster-management-io/cluster-permission.git || true +cd cluster-permission +kubectl apply -f config/crds +kubectl apply -f config/rbac +kubectl apply -f config/deploy +cd - +rm -rf cluster-permission + +echo "Install resource-usage-collect-addon" +git clone git@github.com:open-cluster-management-io/addon-contrib.git || true +cd addon-contrib/resource-usage-collect-addon +make deploy +cd - +rm -rf addon-contrib + +echo "Enable MultiKueue on the hub" +kubectl patch deployment kueue-controller-manager -n kueue-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/args", "value": ["--config=/controller_manager_config.yaml", "--zap-log-level=2", "--feature-gates=MultiKueue=true"]}]' + +echo "Setup queue on the spoke" +kubectl apply -f env/single-clusterqueue-setup-mwrs.yaml + +echo "Setup credentials for clusterprofile" +kubectl apply -f env/cp-c1.yaml +kubectl apply -f env/cp-c2.yaml +kubectl apply -f env/cp-c3.yaml +kubectl apply -f env/msa-c1.yaml +kubectl apply -f env/msa-c2.yaml +kubectl apply -f env/msa-c3.yaml + +echo "Setup faked GPU on the spoke" +kubectl label managedcluster cluster2 accelerator=nvidia-tesla-t4 +kubectl label managedcluster cluster3 accelerator=nvidia-tesla-t4 + +echo "IMPORTANT: RUN BELOW COMMAND MANUALLY on cluster2 and cluster3 !!!" +echo "kubectl edit-status node cluster2-control-plane --context ${c2ctx}" with nvidia.com/gpu: "3" +echo "kubectl edit-status node cluster3-control-plane --context ${c3ctx}" with nvidia.com/gpu: "3"