Skip to content

Commit

Permalink
CORS-3741: Nutanix enhancement: allow multiple NICs
Browse files Browse the repository at this point in the history
  • Loading branch information
yanhua121 committed Nov 5, 2024
1 parent 4cc0d6f commit c2ed1f4
Showing 1 changed file with 83 additions and 0 deletions.
83 changes: 83 additions & 0 deletions enhancements/machine-api/nutanix-multi-nics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
---
title: nutanix-multi-nics
authors:
reviewers:
approvers:
api-approvers:
creation-date: 2024-11-05
last-updated: 2024-11-05
tracking-link:
- https://issues.redhat.com/browse/CORS-3741
---

# Nutanix: Multi-NICs for OCP Cluster Nodes

## Summary

Ability to install OpenShift on Nutanix with nodes having multiple NICs (multiple subnets) from IPI and for autoscaling with MachineSets.

## Motivation and User Stories

Requested by customers:
- Everest Digital
- Unacle B.V

As an OpenShift user, I wish to deploy clusters that allow infrastructure and worker nodes with multi-NICs support. This may be to support secondary storage networking, such as Nutanix CSI, or to support other applications with segmented network requirements.

## Goals

Currently users can only configure one and only one NIC (subnet) when installing a Nutanix OCP cluster, as well as configuring a Machine/MachineSet for worker nodes in an existing cluster. From OCP 4.18, the restriction of only one NIC (subnet) per FailureDomain, or per Machine/MachineSet is relieved, to meet the user stories mentioned in the above section. Users can configure multi-NICs (multi-subnets) for nodes both when installing a new cluster and updating/adding Machine/MachineSet for worker nodes for an existing cluster.

### API Extensions

Currently, the “subnets” fields in both Machine/MachineSet’s Nutanix providerSpec and Nutanix FailureDomain are already array type. The only change for the api is to relax the validation rule for the “subnets” fields to allow multiple values and to ensure no duplication values are configured.

We will add a featue gate "NutanixMultiSubnets" (DevPreviewNoUpgrade, TechPreviewNoUpgrade) for this feature. After QE testing complete, we will add the feature gate to the "Default" feature set.

```go
// NutanixPlatformSpec holds the desired state of the Nutanix infrastructure provider.
// This only includes fields that can be modified in the cluster.
type NutanixPlatformSpec struct {
...

// failureDomains configures failure domains information for the Nutanix platform.
// When set, the failure domains defined here may be used to spread Machines across
// prism element clusters to improve fault tolerance of the cluster.
// +openshift:validation:FeatureGateAwareMaxItems:featureGate=NutanixMultiSubnets,maxItems=32
// +listType=map
// +listMapKey=name
// +optional
FailureDomains []NutanixFailureDomain `json:"failureDomains"`
}

// NutanixFailureDomain configures failure domain information for the Nutanix platform.
type NutanixFailureDomain struct {
...

// subnets holds a list of identifiers (one or more) of the cluster's network subnets
// If the feature gate NutanixMultiSubnets is enabled, up to 32 subnets may be configured.
// for the Machine's VM to connect to. The subnet identifiers (uuid or name) can be
// obtained from the Prism Central console or using the prism_central API.
// +kubebuilder:validation:Required
// +kubebuilder:validation:MinItems=1
// +openshift:validation:FeatureGateAwareMaxItems:featureGate="",maxItems=1
// +openshift:validation:FeatureGateAwareMaxItems:featureGate=NutanixMultiSubnets,maxItems=32
// +openshift:validation:FeatureGateAwareXValidation:featureGate=NutanixMultiSubnets,rule="self.all(x, self.exists_one(y, x == y))",message="each subnet must be unique"
// +listType=atomic
Subnets []NutanixResourceIdentifier `json:"subnets"`
}
```

## Implementation Details/Notes/Constraints

The installer should allow more than one subnets to be configured in the install-config.yaml. And pass that configuration to the installer generated Machine/MachineSet manifests when running the installer to create an OCP cluster.

The Machine validation webhook should check the Nutanix providerSpec’s “subnets” field to allow more than one item and make sure there are no duplicates.
The nutanix machine controller should allow more than one item in the NutanixMachineProviderConfig’s “subnets” field, and use this configured subnets value when creating a new VM for the Machine node.

## Upgrade and Downgrade Considerations

To upgrade an existing OCP (prior to 4.18) Nutanix cluster to 4.18, there is nothing to worry about this feature. Because prior to 4.18, the “subnets” field of the Nutanix providerSpec in the Machine/MachineSet/ControlPlaneMachineSet CRs and in the each of Nutanix FailureDomains of the Infrastructure CR should only have one and exactly one item. And this is supported in 4.18.

To downgrade an existing 4.18 OCP Nutanix cluster to a prior version, if any of the Machine/MachineSet/ControlPlaneMachineSet CRs and the Nutanix FailureDomains of the Infrastructure CR configures more than one “subnets”, it will fail with validation errors.

0 comments on commit c2ed1f4

Please sign in to comment.