Skip to content

Commit

Permalink
update README
Browse files Browse the repository at this point in the history
Signed-off-by: Dmitry Shmulevich <[email protected]>
  • Loading branch information
dmitsh committed Nov 19, 2024
1 parent 48e89c1 commit 5b233d0
Showing 1 changed file with 23 additions and 9 deletions.
32 changes: 23 additions & 9 deletions keps/sig-network/4962-network-topology-standard/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ If none of those approvers are still appropriate, then changes to that list
should be approved by the remaining approvers and/or the owning SIG (or
SIG Architecture for cross-cutting KEPs).
-->
# KEP-4962: Standardizing Cluster Network Topology Representation
# KEP-4962: Standardizing the Representation of Cluster Switch Network Topology

<!--
This is the title of your KEP. Keep it short, simple, and descriptive. A good
Expand Down Expand Up @@ -154,11 +154,31 @@ Items marked with (R) are required *prior to targeting to a milestone / release*

## Summary

This document proposes a standard for declaring cluster network topology in Kubernetes, representing the hierarchy of nodes, switches, and interconnects. In this context, a switch can refer to a physical network device or a collection of such devices with close proximity and functionality.
This document proposes a standard for declaring switch network topology in Kubernetes clusters, representing the hierarchy of nodes, switches, and interconnects. In this context, a switch can refer to a physical network device or a collection of such devices with close proximity and functionality.

## Motivation
With the rise of multi-node Kubernetes workloads, especially AI workloads that demand intensive inter-node communication, scheduling pods in close network proximity becomes essential.

Understanding the cluster network topology is essential for optimizing the placement of workloads that require intensive inter-node communication. Currently, there is no standardized way to represent this information in Kubernetes, making it challenging to develop control plane components and applications that can leverage network topology awareness.
Some major CSPs already provide a way to discover node network topology.
Amazon's AWS has implemented [DescribeInstanceTopology API](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeInstanceTopology.html). Google's GCP exposes [Google Cloud SDK](https://cloud.google.com/go/docs/reference/cloud.google.com/go/compute/latest/apiv1) which allows to fetch rack and cluster IDs, from which we could reconstruct network hierarchy. Oracle's OCI provides [Go SDK](https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/gosdk.htm) which allows to get topology related information for their compute nodes.

In addition to CSPs, there is a way to discover switch network topology for some on-prem clusters, but that depends on the supported switch network vendors.

There is an open-source project [topograph](https://github.com/NVIDIA/topograph) that has implemented all the above approaches and has been deployed in production environments.

What is currently missing is a common and standard way to convey the switch network topology information to the Kubernetes environment.

Currently, there is no standardized way to represent this information in Kubernetes, making it challenging to develop control plane components and applications that can leverage network topology awareness.

AWS started adding `topology.k8s.aws/network-node-layer-N` node labels to outline its 3-tier networking, but this is cloud specific.

In this KEP we are proposing to create a standard for representing switch network topology in the cluster.

The cluster network topology can be:
- Provided directly by a CSP, i.e the CSP will apply node labels during node creation
- Extracted from a CSP using specialized tools like [topograph](https://github.com/NVIDIA/topograph)
- Manually set up by cluster administrators
- A combination of the above methods to ensure comprehensive coverage

This information might be useful for various components and features, including:

Expand Down Expand Up @@ -214,12 +234,6 @@ bogged down.

### Notes/Constraints/Caveats (Optional)

Cluster network topology information can be derived from various sources:
- Provided directly by a Cloud Service Provider (CSP)
- Extracted from a CSP using specialized tools like [topograph](https://github.com/NVIDIA/topograph)
- Manually set up by cluster administrators
- A combination of the above methods to ensure comprehensive coverage

### Risks and Mitigations

<!--
Expand Down

0 comments on commit 5b233d0

Please sign in to comment.