Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple active gateways #2891

Open
cangyin opened this issue Jan 30, 2024 · 5 comments
Open

Multiple active gateways #2891

cangyin opened this issue Jan 30, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@cangyin
Copy link

cangyin commented Jan 30, 2024

What would you like to be added:

Multiple active gateways for higher inter-cluster data transfer performance.

Why is this needed:

Currently there is only one gateway per cluster. As per the benchmark result in #2890, there is significant performance drop (about 56%) for PODs running on non-gateway nodes. Suppose the gateway node has a 10Gbit/s NIC. For DBMS servers running on non-gateway nodes, they only share 560 MByte/s. This means the whole cluster is only able to transfer 46TB of data per day at maximum, theoretically, which is unacceptable for productional clusters (JFYI, a small to median sized productional ClickHouse cluster can receive more than 40TB of data per day).

@cangyin cangyin added the enhancement New feature or request label Jan 30, 2024
@skitt
Copy link
Member

skitt commented Jan 30, 2024

With the default IPsec tunnels, this happens because the IPsec protocol doesn’t support splitting tunnels, and the encryption and packet ordering must happen on a single core.

We have considered in the past enabling multiple parallel tunnels, on one gateway or across multiple gateways (in the latter case, with added HA benefits); but that requires deciding how to split traffic across the available tunnels.

In parallel, protocol extensions are being discussed to allow IPsec tunnels to be split to avoid these bottlenecks; see the current draft for details. It seems preferable for Submariner to support that, once it becomes available, instead of coming up with its own solution.

For performance-critical scenarios, especially in cases where a dedicated (private) network is available between gateways, Submariner supports VXLAN tunnels instead of IPsec.

@cangyin
Copy link
Author

cangyin commented Jan 30, 2024

For performance-critical scenarios, especially in cases where a dedicated (private) network is available between gateways, Submariner supports VXLAN tunnels instead of IPsec.

The 56% performance drop in question is exactly that of inter-cluster VXLAN tunnel (vxlan-tunnel VTEP). Which gives 44% of underlying network capability of only one NIC, while the IPsec tunnel gives 26%.

It seems multiple active VXLAN gateways is easier to implement, intuitively.

@dfarrell07
Copy link
Member

protocol extensions are being discussed to allow IPsec tunnels to be split to avoid these bottlenecks

If we go this path, we should likely raise a more specific issue.

@maayanf24
Copy link
Contributor

Need more investigation before prioritizing.
First we want to implement load-balancer mode.

@maayanf24
Copy link
Contributor

Decided to push to following releases

@maayanf24 maayanf24 added this to Backlog Aug 7, 2024
@github-project-automation github-project-automation bot moved this to Backlog in Backlog Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Backlog
Development

No branches or pull requests

5 participants