-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple active gateways #2891
Comments
With the default IPsec tunnels, this happens because the IPsec protocol doesn’t support splitting tunnels, and the encryption and packet ordering must happen on a single core. We have considered in the past enabling multiple parallel tunnels, on one gateway or across multiple gateways (in the latter case, with added HA benefits); but that requires deciding how to split traffic across the available tunnels. In parallel, protocol extensions are being discussed to allow IPsec tunnels to be split to avoid these bottlenecks; see the current draft for details. It seems preferable for Submariner to support that, once it becomes available, instead of coming up with its own solution. For performance-critical scenarios, especially in cases where a dedicated (private) network is available between gateways, Submariner supports VXLAN tunnels instead of IPsec. |
The 56% performance drop in question is exactly that of inter-cluster VXLAN tunnel ( It seems multiple active VXLAN gateways is easier to implement, intuitively. |
If we go this path, we should likely raise a more specific issue. |
Need more investigation before prioritizing. |
Decided to push to following releases |
What would you like to be added:
Multiple active gateways for higher inter-cluster data transfer performance.
Why is this needed:
Currently there is only one gateway per cluster. As per the benchmark result in #2890, there is significant performance drop (about 56%) for PODs running on non-gateway nodes. Suppose the gateway node has a 10Gbit/s NIC. For DBMS servers running on non-gateway nodes, they only share 560 MByte/s. This means the whole cluster is only able to transfer 46TB of data per day at maximum, theoretically, which is unacceptable for productional clusters (JFYI, a small to median sized productional ClickHouse cluster can receive more than 40TB of data per day).
The text was updated successfully, but these errors were encountered: