-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pool distributes load unevenly across connections. #3648
Comments
Imho duplicate of #310 |
The problem stems from the use of Keep-Alive connections. Essentially, we prioritize "alive" connections as much as possible because they are the ones we know are lively and less likely to timeout/error due to network disconnections. This is suboptimal in case of low traffic but minimizes errors. In a high-traffic scenario, this is equivalent to round-robin. Have you tried using the |
Thanks for the quick response. I see the challenge of balancing against keeping connections alive, prioritizing sending requests through as few connections as possible to ensure maximum reuse. I don't understand the bit about being suboptimal for low traffic and round-robin for high-traffic. It seems the opposite to me. As I reason about it, with low traffic, long request processing times, and short keep-alives, it will be round-robin, but there is an edge case in which we potentially fail to reuse a connection occasionally. With high traffic, we end up with a linear trend of more requests processed by earlier connections and fewer by later connections, with order of connection index use ending up something like: 1, 2, 3, 1, 4, 2, 1, 5, 2, 3, 1 I haven't used
Is my understanding off? |
I think I follow actually. We'd never let go of connections once created, because all connections would have roughly the same duration since the previous request. Then, once the pool got large enough and request volume dropped enough, it would timeout all connections at once. Does that capture the reasoning? |
Yes, exactly. If we can improve |
I'd be happy to if it makes sense and we can identify a sensible approach. It doesn't seem intended for that case though, as I share above. How so are you thinking it could be used to balance requests across a single upstream service? |
I think the algorithm used in balanced pool could be extended to work "in front" of a load balancer. |
We can potentially steal inspiration from the http lib as well. Using Axios, the requests were distributed fairly evenly. We only noticed the change when switching to Undici. Request volume for this service is 500-1000 per second. |
A PR would be highly welcomed. |
Bug Description
Pool distributes load unevenly across connections, prioritizing them by creation time. See here.
Where your connections are load-balanced across instances (pods/containers), this results in overuse of some instances and underuse of others.
Reproducible By
ServiceA
, with an endpoint that takes 100ms - 200ms to respond.ServiceA
.ServiceB
, create a Pool of connections toServiceA
, referencing it by Kubernetes Service address (https://service-name), such that we'll be load-balancing connections through kube-proxy.ServiceB
every 50ms for 1 minute.ServiceA
pods, but imbalanced request distribution and resource usage acrossServiceA
pods.Thinking through, this may be otherwise observable plainly through internal diagnostics around amount of requests processed by each Client or something like that.
Expected Behavior
Roughly balanced request distribution and resource usage across Clients and
ServiceA
pods.Environment
MacOS, Node v18
Additional context
There are a number of ways we could resolve this intelligently. Most simply/crudely though, we could track index of last used client and pick back up next scan from that index to roughly round-robin.
Something like:
The text was updated successfully, but these errors were encountered: