redistribute RDY in high throughput, idle producer situations #277
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When there are several producer nsqds registered on a nsqlookupd but only one of them (at least not all) is currently producing messages, the current flat max-in-flight distribution leads to the consumer effectively having fewer messages in flight than we might want.
Consider a situation of 4 hosts / nsqds being used to produce messages on a topic - but only one of them is used at a time (various reasons). Consider a single consumer setting max-in-flight of 8. These are equally spread so each nsqd connection will have a RDY count of 2. Since at any point in time 3 of the 4 nsqds are idle / not producing on the topic, we effectively only ever have 2 messages in flight.
One workaround is to increase the max-in-flight drastically (multiply by nsqd count) but then we might have more messages in flight than our consumer wants if suddenly more than one nsqd is producing messages.
We constantly deal with this situation (automatically scheduled producer containers that move between hosts), we implemented a second RDY redistribution function that trades RDY count from an unused nsqd connection to a "busy" nsqd connection.
Since this might not be useful / wanted in every use case the feature is only enabled with a config flag
RDYTrading
.The code is similar to the normal code in redistributeRDY for the
max-in-flight < len(conns)
situation but here it essentially deals withmax-in-flight > len(producing_conns)
.Let me know what you think and whether this could be useful for others and thus whether you think it could be merged upstream.
NSQ2019/12/05 14:26:34 DBG 1 [foo/bar] looking for RDY trade possibilities...
NSQ2019/12/05 14:26:34 DBG 1 [foo/bar] - moving 3 RDY from 10.13.2.51:4150 to 10.13.2.85:4150
NSQ2019/12/05 14:26:34 DBG 1 [foo/bar] - moving 3 RDY from 10.13.2.39:4150 to 10.13.2.85:4150