You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
create a number of 1:1 collections in source / target Solr
start indexing (in my tests I was simply sending a constant stream of random documents in a round-robin fashion to each collection)
simulate a problem in ONE of the target collections. I simply deleted it, but in real life scenario it could've been any other kind of breakdown, or the following hypothetical scenario:
with a high-enough traffic there will be a number of messages in-flight between source and target. Assume the ops decided to remove (simultaneously) one of the collections, both at source and at target, without waiting for all messages for that collection to drain. Now the Consumer will pick up queued in-flight messages intended for the no-longer existing target collection.
Consumer will attempt to send picked up requests addressed to a no-longer existing (or functional) collection, to which Solr will respond with errors.
this will trigger back-offs, which will eventually halt ALL processing, also for the remaining healthy collections.
Is this behavior the best we can do? I'm not sure, I would expect the Consumer to continue processing requests for healthy collections. At the very least we should offer some protection against the hypothetical I mentioned above.
The text was updated successfully, but these errors were encountered:
With the dead letter queue, I think this issue shouldn't exist. The failed messages would get sent and processed in parallel. Adding a 'no-op' for specific collections temporarily would also be a possible solution as I guess that's what we really want of the updates that are inflight.
Here's the scenario:
Is this behavior the best we can do? I'm not sure, I would expect the Consumer to continue processing requests for healthy collections. At the very least we should offer some protection against the hypothetical I mentioned above.
The text was updated successfully, but these errors were encountered: