-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow tablet picker to exclude specified tablets from its candidate list #14224
allow tablet picker to exclude specified tablets from its candidate list #14224
Conversation
Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
…rter timeout Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks, @pbibra ! I only had a couple of minor comments that we should address before merging.
I also confirmed that the new tests are not flaky by running them with -count 1 -race
in a loop for some time.
close(done) | ||
}() | ||
|
||
Loop: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this is needed, is it? Since there's only one for loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The break statement breaks out of the switch instead of the loop by default, so we had to label the loop
Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
…ndidate list vitessio#14224 Signed-off-by: Vilius Okockis <[email protected]>
…ist (vitessio#14224) Signed-off-by: Priya Bibra <[email protected]>
Description
Allow the
TabletPicker
to take in a list of tablets to exclude from its candidate list.This is mainly required in the
VStreamManager
when there is a retriable failure streaming from a tablet.Example Scenario:
Previously, a VTGate VStream RPC call could fail on a GTID mismatch error. In this case, the tablet is healthy so a client retry may choose the same one and fail again. This can happen if the cell preference for the tablet picker is local and there is only a single tablet in the VTGate's local cell. Instead, we'd like to retry on this error in addition to marking the tablet as unviable for streaming within this request.
Changes:
ignoreTablets
list toTabletPicker
properties.tp.GetMatchingTablets
to omit a potential candidate if it also appears in theignoreTablets
list.shouldRetry()
function in theVStreamManager
which has twobool
return values. The first represents whether the error should be retried, the second determines whether the tablet on which the error occurred should be ignored as a possible candidate on the retry.vsm.streamFromTablet
to checkshouldRetry()
upon error and pass in a list ofignoreTablets
to theTabletPicker
.Retriable Errors:
Code_UNAVAIALBLE
- should retry from before, we will not ignore the tablet on a retry.Code_FAILED_PRECONDITION
- should retry from before, we will not ignore the tablet on a retry.Code_INVALID_ARGUMENT
with messageGTIDSet Mismatch
- should retry, ignore the tablet on a retry.Related Issue(s)
Checklist