Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow tablet picker to exclude specified tablets from its candidate list #14224

Merged
merged 11 commits into from
Oct 23, 2023

Conversation

pbibra
Copy link
Contributor

@pbibra pbibra commented Oct 10, 2023

Description

Allow the TabletPicker to take in a list of tablets to exclude from its candidate list.
This is mainly required in the VStreamManager when there is a retriable failure streaming from a tablet.

Example Scenario:
Previously, a VTGate VStream RPC call could fail on a GTID mismatch error. In this case, the tablet is healthy so a client retry may choose the same one and fail again. This can happen if the cell preference for the tablet picker is local and there is only a single tablet in the VTGate's local cell. Instead, we'd like to retry on this error in addition to marking the tablet as unviable for streaming within this request.

Changes:

  • Add an ignoreTablets list to TabletPicker properties.
  • Update tp.GetMatchingTablets to omit a potential candidate if it also appears in the ignoreTablets list.
  • Introduce a shouldRetry() function in the VStreamManager which has two bool return values. The first represents whether the error should be retried, the second determines whether the tablet on which the error occurred should be ignored as a possible candidate on the retry.
  • Update vsm.streamFromTablet to check shouldRetry() upon error and pass in a list of ignoreTablets to the TabletPicker.

Retriable Errors:

  • Code_UNAVAIALBLE - should retry from before, we will not ignore the tablet on a retry.
  • Code_FAILED_PRECONDITION - should retry from before, we will not ignore the tablet on a retry.
  • Code_INVALID_ARGUMENT with message GTIDSet Mismatch - should retry, ignore the tablet on a retry.

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on the CI
  • Documentation was added or is not required

@github-actions github-actions bot added this to the v19.0.0 milestone Oct 10, 2023
Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
@pbibra pbibra marked this pull request as ready for review October 13, 2023 18:01
go/vt/discovery/tablet_picker.go Outdated Show resolved Hide resolved
go/vt/vtgate/vstream_manager.go Outdated Show resolved Hide resolved
go/vt/discovery/tablet_picker.go Outdated Show resolved Hide resolved
go/vt/vtgate/vstream_manager.go Outdated Show resolved Hide resolved
@mattlord mattlord self-assigned this Oct 14, 2023
@mattlord mattlord added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: VReplication labels Oct 14, 2023
@mattlord mattlord self-requested a review October 17, 2023 06:29
Signed-off-by: Priya Bibra <[email protected]>
Signed-off-by: Priya Bibra <[email protected]>
Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks, @pbibra ! I only had a couple of minor comments that we should address before merging.

I also confirmed that the new tests are not flaky by running them with -count 1 -race in a loop for some time.

go/vt/vtgate/vstream_manager.go Outdated Show resolved Hide resolved
go/vt/vtgate/vstream_manager_test.go Outdated Show resolved Hide resolved
close(done)
}()

Loop:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this is needed, is it? Since there's only one for loop.

Copy link
Contributor Author

@pbibra pbibra Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The break statement breaks out of the switch instead of the loop by default, so we had to label the loop

Signed-off-by: Priya Bibra <[email protected]>
@rohit-nayak-ps rohit-nayak-ps merged commit 0f21a0b into vitessio:main Oct 23, 2023
115 checks passed
@pbibra pbibra deleted the pbibra-tablet-picker-ignore-list branch October 23, 2023 19:47
pbibra added a commit to slackhq/vitess that referenced this pull request Oct 26, 2023
Signed-off-by: Priya Bibra <[email protected]>
pbibra added a commit to slackhq/vitess that referenced this pull request Oct 26, 2023
Signed-off-by: Priya Bibra <[email protected]>
DeathBorn added a commit to vinted/vitess that referenced this pull request Apr 23, 2024
timvaillancourt pushed a commit to slackhq/vitess that referenced this pull request May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: VReplication Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Feature Request: Add the ability to skip a list of tablets to pick in the TabletPicker
3 participants