Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: PRS & ERS not preferring hosts taking backups not working as expected #17299

Open
ejortegau opened this issue Nov 29, 2024 · 0 comments

Comments

@ejortegau
Copy link
Contributor

ejortegau commented Nov 29, 2024

Overview of the Issue

#16997 introduced a preference to not promote hosts taking backups during reparents. However, we have noticed some issues with it, namely:

Incorrect passing of the flag indicatign whether a tablet is or not taking a backup

The information about whether a tablet is taking a backup or not is being stored in ReplicationStatusResponse and StopReplicationAndGetStatusResponse, even though a BackupRunning field has also been added to replicationdata.Status. The reparenting code, however does not have access to the ReplicationStatusResponse.BackupRunning because the gRPC TabletManagerClient only returns ReplicationStatusResponse.Status. This leads to incorrect decisions being made regarding the right host to promote.

vtctld crashes when certain calls from vtcltd to tablets fail during ERS.

During ERS, when running stopReplicationAndBuildStatusMaps, calls to TabletManagerClient.StopReplicationAndGetStatus can fail, leading to attempting to access a method on a null struct, and hence, segfaulting the vtctld process. See here.

Reproduction Steps

The issue can be verified by running the local installation as described here, triggering backups and calling PlannedReparentShard and EmergencyReparentShard

Binary Version

This has been seen in the latest dev code for v22.

Operating System and Environment details

PRETTY_NAME="Ubuntu 22.04.5 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Linux 6.8.0-49-generic

x86_64

Log Fragments

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants