-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug Report: Pointing vttablet to different (unmaged) MySQL makes it get stuck in loop #15212
Comments
@wiebeytec To clarify, you stopped the My guess would be that it's trying to perform an operation that it cannot. Can you please share the full list of flags that you're using, with sensitive info redacted as necessary? I'm guessing it's related to this: #14871 |
First: I figured out what was wrong, and it was my mistake. So, the issue is one of logging errors. First to answer your questions:
Indeed. I just stopped it and started it again, with a new
I edited the top comment with the information. It also seemed to me it was trying to do something it cannot, but it didn't log that. The problem was that I forgot to pick the correct 'parameter group' for the settings. So it's one of these settings that made it fail: |
Thanks, @wiebeytec . So not being able to start makes sense, but not having any logs for it does not. We should be able to repeat it easily with the local examples by editing one of the tablet's my.cnf files to add |
@wiebeytec Here's a test case which should demonstrate the lack of a clear error message:
But clearly there's something missing here. What MySQL config option(s) in this parameter group did you think were the cause? Are you using the Vitess Operator here? |
(We're not using Vitess Operator.) I confirmed it was |
OK, I'm unable to repeat what you saw (I modified the test case above to set sql_mode instead of gtid_mode and I did not see any problems). I'm not sure what part I may be missing. I'm going to close this for now but if you have more info we can re-open it at any time. |
Overview of the Issue
I have an unmanaged MySQL at Amazon RDS with a vttablet in front of it. I replaced the backing MySQL with one restored from snapshot. Amazon makes you restore that into a new instance, so the hostname changes. But, all users and permissions and stuff are the same.
I changed the
--db_host
argument tovttablet
, after which vttablet got stuckNOT_SERVING
:Every second it prints log lines about discovering its state. I pasted it in the box below.
Pointing
--db_host
back to the original made it work again.Looking in etcd with
etcdctl get --prefix /
I see references to the hostname. Apparently there is some split brain situation going on. The--db_host
does do something, because MySQL reports a connection, but it's not matched to the original DB perhaps?Best expected behavior is that it just works. Giving a proper error message is also at least something. It kind of depends what falls in the Vitess 'way'. I know that caring about one MySQL instance is not normal, but augmenting an RDS instance with Vitess seems valid to me.
Reproduction Steps
Point
vttablet
to an exact copy of the database with by changing--db_host
.Binary Version
Version `vttablet version Version: 18.0.2 (Git revision d3012c188ea0cfc6837917fc6642ea23be9bb1ff branch 'HEAD') built on Wed Dec 20 14:27:31 UTC 2023 by runner@fv-az975-901 using go1.21.5 linux/amd64`
Operating System and Environment details
This is how
vttablet
is started:The text was updated successfully, but these errors were encountered: