-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ApplySchema: log selected flags #14798
ApplySchema: log selected flags #14798
Conversation
Signed-off-by: Shlomi Noach <[email protected]>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
Made a small change: logging from inside VtctldServer.ApplySchema (server side) instead of in multiple client side endpoints. |
@@ -221,6 +221,8 @@ func (s *VtctldServer) ApplyShardRoutingRules(ctx context.Context, req *vtctldat | |||
|
|||
// ApplySchema is part of the vtctlservicepb.VtctldServer interface. | |||
func (s *VtctldServer) ApplySchema(ctx context.Context, req *vtctldatapb.ApplySchemaRequest) (resp *vtctldatapb.ApplySchemaResponse, err error) { | |||
log.Infof("VtctldServer.ApplySchema: keyspace=%s, migrationContext=%v, ddlStrategy=%v, batchSize=%v", req.Keyspace, req.MigrationContext, req.DdlStrategy, req.BatchSize) | |||
|
|||
span, ctx := trace.NewSpan(ctx, "VtctldServer.ApplySchema") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's better to add these things to the trace span and add stats for specific cases that you want to be able to track.
Our logs are already getting noisy and it's causing practical problems in some cases.
Is a new log message on each invocation really the best option? If so, then of course we can add it. I just wanted to ask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ApplySchema
invocations are sparse; those are normally human generated per schema change. So on most production systems this would amount to a single digit number per day, or per week. It's also possible to use ApplySchema
to control migrations (e.g. ALTER VITESS_MIGRATION ... COMPLETE
etc.). But again those are a handful of potential extra calls. So in terms of overhead I think the price is insignificant.
Let me look into trace span though, I don't have a good grasp of how that works and how that differs from logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trace span would be useful when tracing, a stat would be useful if we want to track X over the course of a process lifespan (e.g. ApplySchema statements with option X). I'm OK with adding the log message, just wondering if it's really the best option in this case. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I don't dig how tracing or stats would be useful here. I understand that our logging is sometimes bloated, but in this particular case I don't find it to be an issue, as schema changes are relatively rare and human generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we think this log message will be helpful/useful, and to whom? What do we hope to get from it? I don't see that stated clearly, but the subject says: log selected flags
So I'm guessing we want to know how many times given flags are specified per keyspace? That's the kind of thing a multi-label stat can provide. But I also don't know if this is thought to be useful in debugging a known issue and may soon be removed (i.e. it's only useful to Vitess developers) or if we think this kind of information is generally usable (useful to Vitess users/administrators). If it's the former then the log makes sense to me, if it's the latter then I think stats are better.
I know we don't think that ApplySchema is called very often, but we've thought this about various things in the past and been wrong -- at least when you consider aggregated logging with up to 1,000s of tablets involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I added no context to this change. The purpose is not to do a weekly summary of how many this and that calls were made. The purpose is to be able to debug migrations that go wrong/fail, and to see what the exact invocation command was used, so we can try and reproduce them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to have this as a means to debug unexpected behavior for something that shouldn't be called very often. Thanks!
Description
This simple PR adds more information in
ApplySchema
invocation, both invtctl
and invtctldclient
.Related Issue(s)
None.
Checklist
Deployment Notes