Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Avoid creating MoveTables in case of non-empty target tables #16826

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions go/vt/vtctl/workflow/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -1395,6 +1395,29 @@ func (s *Server) moveTablesCreate(ctx context.Context, req *vtctldatapb.MoveTabl
}
s.Logger().Infof("Found tables to move: %s", strings.Join(tables, ","))

// Check if any table being moved is already non-empty in the target keyspace.
// Skip this check for multi-tenant migrations.
if req.GetWorkflowOptions().GetTenantId() == "" {
targetKeyspaceTables, err := getTablesInKeyspace(ctx, sourceTopo, s.tmc, targetKeyspace)
if err != nil {
return nil, err
}

var alreadyExistingTables []string
for _, t := range targetKeyspaceTables {
if slices.Contains(tables, t) {
alreadyExistingTables = append(alreadyExistingTables, t)
}
}

if len(alreadyExistingTables) > 0 {
err = validateEmptyTables(ctx, sourceTopo, s.tmc, targetKeyspace, alreadyExistingTables)
if err != nil {
return nil, err
}
}
}

if !vschema.Sharded {
// Save the original in case we need to restore it for a late failure
// in the defer().
Expand Down
53 changes: 53 additions & 0 deletions go/vt/vtctl/workflow/utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -1011,3 +1011,56 @@ func applyTargetShards(ts *trafficSwitcher, targetShards []string) error {
}
return nil
}

// validateEmptyTables checks if all specified tables in the keyspace are empty across all shards.
// It queries each shard's primary tablet and if any non-empty table is found, it returns an error
// containing a list of non-empty tables.
func validateEmptyTables(ctx context.Context, ts *topo.Server, tmc tmclient.TabletManagerClient, keyspace string, tables []string) error {
rohit-nayak-ps marked this conversation as resolved.
Show resolved Hide resolved
shards, err := ts.GetServingShards(ctx, keyspace)
if err != nil {
return err
}
if len(shards) == 0 {
return fmt.Errorf("keyspace %s has no shards", keyspace)
}

isFaultyTable := map[string]bool{}
for _, shard := range shards {
rohit-nayak-ps marked this conversation as resolved.
Show resolved Hide resolved
primary := shard.PrimaryAlias
if primary == nil {
return fmt.Errorf("shard does not have a primary: %v", shard.ShardName())
}

ti, err := ts.GetTablet(ctx, primary)
if err != nil {
return err
}

var selectQueries []string
for _, t := range tables {
selectQueries = append(selectQueries, fmt.Sprintf("(select '%s' from %s limit 1)", t, t))
}
query := strings.Join(selectQueries, "union all")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do this for every shard? Can't we build the query once and then use it on each shard?

Keep in mind that the size of this query is unbounded, so we could hit max_allowed_packet. I want to say that there's also a limit on the number of UNIONs you can do in a single statement but I didn't find any docs on that in a quick search. For JOINs e.g.:

The maximum number of tables that can be referenced in a single join is 61. This includes a join handled by merging derived tables and views in the FROM clause into the outer query block (see [Section 10.2.2.4, “Optimizing Derived Tables, View References, and Common Table Expressions with Merging or Materialization”](https://dev.mysql.com/doc/refman/8.4/en/derived-table-optimization.html)).

So that may be what we bump up against here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Matt for finding this. Fixed that. But even I couldn't find out the limit of UNIONs on any docs or web search. What do you think should be the limit? Should we break the query down into multiple queries?


res, err := tmc.ExecuteFetchAsDba(ctx, ti.Tablet, true, &tabletmanagerdatapb.ExecuteFetchAsDbaRequest{
Query: []byte(query),
MaxRows: uint64(len(tables)),
})
if err != nil {
return err
}
for _, row := range res.Rows {
isFaultyTable[string(row.Values)] = true
}
}

var faultyTables []string
for table := range isFaultyTable {
faultyTables = append(faultyTables, table)
}

if len(faultyTables) > 0 {
return fmt.Errorf("target keyspace contains following non-empty table(s): %s", strings.Join(faultyTables, ", "))
}
return nil
}
Loading