-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental: Multi-tenant import support in Vitess #15503
Experimental: Multi-tenant import support in Vitess #15503
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #15503 +/- ##
==========================================
+ Coverage 65.78% 68.01% +2.23%
==========================================
Files 1561 1562 +1
Lines 194838 195606 +768
==========================================
+ Hits 128171 133051 +4880
+ Misses 66667 62555 -4112 ☔ View full report in Codecov by Sentry. |
434d5c0
to
508273d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Nice work on this. I see that there's a few FIXMEs left so I'll come back to this and review the new code. At this point I only had minor comments and notes.
9d81e1e
to
04c3402
Compare
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
…outing rules on complete Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
…which is not reproduced locally Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
aea7d91
to
077a5fa
Compare
Signed-off-by: Rohit Nayak <[email protected]>
…g e2e tests Signed-off-by: Rohit Nayak <[email protected]>
No more FIXMEs remaining. @mattlord please review when you get a chance... |
… type for tenant id column Signed-off-by: Rohit Nayak <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work on this, @rohit-nayak-ps ! Very clean. I only had a few very minor comments that I'll leave to your discretion. Thanks!
} | ||
|
||
func init() { | ||
ApplyKeyspaceRoutingRules.Flags().StringVarP(&applyKeyspaceRoutingRulesOptions.Rules, "rules", "r", "", "Keyspace routing rules, specified as a string") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we mean a valid JSON document as a string or something else? This is where examples in the command definition can be helpful for the user. For example, see:
vitess/go/cmd/vtctldclient/command/vreplication/materialize/create.go
Lines 44 to 71 in fabd746
Example: `vtctldclient --server localhost:15999 materialize --workflow product_sales --target-keyspace commerce create --source-keyspace commerce --table-settings '[{"target_table": "sales_by_sku", "create_ddl": "create table sales_by_sku (sku varbinary(128) not null primary key, orders bigint, revenue bigint)", "source_expression": "select sku, count(*) as orders, sum(price) as revenue from corder group by sku"}]' --cells zone1 --cells zone2 --tablet-types replica`, | |
Long: `Materialize is a lower level VReplication command that allows for generalized materialization | |
of tables. The target tables can be copies, aggregations, or views. The target tables are kept | |
in sync in near-realtime. The primary flag used to define the materializations (you can have | |
multiple per workflow) is table-settings which is a JSON array where each value must contain | |
two key/value pairs. The first required key is 'target_table' and it is the name of the table | |
in the target-keyspace to store the results in. The second required key is 'source_expression' | |
and its value is the select query to run against the source table. An optional key/value pair | |
can also be specified for 'create_ddl' which provides the DDL to create the target table if it | |
does not exist -- you can alternatively specify a value of 'copy' if the target table schema | |
should be copied as-is from the source keyspace. Here's an example value for table-settings: | |
[ | |
{ | |
"target_table": "customer_one_email", | |
"source_expression": "select email from customer where customer_id = 1" | |
}, | |
{ | |
"target_table": "states", | |
"source_expression": "select * from states", | |
"create_ddl": "copy" | |
}, | |
{ | |
"target_table": "sales_by_sku", | |
"source_expression": "select sku, count(*) as orders, sum(price) as revenue from corder group by sku", | |
"create_ddl": "create table sales_by_sku (sku varbinary(128) not null primary key, orders bigint, revenue bigint)" | |
} | |
] | |
`, |
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Description
Please read the design RFC first at #15403.
This proposal outlines support for importing data from a multi-tenant database cluster into a single Vitess cluster
Motivation
The user has a multi-tenant database architecture. They have several separate MySQL databases, with common schemas, one per tenant. Each database will be imported using one MoveTables workflow, using an external keyspace pointing to the
source, with a common target keyspace. All tables, both in the source and target, will include a
tenant_id
column to uniquely identify the tenant.We can run multiple such workflows in parallel, until all tenants are imported. Just like normal imports, we would also
like reverse workflows to be running: to allow for rollback.
Ideally, routing rules should be established. This ensures that queries targeting a specific tenant ID are routed correctly to the target if the tenant has already been imported. Otherwise, they should route to the source if the import workflow is in progress but not yet switched to the target.
Many setups with multi-tenancy will use a multi-schema approach, where each tenant has their own MySQL database on a
single server. In this case the queries will use a database qualifier, which matches their database name. So when they
are running the migrations into Vitess, they will continue to use the same database qualifier.
Assumptions
should be part of the primary key of each table.
Notes
Related Issue(s)
#15403
Checklist