Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source-mysql: support for replica-only capture #1859

Open
willdonnelly opened this issue Aug 27, 2024 · 3 comments
Open

source-mysql: support for replica-only capture #1859

willdonnelly opened this issue Aug 27, 2024 · 3 comments
Assignees

Comments

@willdonnelly
Copy link
Member

willdonnelly commented Aug 27, 2024

For some of the SQL databases which we have CDC connectors for, it ought to be possible to support capturing from read-only replicas. The major obstacle to this is that we currently rely on watermark writes as our primary mechanism for establishing a causal relationship between backfill queries and the replication stream so that our incremental backfills will be exactly correct, and for obvious reasons issuing writes into the watermarks table can't be done from a read-only replica.

In MySQL (and likely Postgres and SQL Server as well, but they're outside of the initial scope of this work) it should be possible to outright replace the watermarking logic with an equivalent implementation in which watermark writes are replaced by querying the current endpoint of the binlog, and observing a watermark change is replaced by observing a transaction commit event in the binlog whose position is >= that position. Since the endpoint query will be occurring on the same primary database connection as backfill queries and will be issued after a backfill query completes, it should provide the same causality guarantees that watermark writes currently do.

A short summary of the planned sequence of changes:

  • A large refactoring of the interface between generic SQL CDC and database-specific logic, which introduces a concept of "streaming change events until the next fence" and makes watermark writes and watermark-detection an implementation detail of how specific connectors implement that stream-to-fence operation.
  • Reimplementing the MySQL stream-to-fence operation in terms of binlog positions rather than watermarks.
  • Cleaning up as much of the old watermark-related code from the MySQL connector as possible.
  • A bunch of stress testing to make sure it actually works in all circumstances and that we can actually support read-only replica captures after these changes.
@willdonnelly willdonnelly self-assigned this Aug 27, 2024
@willdonnelly
Copy link
Member Author

The implementation of MySQL watermark-less captures is going well and appears to be working. One question which recently came up was whether our consistent rate of watermark writes was serving any other important purposes which might be broken if we outright remove them.

In some databases this is definitely true. For instance, in Postgres if the database we're capturing from is entirely idle but it's hosted on a server with other databases which are active, our replication slot LSN will never advance (because there's no new changes for us to confirm that we've read up to) but the Postgres WAL is shared across all databases so it will keep growing until the server either runs out of disk or invalidates the replication slot. So if/when we decide to add read-only replica support for Postgres we will have to think about providing users with some mechanism to keep issuing some sort of heartbeat, and/or warning them and providing setup instructions for generating a heartbeat outside of our connector.

But that's not the case in MySQL. While it's always possible that users could have come to rely on our watermark writes for other non-CDC related reasons it's unlikely, so there is only one "useful side effect" that I am aware of, and that's when a user wants to capture changes to MyISAM tables.

The issue with MyISAM tables is that the MyISAM storage engine doesn't support transactions, and what that means in concrete terms is that changes to a MyISAM table write out change event(s) into the binlog but there is never a transaction commit event following those. Currently the MySQL CDC connector relies on transaction commit events to tell when it's okay to emit a Flow checkpoint, and so if a database is solely using MyISAM tables for their data our guidance in the past was to make sure the watermarks table was created using InnoDB so that watermark writes would cause there to be at least one commit event in the binlog every minute.

After going back and forth for a bit internally, we've concluded that the best solution is to implement support for MyISAM tables so that they actually work correctly. Every change to a MyISAM table is, practically speaking, an independent transaction which commits immediately even when issued in the context of an InnoDB transaction. So in theory, all we should need to do is modify our discovery/metadata logic to keep track of whether a table is using a non-transactional storage engine and then if so we should treat each change to such a table as being followed by an implicit commit event.

@willdonnelly
Copy link
Member Author

The main PR (#1878) addressing this issue is now out for review. Once that's released, we will need to:

  • Update various connector documentation to remove all mention of the watermarks table and possibly also call out read-replica captures as a thing we explicitly support.
  • Wait a few days to make sure everything is running happily in production, and then finish removing the deprecated watermarks_table and heartbeat_interval config properties.

@willdonnelly
Copy link
Member Author

PR #1878 has been live for a few hours and nothing has exploded so far, which means that MySQL read replica captures are now something that works in production. I'll get the aforementioned documentation updates out soon and then next week I'll finish the job of removing the deprecated config properties.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant