Skip to content

Commit

Permalink
fix: sync command breakdown update and remove useless commands (#622)
Browse files Browse the repository at this point in the history
* fix: sync command breakdown update and remove useless commands

* doc: update format

* doc: move breakdown to playbook
  • Loading branch information
vjeeva authored Nov 19, 2024
1 parent c56872f commit 558a28d
Show file tree
Hide file tree
Showing 3 changed files with 37 additions and 45 deletions.
35 changes: 35 additions & 0 deletions docs/playbook.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,38 @@ Run the following commands:
Note that the first four commands will remove all replication job setup from the databases. `remove-constraints` removes NOT VALID constraints from the target schema so when you restart replication, they don't cause failed inserts (these must not exist during the initial setup). `remove-indexes` removes all indexes from the target schema to help speed up the initial bulk load. `remove-indexes` is not necessary to run, you may skip this if needed.

After running these commands, you can `TRUNCATE` the tables in the destination database and start the migration from the beginning. **Please take as much precaution as possible when running TRUNCATE, as it will delete all data in the tables. Especially please ensure you are running this on the correct database!**

## My `sync` command has failed or is hanging. What can I do?

The `sync` command from Step 7 of the Quickstart guide does the following:

- Sync sequence values
- Dump and load tables without Primary Keys
- Add NOT VALID constraints to the target schema (they were removed in Step 1 in the target database)
- Create Indexes (as long as this was run in Step 2, this will be glossed over. If step 2 was missed, indexes will build now amd this will take longer than expected).
- Validate data (take 100 random rows and 100 last rows of each table, and compare data)
- Run ANALYZE to ensure optimal performance

If the `sync` command fails, you can try to run the individual commands that make up the `sync` command to see where the failure is. The individual commands are:

1. Syncing Sequences:

- `sync-sequences` - reads and sets sequences values from SRC to DST at the time of command execution

2. Syncing Tables without Primary Keys:

- `dump-tables` - dumps only tables without Primary Keys (to ensure only tables without Primary Keys are dumped, DO NOT specify the `--tables` flag for this command)
- `load-tables` - load into DST DB the tables from the `dump-tables` command (found on disk)

3. Syncing NOT VALID Constraints:

- `dump-schema` - dumps schema from your SRC DB schema onto disk (the files may already be on disk, but run this command just to ensure they exist anyways)
- `load-constraints` - load NOT VALID constraints from disk (obtained by the `dump-schema` command) to your DST DB schema

4. Creating Indexes & Running ANALYZE:

- `create-indexes` - Create indexes on the target database, and then runs ANALYZE as well.

5. Validating Data:

- `validate-data` - Check random 100 rows and last 100 rows of every table involved in the replication job, and ensure all match exactly.
12 changes: 2 additions & 10 deletions docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,23 +179,15 @@ Therefore the next command will do the following:
- Sync sequence values
- Dump and load tables without Primary Keys
- Add NOT VALID constraints to the target schema (they were removed in Step 1 in the target database)
- Create Indexes (as long as this was run in Step 2, this will be glossed over. If step 2 was missed, indexes will build now amnd this will take longer than expected).
- Create Indexes (as long as this was run in Step 2, this will be glossed over. If step 2 was missed, indexes will build now amd this will take longer than expected).
- Validate data (take 100 random rows and 100 last rows of each table, and compare data)
- Run ANALYZE to ensure optimal performance

```
$ belt sync testdatacenter1 database1
```

If the above command fails, you can diagnose and run the individual steps with the following commands:

- `sync-sequences` - reads and sets sequences values from SRC to DST at the time of command execution
- `dump-tables` - dumps only tables without Primary Keys
- `load-tables` - load into DST DB the tables from the `dump-tables` command (found on disk)
- `dump-contraints` - dumps NOT VALID constraints from your SRC DB schema onto disk
- `load-constraints` - load NOT VALID constraints from disk to your DST DB schema
- `validate-data` - Check random 100 rows and last 100 rows of every table involved in the replication job, and ensure all match exactly.
- `analyze` - Run ANALYZE on the database
If the above command fails, please see the `playbook.md` document in this repository for more information on how to resolve the issue.

## Step 8: Enable write traffic to the destination host

Expand Down
35 changes: 0 additions & 35 deletions pgbelt/cmd/sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,40 +108,6 @@ async def load_tables(
await load_dumped_tables(conf, tables, logger)


@run_with_configs
async def sync_tables(
config_future: Awaitable[DbupgradeConfig],
tables: list[str] = Option([], help="Specific tables to sync"),
):
"""
Dump and load all tables from the source database to the destination database.
Equivalent to running dump-tables followed by load-tables. Table data will be
saved locally in files.
You may also provide a list of tables to sync with the
--tables option and only these tables will be synced.
"""
conf = await config_future
src_logger = get_logger(conf.db, conf.dc, "sync.src")
dst_logger = get_logger(conf.db, conf.dc, "sync.dst")

if tables:
dump_tables = tables.split(",")
else:
async with create_pool(conf.src.pglogical_uri, min_size=1) as src_pool:
_, dump_tables, _ = await analyze_table_pkeys(
src_pool, conf.schema_name, src_logger
)

if conf.tables:
dump_tables = [t for t in dump_tables if t in conf.tables]

await dump_source_tables(conf, dump_tables)
await load_dumped_tables(
conf, [] if not tables and not conf.tables else dump_tables, dst_logger
)


@run_with_configs(skip_src=True)
async def analyze(config_future: Awaitable[DbupgradeConfig]) -> None:
"""
Expand Down Expand Up @@ -276,7 +242,6 @@ async def sync(
sync_sequences,
dump_tables,
load_tables,
sync_tables,
analyze,
validate_data,
sync,
Expand Down

0 comments on commit 558a28d

Please sign in to comment.