diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 90e1507..b158d45 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -62,6 +62,27 @@ Install the egg locally: # install pgbelt and dev tools with make **setup** make setup +### Better understanding of `pgbelt` and how it works + +To gain a better understanding of how the tool works, which helps with development, please read the [extended knowledge document](docs/extended_knowledge.md)! + +### How to spin up a local replication task for development + +This feature is very useful when you are making code changes to `pgbelt` and want to test against live databases to ensure correct behavior. + +To do this, this local development feature uses `docker` and `docker-compose` to spin up the following: + +1. Two Postgres Containers with networking configured between each other +2. One container loaded with your local copy of `pgbelt`, built and installed for CLI usage. + +Simply run the following to spin the above up and drop yourself into your container with your local `pgbelt`: + + make local-dev + +Once you are done, you can exit out of the above container. Then, for cleanliness, please run the following to clean up `docker` and `docker-compose`: + + make clean-docker + ### How to test the project You will want to run the full test suite (including integration tests) to ensure your contribution causes no issues. diff --git a/README.md b/README.md index 35ac8af..5009ee3 100644 --- a/README.md +++ b/README.md @@ -46,6 +46,10 @@ Install pgbelt locally: See [this doc](docs/quickstart.md)! +## Playbook + +This playbook gets updated actively. If you have any issues, solutions could be found in [this playbook](docs/playbook.md). + ## Contributing We welcome contributions! See [this doc](CONTRIBUTING.md) on how to do so, including setting up your local development environment. diff --git a/docs/extended_knowledge.md b/docs/extended_knowledge.md new file mode 100644 index 0000000..1ec9b60 --- /dev/null +++ b/docs/extended_knowledge.md @@ -0,0 +1,27 @@ +# Extended Knowledge with `pgbelt` + +## How `pglogical` replication works + +### How a replication task works logically + +- We have a replication task that runs in **two phases**: + 1. Full Load / Bulk Sync. Moving the majority of data takes a lot of time, so it is all dumped and loaded **at a specific timestamp**. While this occurs, any ongoing changes to the dataset from that timestamp onwards are stored in a **replication slot**. + 2. Once the above step is finished, ongoing changes are consumed from the source database's replication slot and replayed on the destination database. This is an ongoing process. + +### Pglogical Components for a Replication task + +- Node - A way of telling pglogical the existence of an external database, along with the credentials to connect with. +- Subscription - A replication task initiated from the side of the subcribing node, or destination database. +- Replication Set - A set of tables to replicate, along with settings of what action/statement types to replicate. + - We replicate **all** actions, but the list of tables to replicate may vary. We replicate all tables in a database major version upgrade, but also only do subsets for "exodus-style" migrations. + +### What `pgbelt` does with the above components: + +- Configure the pglogical nodes for the external database in both the source and destination databases. +- For forward replication (source to destination) + - Create a new replication set in the source DB, and add all required tables to it. + - Start a new subscription from the destination DB, referencing the above replication set. +- For reverse replication (destination to source) + - Create a new replication set in the destination DB, and add all required tables to it. + - Start a new subscription from the source DB, referencing the above replication set, **and with synchronize_structure off**. + - The last flag ensures no full load sync occurs from the destination DB (incomplete/empty) to the source database. It will only replicate transactions other than the incoming forward replication statements. diff --git a/docs/playbook.md b/docs/playbook.md new file mode 100644 index 0000000..43c85aa --- /dev/null +++ b/docs/playbook.md @@ -0,0 +1,15 @@ +# Playbook + +## I see an incorrect credential error with the `pglogical` user when setting up replication. What do I do? + +It is very possible you have multiple people using the `pgbelt` tool to set up replication. The config's `pglogical` password may be differnt in each person's config, and that is used during the `setup` stage. The password from the config is used to create the `pglogical` role in your databases. + +Therefore, the first person to run `setup` has set the `pglogical` user's password in the databases. The error likely comes from `pglogical` mentioning a `node` configuration, where the password is set. + +For information on `nodes` in the `pglogical` plugin, please see the `Extended Knowledge` document in this repository. + +To remedy this issue, you can perform the following: + +1. If you see the error with the entire DSN (including password and IP address or hostname), identify if the host is the **source** or **destination** database. +2. Once identified, run the following to PSQL into that host: `psql "$(belt -dsn )"` +3. In that PSQL terminal, run the following to set the password according to the `node` configuration: `ALTER ROLE pglogical PASSWORD '';`