From 9a1bc78fee429316783a55629509633cfabcce3a Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Tue, 20 Jun 2023 11:24:09 -0400 Subject: [PATCH 01/10] edits to using TPA topic --- product_docs/docs/pgd/5/tpa.mdx | 109 ++++++++++++++++---------------- 1 file changed, 56 insertions(+), 53 deletions(-) diff --git a/product_docs/docs/pgd/5/tpa.mdx b/product_docs/docs/pgd/5/tpa.mdx index d2554ccacd2..f60cced92eb 100644 --- a/product_docs/docs/pgd/5/tpa.mdx +++ b/product_docs/docs/pgd/5/tpa.mdx @@ -11,23 +11,21 @@ redirects: - ../deployments/using_tpa/ --- -The standard way of deploying EDB Postgres Distributed in a self managed -setting, including physical and virtual machines, both self-hosted and in the -cloud (EC2), is to use EDB's deployment tool: [Trusted Postgres -Architect](/tpa/latest/) (TPA). +The standard way of deploying EDB Postgres Distributed in a self-managed +setting is to use EDB's deployment tool: [Trusted Postgres +Architect](/tpa/latest/) (TPA). This applies to physical and virtual machines, both self-hosted and in the +cloud (EC2), -!!! Note **Get started with PGD quickly** +!!! Note Get started with PGD quickly - If you want to experiment with a local deployment as quickly as possible, you can use your free trial account and [Deploying an EDB Postgres Distributed example cluster on Docker](/pgd/latest/quickstart/quick_start_docker) to configure, provision and deploy a PGD 5 Always-On cluster on Docker. + If you want to experiment with a local deployment as quickly as possible, you can use your free-trial account and [Deploying an EDB Postgres Distributed example cluster on Docker](/pgd/latest/quickstart/quick_start_docker) to configure, provision, and deploy a PGD 5 Always-On cluster on Docker. - If deploying to the cloud is your aim, use that same free trial account + If deploying to the cloud is your aim, use that same free-trial account and [Deploying and EDB Postgres Distributed example cluster on AWS](/pgd/latest/quickstart/quick_start_aws) to get a PGD 5 cluster on your own Amazon account. - And finally, don't forget that you can also use your free trial account and [Trusted Postgres Architect](/tpa/latest/) (TPA) to [deploy directly to your own bare metal servers](/tpa/latest/platform-bare). - - + You can also use your free-trial account and [Trusted Postgres Architect](/tpa/latest/)(TPA) to [deploy directly to your own bare metal servers](/tpa/latest/platform-bare). ## Prerequisite: Install TPA @@ -35,7 +33,7 @@ Architect](/tpa/latest/) (TPA). Before you can use TPA to deploy PGD, you must install TPA. Follow the [installation instructions in the Trusted Postgres Architect documentation](/tpa/latest/INSTALL/) before continuing. ## Configure -The `tpaexec configure` command generates a simple YAML configuration file to describe a cluster, based on the options you select. The configuration is ready for immediate use and you can modify it to better suit your needs. Editing the configuration file is the usual way to make any configuration changes to your cluster both before and after it's created. +The `tpaexec configure` command generates a simple YAML configuration file to describe a cluster, based on the options you select. The configuration is ready for immediate use, and you can modify it to better suit your needs. Editing the configuration file is the usual way to make any configuration changes to your cluster both before and after it's created. The syntax is: @@ -50,13 +48,13 @@ The available configuration options include: | `--architecture` | Required. Set to `PGD-Always-ON` for EDB Postgres Distributed deployments. | | `–-postgresql `
or
`--edb-postgres-advanced `
or
`--edb-postgres-extended ` | Required. Specifies the distribution and version of Postgres to use. For more details, see [Cluster configuration: Postgres flavour and version](/tpa/latest/tpaexec-configure/#postgres-flavour-and-version). | | `--redwood` or `--no-redwood` | Required when `--edb-postgres-advanced` flag is present. Specifies whether Oracle database compatibility features are desired. | -| `--location-names l1 l2 l3` | Required. Specifies the number and name of the locations PGD will be deployed to. | -| `--data-nodes-per-location N` | Specifies number of data nodes per location. Default 3. | -| `--add-witness-node-per-location` | For even number of data nodes per location, this will add witness node to allow for local consensus. This is enabled by default for 2 data node locations. | -| `--add-proxy-nodes-per-location` | Whether to separate PGD-Proxies from data nodes, and how many to configure. By default one proxy is configured and cohosted for each data node. | -| `--pgd-proxy-routing global\|local` | Should PGD-proxy routing be handled on a global or local (per location) basis. | -| `--add-witness-only-location loc` | This designates one of the cluster location as witness only (no data nodes will be present in that location). | -| `--enable-camo` | Sets up CAMO pair in each location. This only works with 2 data node per location. | +| `--location-names l1 l2 l3` | Required. Specifies the number and name of the locations to deploy PGD to. | +| `--data-nodes-per-location N` | Specifies the number of data nodes per location. Default is 3. | +| `--add-witness-node-per-location` | For an even number of data nodes per location, adds witness nodes to allow for local consensus. Enabled by default for 2 data node locations. | +| `--add-proxy-nodes-per-location` | Whether to separate PGD proxies from data nodes and how many to configure. By default one proxy is configured and cohosted for each data node. | +| `--pgd-proxy-routing global\|local` | Whether PGD proxy routing is handled on a global or local (per location) basis. | +| `--add-witness-only-location loc` | Designates one of the cluster locations as witness-only (no data nodes are present in that location). | +| `--enable-camo` | Sets up a CAMO pair in each location. Works only with 2 data nodes per location. | More configuration options are listed in the TPA documentation for [PGD-Always-ON](/tpa/latest/architecture-PGD-Always-ON/). @@ -73,64 +71,72 @@ For example: --pgd-proxy-routing global ``` -The first argument must be the cluster directory, for example, `speedy` or `~/clusters/speedy` (the cluster is named `speedy` in both cases). We recommend that you keep all your clusters in a common directory, for example, `~/clusters`. The next argument must be `--architecture` to select an architecture, followed by options. +The first argument must be the cluster directory, for example, `speedy` or `~/clusters/speedy`. (The cluster is named `speedy` in both cases.) We recommend that you keep all your clusters in a common directory, for example, `~/clusters`. The next argument must be `--architecture` to select an architecture, followed by options. + +The command creates a directory named `~/clusters/speedy` and generates a configuration file named `config.yml` that follows the layout of the PGD-Always-ON architecture. You can use the `tpaexec configure --architecture PGD-Always-ON --help` command to see the values that are supported for the configuration options in this architecture. + +In the example, the options select: -The command creates a directory named ~/clusters/speedy and generates a configuration file named `config.yml` that follows the layout of the PGD-Always-ON architecture. You can use the `tpaexec configure --architecture PGD-Always-ON --help` command to see what values are supported for the configuration options in this architecture. In the example , the options select an AWS deployment (`--platform aws`) with EDB Postgres Advanced, version 15 and Oracle compatibility (`--edb-postgres-advanced 15` and `--redwood`) with three locations (`--location-names eu-west-1 eu-north-1 eu-central-1`) and three data nodes at each location (`--data-nodes-per-location 3`). The last option sets the proxy routing policy to global (`--pgd-proxy-routing global`). +- An AWS deployment (`--platform aws`) +- EDB Postgres Advanced Server, version 15 and Oracle compatibility (`--edb-postgres-advanced 15` and `--redwood`) +- Three locations (`--location-names eu-west-1 eu-north-1 eu-central-1`) +- Three data nodes at each location (`--data-nodes-per-location 3`) +- Proxy routing policy of global (`--pgd-proxy-routing global`) ### Common configuration options -Other configuration options include: +Other configuration options include the following. #### Owner Every cluster must be directly traceable to a person responsible for the provisioned resources. -By default, a cluster is tagged as being owned by the login name of the user running `tpaexec provision`. If this name does not identify a person (for example, `postgres`, `ec2-user`), you must specify `--owner SomeId` to set an identifiable owner. +By default, a cluster is tagged as being owned by the login name of the user running `tpaexec provision`. If this name doesn't identify a person (for example, `postgres`, `ec2-user`), you must specify `--owner SomeId` to set an identifiable owner. -You may use your initials, or "Firstname Lastname", or anything else that identifies you uniquely. +You can use your initials, "Firstname Lastname", or any text that identifies you uniquely. #### Platform options -The default value for `--platform` is `aws`. It is the platform supported by the PGD-Always-ON architecture. +The default value for `--platform` is `aws`, which is the platform supported by the PGD-Always-ON architecture. -Specify `--region` to specify any existing AWS region that you have access to (and that permits the required number of instances to be created). The default region is eu-west-1. +Specify `--region` to specify any existing AWS region that you have access to and that allows you to create the required number of instances. The default region is eu-west-1. Specify `--instance-type` with any valid instance type for AWS. The default is t3.micro. ### Subnet selection -By default, each cluster is assigned a random /28 subnet under 10.33/16, but depending on the architecture, there may be one or more subnets, and each subnet may be anywhere between a /24 and a /29. +By default, each cluster is assigned a random /28 subnet under 10.33/16. However, depending on the architecture, there can be one or more subnets, and each subnet can be anywhere between a /24 and a /29. -Specify `--subnet` to use a particular subnet. For example, `--subnet 192.0.2.128/27`. +Specify `--subnet` to use a particular subnet, for example, `--subnet 192.0.2.128/27`. ### Disk space -Specify `--root-volume-size` to set the size of the root volume in GB. For example, `--root-volume-size 64`. The default is 16GB. (Depending on the image used to create instances, there may be a minimum size for the root volume.) +Specify `--root-volume-size` to set the size of the root volume in GB, for example, `--root-volume-size 64`. The default is 16GB. Depending on the image used to create instances, there might be a minimum size for the root volume. -For architectures that support separate postgres and barman volumes: +For architectures that support separate Postgres and Barman volumes: -Specify `--postgres-volume-size` to set the size of the Postgres volume in GB. The default is 16GB. +- Specify `--postgres-volume-size` to set the size of the Postgres volume in GB. The default is 16GB. -Specify `--barman-volume-size` to set the size of the Barman volume in GB. The default is 32GB. +- Specify `--barman-volume-size` to set the size of the Barman volume in GB. The default is 32GB. ### Distribution -Specify `--os` or `--distribution` to specify the OS to be used on the cluster's instances. The value is case-sensitive. +Specify `--os` or `--distribution` to specify the OS to use on the cluster's instances. The value is case sensitive. -The selected platform determines which distributions are available and which one is used by default. For more details, see `tpaexec info platforms/`. +The selected platform determines the distributions that are available and the one that's used by default. For more details, see `tpaexec info platforms/`. -In general, you can use "Debian", "RedHat", and "Ubuntu" to select TPA images that have Postgres and other software preinstalled (to reduce deployment times). To use stock distribution images instead, append "-minimal" to the value, for example, `--distribution Debian-minimal`. +In general, you can use `Debian`, `RedHat`, and `Ubuntu` to select TPA images that have Postgres and other software preinstalled (to reduce deployment times). To use stock distribution images instead, append `-minimal` to the value, for example, `--distribution Debian-minimal`. ### Repositories -When using TPA to deploy PDG 5 and later, TPA selects repositories from EDB Repos 2.0 and all software will be sourced from these repositories. +When using TPA to deploy PDG 5 and later, TPA selects repositories from EDB Repos 2.0. All software is sourced from these repositories. -To use [EDB Repos 2.0](https://www.enterprisedb.com/repos/) you must +To use [EDB Repos 2.0](https://www.enterprisedb.com/repos/), you must use `export EDB_SUBSCRIPTION_TOKEN=xxx` before you run tpaexec. You can get your subscription token from [the web interface](https://www.enterprisedb.com/repos-downloads). -Optionally, use `--edb-repositories repository …` to specify EDB repositories to install on each instance, in addition to the default repository. +Optionally, use `--edb-repositories repository …` to specify EDB repositories in addition to the default repository to install on each instance. ### Software versions -By default TPA uses the latest major version of Postgres. Specify `--postgres-version` to install an earlier supported major version, or specify both version and distribution via one of the flags described under [Configure](#configure), above. +By default, TPA uses the latest major version of Postgres. Specify `--postgres-version` to install an earlier supported major version, or specify both version and distribution using one of the flags described under [Configure](#configure). -By default, TPA always installs the latest version of every package. This is usually the desired behavior, but in some testing scenarios, it may be necessary to select specific package versions. For example, +By default, TPA installs the latest version of every package, which is usually the desired behavior. However, in some testing scenarios, you might need to select specific package versions. For example: ``` --postgres-package-version 10.4-2.pgdg90+1 @@ -141,16 +147,16 @@ By default, TPA always installs the latest version of every package. This is usu --pgbouncer-package-version '1.8*' ``` -Specify `--extra-packages` or `--extra-postgres-packages` to install additional packages. The former lists packages to install along with system packages, while the latter lists packages to install later along with postgres packages. (If you mention packages that depend on Postgres in the former list, the installation fails because Postgres is not yet installed.) The arguments are passed on to the package manager for installation without any modifications. +Specify `--extra-packages` or `--extra-postgres-packages` to install more packages. The former lists packages to install along with system packages. The latter lists packages to install later along with Postgres packages. (If you mention packages that depend on Postgres in the former list, the installation fails because Postgres isn't yet installed.) The arguments are passed on to the package manager for installation without any modifications. -The `--extra-optional-packages` option behaves like `--extra-packages`, but it is not an error if the named packages cannot be installed. +The `--extra-optional-packages` option behaves like `--extra-packages`, but it's not an error if the named packages can't be installed. ### Hostnames -By default, `tpaexec configure` randomly selects as many hostnames as it needs from a pre-approved list of several dozen names. This should be enough for most clusters. +By default, `tpaexec configure` randomly selects as many hostnames as it needs from a preapproved list of several dozen names, which is enough for most clusters. -Specify `--hostnames-from` to select names from a different list (for example, if you need more names than are available in the canned list). The file must contain one hostname per line. +Specify `--hostnames-from` to select names from a different list, for example, if you need more names than are available in the supplied list. The file must contain one hostname per line. -Specify `--hostnames-pattern` to restrict hostnames to those matching the egrep-syntax pattern. If you choose to do this, you must ensure that the pattern matches only valid hostnames ([a-zA-Z0-9-]) and finds a sufficient number thereof. +Specify `--hostnames-pattern` to restrict hostnames to those matching the egrep-syntax pattern. If you choose to do this, you must ensure that the pattern matches only valid hostnames ([a-zA-Z0-9-]) and finds enough of them. ### Locations By default, `tpaexec configure` uses the names first, second, and so on for any locations used by the selected architecture. @@ -163,23 +169,20 @@ The `tpaexec provision` command creates instances and other resources required b For example, given AWS access with the necessary privileges, TPA provisions EC2 instances, VPCs, subnets, routing tables, internet gateways, security groups, EBS volumes, elastic IPs, and so on. -You can also "provision" existing servers by selecting the "bare" platform and providing connection details. Whether these are bare metal servers or those provisioned separately on a cloud platform, they can be used just as if they had been created by TPA. +You can also provision existing servers by selecting the `bare` platform and providing connection details. Whether these are bare metal servers or those provisioned separately on a cloud platform, they can be used as if they had been created by TPA. -You are not restricted to a single platform—you can spread your cluster out across some AWS instances (in multiple regions) and some on-premise servers, or servers in other data centres, as needed. +You aren't restricted to a single platform. You can spread your cluster out across some AWS instances in multiple regions and some on-premise servers or servers in other data centres, as needed. -At the end of the provisioning stage, you will have the required number of instances with the basic operating system installed, which TPA can access via SSH (with sudo to root). +At the end of the provisioning stage, you will have the required number of instances with the basic operating system installed, which TPA can access using SSH (with sudo to root). ## Deploy -The `tpaexec deploy` command installs and configures Postgres and other software on the provisioned servers (which may or may not have been created by TPA; but it doesn't matter who created them so long as SSH and sudo access is available). This includes setting up replication, backups, and so on. +The `tpaexec deploy` command installs and configures Postgres and other software on the provisioned servers. TPA can create the servers, but it doesn't matter who created them so long as SSH and sudo access are available. This includes setting up replication, backups, and so on. At the end of the deployment stage, EDB Postgres Distributed is up and running. ## Test -The `tpaexec test` command executes various architecture and platform-specific tests against the deployed cluster to ensure that it is working as expected. +The `tpaexec test` command executes various architecture and platform-specific tests against the deployed cluster to ensure that it's working as expected. -At the end of the testing stage, you will have a fully-functioning cluster. +At the end of the testing stage, you have a fully functioning cluster. For more information, see [Trusted Postgres Architect](/tpa/latest/). - - - From 31e46286c851f439c9134c4de70d6274b7933c7a Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Tue, 20 Jun 2023 14:20:41 -0400 Subject: [PATCH 02/10] Basic cleanup of release text Other cleanup tasks First edit of Backup topic --- product_docs/docs/pgd/5/appusage.mdx | 5 +- product_docs/docs/pgd/5/architectures.mdx | 4 +- product_docs/docs/pgd/5/backup.mdx | 205 +++++++++--------- .../docs/pgd/5/cli/installing_cli.mdx | 4 +- .../5/quickstart/connecting_applications.mdx | 2 +- .../5/quickstart/further_explore_failover.mdx | 4 +- product_docs/docs/pgd/5/quickstart/index.mdx | 6 +- .../docs/pgd/5/quickstart/quick_start_aws.mdx | 4 +- .../pgd/5/quickstart/quick_start_docker.mdx | 4 +- .../pgd/5/quickstart/quick_start_linux.mdx | 4 +- .../docs/pgd/5/routing/installing_proxy.mdx | 6 +- product_docs/docs/pgd/5/routing/proxy.mdx | 6 +- .../raft/02_raft_subgroups_and_pgd_cli.mdx | 4 +- .../docs/pgd/5/routing/raft/index.mdx | 28 +-- .../docs/pgd/5/upgrades/bdr_pg_upgrade.mdx | 48 ++-- product_docs/docs/pgd/5/upgrades/index.mdx | 25 +-- 16 files changed, 177 insertions(+), 182 deletions(-) diff --git a/product_docs/docs/pgd/5/appusage.mdx b/product_docs/docs/pgd/5/appusage.mdx index 1a21d330915..9dff5a36d5e 100644 --- a/product_docs/docs/pgd/5/appusage.mdx +++ b/product_docs/docs/pgd/5/appusage.mdx @@ -182,7 +182,7 @@ See [Release notes](/pgd/latest/rel_notes/) for any known incompatibilities. ## Replicating between nodes with differences -By default, DDL is automatically sent to all nodes. You can control this manually, as described in [DDL Replication](ddl), and you can use it to create differences between database schemas across nodes. +By default, DDL is automatically sent to all nodes. You can control this manually, as described in [DDL replication](ddl), and you can use it to create differences between database schemas across nodes. PGD is designed to allow replication to continue even with minor differences between nodes. These features are designed to allow application schema migration without downtime or to allow logical @@ -326,7 +326,7 @@ its different modes. ## Application testing You can test PGD applications using the following programs, -in addition to other techniques. +in addition to other techniques: - [Trusted Postgres Architect](#trusted-postgres-architect) - [pgbench with CAMO/Failover options](#pgbench-with-camofailover-options) @@ -391,7 +391,6 @@ this scenario there's no way to find the status of in-flight transactions. ### isolationtester with multi-node access - `isolationtester` was extended to allow users to run tests on multiple sessions and on multiple nodes. This tool is used for internal PGD testing, although it's also available for use with user application testing. diff --git a/product_docs/docs/pgd/5/architectures.mdx b/product_docs/docs/pgd/5/architectures.mdx index cc7bb7af4c8..fbba16434f9 100644 --- a/product_docs/docs/pgd/5/architectures.mdx +++ b/product_docs/docs/pgd/5/architectures.mdx @@ -137,7 +137,7 @@ Use these criteria to help you to select the appropriate Always On architecture. | Global consensus in case of location failure | N/A | No | Yes | Yes | | Data restore required after location failure | Yes | No | No | No | | Immediate failover in case of location failure | No - requires data restore from backup | Yes - alternate Location | Yes - alternate Location | Yes - alternate Location | -| Cross Location Network Traffic | Only if backup is offsite | Full replication traffic | Full replication traffic | Full replication traffic | -| License Cost | 2 or 3 PGD data nodes | 4 or 6  PGD data nodes | 4 or 6 PGD data nodes | 6+ PGD data nodes | +| Cross-location network traffic | Only if backup is offsite | Full replication traffic | Full replication traffic | Full replication traffic | +| License cost | 2 or 3 PGD data nodes | 4 or 6  PGD data nodes | 4 or 6 PGD data nodes | 6+ PGD data nodes | diff --git a/product_docs/docs/pgd/5/backup.mdx b/product_docs/docs/pgd/5/backup.mdx index b03f33a6388..9f0f18f5e02 100644 --- a/product_docs/docs/pgd/5/backup.mdx +++ b/product_docs/docs/pgd/5/backup.mdx @@ -8,26 +8,26 @@ PGD is designed to be a distributed, highly available system. If one or more nodes of a cluster are lost, the best way to replace them is to clone new nodes directly from the remaining nodes. -The role of backup and recovery in PGD is to provide for Disaster -Recovery (DR), such as in the following situations: +The role of backup and recovery in PGD is to provide for disaster +recovery (DR), such as in the following situations: - Loss of all nodes in the cluster - Significant, uncorrectable data corruption across multiple nodes - as a result of data corruption, application error or + as a result of data corruption, application error, or security breach ## Backup -### `pg_dump` +### pg_dump -`pg_dump`, sometimes referred to as "logical backup", can be used +You can use pg_dump, sometimes referred to as *logical backup*, normally with PGD. -Note that `pg_dump` dumps both local and global sequences as if -they were local sequences. This is intentional, to allow a PGD +pg_dump dumps both local and global sequences as if +they were local sequences. This behavior is intentional, to allow a PGD schema to be dumped and ported to other PostgreSQL databases. -This means that sequence kind metadata is lost at the time of dump, -so a restore would effectively reset all sequence kinds to +This means that sequence-kind metadata is lost at the time of dump, +so a restore effectively resets all sequence kinds to the value of `bdr.default_sequence_kind` at time of restore. To create a post-restore script to reset the precise sequence kind @@ -40,85 +40,85 @@ FROM bdr.sequences WHERE seqkind != 'local'; ``` -Note that if `pg_dump` is run using `bdr.crdt_raw_value = on` then the -dump can only be reloaded with `bdr.crdt_raw_value = on`. +If pg_dump is run using `bdr.crdt_raw_value = on`, then you can reload the +dump only with `bdr.crdt_raw_value = on`. Technical Support recommends the use of physical backup techniques for backup and recovery of PGD. -### Physical Backup +### Physical backup -Physical backups of a node in a EDB Postgres Distributed cluster can be taken using +You can take physical backups of a node in a EDB Postgres Distributed cluster using standard PostgreSQL software, such as [Barman](https://www.enterprisedb.com/docs/supported-open-source/barman/). -A physical backup of a PGD node can be performed with the same -procedure that applies to any PostgreSQL node: a PGD node is just a +You can perform a physical backup of a PGD node using the same +procedure that applies to any PostgreSQL node. A PGD node is just a PostgreSQL node running the BDR extension. -There are some specific points that must be considered when applying +Consider these specific points to consider when applying PostgreSQL backup techniques to PGD: - PGD operates at the level of a single database, while a physical - backup includes all the databases in the instance; you should plan - your databases to allow them to be easily backed-up and restored. + backup includes all the databases in the instance. Plan + your databases to allow them to be easily backed up and restored. -- Backups will make a copy of just one node. In the simplest case, - every node has a copy of all data, so you would need to backup only - one node to capture all data. However, the goal of PGD will not be +- Backups make a copy of just one node. In the simplest case, + every node has a copy of all data, so you need to back up only + one node to capture all data. However, the goal of PGD isn't met if the site containing that single copy goes down, so the - minimum should be at least one node backup per site (obviously with - many copies etc.). + minimum is at least one node backup per site (with + many copies, and so on). -- However, each node may have un-replicated local data, and/or the - definition of replication sets may be complex so that all nodes do - not subscribe to all replication sets. In these cases, backup - planning must also include plans for how to backup any unreplicated +- However, each node might have unreplicated local data, or the + definition of replication sets might be complex so that all nodes don't + subscribe to all replication sets. In these cases, backup + planning must also include plans for how to back up any unreplicated local data and a backup of at least one node that subscribes to each replication set. -### Eventual Consistency +### Eventual consistency The nodes in a EDB Postgres Distributed cluster are *eventually consistent*, but not -entirely *consistent*; a physical backup of a given node will -provide Point-In-Time Recovery capabilities limited to the states -actually assumed by that node (see the [Example] below). +*entirely consistent*. A physical backup of a given node provides +point-in-time recovery capabilities limited to the states +actually assumed by that node. The following example shows how two nodes in the same EDB Postgres Distributed cluster might not -(and usually do not) go through the same sequence of states. +(and usually don't) go through the same sequence of states. -Consider a cluster with two nodes `N1` and `N2`, which is initially in +Consider a cluster with two nodes `N1` and `N2` that's initially in state `S`. If transaction `W1` is applied to node `N1`, and at the same time a non-conflicting transaction `W2` is applied to node `N2`, then -node `N1` will go through the following states: +node `N1` goes through the following states: ``` (N1) S --> S + W1 --> S + W1 + W2 ``` -...while node `N2` will go through the following states: +Node `N2` goes through the following states: ``` (N2) S --> S + W2 --> S + W1 + W2 ``` -That is: node `N1` will *never* assume state `S + W2`, and node `N2` -likewise will never assume state `S + W1`, but both nodes will end up +That is, node `N1` *never* assumes state `S + W2`, and node `N2` +likewise never assumes state `S + W1`. However, both nodes end up in the same state `S + W1 + W2`. Considering this situation might affect how -you decide upon your backup strategy. +you decide on your backup strategy. -### Point-In-Time Recovery (PITR) +### Point-in-time recovery (PITR) -In the example above, the changes are also inconsistent in time, since -`W1` and `W2` both occur at time `T1`, but the change `W1` is not +The previous example showed that the changes are also inconsistent in time. +`W1` and `W2` both occur at time `T1`, but the change `W1` isn't applied to `N2` until `T2`. PostgreSQL PITR is designed around the assumption of changes arriving -from a single master in COMMIT order. Thus, PITR is possible by simply -scanning through changes until one particular point-in-time (PIT) is reached. -With this scheme, you can restore one node to a single point-in-time -from its viewpoint, e.g. `T1`, but that state would not include other -data from other nodes that had committed near that time but had not yet +from a single master in COMMIT order. Thus, PITR is possible by +scanning through changes until one particular point in time (PIT) is reached. +With this scheme, you can restore one node to a single point in time +from its viewpoint, for example, `T1`. However, that state doesn't include other +data from other nodes that committed near that time but had not yet arrived on the node. As a result, the recovery might be considered to be partially inconsistent, or at least consistent for only one replication origin. @@ -129,154 +129,151 @@ To request this, use the standard syntax: recovery_target_time = T1 ``` -PGD allows for changes from multiple masters, all recorded within the +PGD allows for changes from multiple masters, all recorded in the WAL log for one node, separately identified using replication origin identifiers. PGD allows PITR of all or some replication origins to a specific point in time, providing a fully consistent viewpoint across all subsets of nodes. -Thus for multi-origins, we view the WAL stream as containing multiple -streams all mixed up into one larger stream. There is still just one PIT, -but that will be reached as different points for each origin separately. +Thus for multi-origins, you can view the WAL stream as containing multiple +streams all mixed up into one larger stream. There's still just one PIT, +but that's reached as different points for each origin separately. -We read the WAL stream until requested origins have found their PIT. We -apply all changes up until that point, except that we do not mark as committed -any transaction records for an origin after the PIT on that origin has been +The WAL stream is read until requested origins have found their PIT. +All changes are applied up until that point, except that +any transaction records are not marked as commmited for an origin after the PIT on that origin is reached. -We end up with one LSN "stopping point" in WAL, but we also have one single -timestamp applied consistently, just as we do with "single origin PITR". +You end up with one LSN "stopping point" in WAL, but you also have one single +timestamp applied consistently, just as you do with single-origin PITR. -Once we have reached the defined PIT, a later one may also be set to allow +Once you reach the defined PIT, a later one might also be set to allow the recovery to continue, as needed. -After the desired stopping point has been reached, if the recovered server -will be promoted, shut it down first and move the LSN forwards using -`pg_resetwal` to an LSN value higher than used on any timeline on this server. -This ensures that there will be no duplicate LSNs produced by logical +After the desired stopping point is reached, if the recovered server +will be promoted, shut it down first. Move the LSN forward to an LSN value higher +than used on any timeline on this server using `pg_resetwal`. +This approach ensures that there are no duplicate LSNs produced by logical decoding. -In the specific example above, `N1` would be restored to `T1`, but -would also include changes from other nodes that have been committed -by `T1`, even though they were not applied on `N1` until later. +In the specific example shown, `N1` is restored to `T1`. It +also includes changes from other nodes that were committed +by `T1`, even though they weren't applied on `N1` until later. To request multi-origin PITR, use the standard syntax in -the recovery.conf file: +the `recovery.conf` file: ``` recovery_target_time = T1 ``` -The list of replication origins which would be restored to `T1` need either -to be specified in a separate multi_recovery.conf file via the use of +You need to specify the list of replication origins that are restored to `T1` in one of two ways. +You can use a separate `multi_recovery.conf` file by way of a new parameter `recovery_target_origins`: ``` recovery_target_origins = '*' ``` -...or one can specify the origin subset as a list in `recovery_target_origins`. +Or you can specify the origin subset as a list in `recovery_target_origins`: ``` recovery_target_origins = '1,3' ``` -Note that the local WAL activity recovery to the specified +The local WAL activity recovery to the specified `recovery_target_time` is always performed implicitly. For origins -that are not specified in `recovery_target_origins`, recovery may +that aren't specified in `recovery_target_origins`, recovery can stop at any point, depending on when the target for the list mentioned in `recovery_target_origins` is achieved. In the absence of the `multi_recovery.conf` file, the recovery defaults -to the original PostgreSQL PITR behavior that is designed around the assumption +to the original PostgreSQL PITR behavior that's designed around the assumption of changes arriving from a single master in COMMIT order. !!! Note - This feature is only available with EDB Postgres Extended. - Barman does not create a `multi_recovery.conf` file. + This feature is available only with EDB Postgres Extended. + Barman doesn't create a `multi_recovery.conf` file. ## Restore While you can take a physical backup with the same procedure as a -standard PostgreSQL node, what is slightly more complex is -**restoring** the physical backup of a PGD node. +standard PostgreSQL node, it's slightly more complex to +restore the physical backup of a PGD node. -### EDB Postgres Distributed Cluster Failure or Seeding a New Cluster from a Backup +### EDB Postgres Distributed cluster failure or seeding a new cluster from a backup The most common use case for restoring a physical backup involves the failure or replacement of all the PGD nodes in a cluster, for instance in the event of -a datacentre failure. +a data center failure. -You may also want to perform this procedure to clone the current contents of a +You might also want to perform this procedure to clone the current contents of a EDB Postgres Distributed cluster to seed a QA or development instance. -In that case, PGD capabilities can be restored based on a physical backup +In that case, you can restore PGD capabilities based on a physical backup of a single PGD node, optionally plus WAL archives: - If you still have some PGD nodes live and running, fence off the host you - restored the PGD node to, so it cannot connect to any surviving PGD nodes. - This ensures that the new node does not confuse the existing cluster. + restored the PGD node to, so it can't connect to any surviving PGD nodes. + This practice ensures that the new node doesn't confuse the existing cluster. - Restore a single PostgreSQL node from a physical backup of one of the PGD nodes. - If you have WAL archives associated with the backup, create a suitable `recovery.conf` and start PostgreSQL in recovery to replay up to the latest - state. You can specify a alternative `recovery_target` here if needed. + state. You can specify an alternative `recovery_target` here if needed. - Start the restored node, or promote it to read/write if it was in standby recovery. Keep it fenced from any surviving nodes! -- Clean up any leftover PGD metadata that was included in the physical backup, - as described below. +- Clean up any leftover PGD metadata that was included in the physical backup. - Fully stop and restart the PostgreSQL instance. - Add further PGD nodes with the standard procedure based on the `bdr.join_node_group()` function call. -#### Cleanup PGD Metadata +#### Cleanup of PGD metadata -The cleaning of leftover PGD metadata is achieved as follows: +To clean up leftover PGD metadata: -1. Drop the PGD node using `bdr.drop_node` -2. Fully stop and re-start PostgreSQL (important!). +1. Drop the PGD node using `bdr.drop_node`. +2. Fully stop and restart PostgreSQL (important!). -#### Cleanup of Replication Origins +#### Cleanup of replication origins -Replication origins must be explicitly removed with a separate step -because they are recorded persistently in a system catalog, and +You must explicitly remove replication origins with a separate step +because they're recorded persistently in a system catalog. They're therefore included in the backup and in the restored instance. They -are not removed automatically when dropping the BDR extension, because -they are not explicitly recorded as its dependencies. +aren't removed automatically when dropping the BDR extension, because +they aren't explicitly recorded as its dependencies. -PGD creates one replication origin for each remote master node, to -track progress of incoming replication in a crash-safe way. Therefore -we need to run: +To track progress of incoming replication in a crash-safe way, +PGD creates one replication origin for each remote master node. Therefore, +for each node in the previous cluster run this once: ``` SELECT pg_replication_origin_drop('bdr_dbname_grpname_nodename'); ``` -...once for each node in the (previous) cluster. Replication origins can -be listed as follows: +You can list replication origins as follows: ``` SELECT * FROM pg_replication_origin; ``` -...and those created by PGD are easily recognized by their name, as in -the example shown above. +Those created by PGD are easily recognized by their name. -#### Cleanup of Replication Slots +#### Cleanup of replication slots If a physical backup was created with `pg_basebackup`, replication slots -will be omitted from the backup. +are omitted from the backup. -Some other backup methods may preserve replications slots, likely in -outdated or invalid states. Once you restore the backup, just: +Some other backup methods might preserve replications slots, likely in +outdated or invalid states. Once you restore the backup, use this to drop all replication slots: ``` SELECT pg_drop_replication_slot(slot_name) FROM pg_replication_slots; ``` -...to drop *all* replication slots. If you have a reason to preserve some, +If you have a reason to preserve some slots, you can add a `WHERE slot_name LIKE 'bdr%'` clause, but this is rarely useful. diff --git a/product_docs/docs/pgd/5/cli/installing_cli.mdx b/product_docs/docs/pgd/5/cli/installing_cli.mdx index 7f414ec4bf8..028070dbc62 100644 --- a/product_docs/docs/pgd/5/cli/installing_cli.mdx +++ b/product_docs/docs/pgd/5/cli/installing_cli.mdx @@ -44,6 +44,6 @@ By default, `pgd-cli-config.yml` is located in the `/etc/edb/pgd-cli` directory. If you rename the file or move it to another location, specify the new name and location using the optional `-f` or `--config-file` flag when entering a command. See the [sample use case](/pgd/latest/cli/#passing-a-database-connection-string). -!!! Note Avoiding Stale data -The PGD CLI can return stale data on the state of the cluster if it is still connecting to nodes that have previously been parted from the cluster. Edit the `pgd-cli-config.yml` file or change your `--dsn` settings to ensure only active nodes in the cluster are listed for connection. +!!! Note Avoiding stale data +The PGD CLI can return stale data on the state of the cluster if it's still connecting to nodes that were previously parted from the cluster. Edit the `pgd-cli-config.yml` file, or change your `--dsn` settings to ensure only active nodes in the cluster are listed for connection. !!! diff --git a/product_docs/docs/pgd/5/quickstart/connecting_applications.mdx b/product_docs/docs/pgd/5/quickstart/connecting_applications.mdx index 92d58788192..e1c71bb63e4 100644 --- a/product_docs/docs/pgd/5/quickstart/connecting_applications.mdx +++ b/product_docs/docs/pgd/5/quickstart/connecting_applications.mdx @@ -5,7 +5,7 @@ description: > Connect to your quick started PGD cluster with psql and client applications --- -Connecting your application or remotely connecting to your new Postgres Distributed cluster involves: +Connecting your application or remotely connecting to your new EDB Postgres Distributed cluster involves: * Getting credentials and optionally creating a `.pgpass` file * Establishing the IP address of any PGD Proxy hosts you want to connect to diff --git a/product_docs/docs/pgd/5/quickstart/further_explore_failover.mdx b/product_docs/docs/pgd/5/quickstart/further_explore_failover.mdx index 11d2bf95b27..f77da471b3e 100644 --- a/product_docs/docs/pgd/5/quickstart/further_explore_failover.mdx +++ b/product_docs/docs/pgd/5/quickstart/further_explore_failover.mdx @@ -399,7 +399,7 @@ sudo systemctl start pgd-proxy.service ``` !!!Tip Exiting Tmux -You can quickly exit Tmux and all the associated sessions. First terminate any running processes, as they will otherwise continue running after the session is killed. Press **Control-b** and then enter `:kill-session`. This approach is simpler than quitting each pane's session one at a time using **Control-D** or `exit`. +You can quickly exit Tmux and all the associated sessions. First terminate any running processes, as they otherwise continue running after the session is killed. Press **Control-B** and then enter `:kill-session`. This approach is simpler than quitting each pane's session one at a time using **Control-D** or `exit`. !!! ## Other scenarios @@ -408,5 +408,5 @@ This example uses the quick-start configuration of three data nodes and one back ## Further reading -* Read more about the management capabilities of [PGD CLI](../cli/). +* Read more about the management capabilities of the [PGD CLI](../cli/). * Learn more about [monitoring replication using SQL](../monitoring/sql/#monitoring-replication-peers). diff --git a/product_docs/docs/pgd/5/quickstart/index.mdx b/product_docs/docs/pgd/5/quickstart/index.mdx index 698800cd215..e7d45925288 100644 --- a/product_docs/docs/pgd/5/quickstart/index.mdx +++ b/product_docs/docs/pgd/5/quickstart/index.mdx @@ -19,8 +19,8 @@ navigation: EDB Postgres Distributed (PGD) is a multi-master replicating implementation of Postgres designed for high performance and availability. You can create database clusters made up of many bidirectionally synchronizing database nodes. The clusters can have a number of proxy servers that direct your query traffic to the most available nodes, adding further resilience to your cluster configuration. -!!! Note Fully Managed BigAnimal - If you would prefer to have a fully managed EDB Postgres Distributed experience, PGD is now available as the Extreme High Availability option on BigAnimal, EDB's cloud platform for Postgres. Read more about [BigAnimal Extreme High Availability](/biganimal/latest/overview/02_high_availability/#extreme-high-availability-preview). +!!! Note Fully managed BigAnimal + If you prefer to have a fully managed EDB Postgres Distributed experience, PGD is now available as the Extreme High Availability option on BigAnimal, EDB's cloud platform for Postgres. Read more about [BigAnimal Extreme High Availability](/biganimal/latest/overview/02_high_availability/#extreme-high-availability-preview). PGD is very configurable. To quickly evaluate and deploy PGD, use this quick start. It'll get you up and running with a fully configured PGD cluster using the same tools that you'll use to deploy to production. This quick start includes: @@ -78,6 +78,6 @@ The AWS quick start is more extensive and deploys the PGD cluster onto EC2 nodes * [Connect applications to your PGD cluster](connecting_applications/) * [Find out how a PGD cluster stands up to downtime of data nodes or proxies](further_explore_failover/) -* [Learn about how Postgres Distributed manages conflicting updates](further_explore_conflicts/) +* [Learn about how EDB Postgres Distributed manages conflicting updates](further_explore_conflicts/) * [Moving beyond the quick starts](next_steps/) diff --git a/product_docs/docs/pgd/5/quickstart/quick_start_aws.mdx b/product_docs/docs/pgd/5/quickstart/quick_start_aws.mdx index d30bf2b1d8b..02655ab6d5d 100644 --- a/product_docs/docs/pgd/5/quickstart/quick_start_aws.mdx +++ b/product_docs/docs/pgd/5/quickstart/quick_start_aws.mdx @@ -63,7 +63,7 @@ You can add this to your `.bashrc` script or similar shell profile to ensure it' ### Configure the repository -All the software needed for this example is available from the Postgres Distributed package repository. The following command downloads and runs a script to configure the Postgres Distributed repository. This repository also contains the TPA packages. +All the software needed for this example is available from the EDB Postgres Distributed package repository. The following command downloads and runs a script to configure the EDB Postgres Distributed repository. This repository also contains the TPA packages. ```shell curl -1sLf "https://downloads.enterprisedb.com/$EDB_SUBSCRIPTION_TOKEN/postgres_distributed/setup.deb.sh" | sudo -E bash @@ -250,7 +250,7 @@ To leave the SQL client, enter `exit`. ### Using PGD CLI -The `pgd` utility, also known as the PGD CLI, lets you control and manage your Postgres Distributed cluster. It's already installed on the node. +The pgd utility, also known as the PGD CLI, lets you control and manage your EDB Postgres Distributed cluster. It's already installed on the node. You can use it to check the cluster's health by running `pgd check-health`: diff --git a/product_docs/docs/pgd/5/quickstart/quick_start_docker.mdx b/product_docs/docs/pgd/5/quickstart/quick_start_docker.mdx index 2e3c9debdb9..372d663d218 100644 --- a/product_docs/docs/pgd/5/quickstart/quick_start_docker.mdx +++ b/product_docs/docs/pgd/5/quickstart/quick_start_docker.mdx @@ -87,7 +87,7 @@ You can add this to your `.bashrc` script or similar shell profile to ensure it' ### Configure the repository -All the software needed for this example is available from the Postgres Distributed package repository. The following command downloads and runs a script to configure the Postgres Distributed repository. This repository also contains the TPA packages. +All the software needed for this example is available from the EDB Postgres Distributed package repository. The following command downloads and runs a script to configure the EDB Postgres Distributed repository. This repository also contains the TPA packages. ```shell curl -1sLf "https://downloads.enterprisedb.com/$EDB_SUBSCRIPTION_TOKEN/postgres_distributed/setup.deb.sh" | sudo -E bash @@ -287,7 +287,7 @@ To leave the SQL client, enter `exit`. ### Using PGD CLI -The `pgd` utility, also known as the PGD CLI, lets you control and manage your Postgres Distributed cluster. It's already installed on the node. +The pgd utility, also known as the PGD CLI, lets you control and manage your EDB Postgres Distributed cluster. It's already installed on the node. You can use it to check the cluster's health by running `pgd check-health`: diff --git a/product_docs/docs/pgd/5/quickstart/quick_start_linux.mdx b/product_docs/docs/pgd/5/quickstart/quick_start_linux.mdx index 957a7e90397..591267d4703 100644 --- a/product_docs/docs/pgd/5/quickstart/quick_start_linux.mdx +++ b/product_docs/docs/pgd/5/quickstart/quick_start_linux.mdx @@ -75,7 +75,7 @@ You can add this to your `.bashrc` script or similar shell profile to ensure it' ### Configure the repository -All the software needed for this example is available from the Postgres Distributed package repository. Download and run a script to configure the Postgres Distributed repository. This repository also contains the TPA packages. +All the software needed for this example is available from the EDB Postgres Distributed package repository. Download and run a script to configure the EDB Postgres Distributed repository. This repository also contains the TPA packages. ``` curl -1sLf "https://downloads.enterprisedb.com/$EDB_SUBSCRIPTION_TOKEN/postgres_distributed/setup.deb.sh" | sudo -E bash @@ -301,7 +301,7 @@ To leave the SQL client, enter `exit`. ### Using PGD CLI -The `pgd` utility, also known as the PGD CLI, lets you control and manage your Postgres Distributed cluster. It's already installed on the node. +The pgd utility, also known as the PGD CLI, lets you control and manage your EDB Postgres Distributed cluster. It's already installed on the node. You can use it to check the cluster's health by running `pgd check-health`: diff --git a/product_docs/docs/pgd/5/routing/installing_proxy.mdx b/product_docs/docs/pgd/5/routing/installing_proxy.mdx index d9e5f28b442..acef0e5e0a6 100644 --- a/product_docs/docs/pgd/5/routing/installing_proxy.mdx +++ b/product_docs/docs/pgd/5/routing/installing_proxy.mdx @@ -5,11 +5,11 @@ navTitle: "Installing PGD Proxy" ## Installing PGD Proxy -You can use two methods to install and configure PGD Proxy to manage a Postgres Distributed cluster. The recommended way to install and configure PGD Proxy is to use the EDB Trusted Postgres Architect (TPA) utility for cluster deployment and management. +You can use two methods to install and configure PGD Proxy to manage an EDB Postgres Distributed cluster. The recommended way to install and configure PGD Proxy is to use the EDB Trusted Postgres Architect (TPA) utility for cluster deployment and management. ### Installing through TPA -If the PGD cluster is being deployed through TPA, then TPA installs and configures PGD Proxy automatically as per the recommended architecture. If you want to install PGD Proxy on any other node in a PGD cluster, then you need to attach the `pgd-proxy` role to that instance in the TPA configuration file. Also set the `bdr_child_group` parameter before deploying, as this example shows. See [Trusted Postgres Architect](../tpa) for more information. +If the PGD cluster is being deployed through TPA, then TPA installs and configures PGD Proxy automatically as per the recommended architecture. If you want to install PGD Proxy on any other node in a PGD cluster, then you need to attach the pgd-proxy role to that instance in the TPA configuration file. Also set the `bdr_child_group` parameter before deploying, as this example shows. See [Trusted Postgres Architect](../tpa) for more information. ```yaml - Name: proxy-a1 @@ -47,7 +47,7 @@ By default, in the cluster created through TPA, `pgd-proxy-config.yml` is locate If you rename the file or move it to another location, specify the new name and location using the optional `-f` or `--config-file` flag when starting a service. See the [sample service file](#pgd-proxy-service). -You can set the log level for the PGD Proxy service using the top-level config parameter `log-level`, as shown in the sample config. The valid values for `log-level` are `debug`, `info`, `warn` and `error`. +You can set the log level for the PGD Proxy service using the top-level config parameter `log-level`, as shown in the sample config. The valid values for `log-level` are `debug`, `info`, `warn`, and `error`. `cluster.endpoints` and `cluster.proxy.name` are mandatory fields in the config file. PGD Proxy always tries to connect to the first endpoint in the list. If it fails, it tries the next endpoint, and so on. diff --git a/product_docs/docs/pgd/5/routing/proxy.mdx b/product_docs/docs/pgd/5/routing/proxy.mdx index 04d23b8aa98..81ff17c3fd5 100644 --- a/product_docs/docs/pgd/5/routing/proxy.mdx +++ b/product_docs/docs/pgd/5/routing/proxy.mdx @@ -30,7 +30,7 @@ PGD manages write leader election. PGD Proxy interacts with PGD to get write lea PGD Proxy responds to write leader change events that can be categorized into two modes of operation: *failover* and *switchover*. -Automatic transfer of write leadership from the current write leader node to a new node in the event of Postgres or operating system crash is called failover. PGD elects a new write leader when the current write leader goes down or becomes unresponsive. Once the new write leader is elected by PGD, proxy closes existing client connections to the old write leader and redirects new client connections to the newly elected write leader. +Automatic transfer of write leadership from the current write leader node to a new node in the event of Postgres or operating system crash is called failover. PGD elects a new write leader when the current write leader goes down or becomes unresponsive. Once the new write leader is elected by PGD, PGD Proxy closes existing client connections to the old write leader and redirects new client connections to the newly elected write leader. User-controlled, manual transfer of write leadership from the current write leader to a new target leader is called switchover. Switchover is triggered through the [PGD CLI switchover](../cli/command_ref/pgd_switchover) command. The command is submitted to PGD, which attempts to elect the given target node as the new write leader. Similar to failover, PGD Proxy closes existing client connections and redirects new client connections to the newly elected write leader. This is useful during server maintenance, for example, if the current write leader node needs to be stopped for maintenance like a server update or OS patch update. @@ -45,7 +45,7 @@ The main purpose of this option is to allow users to configure the write behavio Consider a 3-data node group with a proxy on each data node. In this case, if the current write leader gets network partitioned or isolated, then the data nodes present in the majority partition elects a new write leader. If `consensus_grace_period` is set to a non-zero value, say `10s`, then the proxy present on the previous write leader continues to route writes for this duration. -Note that in this case, if the grace period is kept too high, then writes continue to happen on the two write leaders. This condition increases the chances of write conflicts. +In this case, if the grace period is kept too high, then writes continue to happen on the two write leaders. This condition increases the chances of write conflicts. Having said that, most of the time, upon loss of the current Raft leader, the new Raft leader gets elected by BDR within a few seconds if more than half of the nodes (quorum) are still up. Hence, if the Raft leader is down but the write leader is still up, then proxy can be configured to allow routing by keeping `consensus_grace_period` to a non-zero, positive value. The proxy waits for the Raft leader to get elected during this period before stopping routing. This might be helpful in some cases where availability is more important. @@ -53,7 +53,7 @@ Having said that, most of the time, upon loss of the current Raft leader, the ne The PostgreSQL C client library (libpq) allows you to specify multiple host names in a single connection string for simple failover. This is also supported by client libraries (drivers) in some other programming languages. It works well for failing over across PGD Proxy instances that are down or inaccessible. -However, if the PGD Proxy instance is accessible but does not have access to the write leader or the write leader for given instance does not exist (i.e. because there is no write leader for the given PGD group), the connection will simply fail and no other hosts in the multi-host connection string will be tried. This is consistent with behavior of PostgreSQL client library with other proxies like HAProxy or pgbouncer. +However, if the PGD Proxy instance is accessible but doesn't have access to the write leader, or the write leader for a given instance doesn't exist (that is, because there's no write leader for the given PGD group), the connection simply fails. No other hosts in the multi-host connection string is tried. This behavior is consistent with the behavior of PostgreSQL client libraries with other proxies like HAProxy or pgbouncer. ## Managing PGD Proxy diff --git a/product_docs/docs/pgd/5/routing/raft/02_raft_subgroups_and_pgd_cli.mdx b/product_docs/docs/pgd/5/routing/raft/02_raft_subgroups_and_pgd_cli.mdx index 2ad069b127e..95be30aee3f 100644 --- a/product_docs/docs/pgd/5/routing/raft/02_raft_subgroups_and_pgd_cli.mdx +++ b/product_docs/docs/pgd/5/routing/raft/02_raft_subgroups_and_pgd_cli.mdx @@ -3,7 +3,7 @@ title: "Working with Raft subgroups and PGD CLI" --- -You can view the status of your nodes and subgroups with the [pgd](../../cli/) cli command. The examples here assume a cluster as configured by in [Creating Raft subgroups with TPA](01_raft_subgroups_and_tpa). +You can view the status of your nodes and subgroups with the [pgd](../../cli/) CLI command. The examples here assume a cluster as configured by in [Creating Raft subgroups with TPA](01_raft_subgroups_and_tpa). ## Viewing nodes with pgd @@ -24,7 +24,7 @@ west3 4162758468 us_west data ACTIVE ACTIVE Up 6 ## Viewing groups (and subgroups) with pgd -To show the groups in a PGD deployment, along with their names and attributes, use the PGD cli command `show-groups.` +To show the groups in a PGD deployment, along with their names and attributes, use the PGD CLI command `show-groups.` ``` $pgd show-groups diff --git a/product_docs/docs/pgd/5/routing/raft/index.mdx b/product_docs/docs/pgd/5/routing/raft/index.mdx index 50b9ab212db..c8a0c6ec1ff 100644 --- a/product_docs/docs/pgd/5/routing/raft/index.mdx +++ b/product_docs/docs/pgd/5/routing/raft/index.mdx @@ -1,45 +1,45 @@ --- -title: Proxies, Raft and Raft subgroups +title: Proxies, Raft, and Raft subgroups --- PGD manages its metadata using a Raft model where a top-level group spans all the data nodes in the PGD installation. A Raft leader is elected by the top-level group and propagates the state of the top-level group to all the other nodes in the group. !!! Hint What is Raft? -Raft is an industry accepted algorithm for making decisions though achieving “consensus” from a group of separate nodes within a distributed system. +Raft is an industry-accepted algorithm for making decisions though achieving *consensus* from a group of separate nodes in a distributed system. !!! -It is essential for certain operations – such as adding and removing nodes and allocating ranges for [galloc](../../sequences/#pgd-global-sequences) sequences – in the top-level group that a Raft leader is both established and connected. +For certain operations in the top-level group, a Raft leader must be both established and connected. Examples of these operations include adding and removing nodes and allocating ranges for [galloc](../../sequences/#pgd-global-sequences) sequences. It also means that an absolute majority of nodes in the top-level group (one half of them plus one) must be able to reach each other. So, in a top-level group with five nodes, at least three of the nodes must be reachable by each other to establish a Raft leader. ## Proxy routing -One function that also uses Raft is proxy routing. Proxy routing requires that the proxies are able to coordinate writing to a data node within their group of nodes. This data node is the write leader. If the write leader goes offline, the proxies need to be able to switch to a new write leader, selected by the data nodes, to maintain continuity for connected applications. +One function that also uses Raft is proxy routing. Proxy routing requires that the proxies can coordinate writing to a data node in their group of nodes. This data node is the write leader. If the write leader goes offline, the proxies need to be able to switch to a new write leader, selected by the data nodes, to maintain continuity for connected applications. -Proxy routing can be configured on a per node group basis in PGD 5, but the recommended configurations are "global" and "local" routing. +You can configure proxy routing on a per-node group basis in PGD 5, but the recommended configurations are *global* and *local* routing. ## Global routing -Global routing uses the top-level group to manage the proxy routing. All writable data nodes (not witness or subscribe only nodes) in the group are eligible to become write leader for all proxies. Connections to proxies within the top-level group will be routed to data nodes within the top-level group. +Global routing uses the top-level group to manage the proxy routing. All writable data nodes (not witness or subscribe-only nodes) in the group are eligible to become write leader for all proxies. Connections to proxies in the top-level group are routed to data nodes in the top-level group. -With global routing there is only one write leader for the entire top-level group. +With global routing, there's only one write leader for the entire top-level group. ## Local routing -Local routing uses subgroups, often mapped to locations, to manage the proxy routing within the subgroup. Local routing is often used for geographical separation of writes and it is important for them to continue routing even when the top-level consensus is lost. +Local routing uses subgroups, often mapped to locations, to manage the proxy routing in the subgroup. Local routing is often used for geographical separation of writes. It's important for them to continue routing even when the top-level consensus is lost. -That's because PGD allows queries and asynchronous data manipulation (DMLs) to work even when the top-level consensus is lost. But using the top-level consensus, as is the case with global routing, would mean that new write leaders could not be elected when that consensus was lost. Local groups cannot rely on the top level consensus (not without adding an independent consensus mechanism and its added complexity). +That's because PGD allows queries and asynchronous data manipulation (DMLs) to work even when the top-level consensus is lost. But using the top-level consensus, as is the case with global routing, means that new write leaders can't be elected when that consensus is lost. Local groups can't rely on the top-level consensus without adding an independent consensus mechanism and its added complexity. -To elegantly address this, PGD 5 introduced subgroup Raft support. This allows the subgroups within a PGD top-level group to elect the leaders they need independently by forming devolved Raft groups which are able to elect write leaders independent of other subgroups or the top-level Raft consensus. Connections to proxies within the subgroup will then route to data nodes within the subgroup. +To elegantly address this, PGD 5 introduced subgroup Raft support. Supgroup Raft support allows the subgroups in a PGD top-level group to elect the leaders they need independently. They do this by forming devolved Raft groups that can elect write leaders independent of other subgroups or the top-level Raft consensus. Connections to proxies in the subgroup then route to data nodes in the subgroup. -With local routing, there is a write leader for each subgroup. +With local routing, there's a write leader for each subgroup. -## More +## More information * [Raft subgroups and TPA](01_raft_subgroups_and_tpa) shows how Raft subgroups can be enabled in PGD when deploying with Trusted Postgres Architect. -* [Raft subgroups and PGD CLI](02_raft_subgroups_and_pgd_cli) shows how PGD's Cli reports on the presence and status of Raft subgroups. +* [Raft subgroups and PGD CLI](02_raft_subgroups_and_pgd_cli) shows how the PGD CLI reports on the presence and status of Raft subgroups. * [Migrating to Raft subgroups](03_migrating_to_raft_subgroups) is a guide to migrating existing installations and enabling Raft subgroups without TPA. -* [Raft elections in depth](04_raft_elections_in_depth) looks at how the write leader is elected using Raft in detail. \ No newline at end of file +* [Raft elections in depth](04_raft_elections_in_depth) looks in detail at how the write leader is elected using Raft. \ No newline at end of file diff --git a/product_docs/docs/pgd/5/upgrades/bdr_pg_upgrade.mdx b/product_docs/docs/pgd/5/upgrades/bdr_pg_upgrade.mdx index 0ab742fb8ce..7f87b9d5c04 100644 --- a/product_docs/docs/pgd/5/upgrades/bdr_pg_upgrade.mdx +++ b/product_docs/docs/pgd/5/upgrades/bdr_pg_upgrade.mdx @@ -2,10 +2,10 @@ title: In-place Postgres major version upgrades --- -You can upgrade a PGD Node to a newer major version of Postgres using the -command-line utility `bdr_pg_upgrade`. +You can upgrade a PGD node to a newer major version of Postgres using the +command-line utility bdr_pg_upgrade. -`bdr_pg_upgrade` internally uses the standard [`pg_upgrade`](https://www.postgresql.org/docs/current/pgupgrade.html) +bdr_pg_upgrade internally uses the standard [pg_upgrade](https://www.postgresql.org/docs/current/pgupgrade.html) with PGD-specific logic to ensure a smooth upgrade. ## Terminology @@ -19,17 +19,17 @@ This terminology is used when describing the upgrade process and components invo ## Precautions Standard Postgres major version upgrade precautions apply, including the fact both clusters must meet -all the requirements for [`pg_upgrade`](https://www.postgresql.org/docs/current/pgupgrade.html#id-1.9.5.12.7.). +all the requirements for [pg_upgrade](https://www.postgresql.org/docs/current/pgupgrade.html#id-1.9.5.12.7.). -Additionally, don't use `bdr_pg_upgrade` if other tools are using -replication slots and replication origins, only PGD slots and origins are +Additionally, don't use bdr_pg_upgrade if other tools are using +replication slots and replication origins. Only PGD slots and origins are restored after the upgrade. -You must meet several prerequisites for `bdr_pg_upgrade`: +You must meet several prerequisites for bdr_pg_upgrade: - Disconnect applications using the old cluster. You can, for example, redirect them to another node in the cluster. -- Configure peer authentication for both clusters. `bdr_pg_upgrade` +- Configure peer authentication for both clusters. bdr_pg_upgrade requires peer authentication. - PGD versions on both clusters must be exactly the same and must be version 4.1.0 or later. @@ -39,7 +39,7 @@ You must meet several prerequisites for `bdr_pg_upgrade`: match the old cluster configuration. - Databases, tables, and other objects must not exist in the new cluster. -We also recommend having the old cluster up prior to running `bdr_pg_upgrade`. +We also recommend having the old cluster up prior to running bdr_pg_upgrade. The CLI starts the old cluster if it's shut down. ## Usage @@ -48,8 +48,8 @@ To upgrade to a newer major version of Postgres, you must first install the new ### bdr_pg_upgrade command-line -`bdr_pg_upgrade` passes all parameters to `pg_upgrade`. Therefore, you can -specify any parameters supported by [`pg_upgrade`](https://www.postgresql.org/docs/current/pgupgrade.html#id-1.9.5.12.6). +bdr_pg_upgrade passes all parameters to pg_upgrade. Therefore, you can +specify any parameters supported by [pg_upgrade](https://www.postgresql.org/docs/current/pgupgrade.html#id-1.9.5.12.6). #### Synopsis @@ -59,12 +59,12 @@ bdr_pg_upgrade [OPTION] ... #### Options -In addition to the options for `pg_upgrade`, you can pass the following parameters -to `bdr_pg_upgrade`: +In addition to the options for pg_upgrade, you can pass the following parameters +to bdr_pg_upgrade: **Required parameters** -These parameters must be specified either in the command line or, for all but the `--database` parameter, in their equivalent environment variable. They are used by `bdr_pg_upgrade`. +Specify these parameters either in the command line or, for all but the `--database` parameter, in their equivalent environment variable. They're used by bdr_pg_upgrade. - `-b, --old-bindir` — Old cluster bin directory. - `-B, --new-bindir`— New cluster bin directory. @@ -74,7 +74,7 @@ These parameters must be specified either in the command line or, for all but th **Optional parameters** -These parameters are optional and are used by `bdr_pg_upgrade`. +These parameters are optional and are used by bdr_pg_upgrade. - `-p, --old-port` — Old cluster port number. - `-s, --socketdir` — Directory to use for postmaster sockets during upgrade. @@ -82,16 +82,16 @@ These parameters are optional and are used by `bdr_pg_upgrade`. **Other parameters** -Any other parameter which is not one of the above, is passed to pg_upgrade. pg_upgrade accepts the following parameters: +Any other parameter that's not one of the above is passed to pg_upgrade. pg_upgrade accepts the following parameters: - `-j, --jobs` — Number of simultaneous processes or threads to use. - `-k, --link` — Use hard links instead of copying files to the new cluster. -- `-o, --old-options` — Option to be passed to old postgres command. Multiple invocations will be appended. -- `-O, --new-options` — Option to be passed to new postgres command. Multiple invocations will be appended. -- `-N, --no-sync` — Do not wait for all files in the upgraded cluster to be written to disk. +- `-o, --old-options` — Option to pass to old postgres command. Multiple invocations are appended. +- `-O, --new-options` — Option to pass to new postgres command. Multiple invocations are appended. +- `-N, --no-sync` — Don't wait for all files in the upgraded cluster to be written to disk. - `-P, --new-port` — New cluster port number. - `-r, --retain` — Retain SQL and log files even after successful completion. -- `-U, --username` — Cluster's install user name +- `-U, --username` — Cluster's install user name. - `--clone` — Use efficient file cloning. #### Environment variables @@ -130,10 +130,10 @@ bdr_pg_upgrade \ ### Steps performed -Steps performed when running `bdr_pg_upgrade`. +These steps are performed when running bdr_pg_upgrade. !!! Note - When `--check` is supplied as an argument to `bdr_pg_upgrade`, the CLI + When `--check` is supplied as an argument to bdr_pg_upgrade, the CLI skips steps that modify the database. #### PGD Postgres checks @@ -160,9 +160,9 @@ Steps performed when running `bdr_pg_upgrade`. | Disconnecting from old cluster | `skip` | | Stopping old cluster | `skip` | -#### `pg_upgrade` steps +#### pg_upgrade steps -Standard `pg_upgrade` steps are performed. +Standard pg_upgrade steps are performed. !!! Note If supplied, `--check` is passed to pg_upgrade. diff --git a/product_docs/docs/pgd/5/upgrades/index.mdx b/product_docs/docs/pgd/5/upgrades/index.mdx index 20b4710bc0f..9bb0a235620 100644 --- a/product_docs/docs/pgd/5/upgrades/index.mdx +++ b/product_docs/docs/pgd/5/upgrades/index.mdx @@ -13,7 +13,7 @@ You can also stop all nodes, perform the upgrade on all nodes, and only then restart the entire cluster. This approach is the same as with a standard PostgreSQL setup. This strategy of upgrading all nodes at the same time avoids running with mixed versions of software and therefore is the simplest. However, it incurs -some downtime and we don't recommend it unless you can't perform the rolling upgrade +some downtime and we don't recommend it unless you can't perform the rolling upgrade for some reason. To upgrade an EDB Postgres Distributed cluster: @@ -101,9 +101,8 @@ new DDL syntax that was added to a newer release of Postgres. `bdr_init_physical` makes a byte-by-byte copy of the source node so you can't use it while upgrading from one major Postgres version to another. In fact, currently `bdr_init_physical` requires that even the - PGD version of the source and the joining node be exactly the same. - You can't be use it for rolling upgrades by way of joining a new node method. Instead, use a logical join. + You can't use it for rolling upgrades by way of joining a new node method. Instead, use a logical join. ### Upgrading a CAMO-enabled cluster @@ -111,26 +110,26 @@ Upgrading a CAMO-enabled cluster requires upgrading CAMO groups one by one while disabling the CAMO protection for the group being upgraded and reconfiguring it using the new [commit scope](../durability/commit-scopes)-based settings. -The following approach is recommended for upgrade two BDR nodes, that -constitute a CAMO pair, to PGD 5.0: +The following approach is recommended for upgrading two BDR nodes that +constitute a CAMO pair to PGD 5.0: - Ensure `bdr.enable_camo` remains `off` for transactions on any of - the two nodes or redirect clients away from the two nodes. Removing - the CAMO pairing while attempting to use CAMO will lead to errors - and prevent further transactions. + the two nodes, or redirect clients away from the two nodes. Removing + the CAMO pairing while attempting to use CAMO leads to errors + and prevents further transactions. - Uncouple the pair by deconfiguring CAMO either by resetting `bdr.camo_origin_for` and `bdr.camo_parter_of` (when upgrading from BDR 3.7.x) or by using `bdr.remove_camo_pair` (on BDR 4.x). -- Upgrade the two nodes to PGD 5.0 +- Upgrade the two nodes to PGD 5.0. - Create a dedicated node group for the two nodes and move them into that node group. - Create a [Commit Scope](../durability/commit-scopes) for this node group and thus the pair of nodes to use CAMO. - Reactivate CAMO protection again by either setting a `default_commit_scope` or by changing the clients to explicitly set - `bdr.commit_scope` (instead of `bdr.enable_camo`) for their sessions + `bdr.commit_scope` instead of `bdr.enable_camo` for their sessions or transactions. -- If necessary, allow clients to connect to the CAMO protected nodes, +- If necessary, allow clients to connect to the CAMO protected nodes again. ## Upgrade preparation @@ -187,7 +186,7 @@ version of Postgres you're upgrading to. Upgrading to a new major version of Postgres is more complicated than upgrading to a minor version. -EDB Postgres Distributed provides a `bdr_pg_upgrade` command line utility, +EDB Postgres Distributed provides a bdr_pg_upgrade command line utility, which you can use to do [in-place Postgres major version upgrades](bdr_pg_upgrade). !!! Note @@ -198,7 +197,7 @@ which you can use to do [in-place Postgres major version upgrades](bdr_pg_upgrad ## Upgrade check and validation After you upgrade your PGD node, you can verify the current -version of BDR binary like this: +version of the binary: ```sql SELECT bdr.bdr_version(); From 20800a72f4fdadbbb5e85e2b7c84f7a7a0bea9fd Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Fri, 23 Jun 2023 10:42:56 -0400 Subject: [PATCH 03/10] Update product_docs/docs/pgd/5/routing/raft/index.mdx Co-authored-by: Dj Walker-Morgan <126472455+djw-m@users.noreply.github.com> --- product_docs/docs/pgd/5/routing/raft/index.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/product_docs/docs/pgd/5/routing/raft/index.mdx b/product_docs/docs/pgd/5/routing/raft/index.mdx index c8a0c6ec1ff..5370b46a3ce 100644 --- a/product_docs/docs/pgd/5/routing/raft/index.mdx +++ b/product_docs/docs/pgd/5/routing/raft/index.mdx @@ -10,7 +10,7 @@ PGD manages its metadata using a Raft model where a top-level group spans all t Raft is an industry-accepted algorithm for making decisions though achieving *consensus* from a group of separate nodes in a distributed system. !!! -For certain operations in the top-level group, a Raft leader must be both established and connected. Examples of these operations include adding and removing nodes and allocating ranges for [galloc](../../sequences/#pgd-global-sequences) sequences. +For certain operations in the top-level group, it's essential that a Raft leader must be both established and connected. Examples of these operations include adding and removing nodes and allocating ranges for [galloc](../../sequences/#pgd-global-sequences) sequences. It also means that an absolute majority of nodes in the top-level group (one half of them plus one) must be able to reach each other. So, in a top-level group with five nodes, at least three of the nodes must be reachable by each other to establish a Raft leader. From 3bd3ad78f00141cfb31cc0157d5fd08cf9cc3dba Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Fri, 23 Jun 2023 10:43:36 -0400 Subject: [PATCH 04/10] Update product_docs/docs/pgd/5/routing/raft/index.mdx Co-authored-by: Dj Walker-Morgan <126472455+djw-m@users.noreply.github.com> --- product_docs/docs/pgd/5/routing/raft/index.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/product_docs/docs/pgd/5/routing/raft/index.mdx b/product_docs/docs/pgd/5/routing/raft/index.mdx index 5370b46a3ce..c90cf398bc6 100644 --- a/product_docs/docs/pgd/5/routing/raft/index.mdx +++ b/product_docs/docs/pgd/5/routing/raft/index.mdx @@ -22,7 +22,7 @@ You can configure proxy routing on a per-node group basis in PGD 5, but the reco ## Global routing -Global routing uses the top-level group to manage the proxy routing. All writable data nodes (not witness or subscribe-only nodes) in the group are eligible to become write leader for all proxies. Connections to proxies in the top-level group are routed to data nodes in the top-level group. +Global routing uses the top-level group to manage the proxy routing. All writable data nodes (not witness or subscribe-only nodes) in the group are eligible to become write leader for all proxies. Connections to proxies within the top-level group will be routed to data nodes within the top-level group. With global routing, there's only one write leader for the entire top-level group. From 98e90c973c35d64547e87fdebfe9f146dba6c68c Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Fri, 23 Jun 2023 10:43:50 -0400 Subject: [PATCH 05/10] Update product_docs/docs/pgd/5/routing/raft/index.mdx Co-authored-by: Dj Walker-Morgan <126472455+djw-m@users.noreply.github.com> --- product_docs/docs/pgd/5/routing/raft/index.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/product_docs/docs/pgd/5/routing/raft/index.mdx b/product_docs/docs/pgd/5/routing/raft/index.mdx index c90cf398bc6..bcbec528556 100644 --- a/product_docs/docs/pgd/5/routing/raft/index.mdx +++ b/product_docs/docs/pgd/5/routing/raft/index.mdx @@ -16,7 +16,7 @@ It also means that an absolute majority of nodes in the top-level group (one hal ## Proxy routing -One function that also uses Raft is proxy routing. Proxy routing requires that the proxies can coordinate writing to a data node in their group of nodes. This data node is the write leader. If the write leader goes offline, the proxies need to be able to switch to a new write leader, selected by the data nodes, to maintain continuity for connected applications. +One function that also uses Raft is proxy routing. Proxy routing requires that the proxies can coordinate writing to a data node within their group of nodes. This data node is the write leader. If the write leader goes offline, the proxies need to be able to switch to a new write leader, selected by the data nodes, to maintain continuity for connected applications. You can configure proxy routing on a per-node group basis in PGD 5, but the recommended configurations are *global* and *local* routing. From dc368a7d99f8010851cab2edeba8248596c41d51 Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Fri, 23 Jun 2023 10:44:06 -0400 Subject: [PATCH 06/10] Update product_docs/docs/pgd/5/routing/raft/index.mdx Co-authored-by: Dj Walker-Morgan <126472455+djw-m@users.noreply.github.com> --- product_docs/docs/pgd/5/routing/raft/index.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/product_docs/docs/pgd/5/routing/raft/index.mdx b/product_docs/docs/pgd/5/routing/raft/index.mdx index bcbec528556..41415d0db6e 100644 --- a/product_docs/docs/pgd/5/routing/raft/index.mdx +++ b/product_docs/docs/pgd/5/routing/raft/index.mdx @@ -28,7 +28,7 @@ With global routing, there's only one write leader for the entire top-level grou ## Local routing -Local routing uses subgroups, often mapped to locations, to manage the proxy routing in the subgroup. Local routing is often used for geographical separation of writes. It's important for them to continue routing even when the top-level consensus is lost. +Local routing uses subgroups, often mapped to locations, to manage the proxy routing within the subgroup. Local routing is often used for geographical separation of writes. It's important for them to continue routing even when the top-level consensus is lost. That's because PGD allows queries and asynchronous data manipulation (DMLs) to work even when the top-level consensus is lost. But using the top-level consensus, as is the case with global routing, means that new write leaders can't be elected when that consensus is lost. Local groups can't rely on the top-level consensus without adding an independent consensus mechanism and its added complexity. From e9c1a0f1d7865373a887b13b236d4014a02f61a5 Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Fri, 23 Jun 2023 10:44:27 -0400 Subject: [PATCH 07/10] Update product_docs/docs/pgd/5/routing/raft/index.mdx Co-authored-by: Dj Walker-Morgan <126472455+djw-m@users.noreply.github.com> --- product_docs/docs/pgd/5/routing/raft/index.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/product_docs/docs/pgd/5/routing/raft/index.mdx b/product_docs/docs/pgd/5/routing/raft/index.mdx index 41415d0db6e..6f017cd5739 100644 --- a/product_docs/docs/pgd/5/routing/raft/index.mdx +++ b/product_docs/docs/pgd/5/routing/raft/index.mdx @@ -32,7 +32,7 @@ Local routing uses subgroups, often mapped to locations, to manage the proxy rou That's because PGD allows queries and asynchronous data manipulation (DMLs) to work even when the top-level consensus is lost. But using the top-level consensus, as is the case with global routing, means that new write leaders can't be elected when that consensus is lost. Local groups can't rely on the top-level consensus without adding an independent consensus mechanism and its added complexity. -To elegantly address this, PGD 5 introduced subgroup Raft support. Supgroup Raft support allows the subgroups in a PGD top-level group to elect the leaders they need independently. They do this by forming devolved Raft groups that can elect write leaders independent of other subgroups or the top-level Raft consensus. Connections to proxies in the subgroup then route to data nodes in the subgroup. +PGD 5 introduced subgroup Raft support to elegantly address this issue. Subgroup Raft support allows the subgroups in a PGD top-level group to elect the leaders they need independently. They do this by forming devolved Raft groups that can elect write leaders independent of other subgroups or the top-level Raft consensus. Connections to proxies in the subgroup then route to data nodes within the subgroup. With local routing, there's a write leader for each subgroup. From 2f6090f648d062f6ef58dec7070f0d833f2dea64 Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Fri, 23 Jun 2023 10:44:38 -0400 Subject: [PATCH 08/10] Update product_docs/docs/pgd/5/tpa.mdx Co-authored-by: Dj Walker-Morgan <126472455+djw-m@users.noreply.github.com> --- product_docs/docs/pgd/5/tpa.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/product_docs/docs/pgd/5/tpa.mdx b/product_docs/docs/pgd/5/tpa.mdx index f60cced92eb..8bdc50322b2 100644 --- a/product_docs/docs/pgd/5/tpa.mdx +++ b/product_docs/docs/pgd/5/tpa.mdx @@ -48,7 +48,7 @@ The available configuration options include: | `--architecture` | Required. Set to `PGD-Always-ON` for EDB Postgres Distributed deployments. | | `–-postgresql `
or
`--edb-postgres-advanced `
or
`--edb-postgres-extended ` | Required. Specifies the distribution and version of Postgres to use. For more details, see [Cluster configuration: Postgres flavour and version](/tpa/latest/tpaexec-configure/#postgres-flavour-and-version). | | `--redwood` or `--no-redwood` | Required when `--edb-postgres-advanced` flag is present. Specifies whether Oracle database compatibility features are desired. | -| `--location-names l1 l2 l3` | Required. Specifies the number and name of the locations to deploy PGD to. | +| `--location-names l1 l2 l3` | Required. Specifies the names of the locations to deploy PGD to. | | `--data-nodes-per-location N` | Specifies the number of data nodes per location. Default is 3. | | `--add-witness-node-per-location` | For an even number of data nodes per location, adds witness nodes to allow for local consensus. Enabled by default for 2 data node locations. | | `--add-proxy-nodes-per-location` | Whether to separate PGD proxies from data nodes and how many to configure. By default one proxy is configured and cohosted for each data node. | From fcb3ea389273cae8be8cc8b8d10c68375294848e Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Tue, 20 Jun 2023 14:20:41 -0400 Subject: [PATCH 09/10] Basic cleanup of release text Other cleanup tasks First edit of Backup topic --- product_docs/docs/pgd/5/routing/raft/index.mdx | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/product_docs/docs/pgd/5/routing/raft/index.mdx b/product_docs/docs/pgd/5/routing/raft/index.mdx index 6f017cd5739..c8a0c6ec1ff 100644 --- a/product_docs/docs/pgd/5/routing/raft/index.mdx +++ b/product_docs/docs/pgd/5/routing/raft/index.mdx @@ -10,29 +10,29 @@ PGD manages its metadata using a Raft model where a top-level group spans all t Raft is an industry-accepted algorithm for making decisions though achieving *consensus* from a group of separate nodes in a distributed system. !!! -For certain operations in the top-level group, it's essential that a Raft leader must be both established and connected. Examples of these operations include adding and removing nodes and allocating ranges for [galloc](../../sequences/#pgd-global-sequences) sequences. +For certain operations in the top-level group, a Raft leader must be both established and connected. Examples of these operations include adding and removing nodes and allocating ranges for [galloc](../../sequences/#pgd-global-sequences) sequences. It also means that an absolute majority of nodes in the top-level group (one half of them plus one) must be able to reach each other. So, in a top-level group with five nodes, at least three of the nodes must be reachable by each other to establish a Raft leader. ## Proxy routing -One function that also uses Raft is proxy routing. Proxy routing requires that the proxies can coordinate writing to a data node within their group of nodes. This data node is the write leader. If the write leader goes offline, the proxies need to be able to switch to a new write leader, selected by the data nodes, to maintain continuity for connected applications. +One function that also uses Raft is proxy routing. Proxy routing requires that the proxies can coordinate writing to a data node in their group of nodes. This data node is the write leader. If the write leader goes offline, the proxies need to be able to switch to a new write leader, selected by the data nodes, to maintain continuity for connected applications. You can configure proxy routing on a per-node group basis in PGD 5, but the recommended configurations are *global* and *local* routing. ## Global routing -Global routing uses the top-level group to manage the proxy routing. All writable data nodes (not witness or subscribe-only nodes) in the group are eligible to become write leader for all proxies. Connections to proxies within the top-level group will be routed to data nodes within the top-level group. +Global routing uses the top-level group to manage the proxy routing. All writable data nodes (not witness or subscribe-only nodes) in the group are eligible to become write leader for all proxies. Connections to proxies in the top-level group are routed to data nodes in the top-level group. With global routing, there's only one write leader for the entire top-level group. ## Local routing -Local routing uses subgroups, often mapped to locations, to manage the proxy routing within the subgroup. Local routing is often used for geographical separation of writes. It's important for them to continue routing even when the top-level consensus is lost. +Local routing uses subgroups, often mapped to locations, to manage the proxy routing in the subgroup. Local routing is often used for geographical separation of writes. It's important for them to continue routing even when the top-level consensus is lost. That's because PGD allows queries and asynchronous data manipulation (DMLs) to work even when the top-level consensus is lost. But using the top-level consensus, as is the case with global routing, means that new write leaders can't be elected when that consensus is lost. Local groups can't rely on the top-level consensus without adding an independent consensus mechanism and its added complexity. -PGD 5 introduced subgroup Raft support to elegantly address this issue. Subgroup Raft support allows the subgroups in a PGD top-level group to elect the leaders they need independently. They do this by forming devolved Raft groups that can elect write leaders independent of other subgroups or the top-level Raft consensus. Connections to proxies in the subgroup then route to data nodes within the subgroup. +To elegantly address this, PGD 5 introduced subgroup Raft support. Supgroup Raft support allows the subgroups in a PGD top-level group to elect the leaders they need independently. They do this by forming devolved Raft groups that can elect write leaders independent of other subgroups or the top-level Raft consensus. Connections to proxies in the subgroup then route to data nodes in the subgroup. With local routing, there's a write leader for each subgroup. From 0d0a39024a0c7ac69e7abc2332cfb79e099e7f9d Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Fri, 23 Jun 2023 13:11:03 -0400 Subject: [PATCH 10/10] Reentered lost edits --- product_docs/docs/pgd/5/routing/raft/index.mdx | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/product_docs/docs/pgd/5/routing/raft/index.mdx b/product_docs/docs/pgd/5/routing/raft/index.mdx index c8a0c6ec1ff..6f017cd5739 100644 --- a/product_docs/docs/pgd/5/routing/raft/index.mdx +++ b/product_docs/docs/pgd/5/routing/raft/index.mdx @@ -10,29 +10,29 @@ PGD manages its metadata using a Raft model where a top-level group spans all t Raft is an industry-accepted algorithm for making decisions though achieving *consensus* from a group of separate nodes in a distributed system. !!! -For certain operations in the top-level group, a Raft leader must be both established and connected. Examples of these operations include adding and removing nodes and allocating ranges for [galloc](../../sequences/#pgd-global-sequences) sequences. +For certain operations in the top-level group, it's essential that a Raft leader must be both established and connected. Examples of these operations include adding and removing nodes and allocating ranges for [galloc](../../sequences/#pgd-global-sequences) sequences. It also means that an absolute majority of nodes in the top-level group (one half of them plus one) must be able to reach each other. So, in a top-level group with five nodes, at least three of the nodes must be reachable by each other to establish a Raft leader. ## Proxy routing -One function that also uses Raft is proxy routing. Proxy routing requires that the proxies can coordinate writing to a data node in their group of nodes. This data node is the write leader. If the write leader goes offline, the proxies need to be able to switch to a new write leader, selected by the data nodes, to maintain continuity for connected applications. +One function that also uses Raft is proxy routing. Proxy routing requires that the proxies can coordinate writing to a data node within their group of nodes. This data node is the write leader. If the write leader goes offline, the proxies need to be able to switch to a new write leader, selected by the data nodes, to maintain continuity for connected applications. You can configure proxy routing on a per-node group basis in PGD 5, but the recommended configurations are *global* and *local* routing. ## Global routing -Global routing uses the top-level group to manage the proxy routing. All writable data nodes (not witness or subscribe-only nodes) in the group are eligible to become write leader for all proxies. Connections to proxies in the top-level group are routed to data nodes in the top-level group. +Global routing uses the top-level group to manage the proxy routing. All writable data nodes (not witness or subscribe-only nodes) in the group are eligible to become write leader for all proxies. Connections to proxies within the top-level group will be routed to data nodes within the top-level group. With global routing, there's only one write leader for the entire top-level group. ## Local routing -Local routing uses subgroups, often mapped to locations, to manage the proxy routing in the subgroup. Local routing is often used for geographical separation of writes. It's important for them to continue routing even when the top-level consensus is lost. +Local routing uses subgroups, often mapped to locations, to manage the proxy routing within the subgroup. Local routing is often used for geographical separation of writes. It's important for them to continue routing even when the top-level consensus is lost. That's because PGD allows queries and asynchronous data manipulation (DMLs) to work even when the top-level consensus is lost. But using the top-level consensus, as is the case with global routing, means that new write leaders can't be elected when that consensus is lost. Local groups can't rely on the top-level consensus without adding an independent consensus mechanism and its added complexity. -To elegantly address this, PGD 5 introduced subgroup Raft support. Supgroup Raft support allows the subgroups in a PGD top-level group to elect the leaders they need independently. They do this by forming devolved Raft groups that can elect write leaders independent of other subgroups or the top-level Raft consensus. Connections to proxies in the subgroup then route to data nodes in the subgroup. +PGD 5 introduced subgroup Raft support to elegantly address this issue. Subgroup Raft support allows the subgroups in a PGD top-level group to elect the leaders they need independently. They do this by forming devolved Raft groups that can elect write leaders independent of other subgroups or the top-level Raft consensus. Connections to proxies in the subgroup then route to data nodes within the subgroup. With local routing, there's a write leader for each subgroup.