EnterpriseDB · josh-heyer · Nov 3, 2023 · Sep 26, 2023 · Sep 27, 2023 · Sep 27, 2023
@@ -14,7 +14,7 @@ The following table lists features of EDB Postgres Distributed that are dependen
 | [Granular DDL Locking](ddl/#ddl-locking-details)                            | Y          | Y                     | Y                     |
 | [Streaming of large transactions](transaction-streaming/)                 | v14+       | v13+                  | v14+                  |
 | [Distributed sequences](sequences/#pgd-global-sequences)                           | Y          | Y                     | Y                     |
-| [Subscribe-only nodes](nodes/#physical-standby-nodes)                            | Y          | Y                     | Y                     |
+| [Subscribe-only nodes](node_management/subscriber_only/)                            | Y          | Y                     | Y                     |
 | [Monitoring](monitoring/)                                      | Y          | Y                     | Y                     |
 | [OpenTelemetry support](monitoring/otel/)                           | Y          | Y                     | Y                     |
 | [Parallel apply](parallelapply)                                  | Y          | Y                     | Y                     |
@@ -28,7 +28,7 @@ The following table lists features of EDB Postgres Distributed that are dependen
 | [Commit At Most Once (CAMO)](durability/camo/)                      | N          | Y                     | 14+                   |
 | [Eager Conflict Resolution](consistency/eager/)                       | N          | Y                     | 14+                   |
 | [Lag Control](durability/lag-control/)                                     | N          | Y                     | 14+                   |
-| [Decoding Worker](nodes/#decoding-worker)                                 | N          | 13+                   | 14+                   |
+| [Decoding Worker](node_management/decoding_worker)                                 | N          | 13+                   | 14+                   |
 | [Lag tracker](monitoring/sql/#monitoring-outgoing-replication)                                     | N          | Y                     | 14+                   |
 | Missing partition conflict                      | N          | Y                     | 14+                   |
 | No need for UPDATE Trigger on tables with TOAST | N          | Y                     | 14+                   |

@@ -475,7 +475,7 @@ Origin info is available only up to the point where a row is frozen. Updates arr
 A node that was offline that reconnects and begins sending data changes can cause divergent
 errors if the newly arrived updates are older than the frozen rows that they update. Inserts and deletes aren't affected by this situation.
 
-We suggest that you don't leave down nodes for extended outages, as discussed in [Node restart and down node recovery](../nodes).
+We suggest that you don't leave down nodes for extended outages, as discussed in [Node restart and down node recovery](../node_management).
 
 On EDB Postgres Extended Server and EDB Postgres Advanced Server, PGD holds back the freezing of rows while a node is down. This mechanism handles this situation gracefully so you don't need to change parameter settings.
 

@@ -22,8 +22,8 @@ navigation:
   - upgrades
   - "#Using"
   - appusage
+  - node_management
   - postgres-configuration
-  - nodes
   - ddl
   - security
   - sequences

@@ -105,7 +105,7 @@ and
 
 Each node has one PGD group slot that must never have a connection to it
 and is very rarely be marked as active. This is normal and doesn't imply
-something is down or disconnected. See [Replication slots created by PGD`](../nodes/#replication-slots-created-by-pgd).
+something is down or disconnected. See [Replication slots](../node_management/replication_slots) in Node Management.
 
 ### Monitoring outgoing replication
 
@@ -271,7 +271,7 @@ subscription_status | replicating
 
 ### Monitoring WAL senders using LCR
 
-If the [decoding worker](../nodes#decoding-worker) is enabled, you can monitor information about the
+If the [decoding worker](../node_management/decoding_worker/) is enabled, you can monitor information about the
 current logical change record (LCR) file for each WAL sender 
 using the function [`bdr.wal_sender_stats()`](/pgd/latest/reference/functions/#bdrwal_sender_stats). For example:
 
@@ -287,7 +287,7 @@ postgres=# SELECT * FROM bdr.wal_sender_stats();
 
 If `is_using_lcr` is `FALSE`, `decoder_slot_name`/`lcr_file_name` is `NULL`.
 This is the case if the decoding worker isn't enabled or the WAL sender is
-serving a [logical standby](../nodes#logical-standby-nodes).
+serving a [logical standby](../node_management/logical_standby_nodes/).
 
 Also, you can monitor information about the decoding worker using the function
 [`bdr.get_decoding_worker_stat()`](/pgd/latest/reference/functions/#bdrget_decoding_worker_stat). For example:

@@ -0,0 +1,55 @@
+---
+title: Connection DSNs and SSL (TLS)
+---
+
+Because nodes connect
+using `libpq`, the DSN of a node is a [`libpq`]( https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNECT-SSLMODE) connection string. As such, the connection string can contain any permitted `libpq` connection
+parameter, including those for SSL. The DSN must work as the
+connection string from the client connecting to the node in which it's
+specified. An example of such a set of parameters using a client certificate is:
+
+```ini
+sslmode=verify-full sslcert=bdr_client.crt sslkey=bdr_client.key
+sslrootcert=root.crt
+```
+
+With this setup, the files `bdr_client.crt`, `bdr_client.key`, and
+`root.crt` must be present in the data directory on each node, with the
+appropriate permissions.
+For `verify-full` mode, the server's SSL certificate is checked to
+ensure that it's directly or indirectly signed with the `root.crt` certificate
+authority and that the host name or address used in the connection matches the
+contents of the certificate. In the case of a name, this can match a subject's
+alternative name or, if there are no such names in the certificate, the
+subject's common name (CN) field.
+Postgres doesn't currently support subject alternative names for IP
+addresses, so if the connection is made by address rather than name, it must
+match the CN field.
+
+The CN of the client certificate must be the name of the user making the
+PGD connection,
+which is usually the user postgres. Each node requires matching
+lines permitting the connection in the `pg_hba.conf` file. For example:
+
+```ini
+hostssl all         postgres 10.1.2.3/24 cert
+hostssl replication postgres 10.1.2.3/24 cert
+```
+
+Another setup might be to use `SCRAM-SHA-256` passwords instead of client
+certificates and not verify the server identity as long as
+the certificate is properly signed. Here the DSN parameters might be:
+
+```ini
+sslmode=verify-ca sslrootcert=root.crt
+```
+
+The corresponding `pg_hba.conf` lines are:
+
+```ini
+hostssl all         postgres 10.1.2.3/24 scram-sha-256
+hostssl replication postgres 10.1.2.3/24 scram-sha-256
+```
+
+In such a scenario, the postgres user needs a [`.pgpass`](https://www.postgresql.org/docs/current/libpq-pgpass.html) file
+containing the correct password.
@@ -0,0 +1,95 @@
+---
+title: Creating and joining PGD groups
+navTitle: Creating and joining PGD groups
+---
+
+## Creating and joining PGD groups
+
+For PGD, every node must connect to every other node. To make
+configuration easy, when a new node joins, it configures all
+existing nodes to connect to it. For this reason, every node, including
+the first PGD node created, must know the [PostgreSQL connection string](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING) that other nodes
+can use to connect to it. This connection string is
+sometimes referred to as a data source name (DSN). 
+
+Both formats of connection string are supported.
+So you can use either key-value format, like `host=myhost port=5432 dbname=mydb`,
+or URI format, like `postgresql://myhost:5432/mydb`.
+
+The SQL function [`bdr.create_node_group()`](/pgd/latest/reference/nodes-management-interfaces#bdrcreate_node_group) creates the PGD group
+from the local node. Doing so activates PGD on that node and allows other
+nodes to join the PGD group, which consists of only one node at that point.
+At the time of creation, you must specify the connection string for other
+nodes to use to connect to this node.
+
+Once the node group is created, every further node can join the PGD
+group using the [`bdr.join_node_group()`](/pgd/latest/reference/nodes-management-interfaces#bdrjoin_node_group) function.
+
+Alternatively, use the command line utility [bdr_init_physical](/pgd/latest/reference/nodes/#bdr_init_physical) to
+create a new node, using `pg_basebackup` or a physical standby of an existing
+node. If using `pg_basebackup`, the bdr_init_physical utility can optionally
+specify the base backup of only the target database. The earlier
+behavior was to back up the entire database cluster. With this utility, the activity
+completes faster and also uses less space because it excludes
+unwanted databases. If you specify only the target database, then the excluded
+databases get cleaned up and removed on the new node.
+
+When a new PGD node is joined to an existing PGD group or a node subscribes
+to an upstream peer, before replication can begin the system must copy the
+existing data from the peer nodes to the local node. This copy must be
+carefully coordinated so that the local and remote data starts out
+identical. It's not enough to use pg_dump yourself. The BDR
+extension provides built-in facilities for making this initial copy.
+
+During the join process, the BDR extension synchronizes existing data
+using the provided source node as the basis and creates all metadata
+information needed for establishing itself in the mesh topology in the PGD
+group. If the connection between the source and the new node disconnects during
+this initial copy, restart the join process from the
+beginning.
+
+The node that's joining the cluster must not contain any schema or data
+that already exists on databases in the PGD group. We recommend that the
+newly joining database be empty except for the BDR extension. However,
+it's important that all required database users and roles are created.
+
+Optionally, you can skip the schema synchronization using the `synchronize_structure`
+parameter of the [`bdr.join_node_group`](/pgd/latest/reference/nodes-management-interfaces#bdrjoin_node_group) function. In this case, the schema must
+already exist on the newly joining node.
+
+We recommend that you select the source node that has the best connection (the
+closest) as the source node for joining. Doing so lowers the time
+needed for the join to finish.
+
+Coordinate the join procedure using the Raft consensus algorithm, which
+requires most existing nodes to be online and reachable.
+
+The logical join procedure (which uses the [`bdr.join_node_group`](/pgd/latest/reference/nodes-management-interfaces#bdrjoin_node_group) function)
+performs data sync doing `COPY` operations and uses multiple writers
+(parallel apply) if those are enabled.
+
+Node join can execute concurrently with other node joins for the majority of
+the time taken to join. However, only one regular node at a time can be in
+either of the states PROMOTE or PROMOTING. These states are typically fairly short if
+all other nodes are up and running. Otherwise the join is serialized at
+this stage. The subscriber-only nodes are an exception to this rule, and they
+can be concurrently in PROMOTE and PROMOTING states as well, so their join
+process is fully concurrent.
+
+The join process uses only one node as the source, so it can be
+executed when nodes are down if a majority of nodes are available.
+This approach can cause a complexity when running logical join.
+During logical join, the commit timestamp of rows copied from the source
+node is set to the latest commit timestamp on the source node.
+Committed changes on nodes that have a commit timestamp earlier than this
+(because nodes are down or have significant lag) can conflict with changes
+from other nodes. In this case, the newly joined node can be resolved
+differently to other nodes, causing a divergence. As a result, we recommend
+not running a node join when significant replication lag exists between nodes.
+If this is necessary, run LiveCompare on the newly joined node to
+correct any data divergence once all nodes are available and caught up.
+
+`pg_dump` can fail when there's concurrent DDL activity on the source node
+because of cache-lookup failures. Since [`bdr.join_node_group`](/pgd/latest/reference/nodes-management-interfaces#bdrjoin_node_group) uses pg_dump
+internally, it might fail if there's concurrent DDL activity on the source node.
+Retrying the join works in that case.
@@ -0,0 +1,79 @@
+---
+title: Decoding worker
+---
+
+PGD provides an option to enable a decoding worker process that performs
+decoding once, no matter how many nodes are sent data. This option introduces a
+new process, the WAL decoder, on each PGD node. One WAL sender process still
+exists for each connection, but these processes now just perform the task of
+sending and receiving data. Taken together, these changes reduce the CPU
+overhead of larger PGD groups and also allow higher replication throughput
+since the WAL sender process now spends more time on communication.
+
+## Enabling
+
+`enable_wal_decoder` is an option for each PGD group, which is currently
+disabled by default. You can use [`bdr.alter_node_group_config()`](../reference/nodes-management-interfaces/#bdralter_node_group_config) to enable or
+disable the decoding worker for a PGD group.
+
+When the decoding worker is enabled, PGD stores logical change record (LCR)
+files to allow buffering of changes between decoding and when all
+subscribing nodes received data. LCR files are stored under the
+`pg_logical` directory in each local node's data directory. The number and
+size of the LCR files varies as replication lag increases, so this process also
+needs monitoring. The LCRs that aren't required by any of the PGD nodes are cleaned
+periodically. The interval between two consecutive cleanups is controlled by
+[`bdr.lcr_cleanup_interval`](/pgd/latest/reference/pgd-settings#bdrlcr_cleanup_interval), which defaults to 3 minutes. The cleanup is
+disabled when [`bdr.lcr_cleanup_interval`](/pgd/latest/reference/pgd-settings#bdrlcr_cleanup_interval) is 0.
+
+## Disabling
+
+When disabled, logical decoding is performed by the WAL sender process for each
+node subscribing to each node. In this case, no LCR files are written.
+
+Even though the decoding worker is enabled for a PGD group, following
+GUCs control the production and use of LCR per node. By default
+these are `false`. For production and use of LCRs, enable the
+decoding worker for the PGD group and set these GUCs to `true` on each of the nodes in the PGD group.
+
+-   [`bdr.enable_wal_decoder`](/pgd/latest/reference/pgd-settings#bdrenable_wal_decoder) &mdash; When `false`, all WAL
+    senders using LCRs restart to use WAL directly. When `true`
+    along with the PGD group config, a decoding worker process is
+    started to produce LCR and WAL senders that use LCR.
+-   [`bdr.receive_lcr`](/pgd/latest/reference/pgd-settings#bdrreceive_lcr) &mdash; When `true` on the subscribing node, it requests WAL
+    sender on the publisher node to use LCRs if available.
+
+
+!!! Note Notes
+As of now, a decoding worker decodes changes corresponding to the node where it's
+running. A logical standby is sent changes from all the nodes in the PGD group
+through a single source. Hence a WAL sender serving a logical standby currently can't
+use LCRs.
+
+A subscriber-only node receives changes from respective nodes directly. Hence
+a WAL sender serving a subscriber-only node can use LCRs.
+
+Even though LCRs are produced, the corresponding WALs are still retained similar
+to the case when a decoding worker isn't enabled. In the future, it might be possible
+to remove WAL corresponding the LCRs, if they aren't otherwise required.
+!!!
+
+## LCR file names
+
+For reference, the first 24 characters of an LCR file name are similar to those
+in a WAL file name. The first 8 characters of the name are currently all '0'.
+In the future, they're expected to represent the TimeLineId similar to the first 8
+characters of a WAL segment file name. The following sequence of 16 characters
+of the name is similar to the WAL segment number, which is used to track LCR
+changes against the WAL stream.
+
+However, logical changes are
+reordered according to the commit order of the transactions they belong to.
+Hence their placement in the LCR segments doesn't match the placement of
+corresponding WAL in the WAL segments.
+
+The set of the last 16 characters represents the
+subsegment number in an LCR segment. Each LCR file corresponds to a
+subsegment. LCR files are binary and variable sized. The maximum size of an
+LCR file can be controlled by `bdr.max_lcr_segment_file_size`, which
+defaults to 1 GB.
@@ -0,0 +1,22 @@
+---
+title: Groups and subgroups
+---
+
+## Groups
+
+A PGD cluster's nodes are gathered in groups. A "top level" group always exists and is the group
+to which all data nodes belong to automatically. The "top level" group can also be
+the direct parent of sub-groups. 
+
+## Sub-groups
+
+A group can also contain zero or more subgroups. Subgroups can be used to
+represent data centers or locations allowing commit scopes to refer to nodes
+in a particular region as a whole. PGD Proxy can also make use of subgroups 
+to delineate nodes available to be write leader.
+
+The `node_group_type` value specifies the type when the subgroup is created.
+Some sub-group types change the behavior of the nodes within the group. For
+example, a [subscriber-only](subscriber_only) sub-group will make all the nodes
+within the group into subscriber-only nodes.
+
@@ -0,0 +1,32 @@
+---
+title: Joining a heterogeneous cluster
+---
+
+
+PGD 4.0 node can join a EDB Postgres Distributed cluster running 3.7.x at a specific
+minimum maintenance release (such as 3.7.6) or a mix of 3.7 and 4.0 nodes.
+This procedure is useful when you want to upgrade not just the PGD
+major version but also the underlying PostgreSQL major
+version. You can achieve this by joining a 3.7 node running on
+PostgreSQL 12 or 13 to a EDB Postgres Distributed cluster running 3.6.x on
+PostgreSQL 11. The new node can also
+run on the same PostgreSQL major release as all of the nodes in the
+existing cluster.
+
+PGD ensures that the replication works correctly in all directions
+even when some nodes are running 3.6 on one PostgreSQL major release and
+other nodes are running 3.7 on another PostgreSQL major release. However,
+we recommend that you quickly bring the cluster into a
+homogenous state by parting the older nodes once enough new nodes
+join the cluster. Don't run any DDLs that might
+not be available on the older versions and vice versa.
+
+A node joining with a different major PostgreSQL release can't use
+physical backup taken with [`bdr_init_physical`](/pgd/latest/reference/nodes#bdr_init_physical), and the node must join
+using the logical join method. Using this method is necessary because the major
+PostgreSQL releases aren't on-disk compatible with each other.
+
+When a 3.7 node joins the cluster using a 3.6 node as a
+source, certain configurations, such as conflict resolution,
+aren't copied from the source node. The node must be configured
+after it joins the cluster.