Merge pull request #2103 from EnterpriseDB/release/2021/12-06

Release: 2021-12-06
EnterpriseDB · Dec 6, 2021 · 0aacfc7 · 0aacfc7
2 parents 9e4d60b + e9270f3
commit 0aacfc7
Show file tree

Hide file tree

Showing 55 changed files with 244 additions and 237 deletions.
diff --git a/product_docs/docs/bdr/3.7/backup.mdx b/product_docs/docs/bdr/3.7/backup.mdx
@@ -236,15 +236,8 @@ of a single BDR node, optionally plus WAL archives:
 
 The cleaning of leftover BDR metadata is achieved as follows:
 
-1.  Drop the `bdr` extension with `CASCADE`.
-2.  Drop all the replication origins previously created by BDR.
-3.  Drop any replication slots left over from BDR.
-4.  Fully stop and re-start PostgreSQL (important!).
-5.  Create the `bdr` extension.
-
-The `DROP EXTENSION`/`CREATE EXTENSION` cycle guarantees that all the
-BDR metadata from the previous cluster is removed, and that the node
-can be used to grow a new BDR cluster from scratch.
+1.  Drop the BDR node using `bdr.drop_node`
+2.  Fully stop and re-start PostgreSQL (important!).
 
 #### Cleanup of Replication Origins
 

diff --git a/product_docs/docs/bdr/3.7/catalogs.mdx b/product_docs/docs/bdr/3.7/catalogs.mdx
@@ -936,7 +936,7 @@ only one node and processed on different nodes.
 | Column             | Type   | Description                                                                                                              |
 | ------------------ | ------ | ------------------------------------------------------------------------------------------------------------------------ |
 | ap_wq_workid       | bigint | The Unique ID of the work item                                                                                           |
-| ap_wq_ruleid       | int    | ID of the rule listed in autopartition_rules. Rules are specified using bdr.autoscale/autopartition commands             |
+| ap_wq_ruleid       | int    | ID of the rule listed in autopartition_rules. Rules are specified using bdr.autopartition command                        |
 | ap_wq_relname      | name   | Name of the relation being autopartitioned                                                                               |
 | ap_wq_relnamespace | name   | Name of the tablespace specified in rule for this work item.                                                             |
 | ap_wq_partname     | name   | Name of the partition created by the workitem                                                                            |
@@ -970,7 +970,7 @@ items, independent of other nodes in the cluster.
 | Column             | Type   | Description                                                                                                              |
 | ------------------ | ------ | ------------------------------------------------------------------------------------------------------------------------ |
 | ap_wq_workid       | bigint | The Unique ID of the work item                                                                                           |
-| ap_wq_ruleid       | int    | ID of the rule listed in autopartition_rules. Rules are specified using bdr.autoscale/autopartition commands             |
+| ap_wq_ruleid       | int    | ID of the rule listed in autopartition_rules. Rules are specified using bdr.autopartition command                        |
 | ap_wq_relname      | name   | Name of the relation being autopartitioned                                                                               |
 | ap_wq_relnamespace | name   | Name of the tablespace specified in rule for this work item.                                                             |
 | ap_wq_partname     | name   | Name of the partition created by the workitem                                                                            |

diff --git a/product_docs/docs/bdr/3.7/ddl.mdx b/product_docs/docs/bdr/3.7/ddl.mdx
@@ -996,8 +996,8 @@ nodes should have applied the `ALTER TABLE .. ADD CONSTRAINT ... NOT VALID`
 command and made enough progress. BDR will wait for a consistent
 state to be reached before validating the constraint.
 
-Note that the new facility requires the cluster to run with RAFT protocol
-version 24 and beyond. If the RAFT protocol is not yet upgraded, the old
+Note that the new facility requires the cluster to run with Raft protocol
+version 24 and beyond. If the Raft protocol is not yet upgraded, the old
 mechanism will be used, resulting in a DML lock request.
 
 !!! Note

diff --git a/product_docs/docs/bdr/3.7/durability.mdx b/product_docs/docs/bdr/3.7/durability.mdx
@@ -20,11 +20,24 @@ can all be implemented individually:
     eventually be applied on all nodes without further conflicts, or get
     an abort directly informing the client of an error.
 
-PGLogical (PGL) integrates with the `synchronous_commit` option of
+BDR integrates with the `synchronous_commit` option of
 Postgres itself, providing a variant of synchronous replication,
-which can be used between BDR nodes.  In addition, BDR offers
-[Eager All-Node Replication](eager) and
-[Commit At Most Once](camo).
+which can be used between BDR nodes.  BDR also offers two additional
+replication modes:
+
+-   Commit At Most Once (CAMO). This feature solves the problem with knowing
+    whether your transaction has COMMITed (and replicated) or not in case of
+    certain errors during COMMIT. Normally, it might be hard to know whether
+    or not the COMMIT was processed in. With this feature, your application can
+    find out what happened, even if your new database connection is to  node
+    than your previous connection. For more info about this feature see the
+    [Commit At Most Once](camo) chapter.
+-   Eager Replication. This is an optional feature to avoid replication
+    conflicts. Every transaction is applied on *all nodes* simultaneously,
+    and commits only if no replication conflicts are detected. This feature does
+    reduce performance, but provides very strong consistency guarantees.
+    For more info about this feature see the [Eager All-Node Replication](eager)
+    chapter.
 
 Postgres itself provides [Physical Streaming
 Replication](https://www.postgresql.org/docs/11/warm-standby.html#SYNCHRONOUS-REPLICATION)

diff --git a/product_docs/docs/bdr/3.7/functions.mdx b/product_docs/docs/bdr/3.7/functions.mdx
@@ -44,28 +44,6 @@ value:
 MAJOR_VERSION * 10000 + MINOR_VERSION * 100 + PATCH_RELEASE
 ```
 
-### bdr.wal_sender_stats
-
-If the [Decoding Worker](nodes#decoding-worker) is enabled, this
-view shows information about the decoder slot and current LCR
-(`Logical Change Record`) segment file being read by each WAL sender.
-
-#### Synopsis
-
-```sql
-bdr.wal_sender_stats() &rarr; setof record (pid integer, is_using_lcr boolean,  decoder_slot_name TEXT, lcr_file_name TEXT)
-```
-
-#### Output columns
-
--   `pid` - PID of the WAL sender (corresponds to `pg_stat_replication`'s `pid` column)
-
--   `is_using_lcr` - Whether the WAL sender is sending LCR files. The next columns will be `NULL` if `is_using_lcr` is `FALSE`.
-
--   `decoder_slot_name` - The name of the decoder replication slot.
-
--   `lcr_file_name` - The name of the current LCR file.
-
 ## System and Progress Information Parameters
 
 BDR exposes some parameters that can be queried via `SHOW` in `psql`

diff --git a/product_docs/docs/bdr/3.7/index.mdx b/product_docs/docs/bdr/3.7/index.mdx
@@ -114,3 +114,5 @@ Some features are only available on particular versions of Postgres server.
 
 Features that are currently available only with EDB Postgres Extended are
 expected to be available with EDB Postgres Advanced 14.
+
+This documentation is for the Enterprise Edition of BDR3.
diff --git a/product_docs/docs/bdr/3.7/known-issues.mdx b/product_docs/docs/bdr/3.7/known-issues.mdx
@@ -55,11 +55,10 @@ unique identifier.
 
 -   Decoding Worker works only with the default replication sets
 
--   When Decoding Worker is enabled in BDR node group and a BDR node is shutdown
+-   When Decoding Worker is enabled in BDR node group and a BDR  node is shutdown
     in fast mode immediately after starting it, the shutdown may not complete
     because WAL sender does not exit. This happens because WAL sender waits for
-    the Decoding Worker process to start, but it may never start since the node is
+    WAL decoder to start and WAL decoder may never start since the node is
     shutting down. The situation can be worked around by using an immediate
-    shutdown or waiting for the Decoding Worker to start. The Decoding Worker
-    process is
+    shutdown or waiting for WAL decoder to start. The WAL decoder process is
     reported in `pglogical.workers` as well as `pg_stat_activity` catalogs.
diff --git a/product_docs/docs/bdr/3.7/monitoring.mdx b/product_docs/docs/bdr/3.7/monitoring.mdx
@@ -293,6 +293,17 @@ If `is_using_lcr` is `FALSE`, `decoder_slot_name`/`lcr_file_name` will be `NULL`
 This will be the case if the Decoding Worker is not enabled, or the WAL sender is
 serving a [logical standby]\(nodes.md#Logical Standby Nodes).
 
+Additionally, information about the Decoding Worker can be monitored via the function
+[bdr.get_decoding_worker_stat](functions#bdr_get_decoding_worker_stat), e.g.:
+
+```
+postgres=# SELECT * FROM bdr.get_decoding_worker_stat();
+   pid   | decoded_upto_lsn | waiting | waiting_for_lsn
+---------+------------------+---------+-----------------
+ 1153091 | 0/1E5EEE8        | t       | 0/1E5EF00
+(1 row)
+```
+
 ## Monitoring BDR Replication Workers
 
 All BDR workers show up in the system view `bdr.stat_activity`,
@@ -701,30 +712,19 @@ Peer replication slots should be active on all nodes at all times.
 If a peer replication slot is not active, then it might mean:
 
 -   The corresponding peer is shutdown or not accessible; or
--   BDR replication is broken. Grep the log file for `ERROR` or
-    `FATAL` and also check `bdr.worker_errors` on all nodes.
-    The root cause might be, for example, an incompatible DDL was
-    executed with DDL replication disabled on one of the nodes.
+-   BDR replication is broken.
 
-The BDR group replication slot, on the other hand, is inactive most
-of the time. BDR keeps this slot and advances LSN, as all other peers
-have already consumed the corresponding transactions. So it is not
-possible to monitor the status (active or inactive) of the group slot.
+Grep the log file for `ERROR` or `FATAL` and also check `bdr.worker_errors` on
+all nodes.  The root cause might be, for example, an incompatible DDL was
+executed with DDL replication disabled on one of the nodes.
 
-We recommend the following monitoring alert levels:
+The BDR group replication slot is however inactive most of the time.  BDR
+maintains this slot and advances its LSN when all other peers have already
+consumed the corresponding transactions. Consequently it is not necessary to
+monitor the status of the group slot.
 
--   status=UNKNOWN, message=This node is not part of any BDR group
--   status=OK, message=All BDR replication slots are working correctly
--   status=CRITICAL, message=There is at least 1 BDR replication
-    slot which is inactive
--   status=CRITICAL, message=There is at least 1 BDR replication
-    slot which is missing
-
-The described behavior is implemented in the function
-`bdr.monitor_local_replslots()`, which uses replication slot status
-information returned from view `bdr.node_slots` (slot active or
-inactive) to provide a local check considering all BDR node replication
-slots, except the BDR group slot.
+The function `bdr.monitor_local_replslots()` provides a summary of whether all
+BDR node replication slots are working as expected, e.g.:
 
 ```sql
 bdrdb=# SELECT * FROM bdr.monitor_local_replslots();
@@ -733,6 +733,14 @@ bdrdb=# SELECT * FROM bdr.monitor_local_replslots();
  OK     | All BDR replication slots are working correctly
 ```
 
+One of the following status summaries will be returned:
+
+-   `UNKNOWN`: `This node is not part of any BDR group`
+-   `OK`: `All BDR replication slots are working correctly`
+-   `OK`: `This node is part of a subscriber-only group`
+-   `CRITICAL`: `There is at least 1 BDR replication slot which is inactive`
+-   `CRITICAL`: `There is at least 1 BDR replication slot which is missing`
+
 ## Monitoring Transaction COMMITs
 
 By default, BDR transactions commit only on the local node. In that case,

diff --git a/product_docs/docs/bdr/3.7/nodes.mdx b/product_docs/docs/bdr/3.7/nodes.mdx
@@ -47,12 +47,6 @@ The `bdr_init_physical` utility replaces the functionality of the
 `bdr_init_copy` utility from BDR1 and BDR2. It is the BDR3 equivalent of the
 pglogical `pglogical_create_subscriber` utility.
 
-!!! Warning
-    Only one node at the time should join the BDR node group, or be
-    parted from it. If a new node is being joined while there is
-    another join or part operation in progress, the new node will
-    sometimes not have consistent data after the join has finished.
-
 When a new BDR node is joined to an existing BDR group or a node is subscribed
 to an upstream peer, before replication can begin, the system must copy the
 existing data from the peer node(s) to the local node. This copy must be
@@ -87,10 +81,12 @@ performs data sync doing `COPY` operations and will use multiple writers
 (parallel apply) if those are enabled.
 
 Node join can execute concurrently with other node joins for the majority of
-the time taken to join. Only one regular node at a time can be in either of
-the states PROMOTE or PROMOTING, which are typically fairly short.
-The subscriber-only nodes are an exception to this rule, and they can be
-cocurrently in PROMOTE and PROMOTING states as well.
+the time taken to join. However, only one regular node at a time can be in
+either of the states PROMOTE or PROMOTING, which are typically fairly short if
+all other nodes are up and running, otherwise the join will get serialized at
+this stage. The subscriber-only nodes are an exception to this rule, and they
+can be concurrently in PROMOTE and PROMOTING states as well, so their join
+process is fully concurrent.
 
 Note that the join process uses only one node as the source, so can be
 executed when nodes are down, if a majority of nodes are available.
@@ -871,43 +867,6 @@ as `STANDBY`.
 
 Only one node at a time can be in either of the states PROMOTE or PROMOTING.
 
-## Managing Shard Groups
-
-BDR clusters may contain an array of Shard Groups for the AutoScale feature.
-These are shown as a sub-node group that is composed of an array of
-sub-sub node groups known as Shard Groups.
-
-Operations that can be performed on the Shard Group are:
-
--   Create Shard Array
--   Drop Shard Array
--   Repair - add new nodes to replace failed nodes
--   Expand - add new Shard Groups
--   Re-Balance - re-distribute data across Shard Groups
-
-### Create/Drop
-
-### Expand
-
-e.g. expand from 4 Shard Groups to 8 Shard Groups
-
-This operation can occur without interfering with user operations.
-
-### Re-Balance
-
-e.g. move data from where it was in a 4-node array to how it would be ideally
-placed in an 8-node array.
-
-Some portion of the data is moved from one Shard Group to another,
-so this action can take an extended period, depending upon how
-much data is to be moved. The data is moved one partition at a
-time, so is restartable without too much wasted effort.
-
-Note that re-balancing is optional.
-
-This operation can occur without interfering with user operations,
-even when this includes write transactions.
-
 ## Node Management Interfaces
 
 Nodes can be added and removed dynamically using the SQL interfaces.
@@ -995,7 +954,7 @@ This function creates a BDR group with the local node as the only member of the
 
 ```sql
 bdr.create_node_group(node_group_name text,
-                      parent_group_name text,
+                      parent_group_name text DEFAULT NULL,
                       join_node_group boolean DEFAULT true,
                       node_group_type text DEFAULT NULL)
 ```
@@ -1017,9 +976,8 @@ bdr.create_node_group(node_group_name text,
      changes to other nodes. See [Subscriber-Only Nodes] for more details.
      Datanode implies that the group represents a shard, whereas the other
      values imply that the group represents respective coordinators.
-     Except 'subscriber-only', the rest three values are reserved for use
-     with a separate extension called autoscale. NULL implies a normal
-     general purpose node group will be created.
+     Except 'subscriber-only', the rest three values are reserved for future use.
+     NULL implies a normal general purpose node group will be created.
 
 #### Notes
 
@@ -1425,6 +1383,7 @@ bdr_init_physical [OPTION] ...
 
 -   `--hba-conf -path` to the new pg_hba.conf
 -   `--postgresql-conf` - path to the new postgresql.conf
+-   `--postgresql-auto-conf` - path to the new postgresql.auto.conf
 
 #### Notes
Original file line number	Diff line number	Diff line change
Expand Up		@@ -114,3 +114,5 @@ Some features are only available on particular versions of Postgres server.

		Features that are currently available only with EDB Postgres Extended are
		expected to be available with EDB Postgres Advanced 14.

		This documentation is for the Enterprise Edition of BDR3.