Skip to content

Commit

Permalink
Merge pull request #2103 from EnterpriseDB/release/2021/12-06
Browse files Browse the repository at this point in the history
Release: 2021-12-06
  • Loading branch information
drothery-edb authored Dec 6, 2021
2 parents 9e4d60b + e9270f3 commit 0aacfc7
Show file tree
Hide file tree
Showing 55 changed files with 244 additions and 237 deletions.
11 changes: 2 additions & 9 deletions product_docs/docs/bdr/3.7/backup.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -236,15 +236,8 @@ of a single BDR node, optionally plus WAL archives:

The cleaning of leftover BDR metadata is achieved as follows:

1. Drop the `bdr` extension with `CASCADE`.
2. Drop all the replication origins previously created by BDR.
3. Drop any replication slots left over from BDR.
4. Fully stop and re-start PostgreSQL (important!).
5. Create the `bdr` extension.

The `DROP EXTENSION`/`CREATE EXTENSION` cycle guarantees that all the
BDR metadata from the previous cluster is removed, and that the node
can be used to grow a new BDR cluster from scratch.
1. Drop the BDR node using `bdr.drop_node`
2. Fully stop and re-start PostgreSQL (important!).

#### Cleanup of Replication Origins

Expand Down
4 changes: 2 additions & 2 deletions product_docs/docs/bdr/3.7/catalogs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -936,7 +936,7 @@ only one node and processed on different nodes.
| Column | Type | Description |
| ------------------ | ------ | ------------------------------------------------------------------------------------------------------------------------ |
| ap_wq_workid | bigint | The Unique ID of the work item |
| ap_wq_ruleid | int | ID of the rule listed in autopartition_rules. Rules are specified using bdr.autoscale/autopartition commands |
| ap_wq_ruleid | int | ID of the rule listed in autopartition_rules. Rules are specified using bdr.autopartition command |
| ap_wq_relname | name | Name of the relation being autopartitioned |
| ap_wq_relnamespace | name | Name of the tablespace specified in rule for this work item. |
| ap_wq_partname | name | Name of the partition created by the workitem |
Expand Down Expand Up @@ -970,7 +970,7 @@ items, independent of other nodes in the cluster.
| Column | Type | Description |
| ------------------ | ------ | ------------------------------------------------------------------------------------------------------------------------ |
| ap_wq_workid | bigint | The Unique ID of the work item |
| ap_wq_ruleid | int | ID of the rule listed in autopartition_rules. Rules are specified using bdr.autoscale/autopartition commands |
| ap_wq_ruleid | int | ID of the rule listed in autopartition_rules. Rules are specified using bdr.autopartition command |
| ap_wq_relname | name | Name of the relation being autopartitioned |
| ap_wq_relnamespace | name | Name of the tablespace specified in rule for this work item. |
| ap_wq_partname | name | Name of the partition created by the workitem |
Expand Down
4 changes: 2 additions & 2 deletions product_docs/docs/bdr/3.7/ddl.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -996,8 +996,8 @@ nodes should have applied the `ALTER TABLE .. ADD CONSTRAINT ... NOT VALID`
command and made enough progress. BDR will wait for a consistent
state to be reached before validating the constraint.

Note that the new facility requires the cluster to run with RAFT protocol
version 24 and beyond. If the RAFT protocol is not yet upgraded, the old
Note that the new facility requires the cluster to run with Raft protocol
version 24 and beyond. If the Raft protocol is not yet upgraded, the old
mechanism will be used, resulting in a DML lock request.

!!! Note
Expand Down
21 changes: 17 additions & 4 deletions product_docs/docs/bdr/3.7/durability.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,24 @@ can all be implemented individually:
eventually be applied on all nodes without further conflicts, or get
an abort directly informing the client of an error.

PGLogical (PGL) integrates with the `synchronous_commit` option of
BDR integrates with the `synchronous_commit` option of
Postgres itself, providing a variant of synchronous replication,
which can be used between BDR nodes. In addition, BDR offers
[Eager All-Node Replication](eager) and
[Commit At Most Once](camo).
which can be used between BDR nodes. BDR also offers two additional
replication modes:

- Commit At Most Once (CAMO). This feature solves the problem with knowing
whether your transaction has COMMITed (and replicated) or not in case of
certain errors during COMMIT. Normally, it might be hard to know whether
or not the COMMIT was processed in. With this feature, your application can
find out what happened, even if your new database connection is to node
than your previous connection. For more info about this feature see the
[Commit At Most Once](camo) chapter.
- Eager Replication. This is an optional feature to avoid replication
conflicts. Every transaction is applied on *all nodes* simultaneously,
and commits only if no replication conflicts are detected. This feature does
reduce performance, but provides very strong consistency guarantees.
For more info about this feature see the [Eager All-Node Replication](eager)
chapter.

Postgres itself provides [Physical Streaming
Replication](https://www.postgresql.org/docs/11/warm-standby.html#SYNCHRONOUS-REPLICATION)
Expand Down
22 changes: 0 additions & 22 deletions product_docs/docs/bdr/3.7/functions.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -44,28 +44,6 @@ value:
MAJOR_VERSION * 10000 + MINOR_VERSION * 100 + PATCH_RELEASE
```

### bdr.wal_sender_stats

If the [Decoding Worker](nodes#decoding-worker) is enabled, this
view shows information about the decoder slot and current LCR
(`Logical Change Record`) segment file being read by each WAL sender.

#### Synopsis

```sql
bdr.wal_sender_stats() → setof record (pid integer, is_using_lcr boolean, decoder_slot_name TEXT, lcr_file_name TEXT)
```

#### Output columns

- `pid` - PID of the WAL sender (corresponds to `pg_stat_replication`'s `pid` column)

- `is_using_lcr` - Whether the WAL sender is sending LCR files. The next columns will be `NULL` if `is_using_lcr` is `FALSE`.

- `decoder_slot_name` - The name of the decoder replication slot.

- `lcr_file_name` - The name of the current LCR file.

## System and Progress Information Parameters

BDR exposes some parameters that can be queried via `SHOW` in `psql`
Expand Down
2 changes: 2 additions & 0 deletions product_docs/docs/bdr/3.7/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -114,3 +114,5 @@ Some features are only available on particular versions of Postgres server.

Features that are currently available only with EDB Postgres Extended are
expected to be available with EDB Postgres Advanced 14.

This documentation is for the Enterprise Edition of BDR3.
7 changes: 3 additions & 4 deletions product_docs/docs/bdr/3.7/known-issues.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -55,11 +55,10 @@ unique identifier.

- Decoding Worker works only with the default replication sets

- When Decoding Worker is enabled in BDR node group and a BDR node is shutdown
- When Decoding Worker is enabled in BDR node group and a BDR node is shutdown
in fast mode immediately after starting it, the shutdown may not complete
because WAL sender does not exit. This happens because WAL sender waits for
the Decoding Worker process to start, but it may never start since the node is
WAL decoder to start and WAL decoder may never start since the node is
shutting down. The situation can be worked around by using an immediate
shutdown or waiting for the Decoding Worker to start. The Decoding Worker
process is
shutdown or waiting for WAL decoder to start. The WAL decoder process is
reported in `pglogical.workers` as well as `pg_stat_activity` catalogs.
50 changes: 29 additions & 21 deletions product_docs/docs/bdr/3.7/monitoring.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,17 @@ If `is_using_lcr` is `FALSE`, `decoder_slot_name`/`lcr_file_name` will be `NULL`
This will be the case if the Decoding Worker is not enabled, or the WAL sender is
serving a [logical standby]\(nodes.md#Logical Standby Nodes).

Additionally, information about the Decoding Worker can be monitored via the function
[bdr.get_decoding_worker_stat](functions#bdr_get_decoding_worker_stat), e.g.:

```
postgres=# SELECT * FROM bdr.get_decoding_worker_stat();
pid | decoded_upto_lsn | waiting | waiting_for_lsn
---------+------------------+---------+-----------------
1153091 | 0/1E5EEE8 | t | 0/1E5EF00
(1 row)
```

## Monitoring BDR Replication Workers

All BDR workers show up in the system view `bdr.stat_activity`,
Expand Down Expand Up @@ -701,30 +712,19 @@ Peer replication slots should be active on all nodes at all times.
If a peer replication slot is not active, then it might mean:

- The corresponding peer is shutdown or not accessible; or
- BDR replication is broken. Grep the log file for `ERROR` or
`FATAL` and also check `bdr.worker_errors` on all nodes.
The root cause might be, for example, an incompatible DDL was
executed with DDL replication disabled on one of the nodes.
- BDR replication is broken.

The BDR group replication slot, on the other hand, is inactive most
of the time. BDR keeps this slot and advances LSN, as all other peers
have already consumed the corresponding transactions. So it is not
possible to monitor the status (active or inactive) of the group slot.
Grep the log file for `ERROR` or `FATAL` and also check `bdr.worker_errors` on
all nodes. The root cause might be, for example, an incompatible DDL was
executed with DDL replication disabled on one of the nodes.

We recommend the following monitoring alert levels:
The BDR group replication slot is however inactive most of the time. BDR
maintains this slot and advances its LSN when all other peers have already
consumed the corresponding transactions. Consequently it is not necessary to
monitor the status of the group slot.

- status=UNKNOWN, message=This node is not part of any BDR group
- status=OK, message=All BDR replication slots are working correctly
- status=CRITICAL, message=There is at least 1 BDR replication
slot which is inactive
- status=CRITICAL, message=There is at least 1 BDR replication
slot which is missing

The described behavior is implemented in the function
`bdr.monitor_local_replslots()`, which uses replication slot status
information returned from view `bdr.node_slots` (slot active or
inactive) to provide a local check considering all BDR node replication
slots, except the BDR group slot.
The function `bdr.monitor_local_replslots()` provides a summary of whether all
BDR node replication slots are working as expected, e.g.:

```sql
bdrdb=# SELECT * FROM bdr.monitor_local_replslots();
Expand All @@ -733,6 +733,14 @@ bdrdb=# SELECT * FROM bdr.monitor_local_replslots();
OK | All BDR replication slots are working correctly
```

One of the following status summaries will be returned:

- `UNKNOWN`: `This node is not part of any BDR group`
- `OK`: `All BDR replication slots are working correctly`
- `OK`: `This node is part of a subscriber-only group`
- `CRITICAL`: `There is at least 1 BDR replication slot which is inactive`
- `CRITICAL`: `There is at least 1 BDR replication slot which is missing`

## Monitoring Transaction COMMITs

By default, BDR transactions commit only on the local node. In that case,
Expand Down
61 changes: 10 additions & 51 deletions product_docs/docs/bdr/3.7/nodes.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -47,12 +47,6 @@ The `bdr_init_physical` utility replaces the functionality of the
`bdr_init_copy` utility from BDR1 and BDR2. It is the BDR3 equivalent of the
pglogical `pglogical_create_subscriber` utility.

!!! Warning
Only one node at the time should join the BDR node group, or be
parted from it. If a new node is being joined while there is
another join or part operation in progress, the new node will
sometimes not have consistent data after the join has finished.

When a new BDR node is joined to an existing BDR group or a node is subscribed
to an upstream peer, before replication can begin, the system must copy the
existing data from the peer node(s) to the local node. This copy must be
Expand Down Expand Up @@ -87,10 +81,12 @@ performs data sync doing `COPY` operations and will use multiple writers
(parallel apply) if those are enabled.

Node join can execute concurrently with other node joins for the majority of
the time taken to join. Only one regular node at a time can be in either of
the states PROMOTE or PROMOTING, which are typically fairly short.
The subscriber-only nodes are an exception to this rule, and they can be
cocurrently in PROMOTE and PROMOTING states as well.
the time taken to join. However, only one regular node at a time can be in
either of the states PROMOTE or PROMOTING, which are typically fairly short if
all other nodes are up and running, otherwise the join will get serialized at
this stage. The subscriber-only nodes are an exception to this rule, and they
can be concurrently in PROMOTE and PROMOTING states as well, so their join
process is fully concurrent.

Note that the join process uses only one node as the source, so can be
executed when nodes are down, if a majority of nodes are available.
Expand Down Expand Up @@ -871,43 +867,6 @@ as `STANDBY`.

Only one node at a time can be in either of the states PROMOTE or PROMOTING.

## Managing Shard Groups

BDR clusters may contain an array of Shard Groups for the AutoScale feature.
These are shown as a sub-node group that is composed of an array of
sub-sub node groups known as Shard Groups.

Operations that can be performed on the Shard Group are:

- Create Shard Array
- Drop Shard Array
- Repair - add new nodes to replace failed nodes
- Expand - add new Shard Groups
- Re-Balance - re-distribute data across Shard Groups

### Create/Drop

### Expand

e.g. expand from 4 Shard Groups to 8 Shard Groups

This operation can occur without interfering with user operations.

### Re-Balance

e.g. move data from where it was in a 4-node array to how it would be ideally
placed in an 8-node array.

Some portion of the data is moved from one Shard Group to another,
so this action can take an extended period, depending upon how
much data is to be moved. The data is moved one partition at a
time, so is restartable without too much wasted effort.

Note that re-balancing is optional.

This operation can occur without interfering with user operations,
even when this includes write transactions.

## Node Management Interfaces

Nodes can be added and removed dynamically using the SQL interfaces.
Expand Down Expand Up @@ -995,7 +954,7 @@ This function creates a BDR group with the local node as the only member of the

```sql
bdr.create_node_group(node_group_name text,
parent_group_name text,
parent_group_name text DEFAULT NULL,
join_node_group boolean DEFAULT true,
node_group_type text DEFAULT NULL)
```
Expand All @@ -1017,9 +976,8 @@ bdr.create_node_group(node_group_name text,
changes to other nodes. See [Subscriber-Only Nodes] for more details.
Datanode implies that the group represents a shard, whereas the other
values imply that the group represents respective coordinators.
Except 'subscriber-only', the rest three values are reserved for use
with a separate extension called autoscale. NULL implies a normal
general purpose node group will be created.
Except 'subscriber-only', the rest three values are reserved for future use.
NULL implies a normal general purpose node group will be created.

#### Notes

Expand Down Expand Up @@ -1425,6 +1383,7 @@ bdr_init_physical [OPTION] ...

- `--hba-conf -path` to the new pg_hba.conf
- `--postgresql-conf` - path to the new postgresql.conf
- `--postgresql-auto-conf` - path to the new postgresql.auto.conf

#### Notes

Expand Down
Loading

0 comments on commit 0aacfc7

Please sign in to comment.