Skip to content

Commit

Permalink
First Split
Browse files Browse the repository at this point in the history
Signed-off-by: Dj Walker-Morgan <[email protected]>
  • Loading branch information
djw-m committed Sep 19, 2023
1 parent 76f48ca commit 9f99620
Show file tree
Hide file tree
Showing 6 changed files with 239 additions and 174 deletions.
22 changes: 22 additions & 0 deletions product_docs/docs/pgd/5/durability/administering.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: Administering
navTitle: Administering
---

When running a PGD cluster with Group Commit, there are some things you need to
be aware of when administering the system, such as how to safely shut down and
restart nodes.

## Planned shutdown and restarts

When using Group Commit with receive confirmations, take care
with planned shutdown or restart. By default, the apply queue is consumed
prior to shutting down. However, in the `immediate` shutdown mode, the queue
is discarded at shutdown, leading to the stopped node "forgetting"
transactions in the queue. A concurrent failure of the origin node can
lead to loss of data, as if both nodes failed.

To ensure the apply queue gets flushed to disk, use either
`smart` or `fast` shutdown for maintenance tasks. This approach maintains the
required synchronization level and prevents loss of data.

54 changes: 54 additions & 0 deletions product_docs/docs/pgd/5/durability/comparing.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
title: Comparing Durability Options
navTitle: Comparing
---

## Comparison

Most options for synchronous replication available to
PGD allow for different levels of synchronization, offering different
tradeoffs between performance and protection against node or network
outages.

The following table summarizes what a client can expect from a peer
node replicated to after receiving a COMMIT confirmation from
the origin node the transaction was issued to. For commit scopes the Mode
column refers to the confirmation requirements of the
[commit scope configuration](commit-scopes#configuration).

| Variant | Mode | Received | Visible | Durable |
|---------------------|-----------------------|----------|---------|---------|
| PGD Async | off (default) | no | no | no |
| PGD Lag Control | 'ON received' nodes | no | no | no |
| PGD Lag Control | 'ON replicated' nodes | no | no | no |
| PGD Lag Control | 'ON durable' nodes | no | no | no |
| PGD Lag Control | 'ON visible' nodes | no | no | no |
| PGD Group Commit | 'ON received' nodes | yes | no | no |
| PGD Group Commit | 'ON replicated' nodes | yes | no | no |
| PGD Group Commit | 'ON durable' nodes | yes | no | yes |
| PGD Group Commit | 'ON visible' nodes | yes | yes | yes |
| PGD CAMO | 'ON received' nodes | yes | no | no |
| PGD CAMO | 'ON replicated' nodes | yes | no | no |
| PGD CAMO | 'ON durable' nodes | yes | no | yes |
| PGD CAMO | 'ON visible' nodes | yes | yes | yes |

Reception ensures the peer operating normally can
eventually apply the transaction without requiring any further
communication, even in the face of a full or partial network
outage. A crash of a peer node might still require retransmission of
the transaction, as this confirmation doesn't involve persistent
storage. All modes considered synchronous provide this protection.

Visibility implies the transaction was applied remotely. All other
clients see the results of the transaction on all nodes, providing
this guarantee immediately after the commit is confirmed by the origin
node. Without visibility, other clients connected might not see the
results of the transaction and experience stale reads.

Durability relates to the peer node's storage and provides protection
against loss of data after a crash and recovery of the peer node.
This can either relate to the reception of the data (as with physical
streaming replication) or to visibility (as with Group Commit).
The former eliminates the need for retransmissions after
a crash, while the latter ensures visibility is maintained across
restarts.
49 changes: 49 additions & 0 deletions product_docs/docs/pgd/5/durability/configuring.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
title: Configuring Commit Scopes
navTitle: Configuring
---

## Configuration

You configure commit scopes using an SQL function just like other administration
operations in PGD.

For example, you might define a basic commit scope that does Group Commit on a majority
of nodes in the example_group PGD group:

```sql
SELECT bdr.add_commit_scope(
commit_scope_name := 'example_scope',
origin_node_group := 'example_group',
rule := 'ANY MAJORITY (example_group) GROUP COMMIT',
wait_for_ready := true
);
```

You can then use the commit scope either by setting the configuration variable (GUC)
`bdr.commit_scope` either per transaction or globally to that commit scope:

```sql
BEGIN;
SET LOCAL bdr.commit_scope = 'example_scope';
...
COMMIT;
```

You can also set the default commit scope for a given PGD group:

```sql
SELECT bdr.alter_node_group_option(
node_group_name := 'example_group',
config_key := 'default_commit_scope',
config_value := 'example_scope'
);
```

The `default_commit_scope` is checked in the group tree that the given origin
node belongs to from bottom to top. The `default_commit_scope` can't be set to
the special value `local`, which means there's no way for the commit scope to use
the `bdr.commit_scope` configuration parameter.

For full details of the commit scope language with all the options described,
see [Commit scopes](commit-scopes).
1 change: 1 addition & 0 deletions product_docs/docs/pgd/5/durability/group-commit.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
title: Group Commit
redirects:
- /pgd/latest/bdr/group-commit/
deepToC: true
---

The goal of Group Commit is to protect against data loss
Expand Down
216 changes: 42 additions & 174 deletions product_docs/docs/pgd/5/durability/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,132 +2,57 @@
title: Durability and performance options

navigation:
- overview
- configuring
- comparing
- commit-scopes
- group-commit
- camo
- lag-control
- administering
- legacy-sync

redirects:
- /pgd/latest/bdr/durability/
- /pgd/latest/choosing_durability/
---

## Overview

EDB Postgres Distributed allows you to choose from several replication
configurations based on your durability, consistency, availability, and
performance needs using *commit scopes*.

In its basic configuration, EDB Postgres Distributed uses asynchronous
replication. However, commit scopes can change both the default and the
per-transaction behavior.

It's also possible to configure the legacy Postgres
synchronous replication using standard `synchronous_standby_names` in the same
way as the built-in physical or logical replication. However, commit scopes
provide much more flexibility and control over the replication behavior.

The different synchronization settings affect three properties of interest
to applications that are related but can all be implemented individually:

- Durability: Writing to multiple nodes increases crash resilience
and allows you to recover the data after a crash and restart.
- Visibility: With the commit confirmation to the client, the database
guarantees immediate visibility of the committed transaction on some
sets of nodes.
- Conflict handling: Conflicts can be handled optimistically
postcommit, with conflicts resolved when the transaction is replicated
based on commit timestamps. Or, they can be handled pessimistically
precommit. The client can rely on the transaction to eventually be
applied on all nodes without further conflicts or get an abort, directly
informing the client of an error.

Commit scopes allow two ways of controlling durability of the transaction:

- [Group Commit](group-commit). This option controls which and how many nodes
have to reach a consensus before the transaction is considered to be committable
and at what stage of replication it can be considered committed. This option also
allows you to control the visibility ordering of the transaction.
- [CAMO](camo). This option is a variant of Group Commit in which the client is part of the
consensus.
- [Lag Control](lag-control). This option controls how far behind nodes can
be in terms of replication before allowing commit to proceed.


!!! Note Legacy synchronization availability
For backward compatibility, PGD still supports configuring synchronous
replication with `synchronous_commit` and `synchronous_standby_names`. See
[Legacy synchronous replication](legacy-sync) for more on this option,
but consider using [Group Commit](group-commit) instead.
!!!

## Terms and definitions

PGD nodes take different roles during the replication of a transaction.
These are implicitly assigned per transaction and are unrelated even for
concurrent transactions.

* The *origin* is the node that receives the transaction from the
client or application. It's the node processing the transaction
first, initiating replication to other PGD nodes and responding back
to the client with a confirmation or an error.

* A *partner* node is a PGD node expected to confirm transactions
according to Group Commit requirements.

* A *commit group* is the group of all PGD nodes involved in the
commit, that is, the origin and all of its partner nodes, which can be
just a few or all peer nodes.

## Comparison

Most options for synchronous replication available to
PGD allow for different levels of synchronization, offering different
tradeoffs between performance and protection against node or network
outages.

The following table summarizes what a client can expect from a peer
node replicated to after receiving a COMMIT confirmation from
the origin node the transaction was issued to. For commit scopes the Mode
column refers to the confirmation requirements of the
[commit scope configuration](commit-scopes#configuration).

| Variant | Mode | Received | Visible | Durable |
|---------------------|-----------------------|----------|---------|---------|
| PGD Async | off (default) | no | no | no |
| PGD Lag Control | 'ON received' nodes | no | no | no |
| PGD Lag Control | 'ON replicated' nodes | no | no | no |
| PGD Lag Control | 'ON durable' nodes | no | no | no |
| PGD Lag Control | 'ON visible' nodes | no | no | no |
| PGD Group Commit | 'ON received' nodes | yes | no | no |
| PGD Group Commit | 'ON replicated' nodes | yes | no | no |
| PGD Group Commit | 'ON durable' nodes | yes | no | yes |
| PGD Group Commit | 'ON visible' nodes | yes | yes | yes |
| PGD CAMO | 'ON received' nodes | yes | no | no |
| PGD CAMO | 'ON replicated' nodes | yes | no | no |
| PGD CAMO | 'ON durable' nodes | yes | no | yes |
| PGD CAMO | 'ON visible' nodes | yes | yes | yes |

Reception ensures the peer operating normally can
eventually apply the transaction without requiring any further
communication, even in the face of a full or partial network
outage. A crash of a peer node might still require retransmission of
the transaction, as this confirmation doesn't involve persistent
storage. All modes considered synchronous provide this protection.

Visibility implies the transaction was applied remotely. All other
clients see the results of the transaction on all nodes, providing
this guarantee immediately after the commit is confirmed by the origin
node. Without visibility, other clients connected might not see the
results of the transaction and experience stale reads.

Durability relates to the peer node's storage and provides protection
against loss of data after a crash and recovery of the peer node.
This can either relate to the reception of the data (as with physical
streaming replication) or to visibility (as with Group Commit).
The former eliminates the need for retransmissions after
a crash, while the latter ensures visibility is maintained across
restarts.
EDB Postgres Distributed offers a range of synchronous modes to complement its
default asynchronous replication. These synchronous modes are configured through
commit scopes; rules that define how operations are handled and when the system
should consider a transaction committed.

The [overview](overview) introduces these concepts and some of the essential
terminology which is used when discussing synchronous commits.

[Comparing](comparing) compares how each option behaves.

[Configuring](configuring) shows how you can use the PGD SQL interface to set up
commit scopes which manage the various synchronous commit options.

[Commit Scopes](commit-scopes) is a more in-depth look at the syntax and structure
of commit scopes and how to define them for your needs.

[Group Commit](group-commit) focuses on the Group Commit option, where you can
define a transaction as done when a group of nodes agrees its done.

[CAMO](camo) focuses on the Commit At Most Once option, which allows applications
to participate in confirming a transaction has been done and in turn ensure that
their commits only happen at most once.

[Lag Control](lag-control) looks at the commit scope mechanism which regulates how
far out of sync nodes may go when a database node goes out of service.

[Administering](administering) addresses how a PGD cluster with Group Commit
in use should be managed.

[Legacy Sync](legacy-sync) shows how traditional Postgres synchronous operations
can still be accessed under EDB Postgres Distributed.



<!---


## Internal timing of operations

Expand Down Expand Up @@ -170,62 +95,5 @@ The following table summarizes the differences.
| PGD Group Commit | apply first | before COMMIT on origin |
| PGD CAMO | apply first | before COMMIT on origin |

## Configuration

You configure commit scopes using an SQL function just like other administration
operations in PGD.

For example, you might define a basic commit scope that does Group Commit on a majority
of nodes in the example_group PGD group:

```sql
SELECT bdr.add_commit_scope(
commit_scope_name := 'example_scope',
origin_node_group := 'example_group',
rule := 'ANY MAJORITY (example_group) GROUP COMMIT',
wait_for_ready := true
);
```

You can then use the commit scope either by setting the configuration variable (GUC)
`bdr.commit_scope` either per transaction or globally to that commit scope:

```sql
BEGIN;
SET LOCAL bdr.commit_scope = 'example_scope';
...
COMMIT;
```

You can also set the default commit scope for a given PGD group:

```sql
SELECT bdr.alter_node_group_option(
node_group_name := 'example_group',
config_key := 'default_commit_scope',
config_value := 'example_scope'
);
```

The `default_commit_scope` is checked in the group tree that the given origin
node belongs to from bottom to top. The `default_commit_scope` can't be set to
the special value `local`, which means there's no way for the commit scope to use
the `bdr.commit_scope` configuration parameter.

For full details of the commit scope language with all the options described,
see [Commit scopes](commit-scopes).

## Planned shutdown and restarts

When using Group Commit with receive confirmations, take care
with planned shutdown or restart. By default, the apply queue is consumed
prior to shutting down. However, in the `immediate` shutdown mode, the queue
is discarded at shutdown, leading to the stopped node "forgetting"
transactions in the queue. A concurrent failure of the origin node can
lead to loss of data, as if both nodes failed.

To ensure the apply queue gets flushed to disk, use either
`smart` or `fast` shutdown for maintenance tasks. This approach maintains the
required synchronization level and prevents loss of data.


--->
Loading

0 comments on commit 9f99620

Please sign in to comment.