From c688b866d0b30305cb0e5ff1cfddc8e234566c12 Mon Sep 17 00:00:00 2001 From: Betsy Gitelman Date: Tue, 4 Jun 2024 13:06:46 -0400 Subject: [PATCH 1/2] First set of edits on PGD doc re-read --- product_docs/docs/pgd/5/appusage/behavior.mdx | 2 +- product_docs/docs/pgd/5/appusage/index.mdx | 2 +- .../docs/pgd/5/cli/discover_connections.mdx | 4 +-- product_docs/docs/pgd/5/ddl/ddl-locking.mdx | 32 +++++++++---------- 4 files changed, 18 insertions(+), 22 deletions(-) diff --git a/product_docs/docs/pgd/5/appusage/behavior.mdx b/product_docs/docs/pgd/5/appusage/behavior.mdx index fbefa0d093f..380aabc89d5 100644 --- a/product_docs/docs/pgd/5/appusage/behavior.mdx +++ b/product_docs/docs/pgd/5/appusage/behavior.mdx @@ -95,7 +95,7 @@ partitions are replicated downstream. By default, triggers execute only on the origin node. For example, an INSERT trigger executes on the origin node and is ignored when you apply the change on the target node. You can specify for triggers to execute on both the origin node -at execution time and on the target when it's replicated ("apply time") by using +at execution time and on the target when it's replicated (*apply time*) by using `ALTER TABLE ... ENABLE ALWAYS TRIGGER`. Or, use the `REPLICA` option to execute only at apply time: `ALTER TABLE ... ENABLE REPLICA TRIGGER`. diff --git a/product_docs/docs/pgd/5/appusage/index.mdx b/product_docs/docs/pgd/5/appusage/index.mdx index 1005e3e1676..3b805f76e82 100644 --- a/product_docs/docs/pgd/5/appusage/index.mdx +++ b/product_docs/docs/pgd/5/appusage/index.mdx @@ -34,4 +34,4 @@ Developing an application with PGD is mostly the same as working with any Postgr * [Table access methods](table-access-methods) (TAMs) notes the TAMs available with PGD and how to enable them. -* [Feature compatibility](feature-compatibility) shows which server features work with which commit scopes and which commit scopes can be daisychained together. \ No newline at end of file +* [Feature compatibility](feature-compatibility) shows which server features work with which commit scopes and which commit scopes can be daisy chained together. \ No newline at end of file diff --git a/product_docs/docs/pgd/5/cli/discover_connections.mdx b/product_docs/docs/pgd/5/cli/discover_connections.mdx index 183b53e02d9..776660607b0 100644 --- a/product_docs/docs/pgd/5/cli/discover_connections.mdx +++ b/product_docs/docs/pgd/5/cli/discover_connections.mdx @@ -84,7 +84,7 @@ As with TPA, EDB PGD for Kubernetes is very flexible, and there are multiple way Consult your configuration file to determine this information. -Establish a host name or IP address, port, database name, and username. The default database name is `bdrdb`, and the default username is enterprisedb for EDB Postgres Advanced Server and postgres for PostgreSQL and EDB Postgres Extended Server. +Establish a host name or IP address, port, database name, and username. The default database name is `bdrdb`. The default username is enterprisedb for EDB Postgres Advanced Server and postgres for PostgreSQL and EDB Postgres Extended Server. You can then assemble a connection string based on that information: @@ -93,5 +93,3 @@ You can then assemble a connection string based on that information: ``` If the deployment's configuration requires it, add `sslmode=`. - - diff --git a/product_docs/docs/pgd/5/ddl/ddl-locking.mdx b/product_docs/docs/pgd/5/ddl/ddl-locking.mdx index c2c8723aa5c..821779465ca 100644 --- a/product_docs/docs/pgd/5/ddl/ddl-locking.mdx +++ b/product_docs/docs/pgd/5/ddl/ddl-locking.mdx @@ -3,13 +3,13 @@ title: DDL locking details navTitle: Locking --- -Two kinds of locks enforce correctness of replicated DDL with PGD; the global DDL lock and the global DML lock. +Two kinds of locks enforce correctness of replicated DDL with PGD: the global DDL lock and the global DML lock. ### The global DDL lock -The first kind is known as a global DDL lock and is used only when `ddl_locking = 'all'`. -A global DDL lock prevents any other DDL from executing on the cluster while -each DDL statement runs. This ensures full correctness in the general case but +A global DDL lock is used only when `ddl_locking = 'all'`. +This kind of lock prevents any other DDL from executing on the cluster while +each DDL statement runs. This behavior ensures full correctness in the general case but is too strict for many simple cases. PGD acquires a global lock on DDL operations the first time in a transaction where schema changes are made. This effectively serializes the DDL-executing transactions in the cluster. In @@ -24,7 +24,7 @@ The lock request is sent by the regular replication stream, and the nodes respond by the replication stream as well. So it's important that nodes (or at least a majority of the nodes) run without much replication delay. Otherwise it might take a long time for the node to acquire the DDL -lock. Once the majority of nodes agrees, the DDL execution is carried out. +lock. Once the majority of nodes agree, the DDL execution is carried out. The ordering of DDL locking is decided using the Raft protocol. DDL statements executed on one node are executed in the same sequence on all other nodes. @@ -37,17 +37,17 @@ take a long time to acquire the lock. Hence it's preferable to run DDLs from a single node or the nodes that have nearly caught up with replication changes originating at other nodes. -A global DDL Lock has to be granted by a majority of data and witness nodes, where a majority is N/2+1 +A global DDL lock must be granted by a majority of data and witness nodes, where a majority is N/2+1 of the eligible nodes. Subscriber-only nodes aren't eligible to participate. ### The global DML lock -The second kind is known as a global DML lock or relation DML lock. This kind of lock is used when +Known as a global DML lock or relation DML lock, this kind of lock is used when either `ddl_locking = all` or `ddl_locking = dml`, and the DDL statement might cause in-flight DML statements to fail. These failures can occur when you add or modify a constraint, such as a unique constraint, check constraint, or NOT NULL constraint. -Relation DML locks affect only one relation at a time. Relation DML -locks ensure that no DDL executes while there are changes in the queue that +Relation DML locks affect only one relation at a time. These +locks ensure that no DDL executes while changes are in the queue that might cause replication to halt with an error. To acquire the global DML lock on a table, the PGD node executing the DDL @@ -63,19 +63,17 @@ normally doesn't take an EXCLUSIVE LOCK or higher. Waiting for pending DML operations to drain can take a long time and even longer if replication is currently lagging. -This means that schema changes affecting row representation and constraints, -unlike with data changes, can be performed only while all configured nodes +This means that, unlike with data changes, schema changes affecting row representation and constraints can be performed only while all configured nodes can be reached and are keeping up reasonably well with the current write rate. If such DDL commands must be performed while a node is down, first remove the down node from the configuration. - -**All** eligible data notes must agree to grant a global DML lock before the lock is granted. +All eligible data notes must agree to grant a global DML lock before the lock is granted. Witness and subscriber-only nodes aren't eligible to participate. If a DDL statement isn't replicated, no global locks are acquired. -Locking behavior is specified by the [`bdr.ddl_locking`](/pgd/latest/reference/pgd-settings#bdrddl_locking) parameter, as +Specify locking behavior with the [`bdr.ddl_locking`](/pgd/latest/reference/pgd-settings#bdrddl_locking) parameter, as explained in [Executing DDL on PGD systems](ddl-overview#executing-ddl-on-pgd-systems): - `ddl_locking = all` takes global DDL lock and, if needed, takes relation DML lock. @@ -85,12 +83,12 @@ explained in [Executing DDL on PGD systems](ddl-overview#executing-ddl-on-pgd-sy Some PGD functions make DDL changes. For those functions, DDL locking behavior applies, which is noted in the documentation for each function. -Thus, `ddl_locking = dml` is only safe when you can guarantee that +Thus, `ddl_locking = dml` is safe only when you can guarantee that no conflicting DDL is executed from other nodes. With this setting, the statements that require only the global DDL lock don't use the global locking at all. -`ddl_locking = off` is only safe when you can guarantee that there are no +`ddl_locking = off` is safe only when you can guarantee that there are no conflicting DDL and no conflicting DML operations on the database objects DDL executes on. If you turn locking off and then experience difficulties, you might lose in-flight changes to data. The user application team needs to resolve any issues caused. @@ -98,7 +96,7 @@ you might lose in-flight changes to data. The user application team needs to res In some cases, concurrently executing DDL can properly be serialized. If these serialization failures occur, the DDL might reexecute. -DDL replication isn't active on logical standby nodes until they are promoted. +DDL replication isn't active on logical standby nodes until they're promoted. Some PGD management functions act like DDL, meaning that they attempt to take global locks, and their actions are replicated if DDL From c943d97292e1ddab38df0091f0d287e4c4667993 Mon Sep 17 00:00:00 2001 From: Betsy Gitelman Date: Thu, 6 Jun 2024 17:08:15 -0400 Subject: [PATCH 2/2] Additional edits to first set of PGD edits --- .../pgd/5/ddl/ddl-managing-with-pgd-replication.mdx | 8 ++++---- product_docs/docs/pgd/5/ddl/ddl-overview.mdx | 12 +++++------- 2 files changed, 9 insertions(+), 11 deletions(-) diff --git a/product_docs/docs/pgd/5/ddl/ddl-managing-with-pgd-replication.mdx b/product_docs/docs/pgd/5/ddl/ddl-managing-with-pgd-replication.mdx index be3edb13c30..c1fe779c19e 100644 --- a/product_docs/docs/pgd/5/ddl/ddl-managing-with-pgd-replication.mdx +++ b/product_docs/docs/pgd/5/ddl/ddl-managing-with-pgd-replication.mdx @@ -8,8 +8,8 @@ navTitle: Managing with replication Minimizing the impact of DDL is good operational advice for any database. These points become even more important with PGD: -- To minimize the impact of DDL, make transactions performing DDL short, - don't combine them with lots of row changes, and avoid long-running +- To minimize the impact of DDL, make transactions performing DDL short. + Don't combine them with lots of row changes, and avoid long-running foreign key or other constraint rechecks. - For `ALTER TABLE`, use `ADD CONSTRAINT NOT VALID` followed by another @@ -37,7 +37,7 @@ INDEX CONCURRENTLY`, noting that DDL replication must be disabled for the whole session because `CREATE INDEX CONCURRENTLY` is a multi-transaction command. Avoid `CREATE INDEX` on production systems since it prevents writes while it executes. -`REINDEX` is replicated in versions up to 3.6 but not with PGD 3.7 or later. +`REINDEX` is replicated in versions 3.6 and earlier but not with PGD 3.7 or later. Avoid using `REINDEX` because of the AccessExclusiveLocks it holds. Instead, use `REINDEX CONCURRENTLY` (or `reindexdb --concurrently`), @@ -107,7 +107,7 @@ by turning [`bdr.ddl_replication`](/pgd/latest/reference/pgd-settings#bdrddl_rep PGD prevents some DDL statements from running when it's active on a database. This protects the consistency of the system by disallowing statements that can't be replicated correctly or for which replication -isn't yet supported. +isn't yet supported. If a statement isn't permitted under PGD, you can often find another way to do the same thing. For example, you can't do an `ALTER TABLE`, diff --git a/product_docs/docs/pgd/5/ddl/ddl-overview.mdx b/product_docs/docs/pgd/5/ddl/ddl-overview.mdx index c6737253aab..aec5291d0ed 100644 --- a/product_docs/docs/pgd/5/ddl/ddl-overview.mdx +++ b/product_docs/docs/pgd/5/ddl/ddl-overview.mdx @@ -32,7 +32,7 @@ comes to DDL replication. Treating it the same is the most common issue with PGD. The main difference from table replication is that DDL replication doesn't -replicate the result of the DDL but the statement itself. This works +replicate the result of the DDL. Instead, it replicates the statement. This works very well in most cases, although it introduces the requirement that the DDL must execute similarly on all nodes. A more subtle point is that the DDL must be immutable with respect to all datatype-specific parameter settings, @@ -59,10 +59,10 @@ PGD offers three levels of protection against those problems: `ddl_locking = 'all'` is the strictest option and is best when DDL might execute from any node concurrently and you want to ensure correctness. This is the default. -`ddl_locking = 'dml'` is an option that is only safe when you execute -DDL from one node at any time. You should only use this setting -if you can completely control where DDL is executed. Executing DDL from a single node -it ensures that there are no inter-node conflicts. Intra-node conflicts are already +`ddl_locking = 'dml'` is an option that is safe only when you execute +DDL from one node at any time. Use this setting only +if you can completely control where DDL is executed. Executing DDL from a single node +ensures that there are no inter-node conflicts. Intra-node conflicts are already handled by PostgreSQL. `ddl_locking = 'off'` is the least strict option and is dangerous in general use. @@ -71,9 +71,7 @@ it a useful option when creating a new and empty database schema. These options can be set only by the bdr_superuser, by the superuser, or in the `postgres.conf` configuration file. - When using the [`bdr.replicate_ddl_command`](/pgd/latest/reference/functions#bdrreplicate_ddl_command), you can set this parameter directly with the third argument, using the specified [`bdr.ddl_locking`](/pgd/latest/reference/pgd-settings#bdrddl_locking) setting only for the DDL commands passed to that function. -