Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure the Consistency/Conflicts section in PGD docs #5804

Merged
merged 24 commits into from
Jul 9, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
50de6ad
some updates
jpe442 Jun 25, 2024
7f48e0f
Fixed descriptions.
jpe442 Jun 25, 2024
030a924
Add bit to reference for testing.
jpe442 Jun 25, 2024
b9a974a
First command and list to reference.
jpe442 Jun 25, 2024
ecce35e
Main structure in place for new conflict reference page.
jpe442 Jun 25, 2024
968b02a
Extracted old reference material from main Consistency pages so it is…
jpe442 Jun 26, 2024
546ec83
Fixed links.
jpe442 Jun 26, 2024
26e5e85
Small change.
jpe442 Jun 26, 2024
db5817d
Update product_docs/docs/pgd/5/reference/conflicts.mdx
jpe442 Jun 27, 2024
9814cfb
Update product_docs/docs/pgd/5/reference/conflicts.mdx
jpe442 Jun 27, 2024
d100212
Fix bad links
jpe442 Jun 27, 2024
273852d
Reference index.mdx changes.
jpe442 Jun 27, 2024
004b5a9
Integrated fixed links.
jpe442 Jul 2, 2024
3d5eefe
Added overview page and switched avoiding with types.
jpe442 Jul 2, 2024
4e2eb87
Added index content.
jpe442 Jul 2, 2024
b246ef5
Small typos
jpe442 Jul 2, 2024
532148e
Update product_docs/docs/pgd/5/consistency/conflicts/index.mdx
jpe442 Jul 3, 2024
908b87c
Split out functions in their own file and put all the lists in their …
jpe442 Jul 3, 2024
2be59ff
dropped deepToCs
jpe442 Jul 3, 2024
1c4d51f
Fixed broken links.
jpe442 Jul 3, 2024
b70318d
Merged avoiding conflicts section into overview.
jpe442 Jul 9, 2024
2242aae
Update product_docs/docs/pgd/5/consistency/conflicts/index.mdx
jpe442 Jul 9, 2024
5cb5f65
Update product_docs/docs/pgd/5/consistency/conflicts/00_conflicts_ove…
jpe442 Jul 9, 2024
89fcbaf
add whitespace for easier reading in code
jpe442 Jul 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
774 changes: 0 additions & 774 deletions product_docs/docs/pgd/5/consistency/conflicts.mdx

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
title: Overview
Description: Conflicts section overview.
deepToC: true
---

EDB Postgres Distributed is an active/active or multi-master DBMS. If used asynchronously, writes to the same or related rows from multiple different nodes can result in data conflicts when using standard data types.

Conflicts aren't errors. In most cases, they are events that PGD can detect and resolve as they occur. Resolving them depends on the nature of the application and the meaning of the data, so it's important for
PGD to provide the application with a range of choices for how to resolve conflicts.

By default, conflicts are resolved at the row level. When changes from two nodes conflict, PGD picks either the local or remote tuple and the discards the other. For example, the commit timestamps might be compared for the two conflicting changes and the newer one kept. This approach ensures that all nodes converge to the same result and establishes commit-order-like semantics on the whole cluster.

Conflict handling is configurable, as described in [Conflict resolution](04_conflict_resolution). PGD can detect conflicts and handle them differently for each table using conflict triggers, described in [Stream triggers](../../striggers).

Column-level conflict detection and resolution is available with PGD, as described in [CLCD](../column-level-conflicts).

By default, all conflicts are logged to `bdr.conflict_history`. If conflicts are possible, then table owners must monitor for them and analyze how to avoid them or make plans to handle them regularly as an application task. The [LiveCompare](/livecompare/latest) tool is also available to scan regularly for divergence.
jpe442 marked this conversation as resolved.
Show resolved Hide resolved

Some clustering systems use distributed lock mechanisms to prevent concurrent access to data. These can perform reasonably when servers are very close to each other but can't support geographically distributed applications where very low latency is critical for acceptable performance.

Distributed locking is essentially a pessimistic approach. PGD advocates an optimistic approach, which is to avoid conflicts where possible but allow some types of conflicts to occur and resolve them when they arise.

## How conflicts happen

Inter-node conflicts arise as a result of sequences of events that can't happen if all the involved transactions happen concurrently on the same node. Because the nodes exchange changes only after the transactions commit, each transaction is individually valid on the node it committed on. It isn't
valid if applied on another node that did other conflicting work at the same time.

Since PGD replication essentially replays the transaction on the other nodes, the replay operation can fail if there's a conflict between a transaction being applied and a transaction that was committed on the receiving node.

Most conflicts can't happen when all transactions run on a single node because Postgres has inter-transaction communication mechanisms to prevent it. Examples of these mechanisms are `UNIQUE` indexes, `SEQUENCE` operations, row and relation locking, and `SERIALIZABLE` dependency tracking. All of these mechanisms are ways to communicate between ongoing transactions to prevent undesirable concurrency
issues.

PGD doesn't have a distributed transaction manager or lock manager. That's part of why it performs well with latency and network partitions. As a result, transactions on different nodes execute entirely independently from each other when using the default, which is lazy replication. Less independence between nodes can avoid conflicts altogether, which is why PGD also offers Eager Replication for when this is important.
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
title: Avoiding or tolerating conflicts
Description: How to avoid database conflicts.
deepToC: true
jpe442 marked this conversation as resolved.
Show resolved Hide resolved
---

In most cases, you can design the application to avoid or tolerate conflicts.

Conflicts can happen only if things are happening at the same time on multiple nodes. The simplest way to avoid conflicts is to only ever write to one node or to only ever write to a specific row in a specific way from one specific node at a time.

This avoidance happens naturally in many applications. For example, many consumer applications allow only the owning user to change data, such as changing the default billing address on an account. Such data changes seldom have update conflicts.

You might make a change just before a node goes down, so the change seems to be lost. You might then make the same change again, leading to two updates on different nodes. When the down node comes back up, it tries to send the older change to other nodes. It's rejected because the last update of the data is kept.

For `INSERT`/`INSERT` conflicts, use [global sequences](../../sequences//#pgd-global-sequences) to prevent this type of conflict.

For applications that assign relationships between objects, such as a room-booking application, applying `update_if_newer` might not give an acceptable business outcome. That is, it isn't useful to confirm to two people separately that they have booked the same room. The simplest resolution is to use Eager Replication to ensure that only one booking succeeds. More complex ways might be possible depending on the application. For example, you can assign 100 seats to each node and allow those to be booked by a writer on that node. But if none are available locally, use a distributed locking scheme or Eager Replication after most seats are reserved.

Another technique for ensuring certain types of updates occur only from one specific node is to route different types of transactions through different nodes. For example:

- Receiving parcels on one node but delivering parcels using another node
- A service application where orders are input on one node and work is prepared on a second node and then served back to customers on another

Frequently, the best course is to allow conflicts to occur and design the application to work with PGD's conflict resolution mechanisms to cope with the conflict.
Loading