Skip to content

Commit

Permalink
Merge pull request #4885 from EnterpriseDB/content/biganimal/upm-24211
Browse files Browse the repository at this point in the history
  • Loading branch information
josh-heyer authored Nov 29, 2023
2 parents 10893b4 + a8837d1 commit 473eb41
Show file tree
Hide file tree
Showing 11 changed files with 647 additions and 83 deletions.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: "Distributed high availability"
---

Distributed high-availability clusters are powered by [EDB Postgres Distributed](/pgd/latest/). They use multi-master logical replication to deliver more advanced cluster management compared to a physical replication-based system. Distributed high-availability clusters let you deploy a cluster across multiple regions or a single region. For use cases where high availability across regions is a major concern, a cluster deployment with distributed high availability enabled can provide two data groups with a witness group in a third region

This configuration provides a true active-active solution as each data group is configured to accept writes.

Distributed high-availability clusters support both EDB Postgres Advanced Server and EDB Postgres Extended Server database distributions.

Distributed high-availability clusters contain one or two data groups. Your data groups can contain either three data nodes or two data nodes and one witness node. At any given time, one of these data nodes is the leader and accepts writes, while the rest are referred to as [shadow nodes](/pgd/latest/terminology/#write-leader). We recommend that you don't use two data nodes and one witness node in production unless you use asynchronous [commit scopes](/pgd/latest/durability/commit-scopes/).

[PGD Proxy](/pgd/latest/routing/proxy) routes all application traffic to the leader node, which acts as the principal write target to reduce the potential for data conflicts. PGD Proxy leverages a distributed consensus model to determine availability of the data nodes in the cluster. On failure or unavailability of the leader, PGD Proxy elects a new leader and redirects application traffic. Together with the core capabilities of EDB Postgres Distributed, this mechanism of routing application traffic to the leader node enables fast failover and switchover.

The witness node/witness group doesn't host data but exists for management purposes. It supports operations that require a consensus, for example, in case of an availability zone failure.

!!!Note
Operations against a distributed high-availability cluster leverage the [EDB Postgres Distributed switchover](/pgd/latest/cli/command_ref/pgd_switchover/) feature, which provides subsecond interruptions during planned lifecycle operations.

## Single data location

A configuration with single data location has one data group and either:

- Two data nodes with one lead and one shadow and a witness node each in separate availability zones

![region(2 data + 1 witness)](../images/image5.png)

- Three data nodes with one lead and two shadow nodes each in separate availability zones

![region(3 data)](../images/image3.png)

## Multiple data locations and witness node

A configuration with multiple data locations has two data groups that contain either:

- Three data nodes:

- A data node and two shadow nodes in one region
- The same configuration in another region
- A witness node in a third region

![region(2 data + 1 shadow) + region(2 data + 1 shadow) + region(1 witness)](../images/eha.png)

- Two data nodes (not recommended for production):

- A data node, shadow node, and a witness node in one region
- The same configuration in another region
- A witness node in a third region

![region(2 data + 1 shadow) + region(2 data + 1 shadow) + region(1 witness)](../images/2dn-1wn-2dn-1wn-1wg.png)

### Cross-cloud service providers (CSP) witness node

By default, the cloud service provider selected for the data groups is preselected for the witness node.

To guard against cloud service provider failures, you can designate a witness node on a different cloud service provider than the data groups. This configuration can enable a three-region configuration even if a single cloud provider only offers two regions in the jurisdiction you are allowed to deploy your cluster in.

Cross-cloud service provider witness nodes are available with AWS, Azure, and Google Cloud using your own cloud account and BigAnimal's cloud account. This option is enabled by default and applies to both multi-region configurations available with PGD. For witness nodes you only pay for the used infrastructure, which is reflected in the pricing estimate.

## For more information

For instructions on creating a distributed high-availability cluster using the BigAnimal portal, see [Creating a distributed high-availability cluster](../getting_started/creating_a_cluster/creating_an_eha_cluster/).

For instructions on creating, retrieving information from, and managing a distributed high-availability cluster using the BigAnimal CLI, see [Using the BigAnimal CLI](/biganimal/latest/reference/cli/managing_clusters/#managing-distributed-high-availability-clusters).
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
title: "Supported cluster types"
deepToC: true
redirects:
- 02_high_availibility
navigation:
- single_node
- primary_standby_highavailability
- distributed_highavailability
---

BigAnimal supports three cluster types:
- [Single node](./single_node)
- [Primary/standby high availability](./primary_standby_highavailability)
- [Distributed high availability](./distributed_highavailability)

You choose the type of cluster you want on the [Create Cluster](https://portal.biganimal.com/create-cluster) page in the [BigAnimal](https://portal.biganimal.com) portal.
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
title: "Primary/standby high availability"
---

The Primary/Standby High Availability option is provided to minimize downtime in cases of failures. Primary/standby high-availability clusters—one *primary* and one or two *standby replicas*—are configured automatically, with standby replicas staying up to date through physical streaming replication.

If read-only workloads are enabled, then standby replicas serve the read-only workloads. In a two-node cluster, the single standby replica serves read-only workloads. In a three-node cluster, both standby replicas serve read-only workloads. The connections are made to the two standby replicas randomly and on a per-connection basis.

In cloud regions with availability zones, clusters are provisioned across zones to provide fault tolerance in the face of a data center failure.

In case of temporary or permanent unavailability of the primary, a standby replica becomes the primary.

![BigAnimal Cluster4](../images/high-availability.png)

Incoming client connections are always routed to the current primary. In case of failure of the primary, a standby replica is promoted to primary, and new connections are routed to the new primary. When the old primary recovers, it rejoins the cluster as a standby replica.

## Standby replicas

By default, replication is synchronous to one standby replica and asynchronous to the other. That is, one standby replica must confirm that a transaction record was written to disk before the client receives acknowledgment of a successful commit.

In a cluster with one primary and one replica (a two-node primary/standby high-availability cluster), you run the risk of the cluster being unavailable for writes because it doesn't have the same level of reliability as a three-node cluster. BigAnimal disables synchronous replication during maintenance operations of a two-node cluster to ensure write availability. You can also change from the default synchronous replication for a two-node cluster to asynchronous replication on a per-session or per-transaction basis.

In PostgreSQL terms, `synchronous_commit` is set to `on`, and `synchronous_standby_names` is set to `ANY 1 (replica-1, replica-2)`. You can modify this behavior on a per-transaction, per-session, per-user, or per-database basis using `SET` or `ALTER` commands.

To ensure write availability, BigAnimal disables synchronous replication during maintenance operations of a two-node cluster.

Since BigAnimal replicates to only one node synchronously, some standby replicas in three-node clusters might experience replication lag. Also, if you override the BigAnimal synchronous replication configuration, then the standby replicas are inconsistent.
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
title: "Single node"
---

For nonproduction use cases where high availability isn't a primary concern, a cluster deployment with high availability not enabled provides one primary with no standby replicas for failover or read-only workloads.

In case of unrecoverable failure of the primary, a restore from a backup is required.

![BigAnimal Cluster4](../images/single-node.png)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

1 comment on commit 473eb41

@github-actions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.