Content/harp/2.0/test import (#1914)

Final import before publication.
EnterpriseDB · Oct 13, 2021 · cf9a0bf · cf9a0bf
1 parent 502848e
commit cf9a0bf
Show file tree

Hide file tree

Showing 23 changed files with 2,211 additions and 0 deletions.
diff --git a/product_docs/docs/harp/2.0/02_overview.mdx b/product_docs/docs/harp/2.0/02_overview.mdx
@@ -0,0 +1,263 @@
+---
+navTitle: Overview
+title: HARP Functionality Overview
+---
+
+HARP is a new approach to High Availability primarily designed for BDR
+clusters, though it can also manage standard Primary + PSR topologies. It
+leverages consensus-driven Quorum to determine the correct connection end-point
+in a semi-exclusive manner to prevent unintended multi-node writes from an
+application.
+
+## The Importance of Quorum
+
+The central purpose of HARP is to enforce full Quorum on any Postgres cluster
+it manages. Quorum is merely a term generally applied to a voting body that
+mandates a certain minimum of attendees are available to make a decision. Or
+perhaps even more simply: Majority Rules.
+
+In order for any vote to end in a result other than a tie, an odd number of
+nodes must constitute the full cluster membership. Quorum however does not
+strictly demand this restriction; a simple majority will suffice. This means
+that in a cluster of N nodes, Quorum requires a minimum of N/2+1 nodes to hold
+a meaningful vote.
+
+All of this ensures the cluster is always in agreement regarding which node
+should be "in charge". For a BDR cluster consisting of multiple nodes, this
+determines which node is the primary write target. HARP designates this node
+as the Lead Master.
+
+## Reducing Write Targets
+
+The consequence of ignoring the concept of Quorum, or applying it
+insufficiently, may lead to a Split Brain scenario where the "correct" write
+target is ambiguous or unknowable. In a standard Postgres cluster, it is
+important that only a single node is ever writable and sending replication
+traffic to the remaining nodes.
+
+Even in Multi-Master capable approaches such as BDR, it can be beneficial to
+reduce the amount of necessary conflict management to derive identical data
+across the cluster. In clusters that consist of multiple BDR nodes per physical
+location or region, this usually means a single BDR node acts as a "Leader" and
+remaining nodes are "Shadows". These Shadow nodes are still writable, but doing
+so is discouraged unless absolutely necessary.
+
+By leveraging Quorum, it's possible for all nodes to agree exactly which
+Postgres node should represent the entire cluster, or a local BDR region. Any
+nodes that lose contact with the remainder of the Quorum, or are overruled by
+it, by definition cannot become the cluster Leader.
+
+This prevents Split Brain situations where writes unintentionally reach two
+Postgres nodes. Unlike technologies such as VPNs, Proxies, load balancers, or
+DNS, a Quorum-derived consensus cannot be circumvented by mis-configuration or
+network partitions. So long as it's possible to contact the Consensus layer to
+determine the state of the Quorum maintained by HARP, only one target is ever
+valid.
+
+## Basic Architecture
+
+The design of HARP comes in essentially two parts consisting of a Manager and
+a Proxy. The following diagram describes how these interact with a single
+Postgres instance:
+
+![HARP Unit](images/ha-unit.png)
+
+The Consensus Layer is an external entity where Harp Manager maintains 
+information it learns about its assigned Postgres node, and HARP Proxy 
+translates this information to a valid Postgres node target. Because Proxy
+obtains the node target from the Consensus Layer, several such instances may
+exist independently.
+
+While using BDR itself as the Consensus Layer, each server node resembles this
+variant instead.
+
+![HARP Unit w/BDR Consensus](images/ha-unit-bdr.png)
+
+In either case, each unit consists of the following elements:
+
+* A Postgres or EDB instance
+* A Consensus Layer resource, meant to track various attributes of the Postgres
+  instance
+* A HARP Manager process to convey the state of the Postgres node to the
+  Consensus Layer
+* A HARP Proxy service that directs traffic to the proper Lead Master node,
+  as derived from the Consensus Layer
+
+Not every application stack has access to additional node resources 
+specifically for the Proxy component, so it can be combined with the 
+application server to simplify the stack itself.
+
+This is a typical design using two BDR nodes in a single Data Center organized in a Lead Master / Shadow Master configuration:
+
+![HARP Cluster](images/ha-ao.png)
+
+Note that when using BDR itself as the HARP Consensus Layer, at least three
+fully qualified BDR nodes must be present to ensure a quorum majority.
+
+![HARP Cluster w/BDR Consensus](images/ha-ao-bdr.png)
+
+(Not shown in the above diagram are connections between BDR nodes.)
+
+## How it Works
+
+When managing a BDR cluster, HARP maintains at most one "Leader" node per
+defined Location. Canonically this is referred to as the Lead Master. Other BDR
+nodes which are eligible to take this position are Shadow Master state until
+such a time they take the Leader role.
+
+Applications may contact the current Leader only through the Proxy service. 
+Since the Consensus Layer requires Quorum agreement before conveying Leader 
+state, any and all Proxy services will direct traffic to that node.
+
+At a high level, this is ultimately what prevents application interaction with
+multiple nodes simultaneously.
+
+### Determining a Leader
+
+As an example, consider the role of Lead Master within a locally subdivided
+BDR Always-On group as may exist within a single data center. When any
+Postgres or Manager resource is started, and after a configurable refresh
+interval, the following must occur:
+
+1. The Manager checks the status of its assigned Postgres resource.
+    - If Postgres is not running, try again after configurable timeout.
+    - If Postgres is running, continue.
+2. The Manager checks the status of the Leader lease in the Consensus Layer.
+    - If the lease is unclaimed, acquire it and assign the identity of
+      the Postgres instance assigned to this Manager. This lease duration is
+      configurable, but setting it too low may result in unexpected leadership
+      transitions.
+    - If the lease is already claimed by us, renew the lease TTL.
+    - Otherwise do nothing.
+
+Obviously a lot more happens here, but this simplified version should explain
+what's happening. The Leader lease can only be held by one node, and if it's
+held elsewhere, HARP Manager gives up and tries again later.
+
+!!! Note
+    Depending on the chosen Consensus Layer, rather than repeatedly looping to
+    check the status of the Leader lease, HARP will subscribe to notifications
+    instead. In this case, it can respond immediately any time the state of the
+    lease changes, rather than polling. Currently this functionality is
+    restricted to the etcd Consensus Layer.
+
+This means HARP itself does not hold elections or manage Quorum; this is
+delegated to the Consensus Layer. The act of obtaining the lease must be
+acknowledged by a Quorum of the Consensus Layer, so if the request succeeds,
+that node leads the cluster in that Location.
+
+### Connection Routing
+
+Once the role of the Lead Master is established, connections are handled
+with a similar deterministic result as reflected by HARP Proxy. Consider a case
+where HAProxy needs to determine the connection target for a particular backend
+resource:
+
+1. HARP Proxy interrogates the Consensus layer for the current Lead Master in
+   its configured location.
+2. If this is unset or in transition;
+    - New client connections to Postgres are barred, but clients will
+      accumulate and be in a paused state until a Lead Master appears.
+    - Existing client connections are allowed to complete current transaction,
+      and are then reverted to a similar pending state as new connections.
+3. Client connections are forwarded to the Lead Master.
+
+Note that the interplay demonstrated in this case does not require any
+interaction with either HARP Manager or Postgres. The Consensus Layer itself
+is the source of all truth from the Proxy's perspective.
+
+### Colocation
+
+The arrangement of the work units is such that their organization is required
+to follow these principles:
+
+1. The Manager and Postgres units must exist concomitantly within the same
+   node.
+2. The contents of the Consensus Layer dictate the prescriptive role of all
+   operational work units.
+
+This delegates cluster Quorum responsibilities to the Consensus Layer itself, 
+while HARP leverages it for critical role assignments and key/value storage. 
+Neither storage or retrieval will succeed if the Consensus Layer is inoperable 
+or unreachable, thus preventing rogue Postgres nodes from accepting 
+connections.
+
+As a result, the Consensus Layer should generally exist outside of HARP or HARP 
+managed nodes for maximum safety. Our reference diagrams reflect this in order
+to encourage such separation, though it is not required.
+
+!!! Note
+    In order to operate and manage cluster state, BDR contains its own
+    implementation of the Raft Consensus model. HARP may be configured to
+    leverage this same layer to reduce reliance on external dependencies and
+    to preserve server resources. However, there are certain drawbacks to this
+    approach that are discussed in further depth in the section on the
+    [Consensus Layer](09_consensus-layer).
+
+## Recommended Architecture and Use
+
+HARP was primarily designed to represent a BDR Always-On architecture which
+resides within two (or more) Data Centers and consists of at least five BDR
+nodes. This does not count any Logical Standby nodes.
+
+The current and standard representation of this can be seen in the following
+diagram:
+
+![BDR Always-On Reference Architecture](images/bdr-ao-spec.png)
+
+In this diagram, HARP Manager would exist on BDR Nodes 1-4. The initial state 
+of the cluster would be that BDR Node 1 is the Lead master of DC A, and BDR 
+Node 3 is the Lead Master of DC B.
+
+This would result in any HARP Proxy resource in DC A connecting to BDR Node 1, 
+and likewise the HARP Proxy resource in DC B connecting to BDR Node 3.
+
+!!! Note
+    While this diagram only shows a single HARP Proxy per DC, this is merely
+    illustrative and should not be considered a Single Point of Failure. Any
+    number of HARP Proxy nodes may exist, and they will all direct application
+    traffic to the same node.
+
+### Location Configuration
+
+In order for multiple BDR nodes to be eligible to take the Lead Master lock in
+a location, a Location must be defined within the `config.yml` configuration
+file.
+
+To reproduce the diagram above, we would have these lines in the `config.yml`
+configuration for BDR Nodes 1 and 2:
+
+```yaml
+location: dca
+```
+
+And for BDR Nodes 3 and 4:
+
+```yaml
+location: dcb
+```
+
+This applies to any HARP Proxy nodes which are designated in those respective
+data centers as well.
+
+### BDR 3.7 Compatibility
+
+BDR 3.7 and above offers more direct Location definition by assigning a
+Location to the BDR node itself. This is done by calling the following SQL
+API function while connected to the BDR node. So for BDR Nodes 1 and 2, we
+might do this:
+
+```sql
+SELECT bdr.set_node_location('dca');
+```
+
+And for BDR Nodes 3 and 4:
+
+```sql
+SELECT bdr.set_node_location('dcb');
+```
+
+Afterwards, future versions of HARP Manager would derive the `location` field
+directly from BDR itself. This HARP functionality is not available yet, so we
+recommend using this and the setting in `config.yml` until HARP reports
+compatibility with this BDR API method.
diff --git a/product_docs/docs/harp/2.0/03_installation.mdx b/product_docs/docs/harp/2.0/03_installation.mdx
@@ -0,0 +1,129 @@
+---
+navTitle: Installation
+title: Installation
+---
+
+A standard installation of HARP includes two system services:
+
+* HARP Manager (`harp_manager`) on the node being managed
+* HARP Proxy (`harp_router`) elsewhere
+
+There are generally two ways to install and configure these services to manage
+Postgres for proper Quorum-based connection routing.
+
+## Software Versions
+
+HARP does have dependencies on external software. These must fit a minimum
+version as listed here.
+
+| Software  | Min Ver |
+|-----------|---------|
+| etcd      | 3.4     |
+| PgBouncer | 1.14    |
+
+## TPAExec
+
+The easiest way to install and configure HARP is to use EDB's TPAexec utility
+for cluster deployment and management. For details on this software, see the
+[TPAexec product page](https://access.2ndquadrant.com/customer_portal/sw/tpa/).
+
+!!! Note
+    TPAExec is currently only available through an EULA specifically dedicated
+    to BDR cluster deployments. If you are unable to access the above URL,
+    please contact your sales or account representative for more information.
+
+TPAexec itself must be configured to recognize that cluster routing should be
+managed through HARP by ensuring the TPA `config.yml` file contains these
+attributes:
+
+```yaml
+cluster_vars:
+  failover_manager: harp
+```
+
+!!! Note
+    Versions of TPAexec prior to 21.1 require a slightly different approach:
+
+    ```yaml
+    cluster_vars:
+      enable_harp: true
+    ```
+
+After this, HARP will be installed by invoking the regular `tpaexec` commands
+for making cluster modifications:
+
+```bash
+tpaexec provision ${CLUSTER_DIR}
+tpaexec deploy ${CLUSTER_DIR}
+```
+
+No other modifications should be necessary, barring cluster-specific
+considerations.
+
+
+## Package Installation
+
+Currently CentOS/RHEL packages are provided via the EDB packaging
+infrastructure. For details, see the [HARP product
+page](https://access.2ndquadrant.com/customer_portal/sw/harp/).
+
+### etcd Packages
+
+Currently `etcd` packages for many popular Linux distributions are not 
+available via their standard public repositories. EDB has therefore packaged 
+`etcd` for RHEL and CentOS versions 7 and 8, Debian, and variants such as 
+Ubuntu LTS. Again, access to our HARP package repository is necessary to use
+these libraries.
+
+## Consensus layer
+
+HARP requires a distributed consensus layer to operate. Currently this must be
+either `bdr` or `etcd`. If using fewer than 3 BDR nodes, it may become
+necessary to rely on `etcd`. Otherwise any BDR service outage will reduce the
+consensus layer to a single node and thus prevent node consensus and disable
+Postgres routing.
+
+### etcd
+
+If using `etcd` as the consensus layer, `etcd` must be installed either
+directly on the Postgres nodes, or in some separate location they can access.
+
+To set `etcd` as the consensus layer, include this in the HARP `config.yml`
+configuration file:
+
+```yaml
+dcs:
+  driver: etcd
+  endpoints:
+    - host1:2379
+    - host2:2379
+    - host3:2379
+```
+
+When using TPAExec, all configured etcd endpoints will be entered here 
+automatically.
+
+### BDR
+
+The `bdr` native consensus layer is available from BDR 3.6.21 and 3.7.3. This 
+Consensus Layer model requires no supplementary software when managing routing 
+for a BDR cluster.
+
+As previously mentioned, to ensure Quorum is possible in the cluster, always
+use more than two nodes so BDR's consensus layer remains responsive during node
+maintenance or outages.
+
+To set BDR as the consensus layer, include this in the `config.yml` 
+configuration file:
+
+```yaml
+dcs:
+  driver: bdr
+  endpoints:
+    - host=host1 dbname=bdrdb user=harp_user
+    - host=host2 dbname=bdrdb user=harp_user
+    - host=host3 dbname=bdrdb user=harp_user
+```
+
+As can be seen here, the endpoints for a BDR consensus layer follow the
+standard Postgres DSN connection format.