Skip to content

Commit

Permalink
Merge pull request #1505 from EnterpriseDB/release/2021-06-25
Browse files Browse the repository at this point in the history
Release/2021 06 25

Former-commit-id: c6dd916
  • Loading branch information
jericson-edb authored Jun 25, 2021
2 parents 0476a80 + 13ba36a commit 46a47e0
Show file tree
Hide file tree
Showing 114 changed files with 6,730 additions and 163 deletions.
60 changes: 60 additions & 0 deletions product_docs/docs/efm/3.10/efm_pgpool_ha_guide/01_introduction.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
title: "Architecture Overview"

legacyRedirectsGenerated:
# This list is generated by a script. If you need add entries, use the `legacyRedirects` key.
- "/edb-docs/d/edb-postgres-failover-manager/user-guides/high-availability-scalability-guide/3.10/introduction.html"
---

This guide explains how to configure Failover Manager and Pgpool best to leverage the benefits that they provide for Advanced Server. Using the reference architecture described in the Architecture section, you can learn how to achieve high availability by implementing an automatic failover mechanism (with Failover Manager) while scaling the system for larger workloads and an increased number of concurrent clients with read-intensive or mixed workloads to achieve horizontal scaling/read-scalability (with Pgpool).

The architecture described in this document has been developed and tested for EFM 3.10, EDB Pgpool, and Advanced Server 12.

Documentation for Advanced Server and Failover Manager are available from EnterpriseDB at:

<https://www.enterprisedb.com/docs/>

Documentation for pgPool-II can be found at:

<http://www.pgpool.net/docs/latest/en/html>

## Failover Manager Overview

Failover Manager is a high-availability module that monitors the health of a Postgres streaming replication cluster and verifies failures quickly. When a database failure occurs, Failover Manager can automatically promote a streaming replication Standby node into a writable Primary node to ensure continued performance and protect against data loss with minimal service interruption.

**Basic EFM Architecture Terminology**

A Failover Manager cluster is comprised of EFM processes that reside on the following hosts on a network:

- A **Primary** node is the Primary database server that is servicing database clients.
- One or more **Standby nodes** are streaming replication servers associated with the Primary node.
- The **Witness node** confirms assertions of either the Primary or a Standby in a failover scenario. If, during a failure situation, the Primary finds itself in a partition with half or more of the nodes, it will stay Primary. As such, EFM supports running in a cluster with an even number of agents.

## Pgpool-II Overview

Pgpool-II (Pgpool) is an open-source application that provides connection pooling and load balancing for horizontal scalability of SELECT queries on multiple Standbys in EPAS and community Postgres clusters. For every backend, a backend_weight parameter can set the ratio of read traffic to be directed to the backend node. To prevent read traffic on the Primary node, the backend_weight parameter can be set to 0. In such cases, data modification language (DML) queries (i.e., INSERT, UPDATE, and DELETE) will still be sent to the Primary node, while read queries are load-balanced to the Standbys, providing scalability with mixed and read-intensive workloads.

EnterpriseDB supports the following Pgpool functionality:

- Load balancing
- Connection pooling
- High availability
- Connection limits

### PCP Overview

Pgpool provides an interface called PCP for administrators that performs management operations such as retrieving the status of Pgpool or terminating Pgpool processes remotely. PCP commands are UNIX commands that manipulate Pgpool via the network.

### Pgpool Watchdog

`watchdog` is an optional sub process of Pgpool that provides a high availability feature. Features added by `watchdog` include:

- Health checking of the pgpool service
- Mutual monitoring of other watchdog processes
- Changing leader/Standby state if certain faults are detected
- Automatic virtual IP address assigning synchronous to server switching
- Automatic registration of a server as a Standby during recovery

More information about the `Pgpool watchdog` component can be found at:

<http://www.pgpool.net/docs/latest/en/html/tutorial-watchdog.html>
32 changes: 32 additions & 0 deletions product_docs/docs/efm/3.10/efm_pgpool_ha_guide/02_architecture.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
title: "Architecture"

legacyRedirectsGenerated:
# This list is generated by a script. If you need add entries, use the `legacyRedirects` key.
- "/edb-docs/d/edb-postgres-failover-manager/user-guides/high-availability-scalability-guide/3.10/architecture.html"
---

![A typical EFM and Pgpool configuration](images/edb_ha_architecture.png)

The sample architecture diagram shows four nodes as described in the table below:

| **Systems** | **Components** |
| ------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Primary Pgpool/EFM witness node | The Primary Pgpool node will only run Pgpool, and EFM witness, as such leaving as much resources available to Pgpool as possible. During normal runmode (no Pgpool Failovers), the Primary Pgpool node has attached the Virtual IP address, and all applications connect through the Virtual IP address to Pgpool. Pgpool will forward all write traffic to the Primary Database node, and will balance all read across all Standby nodes.On the Primary Pgpool node, the EFM witness process ensures that a minimum quota of three EFM agents remains available even if one of the database nodes fails. Some examples are when a node is already unavailable due to maintenance, or failure, and another failure occurs. |
| Primary Database node | The Primary Database node will only run Postgres (Primary)and EFM, leaving all resources to Postgres. Read/Write traffic (i.e., INSERT, UPDATE, DELETE) is forwarded to this node by the Primary Pgpool node. |
| Standby nodes | The Standby nodes are running Postgres (Standby), EFM and an inactive Pgpool process. In case of a Primary database failure, EFM will promote Postgres on one of these Standby nodes to handle read-write traffic. In case of a Primary Pgpool failure, the Pgpool watchdog will activate Pgpool on one of the Standby nodes which will attach the VIP, and handle the forwarding of the application connections to the Database nodes. Note that in a double failure situation (both the Primary Pgpool node and the Primary Database node are in failure), both of these Primary processes might end up on the same node. |

This architecture:

- Achieves high availability by providing two Standbys that can be promoted in case of a Primary Postgres node failure.
- Achieves high availability by providing at least three Pgpool processes in a watchdog configuration.
- Increases performance with mixed and read-intensive workloads by introducing increased read scalability with more than one Standby for load balancing.
- Reduces load on the Primary database node by redirecting read-only traffic with the Primary pgpool node.
- Prevents resource contention between Pgpool and Postgres on the Primary Database node. By not running Pgpool on the Primary database node, the Primary Postgres process can utilize as much resources as possible.
- Prevents resource contention between pgpool and Postgres on the Primary Pgpool node. By not running Standby databases on the Primary Pgpool node, Pgpool can utilize as many resources as possible.
- Optionally, synchronous replication can be set up to achieve near-zero data loss in a failure event.

!!! Note
The architecture also allows us to completely separate 3 virtual machines running Postgres from 3 virtual machines running Pgpool. This kind of setup requires 2 extra virtual machines, but it is a better choice if you want to prevent resource contention between Pgpool and Postgres in Failover scenarios. In this setup, the architecture can run without an extra 7th node running the EFM Witness Process. To increase failure resolution efm witness agents could be deployed on the Pgpool servers.

![Deployment of EFM and Pgpool on separate virtual machines](images/edb_ha_architecture_separate_VM.png)
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
---
title: "Implementing High Availability with Pgpool"

legacyRedirectsGenerated:
# This list is generated by a script. If you need add entries, use the `legacyRedirects` key.
- "/edb-docs/d/edb-postgres-failover-manager/user-guides/high-availability-scalability-guide/3.10/components_ha_pgpool.html"
---

Failover Manager monitors the health of Postgres nodes; in the event of a database failure, Failover Manager performs an automatic failover to a Standby node. Note that Pgpool does not monitor the health of backend nodes and will not perform failover to any Standby nodes.

## Configuring Failover Manager

Failover Manager provides functionality that will remove failed database nodes from Pgpool load balancing; it can also re-attach nodes to Pgpool when returned to the Failover Manager cluster. To configure EFM for high availability using Pgpool, you must set the following properties in the cluster properties file:

pgpool.enable =&lt;true/false>

'pcp.user' = &lt;User that would be invoking PCP commands>

'pcp.host' = &lt;Virtual IP that would be used by pgpool. Same as pgpool parameter 'delegate_IP’>

'pcp.port' = &lt;The port on which pgpool listens for pcp commands>

'pcp.pass.file' = &lt;Absolute path of PCPPASSFILE>

'pgpool.bin' = &lt;Absolute path of pgpool bin directory>

## Configuring Pgpool

The section lists the configuration of some important parameters in the `pgpool.conf` file to integrate the Pgpool-II with EFM.

**Backend node setting**

There are three PostgreSQL backend nodes, one Primary and two Standby nodes. Configure using `backend_*` configuration parameters in `pgpool.conf`, and use the equal backend weights for all nodes. This will make the read queries to be distributed equally among all nodes.

```text
backend_hostname0 = ‘server1_IP'
backend_port0 = 5444
backend_weight0 = 1
backend_flag0 = 'DISALLOW_TO_FAILOVER'
backend_hostname1 = ‘server2_IP'
backend_port1 = 5444
backend_weight1 = 1
backend_flag1 = 'DISALLOW_TO_FAILOVER'
backend_hostname2 = ‘server3_IP'
backend_port2 = 5444
backend_weight2 = 1
backend_flag2 = 'DISALLOW_TO_FAILOVER'
```

**Enable Load-balancing and streaming replication mode**

Set the following configuration parameter in the `pgpool.conf` file to enable load balancing and streaming replication mode

```text
master_slave_mode = on
master_slave_sub_mode = 'stream'
load_balance_mode = on
```

**Disable health-checking and failover**

Health-checking and failover must be handled by EFM and hence, these must be disabled on Pgpool-II side. To disable the health-check and failover on pgpool-II side, assign the following values:

```text
health_check_period = 0
fail_over_on_backend_error = off
failover_if_affected_tuples_mismatch = off
failover_command = ‘’
failback_command = ‘’
```

Ensure the following while setting up the values in the `pgpool.conf` file:

- Keep the value of wd_priority in pgpool.conf different on each node. The node with the highest value gets the highest priority.
- The properties backend_hostname0 , backend_hostname1, backend_hostname2 and so on are shared properties (in EFM terms) and should hold the same value for all the nodes in pgpool.conf file.
- Update the correct interface value in *if\_* \* and arping cmd props in the pgpool.conf file.
- Add the properties heartbeat_destination0, heartbeat_destination1, heartbeat_destination2 etc. as per the number of nodes in pgpool.conf file on every node. Here heartbeat_destination0 should be the ip/hostname of the local node.

**Setting up PCP**

Script uses the PCP interface, So we need to set up the PCP and .PCPPASS file to allow PCP connections without password prompt.

setup PCP: <http://www.pgpool.net/docs/latest/en/html/configuring-pcp-conf.html>

setup PCPPASS: <https://www.pgpool.net/docs/latest/en/html/pcp-commands.html>

Note that the load-balancing is turned on to ensure read scalability by distributing read traffic across the standby nodes

The health checking and error-triggered backend failover have been turned off, as Failover Manager will be responsible for performing health checks and triggering failover. It is not advisable for Pgpool to perform health checking in this case, so as not to create a conflict with Failover Manager, or prematurely perform failover.

Finally, `search_primary_node_timeout` has been set to a low value to ensure prompt recovery of Pgpool services upon an Failover Manager-triggered failover.

## Virtual IP Addresses

Both Pgpool-II and Failover Manager provide functionality to employ a virtual IP for seamless failover. While both provide this capability, the pgpool-II leader is the process that receives the Application connections through the Virtual IP. As in this design, such Virtual IP management is performed by the Pgpool-II watchdog system. EFM VIP has no beneficial effect in this design and it must be disabled.

Note that in a failure situation of the active instance of Pgpool (The Primary Pgpool Server in our sample architecture), the next available Standby Pgpool instance (according to watchdog priority) will be activated and takes charge as the leader Pgpool instance.

## Configuring Pgpool-II Watchdog

Watchdog provides the high availability of Pgpool-II nodes. This section lists the configuration required for watchdog on each Pgpool-II node.

**Common watchdog configurations on all Pgpool nodes**

The following configuration parameters enable and configure the watchdog. The interval and retry values can be adjusted depending upon the requirements and testing results.

```text
use_watchdog = on # enable watchdog
wd_port = 9000 # watchdog port, can be changed
delegate_IP = ‘Virtual IP address’
wd_lifecheck_method = 'heartbeat'
wd_interval = 10 # we can lower this value for quick detection
wd_life_point = 3
# virtual IP control
ifconfig_path = '/sbin' # ifconfig command path
if_up_cmd = 'ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0'
# startup delegate IP command
if_down_cmd = 'ifconfig eth0:0 down' # shutdown delegate IP command
arping_path = '/usr/sbin' # arping command path
```

!!! Note
Replace the value of eth0 with the network interface on your system. See [Chapter 5](05_appendix_b/#configuration-for-number-of-connections-and-pooling) for tuning the number of connections, and pooling configuration.

**Watchdog configurations on server 2**

```text
other_pgpool_hostname0 = 'server 3 IP/hostname'
other_pgpool_port0 = 9999
other_wd_port0 = 9000
other_pgpool_hostname1 = 'server 4 IP/hostname'
other_pgpool_port1 = 9999
other_wd_port1 = 9000
wd_priority = 1
```

**Watchdog configurations on server 3**

```text
other_pgpool_hostname0 = 'server 2 IP/hostname'
other_pgpool_port0 = 9999
other_wd_port0 = 9000
other_pgpool_hostname1 = 'server 4 IP/hostname'
other_pgpool_port1 = 9999
other_wd_port1 = 9000
wd_priority = 3
```

**Watchdog configurations on server 4**

```text
other_pgpool_hostname0 = 'server 2 IP/hostname'
other_pgpool_port0 = 9999
other_wd_port0 = 9000
other_pgpool_hostname1 = 'server 3 IP/hostname'
other_pgpool_port1 = 9999
other_wd_port1 = 9000
wd_priority = 5 # use high watchdog priority on server 4
```
Loading

0 comments on commit 46a47e0

Please sign in to comment.