Skip to content

Commit

Permalink
Added diagrams to overview
Browse files Browse the repository at this point in the history
  • Loading branch information
nastena1606 committed Jan 15, 2025
1 parent bcb0094 commit 22f6672
Show file tree
Hide file tree
Showing 12 changed files with 290 additions and 170 deletions.
Binary file added docs/_images/diagrams/ha-overview-backup.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_images/diagrams/ha-overview-failover.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_images/diagrams/ha-overview-replication.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions docs/solutions/ha-architecture.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Architecture layout
# Architecture

As we discussed in the [overview of high availability](high-availability.md), the minimalist approach to a highly-available deployment is to have a three-node PostgreSQL cluster with the cluster management and failover mechanisms, load balancer and a backup / restore solution.

The following diagram shows this architecture.
The following diagram shows this architecture with the tools we recommend to use.

![Architecture of the three-node, single primary PostgreSQL cluster](../_images/diagrams/ha-architecture-patroni.png)

Expand Down
218 changes: 103 additions & 115 deletions docs/solutions/ha-etcd.md
Original file line number Diff line number Diff line change
@@ -1,142 +1,130 @@
# Configure etcd distributed store

The distributed configuration store provides a reliable way to store data that needs to be accessed by large scale distributed systems. The most popular implementation of the distributed configuration store is etcd. etcd is deployed as a cluster for fault-tolerance and requires an odd number of members (n/2+1) to agree on updates to the cluster state. An etcd cluster helps establish a consensus among nodes during a failover and manages the configuration for the three PostgreSQL instances.

This document provides configuration for etcd version 3.5.x. For how to configure etcd cluster with earlier versions of etcd, read the blog post by _Fernando Laudares Camargos_ and _Jobin Augustine_ [PostgreSQL HA with Patroni: Your Turn to Test Failure Scenarios](https://www.percona.com/blog/postgresql-ha-with-patroni-your-turn-to-test-failure-scenarios/)

If you [installed the software from tarballs](../tarball.md), check how you [enable etcd](../enable-extensions.md#etcd).

The `etcd` cluster is first started in one node and then the subsequent nodes are added to the first node using the `add `command.
In our implementation we use etcd distributed configuration store. [Refresh your knowledge about etcd](high-availability.md#etcd).

!!! note

If you [installed the software from tarballs](../tarball.md), you must first [enable etcd](../enable-extensions.md#etcd) before configuring it.

Users with deeper understanding of how etcd works can configure and start all etcd nodes at a time and bootstrap the cluster using one of the following methods:

* Static in the case when the IP addresses of the cluster nodes are known
* Discovery service - for cases when the IP addresses of the cluster are not known ahead of time.

See the [How to configure etcd nodes simultaneously](../how-to.md#how-to-configure-etcd-nodes-simultaneously) section for details.

### Configure `node1`

1. Create the configuration file. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node name and IP address with the actual name and IP address of your node.

```yaml title="/etc/etcd/etcd.conf.yaml"
name: 'node1'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: new
initial-cluster: node1=http://10.104.0.1:2380
data-dir: /var/lib/etcd
initial-advertise-peer-urls: http://10.104.0.1:2380
listen-peer-urls: http://10.104.0.1:2380
advertise-client-urls: http://10.104.0.1:2379
listen-client-urls: http://10.104.0.1:2379
```
To get started with `etcd` cluster, you need to bootstrap it. This means setting up the initial configuration and starting the etcd nodes so they can form a cluster. There are the following bootstrapping mechanisms:

2. Start the `etcd` service to apply the changes on `node1`.
* Static in the case when the IP addresses of the cluster nodes are known
* Discovery service - for cases when the IP addresses of the cluster are not known ahead of time.

Since we know the IP addresses of the nodes, we will use the static method. For using the discovery service, please refer to the [etcd documentation :octicons-external-link-16:](https://etcd.io/docs/v3.5/op-guide/clustering/#etcd-discovery){:target="_blank"}.

We will configure and start all etcd nodes in parallel. This can be done either by modifying each node's configuration or using the command line options. Use the method that you prefer more.

### Method 1. Modify the configuration file

1. Create the etcd configuration file on every node. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes.

=== "node1"

```yaml title="/etc/etcd/etcd.conf.yaml"
name: 'node1'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: new
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380
data-dir: /var/lib/etcd
initial-advertise-peer-urls: http://10.104.0.1:2380
listen-peer-urls: http://10.104.0.1:2380
advertise-client-urls: http://10.104.0.1:2379
listen-client-urls: http://10.104.0.1:2379
```

=== "node2"

```yaml title="/etc/etcd/etcd.conf.yaml"
name: 'node2'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: new
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380
data-dir: /var/lib/etcd
initial-advertise-peer-urls: http://10.104.0.2:2380
listen-peer-urls: http://10.104.0.2:2380
advertise-client-urls: http://10.104.0.2:2379
listen-client-urls: http://10.104.0.2:2379
```

=== "node3"

```yaml title="/etc/etcd/etcd.conf.yaml"
name: 'node3'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: new
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380
data-dir: /var/lib/etcd
initial-advertise-peer-urls: http://10.104.0.3:2380
listen-peer-urls: http://10.104.0.3:2380
advertise-client-urls: http://10.104.0.3:2379
listen-client-urls: http://10.104.0.3:2379
```

2. Enable and start the `etcd` service on all nodes:

```{.bash data-prompt="$"}
$ sudo systemctl enable --now etcd
$ sudo systemctl start etcd
$ sudo systemctl status etcd
```

3. Check the etcd cluster members on `node1`:
During the node start, etcd searches for other cluster nodes defined in the configuration. If the other nodes are not yet running, the start may fail by a quorum timeout. This is expected behavior. Try starting all nodes again at the same time for the etcd cluster to be created.

```{.bash data-prompt="$"}
$ sudo etcdctl member list --write-out=table --endpoints=http://10.104.0.1:2379
```

Sample output:

```{.text .no-copy}
+------------------+---------+-------+----------------------------+----------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+-------+----------------------------+----------------------------+------------+
| 9d2e318af9306c67 | started | node1 | http://10.104.0.1:2380 | http://10.104.0.1:2379 | false |
+------------------+---------+-------+----------------------------+----------------------------+------------+
```

4. Add the `node2` to the cluster. Run the following command on `node1`:

```{.bash data-prompt="$"}
$ sudo etcdctl member add node2 --peer-ulrs=http://10.104.0.2:2380
```

??? example "Sample output"

```{.text .no-copy}
Added member named node2 with ID 10042578c504d052 to cluster
etcd_NAME="node2"
etcd_INITIAL_CLUSTER="node2=http://10.104.0.2:2380,node1=http://10.104.0.1:2380"
etcd_INITIAL_CLUSTER_STATE="existing"
```
--8<-- "check-etcd.md"

### Configure `node2`
### Method 2. Start etcd nodes with command line options

1. Create the configuration file. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes.
1. On each etcd node, set the environment variables for the cluster members, the cluster token and state:

```yaml title="/etc/etcd/etcd.conf.yaml"
name: 'node2'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: existing
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380
data-dir: /var/lib/etcd
initial-advertise-peer-urls: http://10.104.0.2:2380
listen-peer-urls: http://10.104.0.2:2380
advertise-client-urls: http://10.104.0.2:2379
listen-client-urls: http://10.104.0.2:2379
```

3. Start the `etcd` service to apply the changes on `node2`:

```{.bash data-prompt="$"}
$ sudo systemctl enable --now etcd
$ sudo systemctl start etcd
$ sudo systemctl status etcd
```

### Configure `node3`

1. Add `node3` to the cluster. **Run the following command on `node1`**

```{.bash data-prompt="$"}
$ sudo etcdctl member add node3 http://10.104.0.3:2380
TOKEN=PostgreSQL_HA_Cluster_1
CLUSTER_STATE=new
NAME_1=node1
NAME_2=node2
NAME_3=node3
HOST_1=10.104.0.1
HOST_2=10.104.0.2
HOST_3=10.104.0.3
CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=http://${HOST_3}:2380
```

2. On `node3`, create the configuration file. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes.

```yaml title="/etc/etcd/etcd.conf.yaml"
name: 'node3'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: existing
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380
data-dir: /var/lib/etcd
initial-advertise-peer-urls: http://10.104.0.3:2380
listen-peer-urls: http://10.104.0.3:2380
advertise-client-urls: http://10.104.0.3:2379
listen-client-urls: http://10.104.0.3:2379
```
2. Start each etcd node in parallel using the following command:

3. Start the `etcd` service to apply the changes.
=== "node1"

```{.bash data-prompt="$"}
$ sudo systemctl enable --now etcd
$ sudo systemctl start etcd
$ sudo systemctl status etcd
```
```{.bash data-prompt="$"}
THIS_NAME=${NAME_1}
THIS_IP=${HOST_1}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
```

4. Check the etcd cluster members.
=== "node2"

```{.bash data-prompt="$"}
$ sudo etcdctl member list
```
```{.bash data-prompt="$"}
THIS_NAME=${NAME_2}
THIS_IP=${HOST_2}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
```

??? example "Sample output"
=== "node3"

```{.bash data-prompt="$"}
THIS_NAME=${NAME_3}
THIS_IP=${HOST_3}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
```
2d346bd3ae7f07c4: name=node2 peerURLs=http://10.104.0.2:2380 clientURLs=http://10.104.0.2:2379 isLeader=false
8bacb519ebdee8db: name=node3 peerURLs=http://10.104.0.3:2380 clientURLs=http://10.104.0.3:2379 isLeader=false
c5f52ea2ade25e1b: name=node1 peerURLs=http://10.104.0.1:2380 clientURLs=http://10.104.0.1:2379 isLeader=true
```

--8<-- "check-etcd.md"
67 changes: 67 additions & 0 deletions docs/solutions/ha-haproxy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Configure HAProxy

HAproxy is the load balancer and the single point of entry to your PostgreSQL cluster for client applications. A client application accesses the HAPpoxy URL and sends its read/write requests there. Behind-the-scene, HAProxy routes write requests to the primary node and read requests - to the secondaries in a round-robin fashion so that no secondary instance is unnecessarily loaded. To make this happen, provide different ports in the HAProxy configuration file. In this deployment, writes are routed to port 5000 and reads - to port 5001

This way, a client application doesn’t know what node in the underlying cluster is the current primary. HAProxy sends connections to a healthy node (as long as there is at least one healthy node available) and ensures that client application requests are never rejected.

1. Install HAProxy on the `HAProxy-demo` node:

```{.bash data-prompt="$"}
$ sudo apt install percona-haproxy
```

2. The HAProxy configuration file path is: `/etc/haproxy/haproxy.cfg`. Specify the following configuration in this file.

```
global
maxconn 100

defaults
log global
mode tcp
retries 2
timeout client 30m
timeout connect 4s
timeout server 30m
timeout check 5s

listen stats
mode http
bind *:7000
stats enable
stats uri /

listen primary
bind *:5000
option httpchk /primary
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server node1 node1:5432 maxconn 100 check port 8008
server node2 node2:5432 maxconn 100 check port 8008
server node3 node3:5432 maxconn 100 check port 8008

listen standbys
balance roundrobin
bind *:5001
option httpchk /replica
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server node1 node1:5432 maxconn 100 check port 8008
server node2 node2:5432 maxconn 100 check port 8008
server node3 node3:5432 maxconn 100 check port 8008
```


HAProxy will use the REST APIs hosted by Patroni to check the health status of each PostgreSQL node and route the requests appropriately.

3. Restart HAProxy:

```{.bash data-prompt="$"}
$ sudo systemctl restart haproxy
```

4. Check the HAProxy logs to see if there are any errors:

```{.bash data-prompt="$"}
$ sudo journalctl -u haproxy.service -n 100 -f
```
4 changes: 2 additions & 2 deletions docs/solutions/ha-install-software.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,8 @@ Run the following commands as root or with `sudo` privileges.
3. Stop and disable all installed services:

```{.bash data-prompt="$"}
$ sudo systemctl stop {etcd,patroni,postgresql}
$ sudo systemctl disable {etcd,patroni,postgresql}
$ sudo systemctl stop {etcd,patroni,postgresql-{{pgversion}}}
$ sudo systemctl disable {etcd,patroni,postgresql-{{pgversion}}}
```

4. Even though Patroni can use an existing Postgres installation, remove the data directory to force it to initialize a new Postgres cluster instance.
Expand Down
7 changes: 6 additions & 1 deletion docs/solutions/ha-measure.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
# Measuring high availability

The need for high availability is determined by the business requirements, potential risks, and operational limitations. The level of high availability depends on how much downtime you can bear without negatively impacting your users and how much data loss you can tolerate during the system outage.
The need for high availability is determined by the business requirements, potential risks, and operational limitations (e.g. the more components you add to your infrastructure, the more complex and time-consuming it is to maintain).

The level of high availability depends on the following:

* how much downtime you can bear without negatively impacting your users and
* how much data loss you can tolerate during the system outage.

The measurement of availability is done by establishing a measurement time frame and dividing it by the time that it was available. This ratio will rarely be one, which is equal to 100% availability. At Percona, we don’t consider a solution to be highly available if it is not at least 99% or two nines available.

Expand Down
Loading

0 comments on commit 22f6672

Please sign in to comment.