From be8096ce9ba98e532df8a8976fc0d8bde2b074c4 Mon Sep 17 00:00:00 2001 From: Betsy Gitelman Date: Thu, 30 Nov 2023 12:26:13 -0500 Subject: [PATCH 1/7] edits to big animal high availability content PR4885 --- .../distributed_highavailability.mdx | 12 ++++++------ .../primary_standby_highavailability.mdx | 2 +- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/product_docs/docs/biganimal/release/overview/02_high_availability/distributed_highavailability.mdx b/product_docs/docs/biganimal/release/overview/02_high_availability/distributed_highavailability.mdx index 5e06ba49951..cecf054cfa7 100644 --- a/product_docs/docs/biganimal/release/overview/02_high_availability/distributed_highavailability.mdx +++ b/product_docs/docs/biganimal/release/overview/02_high_availability/distributed_highavailability.mdx @@ -2,7 +2,7 @@ title: "Distributed high availability" --- -Distributed high-availability clusters are powered by [EDB Postgres Distributed](/pgd/latest/). They use multi-master logical replication to deliver more advanced cluster management compared to a physical replication-based system. Distributed high-availability clusters let you deploy a cluster across multiple regions or a single region. For use cases where high availability across regions is a major concern, a cluster deployment with distributed high availability enabled can provide two data groups with a witness group in a third region +Distributed high-availability clusters are powered by [EDB Postgres Distributed](/pgd/latest/). They use multi-master logical replication to deliver more advanced cluster management compared to a physical replication-based system. Distributed high-availability clusters let you deploy a cluster across multiple regions or a single region. For use cases where high availability across regions is a major concern, a cluster deployment with distributed high availability enabled can provide two data groups with a witness group in a third region. This configuration provides a true active-active solution as each data group is configured to accept writes. @@ -19,13 +19,13 @@ The witness node/witness group doesn't host data but exists for management purpo ## Single data location -A configuration with single data location has one data group and either: +A configuration with a single data location has one data group and either: -- Two data nodes with one lead and one shadow and a witness node each in separate availability zones +- Two data nodes with one lead, one shadow, and a witness node, each in separate availability zones ![region(2 data + 1 witness)](../images/image5.png) -- Three data nodes with one lead and two shadow nodes each in separate availability zones +- Three data nodes with one lead and two shadow nodes, each in separate availability zones ![region(3 data)](../images/image3.png) @@ -53,9 +53,9 @@ A configuration with multiple data locations has two data groups that contain ei By default, the cloud service provider selected for the data groups is preselected for the witness node. -To guard against cloud service provider failures, you can designate a witness node on a different cloud service provider than the data groups. This configuration can enable a three-region configuration even if a single cloud provider only offers two regions in the jurisdiction you are allowed to deploy your cluster in. +To guard against cloud service provider failures, you can designate a witness node on a cloud service provider different from the one for data groups. This configuration can enable a three-region configuration even if a single cloud provider offers only two regions in the jurisdiction you're allowed to deploy your cluster in. -Cross-cloud service provider witness nodes are available with AWS, Azure, and Google Cloud using your own cloud account and BigAnimal's cloud account. This option is enabled by default and applies to both multi-region configurations available with PGD. For witness nodes you only pay for the used infrastructure, which is reflected in the pricing estimate. +Cross-cloud service provider witness nodes are available with AWS, Azure, and Google Cloud using your own cloud account and BigAnimal's cloud account. This option is enabled by default and applies to both multi-region configurations available with PGD. For witness nodes you pay only for the used infrastructure, which is reflected in the pricing estimate. ## For more information diff --git a/product_docs/docs/biganimal/release/overview/02_high_availability/primary_standby_highavailability.mdx b/product_docs/docs/biganimal/release/overview/02_high_availability/primary_standby_highavailability.mdx index f644e3998b7..088213a3ea6 100644 --- a/product_docs/docs/biganimal/release/overview/02_high_availability/primary_standby_highavailability.mdx +++ b/product_docs/docs/biganimal/release/overview/02_high_availability/primary_standby_highavailability.mdx @@ -2,7 +2,7 @@ title: "Primary/standby high availability" --- -The Primary/Standby High Availability option is provided to minimize downtime in cases of failures. Primary/standby high-availability clusters—one *primary* and one or two *standby replicas*—are configured automatically, with standby replicas staying up to date through physical streaming replication. +The Primary/Standby High Availability option is provided to minimize downtime in cases of failures. Primary/standby high-availability clusters—one *primary* and one or two *standby replicas*—are configured automatically. Standby replicas stay up to date through physical streaming replication. If read-only workloads are enabled, then standby replicas serve the read-only workloads. In a two-node cluster, the single standby replica serves read-only workloads. In a three-node cluster, both standby replicas serve read-only workloads. The connections are made to the two standby replicas randomly and on a per-connection basis. From 18d483b9ccadb81a6f804607f87722dc4abbacc4 Mon Sep 17 00:00:00 2001 From: Betsy Gitelman Date: Thu, 30 Nov 2023 12:27:48 -0500 Subject: [PATCH 2/7] Update product_docs/docs/biganimal/release/overview/02_high_availability/distributed_highavailability.mdx --- .../02_high_availability/distributed_highavailability.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/product_docs/docs/biganimal/release/overview/02_high_availability/distributed_highavailability.mdx b/product_docs/docs/biganimal/release/overview/02_high_availability/distributed_highavailability.mdx index cecf054cfa7..1ce708499fb 100644 --- a/product_docs/docs/biganimal/release/overview/02_high_availability/distributed_highavailability.mdx +++ b/product_docs/docs/biganimal/release/overview/02_high_availability/distributed_highavailability.mdx @@ -55,7 +55,7 @@ By default, the cloud service provider selected for the data groups is preselect To guard against cloud service provider failures, you can designate a witness node on a cloud service provider different from the one for data groups. This configuration can enable a three-region configuration even if a single cloud provider offers only two regions in the jurisdiction you're allowed to deploy your cluster in. -Cross-cloud service provider witness nodes are available with AWS, Azure, and Google Cloud using your own cloud account and BigAnimal's cloud account. This option is enabled by default and applies to both multi-region configurations available with PGD. For witness nodes you pay only for the used infrastructure, which is reflected in the pricing estimate. +Cross-cloud service provider witness nodes are available with AWS, Azure, and Google Cloud using your own cloud account and BigAnimal's cloud account. This option is enabled by default and applies to both multi-region configurations available with PGD. For witness nodes you pay only for the infrastructure used, which is reflected in the pricing estimate. ## For more information From 57f599e43da5cf0393e797845e45483017d75ced Mon Sep 17 00:00:00 2001 From: sergioenterprisedb <86610980+sergioenterprisedb@users.noreply.github.com> Date: Fri, 1 Dec 2023 08:29:27 +0100 Subject: [PATCH 3/7] Replace orace by oracle Replace orace by oracle --- product_docs/docs/epas/16/index.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/product_docs/docs/epas/16/index.mdx b/product_docs/docs/epas/16/index.mdx index 4d55246181d..a57540a542e 100644 --- a/product_docs/docs/epas/16/index.mdx +++ b/product_docs/docs/epas/16/index.mdx @@ -38,7 +38,7 @@ All of these features are available in Postgres mode and [Oracle compatibility m - [Oracle-compatible custom data types](reference/sql_reference/02_data_types/) - [Oracle keywords](reference/sql_reference/01_sql_syntax/) - [Oracle functions](reference/sql_reference/03_functions_and_operators/) -- [Orace-style catalog views](reference/oracle_compatibility_reference/epas_compat_cat_views/) +- [Oracle-style catalog views](reference/oracle_compatibility_reference/epas_compat_cat_views/) - [Additional compatibility with Oracle MERGE](reference/oracle_compatibility_reference/epas_compat_sql/65a_merge.mdx). EDB also makes available a [full suite of tools and utilities](tools_utilities_and_components) that helps you monitor and manage your EDB Postgres Advanced Server deployment. From a032f952bd06c1f97523eb60e70bedda98e19e83 Mon Sep 17 00:00:00 2001 From: Dj Walker-Morgan <126472455+djw-m@users.noreply.github.com> Date: Tue, 5 Dec 2023 11:04:28 +0000 Subject: [PATCH 4/7] Fix GRANT ROLE reference Resolve BDR-4128 - "Incorrect statement". --- product_docs/docs/pgd/5/security/pgd-predefined-roles.mdx | 3 --- 1 file changed, 3 deletions(-) diff --git a/product_docs/docs/pgd/5/security/pgd-predefined-roles.mdx b/product_docs/docs/pgd/5/security/pgd-predefined-roles.mdx index ce4f2643bcd..2ceb0ef914e 100644 --- a/product_docs/docs/pgd/5/security/pgd-predefined-roles.mdx +++ b/product_docs/docs/pgd/5/security/pgd-predefined-roles.mdx @@ -7,9 +7,6 @@ extension is dropped from a database, the roles continue to exist. You need to drop them manually if dropping is required. This practice allows PGD to be used in multiple databases on the same PostgreSQL instance without problem. -The `GRANT ROLE` DDL statement doesn't participate in PGD replication. Thus, -execute this on each node of a cluster. - ### bdr_superuser - ALL PRIVILEGES ON ALL TABLES IN SCHEMA BDR From c765150ca3a7121be4e2fc231e78a7a34a1533ee Mon Sep 17 00:00:00 2001 From: Petr Jelinek Date: Mon, 4 Dec 2023 18:24:59 +0100 Subject: [PATCH 5/7] PGD5: initial shot at manual installation quickstart --- .../pgd/5/quickstart/quick_start_manual.mdx | 309 ++++++++++++++++++ 1 file changed, 309 insertions(+) create mode 100644 product_docs/docs/pgd/5/quickstart/quick_start_manual.mdx diff --git a/product_docs/docs/pgd/5/quickstart/quick_start_manual.mdx b/product_docs/docs/pgd/5/quickstart/quick_start_manual.mdx new file mode 100644 index 00000000000..cd10be46ba8 --- /dev/null +++ b/product_docs/docs/pgd/5/quickstart/quick_start_manual.mdx @@ -0,0 +1,309 @@ +--- +title: "Deploying an EDB Postgres Distributed example cluster on Linux hosts by hand" +navTitle: "Deploying on Linux hosts by hand" +description: > + A quick demonstration of deploying a PGD architecture on Linux hosts by hand +--- + +## Prerequisites + +### Configure your Linux hosts + +You will need to provision four hosts for this quick start. Each host should have a +[supported Linux operating system](https://www.enterprisedb.com/resources/platform-compatibility#bdr) +installed. + +!!! Note On machine provisioning +AWS users can follow [an Amazon guide](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EC2_GetStarted.html) +on how to provision EC2 linux instances. +Azure users can follow [a Microsoft guide](https://learn.microsoft.com/en-us/azure/virtual-machines/linux/quick-create-portal?tabs=ubuntu) on how to provision Azure VMs loaded with Linux. Google Cloud Platform users can follow [a Google guide](https://cloud.google.com/compute/docs/create-linux-vm-instance) on how to provision GCP VMs with Linux loaded. You can use any virtual machine technology to host a Linux instance, too. Refer to your virtualization platform's documentation for instructions on how to create instances with Linux loaded on them. + +Whichever cloud or VM platform you use, you need to make sure that each instance is accessible by SSH and that each instance can connect to the other instances. They can connect through either the public network or over a VPC for the cloud platforms. You can connect through your local network for on-premises VMs. +!!! + +In this quick start, you will install PGD nodes onto three hosts configured in the cloud. Each of these hosts in this example is installed with Rocky Linux. Each has a public IP address to go with its private IP address. + +| Host name | Public IP | Private IP | +| ----------- | ------------------------ | -------------- | +| linuxhost-1 | 172.19.16.27 | 192.168.2.247 | +| linuxhost-2 | 172.19.16.26 | 192.168.2.41 | +| linuxhost-3 | 172.19.16.25 | 192.168.2.254 | + +These are example IP addresses. Substitute them with your own public and private IP addresses as you progress through the quick start. + +### Set up a host admin user + +Each machine requires a user account to use for installation. For simplicity, use a user with the same name on all the hosts. On each host, also configure the user so that you can SSH into the host without being prompted for a password. Be sure to give that user sudo privileges on the host. On the four hosts, the user rocky is already configured with sudo privileges. + +### Set up respository access + +Before you begin the installation process: + +- Install Postgres on the same host (not needed for witness nodes) + + - See [Installing EDB Postgres Advanced Server](/epas/latest/epas_inst_linux) + + - See [PostgreSQL Downloads](https://www.postgresql.org/download/) + +- Set up the EDB repository + + Setting up the repository is a one-time task. If you have already set up your repository, you don't need to perform this step. + + To determine if your repository exists, enter this command: + + `dnf repolist | grep enterprisedb` + + If no output is generated, the repository isn't installed. + + To set up the EDB repository: + + 1. Go to [EDB repositories](https://www.enterprisedb.com/repos-downloads). + + 2. Select the button that provides access to the EDB repository. + + 3. Select the platform and software that you want to download. + + + + 4. Follow the instructions for setting up the EDB repository. + EDB Postgres Distributed packages come from `postgres_distrbuted` repository. + +- Install the EPEL repository + + ```shell + sudo yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm + ``` + +## Install the package + +Install PGD5 packages for [EDB Postgres Advanced Server v15](https://www.enterprisedb.com/docs/epas/latest/): + +```shell +sudo yum -y install edb-bdr5-epas15 +``` + +We also need to install the proxy and the CLI packages + +```shell +sudo yum -y install edb-pgd5-proxy +sudo yum -y install edb-pgd5-cli +``` + + + +## Initial Postgres configuration + +We need to configure Postgres in order to be able to use PGD with it. + +Ensure that Postgres configuration on every host includes following: + +- `shared_preload_libraries = 'bdr'` - this loads the binary itself, you may add additional + extensions here as needed, the order generally does not matter +- `wal_level = logical` - enables extra information logging +- `track_commit_timestamps = on` - this enables feature which writes timestamp of a COMMIT to + the transaction log + +There are other configuration options that might need adjusting depending on the cluster size. +Those are documented in the [Postgres configuration](/pgd/latest/postgres-configuration) chapter +of the PGD documentation. + +We won't need to adjust anything else in Postgres config for our example cluster with 3 nodes +in a single location. + +!!! Note + Make sure you restart Postgres after changing the above parameters + +## PGD setup + +First, we need to create a Postgres database which will represent the PGD node on each host. + +``` +CREATE DATABASE bdrdb; +``` + +Once the database is created, connect to it and install the BDR extension there. + +``` +CREATE EXTENSION BDR; +``` + +Adding PGD Replication user +Adding PGD Proxy user + +After that we need to create a PGD node record. + +``` +SELECT bdr.create_node('linuxhost-1', 'host=linuxhost-1 dbname=bdrdb'); +``` + +Do this on each node, changing the host name and connection string appropriately. + +Then on first node (let's assume it's `linuxhost-1` node). Create PGD node groups that +will represent the cluster and the location (dc1 in out case). + +``` +SELECT bdr.create_node_group('pgd'); +SELECT bdr.create_node_group('dc1', 'pgd'); +``` + +Once this is done, connect to the other two hosts and join the `dc1` group with the +following command. + +``` +SELECT bdr.join_node_group('host=linuxhost-1 dbname=bdrdb', 'dc1'); +``` + +Note the first parameter here, which is the connection string to the node on which we +created the `dc1` group. + +It make take a while for these commands to finish as the node membership updates. If the +process is taking more than few seconds, the function will start reporting NOTICE messages +with the current state of joining process. + +Once everything is finished we have EDB Postgres Distributed cluster up and running. + +Now we can check the cluster shape with. + + + +``` +SELECT * FROM bdr.node_summary; +``` + +And replication status of all nodes with. + +``` +SELECT * FROM bdr.group_subscription_summary; +``` + +This, however, is not the full setup. We need to make the cluster usable by application +without worry about failover and conflicts. For that we need to setup transparent routing +to the write leader. + +## Routing setup + +Before we start proxy setup, we need to setup automatic connection routing in PGD itself. + +To do that, connect to any of the PGD nodes and execute following. + +``` +SELECT bdr.alter_node_group_option('dc1', 'enable_proxy_routing', 'true'); +``` + +When this is done, we need to tell PGD that we plan to add proxies and store configuration +for them. We'll put proxy on every PGD node and keep default behavior, so we simply create +record for each of them. + +Again, on any PGD node (can be same one as above), execute these commands. + +``` +SELECT bdr.create_proxy('pgd-proxy', 'dc1'); +``` + +This creates default configuration for 3 proxies that will route connections transparently +to the write leader of the `dc1` group. + +### PGD Proxy setup + +Now, we need to configure the proxies themselves. + +Locate the configuration file of the pgd-proxy (`/etc/edb/pgd-proxy`) or every node, +and put following in it. + +```yaml +log-level: info # debug, info, warn, error +log-encoder: text # text, json +cluster: + name: pgd + endpoints: + - host=linuxhost-1 dbname=bdrdb + - host=linuxhost-2 dbname=bdrdb + - host=linuxhost-3 dbname=bdrdb + proxy: + name: pgd-proxy + endpoint: "host=localhost port=6432 dbname=bdrdb" +``` + +The endpoints should contain connection strings to all PGD nodes in the same group (`dc1`). + +The name of the proxy configuration must correspond to the name created above with the +`bdr.create_proxy` command. + +!!! Note + The proxy must listen on different port that Postgres itself, here we use the port + 6432, this will be the port application must connect to. + +Now we can start the proxies. On every host, start the pgd-proxy service using systemd. + +```shell +sudo systemctl restart pgd-proxy +``` + +Once this is done, we can connect to one of the proxies. + +``` +psql -h linuxhost-1,linuxhost-2,linuxhost-3 -p 6432 bdrdb +__OUTPUT__ +psql (15.2.0, server 15.2.0) +SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off) +Type "help" for help. + +bdrdb=# +``` + +Using all 3 hostnames will result in psql connecting automatically to whichever proxy is up, +and the proxy will route the connection to current write leader. + +You're connected to the Postgres database using PGD Proxy and can start issuing SQL commands. + +To leave the SQL client, enter `exit`. + +### Using PGD CLI + +The pgd utility, also known as the PGD CLI, lets you control and manage your EDB Postgres Distributed cluster. +We've already installed it on all nodes in the first step. + +You can use it to check the cluster's health by running `pgd check-health`: + +```shell +pgd --dsn 'host=linuxhost-1 dbname=bdrdb' check-health +__OUTPUT__ +Check Status Message +----- ------ ------- +ClockSkew Ok All BDR node pairs have clockskew within permissible limit +Connection Ok All BDR nodes are accessible +Raft Ok Raft Consensus is working correctly +Replslots Ok All BDR replication slots are working correctly +Version Ok All nodes are running same BDR versions +``` + +Or, you can use `pgd show-nodes` to ask PGD to show you the data-bearing nodes in the cluster: + +```shell +pgd --dsn 'host=linuxhost-1 dbname=bdrdb' show-nodes +__OUTPUT__ +Node Node ID Group Type Current State Target State Status Seq ID +---- ------- ----- ---- ------------- ------------ ------ ------ +linuxhost-1 2710197610 dc1_subgroup data ACTIVE ACTIVE Up 1 +linuxhost-2 3490219809 dc1_subgroup data ACTIVE ACTIVE Up 3 +linuxhost-3 2111777360 dc1_subgroup data ACTIVE ACTIVE Up 2 +``` + +Similarly, use `pgd show-proxies` to display the proxy connection nodes: + +```shell +pgd --dsn 'host=linuxhost-1 dbname=bdrdb' show-proxies +__OUTPUT__ +Proxy Group Listen Addresses Listen Port +----- ----- ---------------- ----------- +linuxhost-1 dc1_subgroup [0.0.0.0] 6432 +linuxhost-2 dc1_subgroup [0.0.0.0] 6432 +linuxhost-3 dc1_subgroup [0.0.0.0] 6432 +``` + +## Explore your cluster + +* [Connect to your database](connecting_applications) to applications +* [Explore failover](further_explore_failover) with hands-on exercises +* [Understand conflicts](further_explore_conflicts) by creating and monitoring them +* [Next steps](next_steps) in working with your cluster From 12627a10dd64a29a87e7ba6ebf6626ea65432b9e Mon Sep 17 00:00:00 2001 From: Dj Walker-Morgan Date: Wed, 6 Dec 2023 08:44:10 +0000 Subject: [PATCH 6/7] Updated for ordering of pgaudit Signed-off-by: Dj Walker-Morgan --- product_docs/docs/pgd/5/postgres-configuration.mdx | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/product_docs/docs/pgd/5/postgres-configuration.mdx b/product_docs/docs/pgd/5/postgres-configuration.mdx index b275b9c96e2..4318c54fc25 100644 --- a/product_docs/docs/pgd/5/postgres-configuration.mdx +++ b/product_docs/docs/pgd/5/postgres-configuration.mdx @@ -16,7 +16,8 @@ To run correctly, PGD requires these Postgres settings: - `wal_level` — Must be set to `logical`, since PGD relies on logical decoding. - `shared_preload_libraries` — Must contain `bdr`, although it can contain - other entries before or after, as needed. However, don't include `pglogical`. + other entries before or after, as needed. Some libraries, such as `pgaudit`, + must come after `bdr`. Don't include `pglogical` in the list. - `track_commit_timestamp` — Must be set to `on` for conflict resolution to retrieve the timestamp for each conflicting row. From e50ca522ad70c8bfb5084e124eb32d4ed0f7f54c Mon Sep 17 00:00:00 2001 From: Dj Walker-Morgan Date: Wed, 6 Dec 2023 09:34:22 +0000 Subject: [PATCH 7/7] Update text, remove accidental included file Signed-off-by: Dj Walker-Morgan --- .../docs/pgd/5/postgres-configuration.mdx | 5 +- .../pgd/5/quickstart/quick_start_manual.mdx | 309 ------------------ 2 files changed, 2 insertions(+), 312 deletions(-) delete mode 100644 product_docs/docs/pgd/5/quickstart/quick_start_manual.mdx diff --git a/product_docs/docs/pgd/5/postgres-configuration.mdx b/product_docs/docs/pgd/5/postgres-configuration.mdx index 4318c54fc25..3ed58e9cf95 100644 --- a/product_docs/docs/pgd/5/postgres-configuration.mdx +++ b/product_docs/docs/pgd/5/postgres-configuration.mdx @@ -15,9 +15,8 @@ For PGD's own settings, see the [PGD settings reference](reference/pgd-settings) To run correctly, PGD requires these Postgres settings: - `wal_level` — Must be set to `logical`, since PGD relies on logical decoding. -- `shared_preload_libraries` — Must contain `bdr`, although it can contain - other entries before or after, as needed. Some libraries, such as `pgaudit`, - must come after `bdr`. Don't include `pglogical` in the list. +- `shared_preload_libraries` — Must start with `bdr`, before other + entries. Don't include `pglogical` in this list - `track_commit_timestamp` — Must be set to `on` for conflict resolution to retrieve the timestamp for each conflicting row. diff --git a/product_docs/docs/pgd/5/quickstart/quick_start_manual.mdx b/product_docs/docs/pgd/5/quickstart/quick_start_manual.mdx deleted file mode 100644 index cd10be46ba8..00000000000 --- a/product_docs/docs/pgd/5/quickstart/quick_start_manual.mdx +++ /dev/null @@ -1,309 +0,0 @@ ---- -title: "Deploying an EDB Postgres Distributed example cluster on Linux hosts by hand" -navTitle: "Deploying on Linux hosts by hand" -description: > - A quick demonstration of deploying a PGD architecture on Linux hosts by hand ---- - -## Prerequisites - -### Configure your Linux hosts - -You will need to provision four hosts for this quick start. Each host should have a -[supported Linux operating system](https://www.enterprisedb.com/resources/platform-compatibility#bdr) -installed. - -!!! Note On machine provisioning -AWS users can follow [an Amazon guide](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EC2_GetStarted.html) -on how to provision EC2 linux instances. -Azure users can follow [a Microsoft guide](https://learn.microsoft.com/en-us/azure/virtual-machines/linux/quick-create-portal?tabs=ubuntu) on how to provision Azure VMs loaded with Linux. Google Cloud Platform users can follow [a Google guide](https://cloud.google.com/compute/docs/create-linux-vm-instance) on how to provision GCP VMs with Linux loaded. You can use any virtual machine technology to host a Linux instance, too. Refer to your virtualization platform's documentation for instructions on how to create instances with Linux loaded on them. - -Whichever cloud or VM platform you use, you need to make sure that each instance is accessible by SSH and that each instance can connect to the other instances. They can connect through either the public network or over a VPC for the cloud platforms. You can connect through your local network for on-premises VMs. -!!! - -In this quick start, you will install PGD nodes onto three hosts configured in the cloud. Each of these hosts in this example is installed with Rocky Linux. Each has a public IP address to go with its private IP address. - -| Host name | Public IP | Private IP | -| ----------- | ------------------------ | -------------- | -| linuxhost-1 | 172.19.16.27 | 192.168.2.247 | -| linuxhost-2 | 172.19.16.26 | 192.168.2.41 | -| linuxhost-3 | 172.19.16.25 | 192.168.2.254 | - -These are example IP addresses. Substitute them with your own public and private IP addresses as you progress through the quick start. - -### Set up a host admin user - -Each machine requires a user account to use for installation. For simplicity, use a user with the same name on all the hosts. On each host, also configure the user so that you can SSH into the host without being prompted for a password. Be sure to give that user sudo privileges on the host. On the four hosts, the user rocky is already configured with sudo privileges. - -### Set up respository access - -Before you begin the installation process: - -- Install Postgres on the same host (not needed for witness nodes) - - - See [Installing EDB Postgres Advanced Server](/epas/latest/epas_inst_linux) - - - See [PostgreSQL Downloads](https://www.postgresql.org/download/) - -- Set up the EDB repository - - Setting up the repository is a one-time task. If you have already set up your repository, you don't need to perform this step. - - To determine if your repository exists, enter this command: - - `dnf repolist | grep enterprisedb` - - If no output is generated, the repository isn't installed. - - To set up the EDB repository: - - 1. Go to [EDB repositories](https://www.enterprisedb.com/repos-downloads). - - 2. Select the button that provides access to the EDB repository. - - 3. Select the platform and software that you want to download. - - - - 4. Follow the instructions for setting up the EDB repository. - EDB Postgres Distributed packages come from `postgres_distrbuted` repository. - -- Install the EPEL repository - - ```shell - sudo yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm - ``` - -## Install the package - -Install PGD5 packages for [EDB Postgres Advanced Server v15](https://www.enterprisedb.com/docs/epas/latest/): - -```shell -sudo yum -y install edb-bdr5-epas15 -``` - -We also need to install the proxy and the CLI packages - -```shell -sudo yum -y install edb-pgd5-proxy -sudo yum -y install edb-pgd5-cli -``` - - - -## Initial Postgres configuration - -We need to configure Postgres in order to be able to use PGD with it. - -Ensure that Postgres configuration on every host includes following: - -- `shared_preload_libraries = 'bdr'` - this loads the binary itself, you may add additional - extensions here as needed, the order generally does not matter -- `wal_level = logical` - enables extra information logging -- `track_commit_timestamps = on` - this enables feature which writes timestamp of a COMMIT to - the transaction log - -There are other configuration options that might need adjusting depending on the cluster size. -Those are documented in the [Postgres configuration](/pgd/latest/postgres-configuration) chapter -of the PGD documentation. - -We won't need to adjust anything else in Postgres config for our example cluster with 3 nodes -in a single location. - -!!! Note - Make sure you restart Postgres after changing the above parameters - -## PGD setup - -First, we need to create a Postgres database which will represent the PGD node on each host. - -``` -CREATE DATABASE bdrdb; -``` - -Once the database is created, connect to it and install the BDR extension there. - -``` -CREATE EXTENSION BDR; -``` - -Adding PGD Replication user -Adding PGD Proxy user - -After that we need to create a PGD node record. - -``` -SELECT bdr.create_node('linuxhost-1', 'host=linuxhost-1 dbname=bdrdb'); -``` - -Do this on each node, changing the host name and connection string appropriately. - -Then on first node (let's assume it's `linuxhost-1` node). Create PGD node groups that -will represent the cluster and the location (dc1 in out case). - -``` -SELECT bdr.create_node_group('pgd'); -SELECT bdr.create_node_group('dc1', 'pgd'); -``` - -Once this is done, connect to the other two hosts and join the `dc1` group with the -following command. - -``` -SELECT bdr.join_node_group('host=linuxhost-1 dbname=bdrdb', 'dc1'); -``` - -Note the first parameter here, which is the connection string to the node on which we -created the `dc1` group. - -It make take a while for these commands to finish as the node membership updates. If the -process is taking more than few seconds, the function will start reporting NOTICE messages -with the current state of joining process. - -Once everything is finished we have EDB Postgres Distributed cluster up and running. - -Now we can check the cluster shape with. - - - -``` -SELECT * FROM bdr.node_summary; -``` - -And replication status of all nodes with. - -``` -SELECT * FROM bdr.group_subscription_summary; -``` - -This, however, is not the full setup. We need to make the cluster usable by application -without worry about failover and conflicts. For that we need to setup transparent routing -to the write leader. - -## Routing setup - -Before we start proxy setup, we need to setup automatic connection routing in PGD itself. - -To do that, connect to any of the PGD nodes and execute following. - -``` -SELECT bdr.alter_node_group_option('dc1', 'enable_proxy_routing', 'true'); -``` - -When this is done, we need to tell PGD that we plan to add proxies and store configuration -for them. We'll put proxy on every PGD node and keep default behavior, so we simply create -record for each of them. - -Again, on any PGD node (can be same one as above), execute these commands. - -``` -SELECT bdr.create_proxy('pgd-proxy', 'dc1'); -``` - -This creates default configuration for 3 proxies that will route connections transparently -to the write leader of the `dc1` group. - -### PGD Proxy setup - -Now, we need to configure the proxies themselves. - -Locate the configuration file of the pgd-proxy (`/etc/edb/pgd-proxy`) or every node, -and put following in it. - -```yaml -log-level: info # debug, info, warn, error -log-encoder: text # text, json -cluster: - name: pgd - endpoints: - - host=linuxhost-1 dbname=bdrdb - - host=linuxhost-2 dbname=bdrdb - - host=linuxhost-3 dbname=bdrdb - proxy: - name: pgd-proxy - endpoint: "host=localhost port=6432 dbname=bdrdb" -``` - -The endpoints should contain connection strings to all PGD nodes in the same group (`dc1`). - -The name of the proxy configuration must correspond to the name created above with the -`bdr.create_proxy` command. - -!!! Note - The proxy must listen on different port that Postgres itself, here we use the port - 6432, this will be the port application must connect to. - -Now we can start the proxies. On every host, start the pgd-proxy service using systemd. - -```shell -sudo systemctl restart pgd-proxy -``` - -Once this is done, we can connect to one of the proxies. - -``` -psql -h linuxhost-1,linuxhost-2,linuxhost-3 -p 6432 bdrdb -__OUTPUT__ -psql (15.2.0, server 15.2.0) -SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off) -Type "help" for help. - -bdrdb=# -``` - -Using all 3 hostnames will result in psql connecting automatically to whichever proxy is up, -and the proxy will route the connection to current write leader. - -You're connected to the Postgres database using PGD Proxy and can start issuing SQL commands. - -To leave the SQL client, enter `exit`. - -### Using PGD CLI - -The pgd utility, also known as the PGD CLI, lets you control and manage your EDB Postgres Distributed cluster. -We've already installed it on all nodes in the first step. - -You can use it to check the cluster's health by running `pgd check-health`: - -```shell -pgd --dsn 'host=linuxhost-1 dbname=bdrdb' check-health -__OUTPUT__ -Check Status Message ------ ------ ------- -ClockSkew Ok All BDR node pairs have clockskew within permissible limit -Connection Ok All BDR nodes are accessible -Raft Ok Raft Consensus is working correctly -Replslots Ok All BDR replication slots are working correctly -Version Ok All nodes are running same BDR versions -``` - -Or, you can use `pgd show-nodes` to ask PGD to show you the data-bearing nodes in the cluster: - -```shell -pgd --dsn 'host=linuxhost-1 dbname=bdrdb' show-nodes -__OUTPUT__ -Node Node ID Group Type Current State Target State Status Seq ID ----- ------- ----- ---- ------------- ------------ ------ ------ -linuxhost-1 2710197610 dc1_subgroup data ACTIVE ACTIVE Up 1 -linuxhost-2 3490219809 dc1_subgroup data ACTIVE ACTIVE Up 3 -linuxhost-3 2111777360 dc1_subgroup data ACTIVE ACTIVE Up 2 -``` - -Similarly, use `pgd show-proxies` to display the proxy connection nodes: - -```shell -pgd --dsn 'host=linuxhost-1 dbname=bdrdb' show-proxies -__OUTPUT__ -Proxy Group Listen Addresses Listen Port ------ ----- ---------------- ----------- -linuxhost-1 dc1_subgroup [0.0.0.0] 6432 -linuxhost-2 dc1_subgroup [0.0.0.0] 6432 -linuxhost-3 dc1_subgroup [0.0.0.0] 6432 -``` - -## Explore your cluster - -* [Connect to your database](connecting_applications) to applications -* [Explore failover](further_explore_failover) with hands-on exercises -* [Understand conflicts](further_explore_conflicts) by creating and monitoring them -* [Next steps](next_steps) in working with your cluster