diff --git a/product_docs/docs/tpa/23/ansible-and-sudo.mdx b/product_docs/docs/tpa/23/ansible-and-sudo.mdx index 18a6be0d309..7b92de59f76 100644 --- a/product_docs/docs/tpa/23/ansible-and-sudo.mdx +++ b/product_docs/docs/tpa/23/ansible-and-sudo.mdx @@ -5,26 +5,27 @@ originalFilePath: ansible-and-sudo.md --- TPA uses Ansible with sudo to execute tasks with elevated privileges -on target instances. This page explains how Ansible uses sudo (which is -in no way TPA-specific), and the consequences to systems managed +on target instances. +It's important to understand how Ansible uses sudo (which isn't specific to TPA) +and the consequences to systems managed with TPA. TPA needs root privileges; -- to install packages (required packages using the operating system's - native package manager, and optional packages using pip) -- to stop, reload and restart services (i.e Postgres, repmgr, efm, etcd, - haproxy, pgbouncer etc.) -- to perform a variety of other tasks (e.g., gathering cluster facts, - performing switchover, setting up cluster nodes) +- To install packages (required packages using the operating system's + native package manager and optional packages using pip) +- To stop, reload, and restart services (that is, Postgres, repmgr, efm, etcd, + haproxy, pgbouncer, and so on) +- To perform a variety of other tasks (such as gathering cluster facts, + performing switchover, and setting up cluster nodes) -TPA also needs to be able to use sudo. You can make it ssh in as root -directly by setting `ansible_user: root`, but it will still use sudo to -execute tasks as other users (e.g., postgres). +TPA also must be able to use sudo. You can make it ssh in as root +directly by setting `ansible_user: root`, but it still uses sudo to +execute tasks as other users (for example, postgres). ## Ansible sudo invocations -When Ansible runs a task using sudo, you will see a process on the +When Ansible runs a task using sudo, you see a process on the target instance that looks something like this: ``` @@ -33,104 +34,107 @@ target instance that looks something like this: /usr/bin/python2'"'"' && sleep 0' ``` -People who were expecting something like `sudo yum install -y xyzpkg` -are often surprised by this. By and large, most tasks in Ansible will -invoke a Python interpreter to execute Python code, rather than -executing recognisable shell commands. (Playbooks may execute `raw` +Users who were expecting something like `sudo yum install -y xyzpkg` +are often surprised by this. By and large, most tasks in Ansible +invoke a Python interpreter to execute Python code rather than +executing recognizable shell commands. (Playbooks can execute `raw` shell commands, but TPA uses such tasks only to bootstrap a Python interpreter.) Ansible modules contain Python code of varying complexity, and an -Ansible playbook is not just a shell script written in YAML format. -There is no way to “extract” shell commands that would do the same thing +Ansible playbook isn't just a shell script written in YAML format. +There's no way to “extract” shell commands that do the same thing as executing an arbitrary Ansible playbook. -There is one significant consequence of how Ansible uses sudo: [privilege -escalation must be general](https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_privilege_escalation.html#privilege-escalation-must-be-general). That, it is not possible -to limit sudo invocations to specific commands in sudoers.conf, -as some administrators are used to doing. Most tasks will just invoke python. -You could have restricted sudo access to python if it were not -for the random string in every command—but once Python is running as root, +One significant consequence of how Ansible uses sudo is that [privilege +escalation must be general](https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_privilege_escalation.html#privilege-escalation-must-be-general). It isn't possible +to limit sudo invocations to specific commands in `sudoers.conf`, +as some administrators are used to doing. Most tasks just invoke Python. +You could have restricted sudo access to Python if it weren't +for the random string in every command. However, once Python is running as root, there's no effective limit on what it can do anyway. -Executing Python modules on target hosts is just the way Ansible works. -None of this is specific to TPA in any way, and these considerations -would apply equally to any other Ansible playbook. +Executing Python modules on target hosts is how Ansible works. +None of this is specific to TPA, and these considerations +apply equally to any other Ansible playbook. ## Recommendations -- Use SSH public key-based authentication to access target instances. +- Use SSH public-key-based authentication to access target instances. - Allow the SSH user to execute sudo commands without a password. -- Restrict access by time, rather than by command. +- Restrict access by time rather than by command. -TPA needs access only when you are first setting up your cluster or -running `tpaexec deploy` again to make configuration changes, e.g., +TPA needs access only when you're first setting up your cluster or +running `tpaexec deploy` again to make configuration changes, for example, during a maintenance window. Until then, you can disable its access -entirely (a one-line change for both ssh and sudo). +entirely, which is a one-line change for both ssh and sudo. During deployment, everything Ansible does is generally predictable -based on what the playbooks are doing and what parameters you provide, -and each action is visible in the system logs on the target instances, -as well as the Ansible log on the machine where tpaexec itself runs. +based on what the playbooks are doing and the parameters you provide. +Each action is visible in the system logs on the target instances +as well as in the Ansible log on the machine where tpaexec runs. -Ansible's focus is less to impose fine-grained restrictions on what -actions may be executed and more to provide visibility into what it does -as it executes, so elevated privileges are better assigned and managed +Ansible's focus is less to impose fine-grained restrictions on the +actions you can execute and more to provide visibility into what it does +as it executes. Thus elevated privileges are better assigned and managed by time rather than by scope. ## SSH and sudo passwords -We *strongly* recommend setting up password-less SSH key authentication -and password-less sudo access, but it is possible to use passwords too. +We strongly recommend setting up passwordless SSH key authentication +and passwordless sudo access. However, it's possible to use passwords too. If you set `ANSIBLE_ASK_PASS=yes` and `ANSIBLE_BECOME_ASK_PASS=yes` -in your environment before running tpaexec, Ansible will prompt you to +in your environment before running tpaexec, Ansible prompts you to enter a login password and a sudo password for the remote servers. It -will then negotiate the login/sudo password prompt on the remote server -and send the password you specify (which will make your playbooks take -noticeably longer to run). +then negotiates the login/sudo password prompt on the remote server +and sends the password you specify, which makes your playbooks take +noticeably longer to run. -We do not recommend this mode of operation because we feel it is a more +We don't recommend this mode of operation because it's a more effective security control to completely disable access through a particular account when not needed than to use a combination of passwords to restrict access. Using public key authentication for ssh provides an effective control over who can access the server, and it's -easier to protect a single private key per authorised user than it is to +easier to protect a single private key per authorized user than it is to protect a shared password or multiple shared passwords. Also, if you -limit access at the ssh/sudo level to when it is required, the passwords -do not add any extra security during your maintenance window. +limit access at the ssh/sudo level to when it's required, the passwords +don't add any extra security during your maintenance window. ## sudo options -To use Ansible with sudo, you must not set `requiretty` in sudoers.conf. +To use Ansible with sudo, don't set `requiretty` in `sudoers.conf`. If needed, you can change the sudo options that Ansible uses -(`-H -S -n`) by setting `become_flags` in the -`[privilege_escalation]` section of ansible.cfg, or -`ANSIBLE_BECOME_FLAGS` in the environment, or `ansible_become_flags` -in the inventory. All three methods are equivalent, but please change -the sudo options only if there is a specific need to do so. The defaults +(`-H -S -n`) by setting either: + +- `become_flags` in the `[privilege_escalation]` section of `ansible.cfg` +- `ANSIBLE_BECOME_FLAGS` in the environment +- `ansible_become_flags` in the inventory + +All three methods are equivalent, but change +the sudo options only if there's a specific need to do so. The defaults were chosen for good reasons. For example, removing `-S -n` will cause -tasks to timeout if password-less sudo is incorrectly configured. +tasks to time out if passwordless sudo is incorrectly configured. ## Logging -For playbook executions, the sudo logs will show mostly invocations of -Python (just as it will show only an invocation of bash when someone -uses `sudo -i`). +For playbook executions, the sudo logs show mostly invocations of +Python, just as it shows only an invocation of bash when +`sudo -i` is used. -For more detail, the syslog will show the exact arguments to each module +For more detail, the syslog shows the exact arguments to each module invocation on the target instance. For a higher-level view of why that -module was invoked, the ansible.log on the controller shows what that +module was invoked, the `ansible.log` on the controller shows what that task was trying to do, and the result. -If you want even more detail, or an independent source of audit data, +If you want even more detail or an independent source of audit data, you can run auditd on the server and use the SELinux log files. You can -get still more fine-grained syscall-level information from bpftrace/bcc -(e.g., opensnoop shows every file opened on the system, and execsnoop -shows every process executed on the system). You can do any or all of +get still more fine-grained syscall-level information from bpftrace/bcc. +(For example, opensnoop shows every file opened on the system, and execsnoop +shows every process executed on the system.) You can do any or all of these things, depending on your needs, with the obvious caveat of increasing overhead with increased logging. @@ -138,8 +142,8 @@ increasing overhead with increased logging. The [installation instructions for TPA](INSTALL/) -mention sudo only as shorthand for “run these commands as root somehow”. -Once TPA is installed and you have run `tpaexec setup`, TPA -itself does not require elevated privileges on the local machine. (But +mention sudo only as shorthand for “run these commands as root somehow.” +Once TPA is installed and you've run `tpaexec setup`, TPA +doesn't require elevated privileges on the local machine. (But if you use Docker, you must run tpaexec as a user that belongs to a -group that is permitted to connect to the Docker daemon.) +Unix group that has permission to connect to the Docker daemon.) diff --git a/product_docs/docs/tpa/23/architecture-BDR-Always-ON.mdx b/product_docs/docs/tpa/23/architecture-BDR-Always-ON.mdx index 3dbd94b0ae5..2b27631ab0c 100644 --- a/product_docs/docs/tpa/23/architecture-BDR-Always-ON.mdx +++ b/product_docs/docs/tpa/23/architecture-BDR-Always-ON.mdx @@ -5,28 +5,28 @@ originalFilePath: architecture-BDR-Always-ON.md --- EDB Postgres Distributed 3.7 or 4 in an Always-ON -configuration, suitable for use in test and production. +configuration is suitable for use in test and production. This architecture requires a subscription to the legacy 2ndQuadrant repositories, and some options require a subscription to EDB Repos 1.0. See [How TPA uses 2ndQuadrant and EDB repositories](reference/2q_and_edb_repositories/) -for more detail on this topic. +for more detail. The BDR-Always-ON architecture has four variants, which can be selected with the `--layout` configure option: -1. bronze: 2×bdr+primary, bdr+witness, barman, 2×harp-proxy +- `bronze`: 2×bdr+primary, bdr+witness, barman, 2×harp-proxy -2. silver: bronze, with bdr+witness promoted to bdr+primary, and barman +- `silver`: bronze, with bdr+witness promoted to bdr+primary, and barman moved to separate location -3. gold: two symmetric locations with 2×bdr+primary, 2×harp-proxy, +- `gold`: two symmetric locations with 2×bdr+primary, 2×harp-proxy, and barman each; plus a bdr+witness in a third location -4. platinum: gold, but with one bdr+readonly (logical standby) added to +- `platinum`: gold, but with one bdr+readonly (logical standby) added to each of the main locations -You can check EDB's Postgres Distributed Always On Architectures +See EDB's Postgres Distributed Always On Architectures [whitepaper](https://www.enterprisedb.com/promote/bdr-always-on-architectures) for the detailed layout diagrams. @@ -36,8 +36,7 @@ This architecture is meant for use with PGD versions 3.7 and 4. ### Overview of configuration options -An example invocation of `tpaexec configure` for this architecture -is shown below. +This example shows an invocation of `tpaexec configure` for this architecture: ```shell tpaexec configure ~/clusters/bdr \ @@ -49,59 +48,55 @@ tpaexec configure ~/clusters/bdr \ --harp-consensus-protocol bdr ``` -You can list all available options using the help command. +You can list all available options using the `help` command. ```shell tpaexec configure --architecture BDR-Always-ON --help ``` -The tables below describe the mandatory options for BDR-Always-ON -and additional important options. -More detail on the options is provided in the following section. +#### Mandatory options -#### Mandatory Options - -| Option | Description | -| ----------------------------------------------------- | ----------------------------------------------------------------------------------------- | -| `--architecture` (`-a`) | Must be set to `BDR-Always-ON`. | -| Postgres flavour and version (e.g. `--postgresql 14`) | A valid [flavour and version specifier](tpaexec-configure/#postgres-flavour-and-version). | -| `--layout` | One of `bronze`, `silver`, `gold`, `platinum`. | -| `--harp-consensus-protocol` | One of `bdr`, `etcd`. | +| Option | Description | +| ------------------------------- | ------------------------------------------------------------------------------------------- | +| `--architecture` (`-a`) | Must be set to `BDR-Always-ON`. | +| `--postgresql 14` (for example) | Any valid [flavour and version specifier](tpaexec-configure/#postgres-flavour-and-version). | +| `--layout` | One of `bronze`, `silver`, `gold`, `platinum`. | +| `--harp-consensus-protocol` | One of `bdr`, `etcd`. | <br/><br/> -#### Additional Options +#### Additional options -| Option | Description | Behaviour if omitted | -| ---------------------- | ----------------------------------------------------------------------------------------------- | ------------------------- | -| `--platform` | One of `aws`, `docker`, `bare`. | Defaults to `aws`. | -| `--enable-camo` | Sets two data nodes in each location as CAMO partners. | CAMO will not be enabled. | -| `--bdr-database` | The name of the database to be used for replication. | Defaults to `bdrdb`. | -| `--enable-harp-probes` | Enable http(s) api endpoints for harp such as `health/is-ready` to allow probing harp's health. | Disabled by default. | +| Option | Description | Behavior if omitted | +| ---------------------- | ----------------------------------------------------------------------------------------------- | -------------------- | +| `--platform` | One of `aws`, `docker`, `bare`. | Defaults to `aws`. | +| `--enable-camo` | Sets two data nodes in each location as CAMO partners. | CAMO isn't enabled. | +| `--bdr-database` | The name of the database to be used for replication. | Defaults to `bdrdb`. | +| `--enable-harp-probes` | Enable http(s) api endpoints for harp such as `health/is-ready` to allow probing harp's health. | Disabled by default. | <br/><br/> ### More detail about BDR-Always-ON configuration -You must specify `--layout layoutname` to set one of the supported BDR -use-case variations. The permitted arguments are bronze, silver, gold, and -platinum. The bronze, gold and platinum layouts have a PGD witness node -to ensure odd number of nodes for Raft consensus majority. Witness nodes do -not participate in the data replication. +Specify `--layout layoutname` to choose a layout: +`bronze`, `silver`, `gold`, or +`platinum`. The bronze, gold, and platinum layouts have a PGD witness node +to ensure an odd number of nodes for Raft consensus majority. Witness nodes don't +participate in the data replication. You must specify `--harp-consensus-protocol protocolname`. The supported -protocols are bdr and etcd; see [`Configuring HARP`](reference/harp/) for more details. +protocols are bdr and etcd. See [`Configuring HARP`](reference/harp/) for more details. -You may optionally specify `--bdr-database dbname` to set the name of -the database with PGD enabled (default: bdrdb). +You can optionally specify `--bdr-database dbname` to set the name of +the database with PGD enabled (default: `bdrdb`). -You may optionally specify `--enable-camo` to set the pair of PGD +You can optionally specify `--enable-camo` to set the pair of PGD primary instances in each region to be each other's CAMO partners. -You may optionally specify `--enable-harp-probes [{http, https}]` to -enable http(s) api endpoints that will allow to easily probe harp's health. +You can optionally specify `--enable-harp-probes [{http, https}]` to +enable http(s) api endpoints that allow you to easily probe harp's health. -Please note we enable HARP2 by default in BDR-Always-ON architecture. +HARP2 is enabled by default in the BDR-Always-ON architecture. -You may also specify any of the options described by +You can also specify any of the options described by [`tpaexec help configure-options`](tpaexec-configure/). diff --git a/product_docs/docs/tpa/23/architecture-M1.mdx b/product_docs/docs/tpa/23/architecture-M1.mdx index ca8b39ddccf..001c25d265a 100644 --- a/product_docs/docs/tpa/23/architecture-M1.mdx +++ b/product_docs/docs/tpa/23/architecture-M1.mdx @@ -4,57 +4,55 @@ originalFilePath: architecture-M1.md --- -A Postgres cluster with a primary and a streaming replica, one Barman +The M1 architecture is a Postgres cluster with a primary and a streaming replica, one Barman server, and any number of additional replicas cascaded from the first -one. This architecture is suitable for production and is also suited to -testing, demonstrating and learning due to its simplicity and ability to +one. This architecture is suitable for production. It's also suited to +testing, demonstrating, and learning due to its simplicity and ability to be configured with no proprietary components. -If you select subscription-only EDB software with this architecture -it will be sourced from EDB Repos 2.0 and you will need to provide a token. +If you select subscription-only EDB software with this architecture, +it's sourced from EDB Repos 2.0. See [How TPA uses 2ndQuadrant and EDB repositories](reference/2q_and_edb_repositories/) -for more detail on this topic. +for more detail. ## Default layout By default, the primary has one read-only replica attached in the same -location; the replica, in turn, has one cascaded replica attached in a +location. The replica, in turn, has one cascaded replica attached in a different location, where the Barman server is also configured to take backups from the primary. ![Cluster with cascading replication](images/m1.png) -If there is an even number of Postgres nodes, the Barman node is -additionally configured as a witness. This ensures that the -number of nodes is always odd, which is convenient when -enabling automatic failover. +If the number of Postgres nodes is even, the Barman node is +also configured as a witness. Having an odd number of nodes +in total helps to establish consensus in case of automatic failover. ## Application and backup failover The M1 architecture implements failover management in that it ensures -that a replica will be promoted to take the place of the primary should -the primary become unavailable. However it *does not provide any -automatic facility to reroute application traffic to the primary*. If -you require, automatic failover of application traffic you will need to -configure this at the application itself (for example using multi-host +that a replica is promoted to take the place of the primary if +the primary become unavailable. However it doesn't provide any +automatic facility to reroute application traffic to the primary. If +you require automatic failover of application traffic, you must +configure this at the application (for example using multi-host connections) or by using an appropriate proxy or load balancer and the facilities offered by your selected failover manager. -The above is also true of the connection between the backup node and the -primary created by TPA. The backup will not be automatically adjusted to -target the new primary in the event of failover, instead it will remain -connected to the original primary. If you are performing a manual -failover and wish to connect the backup to the new primary, you may -simply re-run `tpaexec deploy`. If you wish to automatically change the -backup source, you should implement this using your selected failover -manager as noted above. +This is also true of the connection between the backup node and the +primary created by TPA. The backup isn't automatically adjusted to +target the new primary in the event of failover. Instead, it remains +connected to the original primary. If you're performing a manual +failover and want to connect the backup to the new primary, you can +rerun `tpaexec deploy`. If you want to automatically change the +backup source, implement this using your selected failover +manager. ## Cluster configuration ### Overview of configuration options -An example invocation of `tpaexec configure` for this architecture -is shown below. +This example shows an invocation of `tpaexec configure` for this architecture: ```shell tpaexec configure ~/clusters/m1 \ @@ -65,17 +63,13 @@ tpaexec configure ~/clusters/m1 \ --failover-manager repmgr ``` -You can list all available options using the help command. +You can list all available options using the `help` command: ```shell tpaexec configure --architecture M1 --help ``` -The tables below describe the mandatory options for M1 -and additional important options. -More detail on the options is provided in the following section. - -#### Mandatory Options +#### Mandatory options | Parameter | Description | | ---------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | @@ -85,9 +79,9 @@ More detail on the options is provided in the following section. <br/><br/> -#### Additional Options +#### Additional options -| Parameter | Description | Behaviour if omitted | +| Parameter | Description | Behavior if omitted | | ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------- | | `--platform` | One of `aws`, `docker`, `bare`. | Defaults to `aws`. | | `--num-cascaded-replicas` | The number of cascaded replicas from the first replica. | Defaults to 1. | @@ -98,8 +92,8 @@ More detail on the options is provided in the following section. ### More detail about M1 configuration -You may optionally specify `--num-cascaded-replicas N` to request N +You can optionally specify `--num-cascaded-replicas N` to request N cascaded replicas (including 0 for none; default: 1). -You may also specify any of the options described by +You can also specify any of the options described by [`tpaexec help configure-options`](tpaexec-configure/). diff --git a/product_docs/docs/tpa/23/architecture-PGD-Always-ON.mdx b/product_docs/docs/tpa/23/architecture-PGD-Always-ON.mdx index 95feae18deb..8574f2c078d 100644 --- a/product_docs/docs/tpa/23/architecture-PGD-Always-ON.mdx +++ b/product_docs/docs/tpa/23/architecture-PGD-Always-ON.mdx @@ -7,15 +7,14 @@ originalFilePath: architecture-PGD-Always-ON.md EDB Postgres Distributed 5 in an Always-ON configuration, suitable for use in test and production. -This architecture is valid for use with EDB Postgres Distributed 5 only +This architecture is valid for use with EDB Postgres Distributed version 5 only and requires a subscription to [EDB Repos 2.0](reference/2q_and_edb_repositories/). ## Cluster configuration ### Overview of configuration options -An example invocation of `tpaexec configure` for this architecture -is shown below. +This example shows an invocation of `tpaexec configure` for this architecture: ```shell tpaexec configure ~/clusters/pgd-ao \ @@ -29,17 +28,13 @@ tpaexec configure ~/clusters/pgd-ao \ --data-nodes-per-location 2 ``` -You can list all available options using the help command. +You can list all available options using the `help` option. ```shell tpaexec configure --architecture PGD-Always-ON --help ``` -The table below describes the mandatory options for PGD-Always-ON -and additional important options. -More detail on the options is provided in the following section. - -#### Mandatory Options +#### Mandatory options | Options | Description | | ----------------------------------------------------- | ----------------------------------------------------------------------------------------- | @@ -49,9 +44,9 @@ More detail on the options is provided in the following section. <br/><br/> -#### Additional Options +#### Additional options -| Options | Description | Behaviour if omitted | +| Options | Description | Behavior if omitted | | -------------------------------- | ----------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------- | | `--platform` | One of `aws`, `docker`, `bare`. | Defaults to `aws`. | | `--location-names` | A space-separated list of location names. The number of locations is equal to the number of names supplied. | TPA will configure a single location with three data nodes. | @@ -60,68 +55,67 @@ More detail on the options is provided in the following section. | `--add-proxy-nodes-per-location` | The number of proxy nodes in each location. | PGD-proxy will be installed on each data node. | | `--enable-camo` | Sets two data nodes in each location as CAMO partners. | CAMO will not be enabled. | | `--bdr-database` | The name of the database to be used for replication. | Defaults to `bdrdb`. | -| `--enable-pgd-probes` | Enable http(s) api endpoints for pgd-proxy such as `health/is-ready` to allow probing proxy's health. | Disabled by default. | +| `--enable-pgd-probes` | Enable http(s) API endpoints like `health/is-ready` to probe pgd-proxy's health. | Disabled by default. | <br/><br/> ### More detail about PGD-Always-ON configuration -A PGD-Always-ON cluster comprises a number of locations, preferably odd, -each with the same number of data nodes, again preferably odd. If you do -not specify any `--location-names`, the default is to use a single +A PGD-Always-ON cluster comprises a number of locations, preferably odd. +Each location has the same number of data nodes, also preferably odd. If you don't +specify any `--location-names`, the default is to use a single location with three data nodes. -Location names for the cluster are specified as +You can specify location names for the cluster with `--location-names dc1 dc2 …`. A location represents an independent -data centre that provides a level of redundancy, in whatever way -this definition makes sense to your use case. For example, AWS -regions, your own data centres, or any other designation to identify +data center that provides a level of redundancy, in whatever way +this definition makes sense to your use case. Examples include AWS +regions, your own data centers, or any other designation to identify where your servers are hosted. !!! Note for AWS users - If you are using TPA to provision an AWS cluster, the locations will - be mapped to separate availability zones within the `--region` you + If you're using TPA to provision an AWS cluster, the locations will + be mapped to separate availability zones in the `--region` you specify. - You may specify multiple `--regions`, but TPA does not currently set + You can specify multiple `--regions`, but TPA doesn't currently set up VPC peering to allow instances in different regions to - communicate with each other. For a multi-region cluster, you will - need to set up VPC peering yourself. + communicate with each other. For a multi-region cluster, you must + set up VPC peering yourself. Use `--data-nodes-per-location N` to specify the number of data -nodes in each location. The minimum number is 2, the default is 3. +nodes in each location. The minimum number is 2. The default is 3. -If you specify an even number of data nodes per location, TPA will add -an extra witness node to each location automatically. This retains the -ability to establish reliable consensus while allowing cost savings (a -witness has minimal hardware requirements compared to the data nodes). +If you specify an even number of data nodes per location, TPA +adds an extra witness node to each location. The witness +preserves the ability to reliably establish consensus without +the hardware requirements and extra cost of another data node. A cluster with only two locations would entirely lose the ability to -establish global consensus if one of the locations were to fail. We -recommend adding a third witness-only location (which contains no data -nodes, only a witness node, again used to reliably establish consensus). -Use `--witness-only-location loc` to designate one of your locations as -a witness. - -By default, every data node (in every location) will also run PGD-Proxy -for connection routing. To create separate PGD-Proxy instances instead, +establish global consensus if one of the locations were to fail. +You can retain the ability to establish consensus despite a single-location failure. +To do so, we recommend adding a third witness-only location that contains only +a witness node. Use `--witness-only-location loc` to designate one of +your locations as a witness. + +By default, every data node in every location also runs PGD Proxy +for connection routing. To create separate PGD Proxy instances instead, use `--add-proxy-nodes-per-location 3` (or however many proxies you want to add). -Depending on your use-case, you must specify `--pgd-proxy-routing local` -or `global` to configure how PGD-Proxy will route connections to a write -leader. Local routing will make every PGD-Proxy route to a write leader -within its own location (suitable for geo-sharding applications). Global -routing will make every proxy route to a single write leader, elected -amongst all available data nodes across all locations. +Depending on your use case, you must specify `--pgd-proxy-routing local` +or `global` to configure how PGD Proxy routes connections to a write +leader. Local routing makes every PGD Proxy route to a write leader +within its own location (suitable for geosharding applications). Global +routing makes every proxy route to a single write leader, elected +among all available data nodes across all locations. + +You can optionally specify: -You may optionally specify `--bdr-database dbname` to set the name of -the database with BDR enabled (default: bdrdb). +- `--bdr-database dbname` to set the name of the database with BDR enabled (default: bdrdb) -You may optionally specify `--enable-camo` to set two data nodes in -each region as CAMO partners. +- `--enable-camo` to set two data nodes in each region as CAMO partners -You may optionally specify `--enable-pgd-probes [{http, https}]` to -enable http(s) api endpoints that will allow to easily probe proxy's health. +- `--enable-pgd-probes [{http, https}]` to enable http(s) api endpoints that allow you to easily probe proxy's health -You may also specify any of the options described by +You can also specify any of the options described by [`tpaexec help configure-options`](tpaexec-configure/). diff --git a/product_docs/docs/tpa/23/configure-cluster.mdx b/product_docs/docs/tpa/23/configure-cluster.mdx index 49b7e4c1f2f..753117d9e04 100644 --- a/product_docs/docs/tpa/23/configure-cluster.mdx +++ b/product_docs/docs/tpa/23/configure-cluster.mdx @@ -5,19 +5,20 @@ originalFilePath: configure-cluster.md --- With TPA, the way to make any configuration change to a cluster is -to edit config.yml and run the provision/deploy/test cycle. The process -is carefully designed to be idempotent, and to make changes only in -response to a change in the configuration or a change on the instances. +to edit `config.yml` and run the provision/deploy/test cycle. The process +is carefully designed to be idempotent and to make changes only in +response to a change in the configuration or on the instances. -The [`tpaexec configure`](tpaexec-configure/) command will generate -a sensible config.yml file for you, but it covers only the most common +The [`tpaexec configure`](tpaexec-configure/) command generates +a sensible `config.yml` file for you, but it covers only the most common topology and configuration options. If you need something beyond the defaults, or you need to make changes after provisioning the cluster, -you will need to edit config.yml anyway. +you need to edit `config.yml`. -This page is an overview of the configuration mechanisms available. -There's a separate page with more details about the specific -[variables you can set to customise the deployment process](configure-instance/). +An overview of the configuration mechanisms available follows. +For more details about the specific +variables you can set to customize the deployment process, +see [Instance configuration](configure-instance/). ## config.yml @@ -52,61 +53,67 @@ instances: ``` These three definitions are central to your cluster configuration. The -file may contain many other definitions (including platform-specific +file might contain many other definitions (including platform-specific details), but the list of `instances` with `vars` set either for one instance or for the whole cluster are the basic building blocks of every TPA configuration. All [`tpaexec configure`](tpaexec-configure/) -options translate to config.yml variables in -some way. A single option may affect several variables (e.g., -`--bdr-version` could set `postgres_version`, -`tpa_2q_repositories`, `edb_repositories`, `extra_postgres_extensions`, and so on), but -you can always accomplish with an editor what you could by running the +options translate to `config.yml` variables in +some way. A single option can affect several variables. (For example, +`--bdr-version` might set `postgres_version`, +`tpa_2q_repositories`, `edb_repositories`, `extra_postgres_extensions`, and so on.) But +you can always accomplish with an editor the same things you can by running the command. -In terms of YAML syntax, config.yml as a whole represents a hash with -keys such as `cluster_vars` and `instances`. **You must ensure that -each key is defined only once.** If you were to inadvertently repeat the -`cluster_vars`, say, the second definition would completely override -the former, and your next deployment could make unintended changes -because of missing (shadowed) variables. +In terms of YAML syntax, `config.yml` as a whole represents a hash with +keys such as `cluster_vars` and `instances`. -TPA checks the consistency of the overall cluster topology (for -example, if you declare an instance with the role "replica", you must +!!! Note + You must ensure that + each key is defined only once. If you were to inadvertently repeat the + `cluster_vars`, say, the second definition would completely override + the first, and your next deployment could make unintended changes + because of missing (shadowed) variables. + +TPA checks the consistency of the overall cluster topology. For +example, if you declare an instance with the replica role, you must also declare the name of its upstream instance, and that instance must -exist), but it will not prevent you from setting any variable you like -on the instances. You must exercise due caution, and try out changes in +exist. However, TPA doesn't prevent you from setting variables +on the instances. Exercise due caution and try out changes in a test environment before rolling them out into production. ## Variables In Ansible terminology, most configuration settings are “inventory -variables”—TPA will translate `cluster_vars` into `group_vars` -(that apply to the cluster as a whole) and each instance's `vars` into -`host_vars` in the inventory during provisioning, and deployment will -use the inventory values. After you change config.yml, **you must -remember to run** `tpaexec provision` **before** `tpaexec deploy`. - -Any variable can be set for the entire cluster, or an individual host, -or both; host variables override group variables. In practice, setting -`x: 42` in `cluster_vars` is no different from setting it in every -host's `vars`. A host that needs `x` during deployment will see the -value 42 either way. A host will always see the most specific value, so -it is convenient to set some default value for the group and override it +variables.” TPA translates `cluster_vars` into `group_vars` +that apply to the cluster as a whole and each instance's `vars` into +`host_vars` in the inventory during provisioning. Deployment then +uses the inventory values. + +!!! Note + After you change `config.yml`, + remember to run `tpaexec provision` before `tpaexec deploy`. + +You can set any variable for the entire cluster, an individual host, +or both. Host variables override group variables. In practice, setting +`x: 42` in `cluster_vars` is the same as setting it in every +host's `vars`. A host that needs `x` during deployment sees the +value 42 either way. A host always sees the most specific value, so +it's convenient to set some default value for the group and override it for specific instances as required. Whenever possible, defining variables in `cluster_vars` and overriding -them for specific instances results in a concise configuration that is -easier to review and change (less repetition). Beyond that, it's up to +them for specific instances results in a concise configuration. because +there's less repetition, it's easier to review and change. Beyond that, it's up to you to decide whether any given setting makes more sense as a group or host variable. ## Cluster variables -The keys under `cluster_vars` may map to any valid YAML type, and will -be translated directly into group variables in the Ansible inventory: +The keys under `cluster_vars` can map to any valid YAML type and are +translated directly into group variables in the Ansible inventory: ```yaml cluster_vars: @@ -118,15 +125,15 @@ cluster_vars: bdr.trace_replay: true ``` -In this case, `tpaexec provision` will write three variables (a +In this case, `tpaexec provision` writes three variables (a string, a list, and a hash) to the inventory in `group_vars/tag_Cluster_name/01-cluster_name.yml`. ## Instance variables -This documentation uses the term “instance variables” to refer to any -variables that are defined for a specific instance in config.yml. For -example, here's a typical instance definition: +We use the term “instance variables” to refer to any +variables that are defined for a specific instance in `config.yml`. This +example shows a typical instance definition: ```yaml instances: @@ -156,27 +163,27 @@ instances: ``` The variables defined in this instance's `vars` will all become host -variables in the inventory, but all host vars in the inventory do not +variables in the inventory, but all host vars in the inventory don't come from `vars` alone. Some other instance settings, including -`platform`, `location`, `volumes`, and `role` are also copied to the -inventory as host vars (but you cannot define these settings under -`vars` or `cluster_vars` instead). - -The settings outside `vars` may describe the properties of the instance -(e.g., `Name` and `node`) or its place in the topology of the cluster -(e.g., `role`, `backup`) or they may be platform-specific attributes -(e.g., instance `type` and `volumes`). Other than knowing that they -cannot be defined under `vars`, it is rarely necessary to distinguish -between these instance “settings” and instance “variables”. - -In this case, `tpaexec provision` will write a number of host +`platform`, `location`, `volumes`, and `role`, are also copied to the +inventory as host `vars`. However, you can't define these settings under +`vars` or `cluster_vars` instead. + +The settings outside `vars` can describe the properties of the instance +(such as `Name` and `node`) or its place in the topology of the cluster +(such as `role` and `backup`). Or they can be platform-specific attributes +(such as instance `type` and `volumes`). Other than knowing that they +can't be defined under `vars`, it's rarely necessary to distinguish +between these instance settings and instance variables. + +In this case, `tpaexec provision` writes a number of host variables to the inventory in `host_vars/unwind/01-instance_vars.yml`. ## instance_defaults -This is a mechanism to further reduce repetition in -config.yml. It is most useful for instance settings that cannot be -defined as `cluster_vars`. For example, you could write the following: +This setting is a mechanism to further reduce repetition in +`config.yml`. It's most useful for instance settings that can't be +defined as `cluster_vars`. For example, you can write the following: ```yaml instance_defaults: @@ -195,29 +202,29 @@ instances: Whatever you specify under `instance_defaults` serves as the default for every entry in `instances`. In this example, it saves spelling out the -`platform` and `type` of each instance, and makes it easier to change +`platform` and `type` of each instance and makes it easier to change all your instances to a different type. If any instance specifies a -different value, it will of course take precedence over the default. +different value, that value takes precedence over the default. -It may help to think of `instance_defaults` as being a macro facility to -use in defining `instances`. What is ultimately written to the inventory +It might help to think of `instance_defaults` as being a macro facility to +use in defining `instances`. What's ultimately written to the inventory comes from the (expanded) definition of `instances` alone. If you're trying to decide whether to put something in `cluster_vars` or `instance_defaults`, it probably belongs in the former unless it -*cannot* be defined as a variable (e.g., `platform` or `type`), which is -true for many platform-specific properties (such as AWS resource tags) -that are used only in provisioning, and not during deployment. +can't be defined as a variable (for example, `platform` or `type`). This is +true for many platform-specific properties, such as AWS resource tags, +that are used only in provisioning and not during deployment. The `instance_defaults` mechanism does nothing to stop you from using it -to fill in the `vars` for an instance (default hash values are merged -with any hash specified in the `instances` entry). However, there is no +to fill in the `vars` for an instance. (Default hash values are merged +with any hash specified in the `instances` entry.) However, there isn't a particular advantage to doing this rather than setting the same default in `cluster_vars` and overriding it for an instance if necessary. When in doubt, use `cluster_vars`. ## Locations -You can also specify a list of `locations` in config.yml: +You can also specify a list of `locations` in `config.yml`: ```yaml locations: @@ -239,15 +246,15 @@ instances: ``` If an instance specifies `location: first` (or `location: 0`), the -settings under that location serve as defaults for that instance. Again, -just like `instance_defaults`, an instance may override the defaults -that it inherits from its location. And again, you can use this feature -to fill in `vars` for an instance. This can be useful if you have some -defaults that apply to only half your instances, and different values -for the other half (as with the platform-specific settings in the -example above). +settings under that location serve as defaults for that instance. +Just like `instance_defaults`, an instance can override the defaults +that it inherits from its location. Similarly, you can use this feature +to fill in `vars` for an instance. This approach can be useful if you have some +defaults that apply to only half your instances and different values +for the other half, as with the platform-specific settings in the +example. Locations represent a collection of settings that instances can “opt in” -to. You can use them to stand for different data centres, AWS regions, -Docker hosts, or something else entirely. TPA does not expect or +to. You can use them to stand for different data centers, AWS regions, +Docker hosts, or something else. TPA doesn't expect or enforce any particular interpretation. diff --git a/product_docs/docs/tpa/23/images/m1.dot b/product_docs/docs/tpa/23/images/m1.dot index 5776b84b88d..d25334e30ae 100644 --- a/product_docs/docs/tpa/23/images/m1.dot +++ b/product_docs/docs/tpa/23/images/m1.dot @@ -1,4 +1,4 @@ -# © Copyright EnterpriseDB UK Limited 2015-2023 - All rights reserved. +# © Copyright EnterpriseDB UK Limited 2015-2024 - All rights reserved. digraph M1 { backup [shape=box]; diff --git a/product_docs/docs/tpa/23/index.mdx b/product_docs/docs/tpa/23/index.mdx index 06019dc8b2b..58e630134c9 100644 --- a/product_docs/docs/tpa/23/index.mdx +++ b/product_docs/docs/tpa/23/index.mdx @@ -40,7 +40,7 @@ originalFilePath: index.md --- -© Copyright EnterpriseDB UK Limited 2015-2023 - All rights reserved. +© Copyright EnterpriseDB UK Limited 2015-2024 - All rights reserved. ## Introduction diff --git a/product_docs/docs/tpa/23/platform-aws.mdx b/product_docs/docs/tpa/23/platform-aws.mdx index b19f594f85e..929bd3fb552 100644 --- a/product_docs/docs/tpa/23/platform-aws.mdx +++ b/product_docs/docs/tpa/23/platform-aws.mdx @@ -357,6 +357,17 @@ write access to this bucket. During provisioning, tpaexec will provide instances with read-only access to the cluster_bucket through the instance profile. +### Elastic IP addresses + +To use elastic IP addresses, set `assign_elastic_ip` to `true` in +config.yml, either in `instance_defaults` to affect all the instances in your +cluster or individually on the separate instances as required. By +default, this will allocate a new elastic ip address and assign it to +the new instance. +To use an elastic IP address that has already been allocated but not yet +assigned, use `elastic_ip: 34.252.55.252`, substituting in your +allocated address. + ### Instance profile (optional) ``` diff --git a/product_docs/docs/tpa/23/reference/2q_and_edb_repositories.mdx b/product_docs/docs/tpa/23/reference/2q_and_edb_repositories.mdx index f98c5f1df6b..89a237ff81d 100644 --- a/product_docs/docs/tpa/23/reference/2q_and_edb_repositories.mdx +++ b/product_docs/docs/tpa/23/reference/2q_and_edb_repositories.mdx @@ -4,16 +4,15 @@ originalFilePath: 2q_and_edb_repositories.md --- -This page explains the package sources from which TPA can download EDB -(including 2ndQuadrant) software, how the source varies depending on the -selected software, and how to configure access to each source. +TPA can download EDB software +(including 2ndQuadrant) from several package sources, depending on the selected software. -Note that this page only describes the special configuration options and -logic for EDB and 2ndQuadrant sources. Arbitrary +Only the special configuration options and logic for EDB and 2ndQuadrant sources are +described here. You can add arbitrary [yum](yum_repositories/) or [apt](apt_repositories/) repositories -can be added independently of the logic described here. Likewise, -packages can be [downloaded in advance](tpaexec-download-packages/) -and added to a [local repository](local-repo/) if preferred. +independently of this logic. Likewise, you can [download +packages in advance](tpaexec-download-packages/) +and add them to a [local repository](local-repo/) if you prefer. ## Package sources used by TPA @@ -25,12 +24,11 @@ is available from more than one source. - [EDB Repos 1.0](https://www.enterprisedb.com/repos/legacy) - [2ndQuadrant Repos](https://techsupport.enterprisedb.com/customer_portal/sw/) -By default, TPA will [select sources and repositories automatically](#how-sources-are-selected-by-default) -based on the architecture and other options you have specified, so it is -not generally necessary to change these. However, you will need to +By default, TPA [selects sources and repositories](#how-sources-are-selected-by-default) +based on the architecture and other options you specify. It's +generally not necessary to change these. However, before running `tpaexec deploy`, you must ensure that you have a valid subscription for all the sources used and -that you have [exported the token](#authenticating-with-package-sources) -before running `tpaexec deploy` or the operation will fail. +that you [exported the token](#authenticating-with-package-sources). Otherwise, the operation fails. !!! Note EDB is in the process of publishing all software through Repos 2.0, @@ -38,44 +36,46 @@ before running `tpaexec deploy` or the operation will fail. ## Authenticating with package sources -To use [EDB Repos 2.0](https://www.enterprisedb.com/repos/) you must -`export EDB_SUBSCRIPTION_TOKEN=xxx` before you run tpaexec. You can get +To use [EDB Repos 2.0](https://www.enterprisedb.com/repos/), before you run tpaexec, you must run +`export EDB_SUBSCRIPTION_TOKEN=xxx`. You can get your subscription token from [the web interface](https://www.enterprisedb.com/repos-downloads). To use [2ndQuadrant repositories](https://techsupport.enterprisedb.com/customer_portal/sw/), -you must `export TPA_2Q_SUBSCRIPTION_TOKEN=xxx` before you run -tpaexec. You can get your subscription token from the 2ndQuadrant -Portal, under "Company info" in the left menu, then "Company". Some +before you run tpaexec, you must run `export TPA_2Q_SUBSCRIPTION_TOKEN=xxx`. +You can get your subscription token from the 2ndQuadrant +Portal. In the left menu, under **Company info**, select **Company**. Some repositories are available only by prior arrangement. -To use [EDB Repos 1.0](https://www.enterprisedb.com/repos/legacy) you +To use [EDB Repos 1.0](https://www.enterprisedb.com/repos/legacy), you must create a text file that contains your access credentials in the -`username:password` format and run `export -EDB_REPO_CREDENTIALS_FILE=/path/to/credentials/file` before you run -tpaexec. +`username:password` format. Before you run tpaexec, run: -If you do not have an account for any of the sites listed, you can -register for access at -<https://www.enterprisedb.com/user/register?destination=/repository-access-request> +``` +export EDB_REPO_CREDENTIALS_FILE=/path/to/credentials/file +``` + +If you don't have an account for any of the sites listed, you can +register for access at the [Account Registration page](https://www.enterprisedb.com/user/register?destination=/repository-access-request). ## How sources are selected by default -If the PGD-Always-ON architecture is selected, repositories will be -selected from EDB Repos 2.0 and all software will be sourced +If you select the PGD-Always-ON architecture, repositories are +selected from EDB Repos 2.0, and all software is sourced from these repositories. -If the M1 architecture is selected and no proprietary EDB software is -selected, all packages will be sourced from PGDG. If any proprietary EDB -software is selected, all packages will be sourced from EDB Repos 2.0. +If you select the M1 architecture and don't select any proprietary EDB software, +all packages are sourced from PGDG. If you select any proprietary EDB +software, all packages are sourced from EDB Repos 2.0. For the BDR-Always-ON architecture, the default source is -2ndQuadrant, and the necessary repositories will be added from this -source. In addition, the PGDG repositories will be used for community -packages such as PostgreSQL and etcd as required. -If EDB software not available in the 2ndQuadrant repos is required -(e.g. EDB Advanced Server), repositories will be selected from EDB Repos +EDB Repos 2.0 for new clusters, and the necessary repositories are added from this +source. (Existing clusters use 2ndQuadrant repositories until you reconfigure them.) +In addition, the PGDG repositories are used for community +packages, such as PostgreSQL and etcd, as required. +If EDB software that isn't available in the 2ndQuadrant repos is required +(such as EDB Postgres Advanced Server), TPA selects the repositories from EDB Repos 1.0. ## Specifying EDB 2.0 repositories @@ -91,14 +91,14 @@ cluster_vars: - postgres_distributed ``` -This example will configure the `enterprise` and `postgres_distributed` -repositories, giving access to EPAS and PGD5 products. -On Debian or Ubuntu systems, it will use the APT repository, and on -RedHat or SLES systems, it will use the rpm repositories, through the yum -or zypper frontends, respectively. +This example configures the `enterprise` and `postgres_distributed` +repositories, giving access to EDB Postgres Advanced Server and PGD version 5 products. +On Debian or Ubuntu systems, it uses the apt repository. +RedHat systems use the rpm repositories through the yum front end. +SLES systems use the rpm repositories through the zypper front end. -If any EDB repositories are specified, any 2ndQuadrant repositories -specified will be ignored and no EDB Repos 1.0 will be installed. +If you specify any EDB repositories, any 2ndQuadrant repositories +specified are ignored and no EDB Repos 1.0 are installed. ## Specifying 2ndQuadrant repositories @@ -113,15 +113,15 @@ cluster_vars: - products/bdr3/release ``` -This example will install the pglogical3 and bdr3 release repositories. -On Debian and Ubuntu systems, it will use the APT repository, and on -RedHat systems, it will use the YUM repository. +This example installs the pglogical3 and bdr3 release repositories. +On Debian and Ubuntu systems, it uses the apt repository, and on +RedHat systems, it uses the yum repository. The `dl/default/release` repository is always installed by default, -unless you +unless you either: -- explicitly set `tpa_2q_repositories: []`, or -- have at least one entry in `edb_repositories`. +- Explicitly set `tpa_2q_repositories: []` +- Have at least one entry in `edb_repositories` -Either or the above will result in no 2ndQuadrant repositories being +Either of these action results in no 2ndQuadrant repositories being installed. diff --git a/product_docs/docs/tpa/23/reference/air-gapped.mdx b/product_docs/docs/tpa/23/reference/air-gapped.mdx index 4a80131fc49..baeb7b3a640 100644 --- a/product_docs/docs/tpa/23/reference/air-gapped.mdx +++ b/product_docs/docs/tpa/23/reference/air-gapped.mdx @@ -4,43 +4,50 @@ originalFilePath: air-gapped.md --- -In a security controlled environment where no direct connection to the -Internet is allowed, it is necessary to provide all packages needed by -TPA to complete the deployment. This can be done via a local-repo on -each node in the cluster. TPA supports the addition of custom -repositories on each node via a -[local-repo](local-repo/) and the required packages can be downloaded -using the [download-packages](tpaexec-download-packages/) command. +In a security-controlled environment where no direct connection to the +Internet is allowed, you must provide all packages needed by +TPA to complete the deployment on each node of the cluster. You can supply +those packages using whatever method you choose, for example, +by way of shared network repos or local repos preconfigured on each node. + +You can alternatively use the local-repo approach that TPA provides, as described in +[Creating and using a local repository](local-repo/). If you use this approach, +you still need to make sure that TPA's local repo has all the required packages +needed during the deployment of each node. + +To help with this requirement when using the local-repo approach, TPA provides +the [`download-packages`](tpaexec-download-packages/) command. +This command can populate a local repository created using the local-repo approach. +Use this command to ensure that you download all the required packages needed for the deployment to succeed. ## Preparation -Choose an internet connected machine where you can install TPA and -follow the instructions below to either copy an existing cluster +Choose an internet-connected machine where you can install TPA. +Follow these instructions to either copy an existing cluster configuration or create a new cluster. !!! Note - If the air-gapped server does not already have TPA installed, - follow the instructions - [here](../INSTALL/#installing-tpa-without-internet-or-network-access-air-gapped) + If TPA isn't already installed on the air-gapped server, + follow [these instructions](../INSTALL/#installing-tpa-without-internet-or-network-access-air-gapped) to install it. If you have an existing cluster in a disconnected environment, all you -need on the internet connected host is the config.yml. Create a -directory and copy that file into it then run `tpaexec relink` on that -directory to generate the remaining files that would normally be created +need on the internet-connected host is the `config.yml` file. Create a +directory and copy that file into it. Then run `tpaexec relink` on that +directory to generate the remaining files that are normally created by `tpaexec configure`. Alternatively, to create a new configuration for an environment where -the target instances will not have network access, configure a new -cluster with this option: +the target instances doesn't have network access, configure a new +cluster using this option: ``` -tpaexec configure --use-local-repo-only … + tpaexec configure --use-local-repo-only … ``` -This will do everything that `--enable-local-repo` does, and disable the +This command does everything that `--enable-local-repo` does and disables the configuration for all other package repositories. On RedHat instances, -this also includes disabling access to subscription-based services. +it also disables access to subscription-based services. In an existing cluster, you can set `use_local_repo_only: yes` in `config.yml`: @@ -50,42 +57,41 @@ cluster_vars: use_local_repo_only: yes ``` -Note: that you do not need separate cluster configurations for internet -connected and disconnected environments, the options below work in both. +You don't need separate cluster configurations for internet-connected +and disconnected environments. The options that follow work in both. -More info on [using local-repo for distributing packages](local-repo/) +See [Creating and using a local repository](local-repo/) for more information. ## Downloading packages -On the internet connected machine, ensure that you -have [docker installed](../platform-docker/) and run: +On the internet-connected machine with +[docker installed](../platform-docker/), run: ```shell tpaexec download-packages cluster-dir --os <OS> --os-version <version> ``` -See detailed description for +See the detailed description for the [package downloader](tpaexec-download-packages/). ## Copying packages to the target environment -The resulting repository will be contained in the +The resulting repository is in the `cluster-dir/local-repo` directory. This is a complete package repo for -the target OS. Copy this directory, from the connected controller to the +the target OS. Copy this directory from the connected controller to the disconnected controller that will be used to deploy the cluster. Place the directory in the same place, beneath the cluster directory. TPA -will then copy packages to the instances automatically when `deploy` is -run. +then copies packages to the instances when you run `deploy`. ## Deploying in a disconnected environment -Ensure that the cluster config.yml has been configured as above in +Make sure that the cluster `config.yml` is configured as described in [Preparation](#preparation). Run `tpaexec provision` and `deploy` as you -would normally. +do normally. ## Updating in a disconnected environment -You can use the [upgrade](../tpaexec-upgrade/) command to -perform updates in an air-gapped environment. Prior to running this -command you must run `download-packages` on the connected controller and +You can use the [`upgrade`](../tpaexec-upgrade/) command to +perform updates in an air-gapped environment. Before running this +command, you must run `download-packages` on the connected controller and copy the updated repository to the disconnected controller. diff --git a/product_docs/docs/tpa/23/reference/apt_repositories.mdx b/product_docs/docs/tpa/23/reference/apt_repositories.mdx index b238058de12..52335ef4d51 100644 --- a/product_docs/docs/tpa/23/reference/apt_repositories.mdx +++ b/product_docs/docs/tpa/23/reference/apt_repositories.mdx @@ -4,10 +4,10 @@ originalFilePath: apt_repositories.md --- -This page explains how to configure APT package repositories on Debian +You can configure APT package repositories on Debian and Ubuntu systems. -You can define named repositories in `apt_repositories`, and decide +You can define named repositories in `apt_repositories` and decide which ones to use by listing the names in `apt_repository_list`: ```yaml @@ -24,11 +24,12 @@ cluster_vars: - Example ``` -This configuration would install the GPG key (with id `key_id`, -obtained from `key_url`) and a new entry under -`/etc/apt/sources.list.d` with the given `repo` line (or lines) -for the PGDG repository (which is already defined by default) and the -new Example repository. +This configuration installs: + +- The GPG key (with id `key_id`, obtained from `key_url`) +- A new entry under `/etc/apt/sources.list.d` with the given `repo` line (or lines) + for the PGDG repository (which is already defined by default) +- The new Example repository When you configure additional repositories, remember to include PGDG in `apt_repository_list` if you still want to install PGDG packages. diff --git a/product_docs/docs/tpa/23/reference/artifacts.mdx b/product_docs/docs/tpa/23/reference/artifacts.mdx index da0d942eb06..0cec130e702 100644 --- a/product_docs/docs/tpa/23/reference/artifacts.mdx +++ b/product_docs/docs/tpa/23/reference/artifacts.mdx @@ -32,22 +32,22 @@ cluster_vars: The following types are supported: - Use `path` to create or remove and change the ownership or mode of - files and directories (takes the same parameters as Ansible's `file` - module, which it uses internally) + files and directories. (This type takes the same parameters as Ansible's `file` + module, which it uses internally.) - Use `file` to copy a file from the controller and set the ownership - and mode (uses `copy`) + and mode (uses `copy`). - Use `archive` to extract files from an archive to a specified location - (uses `unarchive`) + (uses `unarchive`). - Use `directory` to rsync a directory from the controller to target - instances (uses `synchronize`) + instances (uses `synchronize`). -The example shows one entry for each of the above artifact types, but -you can use these or any other parameters that the corresponding Ansible +The example shows one entry for each of these artifact types. +You can use these or any other parameters that the corresponding Ansible module accepts. -Copying files and directories to target instances is a common-enough -need that this feature provides a convenient shortcut you can use -instead of writing a [custom hook](../tpaexec-hooks/). +Copying files and directories to target instances is a common need. +Defining artifacts can be a convenient shortcut to +writing a [custom hook](../tpaexec-hooks/). diff --git a/product_docs/docs/tpa/23/reference/barman.mdx b/product_docs/docs/tpa/23/reference/barman.mdx index e53b021dd71..3c46f52c480 100644 --- a/product_docs/docs/tpa/23/reference/barman.mdx +++ b/product_docs/docs/tpa/23/reference/barman.mdx @@ -4,8 +4,8 @@ originalFilePath: barman.md --- -When an instance is given the `barman` role in config.yml, TPA will -configure it as a [Barman](https://pgbarman.org/) server to take backups +When an instance has the `barman` role in `config.yml`, TPA +configures the instance as a [Barman](https://pgbarman.org/) server to take backups of any other instances that name it in their `backup` setting. ```yaml @@ -21,32 +21,32 @@ instances: ``` Multiple `postgres` instances can have the same Barman server named as -their `backup`; equally, one `postgres` instance can have a list of -Barman servers named as its `backup` and backups will be taken to all +their `backup`. Any `postgres` instance can have a list of +Barman servers named as its `backup`. In this case, backups are taken to all of the named servers. -The default Barman configuration will connect to PostgreSQL using -pg_receivewal to take continuous backups of WAL, and will take a full +The default Barman configuration connects to PostgreSQL using +`pg_receivewal` to take continuous backups of WAL. It takes a full backup of the instance using rsync over ssh twice weekly. Full backups and WAL are retained for long enough to enable recovery to any point in -the last 4 weeks. +the previous 4 weeks. ## Barman configuration -On each barman server, a global configuration file will be created -as `/etc/barman.conf`; this contains default values for many barman -configuration variables. For each postgres server being backed up, -an additional Barman configuration file is created: to back up the -server `one`, the file will be `/etc/barman.d/one.conf`, and the backups -will be stored in `/var/lib/barman/one`. The file and directory names -are taken from the backed-up instance's `backup_name` setting, defaulting -to the instance's name. +On each Barman server, a global configuration file is created +as `/etc/barman.conf`. This file contains default values for many Barman +configuration variables. For each Postgres server being backed up, +an additional Barman configuration file is created. For example, to back up the +server `one`, the file is `/etc/barman.d/one.conf`, and the backups +are stored in `/var/lib/barman/one`. The file and directory names +are taken from the backed-up instance's `backup_name` setting. The default for this setting +is the instance name. -The following variables can be set on the backed-up instance and are +You can set the following variables on the backed-up instance. They are passed through into Barman's configuration with the prefix `barman_` -removed: +removed. -| variable | default | +| Variable | Default | | ------------------------------- | -------------------------- | | barman_archiver | false | | barman_log_file | /var/log/barman.log | @@ -69,20 +69,22 @@ removed: ## Backup scheduling -TPA installs a cron job in `/etc/cron.d/barman` which will run every -minute and invoke `barman cron` to perform maintenance tasks. +TPA installs a cron job in `/etc/cron.d/barman` that runs every +minute and invokes `barman cron` to perform maintenance tasks. -For each instance being backed up, it installs another cron job in -`/etc/cron.d/<backup_name>` which takes the backups of that instance. +For each instance being backed up, TPA installs another cron job in +`/etc/cron.d/<backup_name>` that takes the backups of that instance. This job runs as determined by the `barman_backup_interval` variable for -the instance, with the default being to take backups at 04:00 every +the instance. The default is to take backups at 04:00 every Wednesday and Saturday. ## SSH keys -TPA will generate ssh key pairs for the `postgres` and `barman` -users and install them into the respective ~/.ssh directories, and add -them to each other's authorized_keys file. The postgres user must be -able to ssh to the barman server in order to archive WAL segments (if +TPA generates ssh key pairs for the postgres and barman +users and installs them into the respective `~/.ssh` directories. Keys for +the postgres user are added to the barman `authorized_keys` file, and +keys for the barman user are added to the postgres `authorized_keys` file. +The postgres user must be +able to ssh to the Barman server to archive WAL segments (if configured), and the barman user must be able to ssh to the Postgres instance to take or restore backups. diff --git a/product_docs/docs/tpa/23/reference/bdr.mdx b/product_docs/docs/tpa/23/reference/bdr.mdx index 3a2be32508d..dbdae4754c4 100644 --- a/product_docs/docs/tpa/23/reference/bdr.mdx +++ b/product_docs/docs/tpa/23/reference/bdr.mdx @@ -11,63 +11,61 @@ and 5.x. Access to PGD packages is through EDB's package repositories only. You must have a valid EDB subscription token to download the packages. -This documentation touches on several aspects of PGD configuration, but -we refer you to the [PGD -documentation](https://enterprisedb.com/docs/pgd/latest/) for an -authoritative description of the details. +This documentation touches on several aspects of PGD configuration, but +for an authoritative description of the details, +see the [PGD documentation](https://enterprisedb.com/docs/pgd/latest/). -## Introduction +## How TPA approaches PGD installation -TPA will install PGD and any dependencies on all PGD instances along -with Postgres itself. +TPA installs PGD and any dependencies on all PGD instances. It also installs Postgres. After completing the basic Postgres setup and starting Postgres, TPA -will then create the `bdr_database` and proceed to set up a PGD cluster -through the various steps described below. +then creates `bdr_database` and proceeds to set up a PGD cluster +using the steps that follow. ## Installation -TPA will install the correct PGD packages, depending on the version -and flavour of Postgres in use (e.g., Postgres, Postgres Extended, or -EPAS). +TPA installs the correct PGD packages based on the version +and flavor of Postgres in use (for example, Postgres, Postgres Extended, or +EDB Postgres Advanced Server). -Set `bdr_version` to determine which major version of PGD to install -(i.e., 3, 4, 5). Set `bdr_package_version` to determine which exact -package to install (e.g., '5.0\*' to install the latest 5.0.x). +Set `bdr_version` to determine the major version of PGD to install +(that is, 3, 4, or 5). Set `bdr_package_version` to determine the +package to install. For example, use `5.0*` to install the latest 5.0.x package. ## Overview of cluster setup After installing the required packages, configuring Postgres to load -PGD, and starting the server, TPA will go on to set up PGD nodes, +PGD, and starting the server, TPA sets up PGD nodes, groups, replication sets, and other resources. Here's a summary of the steps TPA performs: -- Create a PGD node (using bdr.create_node()) for each participating - instance +- Create a PGD node (using `bdr.create_node()`) for each participating + instance. -- Create one or more PGD node groups (using bdr.create_node_group()) - depending on `bdr_node_groups` +- Create one or more PGD node groups (using `bdr.create_node_group()`) + based on `bdr_node_groups`. -- Create replication sets, if required, to control exactly which changes - are replicated (depending on node group type and memberships, e.g., - subscriber-only and witness nodes may need special handling) +- Create replication sets, if required, to control the changes that + are replicated. This decision is based on node group type and memberships. For example, + subscriber-only and witness nodes might need special handling. -- Join the relevant node groups on the individual instances +- Join the relevant node groups on the individual instances. - Perform additional configuration, such as enabling subgroup RAFT or proxy routing. -(This process involves executing a complex sequence of queries, some on -each instance in turn, and others in parallel. To make the steps easier +This process involves executing a complex sequence of queries, some on +each instance in turn and others in parallel. To make the steps easier to follow, TPA designates an arbitrary PGD primary instance as the -"first_bdr_primary" for the cluster, and uses this instance to execute +`first_bdr_primary` for the cluster and uses this instance to execute most of these queries. The instance is otherwise not special, and its -identity is not significant to the PGD configuration itself.) +identity isn't significant to the PGD configuration. ## Instance roles -Every instance with `bdr` in its `role` is a PGD instance, and +Every instance with `bdr` in its role is a PGD instance and implicitly also a `postgres` server instance. A PGD instance with `readonly` in its role is a logical standby node @@ -75,44 +73,44 @@ A PGD instance with `readonly` in its role is a logical standby node for promotion. A PGD instance with `subscriber-only` in its role is a subscriber-only -node, which receives replicated changes but does not publish them. +node, which receives replicated changes but doesn't publish them. A PGD instance with `witness` in its role is a witness node. -Every PGD instance described above is implicitly also a `primary` -instance. The exception is an instance with `replica` in its role; that +Each of these PGD instances is implicitly also a `primary` +instance. The exception is an instance with `replica` in its role. That setting indicates a physical streaming replica of an upstream PGD instance. Such -instances are not included in any recommended PGD architecture, and not +instances aren't included in any recommended PGD architecture and aren't currently supported by TPA. ## Configuration settings -The settings mentioned below should ordinarily be set in `cluster_vars`, -so that they are set uniformly for all the PGD instances in the cluster. -You can set different values on different instances in some cases (e.g., -`bdr_database`), but in other cases, the result is undefined (e.g., all -instances must have exactly the same value of `bdr_node_groups`). +The settings that follow are ordinarily set in `cluster_vars` +so that they're set uniformly for all the PGD instances in the cluster. +You can set different values on different instances in some cases, for example, +`bdr_database`. In other cases, though, the result is undefined. For example, all +instances must have exactly the same value of `bdr_node_groups`. We strongly recommend defining your PGD configuration by setting uniform values for the whole cluster under `cluster_vars`. ### bdr_database -The `bdr_database` (default: bdrdb) will be initialised with PGD. +The `bdr_database` (default: `bdrdb`) is initialized with PGD. ### bdr_node_group The setting of `bdr_node_group` (default: based on the cluster name) -identifies which PGD cluster an instance should be a part of. It is also -used to identify a particular cluster for external components (e.g., -pgd-proxy or harp-proxy). +identifies the PGD cluster for an instance to be a part of. It's also +used to identify a particular cluster for external components, such as +pgd-proxy or harp-proxy. ### bdr_node_groups -This is a list of PGD node groups that must be created before the group -join stage (if the cluster requires additional subgroups). +This setting is a list of PGD node groups that must be created before the group-join +stage (if the cluster requires additional subgroups). -In general, `tpaexec configure` will generate an appropriate value based +In general, `tpaexec configure` generates an appropriate value based on the selected architecture. ```yaml @@ -129,23 +127,23 @@ cluster_vars: The first entry must be for the cluster's `bdr_node_group`. -Each subsequent entry in the list must specify a `parent_group_name`, -and may specify the `node_group_type` (optional). +Each subsequent entry in the list must specify a `parent_group_name` +and can optionally specify the `node_group_type` optional. -Each entry may also have an optional key/value mapping of group options. +Each entry can also have an optional key/value mapping of group options. The available options vary by PGD version. ### bdr_child_group If `bdr_child_group` is set for an instance (to the name of a group that -is mentioned in `bdr_node_groups`), it will join that group instead of +is mentioned in `bdr_node_groups`), it joins that group instead of `bdr_node_group`. ### bdr_commit_scopes -This is an optional list of +This setting is an optional list of [commit scopes](https://www.enterprisedb.com/docs/pgd/latest/reference/commit-scopes/) -that must exist in the PGD database (available for PGD 4.1 and above). +that must exist in the PGD database (available for PGD 4.1 and later). ```yaml cluster_vars: @@ -159,31 +157,31 @@ cluster_vars: … ``` -Each entry must specify the `name` of the commit scope, the name of the -`origin` group, and the commit scope `rule`. The groups must correspond +Each entry must specify a `name` value for the commit scope, the name of the +`origin` group, and a commit scope `rule` value. The groups must correspond to entries in `bdr_node_groups`. -If you set `bdr_commit_scopes` explicitly, TPA will create, alter, or -drop commit scopes as needed to ensure that the database matches the -configuration. If you do not set it, it will leave existing commit +If you set `bdr_commit_scopes` explicitly, TPA creates, alters, or +drops commit scopes as needed to ensure that the database matches the +configuration. If you don't set it, TPA leaves existing commit scopes alone. ## Miscellaneous notes ### Hooks -TPA invokes the bdr-node-pre-creation, bdr-post-group-creation, and -bdr-pre-group-join [hooks](../tpaexec-hooks/) during the PGD cluster +TPA invokes the `bdr-node-pre-creation`, `bdr-post-group-creation`, and +`bdr-pre-group-join` [hooks](../tpaexec-hooks/) during the PGD cluster setup process. ### Database collations TPA checks that the PGD database on every instance in a cluster has -the same collation (LC_COLLATE) setting. Having different collations in -databases in the same PGD cluster is a data loss risk. +the same collation (`LC_COLLATE`) setting. Having different collations in +databases in the same PGD cluster risks data loss. ## Older versions of PGD TPA no longer actively supports or tests the deployment of BDR v1 (with a patched version of Postgres 9.4), v2 (with Postgres 9.6), or -any PGD versions below v3.7. +any PGD versions before v3.7. diff --git a/product_docs/docs/tpa/23/reference/harp.mdx b/product_docs/docs/tpa/23/reference/harp.mdx index d9734256f85..d1a76df27a0 100644 --- a/product_docs/docs/tpa/23/reference/harp.mdx +++ b/product_docs/docs/tpa/23/reference/harp.mdx @@ -82,6 +82,10 @@ by setting `harp_dcs_user: username` under cluster_vars. TPA will use `harp_dcs_user` setting to create a readonly user and set it up in the DCS configuration. +## Configuring a separate user for harp manager + +If you want harp manager to use a separate user, you can specify that by setting `harp_manager_user: username` under `cluster_vars`. TPAexec will use that setting to create a new user and grant it the `bdr_superuser` role. + ## Custom SSL password command The command provided by `harp_ssl_password_command` will be used by HARP diff --git a/product_docs/docs/tpa/23/reference/postgresql.conf.mdx b/product_docs/docs/tpa/23/reference/postgresql.conf.mdx index d714ce5db3d..5109ffe9dd4 100644 --- a/product_docs/docs/tpa/23/reference/postgresql.conf.mdx +++ b/product_docs/docs/tpa/23/reference/postgresql.conf.mdx @@ -119,6 +119,19 @@ you can set a fully-quoted value under [`postgres_conf_settings`](#postgres_conf_settings). In this case, the value is set in `conf.d/9900-tpa_postgres_conf_settings.conf`. +## Postgres log + +The default log file is defined as `/var/log/postgres/postgres.log`. If you +need to change that, you can now set postgres_log_file in your config.yml: + +```yaml +cluster_vars: + [...] + postgres_log_file: '/srv/fantastic_logs/pg_server.log' +``` + +TPA will take care of creating the directories and rotate the log when needed. + ## Making changes by hand There are two ways you can override anything in the TPA-generated diff --git a/product_docs/docs/tpa/23/rel_notes/index.mdx b/product_docs/docs/tpa/23/rel_notes/index.mdx index b0afc87a0ab..708ab82ead9 100644 --- a/product_docs/docs/tpa/23/rel_notes/index.mdx +++ b/product_docs/docs/tpa/23/rel_notes/index.mdx @@ -2,6 +2,7 @@ title: Trusted Postgres Architect release notes navTitle: "Release notes" navigation: + - tpa_23.28_rel_notes - tpa_23.27_rel_notes - tpa_23.26_rel_notes - tpa_23.25_rel_notes @@ -25,6 +26,7 @@ The Trusted Postgres Architect documentation describes the latest version of Tru | Version | Release date | | ---------------------------- | ------------ | +| [23.28](tpa_23.28_rel_notes) | 23 Jan 2023 | | [23.27](tpa_23.27_rel_notes) | 19 Dec 2023 | | [23.26](tpa_23.26_rel_notes) | 30 Nov 2023 | | [23.25](tpa_23.25_rel_notes) | 14 Nov 2023 | diff --git a/product_docs/docs/tpa/23/rel_notes/tpa_23.28_rel_notes.mdx b/product_docs/docs/tpa/23/rel_notes/tpa_23.28_rel_notes.mdx new file mode 100644 index 00000000000..37c71b2935b --- /dev/null +++ b/product_docs/docs/tpa/23/rel_notes/tpa_23.28_rel_notes.mdx @@ -0,0 +1,27 @@ +--- +title: Trusted Postgres Architect 23.28 release notes +navTitle: "Version 23.28" +--- + +Released: 23 Jan 2023 + +New features, enhancements, bug fixes, and other changes in Trusted Postgres Architect 23.28 include the following: + +| Type | Description | +| ---- |------------ | +| Enhancement | Added a new option `postgres_log_file`. This option sets the Postgres log file, whether logging through stderr or syslog. The default is '/var/log/postgres/postgres.log', the previously hard-coded value. | +| Enhancement | Added a new hook `barman-pre-config`. This hook is invoked after Barman is installed and its user is set up but before it is configured. It can be used for installing certificate files or other tasks which need the barman user to exist but which must be done before Barman is started. | +| Enhancement | The key `elastic_ip` on an AWS instance in `config.yml` can be set to an elastic IP address that has already been allocated in order to assign it to this instance. | +| Change | In Patroni clusters, TPA now sets up replicas before handing over control of the cluster to Patroni, rather than setting up the primary only and letting Patroni set up the replicas. | +| Change | If harp_manager_user is defined in `config.yml`, TPA will create the user `harp_manager_user`, belonging to the `bdr_superuser` role, and set HARP manager to operate as this user instead of as the `postgres` superuser. | +| Bug Fix | Fixed an issue whereby TPA would erroneously attempt to install repmgr on an EFM cluster. | +| Bug Fix | Fixed an issue whereby the TPA would return a non-zero exit code when the warning about 2q repositories was displayed despite deploy having succeeded.| +| Bug Fix | TPA will now interpret wildcards correctly on Debian-family systems when downloading packages for offline use. | +| Bug Fix | Fixed an issue whereby TPA would attempt to use incorrect package names for repmgr when installing from PGDG repositories. | +| Bug Fix | Fixed barman connection failure when using selinux and a custom barman home directory. | +| Bug Fix | TPA will now use the correct cluster name in `show-password` and `store-password` commands when it is different from the directory name | +| Bug Fix | TPA will now error out cleanly if unavailable 2ndQuadrant repository keys are required. | +| Bug Fix | TPA will now sanitize hostnames correctly when the --cluster-prefixed-hostnames option is used. | +| Bug Fix | TPA will now ensure packages are correctly copied to the remote host when upgrading a cluster using a local repo. | + + diff --git a/product_docs/docs/tpa/23/tpaexec-hooks.mdx b/product_docs/docs/tpa/23/tpaexec-hooks.mdx index ede9c8f6ded..2ce4199c92b 100644 --- a/product_docs/docs/tpa/23/tpaexec-hooks.mdx +++ b/product_docs/docs/tpa/23/tpaexec-hooks.mdx @@ -83,6 +83,14 @@ Postgres and creating users, databases, and extensions. You can use this hook to execute SQL commands, for example, to perform custom extension configuration or create database objects. +### barman-pre-config + +TPA invokes `hooks/barman-pre-config.yml` after installing Barman and +setting up Barman users, but before generating any Barman configuration. + +You can use this hook, for example, to perform any tasks related with +Barman certificate files or mount points. + ### harp-config TPA invokes `hooks/harp-config.yml` after generating HARP configuration diff --git a/scripts/source/process-tpa-docs-auto.sh b/scripts/source/process-tpa-docs-auto.sh index e67d00291ef..cb853508d1b 100755 --- a/scripts/source/process-tpa-docs-auto.sh +++ b/scripts/source/process-tpa-docs-auto.sh @@ -14,10 +14,10 @@ rm -rf $SOURCE_CHECKOUT if [ -z $2 ] then # Get the TPA tree into /tmp/tpaforimport - git clone https://github.com/EnterpriseDB/tpa.git $SOURCE_CHECKOUT + git clone https://github.com/EnterpriseDB/tpa-internal.git $SOURCE_CHECKOUT else # Get specific branch of the TPA tree into /tmp/tpaforimport - git clone -b $2 https://github.com/EnterpriseDB/tpa.git $SOURCE_CHECKOUT + git clone -b $2 https://github.com/EnterpriseDB/tpa-internal.git $SOURCE_CHECKOUT fi # convert inputs to actual directory names, in case a relative path is passed in.