Skip to content

Commit

Permalink
Merge pull request #5559 from EnterpriseDB/efm/4_9_release_changes
Browse files Browse the repository at this point in the history
Efm 4.9 release changes
  • Loading branch information
gvasquezvargas authored May 15, 2024
2 parents 212c63d + 339a584 commit f140529
Show file tree
Hide file tree
Showing 13 changed files with 103 additions and 128 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ redirects:
{{ super() }}
{% endblock product_prerequisites %}
{% block postinstall %}
Where `<4x>` is the version of Failover Manager that you're installing. For example, if you're installing version 4.8, the package name is `edb-efm48`.
Where `<4x>` is the version of Failover Manager that you're installing. For example, if you're installing version 4.9, the package name is `edb-efm49`.

The installation process creates a user named efm that has privileges to invoke scripts that control the Failover Manager service for clusters owned by enterprisedb or postgres.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Each node in a Failover Manager cluster has a properties file (by default, named
After completing the Failover Manager installation, make a working copy of the template before modifying the file contents:

```text
# cp /etc/edb/efm-4.8/efm.properties.in /etc/edb/efm-4.8/efm.properties
# cp /etc/edb/efm-4.9/efm.properties.in /etc/edb/efm-4.9/efm.properties
```

After copying the template file, change the owner of the file to efm:
Expand Down Expand Up @@ -224,42 +224,15 @@ db.bin=

<div id="db_data_dir" class="registered_link"></div>

Use the `db.data.dir` property to specify the location to write a recovery file on the primary node of the cluster during promotion. This property is required on primary and standby nodes. It isn't required on a dedicated witness node.
Use the `db.data.dir` property to specify the location where a `standby.signal` or `recovery.conf` file will be created. This property is required on primary and standby nodes. It isn't required on a dedicated witness node.

```ini
# For database version 12 and up, this is the directory where a
# standby.signal file will exist for a standby node. For previous
# versions, this is the location of the db recovery.conf file on
# the node.
# After a failover, the recovery.conf files on remaining standbys are
# changed to point to the new primary db (a copy of the original is made
# first). On a primary node, a recovery.conf file will be written during
# failover and promotion to ensure that the primary node can not be
# restarted as the primary database.
# This corresponds to database environment variable PGDATA and should
# be same as the output of query 'show data_directory;' on respective
# database.
db.data.dir=
```


<div id="db_data_dir" class="registered_link"></div>

Use the `db.data.dir` property to specify the location to write a recovery file on the primary node of the cluster during promotion. This property is required on primary and standby nodes. It isn't required on a dedicated witness node.

```ini
# For database version 12 and up, this is the directory where a
# standby.signal file will exist for a standby node. For previous
# versions, this is the location of the db recovery.conf file on
# the node.
# After a failover, the recovery.conf files on remaining standbys are
# changed to point to the new primary db (a copy of the original is made
# first). On a primary node, a recovery.conf file will be written during
# failover and promotion to ensure that the primary node can not be
# restarted as the primary database.
# This corresponds to database environment variable PGDATA and should
# be same as the output of query 'show data_directory;' on respective
# database.
# This is the directory where a standby.signal file will exist for a standby node.
# If a primary database fails, a recovery.conf file will be written in this
# location to ensure that the failed database can not be restarted as the
# primary database.
# This corresponds to database environment variable PGDATA and should be same
# as the output of query 'show data_directory;' on respective database.
db.data.dir=
```

Expand All @@ -269,14 +242,13 @@ Use the `db.config.dir` property to specify the location of database configurati

```ini
# Specify the location of database configuration files if they are
# not contained in the same location as the recovery.conf or
# standby.signal file. This is most likely the case for Debian
# installations. The location specified will be used as the -D value
# (the location of the data directory for the cluster) when calling
# pg_ctl to start or stop the database. If this property is blank,
# the db.data.dir location specified by the db.data.dir property will
# be used. This corresponds to the output of query 'show config_file;'
# on respective database.
# not contained in the same location as the standby.signal
# file. This is most likely the case for Debian installations. The location
# specified will be used as the -D value (the location of the data directory
# for the cluster) when calling pg_ctl to start or stop the database.
# If this property is blank, the db.data.dir location specified by the
# db.data.dir property will be used.
# This corresponds to the output of query 'show config_file;' on respective database.
db.config.dir=
```

Expand Down Expand Up @@ -315,10 +287,10 @@ For information about configuring and using SSL, see [Secure TCP/IP Connections
Use the `user.email` property to specify an email address (or multiple email addresses) to receive notifications sent by Failover Manager.

```ini
# Email address(es) for notifications. The value of this
# property must be the same across all agents. Multiple email
# addresses must be separated by space. If using a notification
# script instead, this property can be left blank.
# Email address(es) for notifications. The value of this property must
# be the same across all agents. Multiple email addresses must
# be separated by space. This is required if not using a 'script.notification'
# script. Either/both can be used.
user.email=
```

Expand Down Expand Up @@ -506,10 +478,11 @@ Use the `node.timeout` property to specify the number of seconds for an agent to
node.timeout=50
```

!!! Summary/comparison of timeout properties
- The `local.*` properties are for failure detection of an agent's local database.
- The `node.timeout` property is for failure detection of other nodes.
- The `remote.timeout` property limits how long agents wait for responses from other agents.
!!!note Summary/comparison of timeout properties
- The `local.*` properties are for failure detection of an agent's local database.
- The `node.timeout` property is for failure detection of other nodes.
- The `remote.timeout` property limits how long agents wait for responses from other agents.
!!!

<div id="encrypt_agent_messages" class="registered_link"></div>

Expand Down Expand Up @@ -584,7 +557,7 @@ To perform maintenance on the primary database when `primary.shutdown.as.failure

<div id="update_physical_slots_period" class="registered_link"></div>

Use the `update.physical.slots.period` property to define the slot advance frequency for database version 12 and later. When `update.physical.slots.period` is set to a positive integer value, the primary agent reads the current `restart_lsn` of the physical replication slots after every `update.physical.slots.period` seconds and sends this information with its `pg_current_wal_lsn` and `primary_slot_name` (if it is set in the postgresql.conf file) to the standbys. The physical slots must already exist on the primary for the agent to find them. If physical slots do not already exist on the standbys, standby agents create the slots and then update `restart_lsn` parameter for these slots. A non-promotable standby doesn't create new slots but updates them if they exist.
Use the `update.physical.slots.period` property to define the slot advance frequency. When `update.physical.slots.period` is set to a positive integer value, the primary agent reads the current `restart_lsn` of the physical replication slots after every `update.physical.slots.period` seconds and sends this information with its `pg_current_wal_lsn` and `primary_slot_name` (if it is set in the postgresql.conf file) to the standbys. The physical slots must already exist on the primary for the agent to find them. If physical slots do not already exist on the standbys, standby agents create the slots and then update `restart_lsn` parameter for these slots. A non-promotable standby doesn't create new slots but updates them if they exist.

Before updating the `restart_lsn` value of a slot, the agent checks to see if an `xmin` value has been set, which may happen if this was previously a primary node. If an `xmin` value has been set for the slot, the agent drops and recreates the slot before updating the `restart_lsn` value.

Expand Down Expand Up @@ -690,15 +663,14 @@ auto.failover=true

<div id="auto_reconfigure" class="registered_link"></div>

Use the `auto.reconfigure` property to instruct Failover Manager to enable or disable automatic reconfiguration of remaining standby servers after the primary standby is promoted to primary. Set the property to `true` (the default) to enable automatic reconfiguration or `false` to disable automatic reconfiguration. This property isn't required on a dedicated witness node. If you're using EDB Postgres Advanced Server or PostgreSQL version 11, the `recovery.conf` file is backed up during the reconfiguration process.
Use the `auto.reconfigure` property to instruct Failover Manager to enable or disable automatic reconfiguration of remaining standby servers after the primary standby is promoted to primary. Set the property to `true` (the default) to enable automatic reconfiguration or `false` to disable automatic reconfiguration. This property isn't required on a dedicated witness node.

```ini
# After a standby is promoted, Failover Manager will attempt to
# update the remaining standbys to use the new primary. For database
# versions before 12, Failover Manager will back up recovery.conf.
# Then it will change the host parameter of the primary_conninfo entry
# in recovery.conf or postgresql.auto.conf, and restart the database.
# The restart command is contained in either the efm_db_functions or
# update the remaining standbys to use the new primary. Failover
# Manager will change the host parameter of the primary_conninfo
# entry in postgresql.auto.conf and restart the database. The
# restart command is contained in either the efm_db_functions or
# efm_root_functions file; default when not running db as an os
# service is: "pg_ctl restart -m fast -w -t <timeout> -D <directory>"
# where the timeout is the local.timeout property value and the
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ This example shows using the `encrypt` utility to encrypt a password for the `ac
# efm encrypt acctg
This utility will generate an encrypted password for you to place in
your Failover Manager cluster property file:
/etc/edb/efm-4.8/acctg.properties
/etc/edb/efm-4.9/acctg.properties
Please enter the password and hit enter:
Please enter the password again to confirm:
The encrypted password is: 516b36fb8031da17cfbc010f7d09359c
Expand All @@ -49,17 +49,11 @@ db.password.encrypted=516b36fb8031da17cfbc010f7d09359c
After receiving your encrypted password, paste the password into the properties file and start the Failover Manager service. If there's a problem with the encrypted password, the Failover Manager service doesn't start:

```text
[witness@localhost ~]# systemctl start edb-efm-4.8
Job for edb-efm-4.8.service failed because the control process exited with error code. See "systemctl status edb-efm-4.8.service" and "journalctl -xe" for details.
[witness@localhost ~]# systemctl start edb-efm-4.9
Job for edb-efm-4.9.service failed because the control process exited with error code. See "systemctl status edb-efm-4.9.service" and "journalctl -xe" for details.
```

If you receive this message when starting the Failover Manager service, see the startup log `/var/log/efm-4.8/startup-efm.log` for more information.

If you are using RHEL/CentOS 7.x or RHEL/Rocky Linux/AlmaLinux 8.x, startup information is also available with the following command:

```shell
systemctl status edb-efm-4.8
```
If you receive this message when starting the Failover Manager service, see the startup log `/var/log/efm-4.<x>/startup-efm.log` for more information.

To prevent a cluster from inadvertently connecting to the database of another cluster, the cluster name is incorporated into the encrypted password. If you modify the cluster name, you must re-encrypt the database password and update the cluster properties file.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Each node in a Failover Manager cluster has a cluster members file (by default n
After completing the Failover Manager installation, make a working copy of the template:

```shell
cp /etc/edb/efm-4.8/efm.nodes.in /etc/edb/efm-4.8/efm.nodes
cp /etc/edb/efm-4.9/efm.nodes.in /etc/edb/efm-4.9/efm.nodes
```

After copying the template file, change the owner of the file to efm:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ The `efm_db_functions` or `efm_root_functions` scripts perform management functi

The `sudoers` file contains entries that allow the user efm to control the Failover Manager service for clusters owned by postgres or enterprisedb. You can modify a copy of the `sudoers` file to grant permission to efm to manage Postgres clusters owned by other users.

The `efm-48` file is located in `/etc/sudoers.d` and contains the following entries:
The `efm-49` file is located in `/etc/sudoers.d` and contains the following entries:

```text
# Copyright EnterpriseDB Corporation, 2014-2021. All Rights Reserved.
Expand All @@ -36,24 +36,24 @@ The `efm-48` file is located in `/etc/sudoers.d` and contains the following entr
# If you run your db service under a non-default account, you will need to copy
# this file to grant the proper permissions and specify the account in your efm
# cluster properties file by changing the 'db.service.owner' property.
efm ALL=(postgres) NOPASSWD: /usr/edb/efm-4.8/bin/efm_db_functions
efm ALL=(enterprisedb) NOPASSWD: /usr/edb/efm-4.8/bin/efm_db_functions
efm ALL=(postgres) NOPASSWD: /usr/edb/efm-4.9/bin/efm_db_functions
efm ALL=(enterprisedb) NOPASSWD: /usr/edb/efm-4.9/bin/efm_db_functions
# Allow user 'efm' to sudo efm_root_functions as 'root' to write/delete the PID file,
# validate the db.service.owner property, etc.
efm ALL=(ALL) NOPASSWD: /usr/edb/efm-4.8/bin/efm_root_functions
efm ALL=(ALL) NOPASSWD: /usr/edb/efm-4.9/bin/efm_root_functions
# Allow user 'efm' to sudo efm_address as root for VIP tasks.
efm ALL=(ALL) NOPASSWD: /usr/edb/efm-4.8/bin/efm_address
efm ALL=(ALL) NOPASSWD: /usr/edb/efm-4.9/bin/efm_address
# Allow user 'efm' to sudo efm_pgpool_functions as root for pgpool tasks.
efm ALL=(ALL) NOPASSWD: /usr/edb/efm-4.8/bin/efm_pgpool_functions
efm ALL=(ALL) NOPASSWD: /usr/edb/efm-4.9/bin/efm_pgpool_functions
# relax tty requirement for user 'efm'
Defaults:efm !requiretty
```

If you're using Failover Manager to monitor clusters that are owned by users other than postgres or enterprisedb, make a copy of the `efm-48` file. Then modify the content to allow the user to access the `efm_functions` script to manage their clusters.
If you're using Failover Manager to monitor clusters that are owned by users other than postgres or enterprisedb, make a copy of the `efm-49` file. Then modify the content to allow the user to access the `efm_functions` script to manage their clusters.

If an agent can't start because of permission problems, make sure the default `/etc/sudoers` file contains the following line at the end of the file:

Expand Down Expand Up @@ -89,17 +89,17 @@ To run Failover Manager without sudo, you must select a database process owner w
```shell
su - enterprisedb

cp /etc/edb/efm-4.8/efm.properties.in <directory/cluster_name>.properties
cp /etc/edb/efm-4.9/efm.properties.in <directory/cluster_name>.properties

cp /etc/edb/efm-4.8/efm.nodes.in <directory>/<cluster_name>.nodes
cp /etc/edb/efm-4.9/efm.nodes.in <directory>/<cluster_name>.nodes
```

Then, modify the cluster properties file, providing the name of the user in the `db.service.owner` property. Also make sure that the `db.service.name` property is blank. Without sudo, you can't run services without root access.

After modifying the configuration, the new user can control Failover Manager with the following command:

```shell
/usr/edb/efm-4.8/bin/runefm.sh start|stop <directory/cluster_name>.properties
/usr/edb/efm-4.9/bin/runefm.sh start|stop <directory/cluster_name>.properties
```

Where `<directory/cluster_name.properties>` specifies the full path of the cluster properties file. The user provides the full path to the properties file whenever the nondefault user is controlling agents or using the `efm` script.
Expand Down
18 changes: 9 additions & 9 deletions product_docs/docs/efm/4/05_using_efm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -132,13 +132,13 @@ Where:

During switchover:

- For server version 11, the `recovery.conf` file is copied from an existing standby to the primary node. For server version 12 and later, the `primary_conninfo` and `restore_command` parameters are copied and stored in memory.
- The `primary_conninfo` and `restore_command` parameters are copied from an existing standby to the primary node are and stored in memory.
- The primary database is stopped.
- If you are using a VIP, the address is released from the primary node.
- A standby is promoted to replace the primary node and acquires the VIP.
- The address of the new primary node is added to the `recovery.conf` file or the `primary_conninfo` details are stored in memory.
- If the `application.name` property is set for this node, the application_name property is added to the `recovery.conf` file or the `primary_conninfo` information is stored in memory.
- If you're using server version 12 or later, the recovery settings that were stored in memory are written to the `postgresql.auto.conf` file. A `standby.signal` file is created.
- The address of the new primary node is added to the `primary_conninfo` details stored in memory.
- If the `application.name` property is set for this node, the application_name property is added to the `primary_conninfo` information stored in memory.
- The recovery settings that were stored in memory are written to the `postgresql.auto.conf` file. A `standby.signal` file is created.
- The old primary is started; the agent resumes monitoring it as a standby.

During a promotion, the primary agent releases the virtual IP address. If it isn't a switchover, a `recovery.conf` file is created in the directory specified by the `db.data.dir` property. The `recovery.conf` file is used to prevent the old primary database from starting until the file is removed, preventing the node from starting as a second primary in the cluster. If the promotion is part of a switchover, recovery settings are handled as described above.
Expand Down Expand Up @@ -271,11 +271,11 @@ After creating the `acctg.properties` and `sales.properties` files, create a ser

If you're using RHEL/CentOS 7.x or RHEL/Rocky Linux/AlmaLinux 8.x, copy the service file `/usr/lib/systemd/system/edb-efm-4.<x>.service` to `/etc/systemd/system` with a new name that's unique for each cluster.

For example, if you have two clusters named `acctg` and `sales` managed by Failover Manager 4.8, the unit file names might be `efm-acctg.service` and `efm-sales.service`. You can create them with:
For example, if you have two clusters named `acctg` and `sales` managed by Failover Manager 4.9, the unit file names might be `efm-acctg.service` and `efm-sales.service`. You can create them with:

```shell
cp /usr/lib/systemd/system/edb-efm-4.8.service /etc/systemd/system/efm-acctg.service
cp /usr/lib/systemd/system/edb-efm-4.8.service /etc/systemd/system/efm-sales.service
cp /usr/lib/systemd/system/edb-efm-4.9.service /etc/systemd/system/efm-acctg.service
cp /usr/lib/systemd/system/edb-efm-4.9.service /etc/systemd/system/efm-sales.service
```

Then use `systemctl edit` to edit the `CLUSTER` variable in each unit file, changing the specified cluster name from `efm` to the new cluster name.
Expand All @@ -286,15 +286,15 @@ In this example, edit the `acctg` cluster by running `systemctl edit efm-acctg.s
```ini
[Service]
Environment=CLUSTER=acctg
PIDFile=/run/efm-4.8/acctg.pid
PIDFile=/run/efm-4.9/acctg.pid
```

Edit the `sales` cluster by running `systemctl edit efm-sales.service` and write:

```ini
[Service]
Environment=CLUSTER=sales
PIDFile=/run/efm-4.8/sales.pid
PIDFile=/run/efm-4.9/sales.pid
```

!!!Note
Expand Down
Loading

1 comment on commit f140529

@github-actions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.