Skip to content

Commit

Permalink
Merge pull request #2492 from EnterpriseDB/release/2022-03-25
Browse files Browse the repository at this point in the history
Release: 2022-03-25
  • Loading branch information
drothery-edb authored Mar 25, 2022
2 parents 6624a55 + d32a789 commit 7ae0e05
Show file tree
Hide file tree
Showing 76 changed files with 1,137 additions and 862 deletions.
95 changes: 57 additions & 38 deletions advocacy_docs/kubernetes/cloud_native_postgresql/api_reference.mdx

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ only write inside a single Kubernetes cluster, at any time.
!!! Tip
If you are interested in a PostgreSQL architecture where all instances accept writes,
please take a look at [BDR (Bi-Directional Replication) by EDB](https://www.enterprisedb.com/docs/bdr/latest/).
For Kubernetes, BDR will have its own Operator, expected late in 2021.
For Kubernetes, BDR will have its own Operator, expected later in 2022.

However, for business continuity objectives it is fundamental to:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ You can archive the backup files in any service that is supported
by the Barman Cloud infrastructure. That is:

- [AWS S3](https://aws.amazon.com/s3/)
- [Microsoft Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/).
- [Microsoft Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/)
- [Google Cloud Storage](https://cloud.google.com/storage/)

You can also use any compatible implementation of the
supported services.
Expand Down Expand Up @@ -318,6 +319,71 @@ In that case, `<account-name>` is the first component of the path.
This is required if you are testing the Azure support via the Azure Storage
Emulator or [Azurite](https://github.com/Azure/Azurite).

### Google Cloud Storage

Currently, the operator supports two authentication methods for Google Cloud Storage,
one assumes the pod is running inside a Google Kubernetes Engine cluster, the other one leverages
the environment variable `GOOGLE_APPLICATION_CREDENTIALS`.

#### Running inside Google Kubernetes Engine

This could be one of the easiest way to create a backup, and only requires
the following configuration:

```yaml
apiVersion: postgresql.k8s.enterprisedb.io/v1
kind: Cluster
[...]
spec:
backup:
barmanObjectStore:
destinationPath: "gs://<destination path here>"
googleCredentials:
gkeEnvironment: true
```

This, will tell the operator that the cluster is running inside a Google Kubernetes
Engine meaning that no credentials are needed to upload the files

!!! Important
This method will require carefully defined permissions for cluster
and pods, which have to be defined by a cluster administrator.

#### Using authentication

Following the [instruction from Google](https://cloud.google.com/docs/authentication/getting-started)
you will get a JSON file that contains all the required information to authenticate.

The content of the JSON file must be provided using a `Secret` that can be created
with the following command:

```shell
kubectl create secret generic backup-creds --from-file=gcsCredentials=gcs_credentials_file.json
```

This will create the `Secret` with the name `backup-creds` to be used in the yaml file like this:

```yaml
apiVersion: postgresql.k8s.enterprisedb.io/v1
kind: Cluster
[...]
spec:
backup:
barmanObjectStore:
destinationPath: "gs://<destination path here>"
googleCredentials:
applicationCredentials:
name: backup-creds
key: gcsCredentials
```

Now the operator will use the credentials to authenticate against Google Cloud Storage.

!!! Important
This way of authentication will create a JSON file inside the container with all the needed
information to access your Google Cloud Storage bucket, meaning that if someone gets access to the pod
will also have write permissions to the bucket.

## On-demand backups

To request a new backup, you need to create a new Backup resource
Expand Down
14 changes: 7 additions & 7 deletions advocacy_docs/kubernetes/cloud_native_postgresql/bootstrap.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,8 @@ method or the `recovery` one. An external cluster needs to have:
cluster - that is, base backups and WAL archives.

!!! Note
A recovery object store is normally an AWS S3 or an Azure Blob Storage
compatible source that is managed by Barman Cloud.
A recovery object store is normally an AWS S3, or an Azure Blob Storage,
or a Google Cloud Storage source that is managed by Barman Cloud.

When only the streaming connection is defined, the source can be used for the
`pg_basebackup` method. When only the recovery object store is defined, the
Expand Down Expand Up @@ -678,7 +678,7 @@ file on the source PostgreSQL instance:
host replication streaming_replica all md5
```

The following manifest creates a new PostgreSQL 14.1 cluster,
The following manifest creates a new PostgreSQL 14.2 cluster,
called `target-db`, using the `pg_basebackup` bootstrap method
to clone an external PostgreSQL cluster defined as `source-db`
(in the `externalClusters` array). As you can see, the `source-db`
Expand All @@ -693,7 +693,7 @@ metadata:
name: target-db
spec:
instances: 3
imageName: quay.io/enterprisedb/postgresql:14.1
imageName: quay.io/enterprisedb/postgresql:14.2
bootstrap:
pg_basebackup:
Expand All @@ -713,7 +713,7 @@ spec:
```

All the requirements must be met for the clone operation to work, including
the same PostgreSQL version (in our case 14.1).
the same PostgreSQL version (in our case 14.2).

#### TLS certificate authentication

Expand All @@ -728,7 +728,7 @@ in the same Kubernetes cluster.
This example can be easily adapted to cover an instance that resides
outside the Kubernetes cluster.

The manifest defines a new PostgreSQL 14.1 cluster called `cluster-clone-tls`,
The manifest defines a new PostgreSQL 14.2 cluster called `cluster-clone-tls`,
which is bootstrapped using the `pg_basebackup` method from the `cluster-example`
external cluster. The host is identified by the read/write service
in the same cluster, while the `streaming_replica` user is authenticated
Expand All @@ -743,7 +743,7 @@ metadata:
name: cluster-clone-tls
spec:
instances: 3
imageName: quay.io/enterprisedb/postgresql:14.1
imageName: quay.io/enterprisedb/postgresql:14.2
bootstrap:
pg_basebackup:
Expand Down
39 changes: 37 additions & 2 deletions advocacy_docs/kubernetes/cloud_native_postgresql/cnp-plugin.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ Cluster in healthy state
Name: sandbox
Namespace: default
System ID: 7039966298120953877
PostgreSQL Image: quay.io/enterprisedb/postgresql:14.1
PostgreSQL Image: quay.io/enterprisedb/postgresql:14.2
Primary instance: sandbox-2
Instances: 3
Ready instances: 3
Expand Down Expand Up @@ -121,7 +121,7 @@ Cluster in healthy state
Name: sandbox
Namespace: default
System ID: 7039966298120953877
PostgreSQL Image: quay.io/enterprisedb/postgresql:14.1
PostgreSQL Image: quay.io/enterprisedb/postgresql:14.2
Primary instance: sandbox-2
Instances: 3
Ready instances: 3
Expand Down Expand Up @@ -277,4 +277,39 @@ The following command will reload all configurations for a given cluster:

```shell
kubectl cnp reload [cluster_name]
```

### Maintenance

The `kubectl cnp maintenance` command helps to modify one or more clusters across namespaces
and set the maintenance window values, it will change the following fields:

- .spec.nodeMaintenanceWindow.inProgress
- .spec.nodeMaintenanceWindow.reusePVC

Accepts as argument `set` and `unset` using this to set the `inProgress` to `true` in case `set`
and to `false` in case of `unset`.

By default, `reusePVC` is always set to `false` unless the `--reusePVC` flag is passed.

The plugin will ask for a confirmation with a list of the cluster to modify and their new values,
if this is accepted this action will be applied to all the cluster in the list.

If you want to set in maintenance all the PostgreSQL in your Kubernetes cluster, just need to
write the following command:

```shell
kubectl cnp maintenance set --all-namespaces
```

And you'll have the list of all the cluster to update

```shell
The following are the new values for the clusters
Namespace Cluster Name Maintenance reusePVC
--------- ------------ ----------- --------
default cluster-example true false
default pg-backup true false
test cluster-example true false
Do you want to proceed? [y/n]: y
```
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,24 @@ authentication (see the ["Authentication" section](#authentication) below).
Containers run as the `pgbouncer` system user, and access to the `pgbouncer`
database is only allowed via local connections, through `peer` authentication.

### Certificates

By default, PgBouncer pooler will use the same certificates that are used by the
cluster itself, but if the user provides those certificates the pooler will accept
secrets with the following format:

1. Basic Auth
2. TLS
3. Opaque

In the Opaque case, it will look for specific keys that needs to be used, those keys
are the following:

- tls.crt
- tls.key

So we can treat this secret as a TLS secret, and start from there.

## Authentication

**Password based authentication** is the only supported method for clients of
Expand All @@ -121,6 +139,33 @@ Internally, our implementation relies on PgBouncer's `auth_user` and `auth_query
- removes all the above when it detects that a cluster does not have
any pooler associated to it

!!! Important
If you specify your own secrets the operator will not automatically integrate the Pooler.

To manually integrate the Pooler, in the case that you have specified your own secrets, you must run the following queries from inside your cluster.

1. Create the role:

```sql
CREATE ROLE cnp_pooler_pgbouncer WITH LOGIN;
```

2. For each application database, grant the permission for `cnp_pooler_pgbouncer` to connect to it:

```sql
GRANT CONNECT ON DATABASE { database name here } TO cnp_pooler_pgbouncer;
```

3. Connect in each application database, then create the authentication function inside each of the application databases:

```sql
CREATE OR REPLACE FUNCTION user_search(uname TEXT) RETURNS TABLE (usename name, passwd text) as 'SELECT usename, passwd FROM pg_shadow WHERE usename=$1;' LANGUAGE sql SECURITY DEFINER;
REVOKE ALL ON FUNCTION user_search(text) FROM public;
GRANT EXECUTE ON FUNCTION user_search(text) TO cnp_pooler_pgbouncer;
```

## PodTemplates

You can take advantage of pod templates specification in the `template`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ Native PostgreSQL overrides it with its instance manager.
in a **Primary with multiple/optional Hot Standby Servers Architecture**
only.

EnterpriseDB provides and supports public container images for Cloud Native
EDB provides and supports public container images for Cloud Native
PostgreSQL and publishes them on
[Quay.io](https://quay.io/repository/enterprisedb/postgresql).

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ PostgreSQL container images are available at

You can use Cloud Native PostgreSQL with EDB Postgres Advanced
too. You need to request a trial license key from the
[EnterpriseDB website](https://cloud-native.enterprisedb.com).
[EDB website](https://cloud-native.enterprisedb.com).

EDB Postgres Advanced container images are available at
[quay.io/enterprisedb/edb-postgres-advanced](https://quay.io/repository/enterprisedb/edb-postgres-advanced).
Expand Down
79 changes: 79 additions & 0 deletions advocacy_docs/kubernetes/cloud_native_postgresql/failover.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
---
title: 'Automated failover'
originalFilePath: 'src/failover.md'
product: 'Cloud Native Operator'
---

In the case of unexpected errors on the primary, the cluster will go into
**failover mode**. This may happen, for example, when:

- The primary pod has a disk failure
- The primary pod is deleted
- The `postgres` container on the primary has any kind of sustained failure

In the failover scenario, the primary cannot be assumed to be working properly.

After cases like the ones above, the readiness probe for the primary pod will start
failing. This will be picked up in the controller's reconciliation loop. The
controller will initiate the failover process, in two steps:

1. First, it will mark the `TargetPrimary` as `pending`. This change of state will
force the primary pod to shutdown, to ensure the WAL receivers on the replicas
will stop. The cluster will be marked in failover phase ("Failing over").
2. Once all WAL receivers are stopped, there will be a leader election, and a
new primary will be named. The chosen instance will initiate promotion to
primary, and, after this is completed, the cluster will resume normal operations.
Meanwhile, the former primary pod will restart, detect that it is no longer
the primary, and become a replica node.

!!! Important
The two-phase procedure helps ensure the WAL receivers can stop in an orderly
fashion, and that the failing primary will not start streaming WALs again upon
restart. These safeguards prevent timeline discrepancies between the new primary
and the replicas.

During the time the failing primary is being shut down:

1. It will first try a PostgreSQL's *fast shutdown* with
`.spec.switchoverDelay` seconds as timeout. This graceful shutdown will attempt
to archive pending WALs.
2. If the fast shutdown fails, or its timeout is exceeded, a PostgreSQL's
*immediate shutdown* is initiated.

!!! Info
"Fast" mode does not wait for PostgreSQL clients to disconnect and will
terminate an online backup in progress. All active transactions are rolled back
and clients are forcibly disconnected, then the server is shut down.
"Immediate" mode will abort all PostgreSQL server processes immediately,
without a clean shutdown.

## RTO and RPO impact

Failover may result in the service being impacted and/or data being lost:

1. During the time when the primary has started to fail, and before the controller
starts failover procedures, queries in transit, WAL writes, checkpoints and
similar operations, may fail.
2. Once the fast shutdown command has been issued, the cluster will no longer
accept connections, so service will be impacted but no data
will be lost.
3. If the fast shutdown fails, the immediate shutdown will stop any pending
processes, including WAL writing. Data may be lost.
4. During the time the primary is shutting down and a new primary hasn't yet
started, the cluster will operate without a primary and thus be impaired - but
with no data loss.

!!! Note
The timeout that controls fast shutdown is set by `.spec.switchoverDelay`,
as in the case of a switchover. Increasing the time for fast shutdown is safer
from an RPO point of view, but possibly delays the return to normal operation -
negatively affecting RTO.

!!! Warning
As already mentioned in the ["Instance Manager" section](instance_manager.md)
when explaining the switchover process, the `.spec.switchoverDelay` option
affects the RPO and RTO of your PostgreSQL database. Setting it to a low value,
might favor RTO over RPO but lead to data loss at cluster level and/or backup
level. On the contrary, setting it to a high value, might remove the risk of
data loss while leaving the cluster without an active primary for a longer time
during the switchover.
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ PostgreSQL can face on a Kubernetes cluster during its lifetime.

!!! Important
In case the failure scenario you are experiencing is not covered by this
section, please immediately contact EnterpriseDB for support and assistance.
section, please immediately contact EDB for support and assistance.

!!! Seealso "Postgres instance manager"
Please refer to the ["Postgres instance manager" section](instance_manager.md)
Expand Down Expand Up @@ -125,9 +125,9 @@ pod will be created from a backup of the current primary. The pod
will be added again to the `-r` service and to the `-ro` service when ready.

If the failed pod is the primary, the operator will promote the active pod
with status ready and the lowest replication lag, then point the `-rw`service
with status ready and the lowest replication lag, then point the `-rw` service
to it. The failed pod will be removed from the `-r` service and from the
`-ro` service.
`-rw` service.
Other standbys will start replicating from the new primary. The former
primary will use `pg_rewind` to synchronize itself with the new one if its
PVC is available; otherwise, a new standby will be created from a backup of the
Expand All @@ -140,7 +140,7 @@ to solve the problem manually.

!!! Important
In such cases, please do not perform any manual operation without the
support and assistance of EnterpriseDB engineering team.
support and assistance of EDB engineering team.

From version 1.11.0 of the operator, you can use the
`k8s.enterprisedb.io/reconciliationLoop` annotation to temporarily disable the
Expand Down
Loading

0 comments on commit 7ae0e05

Please sign in to comment.