Merge pull request #2492 from EnterpriseDB/release/2022-03-25

Release: 2022-03-25
EnterpriseDB · Mar 25, 2022 · 7ae0e05 · 7ae0e05
2 parents 6624a55 + d32a789
commit 7ae0e05
Show file tree

Hide file tree

Showing 76 changed files with 1,137 additions and 862 deletions.
diff --git a/advocacy_docs/kubernetes/cloud_native_postgresql/api_reference.mdx b/advocacy_docs/kubernetes/cloud_native_postgresql/api_reference.mdx
diff --git a/advocacy_docs/kubernetes/cloud_native_postgresql/architecture.mdx b/advocacy_docs/kubernetes/cloud_native_postgresql/architecture.mdx
@@ -92,7 +92,7 @@ only write inside a single Kubernetes cluster, at any time.
 !!! Tip
     If you are interested in a PostgreSQL architecture where all instances accept writes, 
     please take a look at  [BDR (Bi-Directional Replication) by EDB](https://www.enterprisedb.com/docs/bdr/latest/). 
-    For Kubernetes, BDR will have its own Operator, expected late in 2021. 
+    For Kubernetes, BDR will have its own Operator, expected later in 2022. 
 
 However, for business continuity objectives it is fundamental to:
 

diff --git a/advocacy_docs/kubernetes/cloud_native_postgresql/backup_recovery.mdx b/advocacy_docs/kubernetes/cloud_native_postgresql/backup_recovery.mdx
@@ -35,7 +35,8 @@ You can archive the backup files in any service that is supported
 by the Barman Cloud infrastructure. That is:
 
 -   [AWS S3](https://aws.amazon.com/s3/)
--   [Microsoft Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/).
+-   [Microsoft Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/)
+-   [Google Cloud Storage](https://cloud.google.com/storage/)
 
 You can also use any compatible implementation of the
 supported services.
@@ -318,6 +319,71 @@ In that case, `<account-name>` is the first component of the path.
 This is required if you are testing the Azure support via the Azure Storage
 Emulator or [Azurite](https://github.com/Azure/Azurite).
 
+### Google Cloud Storage
+
+Currently, the operator supports two authentication methods for Google Cloud Storage,
+one assumes the pod is running inside a Google Kubernetes Engine cluster, the other one leverages
+the environment variable `GOOGLE_APPLICATION_CREDENTIALS`.
+
+#### Running inside Google Kubernetes Engine
+
+This could be one of the easiest way to create a backup, and only requires
+the following configuration:
+
+```yaml
+apiVersion: postgresql.k8s.enterprisedb.io/v1
+kind: Cluster
+[...]
+spec:
+  backup:
+    barmanObjectStore:
+      destinationPath: "gs://<destination path here>"
+      googleCredentials:
+        gkeEnvironment: true
+```
+
+This, will tell the operator that the cluster is running inside a Google Kubernetes
+Engine meaning that no credentials are needed to upload the files
+
+!!! Important
+    This method will require carefully defined permissions for cluster
+    and pods, which have to be defined by a cluster administrator.
+
+#### Using authentication
+
+Following the [instruction from Google](https://cloud.google.com/docs/authentication/getting-started)
+you will get a JSON file that contains all the required information to authenticate.
+
+The content of the JSON file must be provided using a `Secret` that can be created
+with the following command:
+
+```shell
+kubectl create secret generic backup-creds --from-file=gcsCredentials=gcs_credentials_file.json
+```
+
+This will create the `Secret` with the name `backup-creds` to be used in the yaml file like this:
+
+```yaml
+apiVersion: postgresql.k8s.enterprisedb.io/v1
+kind: Cluster
+[...]
+spec:
+  backup:
+    barmanObjectStore:
+      destinationPath: "gs://<destination path here>"
+      googleCredentials:
+        applicationCredentials:
+          name: backup-creds
+          key: gcsCredentials
+```
+
+Now the operator will use the credentials to authenticate against Google Cloud Storage.
+
+!!! Important
+    This way of authentication will create a JSON file inside the container with all the needed
+    information to access your Google Cloud Storage bucket, meaning that if someone gets access to the pod
+    will also have write permissions to the bucket.
+
 ## On-demand backups
 
 To request a new backup, you need to create a new Backup resource

diff --git a/advocacy_docs/kubernetes/cloud_native_postgresql/bootstrap.mdx b/advocacy_docs/kubernetes/cloud_native_postgresql/bootstrap.mdx
@@ -77,8 +77,8 @@ method or the `recovery` one. An external cluster needs to have:
         cluster - that is, base backups and WAL archives.
 
 !!! Note
-    A recovery object store is normally an AWS S3 or an Azure Blob Storage
-    compatible source that is managed by Barman Cloud.
+    A recovery object store is normally an AWS S3, or an Azure Blob Storage,
+    or a Google Cloud Storage source that is managed by Barman Cloud.
 
 When only the streaming connection is defined, the source can be used for the
 `pg_basebackup` method. When only the recovery object store is defined, the
@@ -678,7 +678,7 @@ file on the source PostgreSQL instance:
 host replication streaming_replica all md5
 ```
 
-The following manifest creates a new PostgreSQL 14.1 cluster,
+The following manifest creates a new PostgreSQL 14.2 cluster,
 called `target-db`, using the `pg_basebackup` bootstrap method
 to clone an external PostgreSQL cluster defined as `source-db`
 (in the `externalClusters` array). As you can see, the `source-db`
@@ -693,7 +693,7 @@ metadata:
   name: target-db
 spec:
   instances: 3
-  imageName: quay.io/enterprisedb/postgresql:14.1
+  imageName: quay.io/enterprisedb/postgresql:14.2
 
   bootstrap:
     pg_basebackup:
@@ -713,7 +713,7 @@ spec:
 ```
 
 All the requirements must be met for the clone operation to work, including
-the same PostgreSQL version (in our case 14.1).
+the same PostgreSQL version (in our case 14.2).
 
 #### TLS certificate authentication
 
@@ -728,7 +728,7 @@ in the same Kubernetes cluster.
     This example can be easily adapted to cover an instance that resides
     outside the Kubernetes cluster.
 
-The manifest defines a new PostgreSQL 14.1 cluster called `cluster-clone-tls`,
+The manifest defines a new PostgreSQL 14.2 cluster called `cluster-clone-tls`,
 which is bootstrapped using the `pg_basebackup` method from the `cluster-example`
 external cluster. The host is identified by the read/write service
 in the same cluster, while the `streaming_replica` user is authenticated
@@ -743,7 +743,7 @@ metadata:
   name: cluster-clone-tls
 spec:
   instances: 3
-  imageName: quay.io/enterprisedb/postgresql:14.1
+  imageName: quay.io/enterprisedb/postgresql:14.2
 
   bootstrap:
     pg_basebackup:

diff --git a/advocacy_docs/kubernetes/cloud_native_postgresql/cnp-plugin.mdx b/advocacy_docs/kubernetes/cloud_native_postgresql/cnp-plugin.mdx
@@ -77,7 +77,7 @@ Cluster in healthy state
 Name:               sandbox
 Namespace:          default
 System ID:          7039966298120953877
-PostgreSQL Image:   quay.io/enterprisedb/postgresql:14.1
+PostgreSQL Image:   quay.io/enterprisedb/postgresql:14.2
 Primary instance:   sandbox-2
 Instances:          3
 Ready instances:    3
@@ -121,7 +121,7 @@ Cluster in healthy state
 Name:               sandbox
 Namespace:          default
 System ID:          7039966298120953877
-PostgreSQL Image:   quay.io/enterprisedb/postgresql:14.1
+PostgreSQL Image:   quay.io/enterprisedb/postgresql:14.2
 Primary instance:   sandbox-2
 Instances:          3
 Ready instances:    3
@@ -277,4 +277,39 @@ The following command will reload all configurations for a given cluster:
 
 ```shell
 kubectl cnp reload [cluster_name]
+```
+
+### Maintenance
+
+The `kubectl cnp maintenance` command helps to modify one or more clusters across namespaces
+and set the maintenance window values, it will change the following fields:
+
+-   .spec.nodeMaintenanceWindow.inProgress
+-   .spec.nodeMaintenanceWindow.reusePVC
+
+Accepts as argument `set` and `unset` using this to set the `inProgress` to `true` in case `set`
+and to `false` in case of `unset`.
+
+By default, `reusePVC` is always set to `false` unless the `--reusePVC` flag is passed.
+
+The plugin will ask for a confirmation with a list of the cluster to modify and their new values,
+if this is accepted this action will be applied to all the cluster in the list.
+
+If you want to set in maintenance all the PostgreSQL in your Kubernetes cluster, just need to
+write the following command:
+
+```shell
+kubectl cnp maintenance set --all-namespaces
+```
+
+And you'll have the list of all the cluster to update
+
+```shell
+The following are the new values for the clusters
+Namespace  Cluster Name     Maintenance  reusePVC
+---------  ------------     -----------  --------
+default    cluster-example  true         false
+default    pg-backup        true         false
+test       cluster-example  true         false
+Do you want to proceed? [y/n]: y
 ```
diff --git a/advocacy_docs/kubernetes/cloud_native_postgresql/connection_pooling.mdx b/advocacy_docs/kubernetes/cloud_native_postgresql/connection_pooling.mdx
@@ -104,6 +104,24 @@ authentication (see the ["Authentication" section](#authentication) below).
 Containers run as the `pgbouncer` system user, and access to the `pgbouncer`
 database is only allowed via local connections, through `peer` authentication.
 
+### Certificates
+
+By default, PgBouncer pooler will use the same certificates that are used by the
+cluster itself, but if the user provides those certificates the pooler will accept
+secrets with the following format:
+
+1.  Basic Auth
+2.  TLS
+3.  Opaque
+
+In the Opaque case, it will look for specific keys that needs to be used, those keys
+are the following:
+
+-   tls.crt
+-   tls.key
+
+So we can treat this secret as a TLS secret, and start from there.
+
 ## Authentication
 
 **Password based authentication** is the only supported method for clients of
@@ -121,6 +139,33 @@ Internally, our implementation relies on PgBouncer's `auth_user` and `auth_query
 -   removes all the above when it detects that a cluster does not have
     any pooler associated to it
 
+!!! Important
+    If you specify your own secrets the operator will not automatically integrate the Pooler.
+
+To manually integrate the Pooler, in the case that you have specified your own secrets, you must run the following queries from inside your cluster.
+
+1.  Create the role:
+
+```sql
+CREATE ROLE cnp_pooler_pgbouncer WITH LOGIN;
+```
+
+2.  For each application database, grant the permission for `cnp_pooler_pgbouncer` to connect to it:
+
+    ```sql
+    GRANT CONNECT ON DATABASE { database name here } TO cnp_pooler_pgbouncer;
+    ```
+
+3.  Connect in each application database, then create the authentication function inside each of the application databases:
+
+    ```sql
+    CREATE OR REPLACE FUNCTION user_search(uname TEXT) RETURNS TABLE (usename name, passwd text) as 'SELECT usename, passwd FROM pg_shadow WHERE usename=$1;' LANGUAGE sql SECURITY DEFINER;
+
+    REVOKE ALL ON FUNCTION user_search(text) FROM public;
+
+    GRANT EXECUTE ON FUNCTION user_search(text) TO cnp_pooler_pgbouncer;
+    ```
+
 ## PodTemplates
 
 You can take advantage of pod templates specification in the `template`

diff --git a/advocacy_docs/kubernetes/cloud_native_postgresql/container_images.mdx b/advocacy_docs/kubernetes/cloud_native_postgresql/container_images.mdx
@@ -33,7 +33,7 @@ Native PostgreSQL overrides it with its instance manager.
     in a **Primary with multiple/optional Hot Standby Servers Architecture**
     only.
 
-EnterpriseDB provides and supports public container images for Cloud Native
+EDB provides and supports public container images for Cloud Native
 PostgreSQL and publishes them on
 [Quay.io](https://quay.io/repository/enterprisedb/postgresql).
 

diff --git a/advocacy_docs/kubernetes/cloud_native_postgresql/evaluation.mdx b/advocacy_docs/kubernetes/cloud_native_postgresql/evaluation.mdx
@@ -27,7 +27,7 @@ PostgreSQL container images are available at
 
 You can use Cloud Native PostgreSQL with EDB Postgres Advanced
 too. You need to request a trial license key from the
-[EnterpriseDB website](https://cloud-native.enterprisedb.com).
+[EDB website](https://cloud-native.enterprisedb.com).
 
 EDB Postgres Advanced container images are available at
 [quay.io/enterprisedb/edb-postgres-advanced](https://quay.io/repository/enterprisedb/edb-postgres-advanced).

diff --git a/advocacy_docs/kubernetes/cloud_native_postgresql/failover.mdx b/advocacy_docs/kubernetes/cloud_native_postgresql/failover.mdx
@@ -0,0 +1,79 @@
+---
+title: 'Automated failover'
+originalFilePath: 'src/failover.md'
+product: 'Cloud Native Operator'
+---
+
+In the case of unexpected errors on the primary, the cluster will go into
+**failover mode**. This may happen, for example, when:
+
+-   The primary pod has a disk failure
+-   The primary pod is deleted
+-   The `postgres` container on the primary has any kind of sustained failure
+
+In the failover scenario, the primary cannot be assumed to be working properly.
+
+After cases like the ones above, the readiness probe for the primary pod will start
+failing. This will be picked up in the controller's reconciliation loop. The
+controller will initiate the failover process, in two steps:
+
+1.  First, it will mark the `TargetPrimary` as `pending`. This change of state will
+    force the primary pod to shutdown, to ensure the WAL receivers on the replicas
+    will stop. The cluster will be marked in failover phase ("Failing over").
+2.  Once all WAL receivers are stopped, there will be a leader election, and a
+    new primary will be named. The chosen instance will initiate promotion to
+    primary, and, after this is completed, the cluster will resume normal operations.
+    Meanwhile, the former primary pod will restart, detect that it is no longer
+    the primary, and become a replica node.
+
+!!! Important
+    The two-phase procedure helps ensure the WAL receivers can stop in an orderly
+    fashion, and that the failing primary will not start streaming WALs again upon
+    restart. These safeguards prevent timeline discrepancies between the new primary
+    and the replicas.
+
+During the time the failing primary is being shut down:
+
+1.  It will first try a PostgreSQL's *fast shutdown* with
+    `.spec.switchoverDelay` seconds as timeout. This graceful shutdown will attempt
+    to archive pending WALs.
+2.  If the fast shutdown fails, or its timeout is exceeded, a PostgreSQL's
+    *immediate shutdown* is initiated.
+
+!!! Info
+    "Fast" mode does not wait for PostgreSQL clients to disconnect and will
+    terminate an online backup in progress. All active transactions are rolled back
+    and clients are forcibly disconnected, then the server is shut down.
+    "Immediate" mode will abort all PostgreSQL server processes immediately,
+    without a clean shutdown.
+
+## RTO and RPO impact
+
+Failover may result in the service being impacted and/or data being lost:
+
+1.  During the time when the primary has started to fail, and before the controller
+    starts failover procedures, queries in transit, WAL writes, checkpoints and
+    similar operations, may fail.
+2.  Once the fast shutdown command has been issued, the cluster will no longer
+    accept connections, so service will be impacted but no data
+    will be lost.
+3.  If the fast shutdown fails, the immediate shutdown will stop any pending
+    processes, including WAL writing. Data may be lost.
+4.  During the time the primary is shutting down and a new primary hasn't yet
+    started, the cluster will operate without a primary and thus be impaired - but
+    with no data loss.
+
+!!! Note
+    The timeout that controls fast shutdown is set by `.spec.switchoverDelay`,
+    as in the case of a switchover. Increasing the time for fast shutdown is safer
+    from an RPO point of view, but possibly delays the return to normal operation -
+    negatively affecting RTO.
+
+!!! Warning
+    As already mentioned in the ["Instance Manager" section](instance_manager.md)
+    when explaining the switchover process, the `.spec.switchoverDelay` option
+    affects the RPO and RTO of your PostgreSQL database. Setting it to a low value,
+    might favor RTO over RPO but lead to data loss at cluster level and/or backup
+    level. On the contrary, setting it to a high value, might remove the risk of
+    data loss while leaving the cluster without an active primary for a longer time
+    during the switchover.
diff --git a/advocacy_docs/kubernetes/cloud_native_postgresql/failure_modes.mdx b/advocacy_docs/kubernetes/cloud_native_postgresql/failure_modes.mdx
@@ -9,7 +9,7 @@ PostgreSQL can face on a Kubernetes cluster during its lifetime.
 
 !!! Important
     In case the failure scenario you are experiencing is not covered by this
-    section, please immediately contact EnterpriseDB for support and assistance.
+    section, please immediately contact EDB for support and assistance.
 
 !!! Seealso "Postgres instance manager"
     Please refer to the ["Postgres instance manager" section](instance_manager.md)
@@ -125,9 +125,9 @@ pod will be created from a backup of the current primary. The pod
 will be added again to the `-r` service and to the `-ro` service when ready.
 
 If the failed pod is the primary, the operator will promote the active pod
-with status ready and the lowest replication lag, then point the `-rw`service
+with status ready and the lowest replication lag, then point the `-rw` service
 to it. The failed pod will be removed from the `-r` service and from the
-`-ro` service.
+`-rw` service.
 Other standbys will start replicating from the new primary. The former
 primary will use `pg_rewind` to synchronize itself with the new one if its
 PVC is available; otherwise, a new standby will be created from a backup of the
@@ -140,7 +140,7 @@ to solve the problem manually.
 
 !!! Important
     In such cases, please do not perform any manual operation without the
-    support and assistance of EnterpriseDB engineering team.
+    support and assistance of EDB engineering team.
 
 From version 1.11.0 of the operator, you can use the
 `k8s.enterprisedb.io/reconciliationLoop` annotation to temporarily disable the