Skip to content

Commit

Permalink
Merge pull request #1457 from EnterpriseDB/release/2021-06-11
Browse files Browse the repository at this point in the history
Release: 2021-06-11 (corrected)
Former-commit-id: 85e5290
  • Loading branch information
josh-heyer authored Jun 11, 2021
2 parents 1a2d985 + 1a2f5fb commit 31f3a88
Show file tree
Hide file tree
Showing 38 changed files with 843 additions and 136 deletions.
270 changes: 189 additions & 81 deletions advocacy_docs/kubernetes/cloud_native_postgresql/api_reference.mdx

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -124,3 +124,4 @@ The `-app` credentials are the ones that should be used by applications
connecting to the PostgreSQL cluster.

The `-superuser` ones are supposed to be used only for administrative purposes.

268 changes: 265 additions & 3 deletions advocacy_docs/kubernetes/cloud_native_postgresql/bootstrap.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@ originalFilePath: 'src/bootstrap.md'
product: 'Cloud Native Operator'
---

!!! Note
When referring to "PostgreSQL cluster" in this section, the same
concepts apply to both PostgreSQL and EDB Postgres Advanced, unless
differently stated.

This section describes the options you have to create a new
PostgreSQL cluster and the design rationale behind them.

Expand Down Expand Up @@ -34,9 +39,13 @@ The `initdb` bootstrap method is used.

We currently support the following bootstrap methods:

- `initdb`: initialise an empty PostgreSQL cluster
- `recovery`: create a PostgreSQL cluster restoring from an existing backup
and replaying all the available WAL files.
- `initdb`: initialize an empty PostgreSQL cluster
- `recovery`: create a PostgreSQL cluster by restoring from an existing backup
and replaying all the available WAL files or up to a given point in time
- `pg_basebackup`: create a PostgreSQL cluster by cloning an existing one of the
same major version using `pg_basebackup` via streaming replication protocol -
useful if you want to migrate databases to Cloud Native PostgreSQL, even
from outside Kubernetes.

## initdb

Expand Down Expand Up @@ -306,3 +315,256 @@ spec:
targetName: "maintenance-activity"
exclusive: false
```

## pg_basebackup

The `pg_basebackup` bootstrap mode lets you create a new cluster (*target*) as
an exact physical copy of an existing and **binary compatible** PostgreSQL
instance (*source*), through a valid *streaming replication* connection.
The source instance can be either a primary or a standby PostgreSQL server.

The primary use case for this method is represented by **migrations** to Cloud Native PostgreSQL,
either from outside Kubernetes or within Kubernetes (e.g., from another operator).

!!! Warning
The current implementation creates a *snapshot* of the origin PostgreSQL
instance when the cloning process terminates and immediately starts
the created cluster. See ["Current limitations"](#current-limitations) below for details.

Similar to the case of the `recovery` bootstrap method, once the clone operation
completes, the operator will take ownership of the target cluster, starting from
the first instance. This includes overriding some configuration parameters, as
required by Cloud Native PostgreSQL, resetting the superuser password, creating
the `streaming_replica` user, managing the replicas, and so on. The resulting
cluster will be completely independent of the source instance.

!!! Important
Configuring the network between the target instance and the source instance
goes beyond the scope of Cloud Native PostgreSQL documentation, as it depends
on the actual context and environment.

The streaming replication client on the target instance, which will be
transparently managed by `pg_basebackup`, can authenticate itself on the source
instance in any of the following ways:

1. via [username/password](#usernamepassword-authentication)
2. via [TLS client certificate](#tls-certificate-authentication)

The latter is the recommended one if you connect to a source managed
by Cloud Native PostgreSQL or configured for TLS authentication.
The first option is, however, the most common form of authentication to a
PostgreSQL server in general, and might be the easiest way if the source
instance is on a traditional environment outside Kubernetes.
Both cases are explained below.

### Requirements

The following requirements apply to the `pg_basebackup` bootstrap method:

- target and source must have the same hardware architecture
- target and source must have the same major PostgreSQL version
- source must not have any tablespace defined (see ["Current limitations"](#current-limitations) below)
- source must be configured with enough `max_wal_senders` to grant
access from the target for this one-off operation by providing at least
one *walsender* for the backup plus one for WAL streaming
- the network between source and target must be configured to enable the target
instance to connect to the PostgreSQL port on the source instance
- source must have a role with `REPLICATION LOGIN` privileges and must accept
connections from the target instance for this role in `pg_hba.conf`, preferably
via TLS (see ["About the replication user"](#about-the-replication-user) below)
- target must be able to successfully connect to the source PostgreSQL instance
using a role with `REPLICATION LOGIN` privileges

!!! Seealso
For further information, please refer to the
["Planning" section for Warm Standby](https://www.postgresql.org/docs/current/warm-standby.html#STANDBY-PLANNING),
the
[`pg_basebackup` page](https://www.postgresql.org/docs/current/app-pgbasebackup.html)
and the
["High Availability, Load Balancing, and Replication" chapter](https://www.postgresql.org/docs/current/high-availability.html)
in the PostgreSQL documentation.

### About the replication user

As explained in the requirements section, you need to have a user
with either the `SUPERUSER` or, preferably, just the `REPLICATION`
privilege in the source instance.

If the source database is created with Cloud Native PostgreSQL, you
can reuse the `streaming_replica` user and take advantage of client
TLS certificates authentication (which, by default, is the only allowed
connection method for `streaming_replica`).

For all other cases, including outside Kubernetes, please verify that
you already have a user with the `REPLICATION` privilege, or create
a new one by following the instructions below.

As `postgres` user on the source system, please run:

```console
createuser -P --replication streaming_replica
```

Enter the password at the prompt and save it for later, as you
will need to add it to a secret in the target instance.

!!! Note
Although the name is not important, we will use `streaming_replica`
for the sake of simplicity. Feel free to change it as you like,
provided you adapt the instructions in the following sections.

### Username/Password authentication

The first authentication method supported by Cloud Native PostgreSQL
with the `pg_basebackup` bootstrap is based on username and password matching.

Make sure you have the following information before you start the procedure:

- location of the source instance, identified by a hostname or an IP address
and a TCP port
- replication username (`streaming_replica` for simplicity)
- password

You might need to add a line similar to the following to the `pg_hba.conf`
file on the source PostgreSQL instance:

```
# A more restrictive rule for TLS and IP of origin is recommended
host replication streaming_replica all md5
```

The following manifest creates a new PostgreSQL 13.3 cluster,
called `target-db`, using the `pg_basebackup` bootstrap method
to clone an external PostgreSQL cluster defined as `source-db`
(in the `externalClusters` array). As you can see, the `source-db`
definition points to the `source-db.foo.com` host and connects as
the `streaming_replica` user, whose password is stored in the
`password` key of the `source-db-replica-user` secret.

```yaml
apiVersion: postgresql.k8s.enterprisedb.io/v1
kind: Cluster
metadata:
name: target-db
spec:
instances: 3
imageName: quay.io/enterprisedb/postgresql:13.3
bootstrap:
pg_basebackup:
source: source-db
storage:
size: 1Gi
externalClusters:
- name: source-db
connectionParameters:
host: source-db.foo.com
user: streaming_replica
password:
name: source-db-replica-user
key: password
```

All the requirements must be met for the clone operation to work, including
the same PostgreSQL version (in our case 13.3).

### TLS certificate authentication

The second authentication method supported by Cloud Native PostgreSQL
with the `pg_basebackup` bootstrap is based on TLS client certificates.
This is the recommended approach from a security standpoint.

The following example clones an existing PostgreSQL cluster (`cluster-example`)
in the same Kubernetes cluster.

!!! Note
This example can be easily adapted to cover an instance that resides
outside the Kubernetes cluster.

The manifest defines a new PostgreSQL 13.3 cluster called `cluster-clone-tls`,
which is bootstrapped using the `pg_basebackup` method from the `cluster-example`
external cluster. The host is identified by the read/write service
in the same cluster, while the `streaming_replica` user is authenticated
thanks to the provided keys, certificate, and certification authority
information (respectively in the `cluster-example-replication` and
`cluster-example-ca` secrets).

```yaml
apiVersion: postgresql.k8s.enterprisedb.io/v1
kind: Cluster
metadata:
name: cluster-clone-tls
spec:
instances: 3
imageName: quay.io/enterprisedb/postgresql:13.3
bootstrap:
pg_basebackup:
source: cluster-example
storage:
size: 1Gi
externalClusters:
- name: cluster-example
connectionParameters:
host: cluster-example-rw.default.svc
user: streaming_replica
sslmode: verify-full
sslKey:
name: cluster-example-replication
key: tls.key
sslCert:
name: cluster-example-replication
key: tls.crt
sslRootCert:
name: cluster-example-ca
key: ca.crt
```

### Current limitations

#### Missing tablespace support

Cloud Native PostgreSQL does not currently include full declarative management
of PostgreSQL global objects, namely roles, databases, and tablespaces.
While roles and databases are copied from the source instance to the target
cluster, tablespaces require a capability that this version of
Cloud Native PostgreSQL is missing: definition and management of additional
persistent volumes. When dealing with base backup and tablespaces, PostgreSQL
itself requires that the exact mount points in the source instance
must also exist in the target instance, in our case, the pods in Kubernetes
that Cloud Native PostgreSQL manages. For this reason, you cannot directly
migrate in Cloud Native PostgreSQL a PostgreSQL instance that takes advantage
of tablespaces (you first need to remove them from the source or, if your
organization requires this feature, contact EDB to prioritize it).

#### Snapshot copy

The `pg_basebackup` method takes a snapshot of the source instance in the form of
a PostgreSQL base backup. All transactions written from the start of
the backup to the correct termination of the backup will be streamed to the target
instance using a second connection (see the `--wal-method=stream` option for
`pg_basebackup`).

Once the backup is completed, the new instance will be started on a new timeline
and diverge from the source.
For this reason, it is advised to stop all write operations to the source database
before migrating to the target database in Kubernetes.

!!! Important
Before you attempt a migration, you must test both the procedure
and the applications. In particular, it is fundamental that you run the migration
procedure as many times as needed to systematically measure the downtime of your
applications in production. Feel free to contact EDB for assistance.

Future versions of Cloud Native PostgreSQL will enable users to control
PostgreSQL's continuous recovery mechanism via Write-Ahead Log (WAL) shipping
by creating a new cluster that is a replica of another PostgreSQL instance.
This will open up two main use cases:

- replication over different Kubernetes clusters in Cloud Native PostgreSQL
- *0 cutover time* migrations to Cloud Native PostgreSQL with the `pg_basebackup`
bootstrap method
Loading

0 comments on commit 31f3a88

Please sign in to comment.