Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cluster): Recovery using pg_basebackup #252

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions charts/cluster/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,24 @@ refer to the [CloudNativePG Documentation](https://cloudnative-pg.io/documentat
| recovery.google.gkeEnvironment | bool | `false` | |
| recovery.google.path | string | `"/"` | |
| recovery.method | string | `"backup"` | Available recovery methods: * `backup` - Recovers a CNPG cluster from a CNPG backup (PITR supported) Needs to be on the same cluster in the same namespace. * `object_store` - Recovers a CNPG cluster from a barman object store (PITR supported). * `pg_basebackup` - Recovers a CNPG cluster viaa streaming replication protocol. Useful if you want to migrate databases to CloudNativePG, even from outside Kubernetes. # TODO |
| recovery.pgBaseBackup.database | string | `"app"` | |
| recovery.pgBaseBackup.owner | string | `""` | |
| recovery.pgBaseBackup.secret | string | `""` | |
| recovery.pgBaseBackup.source.database | string | `"app"` | |
| recovery.pgBaseBackup.source.host | string | `""` | |
| recovery.pgBaseBackup.source.passwordSecret.create | bool | `false` | Whether to create a secret for the password |
| recovery.pgBaseBackup.source.passwordSecret.key | string | `"password"` | The key in the secret containing the password |
| recovery.pgBaseBackup.source.passwordSecret.name | string | `""` | Name of the secret containing the password |
| recovery.pgBaseBackup.source.passwordSecret.value | string | `""` | The password value to use when creating the secret |
| recovery.pgBaseBackup.source.port | int | `5432` | |
| recovery.pgBaseBackup.source.sslCertSecret.key | string | `""` | |
| recovery.pgBaseBackup.source.sslCertSecret.name | string | `""` | |
| recovery.pgBaseBackup.source.sslKeySecret.key | string | `""` | |
| recovery.pgBaseBackup.source.sslKeySecret.name | string | `""` | |
| recovery.pgBaseBackup.source.sslMode | string | `"verify-full"` | |
| recovery.pgBaseBackup.source.sslRootCertSecret.key | string | `""` | |
| recovery.pgBaseBackup.source.sslRootCertSecret.name | string | `""` | |
| recovery.pgBaseBackup.source.username | string | `""` | |
| recovery.pitrTarget.time | string | `""` | Time in RFC3339 format |
| recovery.provider | string | `"s3"` | One of `s3`, `azure` or `google` |
| recovery.s3.accessKey | string | `""` | |
Expand Down
10 changes: 5 additions & 5 deletions charts/cluster/docs/Recovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,18 @@ You can find more information about the recovery process in the [CNPG documentat
There are 3 types of recovery possible with CNPG:
* Recovery from a backup object in the same Kubernetes namespace.
* Recovery from a Barman Object Store, that could be located anywhere.
* Streaming replication from an operating cluster using `pg_basebackup` (not supported by the chart yet).
* Streaming replication from an operating cluster using `pg_basebackup`.

When performing a recovery you are strongly advised to use the same configuration and PostgreSQL version as the original cluster.

To begin, create a `values.yaml` that contains the following:

1. Set `mode: recovery` to indicate that you want to perform bootstrap the new cluster from an existing one.
2. Set the `recovery.method` to the type of recovery you want to perform.
3. Set either the `recovery.backupName` or the Barman Object Store configuration - i.e. `recovery.provider` and appropriate S3, Azure or GCS configuration.
4. Optionally set the `recovery.pitrTarget.time` in RFC3339 format to perform a point-in-time recovery.
4. Retain the identical PostgreSQL version and configuration as the original cluster.
5. Make sure you don't use the same backup section name as the original cluster. We advise you change the `path` within the storage location if you want to reuse the same storage location/bucket.
3. Set either the `recovery.backupName` or the Barman Object Store configuration - i.e. `recovery.provider` and appropriate S3, Azure or GCS configuration. In case of `pg_basebackup` complete the `recovery.pgBaseBackup` section.
4. Optionally set the `recovery.pitrTarget.time` in RFC3339 format to perform a point-in-time recovery (not applicable for `pgBaseBackup`).
5. Retain the identical PostgreSQL version and configuration as the original cluster.
6. Make sure you don't use the same backup section name as the original cluster. We advise you change the `path` within the storage location if you want to reuse the same storage location/bucket.
One pattern is adding a version number at the end of the path, e.g. `/v1` or `/v2` after each recovery procedure.

Example recovery configurations can be found in the [examples](../examples) directory.
14 changes: 14 additions & 0 deletions charts/cluster/examples/recovery-pg_basebackup.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
mode: "recovery"

recovery:
method: "pg_basebackup"
pgBaseBackup:
sourceHost: "source-db.foo.com"
sourceUsername: "streaming_replica"
existingPasswordSecret: "source-db-replica-password"

cluster:
instances: 1

backups:
enabled: false
47 changes: 32 additions & 15 deletions charts/cluster/templates/NOTES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,22 +41,39 @@ Configuration
{{- range (rest .Values.backups.scheduledBackups) -}}
{{ $scheduledBackups = printf "%s, %s" $scheduledBackups .name }}
{{- end -}}
{{- if eq (len .Values.backups.scheduledBackups) 0 }}
{{- $scheduledBackups = "None" -}}
{{- end -}}

{{- $mode := .Values.mode -}}
{{- $source := "" -}}
{{- if eq .Values.mode "recovery" }}
{{- $mode = printf "%s (%s)" .Values.mode .Values.recovery.method -}}
{{- if eq .Values.recovery.method "pg_basebackup" }}
{{- $source = printf "postgresql://%s@%s:%.0f/%s" .Values.recovery.pgBaseBackup.source.username .Values.recovery.pgBaseBackup.source.host .Values.recovery.pgBaseBackup.source.port .Values.recovery.pgBaseBackup.source.database -}}
{{- end -}}
{{- end -}}

╭───────────────────┬────────────────────────────────────────────────────────╮
│ Configuration │ Value │
┝━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┥
│ Cluster mode │ {{ (printf "%-54s" .Values.mode) }} │
│ Type │ {{ (printf "%-54s" .Values.type) }} │
│ Image │ {{ include "cluster.color-info" (printf "%-54s" (include "cluster.imageName" .)) }} │
│ Instances │ {{ include (printf "%s%s" "cluster.color-" $redundancyColor) (printf "%-54s" (toString .Values.cluster.instances)) }} │
│ Backups │ {{ include (printf "%s%s" "cluster.color-" (ternary "ok" "error" .Values.backups.enabled)) (printf "%-54s" (ternary "Enabled" "Disabled" .Values.backups.enabled)) }} │
│ Backup Provider │ {{ (printf "%-54s" (title .Values.backups.provider)) }} │
│ Scheduled Backups │ {{ (printf "%-54s" $scheduledBackups) }} │
│ Storage │ {{ (printf "%-54s" .Values.cluster.storage.size) }} │
│ Storage Class │ {{ (printf "%-54s" (default "Default" .Values.cluster.storage.storageClass)) }} │
│ PGBouncer │ {{ (printf "%-54s" (ternary "Enabled" "Disabled" .Values.pooler.enabled)) }} │
│ Monitoring │ {{ include (printf "%s%s" "cluster.color-" (ternary "ok" "error" .Values.cluster.monitoring.enabled)) (printf "%-54s" (ternary "Enabled" "Disabled" .Values.cluster.monitoring.enabled)) }} │
╰───────────────────┴────────────────────────────────────────────────────────╯
╭───────────────────┬──────────────────────────────────────────────────────────╮
│ Configuration │ Value │
┝━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┥
│ Cluster mode │ {{ printf "%-56s" $mode }} │
│ Type │ {{ printf "%-56s" .Values.type }} │
│ Image │ {{ include "cluster.color-info" (printf "%-56s" (include "cluster.imageName" .)) }} │
{{- if eq .Values.mode "recovery" }}
│ Source │ {{ printf "%-56s" $source }} │
{{- end }}
│ Instances │ {{ include (printf "%s%s" "cluster.color-" $redundancyColor) (printf "%-56s" (toString .Values.cluster.instances)) }} │
│ Backups │ {{ include (printf "%s%s" "cluster.color-" (ternary "ok" "error" .Values.backups.enabled)) (printf "%-56s" (ternary "Enabled" "Disabled" .Values.backups.enabled)) }} │
{{- if .Values.backups.enabled }}
│ Backup Provider │ {{ printf "%-56s" (title .Values.backups.provider) }} │
│ Scheduled Backups │ {{ printf "%-56s" $scheduledBackups }} │
{{- end }}
│ Storage │ {{ printf "%-56s" .Values.cluster.storage.size }} │
│ Storage Class │ {{ printf "%-56s" (default "Default" .Values.cluster.storage.storageClass) }} │
│ PGBouncer │ {{ printf "%-56s" (ternary "Enabled" "Disabled" .Values.pooler.enabled) }} │
│ Monitoring │ {{ include (printf "%s%s" "cluster.color-" (ternary "ok" "error" .Values.cluster.monitoring.enabled)) (printf "%-56s" (ternary "Enabled" "Disabled" .Values.cluster.monitoring.enabled)) }} │
╰───────────────────┴──────────────────────────────────────────────────────────╯

{{ if not .Values.backups.enabled }}
{{- include "cluster.color-error" "Warning! Backups not enabled. Recovery will not be possible! Do not use this configuration in production.\n" }}
Expand Down
47 changes: 46 additions & 1 deletion charts/cluster/templates/_bootstrap.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,50 @@ bootstrap:
{{- end -}}
{{- else if eq .Values.mode "recovery" -}}
bootstrap:
{{- if eq .Values.recovery.method "pg_basebackup" }}
pg_basebackup:
source: pgBaseBackupSource
{{ with .Values.recovery.pgBaseBackup.database }}
database: {{ . }}
{{- end }}
{{ with .Values.recovery.pgBaseBackup.owner }}
owner: {{ . }}
{{- end }}
{{ with .Values.recovery.pgBaseBackup.secret }}
secret:
{{- toYaml . | nindent 6 }}
{{- end }}

externalClusters:
- name: pgBaseBackupSource
connectionParameters:
host: {{ .Values.recovery.pgBaseBackup.source.host | quote }}
port: {{ .Values.recovery.pgBaseBackup.source.port | quote }}
user: {{ .Values.recovery.pgBaseBackup.source.username | quote }}
dbname: {{ .Values.recovery.pgBaseBackup.source.database | quote }}
sslmode: {{ .Values.recovery.pgBaseBackup.source.sslMode | quote }}
{{- if .Values.recovery.pgBaseBackup.source.passwordSecret.name }}
password:
name: {{ default (printf "%s-pg-basebackup-password" (include "cluster.fullname" .)) .Values.recovery.pgBaseBackup.source.passwordSecret.name }}
key: {{ .Values.recovery.pgBaseBackup.source.passwordSecret.key }}
{{- end }}
{{- if .Values.recovery.pgBaseBackup.source.sslKeySecret.name }}
sslKey:
name: {{ .Values.recovery.pgBaseBackup.source.sslKeySecret.name }}
key: {{ .Values.recovery.pgBaseBackup.source.sslKeySecret.key }}
{{- end }}
{{- if .Values.recovery.pgBaseBackup.source.sslCertSecret.name }}
sslCert:
name: {{ .Values.recovery.pgBaseBackup.source.sslCertSecret.name }}
key: {{ .Values.recovery.pgBaseBackup.source.sslCertSecret.key }}
{{- end }}
{{- if .Values.recovery.pgBaseBackup.source.sslRootCertSecret.name }}
sslRootCert:
name: {{ .Values.recovery.pgBaseBackup.source.sslRootCertSecret.name }}
key: {{ .Values.recovery.pgBaseBackup.source.sslRootCertSecret.key }}
{{- end }}

{{- else }}
recovery:
{{- with .Values.recovery.pitrTarget.time }}
recoveryTarget:
Expand All @@ -38,9 +82,10 @@ bootstrap:
externalClusters:
- name: objectStoreRecoveryCluster
barmanObjectStore:
serverName: {{ default (include "cluster.fullname" .) .Values.recovery.clusterName }}
serverName: {{ .Values.recovery.clusterName }}
{{- $d := dict "chartFullname" (include "cluster.fullname" .) "scope" .Values.recovery "secretPrefix" "recovery" -}}
{{- include "cluster.barmanObjectStoreConfig" $d | nindent 4 }}
{{- end }}
{{- else }}
{{ fail "Invalid cluster mode!" }}
{{- end }}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{{- if and (eq .Values.mode "recovery") (eq .Values.recovery.method "pg_basebackup") .Values.recovery.pgBaseBackup.source.passwordSecret.create }}
apiVersion: v1
kind: Secret
metadata:
name: {{ default (printf "%s-pg-basebackup-password" (include "cluster.fullname" .)) .Values.recovery.pgBaseBackup.source.passwordSecret.name }}
data:
{{ .Values.recovery.pgBaseBackup.source.passwordSecret.key }}: {{ required ".Values.recovery.pgBaseBackup.source.passwordSecret.value required when creating a password secret." .Values.recovery.pgBaseBackup.source.passwordSecret.value | b64enc | quote }}
{{- end }}
5 changes: 2 additions & 3 deletions charts/cluster/test/monitoring/chainsaw-test.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
##
# This is a test that verifies that non-default configuration options are correctly propagated to the CNPG cluster.
# P.S. This test is not designed to have a good running configuration, it is designed to test the configuration propagation!
# This is a test that checks if PodMonitors, ConfigMaps and PrometheusRules are correctly provisioned when requested.
apiVersion: chainsaw.kyverno.io/v1alpha1
kind: Test
metadata:
Expand All @@ -11,7 +10,7 @@ spec:
assert: 20s
cleanup: 30s
steps:
- name: Install the non-default configuration cluster
- name: Install the monitoring cluster
try:
- script:
content: |
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: source-cluster
status:
readyInstances: 1
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
mode: "standalone"
cluster:
instances: 1
backups:
enabled: false
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: batch/v1
kind: Job
metadata:
name: data-write
status:
succeeded: 1
30 changes: 30 additions & 0 deletions charts/cluster/test/postgresql-pg_basebackup/01-data_write.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
apiVersion: batch/v1
kind: Job
metadata:
name: data-write
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: data-write
env:
- name: DB_USER
valueFrom:
secretKeyRef:
name: source-cluster-superuser
key: username
- name: DB_PASS
valueFrom:
secretKeyRef:
name: source-cluster-superuser
key: password
- name: DB_URI
value: postgres://$(DB_USER):$(DB_PASS)@source-cluster-rw:5432
image: alpine:3.19
command: ['sh', '-c']
args:
- |
apk --no-cache add postgresql-client kubectl
psql "$DB_URI" -c "CREATE DATABASE mygooddb;"
psql "$DB_URI/mygooddb" -c "CREATE TABLE mygoodtable (id serial PRIMARY KEY);"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: pg-basebackup-cluster
status:
readyInstances: 2
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
mode: "recovery"
recovery:
method: "pg_basebackup"
pgBaseBackup:
source:
host: "source-cluster-rw"
database: "mygooddb"
username: "streaming_replica"
sslMode: "require"
sslKeySecret:
name: source-cluster-replication
key: tls.key
sslCertSecret:
name: source-cluster-replication
key: tls.crt

cluster:
instances: 2

backups:
enabled: false
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: batch/v1
kind: Job
metadata:
name: data-test
status:
succeeded: 1
23 changes: 23 additions & 0 deletions charts/cluster/test/postgresql-pg_basebackup/03-data_test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
apiVersion: batch/v1
kind: Job
metadata:
name: data-test
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: data-test
env:
- name: DB_URI
valueFrom:
secretKeyRef:
name: pg-basebackup-cluster-superuser
key: uri
image: alpine:3.19
command: ['sh', '-c']
args:
- |
apk --no-cache add postgresql-client
DB_URI=$(echo $DB_URI | sed "s|/\*|/|" )
test "$(psql "${DB_URI}mygooddb" -t -c 'SELECT EXISTS (SELECT FROM information_schema.tables WHERE table_name = $$mygoodtable$$)' --csv -q 2>/dev/null)" = "t"
64 changes: 64 additions & 0 deletions charts/cluster/test/postgresql-pg_basebackup/chainsaw-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
##
# This is a test that provisions a regular (non CNPG) PostgreSQL cluster and attempts to perform a pg_basebackup recovery.
apiVersion: chainsaw.kyverno.io/v1alpha1
kind: Test
metadata:
name: postgresql-pg-basebackup
spec:
timeouts:
apply: 1s
assert: 2m
cleanup: 1m
steps:
- name: Install the external PostgreSQL cluster
try:
- script:
content: |
helm upgrade \
--install \
--namespace $NAMESPACE \
--values ./00-source-cluster.yaml \
--wait \
source ../../
- assert:
file: ./00-source-cluster-assert.yaml
- apply:
file: ./01-data_write.yaml
- assert:
file: ./01-data_write-assert.yaml
- name: Install the pg_basebackup cluster
timeouts:
assert: 5m
try:
- script:
content: |
helm upgrade \
--install \
--namespace $NAMESPACE \
--values ./02-pg_basebackup-cluster.yaml \
--wait \
pg-basebackup ../../
- assert:
file: ./02-pg_basebackup-cluster-assert.yaml
catch:
- describe:
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
- name: Verify the data from step 1 exists
try:
- apply:
file: ./03-data_test.yaml
- assert:
file: ./03-data_test-assert.yaml
catch:
- describe:
apiVersion: batch/v1
kind: Job
- podLogs:
selector: batch.kubernetes.io/job-name=data-test
- name: Cleanup
try:
- script:
content: |
helm uninstall --namespace $NAMESPACE source
helm uninstall --namespace $NAMESPACE pg-basebackup
Loading
Loading