Skip to content

Commit

Permalink
Add replication settings
Browse files Browse the repository at this point in the history
  • Loading branch information
dragoangel committed Aug 28, 2024
1 parent 573c2cf commit 2145416
Show file tree
Hide file tree
Showing 23 changed files with 758 additions and 332 deletions.
97 changes: 56 additions & 41 deletions charts/cluster/README.md

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions charts/cluster/docs/Recovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,18 @@ You can find more information about the recovery process in the [CNPG documentat
There are 3 types of recovery possible with CNPG:
* Recovery from a backup object in the same Kubernetes namespace.
* Recovery from a Barman Object Store, that could be located anywhere.
* Streaming replication from an operating cluster using `pg_basebackup` (not supported by the chart yet).
* Streaming replication from an operating cluster using `pgBasebackup`.

When performing a recovery you are strongly advised to use the same configuration and PostgreSQL version as the original cluster.

To begin, create a `values.yaml` that contains the following:

1. Set `mode: recovery` to indicate that you want to perform bootstrap the new cluster from an existing one.
2. Set the `recovery.method` to the type of recovery you want to perform.
3. Set either the `recovery.backupName` or the Barman Object Store configuration - i.e. `recovery.provider` and appropriate S3, Azure or GCS configuration.
4. Optionally set the `recovery.pitrTarget.time` in RFC3339 format to perform a point-in-time recovery.
4. Retain the identical PostgreSQL version and configuration as the original cluster.
5. Make sure you don't use the same backup section name as the original cluster. We advise you change the `path` within the storage location if you want to reuse the same storage location/bucket.
3. Configure `recovery.methodSettings` for selected `recovery.method`.
4. Optionally set the `recovery.pitrTarget.time` in RFC3339 format to perform a point-in-time recovery (if supported by chosen `recovery.method`).
5. Retain the identical major PostgreSQL version and same/newer minor version as on the original cluster.
6. Make sure you don't use the same backup section name as the original cluster. We advise you change the `path` within the storage location if you want to reuse the same storage location/bucket.
One pattern is adding a version number at the end of the path, e.g. `/v1` or `/v2` after each recovery procedure.

Example recovery configurations can be found in the [examples](../examples) directory.
4 changes: 3 additions & 1 deletion charts/cluster/examples/recovery-backup.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@ mode: recovery

recovery:
method: backup
backupName: "database-clustermarket-database-daily-backup-1683244800"
methodSettings:
backup:
backupName: database-clustermarket-database-daily-backup-1683244800

cluster:
instances: 1
Expand Down
2 changes: 1 addition & 1 deletion charts/cluster/examples/recovery-object_store.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
mode: recovery

recovery:
method: object_store
method: objectStorage
clusterName: "cluster-name-to-recover-from"
provider: s3
s3:
Expand Down
15 changes: 15 additions & 0 deletions charts/cluster/examples/recovery-pg_basebackup-password.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
mode: recovery

recovery:
method: pgBasebackup
methodSettings:
pgBasebackup:
host: source-db.foo.com
user: streaming_replica
auth: password
authDetails:
password: |-
replication-password
cluster:
instances: 1
23 changes: 23 additions & 0 deletions charts/cluster/examples/recovery-pg_basebackup-tls.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
mode: recovery

recovery:
method: pgBasebackup
methodSettings:
pgBasebackup:
host: source-db.foo.com
user: streaming_replica
auth: tls
authDetails:
tls:
key: |-
-----BEGIN PRIVATE KEY-----
-----END PRIVATE KEY-----
crt: |-
-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----
ca: |-
-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----
cluster:
instances: 1
11 changes: 11 additions & 0 deletions charts/cluster/examples/recovery-volume_snapshot.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
mode: recovery

recovery:
method: volumeSnapshot
methodSettings:
volumeSnapshot:
storageSnapshotName: database-clustermarket-database-snapshot
walSnapshotName: wal-clustermarket-database-snapshot

cluster:
instances: 1
11 changes: 9 additions & 2 deletions charts/cluster/templates/_backup.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,20 @@ backup:
encryption: {{ .Values.backups.objectStorage.data.encryption }}
jobs: {{ .Values.backups.objectStorage.data.jobs }}

{{- $d := dict "chartFullname" (include "cluster.fullname" .) "scope" .Values.backups.objectStorage "secretPrefix" "backup" }}
{{- $d := dict "chartFullname" (include "cluster.fullname" .) "scope" .Values.backups.objectStorage "secretPrefix" "backup" "existingSecret" .Values.backups.existingSecret }}
{{- include "cluster.barmanObjectStoreConfig" $d | nindent 2 }}
{{- end }}
{{- if (not (empty .Values.backups.volumeSnapshot.className )) }}
{{- with .Values.backups.volumeSnapshot }}
volumeSnapshot:
{{- toYaml . | nindent 4 }}
className: {{ .className }}
{{- if (not (empty .walClassName)) }}
walClassName: {{ .walClassName }}
{{- end }}
online: {{ .online }}
onlineConfiguration:
immediateCheckpoint: {{ .onlineConfiguration.immediateCheckpoint }}
waitForArchive: {{ .onlineConfiguration.waitForArchive }}
{{- end }}
{{- end }}
{{- end }}
Expand Down
6 changes: 3 additions & 3 deletions charts/cluster/templates/_barman_object_store.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
{{- if empty .scope.destinationPath }}
destinationPath: "s3://{{ required "You need to specify S3 bucket if destinationPath is not specified." .scope.providerSettings.s3.bucket }}{{ .scope.providerSettings.s3.path }}"
{{- end }}
{{- $secretName := coalesce .scope.secret.name (printf "%s-%s-s3-creds" .chartFullname .secretPrefix) }}
{{- $secretName := coalesce .existingSecret.name (printf "%s-%s-s3-creds" .chartFullname .secretPrefix) }}
s3Credentials:
accessKeyId:
name: {{ $secretName }}
Expand All @@ -33,7 +33,7 @@
{{- if empty .scope.destinationPath }}
destinationPath: "https://{{ required "You need to specify Azure storageAccount if destinationPath is not specified." .scope.providerSettings.azure.storageAccount }}.{{ .scope.providerSettings.azure.serviceName }}.core.windows.net/{{ .scope.providerSettings.azure.containerName }}{{ .scope.providerSettings.azure.path }}"
{{- end }}
{{- $secretName := coalesce .scope.secret.name (printf "%s-%s-azure-creds" .chartFullname .secretPrefix) }}
{{- $secretName := coalesce .existingSecret.name (printf "%s-%s-azure-creds" .chartFullname .secretPrefix) }}
azureCredentials:
{{- if .scope.providerSettings.azure.inheritFromAzureAD }}
inheritFromAzureAD: true
Expand All @@ -59,7 +59,7 @@
{{- if empty .scope.destinationPath }}
destinationPath: "gs://{{ required "You need to specify Google storage bucket if destinationPath is not specified." .scope.providerSettings.google.bucket }}{{ .scope.providerSettings.google.path }}"
{{- end }}
{{- $secretName := coalesce .scope.secret.name (printf "%s-%s-google-creds" .chartFullname .secretPrefix) }}
{{- $secretName := coalesce .existingSecret.name (printf "%s-%s-google-creds" .chartFullname .secretPrefix) }}
googleCredentials:
gkeEnvironment: {{ .scope.providerSettings.google.gkeEnvironment }}
{{- if not .scope.providerSettings.google.gkeEnvironment }}
Expand Down
91 changes: 83 additions & 8 deletions charts/cluster/templates/_bootstrap.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -21,19 +21,38 @@ bootstrap:
{{- printf "- %s" . | nindent 6 }}
{{- end -}}
{{- end -}}
{{- else if eq .Values.mode "recovery" -}}
{{- else if or (eq .Values.mode "recovery") (eq .Values.mode "replica") -}}
{{- if eq .Values.mode "replica" -}}
replica:
{{- if eq .Values.replica.topology "standalone" -}}
enabled: true
{{- if not (empty .Values.replica.topologySettings.minApplyDelay) }}
minApplyDelay: {{ .Values.replica.topologySettings.minApplyDelay }}
{{- end }}
{{- else if eq .Values.replica.topology "distributed" -}}
{{- if .Values.replica.topologySettings.distributed.primary -}}
primary: {{ include "cluster.fullname" . }}
{{- else }}
primary: {{ include "cluster.replicaSource" . }}
{{- end }}
{{- end }}
source: {{ include "cluster.replicaSource" . }}
{{- end }}
bootstrap:
recovery:
{{- if and (has .Values.recovery.method (list "backup" "objectStorage" "volumeSnapshot")) (not (eq .Values.mode "replica")) }}
{{- with .Values.recovery.pitrTarget.time }}
recoveryTarget:
targetTime: {{ . }}
{{- end }}
{{- end }}
{{- if eq .Values.recovery.method "backup" }}
backup:
name: {{ .Values.recovery.backupName }}
{{- else if eq .Values.recovery.method "object_store" }}
name: {{ .Values.recovery.backup.name }}
{{- else if eq .Values.recovery.method "objectStorage" }}
source: objectStoreRecoveryCluster
{{- else if eq .Values.recovery.method "volume_snapshot" }}
{{- else if eq .Values.recovery.method "volumeSnapshot" }}
source: volumeSnapshotRecoveryCluster
volumeSnapshots:
storage:
apiGroup: snapshot.storage.k8s.io
Expand All @@ -43,18 +62,74 @@ bootstrap:
apiGroup: snapshot.storage.k8s.io
kind: VolumeSnapshot
name: {{ .Values.recovery.volumeSnapshot.walSnapshotName }}
{{- else if eq .Values.recovery.method "pgBasebackup" }}
pg_basebackup:
source: pgBasebackupRecoveryCluster
{{- if or (not (empty .Values.recovery.methodSettings.pgBaseBackup.database) (not (eq .Values.mode "replica"))) }}
{{ with .Values.recovery.methodSettings.pgBaseBackup.database }}
database: {{ . }}
{{- end }}
{{ with .Values.recovery.methodSettings.pgBaseBackup.owner }}
owner: {{ . }}
{{- end }}
{{ with .Values.recovery.methodSettings.pgBaseBackup.ownerSecret }}
secret:
name: {{ . }}
{{- end }}
{{- end }}
{{- end }}

{{- if eq .Values.recovery.method "object_store" }}
{{- if eq .Values.recovery.method "objectStorage" }}
externalClusters:
{{- if eq .Values.replica.topology "distributed" }}
- name: {{ include "cluster.fullname" . }}
barmanObjectStore:
serverName: {{ include "cluster.fullname" . }}
{{- $d := dict "chartFullname" (include "cluster.fullname" .) "scope" .Values.backups.objectStorage "secretPrefix" "backup" "existingSecret" .Values.backups.existingSecret -}}
{{- include "cluster.barmanObjectStoreConfig" $d | nindent 4 }}
{{- end }}
- name: objectStoreRecoveryCluster
barmanObjectStore:
serverName: {{ default (include "cluster.fullname" .) .Values.recovery.clusterName }}
{{- $d := dict "chartFullname" (include "cluster.fullname" .) "scope" .Values.recovery "secretPrefix" "recovery" -}}
{{- $d := dict "chartFullname" (include "cluster.fullname" .) "scope" .Values.recovery.methodSettings.objectStorage "secretPrefix" "recovery" "existingSecret" .Values.recovery.existingSecret -}}
{{- include "cluster.barmanObjectStoreConfig" $d | nindent 4 }}
{{- else if eq .Values.recovery.method "pgBasebackup" }}
externalClusters:
- name: pgBasebackupRecoveryCluster
connectionParameters:
host: {{ .Values.recovery.methodSettings.pgBasebackup.connectionParameters.host }}
port: {{ .Values.recovery.methodSettings.pgBasebackup.connectionParameters.port | quote }}
user: {{ .Values.recovery.methodSettings.pgBasebackup.connectionParameters.user }}
{{- if and (eq .Values.mode "replica") (not (empty .Values.recovery.methodSettings.pgBaseBackup.database)) -}}
dbname: {{ default .Values.recovery.methodSettings.pgBaseBackup.database .Values.recovery.methodSettings.pgBasebackup.connectionParameters.database }}
{{- else if .Values.mode "recovery" }}
dbname: postgres
{{- end }}
{{- if eq .Values.recovery.methodSettings.auth "password" }}
sslmode: require
{{- else if eq .Values.recovery.methodSettings.auth "tls" }}
sslmode: verify-full
{{- end }}
{{- $secretName := coalesce .Values.recovery.existingSecret.name (printf "%s-recovery-pgbb-creds" (include "cluster.fullname" .)) }}
{{- if eq .Values.recovery.methodSettings.auth "password" }}
password:
{{- if .Values.recovery.methodSettings.pgBasebackup.sourcePassword }}
name: {{ printf $secretName }}
{{- end }}
key: password
{{- else if eq .Values.recovery.methodSettings.auth "tls" }}
sslKey:
name: {{ printf $secretName }}
key: tls.key
sslCert:
name: {{ printf $secretName }}
key: tls.crt
sslRootCert:
name: {{ printf $secretName }}
key: ca.crt
{{- end }}
{{- end }}

{{- else }}
{{- else }}
{{ fail "Invalid cluster mode!" }}
{{- end }}
{{- end }}
111 changes: 110 additions & 1 deletion charts/cluster/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,115 @@ If a custom imageName is available, use it, otherwise use the defaults based on
{{- end }}
{{- end -}}

{{/*
Recovery enabled
Check that recovery method is set to one of supported methods in methodSettings
*/}}
{{- define "cluster.recoveryEnabled" -}}
{{- $method := .Values.recovery.method -}}
{{- $methodSettings := .Values.recovery.methodSettings -}}
{{- if and $method (hasKey $methodSettings $method) }}
{{- (printf "%s" "true") }}
{{- else if and $method (not (hasKey $methodSettings $method)) }}
{{- fail (printf "The specified method '%s' is does not have corresponding to one of supported in .Values.recovery.methodSettings" $method) }}
{{- (printf "%s" "false") }}
{{- else }}
{{- (printf "%s" "false") }}
{{- end }}
{{- end }}

{{/*
objectStorage recovery enabled
Check that provider is set to one of supported providers in providerSettings
*/}}
{{- define "cluster.objectStorageRecoveryEnabled" -}}
{{- $provider := .Values.recovery.methodSettings.objectStorage.provider -}}
{{- $providerSettings := .Values.recovery.methodSettings.objectStorage.providerSettings -}}
{{- if and $provider (hasKey $providerSettings $provider) }}
{{- (printf "%s" "true") }}
{{- else if and $provider (not (hasKey $providerSettings $provider)) }}
{{- fail (printf "The specified provider '%s' is does not have corresponding to one of supported in .Values.recovery.methodSettings.objectStorage.providerSettings" $provider) }}
{{- (printf "%s" "false") }}
{{- else }}
{{- (printf "%s" "false") }}
{{- end }}
{{- end }}

{{/*
pgBasebackup auth recovery enabled
Check that pgBasebackup auth is set to one of supported options in authDetails
*/}}
{{- define "cluster.pgBasebackupAuthRecoveryEnabled" -}}
{{- $auth := .Values.recovery.methodSettings.auth -}}
{{- $authDetails := .Values.recovery.methodSettings.authDetails -}}
{{- if and $auth (hasKey $authDetails $auth) }}
{{- (printf "%s" "true") }}
{{- else if and $auth (not (hasKey $authDetails $auth)) }}
{{- fail (printf "The specified auth '%s' is does not have corresponding to one of supported in .Values.recovery.methodSettings.authDetails" $auth) }}
{{- (printf "%s" "false") }}
{{- else }}
{{- (printf "%s" "false") }}
{{- end }}
{{- end }}

{{/*
Replica enabled
Check that replica topology is set to one of supported topologys in topologySettings
*/}}
{{- define "cluster.replicaEnabled" -}}
{{- $topology := .Values.replica.topology -}}
{{- $topologySettings := .Values.replica.topologySettings -}}
{{- if and $topology (hasKey $topologySettings $topology) }}
{{- (printf "%s" "true") }}
{{- else if and $topology (not (hasKey $topologySettings $topology)) }}
{{- fail (printf "The specified topology '%s' is does not have corresponding to one of supported in .Values.replica.topologySettings" $topology) }}
{{- (printf "%s" "false") }}
{{- else }}
{{- (printf "%s" "false") }}
{{- end }}
{{- end }}

{{/*
Replica source
Defines which source to use
Check that recovery method set to one of supported
*/}}
{{- define "cluster.replicaSource" -}}
{{- $topology := .Values.replica.topology -}}
{{- $recoveryMethod := .Values.recovery.method -}}
{{- if and $topology $recoveryMethod }}
{{- if eq $recoveryMethod "objectStorage" }}
{{- (printf "objectStoreRecoveryCluster") }}
{{- else if eq $recoveryMethod "pgBasebackup" }}
{{- (printf "pgBasebackupRecoveryCluster") }}
{{- else }}
{{- fail (printf "The specified topology '%s' is does not have corresponding to one of supported in .Values.replica.topologySettings" $topology) }}
{{- end }}
{{- end }}
{{- end }}

{{/*
Replica distributed enabled
Check that at both recovery and backups method is set to objectStorage
*/}}
{{- define "cluster.replicaDistributedEnabled" -}}
{{- $topology := .Values.replica.topology -}}
{{- $recoveryMethod := .Values.recovery.method -}}
{{- $backupsObjectStorage := .Values.backups.objectStorage.provider -}}
{{- if and $topology $recoveryMethod }}
{{- if eq $topology "distributed" }}
{{- if and (eq $recoveryMethod "objectStorage") (not (empty $backupsObjectStorage)) }}
{{- (printf "%s" "true") }}
{{- else }}
{{- fail (printf "Replica in distributed topology mode requires setting up both recovery and backups to objectStorage") }}
{{- (printf "%s" "false") }}
{{- end }}
{{- else }}
{{- (printf "%s" "false") }}
{{- end }}
{{- end }}
{{- end }}

{{/*
objectStorage backups enabled
Check that provider is set to one of supported providers in providerSettings
Expand All @@ -79,7 +188,7 @@ Check that provider is set to one of supported providers in providerSettings
{{- if and $provider (hasKey $providerSettings $provider) }}
{{- (printf "%s" "true") }}
{{- else if and $provider (not (hasKey $providerSettings $provider)) }}
{{- fail (printf "The specified provider '%s' is does not have corresponding to one of supported in providerSettings" $provider) }}
{{- fail (printf "The specified provider '%s' is does not have corresponding to one of supported in .Values.backups.objectStorage.providerSettings" $provider) }}
{{- (printf "%s" "false") }}
{{- else }}
{{- (printf "%s" "false") }}
Expand Down
Loading

0 comments on commit 2145416

Please sign in to comment.