diff --git a/product_docs/docs/postgres_for_kubernetes/1/applications.mdx b/product_docs/docs/postgres_for_kubernetes/1/applications.mdx index 9c8a6b2f660..84cf029a86c 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/applications.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/applications.mdx @@ -4,15 +4,10 @@ originalFilePath: 'src/applications.md' --- Applications are supposed to work with the services created by EDB Postgres for Kubernetes -in the same Kubernetes cluster: +in the same Kubernetes cluster. -- `[cluster name]-rw` -- `[cluster name]-ro` -- `[cluster name]-r` - -Those services are entirely managed by the Kubernetes cluster and -implement a form of Virtual IP as described in the -["Service" page of the Kubernetes Documentation](https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies). +For more information on services and how to manage them, please refer to the +["Service management"](service_management.md) section. !!! Hint It is highly recommended using those services in your applications, @@ -85,5 +80,7 @@ connecting to the PostgreSQL cluster, and correspond to the user *owning* the database. The `-superuser` ones are supposed to be used only for administrative purposes, -and correspond to the `postgres` user. Since version 1.21, superuser access -over the network is disabled by default. +and correspond to the `postgres` user. + +!!! Important + Superuser access over the network is disabled by default. diff --git a/product_docs/docs/postgres_for_kubernetes/1/architecture.mdx b/product_docs/docs/postgres_for_kubernetes/1/architecture.mdx index 415d7d7ca22..0e3a681edca 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/architecture.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/architecture.mdx @@ -3,17 +3,30 @@ title: 'Architecture' originalFilePath: 'src/architecture.md' --- -This section covers the main architectural aspects you need to consider -when deploying PostgreSQL in Kubernetes. - -!!! Important - We encourage you to read an article that we've written for the CNCF blog - with title ["Recommended Architectures for PostgreSQL in Kubernetes"](https://www.cncf.io/blog/2023/09/29/recommended-architectures-for-postgresql-in-kubernetes/). - -!!! Important - If you are deploying PostgreSQL in a self-managed Kubernetes environment, - please make sure you read the ["Kubernetes architecture"](#kubernetes-architecture) - below when you start planning your journey to the Cloud Native world. +!!! Hint + For a deeper understanding, we recommend reading our article on the CNCF + blog post titled ["Recommended Architectures for PostgreSQL in Kubernetes"](https://www.cncf.io/blog/2023/09/29/recommended-architectures-for-postgresql-in-kubernetes/), + which provides valuable insights into best practices and design + considerations for PostgreSQL deployments in Kubernetes. + +This documentation page provides an overview of the key architectural +considerations for implementing a robust business continuity strategy when +deploying PostgreSQL in Kubernetes. These considerations include: + +- **[Deployments in *stretched*](#multi-availability-zone-kubernetes-clusters) + vs. [*non-stretched* clusters](#single-availability-zone-kubernetes-clusters)**: + Evaluating the differences between deploying in stretched clusters (across 3 + or more availability zones) versus non-stretched clusters (within a single + availability zone). +- [**Reservation of `postgres` worker nodes**](#reserving-nodes-for-postgresql-workloads): Isolating PostgreSQL workloads by + dedicating specific worker nodes to `postgres` tasks, ensuring optimal + performance and minimizing interference from other workloads. +- [**PostgreSQL architectures within a single Kubernetes cluster**](#postgresql-architecture): + Designing effective PostgreSQL deployments within a single Kubernetes cluster + to meet high availability and performance requirements. +- [**PostgreSQL architectures across Kubernetes clusters for disaster recovery**](#deployments-across-kubernetes-clusters): + Planning and implementing PostgreSQL architectures that span multiple + Kubernetes clusters to provide comprehensive disaster recovery capabilities. ## Synchronizing the state @@ -87,10 +100,18 @@ section below for details on how you can design your PostgreSQL clusters within the same Kubernetes cluster through shared-nothing deployments at the storage, worker node, and availability zone levels. -Moreover, you can take advantage of additional [Kubernetes clusters](#deployments-across-kubernetes-clusters), -by using them to host "passive" PostgreSQL replica clusters. This should be -used primarily for DR, read-only operations, or cross-region availability, -even though failovers and promotions in this case must be done manually. +Additionally, you can leverage [Kubernetes clusters](#deployments-across-kubernetes-clusters) +to deploy distributed PostgreSQL topologies hosting "passive" +[PostgreSQL replica clusters](replica_cluster.md) in different regions and +managing them via declarative configuration. This setup is ideal for disaster +recovery (DR), read-only operations, or cross-region availability. + +!!! Important + Each operator deployment can only manage operations within its local + Kubernetes cluster. For operations across Kubernetes clusters, such as + controlled switchover or unexpected failover, coordination must be handled + manually (through GitOps, for example) or by using a higher-level cluster + management tool. ![Example of a multiple Kubernetes cluster architecture distributed over 3 regions each with 3 independent data centers](./images/k8s-architecture-multi.png) @@ -105,14 +126,14 @@ EDB Postgres for Kubernetes clusters suffer a failure. This scenario is typical of self-managed on-premise Kubernetes clusters, where only one data center is available. -Single availability zone Kubernetes is unfortunately the only viable option -where just **two (2) data centers** are available within reach of a low latency -connection (normally in the same metropolitan area): having only two zones -precludes users from creating a multi-availability zone Kubernetes cluster -(as the minimum number of -3 zones is not reached) and forces them to create two different Kubernetes -clusters in an active/passive configuration, where the second cluster is used -primarily for Disaster Recovery. +Single availability zone Kubernetes clusters are the only viable option when +only **two data centers** are available within reach of a low-latency +connection (typically in the same metropolitan area). Having only two zones +prevents the creation of a multi-availability zone Kubernetes cluster, which +requires a minimum of three zones. As a result, users must create two separate +Kubernetes clusters in an active/passive configuration, with the second cluster +primarily used for Disaster Recovery (see +the [replica cluster feature](replica_cluster.md)). ![Example of a Kubernetes architecture with only 2 data centers](./images/k8s-architecture-2-az.png) @@ -135,14 +156,104 @@ deployments at the storage and worker node levels only. For HA, in such a scenario it becomes even more important that the PostgreSQL instances be located on different worker nodes and do not share the same storage. -For DR, you can push the SPoF above the single zone, by -using additional -[Kubernetes clusters](#deployments-across-kubernetes-clusters) to -host "passive" PostgreSQL replica clusters. As with other Kubernetes workloads in -this scenario, promotion of a Kubernetes cluster as primary must be done -manually. As explained below, no automated failover across Kubernetes clusters -is available for PostgreSQL at the moment with EDB Postgres for Kubernetes, as the operator -can only work within a single Kubernetes cluster. +For DR, you can push the SPoF above the single zone, by using additional +[Kubernetes clusters](#deployments-across-kubernetes-clusters) to define a +distributed topology hosting "passive" [PostgreSQL replica clusters](replica_cluster.md). +As with other Kubernetes workloads in this scenario, promotion of a Kubernetes +cluster as primary must be done manually. + +Through the [replica cluster feature](replica_cluster.md), you can define a +distributed PostgreSQL topology and coordinate a controlled switchover between +data centers by first demoting the primary cluster and then promoting the +replica cluster, without the need to re-clone the former primary. While failover +is now fully declarative, automated failover across Kubernetes clusters is not +within EDB Postgres for Kubernetes' scope, as the operator can only function within a single +Kubernetes cluster. + +!!! Important + EDB Postgres for Kubernetes provides all the necessary primitives and probes to + coordinate PostgreSQL active/passive topologies across different Kubernetes + clusters through a higher-level operator or management tool. + +### Reserving nodes for PostgreSQL workloads + +Whether you're operating in a multi-availability zone environment or, more +critically, within a single availability zone, we strongly recommend isolating +PostgreSQL workloads by dedicating specific worker nodes exclusively to +`postgres` in production. A Kubernetes worker node dedicated to running +PostgreSQL workloads is referred to as a **Postgres node** or `postgres` node. +This approach ensures optimal performance and resource allocation for your +database operations. + +!!! Hint + As a general rule of thumb, deploy Postgres nodes in multiples of + three—ideally with one node per availability zone. Three nodes is + an optimal number because it ensures that a PostgreSQL cluster with three + instances (one primary and two standby replicas) is distributed across + different nodes, enhancing fault tolerance and availability. + +In Kubernetes, this can be achieved using node labels and taints in a +declarative manner, aligning with Infrastructure as Code (IaC) practices: +labels ensure that a node is capable of running `postgres` workloads, while +taints help prevent any non-`postgres` workloads from being scheduled on that +node. + +!!! Important + This methodology is the most straightforward way to ensure that PostgreSQL + workloads are isolated from other workloads in terms of both computing + resources and, when using locally attached disks, storage. While different + PostgreSQL clusters may share the same node, you can take this a step further + by using labels and taints to ensure that a node is dedicated to a single + instance of a specific `Cluster`. + +#### Proposed node label + +EDB Postgres for Kubernetes recommends using the `node-role.kubernetes.io/postgres` label. +Since this is a reserved label (`*.kubernetes.io`), it can only be applied +after a worker node is created. + +To assign the `postgres` label to a node, use the following command: + +```sh +kubectl label node node-role.kubernetes.io/postgres= +``` + +To ensure that a `Cluster` resource is scheduled on a `postgres` node, you must +correctly configure the `.spec.affinity.nodeSelector` stanza in your manifests. +Here’s an example: + +```yaml +spec: + # + affinity: + # + nodeSelector: + node-role.kubernetes.io/postgres: "" +``` + +#### Proposed node taint + +EDB Postgres for Kubernetes recommends using the `node-role.kubernetes.io/postgres` taint. + +To assign the `postgres` taint to a node, use the following command: + +```sh +kubectl taint node node-role.kubernetes.io/postgres=:noSchedule +``` + +To ensure that a `Cluster` resource is scheduled on a node with a `postgres` taint, you must correctly configure the `.spec.affinity.tolerations` stanza in your manifests. +Here’s an example: + +```yaml +spec: + # + affinity: + # + tolerations: + - key: node-role.kubernetes.io/postgres + operator: Exists + effect: NoSchedule +``` ## PostgreSQL architecture @@ -151,10 +262,14 @@ streaming replication to manage multiple hot standby replicas within the same Kubernetes cluster, with the following specifications: - One primary, with optional multiple hot standby replicas for HA + - Available services for applications: - `-rw`: applications connect only to the primary instance of the cluster - - `-ro`: applications connect only to hot standby replicas for read-only-workloads - - `-r`: applications connect to any of the instances for read-only workloads + - `-ro`: applications connect only to hot standby replicas for + read-only-workloads (optional) + - `-r`: applications connect to any of the instances for read-only + workloads (optional) + - Shared-nothing architecture recommended for better resilience of the PostgreSQL cluster: - PostgreSQL instances should reside on different Kubernetes worker nodes and share only the network - as a result, instances should not share @@ -163,6 +278,13 @@ Kubernetes cluster, with the following specifications: - PostgreSQL instances should reside in different availability zones within the same Kubernetes cluster / region +!!! Important + You can configure the above services through the `managed.services` section + in the `Cluster` configuration. This can be done by reducing the number of + services and selecting the type (default is `ClusterIP`). For more details, + please refer to the ["Service Management" section](service_management.md) + below. + The below diagram provides a simplistic view of the recommended shared-nothing architecture for a PostgreSQL cluster spanning across 3 different availability zones, running on separate nodes, each with dedicated local storage for @@ -226,9 +348,9 @@ Applications can also access any PostgreSQL instance through the ## Deployments across Kubernetes clusters !!! Info - EDB Postgres for Kubernetes supports deploying PostgreSQL across multiple - Kubernetes clusters through a feature called **Replica Cluster**, - which is described in this section. + EDB Postgres for Kubernetes supports deploying PostgreSQL across multiple Kubernetes + clusters through a feature that allows you to define a distributed PostgreSQL + topology using replica clusters, as described in this section. In a distributed PostgreSQL cluster there can only be a single PostgreSQL instance acting as a primary at all times. This means that applications can @@ -242,29 +364,17 @@ However, for business continuity objectives it is fundamental to: - reduce global **recovery time objectives** (RTO) by taking advantage of PostgreSQL replication beyond the primary Kubernetes cluster (High Availability) -In order to address the above concerns, EDB Postgres for Kubernetes introduces the -concept of a *PostgreSQL Replica Cluster*. Replica clusters are the -EDB Postgres for Kubernetes way to enable multi-cluster deployments in private, public, -hybrid, and multi-cloud contexts. - -A replica cluster is a separate `Cluster` resource: - -1. having either `pg_basebackup` or full `recovery` as the `bootstrap` - option from a defined external source cluster -2. having the `replica.enabled` option set to `true` -3. replicating from a defined external cluster identified by `replica.source`, - normally located outside the Kubernetes cluster -4. replaying WAL information received from the recovery object store - (using PostgreSQL's `restore_command` parameter), or via streaming - replication (using PostgreSQL's `primary_conninfo` parameter), or any of - the two (in case both the `barmanObjectStore` and `connectionParameters` - are defined in the external cluster) -5. accepting only read connections, as supported by PostgreSQL's Hot Standby - -!!! Seealso - Please refer to the ["Bootstrap" section](bootstrap.md) for more information - about cloning a PostgreSQL cluster from another one (defined in the - `externalClusters` section). +In order to address the above concerns, EDB Postgres for Kubernetes introduces the concept of +a PostgreSQL Topology that is distributed across different Kubernetes clusters +and is made up of a primary PostgreSQL cluster and one or more PostgreSQL +replica clusters. +This feature is called **distributed PostgreSQL topology with replica clusters**, +and it enables multi-cluster deployments in private, public, hybrid, and +multi-cloud contexts. + +A replica cluster is a separate `Cluster` resource that is in continuous +recovery, replicating from another source, either via WAL shipping from a WAL +archive or via streaming replication from a primary or a standby (cascading). The diagram below depicts a PostgreSQL cluster spanning over two different Kubernetes clusters, where the primary cluster is in the first Kubernetes @@ -274,30 +384,46 @@ of disaster and unavailability of the first one. ![An example of multi-cluster deployment with a primary and a replica cluster](./images/multi-cluster.png) -A replica cluster can have the same architecture of the primary cluster. In -place of the primary instance, a replica cluster has a **designated primary** +A replica cluster can have the same architecture as the primary cluster. +Instead of a primary instance, a replica cluster has a **designated primary** instance, which is a standby server with an arbitrary number of cascading standby servers in streaming replication (symmetric architecture). -The designated primary can be promoted at any time, making the replica cluster -a primary cluster capable of accepting write connections. +The designated primary can be promoted at any time, transforming the replica +cluster into a primary cluster capable of accepting write connections. +This is typically triggered by: + +- **Human decision:** You choose to make the other PostgreSQL cluster (or the + entire Kubernetes cluster) the primary. To avoid data loss and ensure that + the former primary can follow without being re-cloned (especially with large + data sets), you first demote the current primary, then promote the designated + primary using the API provided by EDB Postgres for Kubernetes. +- **Unexpected failure:** If the entire Kubernetes cluster fails, you might + experience data loss, but you need to fail over to the other Kubernetes + cluster by promoting the PostgreSQL replica cluster. !!! Warning - EDB Postgres for Kubernetes does not perform any cross-cluster switchover - or failover at the moment. Such operation must be performed manually - or delegated to a multi-cluster/federated cluster aware authority. - Each PostgreSQL cluster is independent from any other. + EDB Postgres for Kubernetes cannot perform any cross-cluster automated failover, as it + does not have authority beyond a single Kubernetes cluster. Such operations + must be performed manually or delegated to a multi-cluster/federated + cluster-aware authority. + +!!! Important + EDB Postgres for Kubernetes allows you to control the distributed topology via + declarative configuration, enabling you to automate these procedures as part of + your Infrastructure as Code (IaC) process, including GitOps. The designated primary in the above example is fed via WAL streaming (`primary_conninfo`), with fallback option for file-based WAL shipping through the `restore_command` and `barman-cloud-wal-restore`. -EDB Postgres for Kubernetes allows you to define multiple replica clusters. +EDB Postgres for Kubernetes allows you to define topologies with multiple replica clusters. You can also define replica clusters with a lower number of replicas, and then increase this number when the cluster is promoted to primary. !!! Seealso "Replica clusters" - Please refer to the ["Replica Clusters" section](replica_cluster.md) for more - information about physical replica clusters work and how you can configure - read-only clusters in different Kubernetes cluster to improve your global - disaster recovery and HA strategy. + Please refer to the ["Replica Clusters" section](replica_cluster.md) for + more detailed information on how physical replica clusters operate and how to + define a distributed topology with read-only clusters across different + Kubernetes clusters. This approach can significantly enhance your global + disaster recovery and high availability (HA) strategy. diff --git a/product_docs/docs/postgres_for_kubernetes/1/backup.mdx b/product_docs/docs/postgres_for_kubernetes/1/backup.mdx index 89a74a8f498..9a37c5f9a64 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/backup.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/backup.mdx @@ -3,14 +3,6 @@ title: 'Backup' originalFilePath: 'src/backup.md' --- -!!! Important - With version 1.21, backup and recovery capabilities in EDB Postgres for Kubernetes - have sensibly changed due to the introduction of native support for - [Kubernetes Volume Snapshots](backup_volumesnapshot.md). - Up to that point, backup and recovery were available only for object - stores. Please carefully read this section and the [recovery](recovery.md) - one if you have been a user of EDB Postgres for Kubernetes 1.15 through 1.20. - PostgreSQL natively provides first class backup and recovery capabilities based on file system level (physical) copy. These have been successfully used for more than 15 years in mission critical production databases, helping diff --git a/product_docs/docs/postgres_for_kubernetes/1/backup_barmanobjectstore.mdx b/product_docs/docs/postgres_for_kubernetes/1/backup_barmanobjectstore.mdx index b76847462a2..1ec2df60615 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/backup_barmanobjectstore.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/backup_barmanobjectstore.mdx @@ -151,24 +151,28 @@ spec: backupRetentionPolicy: "keep" ``` -## Extra options for the backup command +## Extra options for the backup and WAL commands -You can append additional options to the `barman-cloud-backup` command by using +You can append additional options to the `barman-cloud-backup` and `barman-cloud-wal-archive` commands by using the `additionalCommandArgs` property in the -`.spec.backup.barmanObjectStore.data` section. -This property is a list of strings that will be appended to the -`barman-cloud-backup` command. +`.spec.backup.barmanObjectStore.data` and `.spec.backup.barmanObjectStore.wal` sections respectively. +This properties are lists of strings that will be appended to the +`barman-cloud-backup` and `barman-cloud-wal-archive` commands. + For example, you can use the `--read-timeout=60` to customize the connection reading timeout. -For additional options supported by `barman-cloud-backup` you can refer to the -official barman documentation [here](https://www.pgbarman.org/documentation/). + +For additional options supported by `barman-cloud-backup` and `barman-cloud-wal-archive` commands you can refer to the +official barman documentation [here](https://www.pgbarman.org/documentation/). If an option provided in `additionalCommandArgs` is already present in the -declared options in the `barmanObjectStore` section, the extra option will be +declared options in its section (`.spec.backup.barmanObjectStore.data` or `.spec.backup.barmanObjectStore.wal`), the extra option will be ignored. The following is an example of how to use this property: +For backups: + ```yaml apiVersion: postgresql.k8s.enterprisedb.io/v1 kind: Cluster @@ -182,3 +186,19 @@ spec: - "--min-chunk-size=5MB" - "--read-timeout=60" ``` + +For WAL files: + +```yaml +apiVersion: postgresql.k8s.enterprisedb.io/v1 +kind: Cluster +[...] +spec: + backup: + barmanObjectStore: + [...] + wal: + additionalCommandArgs: + - "--max-concurrency=1" + - "--read-timeout=60" +``` diff --git a/product_docs/docs/postgres_for_kubernetes/1/backup_recovery.mdx b/product_docs/docs/postgres_for_kubernetes/1/backup_recovery.mdx index 8080d7e0fae..1ee141ade78 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/backup_recovery.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/backup_recovery.mdx @@ -3,12 +3,4 @@ title: 'Backup and Recovery' originalFilePath: 'src/backup_recovery.md' --- -Until EDB Postgres for Kubernetes 1.20, this page used to contain both the backup and -recovery phases of a PostgreSQL cluster. The reason was that EDB Postgres for Kubernetes -supported only backup and recovery object stores. - -Version 1.21 introduces support for the Kubernetes `VolumeSnapshot` API, -providing more possibilities for the end user. - -As a result, [backup](backup.md) and [recovery](recovery.md) are now in two -separate sections. +[Backup](backup.md) and [recovery](recovery.md) are in two separate sections. diff --git a/product_docs/docs/postgres_for_kubernetes/1/backup_volumesnapshot.mdx b/product_docs/docs/postgres_for_kubernetes/1/backup_volumesnapshot.mdx index 5d677f3911d..3a085374575 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/backup_volumesnapshot.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/backup_volumesnapshot.mdx @@ -3,15 +3,6 @@ title: 'Backup on volume snapshots' originalFilePath: 'src/backup_volumesnapshot.md' --- -!!! Warning - The initial release of volume snapshots (version 1.21.0) only supported - cold backups, which required fencing of the instance. This limitation - has been waived starting with version 1.21.1. Given the minimal impact of - the change on the code, maintainers have decided to backport this feature - immediately instead of waiting for version 1.22.0 to be out, and make online - backups the default behavior on volume snapshots too. If you are planning - to rely instead on cold backups, make sure you follow the instructions below. - !!! Warning As noted in the [backup document](backup.md), a cold snapshot explicitly set to target the primary will result in the primary being fenced for diff --git a/product_docs/docs/postgres_for_kubernetes/1/before_you_start.mdx b/product_docs/docs/postgres_for_kubernetes/1/before_you_start.mdx index ce8994e945e..f22f568b0f2 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/before_you_start.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/before_you_start.mdx @@ -12,6 +12,12 @@ specific to Kubernetes and PostgreSQL. : A *node* is a worker machine in Kubernetes, either virtual or physical, where all services necessary to run pods are managed by the control plane node(s). +[Postgres Node](architecture.md#reserving-nodes-for-postgresql-workloads) +: A *Postgres node* is a Kubernetes worker node dedicated to running PostgreSQL + workloads. This is achieved by applying the `node-role.kubernetes.io` label and + taint, as [proposed by EDB Postgres for Kubernetes](architecture.md#reserving-nodes-for-postgresql-workloads). + It is also referred to as a `postgres` node. + [Pod](https://kubernetes.io/docs/concepts/workloads/pods/pod/) : A *pod* is the smallest computing unit that can be deployed in a Kubernetes cluster and is composed of one or more containers that share network and diff --git a/product_docs/docs/postgres_for_kubernetes/1/bootstrap.mdx b/product_docs/docs/postgres_for_kubernetes/1/bootstrap.mdx index b5b642327ce..99609d40444 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/bootstrap.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/bootstrap.mdx @@ -279,10 +279,43 @@ spec: `-d`), this technique is deprecated and will be removed from future versions of the API. -You can also specify a custom list of queries that will be executed -once, just after the database is created and configured. These queries will -be executed as the *superuser* (`postgres`), connected to the `postgres` -database: +### Executing Queries After Initialization + +You can specify a custom list of queries that will be executed once, +immediately after the cluster is created and configured. These queries will be +executed as the *superuser* (`postgres`) against three different databases, in +this specific order: + +1. The `postgres` database (`postInit` section) +2. The `template1` database (`postInitTemplate` section) +3. The application database (`postInitApplication` section) + +For each of these sections, EDB Postgres for Kubernetes provides two ways to specify custom +queries, executed in the following order: + +- As a list of SQL queries in the cluster's definition (`postInitSQL`, + `postInitTemplateSQL`, and `postInitApplicationSQL` stanzas) +- As a list of Secrets and/or ConfigMaps, each containing a SQL script to be + executed (`postInitSQLRefs`, `postInitTemplateSQLRefs`, and + `postInitApplicationSQLRefs` stanzas). Secrets are processed before ConfigMaps. + +Objects in each list will be processed sequentially. + +!!! Warning + Use the `postInit`, `postInitTemplate`, and `postInitApplication` options + with extreme care, as queries are run as a superuser and can disrupt the entire + cluster. An error in any of those queries will interrupt the bootstrap phase, + leaving the cluster incomplete and requiring manual intervention. + +!!! Important + Ensure the existence of entries inside the ConfigMaps or Secrets specified + in `postInitSQLRefs`, `postInitTemplateSQLRefs`, and + `postInitApplicationSQLRefs`, otherwise the bootstrap will fail. Errors in any + of those SQL files will prevent the bootstrap phase from completing + successfully. + +The following example runs a single SQL query as part of the `postInitSQL` +stanza: ```yaml apiVersion: postgresql.k8s.enterprisedb.io/v1 @@ -305,18 +338,9 @@ spec: size: 1Gi ``` -!!! Warning - Please use the `postInitSQL`, `postInitApplicationSQL` and - `postInitTemplateSQL` options with extreme care, as queries are run as a - superuser and can disrupt the entire cluster. An error in any of those queries - interrupts the bootstrap phase, leaving the cluster incomplete. - -### Executing queries after initialization - -Moreover, you can specify a list of Secrets and/or ConfigMaps which contains -SQL script that will be executed after the database is created and configured. -These SQL script will be executed using the **superuser** role (`postgres`), -connected to the database specified in the `initdb` section: +The example below relies on `postInitApplicationSQLRefs` to specify a secret +and a ConfigMap containing the queries to run after the initialization on the +application database: ```yaml apiVersion: postgresql.k8s.enterprisedb.io/v1 @@ -342,18 +366,9 @@ spec: ``` !!! Note - The SQL scripts referenced in `secretRefs` will be executed before the ones - referenced in `configMapRefs`. For both sections the SQL scripts will be - executed respecting the order in the list. Inside SQL scripts, each SQL - statement is executed in a single exec on the server according to the - [PostgreSQL semantics](https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-MULTI-STATEMENT), - comments can be included, but internal command like `psql` cannot. - -!!! Warning - Please make sure the existence of the entries inside the ConfigMaps or - Secrets specified in `postInitApplicationSQLRefs`, otherwise the bootstrap will - fail. Errors in any of those SQL files will prevent the bootstrap phase to - complete successfully. + Within SQL scripts, each SQL statement is executed in a single exec on the + server according to the [PostgreSQL semantics](https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-MULTI-STATEMENT). + Comments can be included, but internal commands like `psql` cannot. ### Compatibility Features @@ -530,7 +545,7 @@ file on the source PostgreSQL instance: host replication streaming_replica all md5 ``` -The following manifest creates a new PostgreSQL 16.3 cluster, +The following manifest creates a new PostgreSQL 16.4 cluster, called `target-db`, using the `pg_basebackup` bootstrap method to clone an external PostgreSQL cluster defined as `source-db` (in the `externalClusters` array). As you can see, the `source-db` @@ -545,7 +560,7 @@ metadata: name: target-db spec: instances: 3 - imageName: quay.io/enterprisedb/postgresql:16.3 + imageName: quay.io/enterprisedb/postgresql:16.4 bootstrap: pg_basebackup: @@ -565,7 +580,7 @@ spec: ``` All the requirements must be met for the clone operation to work, including -the same PostgreSQL version (in our case 16.3). +the same PostgreSQL version (in our case 16.4). #### TLS certificate authentication @@ -580,7 +595,7 @@ in the same Kubernetes cluster. This example can be easily adapted to cover an instance that resides outside the Kubernetes cluster. -The manifest defines a new PostgreSQL 16.3 cluster called `cluster-clone-tls`, +The manifest defines a new PostgreSQL 16.4 cluster called `cluster-clone-tls`, which is bootstrapped using the `pg_basebackup` method from the `cluster-example` external cluster. The host is identified by the read/write service in the same cluster, while the `streaming_replica` user is authenticated @@ -595,7 +610,7 @@ metadata: name: cluster-clone-tls spec: instances: 3 - imageName: quay.io/enterprisedb/postgresql:16.3 + imageName: quay.io/enterprisedb/postgresql:16.4 bootstrap: pg_basebackup: diff --git a/product_docs/docs/postgres_for_kubernetes/1/certificates.mdx b/product_docs/docs/postgres_for_kubernetes/1/certificates.mdx index 4120f1fe8ae..d7802b3f0f3 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/certificates.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/certificates.mdx @@ -29,6 +29,12 @@ primarily operates in two modes: You can also choose a hybrid approach, where only part of the certificates is generated outside CNP. +!!! Note + The operator and instances verify server certificates against the CA only, + disregarding the DNS name. This approach is due to the typical absence of DNS + names in user-provided certificates for the `-rw` service used for + communication within the cluster. + ## Operator-managed mode By default, the operator generates a single CA and uses it for both client and @@ -66,7 +72,7 @@ is passed as `ssl_ca_file` to all the instances so it can verify client certificates it signed. The private key is stored in the same secret and used to sign client certificates generated by the `kubectl cnp` plugin. -#### Client \`streaming_replica\`\` certificate +#### Client `streaming_replica` certificate The operator uses the generated self-signed CA to sign a client certificate for the user `streaming_replica`, storing it in a secret of type @@ -92,6 +98,12 @@ the following parameters: The operator still creates and manages the two secrets related to client certificates. +!!! Note + The operator and instances verify server certificates against the CA only, + disregarding the DNS name. This approach is due to the typical absence of DNS + names in user-provided certificates for the `-rw` service used for + communication within the cluster. + !!! Note If you want ConfigMaps and secrets to be reloaded by instances, you can add a label with the key `k8s.enterprisedb.io/reload` to it. Otherwise you must reload the diff --git a/product_docs/docs/postgres_for_kubernetes/1/declarative_hibernation.mdx b/product_docs/docs/postgres_for_kubernetes/1/declarative_hibernation.mdx index e35817bd367..26c47451dc5 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/declarative_hibernation.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/declarative_hibernation.mdx @@ -61,7 +61,7 @@ $ kubectl cnp status Cluster Summary Name: cluster-example Namespace: default -PostgreSQL Image: quay.io/enterprisedb/postgresql:16.3 +PostgreSQL Image: quay.io/enterprisedb/postgresql:16.4 Primary instance: cluster-example-2 Status: Cluster in healthy state Instances: 3 diff --git a/product_docs/docs/postgres_for_kubernetes/1/declarative_role_management.mdx b/product_docs/docs/postgres_for_kubernetes/1/declarative_role_management.mdx index 9e5ac43a2d6..f6d4ede40a4 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/declarative_role_management.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/declarative_role_management.mdx @@ -153,14 +153,8 @@ never expires, mirroring the behavior of PostgreSQL. Specifically: allowing `VALID UNTIL NULL` in the `ALTER ROLE` SQL statement) !!! Warning - The declarative role management feature has changed behavior since its - initial version (1.20.0). In 1.20.0, a role without a `passwordSecret` would - lead to setting the password to NULL in PostgreSQL. - In practice there is little difference from 1.20.0. - New roles created without `passwordSecret` will have a `NULL` password. - The relevant change is when using the managed roles to manage roles that - had been previously created. In 1.20.0, doing this might inadvertently - result in setting existing passwords to `NULL`. + New roles created without `passwordSecret` will have a `NULL` password + inside PostgreSQL. ### Password hashed diff --git a/product_docs/docs/postgres_for_kubernetes/1/evaluation.mdx b/product_docs/docs/postgres_for_kubernetes/1/evaluation.mdx index 96aee4bcaff..d8be04054d8 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/evaluation.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/evaluation.mdx @@ -9,31 +9,36 @@ The process is different between Community PostgreSQL and EDB Postgres Advanced ## Evaluating using PostgreSQL -By default, EDB Postgres for Kubernetes installs the latest available -version of Community PostgreSQL. +By default, EDB Postgres for Kubernetes installs the latest available version of Community PostgreSQL. -No license key is required. The operator automatically generates an implicit trial license for the cluster that lasts for -30 days. This trial license is ideal for evaluation, proof of concept, integration with CI/CD pipelines, and so on. +No license key is required. The operator automatically generates an implicit trial license for the cluster that lasts for 30 days. This trial license is ideal for evaluation, proof of concept, integration with CI/CD pipelines, and so on. PostgreSQL container images are available at [quay.io/enterprisedb/postgresql](https://quay.io/repository/enterprisedb/postgresql). ## Evaluating using EDB Postgres Advanced Server -You can use EDB Postgres for Kubernetes with EDB Postgres Advanced Server. You will need a trial key to use EDB Postgres Advanced Server. +There are two ways to obtain the EDB Postgres Advanced Server image for evaluation purposes. The easiest is through the EDB Image Repository, where all you’ll need is an EDB account to auto generate a repository access token. The other way is to download the image through [quay.io](http://quay.io) and request a trial license key from EDB support. -!!! Note Obtaining your trial key - You can request a key from the **[EDB Postgres for Kubernetes Trial License Request](https://cloud-native.enterprisedb.com/trial/)** page. You will also need to be signed into your EDB Account. If you do not have an EDB Account, you can [register for one](https://www.enterprisedb.com/accounts/register) on the EDB site. +### EDB Image Repository -Once you have received the license key, you can use EDB Postgres Advanced Server -by setting in the `spec` section of the `Cluster` deployment configuration file: +You can use EDB Postgres for Kubernetes with EDB Postgres Advanced Server. You can access the image by obtaining a repository access token to EDB’s image repositories. -- `imageName` to point to the `quay.io/enterprisedb/edb-postgres-advanced` repository -- `licenseKey` to your license key (in the form of a string) +### Obtaining your access token + +You can request a repository access token from the [EDB Repositories Download](https://www.enterprisedb.com/repos-downloads) page. You will also need to be signed into your EDB account. If you don't have an EDB Account, you can [register for one](https://www.enterprisedb.com/accounts/register) on the EDB site. + +### Quay Image Repository + +If you want to use the Quay image repository, you’ll need a trial license key for access to use the images. To request a trial license key for EDB Postgres Kubernetes please contact your sales representative or you can contact our EDB Technical Support Team by email at [techsupport@enterprisedb.com](mailto:techsupport@enterprisedb.com) or file a ticket on our support portal . Please allow 24 hours for your license to be generated and delivered to you and if you need any additional support please do not hesitate to contact us. -EDB Postgres Advanced container images are available at -[quay.io/enterprisedb/edb-postgres-advanced](https://quay.io/repository/enterprisedb/edb-postgres-advanced). +Once you have your license key, EDB Postgres Advanced container images will be available at + +You can then use EDB Postgres Advanced Server by setting in the `spec` section of the `Cluster` deployment configuration file: + +- `imageName` to point to the quay.io/enterprisedb/edb-postgres-advanced repository +- `licenseKey` to your license key (in the form of a string) -To see how `imageName` and `licenseKey` is set, refer to the [cluster-full-example](../samples/cluster-example-full.yaml) file from the the [configuration samples](samples.md) section. +To see how `imageName` and `licenseKey` is set, refer to the [cluster-full-example](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/samples/cluster-example-full.yaml) file from the [configuration samples](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/samples/) section. ## Further Information diff --git a/product_docs/docs/postgres_for_kubernetes/1/expose_pg_services.mdx b/product_docs/docs/postgres_for_kubernetes/1/expose_pg_services.mdx deleted file mode 100644 index b111a67f282..00000000000 --- a/product_docs/docs/postgres_for_kubernetes/1/expose_pg_services.mdx +++ /dev/null @@ -1,137 +0,0 @@ ---- -title: 'Exposing Postgres Services' -originalFilePath: 'src/expose_pg_services.md' ---- - -This section explains how to expose a PostgreSQL service externally, allowing access -to your PostgreSQL database **from outside your Kubernetes cluster** using -NGINX Ingress Controller. - -If you followed the [QuickStart](./quickstart.md), you should have by now -a database that can be accessed inside the cluster via the -`cluster-example-rw` (primary) and `cluster-example-r` (read-only) -services in the `default` namespace. Both services use port `5432`. - -Let's assume that you want to make the primary instance accessible from external -accesses on port `5432`. A typical use case, when moving to a Kubernetes -infrastructure, is indeed the one represented by **legacy applications** -that cannot be easily or sustainably "containerized". A sensible workaround -is to allow those applications that most likely reside in a virtual machine -or a physical server, to access a PostgreSQL database inside a Kubernetes cluster -in the same network. - -!!! Warning - Allowing access to a database from the public network could expose - your database to potential attacks from malicious users. Ensure you - secure your database before granting external access or that your - Kubernetes cluster is only reachable from a private network. - -For this example, you will use [NGINX Ingress Controller](https://kubernetes.github.io/ingress-nginx/), -since it is maintained directly by the Kubernetes project and can be set up -on every Kubernetes cluster. Many other controllers are available (see the -[Kubernetes documentation](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/) -for a comprehensive list). - -We assume that: - -- the NGINX Ingress controller has been deployed and works correctly -- it is possible to create a service of type `LoadBalancer` in your cluster - -!!! Important - Ingresses are only required to expose HTTP and HTTPS traffic. While the NGINX - Ingress controller can, not all Ingress objects can expose arbitrary ports or - protocols. - -The first step is to create a `tcp-services` `ConfigMap` whose data field -contains info on the externally exposed port and the namespace, service and -port to point to internally. - -```yaml -apiVersion: v1 -kind: ConfigMap -metadata: - name: tcp-services - namespace: ingress-nginx -data: - 5432: default/cluster-example-rw:5432 -``` - -Then, if you've installed NGINX Ingress Controller as suggested in their -documentation, you should have an `ingress-nginx` service. You'll have to add -the 5432 port to the `ingress-nginx` service to expose it. -The ingress will redirect incoming connections on port 5432 to your database. - -```yaml -apiVersion: v1 -kind: Service -metadata: - name: ingress-nginx - namespace: ingress-nginx - labels: - app.kubernetes.io/name: ingress-nginx - app.kubernetes.io/part-of: ingress-nginx -spec: - type: LoadBalancer - ports: - - name: http - port: 80 - targetPort: 80 - protocol: TCP - - name: https - port: 443 - targetPort: 443 - protocol: TCP - - name: postgres - port: 5432 - targetPort: 5432 - protocol: TCP - selector: - app.kubernetes.io/name: ingress-nginx - app.kubernetes.io/part-of: ingress-nginx -``` - -You can use [`cluster-expose-service.yaml`](../samples/cluster-expose-service.yaml) and apply it -using `kubectl`. - -!!! Warning - If you apply this file directly, you will overwrite any previous change - in your `ConfigMap` and `Service` of the Ingress - -Now you will be able to reach the PostgreSQL Cluster from outside your Kubernetes cluster. - -!!! Important - Make sure you configure `pg_hba` to allow connections from the Ingress. - -## Testing on Minikube - -On Minikube you can setup the ingress controller running: - -```sh -minikube addons enable ingress -``` - -You can then patch the deployment to allow access on port 5432. -Create a file called `patch.yaml` with the following content: - -```yaml -spec: - template: - spec: - containers: - - name: controller - ports: - - containerPort: 5432 - hostPort: 5432 -``` - -and apply it to the `ingress-nginx-controller` deployment: - -```sh -kubectl patch deployment ingress-nginx-controller --patch "$(cat patch.yaml)" -n ingress-nginx -``` - -You can access the primary from your machine running: - -```sh -psql -h $(minikube ip) -p 5432 -U postgres -``` diff --git a/product_docs/docs/postgres_for_kubernetes/1/failure_modes.mdx b/product_docs/docs/postgres_for_kubernetes/1/failure_modes.mdx index a1aab1641cf..5777b5aeb72 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/failure_modes.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/failure_modes.mdx @@ -178,9 +178,8 @@ to solve the problem manually. In such cases, please do not perform any manual operation without the support and assistance of EDB engineering team. -From version 1.11.0 of the operator, you can use the -`k8s.enterprisedb.io/reconciliationLoop` annotation to temporarily disable the -reconciliation loop on a selected PostgreSQL cluster, as follows: +You can use the `k8s.enterprisedb.io/reconciliationLoop` annotation to temporarily disable +the reconciliation loop for a specific PostgreSQL cluster, as shown below: ```yaml metadata: diff --git a/product_docs/docs/postgres_for_kubernetes/1/image_catalog.mdx b/product_docs/docs/postgres_for_kubernetes/1/image_catalog.mdx index bd4c73a38b7..451bb0415ee 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/image_catalog.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/image_catalog.mdx @@ -35,7 +35,7 @@ spec: - major: 15 image: quay.io/enterprisedb/postgresql:15.6 - major: 16 - image: quay.io/enterprisedb/postgresql:16.3 + image: quay.io/enterprisedb/postgresql:16.4 ``` **Example of a Cluster-Wide Catalog using `ClusterImageCatalog` Resource:** @@ -50,7 +50,7 @@ spec: - major: 15 image: quay.io/enterprisedb/postgresql:15.6 - major: 16 - image: quay.io/enterprisedb/postgresql:16.3 + image: quay.io/enterprisedb/postgresql:16.4 ``` A `Cluster` resource has the flexibility to reference either an `ImageCatalog` diff --git a/product_docs/docs/postgres_for_kubernetes/1/installation_upgrade.mdx b/product_docs/docs/postgres_for_kubernetes/1/installation_upgrade.mdx index caa8ddcfe62..7aa84bbebb6 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/installation_upgrade.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/installation_upgrade.mdx @@ -23,12 +23,12 @@ The operator can be installed using the provided [Helm chart](https://github.com The operator can be installed like any other resource in Kubernetes, through a YAML manifest applied via `kubectl`. -You can install the [latest operator manifest](https://get.enterprisedb.io/cnp/postgresql-operator-1.23.3.yaml) +You can install the [latest operator manifest](https://get.enterprisedb.io/cnp/postgresql-operator-1.24.0.yaml) for this minor release as follows: ```sh kubectl apply --server-side -f \ - https://get.enterprisedb.io/cnp/postgresql-operator-1.23.3.yaml + https://get.enterprisedb.io/cnp/postgresql-operator-1.24.0.yaml ``` You can verify that with: @@ -79,7 +79,7 @@ For example, you can install the latest snapshot of the operator with: ```sh curl -sSfL \ - https://raw.githubusercontent.com/cloudnative-pg/artifacts/release-1.23/manifests/operator-manifest.yaml | \ + https://raw.githubusercontent.com/cloudnative-pg/artifacts/release-1.24/manifests/operator-manifest.yaml | \ kubectl apply --server-side -f - ``` @@ -88,12 +88,13 @@ specific minor release, you can just run: ```sh curl -sSfL \ - https://raw.githubusercontent.com/cloudnative-pg/artifacts/release-1.23/manifests/operator-manifest.yaml | \ + https://raw.githubusercontent.com/cloudnative-pg/artifacts/main/manifests/operator-manifest.yaml | \ kubectl apply --server-side -f - ``` !!! Important - Snapshots are not supported by the EDB Postgres for Kubernetes and not intended for production usage. + Snapshots are not supported by the EDB Postgres for Kubernetes Community, and are not + intended for use in production. ## Details about the deployment @@ -173,10 +174,10 @@ plugin for `kubectl`. an upgrade of the operator will trigger a switchover on your PostgreSQL cluster, causing a (normally negligible) downtime. -Since version 1.10.0, the rolling update behavior can be replaced with in-place -updates of the instance manager. The latter don't require a restart of the -PostgreSQL instance and, as a result, a switchover in the cluster. -This behavior, which is disabled by default, is described below. +The default rolling update behavior can be replaced with in-place updates of +the instance manager. This approach does not require a restart of the +PostgreSQL instance, thereby avoiding a switchover within the cluster. This +feature, which is disabled by default, is described in detail below. ### In-place updates of the instance manager @@ -188,11 +189,11 @@ However, this behavior can be changed via configuration to enable in-place updates of the instance manager, which is the PID 1 process that keeps the container alive. -Internally, any instance manager from version 1.10 of EDB Postgres for Kubernetes -supports injection of a new executable that will replace the existing one, -once the integrity verification phase is completed, as well as graceful -termination of all the internal processes. When the new instance manager -restarts using the new binary, it adopts the already running *postmaster*. +Internally, each instance manager in EDB Postgres for Kubernetes supports the injection of a +new executable that replaces the existing one after successfully completing an +integrity verification phase and gracefully terminating all internal processes. +Upon restarting with the new binary, the instance manager seamlessly adopts the +already running *postmaster*. As a result, the PostgreSQL process is unaffected by the update, refraining from the need to perform a switchover. The other side of the coin, is that @@ -234,19 +235,61 @@ When versions are not directly upgradable, the old version needs to be removed before installing the new one. This won't affect user data but only the operator itself. -### Upgrading to 1.23.0, 1.22.3 or 1.21.5 +### Upgrading to 1.24.0 or 1.23.4 !!! Important We encourage all existing users of EDB Postgres for Kubernetes to upgrade to version - 1.23.0 or at least to the latest stable version of the minor release you are - currently using (namely 1.22.2 or 1.21.4). + 1.24.0 or at least to the latest stable version of the minor release you are + currently using (namely 1.23.4). !!! Warning Every time you are upgrading to a higher minor release, make sure you go through the release notes and upgrade instructions of all the intermediate minor releases. For example, if you want to move - from 1.21.x to 1.23, make sure you go through the release notes - and upgrade instructions for 1.22 and 1.23. + from 1.22.x to 1.24, make sure you go through the release notes + and upgrade instructions for 1.23 and 1.24. + +#### From Replica Clusters to Distributed Topology + +One of the key enhancements in EDB Postgres for Kubernetes 1.24.0 is the upgrade of the +replica cluster feature. + +The former replica cluster feature, now referred to as the "Standalone Replica +Cluster," is no longer recommended for Disaster Recovery (DR) and High +Availability (HA) scenarios that span multiple Kubernetes clusters. Standalone +replica clusters are best suited for read-only workloads, such as reporting, +OLAP, or creating development environments with test data. + +For DR and HA purposes, EDB Postgres for Kubernetes now introduces the Distributed Topology +strategy for replica clusters. This new strategy allows you to build PostgreSQL +clusters across private, public, hybrid, and multi-cloud environments, spanning +multiple regions and potentially different cloud providers. It also provides an +API to control the switchover operation, ensuring that only one cluster acts as +the primary at any given time. + +This Distributed Topology strategy enhances resilience and scalability, making +it a robust solution for modern, distributed applications that require high +availability and disaster recovery capabilities across diverse infrastructure +setups. + +You can seamlessly transition from a previous replica cluster configuration to a +distributed topology by modifying all the `Cluster` resources involved in the +distributed PostgreSQL setup. Ensure the following steps are taken: + +- Configure the `externalClusters` section to include all the clusters involved + in the distributed topology. We strongly suggest using the same configuration + across all `Cluster` resources for maintainability and consistency. +- Configure the `primary` and `source` fields in the `.spec.replica` stanza to + reflect the distributed topology. The `primary` field should contain the name + of the current primary cluster in the distributed topology, while the `source` + field should contain the name of the cluster each `Cluster` resource is + replicating from. It is important to note that the `enabled` field, which was + previously set to `true` or `false`, should now be unset (default). + +For more information, please refer to +the ["Distributed Topology" section for replica clusters](replica_cluster.md#distributed-topology). + +### Upgrading to 1.23 from a previous minor version #### User defined replication slots @@ -305,213 +348,3 @@ kubectl apply --server-side --force-conflicts -f Henceforth, `kube-apiserver` will be automatically acknowledged as a recognized manager for the CRDs, eliminating the need for any further manual intervention on this matter. - -### Upgrading to 1.22 from a previous minor version - -EDB Postgres for Kubernetes continues to adhere to the security-by-default approach. As of -version 1.22, the usage of the `ALTER SYSTEM` command is now disabled by -default. - -The reason behind this choice is to ensure that, by default, changes to the -PostgreSQL configuration in a database cluster controlled by EDB Postgres for Kubernetes are -allowed only through the Kubernetes API. - -At the same time, we are providing an option to enable `ALTER SYSTEM` if you -need to use it, even temporarily, from versions 1.22.0, 1.21.2, and 1.20.5, -by setting `.spec.postgresql.enableAlterSystem` to `true`, as in the following -excerpt: - -```yaml -... - postgresql: - enableAlterSystem: true -... -``` - -Clusters in 1.22 will have `enableAlterSystem` set to `false` by default. -If you want to retain the existing behavior, in 1.22, you need to explicitly -set `enableAlterSystem` to `true` as shown above. - -In versions 1.21.2 and 1.20.5, and later patch releases in the 1.20 and 1.21 -branches, `enableAlterSystem` will be set to `true` by default, keeping with -the existing behavior. If you don't need to use `ALTER SYSTEM`, we recommend -that you set `enableAlterSystem` explicitly to `false`. - -!!! Important - You can set the desired value for `enableAlterSystem` immediately - following your upgrade to version 1.22.0, 1.21.2, or 1.20.5, as shown in - the example above. - -### Upgrading to 1.21 from a previous minor version - -With the goal to keep improving out-of-the-box the *convention over -configuration* behavior of the operator, EDB Postgres for Kubernetes changes the default -value of several knobs in the following areas: - -- startup and shutdown control of the PostgreSQL instance -- self-healing -- security -- labels - -!!! Warning - Please read carefully the list of changes below, and how to modify the - `Cluster` manifests to retain the existing behavior if you don't want to - disrupt your existing workloads. Alternatively, postpone the upgrade to - until you are sure. In general, we recommend adopting these default - values unless you have valid reasons not to. - -#### Superuser access disabled - -Pushing towards *security-by-default*, EDB Postgres for Kubernetes now disables access -`postgres` superuser access via the network in all new clusters, unless -explicitly enabled. - -If you want to ensure superuser access to the PostgreSQL cluster, regardless -which version of EDB Postgres for Kubernetes you are running, we advise you to explicitly -declare it by setting: - -```yaml -spec: - ... - enableSuperuserAccess: true -``` - -#### Replication slots for HA - -Replication slots for High Availability are enabled by default. - -If you want to ensure replication slots are disabled, regardless of which -version of EDB Postgres for Kubernetes you are running, we advise you to explicitly declare -it by setting: - -```yaml -spec: - ... - replicationSlots: - highAvailability: - enabled: false -``` - -#### Delay for PostgreSQL shutdown - -Up to 1.20.2, [the `stopDelay` parameter](instance_manager.md#shutdown-control) -was set to 30 seconds. Despite the recommendations to change and tune this -value, almost all the cases we have examined during support incidents or -community issues show that this value is left unchanged. - -The [new default value is 1800 seconds](https://github.com/EnterpriseDB/cloud-native-postgres/commit/9f7f18c5b9d9103423a53d180c0e2f2189e71c3c), -the equivalent of 30 minutes. - -The new `smartShutdownTimeout` parameter has been introduced to define -the maximum time window within the `stopDelay` value reserved to complete -the `smart` shutdown procedure in PostgreSQL. During this time, the -Postgres server rejects any new connections while waiting for all regular -sessions to terminate. - -Once elapsed, the remaining time up to `stopDelay` will be reserved for -PostgreSQL to complete its duties regarding WAL commitments with both the -archive and the streaming replicas to ensure the cluster doesn't lose any data. - -If you want to retain the old behavior, you need to set explicitly: - -```yaml -spec: - ... - stopDelay: 30 -``` - -And, **after** the upgrade has completed, specify `smartShutdownTimeout`: - -```yaml -spec: - ... - stopDelay: 30 - smartShutdownTimeout: 15 -``` - -#### Delay for PostgreSQL startup - -Up to 1.20.2, [the `startDelay` parameter](instance_manager.md#startup-liveness-and-readiness-probes) -was set to 30 seconds, and EDB Postgres for Kubernetes used this parameter as -`initialDelaySeconds` for the Kubernetes liveness probe. Given that all the -supported Kubernetes releases provide [startup probes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-startup-probes), -`startDelay` is now automatically divided into periods of 10 seconds of -duration each. - -!!! Important - In order to add the `startupProbe`, each pod needs to be restarted. - As a result, when you upgrade the operator, a one-time rolling - update of the cluster will be executed even in the online update case. - -Despite the recommendations to change and tune this value, almost all the cases -we have examined during support incidents or community issues show that this -value is left unchanged. Given that this parameter influences the startup of -a PostgreSQL instance, a low value of `startDelay` would cause Postgres -never to reach a consistent recovery state and be restarted indefinitely. - -For this reason, `startDelay` has been [raised by default to 3600 seconds](https://github.com/EnterpriseDB/cloud-native-postgres/commit/4f4cd96bc6f8e284a200705c11a2b41652d58146), -the equivalent of 1 hour. - -If you want to retain the existing behavior using the new implementation, you -can do that by explicitly setting: - -```yaml -spec: - ... - startDelay: 30 -``` - -#### Delay for PostgreSQL switchover - -Up to 1.20.2, [the `switchoverDelay` parameter](instance_manager.md#shutdown-of-the-primary-during-a-switchover) -was set by default to 40000000 seconds (over 15 months) to simulate a very long -interval. - -The [default value has been lowered to 3600 seconds](https://github.com/EnterpriseDB/cloud-native-postgres/commit/9565f9f2ebab8bc648d9c361198479974664c322), -the equivalent of 1 hour. - -If you want to retain the old behavior, you need to set explicitly: - -```yaml -spec: - ... - switchoverDelay: 40000000 -``` - -#### Labels - -In version 1.18, we deprecated the `postgresql` label in pods to identify the -name of the cluster, and replaced it with the more canonical `k8s.enterprisedb.io/cluster` -label. The `postgresql` label is no longer maintained. - -Similarly, from this version, the `role` label is deprecated. The new label -`k8s.enterprisedb.io/instanceRole` is now used, and will entirely replace the `role` label -in a future release. - -#### Shortcut for keeping the existing behavior - -If you want to explicitly keep the behavior of EDB Postgres for Kubernetes up to version -1.20.2 (we advise not to), you need to set these values in all your `Cluster` -definitions **before upgrading** to a higher version: - -```yaml -spec: - ... - # Changed in 1.21.0, 1.20.3 and 1.19.5 - startDelay: 30 - stopDelay: 30 - switchoverDelay: 40000000 - # Changed in 1.21.0 only - enableSuperuserAccess: true - replicationSlots: - highAvailability: - enabled: false -``` - -Once the upgrade is completed, also add: - -```yaml -spec: - ... - smartShutdownTimeout: 15 -``` diff --git a/product_docs/docs/postgres_for_kubernetes/1/instance_manager.mdx b/product_docs/docs/postgres_for_kubernetes/1/instance_manager.mdx index 0ea0adac6aa..52fb97fc013 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/instance_manager.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/instance_manager.mdx @@ -101,3 +101,44 @@ the WAL files. By default it is set to `3600` (1 hour). In case of primary pod failure, the cluster will go into failover mode. Please refer to the ["Failover" section](failover.md) for details. + +## Disk Full Failure + +Storage exhaustion is a well known issue for PostgreSQL clusters. +The [PostgreSQL documentation](https://www.postgresql.org/docs/current/disk-full.html) +highlights the possible failure scenarios and the importance of monitoring disk +usage to prevent it from becoming full. + +The same applies to EDB Postgres for Kubernetes and Kubernetes as well: the +["Monitoring" section](monitoring.md#predefined-set-of-metrics) +provides details on checking the disk space used by WAL segments and standard +metrics on disk usage exported to Prometheus. + +!!! Important + In a production system, it is critical to monitor the database + continuously. Exhausted disk storage can lead to a database server shutdown. + +!!! Note + The detection of exhausted storage relies on a storage class that + accurately reports disk size and usage. This may not be the case in simulated + Kubernetes environments like Kind or with test storage class implementations + such as `csi-driver-host-path`. + +If the disk containing the WALs becomes full and no more WAL segments can be +stored, PostgreSQL will stop working. EDB Postgres for Kubernetes correctly detects this issue +by verifying that there is enough space to store the next WAL segment, +and avoids triggering a failover, which could complicate recovery. + +That allows a human administrator to address the root cause. + +In such a case, if supported by the storage class, the quickest course of action +is currently to: + +1. Expand the storage size of the full PVC +2. Increase the size in the `Cluster` resource to the same value + +Once the issue is resolved and there is sufficient free space for WAL segments, +the Pod will restart and the cluster will become healthy. + +See also the ["Volume expansion" section](storage.md#volume-expansion) of the +documentation. diff --git a/product_docs/docs/postgres_for_kubernetes/1/labels_annotations.mdx b/product_docs/docs/postgres_for_kubernetes/1/labels_annotations.mdx index 99badbeaca3..cf7c1cef737 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/labels_annotations.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/labels_annotations.mdx @@ -197,7 +197,7 @@ These predefined annotations are managed by EDB Postgres for Kubernetes. that ensures that the WAL archive is empty before writing data. Use at your own risk. -`k8s.enterprisedb.io/skipEmptyWalArchiveCheck` +`k8s.enterprisedb.io/skipWalArchiving` : When set to `true` on a `Cluster` resource, the operator disables WAL archiving. This will set `archive_mode` to `off` and require a restart of all PostgreSQL instances. Use at your own risk. diff --git a/product_docs/docs/postgres_for_kubernetes/1/monitoring.mdx b/product_docs/docs/postgres_for_kubernetes/1/monitoring.mdx index e0dca3758a5..d99b067b85c 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/monitoring.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/monitoring.mdx @@ -12,16 +12,15 @@ originalFilePath: 'src/monitoring.md' ## Monitoring Instances For each PostgreSQL instance, the operator provides an exporter of metrics for -[Prometheus](https://prometheus.io/) via HTTP, on port 9187, named `metrics`. +[Prometheus](https://prometheus.io/) via HTTP or HTTPS, on port 9187, named `metrics`. The operator comes with a [predefined set of metrics](#predefined-set-of-metrics), as well as a highly configurable and customizable system to define additional queries via one or more `ConfigMap` or `Secret` resources (see the ["User defined metrics" section](#user-defined-metrics) below for details). !!! Important - Starting from version 1.11, EDB Postgres for Kubernetes already installs - [by default a set of predefined metrics](#default-set-of-metrics) in - a `ConfigMap` called `default-monitoring`. + EDB Postgres for Kubernetes, by default, installs a set of [predefined metrics](#default-set-of-metrics) + in a `ConfigMap` named `default-monitoring`. !!! Info You can inspect the exported metrics by following the instructions in @@ -62,15 +61,16 @@ by specifying a list of one or more databases in the `target_databases` option. A specific PostgreSQL cluster can be monitored using the [Prometheus Operator's](https://github.com/prometheus-operator/prometheus-operator) resource -[PodMonitor](https://github.com/prometheus-operator/prometheus-operator/blob/v0.47.1/Documentation/api.md#podmonitor). -A PodMonitor correctly pointing to a Cluster can be automatically created by the operator by setting -`.spec.monitoring.enablePodMonitor` to `true` in the Cluster resource itself (default: false). +[PodMonitor](https://github.com/prometheus-operator/prometheus-operator/blob/v0.75.1/Documentation/api.md#podmonitor). + +A `PodMonitor` that correctly points to the Cluster can be automatically created by the operator by setting +`.spec.monitoring.enablePodMonitor` to `true` in the Cluster resource itself (default: `false`). !!! Important Any change to the `PodMonitor` created automatically will be overridden by the Operator at the next reconciliation cycle, in case you need to customize it, you can do so as described below. -To deploy a `PodMonitor` for a specific Cluster manually, you can just define it as follows, changing it as needed: +To deploy a `PodMonitor` for a specific Cluster manually, define it as follows and adjust as needed: ```yaml apiVersion: monitoring.coreos.com/v1 @@ -86,14 +86,58 @@ spec: ``` !!! Important - Make sure you modify the example above with a unique name as well as the - correct cluster's namespace and labels (we are using `cluster-example`). + Ensure you modify the example above with a unique name, as well as the + correct cluster's namespace and labels (e.g., `cluster-example`). !!! Important - Label `postgresql`, used in previous versions of this document, is deprecated - and will be removed in the future. Please use the label `k8s.enterprisedb.io/cluster` + The `postgresql` label, used in previous versions of this document, is deprecated + and will be removed in the future. Please use the `k8s.enterprisedb.io/cluster` label instead to select the instances. +### Enabling TLS on the Metrics Port + +To enable TLS communication on the metrics port, configure the `.spec.monitoring.tls.enabled` +setting to `true`. This setup ensures that the metrics exporter uses the same +server certificate used by PostgreSQL to secure communication on port 5432. + +!!! Important + Changing the `.spec.monitoring.tls.enabled` setting will trigger a rolling restart of the Cluster. + +If the `PodMonitor` is managed by the operator (`.spec.monitoring.enablePodMonitor` set to `true`), +it will automatically contain the necessary configurations to access the metrics via TLS. + +To manually deploy a `PodMonitor` suitable for reading metrics via TLS, define it as follows and +adjust as needed: + +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: PodMonitor +metadata: + name: cluster-example +spec: + selector: + matchLabels: + "k8s.enterprisedb.io/cluster": cluster-example + podMetricsEndpoints: + - port: metrics + scheme: https + tlsConfig: + ca: + secret: + name: cluster-example-ca + key: ca.crt + serverName: cluster-example-rw +``` + +!!! Important + Ensure you modify the example above with a unique name, as well as the + correct Cluster's namespace and labels (e.g., `cluster-example`). + +!!! Important + The `serverName` field in the metrics endpoint must match one of the names + defined in the server certificate. If the default certificate is in use, + the `serverName` value should be in the format `-rw`. + ### Predefined set of metrics Every PostgreSQL instance exporter automatically exposes a set of predefined @@ -176,7 +220,7 @@ cnp_collector_up{cluster="cluster-example"} 1 # HELP cnp_collector_postgres_version Postgres version # TYPE cnp_collector_postgres_version gauge -cnp_collector_postgres_version{cluster="cluster-example",full="16.3"} 16.3 +cnp_collector_postgres_version{cluster="cluster-example",full="16.4"} 16.4 # HELP cnp_collector_last_failed_backup_timestamp The last failed backup as a unix timestamp # TYPE cnp_collector_last_failed_backup_timestamp gauge @@ -445,6 +489,29 @@ A list of basic monitoring queries can be found in the [`default-monitoring.yaml` file](../default-monitoring.yaml) that is already installed in your EDB Postgres for Kubernetes deployment (see ["Default set of metrics"](#default-set-of-metrics)). +#### Example of a user defined metric with predicate query + +The `predicate_query` option allows the user to execute the `query` to collect the metrics only under the specified conditions. +To do so the user needs to provide a predicate query that returns at most one row with a single `boolean` column. + +The predicate query is executed in the same transaction as the main query and against the same databases. + +```yaml +some_query: | + predicate_query: | + SELECT + some_bool as predicate + FROM some_table + query: | + SELECT + count(*) as rows + FROM some_table + metrics: + - rows: + usage: "GAUGE" + description: "number of rows" +``` + #### Example of a user defined metric running on multiple databases If the `target_databases` option lists more than one database @@ -551,6 +618,8 @@ Here is a short description of all the available fields: - `target_databases`: a list of databases to run the `query` against, or a [shell-like pattern](#example-of-a-user-defined-metric-running-on-multiple-databases) to enable auto discovery. Overwrites the default database if provided. + - `predicate_query`: a SQL query that returns at most one row and one `boolean` column to run on the target database. + The system evaluates the predicate and if `true` executes the `query`. - `metrics`: section containing a list of all exported columns, defined as follows: - ``: the name of the column returned by the query - `name`: override the `ColumnName` of the column in the metric, if defined @@ -724,6 +793,12 @@ And then run: kubectl exec -ti curl -- curl -s ${POD_IP}:9187/metrics ``` +If you enabled TLS metrics, run instead: + +```shell +kubectl exec -ti curl -- curl -sk https://${POD_IP}:9187/metrics +``` + In case you want to access the metrics of the operator, you need to point to the pod where the operator is running, and use TCP port 8080 as target. diff --git a/product_docs/docs/postgres_for_kubernetes/1/openshift.mdx b/product_docs/docs/postgres_for_kubernetes/1/openshift.mdx index 1828a7bcd23..cedc83f302f 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/openshift.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/openshift.mdx @@ -88,6 +88,41 @@ it would take a full region outage to bring down your cluster. Moreover, you can take advantage of multiple OpenShift clusters in different regions by setting up replica clusters, as previously mentioned. +### Reserving Nodes for PostgreSQL Workloads + +For optimal performance and resource allocation in your PostgreSQL database +operations, it is highly recommended to isolate PostgreSQL workloads by +dedicating specific worker nodes solely to `postgres` in production. This is +particularly crucial whether you're operating in a single availability zone or +a multi-availability zone environment. + +A worker node in OpenShift that is dedicated to running PostgreSQL workloads is +commonly referred to as a **Postgres node** or `postgres` node. + +This dedicated approach ensures that your PostgreSQL workloads are not +competing for resources with other applications, leading to enhanced stability +and performance. + +For further details, please refer to the ["Reserving Nodes for PostgreSQL Workloads" section within the broader "Architecture"](architecture.md#reserving-nodes-for-postgresql-workloads) +documentation. The primary difference when working in OpenShift involves how +labels and taints are applied to the nodes, as described below. + +To label a node as a `postgres` node, execute the following command: + +```sh +oc label node node-role.kubernetes.io/postgres= +``` + +To apply a `postgres` taint to a node, use the following command: + +```sh +oc adm taint node node-role.kubernetes.io/postgres=:NoSchedule +``` + +By correctly labeling and tainting your nodes, you ensure that only PostgreSQL +workloads are scheduled on these dedicated nodes via affinity and tolerations, +reinforcing the stability and performance of your database environment. + ## Important OpenShift concepts To understand how the EDB Postgres for Kubernetes operator fits in an OpenShift environment, @@ -996,7 +1031,7 @@ enabled, so you can peek the `cnp_` prefix: ![Prometheus queries](./images/openshift/prometheus-queries.png) It is easy to define Alerts based on the default metrics as `PrometheusRules`. -You can find some examples of rules in the [cnp-prometheusrule.yaml](../samples/monitoring/cnp-prometheusrule.yaml) +You can find some examples of rules in the [prometheusrule.yaml](../samples/monitoring/prometheusrule.yaml) file, which you can download. Before applying the rules, again, some OpenShift setup may be necessary. diff --git a/product_docs/docs/postgres_for_kubernetes/1/operator_capability_levels.mdx b/product_docs/docs/postgres_for_kubernetes/1/operator_capability_levels.mdx index e73ef65fcc1..0ee58532691 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/operator_capability_levels.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/operator_capability_levels.mdx @@ -123,6 +123,20 @@ proposed patch for PostgreSQL, called [failover slots](https://wiki.postgresql.org/wiki/Failover_slots), and also supports user defined physical replication slots on the primary. +### Service Configuration + +By default, EDB Postgres for Kubernetes creates three Kubernetes [services](service_management.md) +for applications to access the cluster via the network: + +- One pointing to the primary for read/write operations. +- One pointing to replicas for read-only queries. +- A generic one pointing to any instance for read operations. + +You can disable the read-only and read services via configuration. +Additionally, you can leverage the service template capability +to create custom service resources, including load balancers, to access +PostgreSQL outside Kubernetes. This is particularly useful for DBaaS purposes. + ### Database configuration The operator is designed to manage a PostgreSQL cluster with a single @@ -430,18 +444,13 @@ label, or a transaction ID. This capability is built on top of the full restore one and supports all the options available in [PostgreSQL for PITR](https://www.postgresql.org/docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY-TARGET). -### Zero-data-loss clusters through synchronous replication - -Achieve *zero data loss* (RPO=0) in your local high-availability EDB Postgres for Kubernetes -cluster through quorum-based synchronous replication support. The operator provides -two configuration options that control the minimum and maximum number of -expected synchronous standby replicas available at any time. The operator -reacts accordingly, based on the number of available and ready PostgreSQL -instances in the cluster. It uses the following formula for the quorum (`q`): +### Zero-Data-Loss Clusters Through Synchronous Replication -``` -1 <= minSyncReplicas <= q <= maxSyncReplicas <= readyReplicas -``` +Achieve *zero data loss* (RPO=0) in your local high-availability EDB Postgres for Kubernetes +cluster with support for both quorum-based and priority-based synchronous +replication. The operator offers a flexible way to define the number of +expected synchronous standby replicas available at any time, and allows +customization of the `synchronous_standby_names` option as needed. ### Replica clusters @@ -457,7 +466,7 @@ Notably, the source PostgreSQL instance can exist outside the Kubernetes environment, whether in a physical or virtual setting. Replica clusters can be instantiated through various methods, including volume -snapshots, a recovery object store (utilizing the Barman Cloud backup format), +snapshots, a recovery object store (using the Barman Cloud backup format), or streaming using `pg_basebackup`. Both WAL file shipping and WAL streaming are supported. The deployment of replica clusters significantly elevates the business continuity posture of PostgreSQL databases within Kubernetes, @@ -465,7 +474,30 @@ extending across multiple data centers and facilitating hybrid and multi-cloud setups. (While anticipating Kubernetes federation native capabilities, manual switchover across data centers remains necessary.) - +Additionally, the flexibility extends to creating delayed replica clusters +intentionally lagging behind the primary cluster. This intentional lag aims to +minimize the Recovery Time Objective (RTO) in the event of unintended errors, +such as incorrect `DELETE` or `UPDATE` SQL operations. + +### Distributed Database Topologies + +Leverage replica clusters to +define [distributed database topologies](replica_cluster.md#distributed-topology) +for PostgreSQL that span across various Kubernetes clusters, facilitating hybrid +and multi-cloud deployments. With EDB Postgres for Kubernetes, you gain powerful capabilities, +including: + +- **Declarative Primary Control**: Easily specify which PostgreSQL cluster acts + as the primary. +- **Seamless Primary Switchover**: Effortlessly demote the current primary and + promote another PostgreSQL cluster, typically located in a different region, + without needing to re-clone the former primary. + +This setup can efficiently operate across two or more regions, can rely entirely +on object stores for replication, and guarantees a maximum RPO (Recovery Point +Objective) of 5 minutes. This advanced feature is uniquely provided by +EDB Postgres for Kubernetes, ensuring robust data integrity and continuity across diverse +environments. ### Tablespace support diff --git a/product_docs/docs/postgres_for_kubernetes/1/operator_conf.mdx b/product_docs/docs/postgres_for_kubernetes/1/operator_conf.mdx index 373b455a194..42f60e726c6 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/operator_conf.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/operator_conf.mdx @@ -65,12 +65,6 @@ The namespace where the operator looks for the `PULL_SECRET_NAME` secret is wher you installed the operator. If the operator is not able to find that secret, it will ignore the configuration parameter. -!!! Warning - Previous versions of the operator copied the `PULL_SECRET_NAME` secret inside - the namespaces where you deploy the PostgreSQL clusters. From version "1.11.0" - the behavior changed to match the previous description. The pull secrets - created by the previous versions of the operator are unused. - ## Defining an operator config map The example below customizes the behavior of the operator, by defining a diff --git a/product_docs/docs/postgres_for_kubernetes/1/pg4k.v1.mdx b/product_docs/docs/postgres_for_kubernetes/1/pg4k.v1.mdx index f6c6f85aa9b..28424fd9fb7 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/pg4k.v1.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/pg4k.v1.mdx @@ -1158,8 +1158,8 @@ option for initdb (default: empty, resulting in PostgreSQL default: 16MB)

[]string -

List of SQL queries to be executed as a superuser immediately -after the cluster has been created - to be used with extreme care +

List of SQL queries to be executed as a superuser in the postgres +database right after the cluster has been created - to be used with extreme care (by default empty)

@@ -1168,7 +1168,7 @@ after the cluster has been created - to be used with extreme care

List of SQL queries to be executed as a superuser in the application -database right after is created - to be used with extreme care +database right after the cluster has been created - to be used with extreme care (by default empty)

@@ -1177,7 +1177,7 @@ database right after is created - to be used with extreme care

List of SQL queries to be executed as a superuser in the template1 -after the cluster has been created - to be used with extreme care +database right after the cluster has been created - to be used with extreme care (by default empty)

@@ -1190,13 +1190,41 @@ instance using logical backup (pg_dump and pg_restore) postInitApplicationSQLRefs
-PostInitApplicationSQLRefs +SQLRefs -

PostInitApplicationSQLRefs points references to ConfigMaps or Secrets which -contain SQL files, the general implementation order to these references is -from all Secrets to all ConfigMaps, and inside Secrets or ConfigMaps, -the implementation order is same as the order of each array +

List of references to ConfigMaps or Secrets containing SQL files +to be executed as a superuser in the application database right after +the cluster has been created. The references are processed in a specific order: +first, all Secrets are processed, followed by all ConfigMaps. +Within each group, the processing order follows the sequence specified +in their respective arrays. +(by default empty)

+ + +postInitTemplateSQLRefs
+SQLRefs + + +

List of references to ConfigMaps or Secrets containing SQL files +to be executed as a superuser in the template1 database right after +the cluster has been created. The references are processed in a specific order: +first, all Secrets are processed, followed by all ConfigMaps. +Within each group, the processing order follows the sequence specified +in their respective arrays. +(by default empty)

+ + +postInitSQLRefs
+SQLRefs + + +

List of references to ConfigMaps or Secrets containing SQL files +to be executed as a superuser in the postgres database right after +the cluster has been created. The references are processed in a specific order: +first, all Secrets are processed, followed by all ConfigMaps. +Within each group, the processing order follows the sequence specified +in their respective arrays. (by default empty)

@@ -1485,6 +1513,31 @@ this can be omitted. +
+ +## ClusterMonitoringTLSConfiguration + +**Appears in:** + +- [MonitoringConfiguration](#postgresql-k8s-enterprisedb-io-v1-MonitoringConfiguration) + +

ClusterMonitoringTLSConfiguration is the type containing the TLS configuration +for the cluster's monitoring

+ + + + + + + + +
FieldDescription
enabled
+bool +
+

Enable TLS for the monitoring endpoint. +Changing this option will force a rollout of all instances.

+
+
## ClusterSpec @@ -2016,6 +2069,14 @@ any plugin to be loaded with the corresponding configuration

during a switchover or a failover

+lastPromotionToken [Required]
+string + + +

LastPromotionToken is the last verified promotion token that +was used to promote a replica cluster

+ + pvcCount
int32 @@ -2264,6 +2325,16 @@ This field is reported when .spec.failoverDelay is populated or dur

SwitchReplicaClusterStatus is the status of the switch to replica cluster

+demotionToken
+string + + +

DemotionToken is a JSON token containing the information +from pg_controldata such as Database system identifier, Latest checkpoint's +TimeLineID, Latest checkpoint's REDO location, Latest checkpoint's REDO +WAL file, and Time of latest checkpoint

+ + @@ -2289,7 +2360,7 @@ This field is reported when .spec.failoverDelay is populated or dur - [MonitoringConfiguration](#postgresql-k8s-enterprisedb-io-v1-MonitoringConfiguration) -- [PostInitApplicationSQLRefs](#postgresql-k8s-enterprisedb-io-v1-PostInitApplicationSQLRefs) +- [SQLRefs](#postgresql-k8s-enterprisedb-io-v1-SQLRefs)

ConfigMapKeySelector contains enough information to let you locate the key of a ConfigMap

@@ -3113,6 +3184,13 @@ by the instance manager

Database roles managed by the Cluster

+services
+ManagedServices + + +

Services roles managed by the Cluster

+ + @@ -3154,6 +3232,76 @@ with an explanation of the cause

+
+ +## ManagedService + +**Appears in:** + +- [ManagedServices](#postgresql-k8s-enterprisedb-io-v1-ManagedServices) + +

ManagedService represents a specific service managed by the cluster. +It includes the type of service and its associated template specification.

+ + + + + + + + + + + + + + +
FieldDescription
selectorType [Required]
+ServiceSelectorType +
+

SelectorType specifies the type of selectors that the service will have. +Valid values are "rw", "r", and "ro", representing read-write, read, and read-only services.

+
updateStrategy [Required]
+ServiceUpdateStrategy +
+

UpdateStrategy describes how the service differences should be reconciled

+
serviceTemplate [Required]
+ServiceTemplateSpec +
+

ServiceTemplate is the template specification for the service.

+
+ +
+ +## ManagedServices + +**Appears in:** + +- [ManagedConfiguration](#postgresql-k8s-enterprisedb-io-v1-ManagedConfiguration) + +

ManagedServices represents the services managed by the cluster.

+ + + + + + + + + + + +
FieldDescription
disabledDefaultServices
+[]ServiceSelectorType +
+

DisabledDefaultServices is a list of service types that are disabled by default. +Valid values are "r", and "ro", representing read, and read-only services.

+
additional [Required]
+[]ManagedService +
+

Additional is a list of additional managed services specified by the user.

+
+
## Metadata @@ -3174,6 +3322,13 @@ not using the core data types.

+ + + @@ -3241,6 +3396,14 @@ Default: false.

Enable or disable the PodMonitor

+ + + @@ -3555,6 +3718,13 @@ plugin regarding the WAL management

plugin regarding the Backup management

+ + +
FieldDescription
name [Required]
+string +
+

The name of the resource. Only supported for certain types

+
labels
map[string]string
tls
+ClusterMonitoringTLSConfiguration +
+

Configure TLS communication for the metrics endpoint. +Changing tls.enabled option will force a rollout of all instances.

+
podMonitorMetricRelabelings
[]github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring/v1.RelabelConfig
status [Required]
+string +
+

Status contain the status reported by the plugin through the SetStatusInCluster interface

+
@@ -3832,39 +4002,6 @@ Pooler name should never match with any cluster name within the same namespace.<

PoolerType is the type of the connection pool, meaning the service we are targeting. Allowed values are rw and ro.

-
- -## PostInitApplicationSQLRefs - -**Appears in:** - -- [BootstrapInitDB](#postgresql-k8s-enterprisedb-io-v1-BootstrapInitDB) - -

PostInitApplicationSQLRefs points references to ConfigMaps or Secrets which -contain SQL files, the general implementation order to these references is -from all Secrets to all ConfigMaps, and inside Secrets or ConfigMaps, -the implementation order is same as the order of each array

- - - - - - - - - - - -
FieldDescription
secretRefs
-[]SecretKeySelector -
-

SecretRefs holds a list of references to Secrets

-
configMapRefs
-[]ConfigMapKeySelector -
-

ConfigMapRefs holds a list of references to ConfigMaps

-
-
## PostgresConfiguration @@ -3885,6 +4022,13 @@ the implementation order is same as the order of each array

PostgreSQL configuration options (postgresql.conf)

+synchronous
+SynchronousReplicaConfiguration + + +

Configuration of the PostgreSQL synchronous replication feature

+ + pg_hba
[]string @@ -4070,6 +4214,22 @@ cluster

+ + + + + + @@ -4087,6 +4247,25 @@ object store or via streaming through pg_basebackup. Refer to the Replica clusters page of the documentation for more information.

+ + + + + +
FieldDescription
self [Required]
+string +
+

Self defines the name of this cluster. It is used to determine if this is a primary +or a replica cluster, comparing it with primary

+
primary [Required]
+string +
+

Primary defines which Cluster is defined to be the primary in the distributed PostgreSQL cluster, based on the +topology specified in externalClusters

+
source [Required]
string
promotionToken [Required]
+string +
+

A demotion token generated by an external cluster used to +check if the promotion requirements are met.

+
minApplyDelay [Required]
+meta/v1.Duration +
+

When replica mode is enabled, this parameter allows you to replay +transactions only when the system time is at least the configured +time past the commit time. This provides an opportunity to correct +data loss errors. Note that when this parameter is set, a promotion +token cannot be used.

+
@@ -4381,6 +4560,40 @@ files to S3. It can be provided in two alternative ways:

+
+ +## SQLRefs + +**Appears in:** + +- [BootstrapInitDB](#postgresql-k8s-enterprisedb-io-v1-BootstrapInitDB) + +

SQLRefs holds references to ConfigMaps or Secrets +containing SQL files. The references are processed in a specific order: +first, all Secrets are processed, followed by all ConfigMaps. +Within each group, the processing order follows the sequence specified +in their respective arrays.

+ + + + + + + + + + + +
FieldDescription
secretRefs
+[]SecretKeySelector +
+

SecretRefs holds a list of references to Secrets

+
configMapRefs
+[]ConfigMapKeySelector +
+

ConfigMapRefs holds a list of references to ConfigMaps

+
+
## ScheduledBackupSpec @@ -4538,10 +4751,10 @@ Overrides the default settings specified in the cluster '.backup.volumeSnapshot. - [MonitoringConfiguration](#postgresql-k8s-enterprisedb-io-v1-MonitoringConfiguration) -- [PostInitApplicationSQLRefs](#postgresql-k8s-enterprisedb-io-v1-PostInitApplicationSQLRefs) - - [S3Credentials](#postgresql-k8s-enterprisedb-io-v1-S3Credentials) +- [SQLRefs](#postgresql-k8s-enterprisedb-io-v1-SQLRefs) +

SecretKeySelector contains enough information to let you locate the key of a Secret

@@ -4716,12 +4929,29 @@ service account

+
+ +## ServiceSelectorType + +(Alias of `string`) + +**Appears in:** + +- [ManagedService](#postgresql-k8s-enterprisedb-io-v1-ManagedService) + +- [ManagedServices](#postgresql-k8s-enterprisedb-io-v1-ManagedServices) + +

ServiceSelectorType describes a valid value for generating the service selectors. +It indicates which type of service the selector applies to, such as read-write, read, or read-only

+
## ServiceTemplateSpec **Appears in:** +- [ManagedService](#postgresql-k8s-enterprisedb-io-v1-ManagedService) + - [PoolerSpec](#postgresql-k8s-enterprisedb-io-v1-PoolerSpec)

ServiceTemplateSpec is a structure allowing the user to set @@ -4749,6 +4979,18 @@ More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api- +

+ +## ServiceUpdateStrategy + +(Alias of `string`) + +**Appears in:** + +- [ManagedService](#postgresql-k8s-enterprisedb-io-v1-ManagedService) + +

ServiceUpdateStrategy describes how the changes to the managed service should be handled

+
## SnapshotOwnerReference @@ -4919,6 +5161,82 @@ physical replication slots

+
+ +## SynchronousReplicaConfiguration + +**Appears in:** + +- [PostgresConfiguration](#postgresql-k8s-enterprisedb-io-v1-PostgresConfiguration) + +

SynchronousReplicaConfiguration contains the configuration of the +PostgreSQL synchronous replication feature. +Important: at this moment, also .spec.minSyncReplicas and .spec.maxSyncReplicas +need to be considered.

+ + + + + + + + + + + + + + + + + + + + +
FieldDescription
method [Required]
+SynchronousReplicaConfigurationMethod +
+

Method to select synchronous replication standbys from the listed +servers, accepting 'any' (quorum-based synchronous replication) or +'first' (priority-based synchronous replication) as values.

+
number [Required]
+int +
+

Specifies the number of synchronous standby servers that +transactions must wait for responses from.

+
maxStandbyNamesFromCluster
+int +
+

Specifies the maximum number of local cluster pods that can be +automatically included in the synchronous_standby_names option in +PostgreSQL.

+
standbyNamesPre
+[]string +
+

A user-defined list of application names to be added to +synchronous_standby_names before local cluster pods (the order is +only useful for priority-based synchronous replication).

+
standbyNamesPost
+[]string +
+

A user-defined list of application names to be added to +synchronous_standby_names after local cluster pods (the order is +only useful for priority-based synchronous replication).

+
+ +
+ +## SynchronousReplicaConfigurationMethod + +(Alias of `string`) + +**Appears in:** + +- [SynchronousReplicaConfiguration](#postgresql-k8s-enterprisedb-io-v1-SynchronousReplicaConfiguration) + +

SynchronousReplicaConfigurationMethod configures whether to use +quorum based replication or a priority list

+
## TDEConfiguration @@ -5236,5 +5554,39 @@ will be processed one at a time. It accepts a positive integer as a value - with 1 being the minimum accepted value.

+archiveAdditionalCommandArgs [Required]
+[]string + + +

Additional arguments that can be appended to the 'barman-cloud-wal-archive' +command-line invocation. These arguments provide flexibility to customize +the WAL archive process further, according to specific requirements or configurations.

+

Example: +In a scenario where specialized backup options are required, such as setting +a specific timeout or defining custom behavior, users can use this field +to specify additional command arguments.

+

Note: +It's essential to ensure that the provided arguments are valid and supported +by the 'barman-cloud-wal-archive' command, to avoid potential errors or unintended +behavior during execution.

+ + +restoreAdditionalCommandArgs [Required]
+[]string + + +

Additional arguments that can be appended to the 'barman-cloud-wal-restore' +command-line invocation. These arguments provide flexibility to customize +the WAL restore process further, according to specific requirements or configurations.

+

Example: +In a scenario where specialized backup options are required, such as setting +a specific timeout or defining custom behavior, users can use this field +to specify additional command arguments.

+

Note: +It's essential to ensure that the provided arguments are valid and supported +by the 'barman-cloud-wal-restore' command, to avoid potential errors or unintended +behavior during execution.

+ + diff --git a/product_docs/docs/postgres_for_kubernetes/1/preview_version.mdx b/product_docs/docs/postgres_for_kubernetes/1/preview_version.mdx new file mode 100644 index 00000000000..6ebbec0d22b --- /dev/null +++ b/product_docs/docs/postgres_for_kubernetes/1/preview_version.mdx @@ -0,0 +1,40 @@ +--- +title: 'Preview Versions' +originalFilePath: 'src/preview_version.md' +--- + +EDB Postgres for Kubernetes candidate releases are pre-release versions made available for +testing before the community issues a new generally available (GA) release. +These versions are feature-frozen, meaning no new features are added, and are +intended for public testing prior to the final release. + +!!! Important + EDB Postgres for Kubernetes release candidates are not intended for use in production + systems. + +## Purpose of Release Candidates + +Release candidates are provided to the community for extensive testing before +the official release. While a release candidate aims to be identical to the +initial release of a new minor version of EDB Postgres for Kubernetes, additional changes may +be implemented before the GA release. + +## Community Involvement + +The stability of each EDB Postgres for Kubernetes minor release significantly depends on the +community's efforts to test the upcoming version with their workloads and +tools. Identifying bugs and regressions through user testing is crucial in +determining when we can finalize the release. + +## Usage Advisory + +The EDB Postgres for Kubernetes Community strongly advises against using preview versions of +EDB Postgres for Kubernetes in production environments or active development projects. Although +EDB Postgres for Kubernetes undergoes extensive automated and manual testing, beta releases +may contain serious bugs. Features in preview versions may change in ways that +are not backwards compatible and could be removed entirely. + +## Current Preview Version + +There are currently no preview versions available. + diff --git a/product_docs/docs/postgres_for_kubernetes/1/quickstart.mdx b/product_docs/docs/postgres_for_kubernetes/1/quickstart.mdx index 369e0cfbef5..aedcc448496 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/quickstart.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/quickstart.mdx @@ -167,7 +167,7 @@ kubectl get pods -l k8s.enterprisedb.io/cluster= !!! Important Note that we are using `k8s.enterprisedb.io/cluster` as the label. In the past you may have seen or used `postgresql`. This label is being deprecated, and - will be dropped in the future. Please use `cngp.io/cluster`. + will be dropped in the future. Please use `k8s.enterprisedb.io/cluster`. By default, the operator will install the latest available minor version of the latest major version of PostgreSQL when the operator was released. diff --git a/product_docs/docs/postgres_for_kubernetes/1/recovery.mdx b/product_docs/docs/postgres_for_kubernetes/1/recovery.mdx index 39b6598642f..e5df505549f 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/recovery.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/recovery.mdx @@ -24,14 +24,7 @@ from the archive. WAL files are pulled from the defined *recovery object store*. -Base backups can be taken either on object stores or using volume snapshots -(from version 1.21). - -!!! Warning - Recovery using volume snapshots had an initial release on 1.20.1. Because of - the amount of progress on the feature for 1.21.0, to use volume - snapshots, we strongly advise you to upgrade to 1.21 or more advanced - releases. +Base backups can be taken either on object stores or using volume snapshots. You can achieve recovery from a *recovery object store* in two ways: diff --git a/product_docs/docs/postgres_for_kubernetes/1/rel_notes/1_22_6_rel_notes.mdx b/product_docs/docs/postgres_for_kubernetes/1/rel_notes/1_22_6_rel_notes.mdx new file mode 100644 index 00000000000..f9ae43f2b07 --- /dev/null +++ b/product_docs/docs/postgres_for_kubernetes/1/rel_notes/1_22_6_rel_notes.mdx @@ -0,0 +1,12 @@ +--- +title: "EDB Postgres for Kubernetes 1.22.6 release notes" +navTitle: "Version 1.22.6" +--- + +Released: 26 Aug 2024 + +This release of EDB Postgres for Kubernetes includes the following: + +| Type | Description | +| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| Upstream merge | Merged with community CloudNativePG 1.22.6. See the community [Release Notes](https://cloudnative-pg.io/documentation/1.22/release_notes/v1.22/). | diff --git a/product_docs/docs/postgres_for_kubernetes/1/rel_notes/1_23_4_rel_notes.mdx b/product_docs/docs/postgres_for_kubernetes/1/rel_notes/1_23_4_rel_notes.mdx new file mode 100644 index 00000000000..ba93bfd7a3c --- /dev/null +++ b/product_docs/docs/postgres_for_kubernetes/1/rel_notes/1_23_4_rel_notes.mdx @@ -0,0 +1,12 @@ +--- +title: "EDB Postgres for Kubernetes 1.23.4 release notes" +navTitle: "Version 1.23.4" +--- + +Released: 26 Aug 2024 + +This release of EDB Postgres for Kubernetes includes the following: + +| Type | Description | +| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| Upstream merge | Merged with community CloudNativePG 1.23.4. See the community [Release Notes](https://cloudnative-pg.io/documentation/1.23/release_notes/v1.23/). | diff --git a/product_docs/docs/postgres_for_kubernetes/1/rel_notes/1_24_0_rel_notes.mdx b/product_docs/docs/postgres_for_kubernetes/1/rel_notes/1_24_0_rel_notes.mdx new file mode 100644 index 00000000000..9bfb3d8e23f --- /dev/null +++ b/product_docs/docs/postgres_for_kubernetes/1/rel_notes/1_24_0_rel_notes.mdx @@ -0,0 +1,12 @@ +--- +title: "EDB Postgres for Kubernetes 1.24.0 release notes" +navTitle: "Version 1.24.0" +--- + +Released: 26 Aug 2024 + +This release of EDB Postgres for Kubernetes includes the following: + +| Type | Description | +| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| Upstream merge | Merged with community CloudNativePG 1.24.0. See the community [Release Notes](https://cloudnative-pg.io/documentation/1.24/release_notes/v1.24/). | diff --git a/product_docs/docs/postgres_for_kubernetes/1/rel_notes/index.mdx b/product_docs/docs/postgres_for_kubernetes/1/rel_notes/index.mdx index 61da24ef703..b5f985471b5 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/rel_notes/index.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/rel_notes/index.mdx @@ -4,10 +4,13 @@ navTitle: "Release notes" redirects: - ../release_notes navigation: +- 1_24_0_rel_notes +- 1_23_4_rel_notes - 1_23_3_rel_notes - 1_23_2_rel_notes - 1_23_1_rel_notes - 1_23_0_rel_notes +- 1_22_6_rel_notes - 1_22_5_rel_notes - 1_22_4_rel_notes - 1_22_3_rel_notes @@ -102,10 +105,13 @@ The EDB Postgres for Kubernetes documentation describes the major version of EDB | Version | Release date | Upstream merges | | -------------------------- | ------------ | ------------------------------------------------------------------------------------------- | +| [1.24.0](1_24_0_rel_notes) | 26 Aug 2024 | Upstream [1.24.0](https://cloudnative-pg.io/documentation/1.24/release_notes/v1.24/) | +| [1.23.4](1_23_4_rel_notes) | 26 Aug 2024 | Upstream [1.23.4](https://cloudnative-pg.io/documentation/1.23/release_notes/v1.23/) | | [1.23.3](1_23_3_rel_notes) | 01 Aug 2024 | Upstream [1.23.3](https://cloudnative-pg.io/documentation/1.23/release_notes/v1.23/) | | [1.23.2](1_23_2_rel_notes) | 13 Jun 2024 | Upstream [1.23.2](https://cloudnative-pg.io/documentation/1.23/release_notes/v1.23/) | | [1.23.1](1_23_1_rel_notes) | 29 Apr 2024 | Upstream [1.23.1](https://cloudnative-pg.io/documentation/1.23/release_notes/v1.23/) | | [1.23.0](1_23_0_rel_notes) | 24 Apr 2024 | Upstream [1.23.0](https://cloudnative-pg.io/documentation/1.23/release_notes/v1.23/) | +| [1.22.6](1_22_6_rel_notes) | 26 Aug 2024 | Upstream [1.22.6](https://cloudnative-pg.io/documentation/1.22/release_notes/v1.22/) | | [1.22.5](1_22_5_rel_notes) | 01 Aug 2024 | Upstream [1.22.5](https://cloudnative-pg.io/documentation/1.22/release_notes/v1.22/) | | [1.22.4](1_22_4_rel_notes) | 13 Jun 2024 | Upstream [1.22.4](https://cloudnative-pg.io/documentation/1.22/release_notes/v1.22/) | | [1.22.3](1_22_3_rel_notes) | 24 Apr 2024 | Upstream [1.22.3](https://cloudnative-pg.io/documentation/1.22/release_notes/v1.22/) | diff --git a/product_docs/docs/postgres_for_kubernetes/1/replica_cluster.mdx b/product_docs/docs/postgres_for_kubernetes/1/replica_cluster.mdx index c55a355ced3..51dd6a5e3b7 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/replica_cluster.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/replica_cluster.mdx @@ -3,103 +3,419 @@ title: 'Replica clusters' originalFilePath: 'src/replica_cluster.md' --- -A replica cluster is an independent EDB Postgres for Kubernetes `Cluster` resource that has -the main characteristic to be in replica from another Postgres instance, -ideally also managed by EDB Postgres for Kubernetes. Normally, a replica cluster is in another -Kubernetes cluster in another region. Replica clusters can be cascading too, -and they can solely rely on object stores for replication of the data from -the source, as described further down. - -The diagram below - taken from the ["Architecture" -section](architecture.md#deployments-across-kubernetes-clusters) containing more -information about this capability - shows just an example of architecture -that you can implement with replica clusters. +A replica cluster is a EDB Postgres for Kubernetes `Cluster` resource designed to +replicate data from another PostgreSQL instance, ideally also managed by +EDB Postgres for Kubernetes. + +Typically, a replica cluster is deployed in a different Kubernetes cluster in +another region. These clusters can be configured to perform cascading +replication and can rely on object stores for data replication from the source, +as detailed further down. + +There are primarily two use cases for replica clusters: + +1. **Disaster Recovery and High Availability**: Enhance disaster recovery and, + to some extent, high availability of a EDB Postgres for Kubernetes cluster across different + Kubernetes clusters, typically located in different regions. In EDB Postgres for Kubernetes + terms, this is known as a ["Distributed Topology"](replica_cluster.md#distributed-topology). + +2. **Read-Only Workloads**: Create standalone replicas of a PostgreSQL cluster + for purposes such as reporting or Online Analytical Processing (OLAP). These + replicas are primarily for read-only workloads. In EDB Postgres for Kubernetes terms, this + is referred to as a ["Standalone Replica Cluster"](replica_cluster.md#standalone-replica-clusters). + +For example, the diagram below — taken from the ["Architecture" section](architecture.md#deployments-across-kubernetes-clusters) +— illustrates a distributed PostgreSQL topology spanning two Kubernetes +clusters, with a symmetric replica cluster primarily serving disaster recovery +purposes. ![An example of multi-cluster deployment with a primary and a replica cluster](./images/multi-cluster.png) -## Basic concepts +## Basic Concepts + +EDB Postgres for Kubernetes builds on the PostgreSQL replication framework, allowing you to +create and synchronize a PostgreSQL cluster from an existing source cluster +using the replica cluster feature — described in this section. The source can +be a primary cluster or another replica cluster (cascading replication). + +### About PostgreSQL Roles + +A replica cluster operates in continuous recovery mode, meaning no changes to +the database, including the catalog and global objects like roles or databases, +are permitted. These changes are deferred until the `Cluster` transitions to +primary. During this phase, global objects such as roles remain as defined in +the source cluster. EDB Postgres for Kubernetes applies any local redefinitions once the +cluster is promoted. + +If you are not planning to promote the cluster (e.g., for read-only workloads) +or if you intend to detach completely from the source cluster +once the replica cluster is promoted, you don't need to take any action. +This is normally the case of the ["Standalone Replica Cluster"](replica_cluster.md#standalone-replica-clusters). + +If you are planning to promote the cluster at some point, EDB Postgres for Kubernetes will +manage the following roles and passwords when transitioning from replica +cluster to primary: + +- the application user +- the superuser (if you are using it) +- any role defined using the [declarative interface](declarative_role_management.md) + +If your intention is to seamlessly ensure that the above roles and passwords +don't change, you need to define the necessary secrets for the above in each +`Cluster`. +This is normally the case of the ["Distributed Topology"](replica_cluster.md#distributed-topology). + +### Bootstrapping a Replica Cluster + +The first step is to bootstrap the replica cluster using one of the following +methods: + +- **Streaming replication** via `pg_basebackup` +- **Recovery from a volume snapshot** +- **Recovery from a Barman Cloud backup** in an object store + +For detailed instructions on cloning a PostgreSQL server using `pg_basebackup` +(streaming) or recovery (volume snapshot or object store), refer to the +["Bootstrap" section](bootstrap.md#bootstrap-from-another-cluster). + +### Configuring Replication + +Once the base backup for the replica cluster is available, you need to define +how changes will be replicated from the origin using PostgreSQL continuous +recovery. There are three main options: + +1. **Streaming Replication**: Set up streaming replication between the replica + cluster and the source. This method requires configuring network connections + and implementing appropriate administrative and security measures to ensure + seamless data transfer. +2. **WAL Archive**: Use the WAL (Write-Ahead Logging) archive stored in an + object store. WAL files are regularly transferred from the source cluster to + the object store, from where the `barman-cloud-wal-restore` utility retrieves + them for the replica cluster. +3. **Hybrid Approach**: Combine both streaming replication and WAL archive + methods. PostgreSQL can manage and switch between these two approaches as + needed to ensure data consistency and availability. + +### Defining an External Cluster + +When configuring the external cluster, you have the following options: + +- **`barmanObjectStore` section**: + - Enables use of the WAL archive, with EDB Postgres for Kubernetes automatically setting + the `restore_command` in the designated primary instance. + - Allows bootstrapping the replica cluster from an object store using the + `recovery` section if volume snapshots are not feasible. +- **`connectionParameters` section**: + - Enables bootstrapping the replica cluster via streaming replication using + the `pg_basebackup` section. + - EDB Postgres for Kubernetes automatically sets the `primary_conninfo` option in the + designated primary instance, initiating a WAL receiver process to connect + to the source cluster and receive data. + +### Backup and Symmetric Architectures + +The replica cluster can perform backups to a reserved object store from the +designated primary, supporting symmetric architectures in a distributed +environment. This architectural choice is crucial as it ensures the cluster is +prepared for promotion during a controlled data center switchover or a failover +following an unexpected event. + +### Distributed Architecture Flexibility + +You have the flexibility to design your preferred distributed architecture for +a PostgreSQL database, choosing from: + +- **Private Cloud**: Spanning multiple Kubernetes clusters in different data + centers. +- **Public Cloud**: Spanning multiple Kubernetes clusters in different regions. +- **Hybrid Cloud**: Combining private and public clouds. +- **Multi-Cloud**: Spanning multiple Kubernetes clusters across different + regions and Cloud Service Providers. + +## Setting Up a Replica Cluster + +To set up a replica cluster from a source cluster, follow these steps to create +a cluster YAML file and configure it accordingly: + +1. **Define External Clusters**: + - In the `externalClusters` section, specify the replica cluster. + - For a distributed PostgreSQL topology aimed at disaster recovery (DR) and + high availability (HA), this section should be defined for every + PostgreSQL cluster in the distributed database. + +2. **Bootstrap the Replica Cluster**: + - **Streaming Bootstrap**: Use the `pg_basebackup` section for bootstrapping + via streaming replication. + - **Snapshot/Object Store Bootstrap**: Use the `recovery` section to + bootstrap from a volume snapshot or an object store. + +3. **Continuous Recovery Strategy**: Define this in the `.spec.replica` stanza: + - **Distributed Topology**: Configure using the `primary`, `source`, and + `self` fields along with the distributed topology defined in + `externalClusters`. This allows EDB Postgres for Kubernetes to declaratively control the + demotion of a primary cluster and the subsequent promotion of a replica cluster + using a promotion token. + - **Standalone Replica Cluster**: Enable continuous recovery using the + `enabled` option and set the `source` field to point to an + `externalClusters` name. This configuration is suitable for creating replicas + primarily intended for read-only workloads. + +Both the Distributed Topology and the Standalone Replica Cluster strategies for +continuous recovery are thoroughly explained below. + +## Distributed Topology + +!!! Important + The Distributed Topology strategy was introduced in EDB Postgres for Kubernetes 1.24. + +### Planning for a Distributed PostgreSQL Database + +As Dwight Eisenhower famously said, "Planning is everything", and this holds +true for designing PostgreSQL architectures in Kubernetes. + +First, conceptualize your distributed topology on paper, and then translate it +into a EDB Postgres for Kubernetes API configuration. This configuration primarily involves: + +- The `externalClusters` section, which must be included in every `Cluster` + definition within your distributed PostgreSQL setup. +- The `.spec.replica` stanza, specifically the `primary`, `source`, and + (optionally) `self` fields. + +For example, suppose you want to deploy a PostgreSQL cluster distributed across +two Kubernetes clusters located in Southern Europe and Central Europe. + +In this scenario, assume you have EDB Postgres for Kubernetes installed in the Southern +Europe Kubernetes cluster, with a PostgreSQL `Cluster` named `cluster-eu-south` +acting as the primary. This cluster has continuous backup configured with a +local object store. This object store is also accessible by the PostgreSQL +`Cluster` named `cluster-eu-central`, installed in the Central European +Kubernetes cluster. Initially, `cluster-eu-central` functions as a replica +cluster. Following a symmetric approach, it also has a local object store for +continuous backup, which needs to be read by `cluster-eu-south`. The recovery +in this setup relies solely on WAL shipping, with no streaming connection +between the two clusters. + +Here’s how you would configure the `externalClusters` section for both +`Cluster` resources: -EDB Postgres for Kubernetes relies on the foundations of the PostgreSQL replication -framework even when a PostgreSQL cluster is created from an existing one (source) -and kept synchronized through the -[replica cluster](architecture.md#deployments-across-kubernetes-clusters) feature. The source -can be a primary cluster or another replica cluster (cascading replica cluster). +```yaml +# Distributed topology configuration +externalClusters: + - name: cluster-eu-south + barmanObjectStore: + destinationPath: s3://cluster-eu-south/ + # Additional configuration + - name: cluster-eu-central + barmanObjectStore: + destinationPath: s3://cluster-eu-central/ + # Additional configuration +``` -The first step is to bootstrap the replica cluster, choosing among one of the -available methods: +The `.spec.replica` stanza for the `cluster-eu-south` PostgreSQL primary +`Cluster` should be configured as follows: -- streaming replication, via `pg_basebackup` -- recovery from a volume snapshot -- recovery from a Barman Cloud backup in an object store +```yaml +replica: + primary: cluster-eu-south + source: cluster-eu-central +``` -Please refer to the ["Bootstrap" section](bootstrap.md#bootstrap-from-another-cluster) -for information on how to clone a PostgreSQL server using either -`pg_basebackup` (streaming) or `recovery` (volume snapshot or object store). +Meanwhile, the `.spec.replica` stanza for the `cluster-eu-central` PostgreSQL +replica `Cluster` should be configured as: -Once the replica cluster's base backup is available, you need to define how -changes are replicated from the origin, through PostgreSQL continuous recovery. -There are two options: +```yaml +replica: + primary: cluster-eu-south + source: cluster-eu-south +``` -- use streaming replication between the replica cluster and the source - (this will certainly require some administrative and security related - work to be done to make sure that the network connection between the - two clusters are correctly setup) -- use the WAL archive (on an object store) to fetch the WAL files that are - regularly shipped from the source to the object store and pulled by - `barman-cloud-wal-restore` in the replica cluster -- any of the two +In this configuration, when the `primary` field matches the name of the +`Cluster` resource (or `.spec.replica.self` if a different one is used), the +current cluster is considered the primary in the distributed topology. +Otherwise, it is set as a replica from the `source` (in this case, using the +Barman object store). -All you have to do is actually define an external cluster. +This setup allows you to efficiently manage a distributed PostgreSQL +architecture across multiple Kubernetes clusters, ensuring both high +availability and disaster recovery through controlled switchover of a primary +PostgreSQL cluster using declarative configuration. -If the external cluster contains a `barmanObjectStore` section: +Controlled switchover in a distributed topology is a two-step process +involving: -- you'll be able to use the WAL archive, and EDB Postgres for Kubernetes will automatically - set the `restore_command` in the designated primary instance -- you'll be able to bootstrap the replica cluster from an object store - using the `recovery` section, in case you cannot take advantage of - volume snapshots +- Demotion of a primary cluster to a replica cluster +- Promotion of a replica cluster to a primary cluster -If the external cluster contains a `connectionParameters` section: +These processes are described in the next sections. -- you'll be able to bootstrap the replica cluster via streaming replication - using the `pg_basebackup` section -- EDB Postgres for Kubernetes will automatically set the `primary_conninfo` - option in the designated primary instance, so that a WAL receiver - process is started to connect to the source cluster and receive data +!!! Important + Before you proceed, ensure you review the ["About PostgreSQL Roles" section](#about-postgresql-roles) + above and use identical role definitions, including secrets, in all + `Cluster` objects participating in the distributed topology. -The created replica cluster can perform backups in a reserved object store from -the designated primary, enabling symmetric architectures in a distributed -fashion. +### Demoting a Primary to a Replica Cluster -You have full flexibility and freedom to decide your favorite -distributed architecture for a PostgreSQL database by choosing: +EDB Postgres for Kubernetes provides the functionality to demote a primary cluster to a +replica cluster. This action is typically planned when transitioning the +primary role from one data center to another. The process involves demoting the +current primary cluster (e.g., `cluster-eu-south`) to a replica cluster and +subsequently promoting the designated replica cluster (e.g., +`cluster-eu-central`) to primary when fully synchronized. -- a private cloud spanning over multiple Kubernetes clusters in different data - centers -- a public cloud spanning over multiple Kubernetes clusters in different - regions -- a mix of the previous two (hybrid) -- a public cloud spanning over multiple Kubernetes clusters in different - regions and on different Cloud Service Providers +Provided you have defined an external cluster in the current primary `Cluster` +resource that points to the replica cluster that's been selected to become the +new primary, all you need to do is change the `primary` field as follows: -## Setting up a replica cluster +```yaml +replica: + primary: cluster-eu-central + source: cluster-eu-central +``` -To set up a replica cluster from a source cluster, we need to create a cluster YAML -file and define the following parts accordingly: +When the primary PostgreSQL cluster is demoted, write operations are no +longer possible. EDB Postgres for Kubernetes then: -- define the `externalClusters` section in the replica cluster -- define the bootstrap part for the replica cluster. We can either bootstrap via - streaming using the `pg_basebackup` section, or bootstrap from a volume snapshot - or an object store using the `recovery` section -- define the continuous recovery part (`.spec.replica`) in the replica cluster. All - we need to do is to enable the replica mode through option `.spec.replica.enabled` - and set the `externalClusters` name in option `.spec.replica.source` +1. Archives the WAL file containing the shutdown checkpoint as a `.partial` + file in the WAL archive. -#### Example using pg_basebackup +2. Generates a `demotionToken` in the status, a base64-encoded JSON structure + containing relevant information from `pg_controldata` such as the system + identifier, the timestamp, timeline ID, REDO location, and REDO WAL file of the + latest checkpoint. + +The first step is necessary to demote/promote using solely the WAL archive to +feed the continuous recovery process (without streaming replication). + +The second step, generation of the `.status.demotionToken`, will ensure a +smooth demotion/promotion process, without any data loss and without rebuilding +the former primary. + +At this stage, the former primary has transitioned to a replica cluster, +awaiting WAL data from the new global primary: `cluster-eu-central`. + +To proceed with promoting the other cluster, you need to retrieve the +`demotionToken` from `cluster-eu-south` using the following command: + +```sh +kubectl get cluster cluster-eu-south \ + -o jsonpath='{.status.demotionToken}' +``` + +You can obtain the `demotionToken` using the `cnp` plugin by checking the +cluster's status. The token is listed under the `Demotion token` section. + +!!! Note + The `demotionToken` obtained from `cluster-eu-south` will serve as the + `promotionToken` for `cluster-eu-central`. + +You can verify the role change using the `cnp` plugin, checking the status of +the cluster: + +```shell +kubectl cnp status cluster-eu-south +``` + +### Promoting a Replica to a Primary Cluster + +To promote a PostgreSQL replica cluster (e.g., `cluster-eu-central`) to a +primary cluster and make the designated primary an actual primary instance, +you need to perform the following steps simultaneously: + +1. Set the `.spec.replica.primary` to the name of the current replica cluster + to be promoted (e.g., `cluster-eu-central`). +2. Set the `.spec.replica.promotionToken` with the value obtained from the + former primary cluster (refer to ["Demoting a Primary to a Replica Cluster"](replica_cluster.md#demoting-a-primary-to-a-replica-cluster)). + +The updated `replica` section in `cluster-eu-central`'s spec should look like +this: + +```yaml +replica: + primary: cluster-eu-central + promotionToken: + source: cluster-eu-south +``` -This **first example** defines a replica cluster using streaming replication in -both bootstrap and continuous recovery. The replica cluster connects to the -source cluster using TLS authentication. +!!! Warning + It is crucial to apply the changes to the `primary` and `promotionToken` + fields simultaneously. If the promotion token is omitted, a failover will be + triggered, necessitating a rebuild of the former primary. + +After making these adjustments, EDB Postgres for Kubernetes will initiate the promotion of +the replica cluster to a primary cluster. Initially, EDB Postgres for Kubernetes will wait +for the designated primary cluster to replicate all Write-Ahead Logging (WAL) +information up to the specified Log Sequence Number (LSN) contained in the +token. Once this target is achieved, the promotion process will commence. The +new primary cluster will switch timelines, archive the history file and new +WAL, thereby unblocking the replication process in the `cluster-eu-south` +cluster, which will then operate as a replica. + +To verify the role change, use the `cnp` plugin to check the status of the +cluster: + +```shell +kubectl cnp status cluster-eu-central +``` + +This command will provide you with the current status of `cluster-eu-central`, +confirming its promotion to primary. + +By following these steps, you ensure a smooth and controlled promotion process, +minimizing disruption and maintaining data integrity across your PostgreSQL +clusters. + +## Standalone Replica Clusters + +!!! Important + Standalone Replica Clusters were previously known as Replica Clusters + before the introduction of the Distributed Topology strategy in EDB Postgres for Kubernetes + 1.24. + +In EDB Postgres for Kubernetes, a Standalone Replica Cluster is a PostgreSQL cluster in +continuous recovery with the following configurations: + +- `.spec.replica.enabled` set to `true` +- A physical replication source defined via the `.spec.replica.source` field, + pointing to an `externalClusters` name + +When `.spec.replica.enabled` is set to `false`, the replica cluster exits +continuous recovery mode and becomes a primary cluster, completely detached +from the original source. + +!!! Warning + Disabling replication is an **irreversible** operation. Once replication is + disabled and the designated primary is promoted to primary, the replica cluster + and the source cluster become two independent clusters definitively. + +!!! Important + Standalone replica clusters are suitable for several use cases, primarily + involving read-only workloads. If you are planning to setup a disaster + recovery solution, look into "Distributed Topology" above. + +### Main Differences with Distributed Topology + +Although Standalone Replica Clusters can be used for disaster recovery +purposes, they differ from the "Distributed Topology" strategy in several key +ways: + +- **Lack of Distributed Database Concept**: Standalone Replica Clusters do not + support the concept of a distributed database, whether in simple forms (two + clusters) or more complex configurations (e.g., three clusters in a circular + topology). +- **No Global Primary Cluster**: There is no notion of a global primary cluster + in Standalone Replica Clusters. +- **No Controlled Switchover**: A Standalone Replica Cluster can only be + promoted to primary. The former primary cluster must be re-cloned, as + controlled switchover is not possible. + +Failover is identical in both strategies, requiring the former primary to be +re-cloned if it ever comes back up. + +### Example of Standalone Replica Cluster using `pg_basebackup` + +This **first example** defines a standalone replica cluster using streaming +replication in both bootstrap and continuous recovery. The replica cluster +connects to the source cluster using TLS authentication. You can check the [sample YAML](../samples/cluster-example-replica-streaming.yaml) in the `samples/` subdirectory. @@ -121,6 +437,7 @@ user are set to the default, `app`. If the PostgreSQL cluster being restored uses different names, you must specify them as documented in [Configure the application database](bootstrap.md#configure-the-application-database). You should also consider copying over the application user secret from the original cluster and keep it synchronized with the source. +See ["About PostgreSQL Roles"](#about-postgresql-roles) for more details. In the `externalClusters` section, remember to use the right namespace for the host in the `connectionParameters` sub-section. @@ -146,7 +463,7 @@ in case the replica cluster is in a separate namespace. key: ca.crt ``` -#### Example using a Backup from an object store +### Example of Standalone Replica Cluster from an object store The **second example** defines a replica cluster that bootstraps from an object store using the `recovery` section and continuous recovery using both streaming @@ -173,6 +490,7 @@ user are set to the default, `app`. If the PostgreSQL cluster being restored uses different names, you must specify them as documented in [Configure the application database](recovery.md#configure-the-application-database). You should also consider copying over the application user secret from the original cluster and keep it synchronized with the source. +See ["About PostgreSQL Roles"](#about-postgresql-roles) for more details. In the `externalClusters` section, take care to use the right namespace in the `endpointURL` and the `connectionParameters.host`. @@ -202,7 +520,7 @@ a backup of the source cluster has been created already. clusters, and that all the necessary secrets which hold passwords or certificates are properly created in advance. -#### Example using a Volume Snapshot +### Example using a Volume Snapshot If you use volume snapshots and your storage class provides snapshots cross-cluster availability, you can leverage that to @@ -223,60 +541,70 @@ uses different names, you must specify them as documented in [Configure the application database](recovery.md#configure-the-application-database). You should also consider copying over the application user secret from the original cluster and keep it synchronized with the source. +See ["About PostgreSQL Roles"](#about-postgresql-roles) for more details. -## Demoting a Primary to a Replica Cluster +## Delayed replicas -EDB Postgres for Kubernetes provides the functionality to demote a primary cluster to a -replica cluster. This action is typically planned when transitioning the -primary role from one data center to another. The process involves demoting the -current primary cluster (e.g., cluster-eu-south) to a replica cluster and -subsequently promoting the designated replica cluster (e.g., -`cluster-eu-central`) to primary when fully synchronized. -Provided you have defined an external cluster in the current primary `Cluster` -resource that points to the replica cluster that's been selected to become the -new primary, all you need to do is to enable replica mode and define the source -as follows: +EDB Postgres for Kubernetes supports the creation of **delayed replicas** through the +[`.spec.replica.minApplyDelay` option](pg4k.v1.md#postgresql-k8s-enterprisedb-io-v1-ReplicaClusterConfiguration), +leveraging PostgreSQL's +[`recovery_min_apply_delay`](https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-RECOVERY-MIN-APPLY-DELAY). -```yaml - replica: - enabled: true - source: cluster-eu-central -``` +Delayed replicas are designed to intentionally lag behind the primary database +by a specified amount of time. This delay is configurable using the +`.spec.replica.minApplyDelay` option, which maps to the underlying +`recovery_min_apply_delay` parameter in PostgreSQL. -## Promoting the designated primary in the replica cluster +The primary objective of delayed replicas is to mitigate the impact of +unintended SQL statement executions on the primary database. This is especially +useful in scenarios where operations such as `UPDATE` or `DELETE` are performed +without a proper `WHERE` clause. -To promote a replica cluster (e.g. `cluster-eu-central`) to a primary cluster -and make the designated primary a real primary, all you need to do is to -disable the replica mode in the replica cluster through the option -`.spec.replica.enabled`: +To configure a delay in a replica cluster, adjust the +`.spec.replica.minApplyDelay` option. This parameter determines how much time +the replicas will lag behind the primary. For example: ```yaml - replica: - enabled: false - source: cluster-eu-south + # ... + replica: + enabled: true + source: cluster-example + # Enforce a delay of 8 hours + minApplyDelay: '8h' + # ... ``` -If you have first demoted the `cluster-eu-south` and waited for -`cluster-eu-central` to be in sync, once `cluster-eu-central` starts as -primary, the `cluster-eu-south` cluster will seamlessly start as a replica -cluster, without the need of re-cloning. +The above example helps safeguard against accidental data modifications by +providing a buffer period of 8 hours to detect and correct issues before they +propagate to the replicas. -If you disable replica mode without prior demotion, the replica cluster and the -source cluster will become two separate clusters. +Monitor and adjust the delay as needed based on your recovery time objectives +and the potential impact of unintended primary database operations. -When replica mode is disabled, the **designated primary** in the replica -cluster will be promoted to be that cluster's **primary**. +The main use cases of delayed replicas can be summarized into: -You can verify the role change using the `cnp` plugin, checking the status of -the cluster which was previously the replica: +1. mitigating human errors: reduce the risk of data corruption or loss + resulting from unintentional SQL operations on the primary database -```shell -kubectl cnp -n status cluster-eu-central -``` +2. recovery time optimization: facilitate quicker recovery from unintended + changes by having a delayed replica that allows you to identify and rectify + issues before changes are applied to other replicas. -!!! Note - Disabling replication is an **irreversible** operation. Once replication is - disabled and the designated primary is promoted to primary, the replica cluster - and the source cluster become two independent clusters definitively. Ensure to - follow the demotion procedure correctly to avoid unintended consequences. +3. enhanced data protection: safeguard critical data by introducing a time + buffer that provides an opportunity to intervene and prevent the propagation of + undesirable changes. + +!!! Warning + The `minApplyDelay` option of delayed replicas cannot be used in + conjunction with `promotionToken`. + +By integrating delayed replicas into your replication strategy, you can enhance +the resilience and data protection capabilities of your PostgreSQL environment. +Adjust the delay duration based on your specific needs and the criticality of +your data. +!!! Important + Always measure your goals. Depending on your environment, it might be more + efficient to rely on volume snapshot-based recovery for faster outcomes. + Evaluate and choose the approach that best aligns with your unique requirements + and infrastructure. diff --git a/product_docs/docs/postgres_for_kubernetes/1/replication.mdx b/product_docs/docs/postgres_for_kubernetes/1/replication.mdx index ff24b536e26..8a683640daa 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/replication.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/replication.mdx @@ -37,12 +37,25 @@ technologies, physical replication was first introduced in PostgreSQL 8.2 (2006) through WAL shipping from the primary to a warm standby in continuous recovery. -PostgreSQL 9.0 (2010) enhanced it with WAL streaming and read-only replicas via -*hot standby*, while 9.1 (2011) introduced synchronous replication at the -transaction level (for RPO=0 clusters). Cascading replication was released with -PostgreSQL 9.2 (2012). The foundations of logical replication were laid in -PostgreSQL 9.4, while version 10 (2017) introduced native support for the -publisher/subscriber pattern to replicate data from an origin to a destination. +PostgreSQL 9.0 (2010) introduced WAL streaming and read-only replicas through +*hot standby*. In 2011, PostgreSQL 9.1 brought synchronous replication at the +transaction level, supporting RPO=0 clusters. Cascading replication was added +in PostgreSQL 9.2 (2012). The foundations for logical replication were +established in PostgreSQL 9.4 (2014), and version 10 (2017) introduced native +support for the publisher/subscriber pattern to replicate data from an origin +to a destination. The table below summarizes these milestones. + +| Version | Year | Feature | +| :-----: | :--: | --------------------------------------------------------------------- | +| 8.2 | 2006 | Warm Standby with WAL shipping | +| 9.0 | 2010 | Hot Standby and physical streaming replication | +| 9.1 | 2011 | Synchronous replication (priority-based) | +| 9.2 | 2012 | Cascading replication | +| 9.4 | 2014 | Foundations of logical replication | +| 10 | 2017 | Logical publisher/subscriber and quorum-based synchronous replication | + +This table highlights key PostgreSQL replication features and their respective +versions. ## Streaming replication support @@ -95,7 +108,217 @@ transparently configures replicas to take advantage of `restore_command` when in continuous recovery. As a result, PostgreSQL can use the WAL archive as a fallback option whenever pulling WALs via streaming replication fails. -## Synchronous replication +## Synchronous Replication + +EDB Postgres for Kubernetes supports both +[quorum-based and priority-based synchronous replication for PostgreSQL](https://www.postgresql.org/docs/current/warm-standby.html#SYNCHRONOUS-REPLICATION). + +!!! Warning + Please be aware that synchronous replication will halt your write + operations if the required number of standby nodes to replicate WAL data for + transaction commits is unavailable. In such cases, write operations for your + applications will hang. This behavior differs from the previous implementation + in EDB Postgres for Kubernetes but aligns with the expectations of a PostgreSQL DBA for this + capability. + +While direct configuration of the `synchronous_standby_names` option is +prohibited, EDB Postgres for Kubernetes allows you to customize its content and extend +synchronous replication beyond the `Cluster` resource through the +[`.spec.postgresql.synchronous` stanza](pg4k.v1.md#postgresql-k8s-enterprisedb-io-v1-SynchronousReplicaConfiguration). + +Synchronous replication is disabled by default (the `synchronous` stanza is not +defined). When defined, two options are mandatory: + +- `method`: either `any` (quorum) or `first` (priority) +- `number`: the number of synchronous standby servers that transactions must + wait for responses from + +### Quorum-based Synchronous Replication + +PostgreSQL's quorum-based synchronous replication makes transaction commits +wait until their WAL records are replicated to at least a certain number of +standbys. To use this method, set `method` to `any`. + +#### Migrating from the Deprecated Synchronous Replication Implementation + +This section provides instructions on migrating your existing quorum-based +synchronous replication, defined using the deprecated form, to the new and more +robust capability in EDB Postgres for Kubernetes. + +Suppose you have the following manifest: + +```yaml +apiVersion: postgresql.k8s.enterprisedb.io/v1 +kind: Cluster +metadata: + name: angus +spec: + instances: 3 + + minSyncReplicas: 1 + maxSyncReplicas: 1 + + storage: + size: 1G +``` + +You can convert it to the new quorum-based format as follows: + +```yaml +apiVersion: postgresql.k8s.enterprisedb.io/v1 +kind: Cluster +metadata: + name: angus +spec: + instances: 3 + + storage: + size: 1G + + postgresql: + synchronous: + method: any + number: 1 +``` + +!!! Important + The primary difference with the new capability is that PostgreSQL will + always prioritize data durability over high availability. Consequently, if no + replica is available, write operations on the primary will be blocked. However, + this behavior is consistent with the expectations of a PostgreSQL DBA for this + capability. + +### Priority-based Synchronous Replication + +PostgreSQL's priority-based synchronous replication makes transaction commits +wait until their WAL records are replicated to the requested number of +synchronous standbys chosen based on their priorities. Standbys listed earlier +in the `synchronous_standby_names` option are given higher priority and +considered synchronous. If a current synchronous standby disconnects, it is +immediately replaced by the next-highest-priority standby. To use this method, +set `method` to `first`. + +!!! Important + Currently, this method is most useful when extending + synchronous replication beyond the current cluster using the + `maxStandbyNamesFromCluster`, `standbyNamesPre`, and `standbyNamesPost` + options explained below. + +### Controlling `synchronous_standby_names` Content + +By default, EDB Postgres for Kubernetes populates `synchronous_standby_names` with the names +of local pods in a `Cluster` resource, ensuring synchronous replication within +the PostgreSQL cluster. You can customize the content of +`synchronous_standby_names` based on your requirements and replication method +(quorum or priority) using the following optional parameters in the +`.spec.postgresql.synchronous` stanza: + +- `maxStandbyNamesFromCluster`: the maximum number of pod names from the local + `Cluster` object that can be automatically included in the + `synchronous_standby_names` option in PostgreSQL. +- `standbyNamesPre`: a list of standby names (specifically `application_name`) + to be prepended to the list of local pod names automatically listed by the + operator. +- `standbyNamesPost`: a list of standby names (specifically `application_name`) + to be appended to the list of local pod names automatically listed by the + operator. + +!!! Warning + You are responsible for ensuring the correct names in `standbyNamesPre` and + `standbyNamesPost`. EDB Postgres for Kubernetes expects that you manage any standby with an + `application_name` listed here, ensuring their high availability. Incorrect + entries can jeopardize your PostgreSQL database uptime. + +### Examples + +Here are some examples, all based on a `cluster-example` with three instances: + +If you set: + +```yaml +postgresql: + synchronous: + method: any + number: 1 +``` + +The content of `synchronous_standby_names` will be: + +```console +ANY 1 (cluster-example-2, cluster-example-3) +``` + +If you set: + +```yaml +postgresql: + synchronous: + method: any + number: 1 + maxStandbyNamesFromCluster: 1 + standbyNamesPre: + - angus +``` + +The content of `synchronous_standby_names` will be: + +```console +ANY 1 (angus, cluster-example-2) +``` + +If you set: + +```yaml +postgresql: + synchronous: + method: any + number: 1 + maxStandbyNamesFromCluster: 0 + standbyNamesPre: + - angus + - malcolm +``` + +The content of `synchronous_standby_names` will be: + +```console +ANY 1 (angus, malcolm) +``` + +If you set: + +```yaml +postgresql: + synchronous: + method: first + number: 2 + maxStandbyNamesFromCluster: 1 + standbyNamesPre: + - angus + standbyNamesPost: + - malcolm +``` + +The `synchronous_standby_names` option will look like: + +```console +FIRST 2 (angus, cluster-example-2, malcolm) +``` + +## Synchronous Replication (Deprecated) + +!!! Warning + Prior to EDB Postgres for Kubernetes 1.24, only the quorum-based synchronous replication + implementation was supported. Although this method is now deprecated, it will + not be removed anytime soon. + The new method prioritizes data durability over self-healing and offers + more robust features, including priority-based synchronous replication and full + control over the `synchronous_standby_names` option. + It is recommended to gradually migrate to the new configuration method for + synchronous replication, as explained in the previous paragraph. + +!!! Important + The deprecated method and the new method are mutually exclusive. EDB Postgres for Kubernetes supports the configuration of **quorum-based synchronous streaming replication** via two configuration options called `minSyncReplicas` @@ -155,6 +378,11 @@ Postgres pod are. For more information on the general pod affinity and anti-affinity rules, please check the ["Scheduling" section](scheduling.md). +!!! Warning + The `.spec.postgresql.syncReplicaElectionConstraint` option only applies to the + legacy implementation of synchronous replication + (see ["Synchronous Replication (Deprecated)"](replication.md#synchronous-replication-deprecated)). + As an example use-case for this feature: in a cluster with a single sync replica, we would be able to ensure the sync replica will be in a different availability zone from the primary instance, usually identified by the `topology.kubernetes.io/zone` @@ -243,7 +471,7 @@ For details, please refer to the Here follows a brief description of the main options: `.spec.replicationSlots.highAvailability.enabled` -: if `true`, the feature is enabled (`true` is the default since 1.21) +: if `true`, the feature is enabled (`true` is the default) `.spec.replicationSlots.highAvailability.slotPrefix` : the prefix that identifies replication slots managed by the operator diff --git a/product_docs/docs/postgres_for_kubernetes/1/samples.mdx b/product_docs/docs/postgres_for_kubernetes/1/samples.mdx index bf370afd530..52ab68303fb 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/samples.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/samples.mdx @@ -115,6 +115,13 @@ your PostgreSQL cluster. Declares a role with the `managed` stanza. Includes password management with Kubernetes secrets. +## Managed services + +**Cluster with managed services** +: [`cluster-example-managed-services.yaml`](../samples/cluster-example-managed-services.yaml): + Declares a service with the `managed` stanza. Includes default service disabled and new + `rw` service template of `LoadBalancer` type defined. + ## Declarative tablespaces **Cluster with declarative tablespaces** diff --git a/product_docs/docs/postgres_for_kubernetes/1/samples/cluster-example-full.yaml b/product_docs/docs/postgres_for_kubernetes/1/samples/cluster-example-full.yaml index ef6de34f5e5..d4b8b6c50ea 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/samples/cluster-example-full.yaml +++ b/product_docs/docs/postgres_for_kubernetes/1/samples/cluster-example-full.yaml @@ -35,7 +35,7 @@ metadata: name: cluster-example-full spec: description: "Example of cluster" - imageName: quay.io/enterprisedb/postgresql:16.3 + imageName: quay.io/enterprisedb/postgresql:16.4 # imagePullSecret is only required if the images are located in a private registry # imagePullSecrets: # - name: private_registry_access diff --git a/product_docs/docs/postgres_for_kubernetes/1/samples/cluster-example-managed-services.yaml b/product_docs/docs/postgres_for_kubernetes/1/samples/cluster-example-managed-services.yaml new file mode 100644 index 00000000000..087a5afd864 --- /dev/null +++ b/product_docs/docs/postgres_for_kubernetes/1/samples/cluster-example-managed-services.yaml @@ -0,0 +1,24 @@ +apiVersion: postgresql.k8s.enterprisedb.io/v1 +kind: Cluster +metadata: + name: cluster-example-managed-services +spec: + instances: 1 + storage: + size: 1Gi + + managed: + services: + ## disable the default services + disabledDefaultServices: ["ro", "r"] + additional: + - selectorType: rw + serviceTemplate: + metadata: + name: "test-rw" + labels: + test-label: "true" + annotations: + test-annotation: "true" + spec: + type: LoadBalancer diff --git a/product_docs/docs/postgres_for_kubernetes/1/samples/cluster-replica-tls.yaml b/product_docs/docs/postgres_for_kubernetes/1/samples/cluster-replica-tls.yaml index c879dcb2d67..ce5e2654839 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/samples/cluster-replica-tls.yaml +++ b/product_docs/docs/postgres_for_kubernetes/1/samples/cluster-replica-tls.yaml @@ -10,7 +10,7 @@ spec: source: cluster-example replica: - enabled: true + primary: cluster-example source: cluster-example storage: diff --git a/product_docs/docs/postgres_for_kubernetes/1/samples/dc/cluster-dc-a.yaml b/product_docs/docs/postgres_for_kubernetes/1/samples/dc/cluster-dc-a.yaml new file mode 100644 index 00000000000..90f66c5c612 --- /dev/null +++ b/product_docs/docs/postgres_for_kubernetes/1/samples/dc/cluster-dc-a.yaml @@ -0,0 +1,71 @@ +apiVersion: postgresql.k8s.enterprisedb.io/v1 +kind: Cluster +metadata: + name: cluster-dc-a +spec: + instances: 3 + primaryUpdateStrategy: unsupervised + + storage: + storageClass: csi-hostpath-sc + size: 1Gi + + backup: + barmanObjectStore: + destinationPath: s3://backups/ + endpointURL: http://minio:9000 + s3Credentials: + accessKeyId: + name: minio + key: ACCESS_KEY_ID + secretAccessKey: + name: minio + key: ACCESS_SECRET_KEY + wal: + compression: gzip + + replica: + self: cluster-dc-a + primary: cluster-dc-a + source: cluster-dc-b + + externalClusters: + - name: cluster-dc-a + barmanObjectStore: + serverName: cluster-dc-a + destinationPath: s3://backups/ + endpointURL: http://minio:9000 + s3Credentials: + accessKeyId: + name: minio + key: ACCESS_KEY_ID + secretAccessKey: + name: minio + key: ACCESS_SECRET_KEY + wal: + compression: gzip + - name: cluster-dc-b + barmanObjectStore: + serverName: cluster-dc-b + destinationPath: s3://backups/ + endpointURL: http://minio:9000 + s3Credentials: + accessKeyId: + name: minio + key: ACCESS_KEY_ID + secretAccessKey: + name: minio + key: ACCESS_SECRET_KEY + wal: + compression: gzip +--- +apiVersion: postgresql.k8s.enterprisedb.io/v1 +kind: ScheduledBackup +metadata: + name: cluster-dc-a-backup +spec: + schedule: '0 0 0 * * *' + backupOwnerReference: self + cluster: + name: cluster-dc-a + immediate: true \ No newline at end of file diff --git a/product_docs/docs/postgres_for_kubernetes/1/samples/dc/cluster-dc-b.yaml b/product_docs/docs/postgres_for_kubernetes/1/samples/dc/cluster-dc-b.yaml new file mode 100644 index 00000000000..4523eaffb87 --- /dev/null +++ b/product_docs/docs/postgres_for_kubernetes/1/samples/dc/cluster-dc-b.yaml @@ -0,0 +1,75 @@ +apiVersion: postgresql.k8s.enterprisedb.io/v1 +kind: Cluster +metadata: + name: cluster-dc-b +spec: + instances: 3 + primaryUpdateStrategy: unsupervised + + storage: + storageClass: csi-hostpath-sc + size: 1Gi + + backup: + barmanObjectStore: + destinationPath: s3://backups/ + endpointURL: http://minio:9000 + s3Credentials: + accessKeyId: + name: minio + key: ACCESS_KEY_ID + secretAccessKey: + name: minio + key: ACCESS_SECRET_KEY + wal: + compression: gzip + + bootstrap: + recovery: + source: cluster-dc-a + + replica: + self: cluster-dc-b + primary: cluster-dc-a + source: cluster-dc-a + + externalClusters: + - name: cluster-dc-a + barmanObjectStore: + serverName: cluster-dc-a + destinationPath: s3://backups/ + endpointURL: http://minio:9000 + s3Credentials: + accessKeyId: + name: minio + key: ACCESS_KEY_ID + secretAccessKey: + name: minio + key: ACCESS_SECRET_KEY + wal: + compression: gzip + - name: cluster-dc-b + barmanObjectStore: + serverName: cluster-dc-b + destinationPath: s3://backups/ + endpointURL: http://minio:9000 + s3Credentials: + accessKeyId: + name: minio + key: ACCESS_KEY_ID + secretAccessKey: + name: minio + key: ACCESS_SECRET_KEY + wal: + compression: gzip +--- +apiVersion: postgresql.k8s.enterprisedb.io/v1 +kind: ScheduledBackup +metadata: + name: cluster-dc-b-backup +spec: + schedule: '0 0 0 * * *' + backupOwnerReference: self + cluster: + name: cluster-dc-b + immediate: true \ No newline at end of file diff --git a/product_docs/docs/postgres_for_kubernetes/1/samples/dc/cluster-test.yaml b/product_docs/docs/postgres_for_kubernetes/1/samples/dc/cluster-test.yaml new file mode 100644 index 00000000000..2ceb8e3776f --- /dev/null +++ b/product_docs/docs/postgres_for_kubernetes/1/samples/dc/cluster-test.yaml @@ -0,0 +1,25 @@ +apiVersion: postgresql.k8s.enterprisedb.io/v1 +kind: Cluster +metadata: + name: cluster-test +spec: + instances: 3 + primaryUpdateStrategy: unsupervised + + storage: + storageClass: csi-hostpath-sc + size: 1Gi + + backup: + barmanObjectStore: + destinationPath: s3://backups/ + endpointURL: http://minio:9000 + s3Credentials: + accessKeyId: + name: minio + key: ACCESS_KEY_ID + secretAccessKey: + name: minio + key: ACCESS_SECRET_KEY + wal: + compression: gzip diff --git a/product_docs/docs/postgres_for_kubernetes/1/scheduling.mdx b/product_docs/docs/postgres_for_kubernetes/1/scheduling.mdx index 01e233eb871..9478c1f44b9 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/scheduling.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/scheduling.mdx @@ -21,21 +21,40 @@ section in the definition of the cluster, which supports: - node selectors - tolerations -!!! Info - EDB Postgres for Kubernetes does not support pod templates for finer control - on the scheduling of workloads. While they were part of the initial concept, - the development team decided to postpone their introduction in a newer - version of the API (most likely v2 of CNP). +## Pod Affinity and Anti-Affinity -## Pod affinity and anti-affinity +Kubernetes provides mechanisms to control where pods are scheduled using +*affinity* and *anti-affinity* rules. These rules allow you to specify whether +a pod should be scheduled on particular nodes (*affinity*) or avoided on +specific nodes (*anti-affinity*) based on the workloads already running there. +This capability is technically referred to as **inter-pod +affinity/anti-affinity**. -Kubernetes allows you to control which nodes a pod should (*affinity*) or -should not (*anti-affinity*) be scheduled, based on the actual workloads already -running in those nodes. -This is technically known as **inter-pod affinity/anti-affinity**. +By default, EDB Postgres for Kubernetes configures cluster instances to preferably be +scheduled on different nodes, while `pgBouncer` instances might still run on +the same nodes. -EDB Postgres for Kubernetes by default will configure the cluster's instances -preferably on different nodes, resulting in the following `affinity` definition: +For example, given the following `Cluster` specification: + +```yaml +apiVersion: postgresql.k8s.enterprisedb.io/v1 +kind: Cluster +metadata: + name: cluster-example +spec: + instances: 3 + imageName: quay.io/enterprisedb/postgresql:16.4 + + affinity: + enablePodAntiAffinity: true # Default value + topologyKey: kubernetes.io/hostname # Default value + podAntiAffinityType: preferred # Default value + + storage: + size: 1Gi +``` + +The `affinity` configuration applied in the instance pods will be: ```yaml affinity: @@ -44,72 +63,63 @@ affinity: - podAffinityTerm: labelSelector: matchExpressions: - - key: postgresql + - key: k8s.enterprisedb.io/cluster operator: In values: - cluster-example + - key: k8s.enterprisedb.io/podRole + operator: In + values: + - instance topologyKey: kubernetes.io/hostname weight: 100 ``` -As a result of the following Cluster spec: +With this setup, Kubernetes will *prefer* to schedule a 3-node PostgreSQL +cluster across three different nodes, assuming sufficient resources are +available. -```yaml -apiVersion: postgresql.k8s.enterprisedb.io/v1 -kind: Cluster -metadata: - name: cluster-example -spec: - instances: 3 - imageName: quay.io/enterprisedb/postgresql:16.3 +### Requiring Pod Anti-Affinity - affinity: - enablePodAntiAffinity: true #default value - topologyKey: kubernetes.io/hostname #defaul value - podAntiAffinityType: preferred #default value +You can modify the default behavior by adjusting the settings mentioned above. - storage: - size: 1Gi -``` +For example, setting `podAntiAffinityType` to `required` will enforce +`requiredDuringSchedulingIgnoredDuringExecution` instead of +`preferredDuringSchedulingIgnoredDuringExecution`. -Therefore, Kubernetes will *prefer* to schedule a 3-node PostgreSQL cluster over 3 -different nodes - resources permitting. +However, be aware that this strict requirement may cause pods to remain pending +if resources are insufficient—this is particularly relevant when using [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) +for automated horizontal scaling in a Kubernetes cluster. -The aforementioned default behavior can be changed by tweaking the above settings. +!!! Seealso "Inter-pod Affinity and Anti-Affinity" + For more details, refer to the [Kubernetes documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#inter-pod-affinity-and-anti-affinity). -`podAntiAffinityType` can be set to `required`: resulting in -`requiredDuringSchedulingIgnoredDuringExecution` being used instead of -`preferredDuringSchedulingIgnoredDuringExecution`. Please, be aware that such a -strong requirement might result in pending instances in case resources are not -available (which is an expected condition when using -[Cluster Autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) -for automated horizontal scaling of a Kubernetes cluster). +### Topology Considerations -!!! Seealso "Inter-pod affinity and anti-affinity" - More information on this topic is in the - [Kubernetes documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#inter-pod-affinity-and-anti-affinity). +In cloud environments, you might consider using `topology.kubernetes.io/zone` +as the `topologyKey` to ensure pods are distributed across different +availability zones rather than just nodes. For more options, see +[Well-Known Labels, Annotations, and Taints](https://kubernetes.io/docs/reference/labels-annotations-taints/). -Another possible value for `topologyKey` in a cloud environment can be -`topology.kubernetes.io/zone`, to be sure pods will be spread across -availability zones and not just nodes. Please refer to -["Well-Known Labels, Annotations and Taints"](https://kubernetes.io/docs/reference/labels-annotations-taints/) -for more options. +### Disabling Anti-Affinity Policies -You can disable the operator's generated anti-affinity policies by setting -`enablePodAntiAffinity` to false. +If needed, you can disable the operator-generated anti-affinity policies by +setting `enablePodAntiAffinity` to `false`. -Additionally, in case a more fine-grained control is needed, you can specify a -list of custom pod affinity or anti-affinity rules via the -`additionalPodAffinity` and `additionalPodAntiAffinity` configuration -attributes. These rules will be added to the ones generated by the operator, -if enabled, or passed transparently otherwise. +### Fine-Grained Control with Custom Rules + +For scenarios requiring more precise control, you can specify custom pod +affinity or anti-affinity rules using the `additionalPodAffinity` and +`additionalPodAntiAffinity` configuration attributes. These custom rules will +be added to those generated by the operator, if enabled, or used directly if +the operator-generated rules are disabled. !!! Note - You have to pass to `additionalPodAntiAffinity` or `additionalPodAffinity` - the whole content of `podAntiAffinity` or `podAffinity` that is expected by the - Pod spec (please look at the following YAML as an example of having only one - instance of PostgreSQL running on every worker node, regardless of which - PostgreSQL cluster they belong to). + When using `additionalPodAntiAffinity` or `additionalPodAffinity`, you must + provide the full `podAntiAffinity` or `podAffinity` structure expected by the + Pod specification. The following YAML example demonstrates how to configure + only one instance of PostgreSQL per worker node, regardless of which PostgreSQL + cluster it belongs to: ```yaml additionalPodAntiAffinity: @@ -148,3 +158,49 @@ for tolerations. !!! Seealso "Taints and Tolerations" More information on taints and tolerations can be found in the [Kubernetes documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/). + +## Isolating PostgreSQL workloads + +!!! Important + Before proceeding, please ensure you have read the + ["Architecture"](architecture.md) section of the documentation. + +While you can deploy PostgreSQL on Kubernetes in various ways, we recommend +following these essential principles for production environments: + +- **Exploit Availability Zones:** If possible, take advantage of availability + zones (AZs) within the same Kubernetes cluster by distributing PostgreSQL + instances across different AZs. +- **Dedicate Worker Nodes:** Allocate specific worker nodes for PostgreSQL + workloads through the `node-role.kubernetes.io/postgres` label and taint, + as detailed in the [Reserving Nodes for PostgreSQL Workloads](architecture.md#reserving-nodes-for-postgresql-workloads) + section. +- **Avoid Node Overlap:** Ensure that no instances from the same PostgreSQL + cluster are running on the same node. + +As explained in greater detail in the previous sections, EDB Postgres for Kubernetes +provides the flexibility to configure pod anti-affinity, node selectors, and +tolerations. + +Below is a sample configuration to ensure that a PostgreSQL `Cluster` is +deployed on `postgres` nodes, with its instances distributed across different +nodes: + +```yaml + # + affinity: + enablePodAntiAffinity: true + topologyKey: kubernetes.io/hostname + podAntiAffinityType: required + nodeSelector: + node-role.kubernetes.io/postgres: "" + tolerations: + - key: node-role.kubernetes.io/postgres + operator: Exists + effect: NoSchedule + # +``` + +Despite its simplicity, this setup ensures optimal distribution and isolation +of PostgreSQL workloads, leading to enhanced performance and reliability in +your production environment. diff --git a/product_docs/docs/postgres_for_kubernetes/1/security.mdx b/product_docs/docs/postgres_for_kubernetes/1/security.mdx index d37cf838d3e..2e1695419cd 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/security.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/security.mdx @@ -100,11 +100,17 @@ the cluster (PostgreSQL included). ### Role Based Access Control (RBAC) -The operator interacts with the Kubernetes API server with a dedicated service -account called `postgresql-operator-manager`. In Kubernetes this is installed -by default in the `postgresql-operator-system` namespace, with a cluster role -binding between this service account and the `postgresql-operator-manager` -cluster role which defines the set of rules/resources/verbs granted to the operator. +The operator interacts with the Kubernetes API server using a dedicated service +account named `postgresql-operator-manager`. This service account is typically installed in +the operator namespace, commonly `postgresql-operator-system`. However, the namespace may vary +based on the deployment method (see the subsection below). + +In the same namespace, there is a binding between the `postgresql-operator-manager` service +account and a role. The specific name and type of this role (either `Role` or +`ClusterRole`) also depend on the deployment method. This role defines the +necessary permissions required by the operator to function correctly. To learn +more about these roles, you can use the `kubectl describe clusterrole` or +`kubectl describe role` commands, depending on the deployment method. For OpenShift specificities on this matter, please consult the ["Red Hat OpenShift" section](openshift.md#predefined-rbac-objects), in particular ["Pre-defined RBAC objects" section](openshift.md#predefined-rbac-objects). @@ -118,7 +124,7 @@ For OpenShift specificities on this matter, please consult the Below we provide some examples and, most importantly, the reasons why EDB Postgres for Kubernetes requires full or partial management of standard Kubernetes -namespaced resources. +namespaced or non-namespaced resources. `configmaps` : The operator needs to create and manage default config maps for @@ -171,11 +177,43 @@ namespaced resources. validate them before starting the restore process. `nodes` -: The operator needs to get the labels for Affinity and AntiAffinity, so it can - decide in which nodes a pod can be scheduled preventing the replicas to be - in the same node, specially if nodes are in different availability zones. This - permission is also used to determine if a node is schedule or not, avoiding - the creation of pods that cannot be created at all. +: The operator needs to get the labels for Affinity and AntiAffinity so it can + decide in which nodes a pod can be scheduled. This is useful, for example, to + prevent the replicas from being scheduled in the same node - especially + important if nodes are in different availability zones. This + permission is also used to determine whether a node is scheduled, preventing + the creation of pods on unscheduled nodes, or triggering a switchover if + the primary lives in an unscheduled node. + +#### Deployments and `ClusterRole` Resources + +As mentioned above, each deployment method may have variations in the namespace +location of the service account, as well as the names and types of role +bindings and respective roles. + +##### Via Kubernetes Manifest + +When installing EDB Postgres for Kubernetes using the Kubernetes manifest, permissions are +set to `ClusterRoleBinding` by default. You can inspect the permissions +required by the operator by running: + +```sh +kubectl describe clusterrole postgresql-operator-manager +``` + +##### Via OLM + +From a security perspective, the Operator Lifecycle Manager (OLM) provides a +more flexible deployment method. It allows you to configure the operator to +watch either all namespaces or specific namespaces, enabling more granular +permission management. + +!!!Info + OLM allows you to deploy the operator in its own namespace and configure it + to watch specific namespaces used for EDB Postgres for Kubernetes clusters. This setup helps + to contain permissions and restrict access more effectively. + +#### Why Are ClusterRole Permissions Needed? The operator currently requires `ClusterRole` permissions to read `nodes` and `ClusterImageCatalog` objects. All other permissions can be namespace-scoped (i.e., `Role`) or @@ -197,6 +235,7 @@ some resources is correctly updated and to access the config maps and secrets that are associated with that Postgres cluster. Such calls are performed through a dedicated `ServiceAccount` created by the operator that shares the same PostgreSQL `Cluster` resource name. +\!!! !!! Important The operand can only access a specific and limited subset of resources @@ -365,28 +404,27 @@ section of the Kubernetes documentation for further information. EDB Postgres for Kubernetes exposes ports at operator, instance manager and operand levels, as listed in the table below: -| System | Port number | Exposing | Name | Certificates | Authentication | -| :--------------- | :---------- | :------------------ | :--------------- | :----------- | :------------- | -| operator | 9443 | webhook server | `webhook-server` | TLS | Yes | -| operator | 8080 | metrics | `metrics` | no TLS | No | -| instance manager | 9187 | metrics | `metrics` | no TLS | No | -| instance manager | 8000 | status | `status` | no TLS | No | -| operand | 5432 | PostgreSQL instance | `postgresql` | optional TLS | Yes | +| System | Port number | Exposing | Name | TLS | Authentication | +| :--------------- | :---------- | :------------------ | :--------------- | :------- | :------------- | +| operator | 9443 | webhook server | `webhook-server` | Yes | Yes | +| operator | 8080 | metrics | `metrics` | No | No | +| instance manager | 9187 | metrics | `metrics` | Optional | No | +| instance manager | 8000 | status | `status` | Yes | No | +| operand | 5432 | PostgreSQL instance | `postgresql` | Optional | Yes | ### PostgreSQL The current implementation of EDB Postgres for Kubernetes automatically creates -passwords and `.pgpass` files for the the database owner and, only +passwords and `.pgpass` files for the database owner and, only if requested by setting `enableSuperuserAccess` to `true`, for the `postgres` superuser. !!! Warning - Prior to EDB Postgres for Kubernetes 1.21, `enableSuperuserAccess` was set to `true` by - default. This change has been implemented to improve the security-by-default - posture of the operator, fostering a microservice approach where changes to - PostgreSQL are performed in a declarative way through the `spec` of the - `Cluster` resource, while providing developers with full powers inside the - database through the database owner user. + `enableSuperuserAccess` is set to `false` by default to improve the + security-by-default posture of the operator, fostering a microservice approach + where changes to PostgreSQL are performed in a declarative way through the + `spec` of the `Cluster` resource, while providing developers with full powers + inside the database through the database owner user. As far as password encryption is concerned, EDB Postgres for Kubernetes follows the default behavior of PostgreSQL: starting from PostgreSQL 14, diff --git a/product_docs/docs/postgres_for_kubernetes/1/service_management.mdx b/product_docs/docs/postgres_for_kubernetes/1/service_management.mdx new file mode 100644 index 00000000000..4650b19cf04 --- /dev/null +++ b/product_docs/docs/postgres_for_kubernetes/1/service_management.mdx @@ -0,0 +1,135 @@ +--- +title: 'Service Management' +originalFilePath: 'src/service_management.md' +--- + +A PostgreSQL cluster should only be accessed via standard Kubernetes network +services directly managed by EDB Postgres for Kubernetes. For more details, refer to the +["Service" page of the Kubernetes Documentation](https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies). + +EDB Postgres for Kubernetes defines three types of services for each `Cluster` resource: + +- `rw`: Points to the primary instance of the cluster (read/write). +- `ro`: Points to the replicas, where available (read-only). +- `r`: Points to any PostgreSQL instance in the cluster (read). + +By default, EDB Postgres for Kubernetes creates all the above services for a `Cluster` +resource, with the following conventions: + +- The name of the service follows this format: `-`. +- All services are of type `ClusterIP`. + +!!! Important + Default service names are reserved for EDB Postgres for Kubernetes usage. + +While this setup covers most use cases for accessing PostgreSQL within the same +Kubernetes cluster, EDB Postgres for Kubernetes offers flexibility to: + +- Disable the creation of the `ro` and/or `r` default services. +- Define your own services using the standard `Service` API provided by + Kubernetes. + +You can mix these two options. + +A common scenario arises when using EDB Postgres for Kubernetes in database-as-a-service +(DBaaS) contexts, where access to the database from outside the Kubernetes +cluster is required. In such cases, you can create your own service of type +`LoadBalancer`, if available in your Kubernetes environment. + +## Disabling Default Services + +You can disable any or all of the `ro` and `r` default services through the +[`managed.services.disabledDefaultServices` option](pg4k.v1.md#postgresql-k8s-enterprisedb-io-v1-ManagedServices). + +!!! Important + The `rw` service is essential and cannot be disabled because EDB Postgres for Kubernetes + relies on it to ensure PostgreSQL replication. + +For example, if you want to remove both the `ro` (read-only) and `r` (read) +services, you can use this configuration: + +```yaml +# +managed: + services: + disabledDefaultServices: ["ro", "r"] +``` + +## Adding Your Own Services + +!!! Important + When defining your own services, you cannot use any of the default reserved + service names that follow the convention `-`. It is + your responsibility to pick a unique name for the service in the Kubernetes + namespace. + +You can define a list of additional services through the +[`managed.services.additional` stanza](pg4k.v1.md#postgresql-k8s-enterprisedb-io-v1-ManagedService) +by specifying the service type (e.g., `rw`) in the `selectorType` field +and optionally the `updateStrategy`. + +The `serviceTemplate` field gives you access to the standard Kubernetes API for +the network `Service` resource, allowing you to define both the `metadata` and +the `spec` sections as you like. + +You must provide a `name` to the service and avoid defining the `selector` +field, as it is managed by the operator. + +!!! Warning + Service templates give you unlimited possibilities in terms of configuring + network access to your PostgreSQL database. This translates into greater + responsibility on your end to ensure that services work as expected. + EDB Postgres for Kubernetes has no control over the service configuration, except honoring + the selector. + +The `updateStrategy` field allows you to control how the operator +updates a service definition. By default, the operator uses the `patch` +strategy, applying changes directly to the service. +Alternatively, the `recreate` strategy deletes the existing service and +recreates it from the template. + +!!! Warning + The `recreate` strategy will cause a service disruption with every + change. However, it may be necessary for modifying certain + parameters that can only be set during service creation. + +For example, if you want to have a single `LoadBalancer` service for your +PostgreSQL database primary, you can use the following excerpt: + +```yaml +# +managed: + services: + additional: + - selectorType: rw + serviceTemplate: + metadata: + name: "mydb-lb" + labels: + test-label: "true" + annotations: + test-annotation: "true" + spec: + type: LoadBalancer +``` + +The above example also shows how to set metadata such as annotations and labels +for the created service. + +### About Exposing Postgres Services + +There are primarily three use cases for exposing your PostgreSQL service +outside your Kubernetes cluster: + +- Temporarily, for testing. +- Permanently, for **DBaaS purposes**. +- Prolonged period/permanently, for **legacy applications** that cannot be + easily or sustainably containerized and need to reside in a virtual machine + or physical machine outside Kubernetes. This use case is very similar to DBaaS. + +Be aware that allowing access to a database from the public network could +expose your database to potential attacks from malicious users. + +!!! Warning + Ensure you secure your database before granting external access, or make + sure your Kubernetes cluster is only reachable from a private network. diff --git a/product_docs/docs/postgres_for_kubernetes/1/ssl_connections.mdx b/product_docs/docs/postgres_for_kubernetes/1/ssl_connections.mdx index 44a4ee9c9b6..b3ce8c2bb0c 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/ssl_connections.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/ssl_connections.mdx @@ -176,7 +176,7 @@ Output: version -------------------------------------------------------------------------------------- ------------------ -PostgreSQL 16.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1 20191121 (Red Hat +PostgreSQL 16.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), 64-bit (1 row) ``` diff --git a/product_docs/docs/postgres_for_kubernetes/1/storage.mdx b/product_docs/docs/postgres_for_kubernetes/1/storage.mdx index 17da406805d..f526be3575d 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/storage.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/storage.mdx @@ -470,15 +470,43 @@ To use a pre-provisioned volume in EDB Postgres for Kubernetes: cluster.) Make sure you check for any pods stuck in `Pending` after you deploy the cluster. If the condition persists, investigate why it's happening. -## Block storage considerations (Ceph/ Longhorn) - -Most block storage solutions in Kubernetes recommend having multiple replicas -of a volume to improve resiliency. This works well for workloads that don't -have resiliency built into the application. However, EDB Postgres for Kubernetes has this -resiliency built directly into the Postgres `Cluster` through the number of -instances and the persistent volumes that are attached to them. - -In these cases, it makes sense to define the storage class used by the Postgres -clusters as one replica. By having additional replicas defined in the storage -solution (like Longhorn and Ceph), you might incur what's known as write -amplification, unnecessarily increasing disk I/O and space used. +## Block storage considerations (Ceph/Longhorn) + +Most block storage solutions in Kubernetes, such as Longhorn and Ceph, +recommend having multiple replicas of a volume to enhance resiliency. This +approach works well for workloads that lack built-in resiliency. + +However, EDB Postgres for Kubernetes integrates this resiliency directly into the Postgres +`Cluster` through the number of instances and the persistent volumes attached +to them, as explained in ["Synchronizing the state"](architecture.md#synchronizing-the-state). + +As a result, defining additional replicas at the storage level can lead to +write amplification, unnecessarily increasing disk I/O and space usage. + +For EDB Postgres for Kubernetes usage, consider reducing the number of replicas at the block storage +level to one, while ensuring that no single point of failure (SPoF) exists at +the storage level for the entire `Cluster` resource. This typically means +ensuring that a single storage host—and ultimately, a physical disk—does not +host blocks from different instances of the same `Cluster`, in alignment with +the broader *shared-nothing architecture* principle. + +In Longhorn, you can mitigate this risk by enabling strict-local data locality +when creating a custom storage class. Detailed instructions for creating a +volume with strict-local data locality are available [here](https://longhorn.io/docs/1.7.0/high-availability/data-locality/). +This setting ensures that a pod’s data volume resides on the same node as the +pod itself. + +Additionally, your Postgres `Cluster` should have [pod anti-affinity rules](scheduling.md#isolating-postgresql-workloads) +in place to ensure that the operator deploys pods across different nodes, +allowing Longhorn to place the data volumes on the corresponding hosts. If +needed, you can manually relocate volumes in Longhorn by temporarily setting +the volume replica count to 2, reducing it afterward, and then removing the old +replica. If a host becomes corrupted, you can use the [`cnp` plugin to destroy](kubectl-plugin.md#destroy) +the affected instance. EDB Postgres for Kubernetes will then recreate the instance on another +host and replicate the data. + +In Ceph, this can be configured through CRUSH rules. The documentation for +configuring CRUSH rules is available +[here](https://rook.io/docs/rook/latest-release/CRDs/Cluster/external-cluster/topology-for-external-mode/?h=topology). +These rules aim to ensure one volume per pod per node. You can also relocate +volumes by importing them into a different pool. diff --git a/product_docs/docs/postgres_for_kubernetes/1/troubleshooting.mdx b/product_docs/docs/postgres_for_kubernetes/1/troubleshooting.mdx index c92dddbbfaa..e44d8347a4c 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/troubleshooting.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/troubleshooting.mdx @@ -221,7 +221,7 @@ Cluster in healthy state Name: cluster-example Namespace: default System ID: 7044925089871458324 -PostgreSQL Image: quay.io/enterprisedb/postgresql:16.3-3 +PostgreSQL Image: quay.io/enterprisedb/postgresql:16.4-3 Primary instance: cluster-example-1 Instances: 3 Ready instances: 3 @@ -297,7 +297,7 @@ kubectl describe cluster -n | grep "Image Name" Output: ```shell - Image Name: quay.io/enterprisedb/postgresql:16.3-3 + Image Name: quay.io/enterprisedb/postgresql:16.4-3 ``` !!! Note @@ -456,11 +456,6 @@ You can list the backups that have been created for a named cluster with: kubectl get backup -l k8s.enterprisedb.io/cluster= ``` -!!! Important - Backup labelling has been introduced in version 1.10.0 of EDB Postgres for Kubernetes. - So only those resources that have been created with that version or - a higher one will contain such a label. - ## Storage information Sometimes is useful to double-check the StorageClass used by the cluster to have @@ -647,14 +642,24 @@ kubectl cp POD:/var/lib/postgresql/data/pgdata/core.14177 core.14177 You now have the file. Make sure you free the space on the server by removing the core dumps. -## Some common issues +## Some known issues ### Storage is full -If one or more pods in the cluster are in `CrashloopBackoff` and logs -suggest this could be due to a full disk, you probably have to increase the -size of the instance's `PersistentVolumeClaim`. Please look at the -["Volume expansion" section](storage.md#volume-expansion) in the documentation. +In case the storage is full, the PostgreSQL pods will not be able to write new +data, or, in case of the disk containing the WAL segments being full, PostgreSQL +will shut down. + +If you see messages in the logs about the disk being full, you should increase +the size of the affected PVC. You can do this by editing the PVC and changing +the `spec.resources.requests.storage` field. After that, you should also update +the Cluster resource with the new size to apply the same change to all the pods. +Please look at the ["Volume expansion" section](storage.md#volume-expansion) in the documentation. + +If the space for WAL segments is exhausted, the pod will be crash-looping and +the cluster status will report `Not enough disk space`. Increasing the size in +the PVC and then in the Cluster resource will solve the issue. See also +the ["Disk Full Failure" section](instance_manager.md#disk-full-failure) ### Pods are stuck in `Pending` state @@ -713,8 +718,7 @@ Cluster is stuck in "Creating a new replica", while pod logs don't show relevant problems. This has been found to be related to the next issue [on connectivity](#networking-is-impaired-by-installed-network-policies). -From releases 1.20.1, 1.19.3, and 1.18.5, networking issues will be more clearly -reflected in the status column as follows: +Networking issues are reflected in the status column as follows: ```text Instance Status Extraction Error: HTTP communication issue diff --git a/product_docs/docs/postgres_for_kubernetes/1/use_cases.mdx b/product_docs/docs/postgres_for_kubernetes/1/use_cases.mdx index 3b20172c3e4..813300a5cec 100644 --- a/product_docs/docs/postgres_for_kubernetes/1/use_cases.mdx +++ b/product_docs/docs/postgres_for_kubernetes/1/use_cases.mdx @@ -42,7 +42,8 @@ Another possible use case is to manage your PostgreSQL database inside Kubernetes, while having your applications outside of it (for example in a virtualized environment). In this case, PostgreSQL is represented by an IP address (or host name) and a TCP port, corresponding to the defined Ingress -resource in Kubernetes. +resource in Kubernetes (normally a `LoadBalancer` service type as explained +in the ["Service Management"](service_management.md) page). The application can still benefit from a TLS connection to PostgreSQL.