From e970b5ee51cc433a39a024387cc65f23aea20f76 Mon Sep 17 00:00:00 2001 From: drothery-edb Date: Tue, 12 Sep 2023 11:00:51 -0400 Subject: [PATCH 1/8] PGD for Kubernetes --- .../docs/postgres_distributed_for_kubernetes/1/index.mdx | 2 ++ 1 file changed, 2 insertions(+) diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/index.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/index.mdx index c11e9f94e50..c8f6d89d572 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/index.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/index.mdx @@ -102,3 +102,5 @@ please refer to the [API reference](api_reference.md). *[Postgres, PostgreSQL and the Slonik Logo](https://www.postgresql.org/about/policies/trademarks/) are trademarks or registered trademarks of the PostgreSQL Community Association of Canada, and used with their permission.* + + From ca16e14e1bf682a87b8a00a2a5a879a2a01cb84a Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Tue, 12 Sep 2023 16:14:01 -0400 Subject: [PATCH 2/8] First set of edits to pgd for kubernetes doc --- .../1/api_reference.md.in | 8 +- .../1/api_reference.mdx | 8 +- .../1/architecture.mdx | 142 +++++++++--------- .../1/backup.mdx | 90 +++++------ .../1/before_you_start.mdx | 38 ++--- .../1/certificates.mdx | 35 +++-- 6 files changed, 157 insertions(+), 164 deletions(-) diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/api_reference.md.in b/product_docs/docs/postgres_distributed_for_kubernetes/1/api_reference.md.in index 5e70e3ff2ae..7998937baf2 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/api_reference.md.in +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/api_reference.md.in @@ -1,13 +1,11 @@ -# API Reference +# API reference -EDB Postgres Distributed for Kubernetes extends the Kubernetes API defining the -custom resources you find below. +EDB Postgres Distributed for Kubernetes extends the Kubernetes API by defining the +custom resources that follow. All the resources are defined in the `pgd.k8s.enterprisedb.io/v1beta1` API. -Below you will find a description of the defined resources: - {{ range $ -}} diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/api_reference.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/api_reference.mdx index 75bba2b67c3..ae63da33e12 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/api_reference.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/api_reference.mdx @@ -1,16 +1,14 @@ --- -title: 'API Reference' +title: 'API reference' originalFilePath: 'src/api_reference.md' --- -EDB Postgres Distributed for Kubernetes extends the Kubernetes API defining the -custom resources you find below. +EDB Postgres Distributed for Kubernetes extends the Kubernetes API by defining the +custom resources that follow. All the resources are defined in the `pgd.k8s.enterprisedb.io/v1beta1` API. -Below you will find a description of the defined resources: - - [Backup](#Backup) diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/architecture.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/architecture.mdx index bbd7e477926..da804ccca23 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/architecture.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/architecture.mdx @@ -3,7 +3,7 @@ title: 'Architecture' originalFilePath: 'src/architecture.md' --- -This section covers the main architectural aspects you need to consider +Consider these main architectural aspects when deploying EDB Postgres Distributed in Kubernetes (PG4K-PGD). PG4K-PGD is a @@ -13,45 +13,47 @@ running in private, public, hybrid, or multi-cloud environments. ## Relationship with EDB Postgres Distributed -[EDB Postgres Distributed (PGD)](https://www.enterprisedb.com/docs/pgd/latest/) +[EDB Postgres Distributed (PGD)](/pgd/latest/) is a multi-master implementation of Postgres designed for high performance and availability. PGD generally requires deployment using -[*Trusted Postgres Architect*, (TPA)](https://www.enterprisedb.com/docs/pgd/latest/tpa/), -a tool that uses [Ansible](https://www.ansible.com) for provisioning and -deployment of PGD clusters. +[Trusted Postgres Architect (TPA)](https://www.enterprisedb.com/docs/pgd/latest/tpa/), +a tool that uses [Ansible](https://www.ansible.com) to provision and +deploy PGD clusters. PG4K-PGD offers a different way of deploying PGD clusters, leveraging containers -and Kubernetes, with the added advantages that the resulting architecture is -self-healing and robust, managed through declarative configuration, and that it -takes advantage of the vast and growing Kubernetes ecosystem. +and Kubernetes. The advantages are that the resulting architecture: + +- Is self-healing and robust. +- Is managed through declarative configuration. +- Takes advantage of the vast and growing Kubernetes ecosystem. ## Relationship with EDB Postgres for Kubernetes -A PGD cluster consists of one or more *PGD Groups*, each having one or more *PGD -Nodes*. A PGD node is a Postgres database. PG4K-PGD internally +A PGD cluster consists of one or more *PGD groups*, each having one or more *PGD +nodes*. A PGD node is a Postgres database. PG4K-PGD internally manages each PGD node using the `Cluster` resource as defined by EDB Postgres -for Kubernetes (PG4K), specifically a `Cluster` with a single instance (i.e. no +for Kubernetes (PG4K), specifically a cluster with a single instance (that is, no replicas). -The single PostgreSQL instance created by each `Cluster` can be configured -declaratively via the +You can configure the single PostgreSQL instance created by each cluster +declaratively using the [`.spec.cnp` section](api_reference.md#CnpConfiguration) -of the PGD Group spec. +of the PGD group spec. In PG4K-PGD, as in PG4K, the underlying database implementation is responsible -for data replication. However, it is important to note that *failover* and -*switchover* work differently, entailing Raft election and the nomination of new -write leaders. PG4K only handles the deployment and healing of data nodes. +for data replication. However, it's important to note that failover and +switchover work differently, entailing Raft election and nominating new +write leaders. PG4K handles only the deployment and healing of data nodes. ## Managing PGD using PG4K-PGD The PG4K-PGD operator can manage the complete lifecycle of PGD clusters. As -such, in addition to PGD Nodes (represented as single-instance `Clusters`), it +such, in addition to PGD nodes (represented as single-instance clusters), it needs to manage other objects associated with PGD. PGD relies on the Raft algorithm for distributed consensus to manage node -metadata, specifically agreement on a *write leader*. Consensus among data +metadata, specifically agreement on a write leader. Consensus among data nodes is also required for operations such as generating new global sequences or performing distributed DDL. @@ -59,14 +61,14 @@ These considerations force additional actors in PGD above database nodes. PG4K-PGD manages the following: -- Data nodes: as mentioned previously, a node is a database, and is managed - via PG4K, creating a `Cluster` with a single instance. +- Data nodes: as mentioned previously, a node is a database and is managed + by PG4K, creating a cluster with a single instance. - [Witness nodes](https://www.enterprisedb.com/docs/pgd/latest/nodes/#witness-nodes) - are basic database instances that do not participate in data - replication; their function is to guarantee that consensus is possible in - groups with an even number of data nodes, or after network partitions. Witness + are basic database instances that don't participate in data + replication. Their function is to guarantee that consensus is possible in + groups with an even number of data nodes or after network partitions. Witness nodes are also managed using a single-instance `Cluster` resource. -- [PGD Proxies](https://www.enterprisedb.com/docs/pgd/latest/routing/proxy/): +- [PGD proxies](https://www.enterprisedb.com/docs/pgd/latest/routing/proxy/) act as Postgres proxies with knowledge of the write leader. PGD proxies need information from Raft to route writes to the current write leader. @@ -74,22 +76,22 @@ PG4K-PGD manages the following: PGD groups assume full mesh connectivity of PGD nodes. Each node must be able to connect to every other node, using the appropriate connection string (a -`libpq`-style DSN). Write operations don't need to be sent to every node. PGD -will take care of replicating data after it's committed to one node. +`libpq`-style DSN). Write operations don't need to be sent to every node. PGD +takes care of replicating data after it's committed to one node. -For performance, it is often recommendable to send write operations mostly to a -single node, the *write leader*. Raft is used to identify which node is the -write leader, and to hold metadata about the PGD nodes. PGD Proxies are used to -transparently route writes to write leaders, and to quickly pivot to the new +For performance, we often recommend sending write operations mostly to a +single node: the write leader. Raft identifies the node that's the +write leader and holds metadata about the PGD nodes. PGD proxies +transparently route writes to write leaders and can quickly pivot to the new write leader in case of switchover or failover. -It is possible to configure *Raft subgroups*, each of which can maintain a -separate write leader. In PG4K-PGD, a PGD Group containing a PGD Proxy -automatically comprises a Raft subgroup. +It's possible to configure *Raft subgroups*, each of which can maintain a +separate write leader. In PG4K-PGD, a PGD group containing a PGD proxy +comprises a Raft subgroup. -There are two kinds of routing available with PGD Proxies: +Two kinds of routing are available with PGD proxies: -- Global routing uses the top-level Raft group, and maintains one global write +- Global routing uses the top-level Raft group and maintains one global write leader. - Local routing uses subgroups to maintain separate write leaders. Local routing is often used to achieve geographical separation of writes. @@ -97,19 +99,20 @@ There are two kinds of routing available with PGD Proxies: In PG4K-PGD, local routing is used by default, and a configuration option is available to select global routing. -You can find more information in the -[PGD documentation of routing with Raft](https://www.enterprisedb.com/docs/pgd/latest/routing/raft/). +For more information, see +[Proxies, Raft, and Raft subgroups](/pgd/latest/routing/raft/) in the PGD documentation. -### PGD Architectures and High Availability +### PGD architectures and high availability -EDB proposes several recommended architectures to make good use of PGD's -distributed multi-master capabilities and to offer high availability. +To make good use of PGD's +distributed multi-master capabilities and to offer high availability, +we recommend several architectures . The Always On architectures are built from either one group in a single location or two groups in two separate locations. -Please refer to the -[PGD architecture document](https://www.enterprisedb.com/docs/pgd/latest/architectures/) -for further information. +See +[Choosing your architecture](/pgd/latest/architectures/) in the PGD documentation +for more information. ## Deploying PGD on Kubernetes @@ -118,35 +121,30 @@ adaptations are necessary to translate PGD into the Kubernetes ecosystem. ### Images and operands -PGD can be configured to run one of three Postgres distributions. Please refer -to the -[PGD documentation](https://www.enterprisedb.com/docs/pgd/latest/choosing_server/) -to understand the features of each distribution. +PGD can be configured to run one of three Postgres distributions. See +[Choosing a Postgres distribution](/pgd/latest/choosing_server/) +in the PGD documentation to understand the features of each distribution. To function in Kubernetes, containers are provided for each Postgres distribution. These are the *operands*. In addition, the operator images are kept in those same repositories. -Please refer to [the document on registries](private_registries.md) +See [EDB private image registries](private_registries.md) for details on accessing the images. ### Kubernetes architecture -We reproduce some of the points of the -[PG4K document on Kubernetes architecture](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/architecture/), -to which we refer you for further depth. - Kubernetes natively provides the possibility to span separate physical locations -–also known as data centers, failure zones, or more frequently **availability -zones**– connected to each other via redundant, low-latency, private network -connectivity. +connected to each other by way of redundant, low-latency, private network +connectivity. These physical locations are also known as data centers, failure zones, or, +more frequently, *availability zones*. Being a distributed system, the recommended minimum number of availability zones -for a **Kubernetes cluster** is three (3), in order to make the control plane +for a Kubernetes cluster is three to make the control plane resilient to the failure of a single zone. This means that each data center is active at any time and can run workloads simultaneously. -PG4K-PGD can be installed within a +PG4K-PGD can be installed in a [single Kubernetes cluster](#single-kubernetes-cluster) or across [multiple Kubernetes clusters](#multiple-kubernetes-clusters). @@ -154,31 +152,31 @@ or across ### Single Kubernetes cluster A multi-availability-zone Kubernetes architecture is typical of Kubernetes -services managed by Cloud Providers. Such an architecture enables the PG4K-PGD +services managed by cloud providers. Such an architecture enables the PG4K-PGD and the PG4K operators to schedule workloads and nodes across availability -zones, considering all zones active: +zones, considering all zones active. ![Kubernetes cluster spanning over 3 independent data centers](./images/k8s-architecture-3-az.png) PGD clusters can be deployed in a single Kubernetes cluster and take advantage -of Kubernetes availability zones to enable High Availability architectures, +of Kubernetes availability zones to enable high-availability architectures, including the Always On recommended architectures. -The *Always On Single Location* architecture shown in the -[PGD Architecture document](https://www.enterprisedb.com/docs/pgd/latest/architectures/): -![Always On Single Region](./images/always_on_1x3_updated.png) +You can realize the Always On single location architecture shown in +[Choosing your architecture](/pgd/latest/architectures/) in the PGD documentation on +a single Kubernetes cluster with three availability zones. -can be realized on single kubernetes cluster with 3 availability zones. +![Always On Single Region](./images/always_on_1x3_updated.png) -The PG4K-PGD operator can control the *scheduling* of pods (i.e. which pods go -to which data center) using affinity, tolerations and node selectors, as is the +The PG4K-PGD operator can control the scheduling of pods (that is, which pods go +to which data center) using affinity, tolerations, and node selectors, as is the case with PG4K. Individual scheduling controls are available for proxies as well as nodes. -Please refer to the +See the [Kubernetes documentation on scheduling](https://kubernetes.io/docs/concepts/scheduling-eviction/), -as well as the [PG4K documents](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/scheduling/) -for further information. +and [Scheduling](/postgres_for_kubernetes/latest/scheduling/) in the PG4K documentation +for more information. ### Multiple Kubernetes clusters @@ -187,7 +185,7 @@ reliably communicate with each other. ![Multiple Kubernetes clusters](./images/k8s-architecture-multi.png) -[Always On multi-location PGD architectures](https://www.enterprisedb.com/docs/pgd/latest/architectures/) +[Always On multi-location PGD architectures](/pgd/latest/architectures/) can be realized on multiple Kubernetes clusters that meet the connectivity requirements. -More information can be found in the ["Connectivity"](connectivity.md) section. \ No newline at end of file +For more information, see [Connectivity](connectivity.md). \ No newline at end of file diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/backup.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/backup.mdx index be4b57ebf4b..11f1d005918 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/backup.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/backup.mdx @@ -6,31 +6,31 @@ originalFilePath: 'src/backup.md' EDB Postgres Distributed for Kubernetes (PG4K-PGD) supports *online/hot backup* of PGD clusters through physical backup and WAL archiving on an object store. This means that the database is always up (no downtime required) and that -Point In Time Recovery is available. +point-in-time recovery is available. ## Common object stores -Multiple object store are supported, such as `AWS S3`, `Microsoft Azure Blob Storage`, -`Google Cloud Storage`, `MinIO Gateway`, or any S3 compatible provider. +Multiple object stores are supported, such as AWS S3, Microsoft Azure Blob Storage, +Google Cloud Storage, MinIO Gateway, or any S3-compatible provider. Given that PG4K-PGD configures the connection with object stores by relying on -EDB Postgres for Kubernetes (PG4K), please refer to the [PG4K Cloud provider support](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/backup_recovery/#cloud-provider-support) -documentation for additional depth. +EDB Postgres for Kubernetes (PG4K), see the [PG4K cloud provider support](/postgres_for_kubernetes/latest/backup_recovery/#cloud-provider-support) +documentation for more information. !!! Important - In the PG4K documentation you'll find the Cloud Provider configuration section - available at `spec.backup.barmanObjectStore`. Note that in PG4K-PGD examples, the object store section is found at a + The PG4K documentation's Cloud Provider configuration section is + available at `spec.backup.barmanObjectStore`. In PG4K-PGD examples, the object store section is at a different path: `spec.backup.configuration.barmanObjectStore`. ## WAL archive -WAL archiving is the process that sends `WAL files` to the object storage, and it's essential to -execute *online/hot backups*, or Point in Time recovery (PITR). -In PG4K-PGD, each PGD Node will be set up to archive WAL files in the object store independently. +WAL archiving is the process that sends WAL files to the object storage, and it's essential to +execute online/hot backups or point-in-time recovery (PITR). +In PG4K-PGD, each PGD node is set up to archive WAL files in the object store independently. -The WAL archive is defined in the PGDGroup `spec.backup.configuration.barmanObjectStore` stanza, +The WAL archive is defined in the PGD group `spec.backup.configuration.barmanObjectStore` stanza and is enabled as soon as a destination path and cloud credentials are set. -You can choose to compress WAL files before they are uploaded, and/or encrypt them. -Parallel WAL archiving can also be enabled. +You can choose to compress WAL files before uploading them and also or alternatively encrypt them. +In adddition, you can enable parallel WAL archiving. ```yaml apiVersion: pgd.k8s.enterprisedb.io/v1beta1 @@ -47,16 +47,16 @@ spec: maxParallel: 8 ``` -For further information, refer to the [PG4K WAL archiving](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/backup_recovery/#wal-archiving) documentation. +For more information, see the [PG4K WAL archiving](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/backup_recovery/#wal-archiving) documentation. ## Scheduled backups Scheduled backups are the recommended way to configure your backup strategy in PG4K-PGD. -When the PGDGroup `spec.backup.configuration.barmanObjectStore` stanza is configured, the operator will select one of the -PGD data nodes as the elected "Backup Node", for which it will automatically create a `Scheduled Backup` resource. +When the PGD group `spec.backup.configuration.barmanObjectStore` stanza is configured, the operator selects one of the +PGD data nodes as the elected backup node for which it automatically creates a `Scheduled Backup` resource. The `.spec.backup.cron.schedule` field allows you to define a cron schedule specification, expressed -in the [https://pkg.go.dev/github.com/robfig/cron#hdr-CRON_Expression_Format]\(Go `cron` package format). +in the [Go `cron` package format](https://pkg.go.dev/github.com/robfig/cron#hdr-CRON_Expression_Format). ```yaml apiVersion: pgd.k8s.enterprisedb.io/v1beta1 @@ -71,32 +71,32 @@ spec: immediate: true ``` -Scheduled Backups can be suspended if necessary by setting `.spec.backup.cron.suspend` to true. This will -prevent any new backup from being scheduled while the option is set to true. +You can suspend scheduled backups by setting `.spec.backup.cron.suspend` to `true`. This setting +prevents any new backup from being scheduled. -In case you want to execute a backup as soon as the ScheduledBackup resource is created -you can set `.spec.backup.cron.immediate` to true. +If you want to execute a backup as soon as the `ScheduledBackup` resource is created, +you can set `.spec.backup.cron.immediate` to `true`. -`.spec.backupOwnerReference` indicates which ownerReference should be used +`.spec.backupOwnerReference` indicates the `ownerReference` to use in the created backup resources. The choices are: -- *none:* no owner reference for created backup objects -- *self:* sets the Scheduled backup object as owner of the backup -- *cluster:* sets the cluster as owner of the backup +- `none` — No owner reference for created backup objects. +- `self` — Sets the scheduled backup object as owner of the backup. +- `cluster` — Sets the cluster as owner of the backup. !!! Note - The `PG4K` ScheduledBackup object contains an additional option named `cluster` to specify the - Cluster to be backed up. This option is currently not supported by `PG4K-PGD`, and will be + The PG4K `ScheduledBackup` object contains the `cluster` option to specify the + cluster to back up. This option is currently not supported by PG4K-PGD and is ignored if specified. -In case an elected "Backup node" is deleted, the operator will transparently elect a new "Backup Node" -and reconcile the Scheduled Backup resource accordingly. +If an elected backup node is deleted, the operator transparently elects a new backup node +and reconciles the `Scheduled Backup` resource accordingly. ## Retention policies PG4K-PGD can manage the automated deletion of backup files from the backup -object store, using **retention policies** based on the recovery window. -This process will also take care of removing unused WAL files and WALs associated with backups +object store using retention policies based on the recovery window. +This process also takes care of removing unused WAL files and WALs associated with backups that are scheduled for deletion. You can define your backups with a retention policy of 30 days as follows: @@ -111,34 +111,34 @@ spec: retentionPolicy: "30d" ``` -For further information, refer to the [PG4K Retention policies](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/backup_recovery/#retention-policies) documentation. +For more information, see the [PG4K retention policies](/postgres_for_kubernetes/latest/backup_recovery/#retention-policies) documentation. !!! Important - Currently, the retention policy will only be applied for the elected "Backup Node" - backups and WAL files. Given that each other PGD node also archives its own WALs - independently, it is your responsibility to manage the lifecycle of those WAL files, - for example by leveraging the object storage data retention policy. - Also, in case you have an object storage data retention policy set up on every PGD Node + Currently, the retention policy is applied only for the elected backup node + backups and WAL files. Given that every other PGD node also archives its own WALs + independently, it's your responsibility to manage the lifecycle of those WAL files, + for example by leveraging the object storage data-retention policy. + Also, in case you have an object storage data retention policy set up on every PGD node directory, make sure it's not overlapping or interfering with the retention policy managed by the operator. ## Compression algorithms Backups and WAL files are uncompressed by default. However, multiple compression algorithms are -supported. For more information, refer to the [PG4K Compression algorithms](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/backup_recovery/#compression-algorithms) documentation. +supported. For more information, see the [PG4K compression algorithms](/postgres_for_kubernetes/latest/backup_recovery/#compression-algorithms) documentation. ## Tagging of backup objects -It's possible to specify tags as key-value pairs for the backup objects, namely base backups, WAL files and history files. -For more information, refer to the [PG4K document on Tagging of backup objects](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/backup_recovery/#tagging-of-backup-objects). +It's possible to specify tags as key-value pairs for the backup objects, namely base backups, WAL files, and history files. +For more information, see the PG4K documentation about [tagging of backup objects](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/backup_recovery/#tagging-of-backup-objects). -## On-demand backups of a PGD Node +## On-demand backups of a PGD node -A PGD Node is represented as single-instance PG4K `Cluster` object. +A PGD node is represented as single-instance PG4K `Cluster` object. As such, in case of need, it's possible to request an on-demand backup -of a specific PGD Node by creating a PG4K `Backup` resource. -In order to do that, you can directly refer to the [PG4K On-demand backups](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/backup_recovery/#on-demand-backups) documentation. +of a specific PGD node by creating a PG4K `Backup` resource. +To do that, see the [PG4K on-demand backups](/postgres_for_kubernetes/latest/backup_recovery/#on-demand-backups) documentation. !!! Hint - You can retrieve the list of PG4K Clusters that make up your PGDGroup + You can retrieve the list of PG4K clusters that make up your PGD group by running: `kubectl get cluster -l k8s.pgd.enterprisedb.io/group=my-pgd-group -n my-namespace` \ No newline at end of file diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/before_you_start.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/before_you_start.mdx index cdb6cb725ab..3fa579da709 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/before_you_start.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/before_you_start.mdx @@ -1,16 +1,16 @@ --- -title: 'Before You Start' +title: 'Before you start' originalFilePath: 'src/before_you_start.md' --- -Before we get started, it is essential to go over some terminology that is +Before you get started, review the terminology that's specific to Kubernetes and PGD. ## Kubernetes terminology [Node](https://kubernetes.io/docs/concepts/architecture/nodes/) : A *node* is a worker machine in Kubernetes, either virtual or physical, where - all services necessary to run pods are managed by the control plane node(s). + all services necessary to run pods are managed by the control plane nodes. [Pod](https://kubernetes.io/docs/concepts/workloads/pods/pod/) : A *pod* is the smallest computing unit that can be deployed in a Kubernetes @@ -24,26 +24,26 @@ specific to Kubernetes and PGD. on. [Secret](https://kubernetes.io/docs/concepts/configuration/secret/) -: A *secret* is an object that is designed to store small amounts of sensitive - data such as passwords, access keys, or tokens, and use them in pods. +: A *secret* is an object that's designed to store small amounts of sensitive + data such as passwords, access keys, or tokens and use them in pods. -[Storage Class](https://kubernetes.io/docs/concepts/storage/storage-classes/) +[Storage class](https://kubernetes.io/docs/concepts/storage/storage-classes/) : A *storage class* allows an administrator to define the classes of storage in a cluster, including provisioner (such as AWS EBS), reclaim policies, mount options, volume expansion, and so on. -[Persistent Volume](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) +[Persistent volume](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) : A *persistent volume* (PV) is a resource in a Kubernetes cluster that represents storage that has been either manually provisioned by an administrator or dynamically provisioned by a *storage class* controller. A PV - is associated with a pod using a *persistent volume claim* and its lifecycle is + is associated with a pod using a *persistent volume claim*, and its lifecycle is independent of any pod that uses it. Normally, a PV is a network volume, especially in the public cloud. A [*local persistent volume* (LPV)](https://kubernetes.io/docs/concepts/storage/volumes/#local) is a persistent volume that exists only on the particular node where the pod that uses it is running. -[Persistent Volume Claim](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) +[Persistent volume claim](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) : A *persistent volume claim* (PVC) represents a request for storage, which might include size, access mode, or a particular storage class. Similar to how a pod consumes node resources, a PVC consumes the resources of a PV. @@ -55,7 +55,7 @@ specific to Kubernetes and PGD. projects, departments, teams, and so on. [RBAC](https://kubernetes.io/docs/reference/access-authn-authz/rbac/) -: *Role Based Access Control* (RBAC), also known as *role-based security*, is a +: *Role-based access control* (RBAC), also known as *role-based security*, is a method used in computer systems security to restrict access to the network and resources of a system to authorized users only. Kubernetes has a native API to control roles at the namespace and cluster level and associate them with @@ -63,7 +63,7 @@ specific to Kubernetes and PGD. [CRD](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) : A *custom resource definition* (CRD) is an extension of the Kubernetes API - and allows developers to create new data types and objects, *called custom + and allows developers to create new data types and objects, called *custom resources*. [Operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) @@ -75,13 +75,13 @@ specific to Kubernetes and PGD. [`kubectl`](https://kubernetes.io/docs/reference/kubectl/overview/) : `kubectl` is the command-line tool used to manage a Kubernetes cluster. -EDB Postgres Distributed for Kubernetes requires a Kubernetes version supported by the community. Please refer to the -["Supported releases"](https://www.enterprisedb.com/resources/platform-compatibility#pgdk8s) page for details. +EDB Postgres Distributed for Kubernetes requires a Kubernetes version supported by the community. See +["Supported releases"](https://www.enterprisedb.com/resources/platform-compatibility#pgdk8s) for details. ## PGD terminology -Please refer to the -[PGD terminology page for further information](https://www.enterprisedb.com/docs/pgd/latest/terminology/). +For more information, see the +[Terminology](https://www.enterprisedb.com/docs/pgd/latest/terminology/) in the PGD documentation. [Node](https://www.enterprisedb.com/docs/pgd/latest/terminology/#node) : A PGD database instance. @@ -93,22 +93,22 @@ Please refer to the : A planned change in connection between the application and the active database node in a cluster, typically done for maintenance. [Write leader](https://www.enterprisedb.com/docs/pgd/latest/terminology/#write-leader) -: In always-on architectures, a node is selected as the correct connection endpoint for applications. This node is called the write leader. The write leader is selected by consensus of a quorum of proxy nodes. +: In Always On architectures, a node is selected as the correct connection endpoint for applications. This node is called the *write leader*. The write leader is selected by consensus of a quorum of proxy nodes. ## Cloud terminology Region -: A *region* in the Cloud is an isolated and independent geographic area +: A *region* in the cloud is an isolated and independent geographic area organized in *availability zones*. Zones within a region have very little round-trip network latency. Zone -: An *availability zone* in the Cloud (also known as *zone*) is an area in a +: An *availability zone* in the cloud (also known as *zone*) is an area in a region where resources can be deployed. Usually, an availability zone corresponds to a data center or an isolated building of the same data center. ## What to do next -Now that you have familiarized with the terminology, you can decide to +Now that you have familiarized with the terminology, you can [test EDB Postgres Distributed for Kubernetes (PG4K-PGD) on your laptop using a local cluster](quickstart.md) before deploying the operator in your selected cloud environment. \ No newline at end of file diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/certificates.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/certificates.mdx index cc9e2463f39..ea76af2dca3 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/certificates.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/certificates.mdx @@ -3,28 +3,27 @@ title: 'Certificates' originalFilePath: 'src/certificates.md' --- -EDB Postgres Distributed for Kubernetes has been designed to natively support TLS certificates. -In order to set up a PGD cluster, each PGD node require: +EDB Postgres Distributed for Kubernetes was designed to natively support TLS certificates. +To set up a PGD cluster, each PGD node requires: -- a server Certification Authority (CA) certificate -- a server TLS certificate signed by the server Certification Authority -- a client Certification Authority (CA) certificate -- a streaming replication client certificate generated by the client Certification Authority +- A server certification authority (CA) certificate +- A server TLS certificate signed by the server CA +- A client CA certificate +- A streaming replication client certificate generated by the client CA !!! Note - You can find all the secrets used by each PGD Node and the expiry dates in - the Cluster (PGD Node) Status. + You can find all the secrets used by each PGD node and the expiry dates in + the cluster (PGD node) status. -EDB Postgres Distributed for Kubernetes is very flexible when it comes to TLS certificates, and -primarily operates in two modes: +EDB Postgres Distributed for Kubernetes is very flexible when it comes to TLS certificates. It operates +primarily in two modes: -1. **operator managed**: certificates are internally - managed by the operator in a fully automated way, and signed using a CA created - by EDB Postgres Distributed for Kubernetes -2. **user provided**: certificates are +1. **Operator managed** — Certificates are internally + managed by the operator in a fully automated way and signed using a CA created + by EDB Postgres Distributed for Kubernetes. +2. **User provided** — Certificates are generated outside the operator and imported in the cluster definition as - secrets - EDB Postgres Distributed for Kubernetes integrates itself with cert-manager (see - examples below) + secrets. EDB Postgres Distributed for Kubernetes integrates itself with cert-manager. -You can find further information in the -[EDB Postgres for Kubernetes documentation](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/certificates/). \ No newline at end of file +For more information, see +[Certificates](/postgres_for_kubernetes/latest/certificates/) in the EDB Postgres for Kubernetes documentation. \ No newline at end of file From d41d63f74abd4e47eb30b886d23acf46908979a3 Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Tue, 12 Sep 2023 18:30:44 -0400 Subject: [PATCH 3/8] Second set of edits on pgd for kubernetes --- .../1/connectivity.mdx | 173 +++++++++--------- .../1/index.mdx | 54 +++--- .../1/installation_upgrade.mdx | 38 ++-- 3 files changed, 131 insertions(+), 134 deletions(-) diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/connectivity.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/connectivity.mdx index fd50828c2ba..642eb88a21e 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/connectivity.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/connectivity.mdx @@ -3,30 +3,30 @@ title: 'Connectivity' originalFilePath: 'src/connectivity.md' --- -This section provides information about secure network communications within a -PGD Cluster, covering the following topics: +Information about secure network communications in a +PGD cluster includes: -- [services](#services) -- [domain names resolution](#domain-names-resolution) using fully qualified domain names (FQDN) +- [Services](#services) +- [Domain names resolution](#domain-names-resolution) using fully qualified domain names (FQDN) - [TLS configuration](#tls-configuration) -\!!! Notice - Although the above topics might seem unrelated to each other, they all +!!! Notice + Although these topics might seem unrelated to each other, they all participate in the configuration of the PGD resources to make them universally identifiable and accessible over a secure network. ## Services -Resources in a PGD Cluster are accessible through Kubernetes services. -Every PGDGroup manages several of them, namely: +Resources in a PGD cluster are accessible through Kubernetes services. +Every PGD group manages several of them, namely: -- one service per node, used for internal communications (*node service*) -- a *group service*, to reach any node in the group, used primarily by PG4K-PGD +- One service per node, used for internal communications (*node service*) +- A *group service* to reach any node in the group, used primarily by PG4K-PGD to discover a new group in the cluster -- a *proxy service*, to enable applications to reach the write leader of the - group, transparently using PGD proxy +- A *proxy service* to enable applications to reach the write leader of the + group transparently using PGD Proxy -For an example using these services, see [Connecting an application to a PGD cluster](#connecting-to-a-pgd-cluster-from-an-application). +For an example that uses these services, see [Connecting an application to a PGD cluster](#connecting-to-a-pgd-cluster-from-an-application). ![Basic architecture of an EDB Postgres Distributed for Kubernetes PGD group](./images/pg4k-pgd-basic-architecture.png) @@ -34,65 +34,64 @@ Each service is generated from a customizable template in the `.spec.connectivit section of the manifest. All services must be reachable using their fully qualified domain name (FQDN) -from all the PGD nodes in all the Kubernetes clusters (see below in this -section). +from all the PGD nodes in all the Kubernetes clusters. (See [Domain names resolution](#domain-names-resolutions).) PG4K-PGD provides a service templating framework that gives you the -availability to easily customize services at the following 3 levels: +availability to easily customize services at the following three levels: Node Service Template -: Each PGD node is reachable using a service which can be configured in the +: Each PGD node is reachable using a service that can be configured in the `.spec.connectivity.nodeServiceTemplate` section. Group Service Template -: Each PGD group has a group service that is a single entry point for the +: Each PGD group has a group service that's a single entry point for the whole group and that can be configured in the `.spec.connectivity.groupServiceTemplate` section. Proxy Service Template : Each PGD group has a proxy service to reach the group write leader through - the PGD proxy, and can be configured in the `.spec.connectivity.proxyServiceTemplate` + the PGD proxy and can be configured in the `.spec.connectivity.proxyServiceTemplate` section. This is the entry-point service for the applications. -You can use templates to create a LoadBalancer service, and/or to add arbitrary -annotations and labels to a service in order to integrate with other components -available in the Kubernetes system (i.e. to create external DNS names or tweak +You can use templates to create a LoadBalancer service or to add arbitrary +annotations and labels to a service to integrate with other components +available in the Kubernetes system (that is, to create external DNS names or tweak the generated load balancer). ## Domain names resolution -PG4K-PGD ensures that all resources in a PGD Group have a fully qualified -domain name (FQDN) by adopting a convention that uses the PGD Group name as a prefix +PG4K-PGD ensures that all resources in a PGD group have a fully qualified +domain name (FQDN) by adopting a convention that uses the PGD group name as a prefix for all of them. -As a result, it expects that you define the domain name of the PGD Group. This -can be done through the `.spec.connectivity.dns` section which controls how the -FQDN for the resources are generated, with two fields: +As a result, it expects that you define the domain name of the PGD group. This +can be done through the `.spec.connectivity.dns` section, which controls how the +FQDN for the resources are generated with two fields: -- `domain`: domain name to be used by all the objects in the PGD group (mandatory); -- `hostSuffix`: suffix to be added to each service in the PGD group (optional). +- `domain` — Domain name for all the objects in the PGD group to use (mandatory). +- `hostSuffix` — Suffix to add to each service in the PGD group (optional). -## TLS Configuration +## TLS configuration -PG4K-PGD requires that resources in a PGD Cluster communicate over a secure +PG4K-PGD requires that resources in a PGD cluster communicate over a secure connection. It relies on PostgreSQL's native support for [SSL connections](https://www.postgresql.org/docs/current/libpq-ssl.html) to encrypt client/server communications using TLS protocols for increased security. Currently, PG4K-PGD requires that [cert-manager](https://cert-manager.io/) is installed. -Cert-manager has been chosen as the tool to provision dynamic certificates, -given that it is widely recognized as the de facto standard in a Kubernetes +Cert-manager was chosen as the tool to provision dynamic certificates +given that it's widely recognized as the de facto standard in a Kubernetes environment. The `spec.connectivity.tls` section describes how the communication between the -nodes should happen: +nodes happens: - `mode` is an enumeration describing how the server certificates are verified during PGD group nodes communication. It accepts the following values, as - documented in ["SSL Support"](https://www.postgresql.org/docs/current/libpq-ssl.html#LIBPQ-SSL-SSLMODE-STATEMENTS) - from the PostgreSQL documentation: + documented in [SSL Support](https://www.postgresql.org/docs/current/libpq-ssl.html#LIBPQ-SSL-SSLMODE-STATEMENTS) + in the PostgreSQL documentation: - `verify-full` - `verify-ca` @@ -100,59 +99,59 @@ nodes should happen: - `serverCert` defines the server certificates used by the PGD group nodes to accept requests. - The clients validate this certificate depending on the passed TLS mode; - refer to the previous point for the accepted values. + The clients validate this certificate depending on the passed TLS mode. + It accepts the same values as `mode`. -- `clientCert` defines the `streaming_replica` user certificate that will - be used by the nodes to authenticate each other. +- `clientCert` defines the `streaming_replica` user certificate + used by the nodes to authenticate each other. -### Server TLS Configuration +### Server TLS configuration -The server certificate configuration is specified in `.spec.connectivity.tls.serverCert.certManager` -section of the PGDGroup custom resource. +The server certificate configuration is specified in the `.spec.connectivity.tls.serverCert.certManager` +section of the `PGDGroup` custom resource. -The following assumptions have been made for this section to work: +The following assumptions were made for this section to work: - An issuer `.spec.connectivity.tls.serverCert.certManager.issuerRef` is available for the domain `.spec.connectivity.dns.domain` and any other domain used by - `.spec.connectivity.tls.serverCert.certManager.altDnsNames` -- There is a secret containing the public certificate of the CA - used by the issuer `.spec.connectivity.tls.serverCert.caCertSecret` + `.spec.connectivity.tls.serverCert.certManager.altDnsNames`. +- There's a secret containing the public certificate of the CA + used by the issuer `.spec.connectivity.tls.serverCert.caCertSecret`. -The `.spec.connectivity.tls.serverCert.certManager` is used to create a per node -cert-manager certificate request -The resulting certificate will be used by the underlying Postgres instance +The `.spec.connectivity.tls.serverCert.certManager` is used to create a per-node +cert-manager certificate request. +The resulting certificate is used by the underlying Postgres instance to terminate TLS connections. -The operator will add the following altDnsNames to the certificate: +The operator adds the following altDnsNames to the certificate: - `$node$hostSuffix.$domain` - `$groupName$hostSuffix.$domain` !!! Important - It's your responsibility to add in `.spec.connectivity.tls.serverCert.certManager.altDnsNames` - any name required from the underlying networking architecture - (e.g., load balancers used by the user to reach the nodes). + It's your responsibility to add to `.spec.connectivity.tls.serverCert.certManager.altDnsNames` + any name required from the underlying networking architecture, + for example, load balancers used by the user to reach the nodes. -### Client TLS Configuration +### Client TLS configuration The operator requires client certificates to be dynamically provisioned -via cert-manager (recommended approach) or pre-provisioned via secrets. +using cert-manager (recommended approach) or pre-provisioned using secrets. -#### Dynamic provisioning via Cert-manager +#### Dynamic provisioning via cert-manager -The client certificates configuration is managed by `.spec.connectivity.tls.clientCert.certManager` +The client certificates configuration is managed by the `.spec.connectivity.tls.clientCert.certManager` section of the PGDGroup custom resource. -The following assumptions have been made for this section to work: +The following assumptions were made for this section to work: - An issuer `.spec.connectivity.tls.clientCert.certManager.issuerRef` is available - and will sign a certificate with the common name `streaming_replica` -- There is a secret containing the public certificate of the CA - used by the issuer `.spec.connectivity.tls.clientCert.caCertSecret` + and signs a certificate with the common name `streaming_replica`. +- There's a secret containing the public certificate of the CA + used by the issuer `.spec.connectivity.tls.clientCert.caCertSecret`. -The operator will use the configuration under `.spec.connectivity.tls.clientCert.certManager` +The operator uses the configuration under `.spec.connectivity.tls.clientCert.certManager` to create a certificate request per the `streaming_replica` Postgres user. -The resulting certificate will be used to secure communication between the nodes. +The resulting certificate is used to secure communication between the nodes. #### Pre-provisioned certificates via secrets @@ -160,26 +159,26 @@ Alternatively, you can specify a secret containing the pre-provisioned client certificate for the streaming replication user through the `.spec.connectivity.tls.clientCert.preProvisioned.streamingReplica.secretRef` option. The certificate lifecycle in this case is managed entirely by a third party, -either manually or automated, by simply updating the content of the secret. +either manually or automated, by updating the content of the secret. ## Connecting to a PGD cluster from an application -Connecting to a PGD Group from an application running inside the same Kubernetes cluster -or from outside the cluster is a simple procedure. In both cases, you will connect to -the proxy service of the PGD Group as the `app` user. The proxy service is a LoadBalancer -service that will route the connection to the write leader of the PGD Group. +Connecting to a PGD group from an application running inside the same Kubernetes cluster +or from outside the cluster is a simple procedure. In both cases, you connect to +the proxy service of the PGD group as the `app` user. The proxy service is a LoadBalancer +service that routes the connection to the write leader of the PGD group. ### Connecting from inside the cluster When connecting from inside the cluster, you can use the proxy service name to connect -to the PGD Group. The proxy service name is composed of the PGD Group name and the (optional) +to the PGD group. The proxy service name is composed of the PGD group name and the optional host suffix defined in the `.spec.connectivity.dns` section of the PGDGroup custom resource. -For example, if the PGD Group name is `my-group` and the host suffix is `.my-domain.com`, -the proxy service name will be `my-group.my-domain.com`. +For example, if the PGD group name is `my-group`, and the host suffix is `.my-domain.com`, +the proxy service name is `my-group.my-domain.com`. -Before connecting you will need to get the password for the app user from the app user -secret. The naming format of the secret is `my-group-app` for a PGD Group named `my-group`. +Before connecting, you need to get the password for the app user from the app user +secret. The naming format of the secret is `my-group-app` for a PGD group named `my-group`. You can get the username and password from the secret with the following commands: @@ -188,37 +187,37 @@ kubectl get secret my-group-app -o jsonpath='{.data.username}' | base64 --decode kubectl get secret my-group-app -o jsonpath='{.data.password}' | base64 --decode ``` -With this you now have all the pieces for a connection string to the PGD Group: +With this, you now have all the pieces for a connection string to the PGD group: ```text postgresql://:@:5432/ ``` -or for a `psql` invocation: +Or, for a `psql` invocation: ```sh psql -U -h ``` -where `app-user` and `app-password` are the values you got from the secret, +Where `app-user` and `app-password` are the values you got from the secret, and `database` is the name of the database you want to connect -to (the default is `app` for the app user.) +to. (The default is `app` for the app user.) ### Connecting from outside the Kubernetes cluster When connecting from outside the Kubernetes cluster, in the general case, -the [*Ingress*](https://kubernetes.io/docs/concepts/services-networking/ingress/) resource or a [*Load Balancer*](https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer) will be necessary. -Check your cloud provider or local installation for more information about the -behavior of them in your environment. +the [Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) resource or a [load balancer](https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer) is necessary. +Check your cloud provider or local installation for more information about their +behavior in your environment. -Ingresses and Load Balancers require a Pod selector to forward connection to -the PGD proxies. When configuring them, we suggest to use the following labels: +Ingresses and load balancers require a pod selector to forward connection to +the PGD proxies. When configuring them, we suggest using the following labels: -- `k8s.pgd.enterprisedb.io/group`: set the the PGD group name -- `k8s.pgd.enterprisedb.io/workloadType`: set to `pgd-proxy` +- `k8s.pgd.enterprisedb.io/group` — Set the PGD group name. +- `k8s.pgd.enterprisedb.io/workloadType` — Set to `pgd-proxy`. If using Kind or other solutions for local development, the easiest way to -access the PGD Group from outside is to use port forwarding +access the PGD group from outside is to use port forwarding to the proxy service. You can use the following command to forward port 5432 on your local machine to the proxy service: @@ -226,4 +225,4 @@ local machine to the proxy service: kubectl port-forward svc/my-group.my-domain.com 5432:5432 ``` -where `my-group.my-domain.com` is the proxy service name from the previous example. \ No newline at end of file +Where `my-group.my-domain.com` is the proxy service name from the previous example. \ No newline at end of file diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/index.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/index.mdx index c8f6d89d572..add40911869 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/index.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/index.mdx @@ -29,8 +29,8 @@ directoryDefaults: displayBanner: Preview release v0.7.1 --- -**EDB Postgres Distributed for Kubernetes** (`pg4k-pgd`, or PG4K-PGD) is an -operator designed to manage **EDB Postgres Distributed** workloads on +EDB Postgres Distributed for Kubernetes (`pg4k-pgd`, or PG4K-PGD) is an +operator designed to manage EDB Postgres Distributed (PGD) workloads on Kubernetes, with traffic routed by PGD Proxy. The main custom resource that the operator provides is called `PGDGroup`. @@ -40,45 +40,45 @@ Architectures can also be deployed across different Kubernetes clusters. ## Before you start EDB Postgres Distributed for Kubernetes provides you with a way to deploy -EDB Postgres Distributed in a Kubernetes environment. As a result, it -is fundamental that you have read the -["EDB Postgres Distributed" documentation](https://www.enterprisedb.com/docs/pgd/latest/). +EDB Postgres Distributed in a Kubernetes environment. Therefore, we recommend +reading the +[EDB Postgres Distributed documentation](/pgd/latest/). -The following chapters are very important to start working with EDB Postgres -Distributed for Kubernetes: +To start working with EDB Postgres +Distributed for Kubernetes, read the following in the PGD documentation: -- [Terminology](https://www.enterprisedb.com/docs/pgd/latest/terminology/) -- [PGD Overview](https://www.enterprisedb.com/docs/pgd/latest/overview/) -- [Choosing your architecture](https://www.enterprisedb.com/docs/pgd/latest/architectures/) -- [Choosing a Postgres distribution](https://www.enterprisedb.com/docs/pgd/latest/choosing_server/) +- [Terminology](/pgd/latest/terminology/) +- [PGD overview](https://www.enterprisedb.com/docs/pgd/latest/overview/) +- [Choosing your architecture](/pgd/latest/architectures/) +- [Choosing a Postgres distribution](/pgd/latest/choosing_server/) -For advanced usage and maximum customization, it is also important to familiarize with -["EDB Postgres for Kubernetes" (PG4K) documentation](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/), -as described in the ["Architecture" section](architecture.md#relationship-with-edb-postgres-for-kubernetes). +For advanced usage and maximum customization, it's also important to be familiar with the +[EDB Postgres for Kubernetes (PG4K) documentation](/postgres_for_kubernetes/latest/), +as described in [Architecture](architecture.md#relationship-with-edb-postgres-for-kubernetes). ## Supported Kubernetes distributions EDB Postgres Distributed for Kubernetes is available for: -- Kubernetes version 1.23 or higher through a Helm Chart -- Red Hat OpenShift version 4.10 or higher through the Red Hat OpenShift - Certified Operator only +- Kubernetes version 1.23 or later through a Helm chart +- Red Hat OpenShift version 4.10 or later only through the Red Hat OpenShift + certified operator ## Requirements EDB Postgres Distributed for Kubernetes requires that the Kubernetes/OpenShift -clusters hosting the distributed PGD cluster have been prepared by you to cater for: +clusters hosting the distributed PGD cluster were prepared by you to cater for: -- the Public Key Infrastructure (PKI) encompassing all the Kubernetes clusters - the PGD Global Group is spread across, as mTLS is required to authenticate - and authorize all nodes in the mesh topology and guarantee encrypted communication +- The public key infrastructure (PKI) encompassing all the Kubernetes clusters + the PGD Global Group is spread across. mTLS is required to authenticate + and authorize all nodes in the mesh topology and guarantee encrypted communication. - Networking infrastructure across all Kubernetes clusters involved in the PGD Global Group to ensure that each node can communicate with each other -EDB Postgres Distributed for Kubernetes also requires Cert Manager 1.10 or higher. +EDB Postgres Distributed for Kubernetes also requires Cert Manager 1.10 or later. !!! Seealso "About connectivity" - Please refer to the ["Connectivity" section](connectivity.md) for more information. + See [Connectivity](connectivity.md) for more information. -#### Exposed Ports +#### Exposed ports -EDB Postgres Distributed for Kubernetes exposes ports at operator, instance manager and operand -levels, as listed in the table below: +EDB Postgres Distributed for Kubernetes exposes ports at operator, instance manager, and operand +levels, as shown in the table. | System | Port number | Exposing | Name | Certificates | Authentication | | :--------------- | :---------- | :------------------ | :--------------- | :----------- | :------------- | @@ -222,26 +222,26 @@ levels, as listed in the table below: ### PGD -The current implementation of EDB Postgres Distributed for Kubernetes automatically creates -passwords for the `postgres` superuser and the database owner. +The current implementation of EDB Postgres Distributed for Kubernetes creates +passwords for the postgres superuser and the database owner. -As far as encryption of password is concerned, EDB Postgres Distributed for Kubernetes follows +As far as encryption of passwords is concerned, EDB Postgres Distributed for Kubernetes follows the default behavior of PostgreSQL: starting from PostgreSQL 14, -`password_encryption` is by default set to `scram-sha-256`, while on earlier -versions it is set to `md5`. +`password_encryption` is by default set to `scram-sha-256`. On earlier +versions, it's set to `md5`. !!! Important - Please refer to the ["Connection DSNs and SSL"](https://www.enterprisedb.com/docs/pgd/latest/nodes/#connection-dsns-and-ssl-tls) - section in the PGD documentation for details. + See [Connection DSNs and SSL](/pgd/latest/nodes/#connection-dsns-and-ssl-tls) + in the PGD documentation for details. -You can disable management of the `postgres` user password via secrets by setting +You can disable management of the postgres user password using secrets by setting `enableSuperuserAccess` to `false` in the `cnp` section of the spec. !!! Note The operator supports toggling the `enableSuperuserAccess` option. When you - disable it on a running cluster, the operator will ignore the content of the secret, - remove it (if previously generated by the operator) and set the password of the - `postgres` user to `NULL` (de facto disabling remote access through password authentication). + disable it on a running cluster, the operator ignores the content of the secret. + Remove it (if previously generated by the operator) and set the password of the + postgres user to `NULL`, in effect disabling remote access through password authentication. ### Storage From a7bae87a1cf07f632d0e61554d64b0e14b1dfcd4 Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Thu, 21 Sep 2023 16:46:44 -0400 Subject: [PATCH 7/8] Second read of EDB PGD for Kubernetes - first batch --- .../1/architecture.mdx | 53 +++++++++---------- .../1/backup.mdx | 44 +++++++-------- .../1/before_you_start.mdx | 13 +++-- .../1/certificates.mdx | 2 +- .../1/connectivity.mdx | 25 +++++---- .../1/index.mdx | 10 ++-- .../1/openshift.mdx | 18 +++---- .../1/private_registries.mdx | 8 +-- .../1/quickstart.mdx | 18 +++---- .../1/recovery.mdx | 8 +-- .../1/security.mdx | 8 +-- .../1/ssl_connections.mdx | 2 +- 12 files changed, 103 insertions(+), 106 deletions(-) diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/architecture.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/architecture.mdx index da804ccca23..afb3bb0f2c0 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/architecture.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/architecture.mdx @@ -4,9 +4,9 @@ originalFilePath: 'src/architecture.md' --- Consider these main architectural aspects -when deploying EDB Postgres Distributed in Kubernetes (PG4K-PGD). +when deploying EDB Postgres Distributed in Kubernetes. -PG4K-PGD is a +EDB Postgres Distributed for Kubernetes is a [Kubernetes operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) designed to deploy and manage EDB Postgres Distributed clusters running in private, public, hybrid, or multi-cloud environments. @@ -17,11 +17,11 @@ running in private, public, hybrid, or multi-cloud environments. is a multi-master implementation of Postgres designed for high performance and availability. PGD generally requires deployment using -[Trusted Postgres Architect (TPA)](https://www.enterprisedb.com/docs/pgd/latest/tpa/), +[Trusted Postgres Architect (TPA)](/pgd/latest/tpa/), a tool that uses [Ansible](https://www.ansible.com) to provision and deploy PGD clusters. -PG4K-PGD offers a different way of deploying PGD clusters, leveraging containers +EDB Postgres Distributed for Kubernetes offers a different way of deploying PGD clusters, leveraging containers and Kubernetes. The advantages are that the resulting architecture: - Is self-healing and robust. @@ -31,9 +31,9 @@ and Kubernetes. The advantages are that the resulting architecture: ## Relationship with EDB Postgres for Kubernetes A PGD cluster consists of one or more *PGD groups*, each having one or more *PGD -nodes*. A PGD node is a Postgres database. PG4K-PGD internally +nodes*. A PGD node is a Postgres database. EDB Postgres Distributed for Kubernetes internally manages each PGD node using the `Cluster` resource as defined by EDB Postgres -for Kubernetes (PG4K), specifically a cluster with a single instance (that is, no +for Kubernetes, specifically a cluster with a single instance (that is, no replicas). You can configure the single PostgreSQL instance created by each cluster @@ -41,14 +41,14 @@ declaratively using the [`.spec.cnp` section](api_reference.md#CnpConfiguration) of the PGD group spec. -In PG4K-PGD, as in PG4K, the underlying database implementation is responsible +In EDB Postgres Distributed for Kubernetes, as in EDB Postgres for Kubernetes, the underlying database implementation is responsible for data replication. However, it's important to note that failover and switchover work differently, entailing Raft election and nominating new -write leaders. PG4K handles only the deployment and healing of data nodes. +write leaders. EDB Postgres for Kubernetes handles only the deployment and healing of data nodes. -## Managing PGD using PG4K-PGD +## Managing PGD using EDB Postgres Distributed for Kubernetes -The PG4K-PGD operator can manage the complete lifecycle of PGD clusters. As +The EDB Postgres Distributed for Kubernetes operator can manage the complete lifecycle of PGD clusters. As such, in addition to PGD nodes (represented as single-instance clusters), it needs to manage other objects associated with PGD. @@ -59,10 +59,10 @@ or performing distributed DDL. These considerations force additional actors in PGD above database nodes. -PG4K-PGD manages the following: +EDB Postgres Distributed for Kubernetes manages the following: -- Data nodes: as mentioned previously, a node is a database and is managed - by PG4K, creating a cluster with a single instance. +- Data nodes. A node is a database and is managed + by EDB Postgres for Kubernetes, creating a cluster with a single instance. - [Witness nodes](https://www.enterprisedb.com/docs/pgd/latest/nodes/#witness-nodes) are basic database instances that don't participate in data replication. Their function is to guarantee that consensus is possible in @@ -75,7 +75,7 @@ PG4K-PGD manages the following: ### Proxies and routing PGD groups assume full mesh connectivity of PGD nodes. Each node must be able to -connect to every other node, using the appropriate connection string (a +connect to every other node using the appropriate connection string (a `libpq`-style DSN). Write operations don't need to be sent to every node. PGD takes care of replicating data after it's committed to one node. @@ -86,7 +86,7 @@ transparently route writes to write leaders and can quickly pivot to the new write leader in case of switchover or failover. It's possible to configure *Raft subgroups*, each of which can maintain a -separate write leader. In PG4K-PGD, a PGD group containing a PGD proxy +separate write leader. In EDB Postgres Distributed for Kubernetes, a PGD group containing a PGD proxy comprises a Raft subgroup. Two kinds of routing are available with PGD proxies: @@ -96,7 +96,7 @@ Two kinds of routing are available with PGD proxies: - Local routing uses subgroups to maintain separate write leaders. Local routing is often used to achieve geographical separation of writes. -In PG4K-PGD, local routing is used by default, and a configuration option is +In EDB Postgres Distributed for Kubernetes, local routing is used by default, and a configuration option is available to select global routing. For more information, see @@ -106,17 +106,16 @@ For more information, see To make good use of PGD's distributed multi-master capabilities and to offer high availability, -we recommend several architectures . +we recommend several architectures. The Always On architectures are built from either one group in a single location or two groups in two separate locations. -See -[Choosing your architecture](/pgd/latest/architectures/) in the PGD documentation +See [Choosing your architecture](/pgd/latest/architectures/) in the PGD documentation for more information. ## Deploying PGD on Kubernetes -PG4K-PGD leverages Kubernetes to deploy and manage PGD clusters. As such, some +EDB Postgres Distributed for Kubernetes leverages Kubernetes to deploy and manage PGD clusters. As such, some adaptations are necessary to translate PGD into the Kubernetes ecosystem. ### Images and operands @@ -144,7 +143,7 @@ for a Kubernetes cluster is three to make the control plane resilient to the failure of a single zone. This means that each data center is active at any time and can run workloads simultaneously. -PG4K-PGD can be installed in a +EDB Postgres Distributed for Kubernetes can be installed in a [single Kubernetes cluster](#single-kubernetes-cluster) or across [multiple Kubernetes clusters](#multiple-kubernetes-clusters). @@ -152,8 +151,8 @@ or across ### Single Kubernetes cluster A multi-availability-zone Kubernetes architecture is typical of Kubernetes -services managed by cloud providers. Such an architecture enables the PG4K-PGD -and the PG4K operators to schedule workloads and nodes across availability +services managed by cloud providers. Such an architecture enables the EDB Postgres Distributed for Kubernetes +and the EDB Postgres for Kubernetes operators to schedule workloads and nodes across availability zones, considering all zones active. ![Kubernetes cluster spanning over 3 independent data centers](./images/k8s-architecture-3-az.png) @@ -162,20 +161,20 @@ PGD clusters can be deployed in a single Kubernetes cluster and take advantage of Kubernetes availability zones to enable high-availability architectures, including the Always On recommended architectures. -You can realize the Always On single location architecture shown in +You can realize the Always On, single-location architecture shown in [Choosing your architecture](/pgd/latest/architectures/) in the PGD documentation on a single Kubernetes cluster with three availability zones. ![Always On Single Region](./images/always_on_1x3_updated.png) -The PG4K-PGD operator can control the scheduling of pods (that is, which pods go +The EDB Postgres Distributed for Kubernetes operator can control the scheduling of pods (that is, which pods go to which data center) using affinity, tolerations, and node selectors, as is the -case with PG4K. Individual scheduling controls are available for proxies as well +case with EDB Postgres for Kubernetes. Individual scheduling controls are available for proxies as well as nodes. See the [Kubernetes documentation on scheduling](https://kubernetes.io/docs/concepts/scheduling-eviction/), -and [Scheduling](/postgres_for_kubernetes/latest/scheduling/) in the PG4K documentation +and [Scheduling](/postgres_for_kubernetes/latest/scheduling/) in the EDB Postgres for Kubernetes documentation for more information. ### Multiple Kubernetes clusters diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/backup.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/backup.mdx index 11f1d005918..45e2ab42101 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/backup.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/backup.mdx @@ -3,29 +3,29 @@ title: 'Backup on object stores' originalFilePath: 'src/backup.md' --- -EDB Postgres Distributed for Kubernetes (PG4K-PGD) supports *online/hot backup* of +EDB Postgres Distributed for Kubernetes supports *online/hot backup* of PGD clusters through physical backup and WAL archiving on an object store. This means that the database is always up (no downtime required) and that -point-in-time recovery is available. +point-in-time recovery (PITR) is available. ## Common object stores Multiple object stores are supported, such as AWS S3, Microsoft Azure Blob Storage, Google Cloud Storage, MinIO Gateway, or any S3-compatible provider. -Given that PG4K-PGD configures the connection with object stores by relying on -EDB Postgres for Kubernetes (PG4K), see the [PG4K cloud provider support](/postgres_for_kubernetes/latest/backup_recovery/#cloud-provider-support) +Given that EDB Postgres Distributed for Kubernetes configures the connection with object stores by relying on +EDB Postgres for Kubernetes, see the [EDB Postgres for Kubernetes cloud provider support](/postgres_for_kubernetes/latest/backup_recovery/#cloud-provider-support) documentation for more information. !!! Important - The PG4K documentation's Cloud Provider configuration section is - available at `spec.backup.barmanObjectStore`. In PG4K-PGD examples, the object store section is at a + The EDB Postgres for Kubernetes documentation's Cloud Provider configuration section is + available at `spec.backup.barmanObjectStore`. In EDB Postgres Distributed for Kubernetes examples, the object store section is at a different path: `spec.backup.configuration.barmanObjectStore`. ## WAL archive WAL archiving is the process that sends WAL files to the object storage, and it's essential to -execute online/hot backups or point-in-time recovery (PITR). -In PG4K-PGD, each PGD node is set up to archive WAL files in the object store independently. +execute online/hot backups or PITR. +In EDB Postgres Distributed for Kubernetes, each PGD node is set up to archive WAL files in the object store independently. The WAL archive is defined in the PGD group `spec.backup.configuration.barmanObjectStore` stanza and is enabled as soon as a destination path and cloud credentials are set. @@ -47,13 +47,13 @@ spec: maxParallel: 8 ``` -For more information, see the [PG4K WAL archiving](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/backup_recovery/#wal-archiving) documentation. +For more information, see the [EDB Postgres for Kubernetes WAL archiving](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/backup_recovery/#wal-archiving) documentation. ## Scheduled backups -Scheduled backups are the recommended way to configure your backup strategy in PG4K-PGD. +Scheduled backups are the recommended way to configure your backup strategy in EDB Postgres Distributed for Kubernetes. When the PGD group `spec.backup.configuration.barmanObjectStore` stanza is configured, the operator selects one of the -PGD data nodes as the elected backup node for which it automatically creates a `Scheduled Backup` resource. +PGD data nodes as the elected backup node for which it creates a `Scheduled Backup` resource. The `.spec.backup.cron.schedule` field allows you to define a cron schedule specification, expressed in the [Go `cron` package format](https://pkg.go.dev/github.com/robfig/cron#hdr-CRON_Expression_Format). @@ -85,8 +85,8 @@ in the created backup resources. The choices are: - `cluster` — Sets the cluster as owner of the backup. !!! Note - The PG4K `ScheduledBackup` object contains the `cluster` option to specify the - cluster to back up. This option is currently not supported by PG4K-PGD and is + The EDB Postgres for Kubernetes `ScheduledBackup` object contains the `cluster` option to specify the + cluster to back up. This option is currently not supported by EDB Postgres Distributed for Kubernetes and is ignored if specified. If an elected backup node is deleted, the operator transparently elects a new backup node @@ -94,12 +94,12 @@ and reconciles the `Scheduled Backup` resource accordingly. ## Retention policies -PG4K-PGD can manage the automated deletion of backup files from the backup +EDB Postgres Distributed for Kubernetes can manage the automated deletion of backup files from the backup object store using retention policies based on the recovery window. This process also takes care of removing unused WAL files and WALs associated with backups that are scheduled for deletion. -You can define your backups with a retention policy of 30 days as follows: +You can define your backups with a retention policy of 30 days: ```yaml apiVersion: pgd.k8s.enterprisedb.io/v1beta1 @@ -111,7 +111,7 @@ spec: retentionPolicy: "30d" ``` -For more information, see the [PG4K retention policies](/postgres_for_kubernetes/latest/backup_recovery/#retention-policies) documentation. +For more information, see the [EDB Postgres for Kubernetes retention policies](/postgres_for_kubernetes/latest/backup_recovery/#retention-policies) in the EDB Postgres for Kubernetes documentation. !!! Important Currently, the retention policy is applied only for the elected backup node @@ -125,20 +125,20 @@ For more information, see the [PG4K retention policies](/postgres_for_kubernetes ## Compression algorithms Backups and WAL files are uncompressed by default. However, multiple compression algorithms are -supported. For more information, see the [PG4K compression algorithms](/postgres_for_kubernetes/latest/backup_recovery/#compression-algorithms) documentation. +supported. For more information, see the [EDB Postgres for Kubernetes compression algorithms](/postgres_for_kubernetes/latest/backup_recovery/#compression-algorithms) documentation. ## Tagging of backup objects It's possible to specify tags as key-value pairs for the backup objects, namely base backups, WAL files, and history files. -For more information, see the PG4K documentation about [tagging of backup objects](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/backup_recovery/#tagging-of-backup-objects). +For more information, see the EDB Postgres for Kubernetes documentation about [tagging of backup objects](/postgres_for_kubernetes/latest/backup_recovery/#tagging-of-backup-objects). ## On-demand backups of a PGD node -A PGD node is represented as single-instance PG4K `Cluster` object. +A PGD node is represented as single-instance EDB Postgres for Kubernetes `Cluster` object. As such, in case of need, it's possible to request an on-demand backup -of a specific PGD node by creating a PG4K `Backup` resource. -To do that, see the [PG4K on-demand backups](/postgres_for_kubernetes/latest/backup_recovery/#on-demand-backups) documentation. +of a specific PGD node by creating a EDB Postgres for Kubernetes `Backup` resource. +To do that, see [EDB Postgres for Kubernetes on-demand backups](/postgres_for_kubernetes/latest/backup_recovery/#on-demand-backups) in the EDB Postgres for Kubernetes documentation. !!! Hint - You can retrieve the list of PG4K clusters that make up your PGD group + You can retrieve the list of EDB Postgres for Kubernetes clusters that make up your PGD group by running: `kubectl get cluster -l k8s.pgd.enterprisedb.io/group=my-pgd-group -n my-namespace` \ No newline at end of file diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/before_you_start.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/before_you_start.mdx index ae1a863ba17..87a07e5259a 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/before_you_start.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/before_you_start.mdx @@ -19,9 +19,8 @@ specific to Kubernetes and PGD. [Service](https://kubernetes.io/docs/concepts/services-networking/service/) : A *service* is an abstraction that exposes as a network service an - application that runs on a group of pods and standardizes important features - such as service discovery across applications, load balancing, failover, and so - on. + application that runs on a group of pods and standardizes important features, + such as service discovery across applications, load balancing, and failover. [Secret](https://kubernetes.io/docs/concepts/configuration/secret/) : A *secret* is an object that's designed to store small amounts of sensitive @@ -34,7 +33,7 @@ specific to Kubernetes and PGD. [Persistent volume](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) : A *persistent volume* (PV) is a resource in a Kubernetes cluster that - represents storage that has been either manually provisioned by an + represents storage that was either manually provisioned by an administrator or dynamically provisioned by a *storage class* controller. A PV is associated with a pod using a *persistent volume claim*, and its lifecycle is independent of any pod that uses it. Normally, a PV is a network volume, @@ -80,7 +79,7 @@ EDB Postgres Distributed for Kubernetes requires a Kubernetes version supported ## PGD terminology -For more information, see the +For more information, see [Terminology](https://www.enterprisedb.com/docs/pgd/latest/terminology/) in the PGD documentation. [Node](https://www.enterprisedb.com/docs/pgd/latest/terminology/#node) @@ -103,12 +102,12 @@ Region round-trip network latency. Zone -: An *availability zone* in the cloud (also known as *zone*) is an area in a +: An *availability zone* in the cloud (also known as a *zone*) is an area in a region where resources can be deployed. Usually, an availability zone corresponds to a data center or an isolated building of the same data center. ## What to do next Now that you have familiarized with the terminology, you can -[test EDB Postgres Distributed for Kubernetes (PG4K-PGD) on your laptop using a local cluster](quickstart.md) before +[test EDB Postgres Distributed for Kubernetes on your laptop using a local cluster](quickstart.md) before deploying the operator in your selected cloud environment. \ No newline at end of file diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/certificates.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/certificates.mdx index ea76af2dca3..c0a4f8c098e 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/certificates.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/certificates.mdx @@ -4,7 +4,7 @@ originalFilePath: 'src/certificates.md' --- EDB Postgres Distributed for Kubernetes was designed to natively support TLS certificates. -To set up a PGD cluster, each PGD node requires: +To set up an PGD cluster, each PGD node requires: - A server certification authority (CA) certificate - A server TLS certificate signed by the server CA diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/connectivity.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/connectivity.mdx index 642eb88a21e..a5c81e25a7f 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/connectivity.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/connectivity.mdx @@ -21,7 +21,7 @@ Resources in a PGD cluster are accessible through Kubernetes services. Every PGD group manages several of them, namely: - One service per node, used for internal communications (*node service*) -- A *group service* to reach any node in the group, used primarily by PG4K-PGD +- A *group service* to reach any node in the group, used primarily by EDB Postgres Distributed for Kubernetes to discover a new group in the cluster - A *proxy service* to enable applications to reach the write leader of the group transparently using PGD Proxy @@ -33,10 +33,10 @@ For an example that uses these services, see [Connecting an application to a PGD Each service is generated from a customizable template in the `.spec.connectivity` section of the manifest. -All services must be reachable using their fully qualified domain name (FQDN) -from all the PGD nodes in all the Kubernetes clusters. (See [Domain names resolution](#domain-names-resolutions).) +All services must be reachable using their FQDN +from all the PGD nodes in all the Kubernetes clusters. See [Domain names resolution](#domain-names-resolutions). -PG4K-PGD provides a service templating framework that gives you the +EDB Postgres Distributed for Kubernetes provides a service templating framework that gives you the availability to easily customize services at the following three levels: Node Service Template @@ -60,11 +60,10 @@ the generated load balancer). ## Domain names resolution -PG4K-PGD ensures that all resources in a PGD group have a fully qualified -domain name (FQDN) by adopting a convention that uses the PGD group name as a prefix +EDB Postgres Distributed for Kubernetes ensures that all resources in a PGD group have a FQDN by adopting a convention that uses the PGD group name as a prefix for all of them. -As a result, it expects that you define the domain name of the PGD group. This +As a result, it expects you to define the domain name of the PGD group. This can be done through the `.spec.connectivity.dns` section, which controls how the FQDN for the resources are generated with two fields: @@ -75,14 +74,14 @@ FQDN for the resources are generated with two fields: -PG4K-PGD requires that resources in a PGD cluster communicate over a secure +EDB Postgres Distributed for Kubernetes requires that resources in a PGD cluster communicate over a secure connection. It relies on PostgreSQL's native support for [SSL connections](https://www.postgresql.org/docs/current/libpq-ssl.html) to encrypt client/server communications using TLS protocols for increased security. -Currently, PG4K-PGD requires that [cert-manager](https://cert-manager.io/) is installed. +Currently, EDB Postgres Distributed for Kubernetes requires that [cert-manager](https://cert-manager.io/) is installed. Cert-manager was chosen as the tool to provision dynamic certificates -given that it's widely recognized as the de facto standard in a Kubernetes +given that it's widely recognized as the standard in a Kubernetes environment. The `spec.connectivity.tls` section describes how the communication between the @@ -136,7 +135,7 @@ The operator adds the following altDnsNames to the certificate: ### Client TLS configuration The operator requires client certificates to be dynamically provisioned -using cert-manager (recommended approach) or pre-provisioned using secrets. +using cert-manager (the recommended approach) or pre-provisioned using secrets. #### Dynamic provisioning via cert-manager @@ -180,14 +179,14 @@ the proxy service name is `my-group.my-domain.com`. Before connecting, you need to get the password for the app user from the app user secret. The naming format of the secret is `my-group-app` for a PGD group named `my-group`. -You can get the username and password from the secret with the following commands: +You can get the username and password from the secret using the following commands: ```sh kubectl get secret my-group-app -o jsonpath='{.data.username}' | base64 --decode kubectl get secret my-group-app -o jsonpath='{.data.password}' | base64 --decode ``` -With this, you now have all the pieces for a connection string to the PGD group: +With this, you have all the pieces for a connection string to the PGD group: ```text postgresql://:@:5432/ diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/index.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/index.mdx index add40911869..59a381222af 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/index.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/index.mdx @@ -29,7 +29,7 @@ directoryDefaults: displayBanner: Preview release v0.7.1 --- -EDB Postgres Distributed for Kubernetes (`pg4k-pgd`, or PG4K-PGD) is an +EDB Postgres Distributed for Kubernetes (`pg4k-pgd`) is an operator designed to manage EDB Postgres Distributed (PGD) workloads on Kubernetes, with traffic routed by PGD Proxy. @@ -48,12 +48,12 @@ To start working with EDB Postgres Distributed for Kubernetes, read the following in the PGD documentation: - [Terminology](/pgd/latest/terminology/) -- [PGD overview](https://www.enterprisedb.com/docs/pgd/latest/overview/) +- [PGD overview](/pgd/latest/overview/) - [Choosing your architecture](/pgd/latest/architectures/) - [Choosing a Postgres distribution](/pgd/latest/choosing_server/) For advanced usage and maximum customization, it's also important to be familiar with the -[EDB Postgres for Kubernetes (PG4K) documentation](/postgres_for_kubernetes/latest/), +[EDB Postgres for Kubernetes documentation](/postgres_for_kubernetes/latest/), as described in [Architecture](architecture.md#relationship-with-edb-postgres-for-kubernetes). ## Supported Kubernetes distributions @@ -70,10 +70,10 @@ EDB Postgres Distributed for Kubernetes requires that the Kubernetes/OpenShift clusters hosting the distributed PGD cluster were prepared by you to cater for: - The public key infrastructure (PKI) encompassing all the Kubernetes clusters - the PGD Global Group is spread across. mTLS is required to authenticate + the PGD global group is spread across. mTLS is required to authenticate and authorize all nodes in the mesh topology and guarantee encrypted communication. - Networking infrastructure across all Kubernetes clusters involved in the - PGD Global Group to ensure that each node can communicate with each other + PGD global group to ensure that each node can communicate with each other EDB Postgres Distributed for Kubernetes also requires Cert Manager 1.10 or later. diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/openshift.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/openshift.mdx index 6f0aa58dbf8..4f1885c3d4a 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/openshift.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/openshift.mdx @@ -59,8 +59,8 @@ Hat OperatorHub directly from your OpenShift dashboard. 4. In the Operator Installation page, select: - - The installation mode. ([Cluster-wide](#cluster-wide-installation) is currently the - only mode.) + - The installation mode. [Cluster-wide](#cluster-wide-installation) is currently the + only mode. - The update channel (currently **preview**). - The approval strategy, following the availability on the marketplace of a new release of the operator, certified by Red Hat: @@ -81,12 +81,12 @@ From the web console, for **Installation mode**, select **All namespaces on the On installation, the operator is visible in all namespaces. In case there were problems during installation, check the logs in any pods in the -openshift-operators project on the Workloads > Pods page +`openshift-operators` project on the **Workloads > Pods** page as you would with any other OpenShift operator. !!! Important "Beware" By choosing the cluster-wide installation you, can't easily move to a - single project installation later. + single-project installation later. ## Creating a PGD cluster @@ -146,14 +146,14 @@ region-c 0 1 PGDGroup - Healthy To deploy PGD in multiple OpenShift clusters in multiple regions, you must first establish a way for the PGD groups to communicate with each other. The recommended way of achieving this with multiple OpenShift clusters is to use [Submariner](https://submariner.io/getting-started/quickstart/openshift/). Configuring the connectivity is outside the -scope of this document. However, once you've established connectivity between the OpenShift clusters you can deploy +scope of this documentation. However, once you've established connectivity between the OpenShift clusters, you can deploy PGD groups synced with one another. !!! Warning This example assumes you're deploying three PGD groups, one in each OpenShift cluster, and that you established connectivity between the OpenShift clusters using Submariner. -Similar to the [single cluster example](#using-pgd-in-a-single-openshift-cluster-in-a-single-region), this example creates +Similar to the [single-cluster example](#using-pgd-in-a-single-openshift-cluster-in-a-single-region), this example creates two data PGD groups and one witness group. In contrast to that example, each group lives in a different OpenShift cluster. @@ -170,7 +170,7 @@ This example uses a self-signed certificate that has a single certificate authority used for all certificates on all the OpenShift clusters. The example refers to the OpenShift clusters as `OpenShift Cluster A`, `OpenShift Cluster B`, and -`OpenShift Cluster C`. In OpenShift, an installation of the PG4K-PGD-Operator from OperatorHub includes an +`OpenShift Cluster C`. In OpenShift, an installation of the EDB Postgres Distributed for Kubernetes operator from OperatorHub includes an installation of the cert-manager operator. We recommend creating and managing certificates with cert-manager. 1. Create a namespace to hold `OpenShift Cluster A`, and in it also create the needed objects for a self-signed certificate. Assuming @@ -268,7 +268,7 @@ spec: ``` !!! Important - The format of the hostnames in the `discovery` section differs from the single cluster + The format of the hostnames in the `discovery` section differs from the single-cluster example. That's because Submariner is being used to connect the OpenShift clusters, and Submariner uses the `..svc.clusterset.local` domain to route traffic between the OpenShift clusters. `region-a-group` is the name of the service to be created for the PGD group named `region-a`. @@ -330,7 +330,7 @@ spec: oc apply -f region-b.yaml -n pgd-group ``` -1. Finally, you can switch context to `OpenShift Cluster C` and create the third PGD group. The YAML for the PGD +1. You can switch context to `OpenShift Cluster C` and create the third PGD group. The YAML for the PGD group is: ```yaml diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/private_registries.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/private_registries.mdx index 8abcb8c47c6..5c07733f271 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/private_registries.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/private_registries.mdx @@ -11,7 +11,7 @@ container image registries under `docker.enterprisedb.com`. Access to the private registries requires an account with EDB and is reserved for EDB customers with a valid [subscription plan](https://www.enterprisedb.com/products/plans-comparison#selfmanagedenterpriseplan). Credentials are run through your EDB account. - For trials, see the [Trials](#trials). + For trials, see [Trials](#trials). ## Which repository to choose? @@ -46,7 +46,7 @@ is an EDB Repos 2.0 section where a repo token appears obscured. Next to the repo token is a **Copy Token** button to copy the token and an eye icon for looking at the content of the token. -Use the repo token as the password when you log in to EDB +Use the repo token as the password when you log in to the EDB container registry. ### Example with `docker login` @@ -54,7 +54,7 @@ container registry. You can log in using Docker from your terminal. We suggest that you copy the repo token using **Copy Token**. The `docker` command prompts you for a username and a password. -The username is the repo you're trying to access +The username is the repo you're trying to access, and the password is the token you just copied: ```sh @@ -73,7 +73,7 @@ of the repository, and follow the instructions in ## Operand images EDB Postgres Distributed for Kubernetes is an operator that supports running -Postgres Distributed (PGD) version 5 on three PostgreSQL distributions: +EDB Postgres Distributed (PGD) version 5 on three PostgreSQL distributions: - PostgreSQL - EDB Postgres Advanced Server diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/quickstart.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/quickstart.mdx index 3ae3997ed27..bdd094e8917 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/quickstart.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/quickstart.mdx @@ -3,8 +3,8 @@ title: 'Quick start' originalFilePath: 'src/quickstart.md' --- -You cam test an EDB Postgres Distributed (PGD) cluster on your -laptop or computer using EDB Postgres Distributed for Kubernetes (PG4K-PGD) +You can test an EDB Postgres Distributed (PGD) cluster on your +laptop or computer using EDB Postgres Distributed for Kubernetes on a single local Kubernetes cluster built with [Kind](https://kind.sigs.k8s.io/). @@ -23,10 +23,10 @@ cluster on your local Kubernetes installation so you can experiment with it. ## Part 1 - Set up the local Kubernetes playground Install Kind, a tool for running local Kubernetes -clusters using Docker container nodes. (Kind stands for *Kubernetes IN Docker.*) +clusters using Docker container nodes. (Kind stands for Kubernetes IN Docker.) If you already have access to a Kubernetes cluster, you can skip to Part 2. -Install `kind` on your environment following the instructions in [Kind Quick Start](https://kind.sigs.k8s.io/docs/user/quick-start). +Install Kind on your environment following the instructions in [Kind Quick Start](https://kind.sigs.k8s.io/docs/user/quick-start). Then, create a Kubernetes cluster: ```sh @@ -50,7 +50,7 @@ As with any other deployment in Kubernetes, to deploy a PGD cluster you need to apply a configuration file that defines your desired `PGDGroup` resources that make up a PGD cluster. -Some sample files are included in the PG4K-PGD repository. The +Some sample files are included in the EDB Postgres Distributed for Kubernetes repository. The [flexible_3regions.yaml](../samples/flexible_3regions.yaml) manifest contains the definition of a PGD cluster with two data groups and a global witness node spread across three regions. Each data group consists of two data nodes @@ -66,15 +66,15 @@ You can deploy the `flexible-3-regions` example by saving it first and running: kubectl apply -f flexible_3regions.yaml ``` -You can check that the pods are being created with the `get pods` command: +You can check that the pods are being created using the `get pods` command: ```sh kubectl get pods ``` The pods are being created as part of PGD nodes. As described in the -[architecture document](architecture.md), they're implemented on top -of PG4K clusters. +[Architecture](architecture.md), they're implemented on top +of EDB Postgres for Kubernetes clusters. You can list the clusters then, which shows the PGD nodes: @@ -89,7 +89,7 @@ region-a-3 91s 1 1 Cluster in healthy state region-a-3-1 ``` Ultimately, the PGD nodes are created as part of the PGD groups -that make up our PGD cluster. +that make up your PGD cluster. ```sh $ kubectl get pgdgroups diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/recovery.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/recovery.mdx index ca64c896a55..3a50eec8c7d 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/recovery.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/recovery.mdx @@ -3,10 +3,10 @@ title: 'Recovery' originalFilePath: 'src/recovery.md' --- -In EDB Postgres Distributed for Kubernetes (PG4K-PGD), recovery is available as a way +In EDB Postgres Distributed for Kubernetes, recovery is available as a way to bootstrap a new PGD group starting from an available physical backup of a PGD node. The recovery can't be performed in-place on an existing PGD group. -PG4K-PGD also supports point-in-time recovery, which allows you to restore a PGDGroup up to +EDB Postgres Distributed for Kubernetes also supports point-in-time recovery, which allows you to restore a PGDGroup up to any point in time, from the first available backup in your catalog to the last archived WAL. (Having a WAL archive is mandatory in this case.) @@ -19,7 +19,7 @@ Before recovering from a backup, take care to apply the following considerations - When recovering in a newly created namespace, remember to first set up a cert-manager CA issuer before deploying the recovered PGDGroup. -For more information, see [PG4K recovery - Additional considerations](/postgres_for_kubernetes/latest/bootstrap/#additional-considerations) in the EDB Postgres for Kubernetes documentation. +For more information, see [EDB Postgres for Kubernetes recovery - Additional considerations](/postgres_for_kubernetes/latest/bootstrap/#additional-considerations) in the EDB Postgres for Kubernetes documentation. ## Recovery from an object store @@ -166,4 +166,4 @@ spec: ## Recovery targets Beyond PITR are other recovery target criteria you can use. -For more information on all the available recovery targets, see [PG4K recovery targets](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/bootstrap/#point-in-time-recovery-pitr) in the EDB Postgres for Kubernetes documentation. \ No newline at end of file +For more information on all the available recovery targets, see [EDB Postgres for Kubernetes recovery targets](/postgres_for_kubernetes/latest/bootstrap/#point-in-time-recovery-pitr) in the EDB Postgres for Kubernetes documentation. \ No newline at end of file diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/security.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/security.mdx index ac238d63222..3dcc286cafa 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/security.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/security.mdx @@ -130,10 +130,10 @@ namespaced resources. To see all the permissions required by the operator, you can run `kubectl describe clusterrole pgd-operator-manager-role`. -PG4K-PGD internally manages the PGD nodes using the `Cluster` resource as defined by EDB Postgres -for Kubernetes (PG4K). See the -[EDB Postgres for Kubernetes documentation](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/security/) -for the list of permissions used by the PG4K operator service account. +EDB Postgres Distributed for Kubernetes internally manages the PGD nodes using the `Cluster` resource as defined by EDB Postgres +for Kubernetes. See the +[EDB Postgres for Kubernetes documentation](/postgres_for_kubernetes/latest/security/) +for the list of permissions used by the EDB Postgres for Kubernetes operator service account. ### Calls to the API server made by the instance manager diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/ssl_connections.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/ssl_connections.mdx index 43cfca1963b..bd38e5b5d94 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/ssl_connections.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/ssl_connections.mdx @@ -9,7 +9,7 @@ originalFilePath: 'src/ssl_connections.md' The EDB Postgres Distributed for Kubernetes operator was designed to work with TLS/SSL for both encryption in transit and authentication on server and client sides. PGD nodes are created as cluster -resources using the EDB Postgres for Kubernetes (PG4K) operator. This +resources using the EDB Postgres for Kubernetes operator. This includes deploying a certification authority (CA) to create and sign TLS client certificates. From 1ff1f08a30c11d4aefee4e7cf0ade453b9734098 Mon Sep 17 00:00:00 2001 From: Betsy Gitelman <93718720+ebgitelman@users.noreply.github.com> Date: Tue, 26 Sep 2023 14:31:37 -0400 Subject: [PATCH 8/8] rest of edits to EDB PGD for Kubernetes --- .../1/connectivity.mdx | 4 +-- .../1/recovery.mdx | 26 +++++++++--------- .../1/samples.mdx | 2 +- .../1/security.mdx | 27 +++++++++---------- .../1/ssl_connections.mdx | 2 +- .../1/use_cases.mdx | 10 +++---- .../1/using_pgd.mdx | 16 +++++------ 7 files changed, 43 insertions(+), 44 deletions(-) diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/connectivity.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/connectivity.mdx index a5c81e25a7f..0090c42296c 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/connectivity.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/connectivity.mdx @@ -140,7 +140,7 @@ using cert-manager (the recommended approach) or pre-provisioned using secrets. #### Dynamic provisioning via cert-manager The client certificates configuration is managed by the `.spec.connectivity.tls.clientCert.certManager` -section of the PGDGroup custom resource. +section of the `PGDGroup` custom resource. The following assumptions were made for this section to work: - An issuer `.spec.connectivity.tls.clientCert.certManager.issuerRef` is available @@ -171,7 +171,7 @@ service that routes the connection to the write leader of the PGD group. When connecting from inside the cluster, you can use the proxy service name to connect to the PGD group. The proxy service name is composed of the PGD group name and the optional -host suffix defined in the `.spec.connectivity.dns` section of the PGDGroup custom resource. +host suffix defined in the `.spec.connectivity.dns` section of the `PGDGroup` custom resource. For example, if the PGD group name is `my-group`, and the host suffix is `.my-domain.com`, the proxy service name is `my-group.my-domain.com`. diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/recovery.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/recovery.mdx index 3a50eec8c7d..e2b8444a5de 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/recovery.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/recovery.mdx @@ -5,19 +5,19 @@ originalFilePath: 'src/recovery.md' In EDB Postgres Distributed for Kubernetes, recovery is available as a way to bootstrap a new PGD group starting from an available physical backup of a PGD node. -The recovery can't be performed in-place on an existing PGD group. -EDB Postgres Distributed for Kubernetes also supports point-in-time recovery, which allows you to restore a PGDGroup up to +The recovery can't be performed in place on an existing PGD group. +EDB Postgres Distributed for Kubernetes also supports point-in-time recovery (PITR), which allows you to restore a PGD group up to any point in time, from the first available backup in your catalog to the last archived -WAL. (Having a WAL archive is mandatory in this case.) +WAL. Having a WAL archive is mandatory in this case. ## Prerequisite -Before recovering from a backup, take care to apply the following considerations: +Before recovering from a backup: - Make sure that the PostgreSQL configuration (`.spec.cnp.postgresql.parameters`) of the - recovered cluster is compatible with the original one from a physical replication standpoint, . + recovered cluster is compatible with the original one from a physical replication standpoint. -- When recovering in a newly created namespace, remember to first set up a cert-manager CA issuer before deploying the recovered PGDGroup. +- When recovering in a newly created namespace, first set up a cert-manager CA issuer before deploying the recovered PGD group. For more information, see [EDB Postgres for Kubernetes recovery - Additional considerations](/postgres_for_kubernetes/latest/bootstrap/#additional-considerations) in the EDB Postgres for Kubernetes documentation. @@ -25,7 +25,7 @@ For more information, see [EDB Postgres for Kubernetes recovery - Additional con You can recover from a PGD node backup created by Barman Cloud and stored on supported object storage. -For example, given a PGDGroup named `pgdgroup-example` with three instances with backups available, your object storage contains a directory for each node: +For example, given a PGD group` named `pgdgroup-example` with three instances with backups available, your object storage contains a directory for each node: `pgdgroup-example-1`, `pgdgroup-example-2`, `pgdgroup-example-3` @@ -62,8 +62,8 @@ spec: !!! Important Make sure to correctly configure the WAL section according to the source cluster. - In the example, since the `pgdgroup-example` PGDGroup uses `compression` - and `encryption`, make sure to set the proper parameters also in the PGDGroup + In the example, since the `pgdgroup-example` PGD group uses `compression` + and `encryption`, make sure to set the proper parameters also in the PGD group that's being created by the `restore`. !!! Note @@ -73,12 +73,12 @@ spec: for this scenario and tune the value of this parameter for your environment. It makes a difference when you need it. -## Point-in-time recovery (PITR) from an object store +## PITR from an object store Instead of replaying all the WALs up to the latest one, after extracting a base backup, you can ask PostgreSQL to stop replaying -WALs at any given point in time. +WALs at any point in time. PostgreSQL uses this technique to achieve PITR. -The presence of a WAL archive is mandatory. +(The presence of a WAL archive is mandatory.) This example defines a time-base target for the recovery: @@ -123,7 +123,7 @@ The `.spec.restore.recoveryTarget.backupID` option allows you to specify a base which to start the recovery process. By default, this value is empty. If you assign a value to it, the operator uses that backup as the base for the recovery. The value must be in the form of a Barman backup ID. -This example recovers a new PGDGroup from a specific backupID of the +This example recovers a new PGD group from a specific backupID of the `pgdgroup-backup-1` PGD node: ```yaml diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/samples.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/samples.mdx index ff5735e6093..7e54016add4 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/samples.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/samples.mdx @@ -4,7 +4,7 @@ originalFilePath: 'src/samples.md' --- !!! Important - The available dxamples are for demonstration and + The available examples are for demonstration and experimentation purposes only. These examples are configuration files for setting up diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/security.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/security.mdx index 3dcc286cafa..c7a69c946d3 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/security.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/security.mdx @@ -14,7 +14,7 @@ analyzed at three layers: code, container, and cluster. !!! Seealso "About the 4C's Security Model" See [The 4C's Security Model in Kubernetes](https://www.enterprisedb.com/blog/4cs-security-model-kubernetes) - blog article to get a better understanding and context of the approach EDB + blog article for a better understanding and context of the approach EDB takes with security in EDB Postgres Distributed for Kubernetes. ## Code @@ -25,8 +25,7 @@ including security problems. EDB uses a popular open-source linter for Go called GolangCI-Lint can run several linters on the same source code. One of these is [Golang Security Checker](https://github.com/securego/gosec), or `gosec`. -`gosec` is a linter that scans the abstract syntactic tree of the source against a set of rules aimed at -the discovery of well-known vulnerabilities, threats, and weaknesses hidden in +`gosec` is a linter that scans the abstract syntactic tree of the source against a set of rules aimed at discovering well-known vulnerabilities, threats, and weaknesses hidden in the code. These threads include hard-coded credentials, integer overflows, SQL injections, and others. !!! Important @@ -36,7 +35,7 @@ the code. These threads include hard-coded credentials, integer overflows, SQL i ## Container -Every container image that's part of EDB Postgres Distributed for Kubernetes is automatically built by way of CI/CD pipelines following every commit. +Every container image that's part of EDB Postgres Distributed for Kubernetes is built by way of CI/CD pipelines following every commit. Such images include not only those of the operator but also of the operands, specifically every supported PostgreSQL version. In the pipelines, images are scanned with: @@ -65,16 +64,16 @@ The following guidelines and frameworks were taken into account for container-le ## Cluster Security at the cluster level takes into account all Kubernetes components that -form both the control plane and the nodes, as well as the applications that run in -the cluster (PostgreSQL included). +form both the control plane and the nodes as well as the applications that run in +the cluster, including PostgreSQL. -### Role Based Access Control (RBAC) +### Role-based access control (RBAC) The operator interacts with the Kubernetes API server with a dedicated service account called pgd-operator-controller-manager. In Kubernetes this account is installed by default in the `pgd-operator-system` namespace. A cluster role binds between this service account and the pgd-operator-controller-manager -cluster role that defines the set of rules/resources/verbs granted to the operator. +cluster role that defines the set of rules, resources, and verbs granted to the operator. RedHat OpenShift directly manages the operator RBAC entities by way of [Operator Lifecycle @@ -105,12 +104,12 @@ namespaced resources. `secrets` : Unless you provide certificates and passwords to your data nodes, the operator adopts the "convention over configuration" paradigm by - self-provisioning random-generated passwords and TLS certificates, and by + self-provisioning random-generated passwords and TLS certificates and by storing them in secrets. `serviceaccounts` : The operator needs to create a service account to - enable the PGDGroup recovery job to retrieve the backup objects from + enable the `PGDGroup` recovery job to retrieve the backup objects from the object store where they reside. `services` @@ -164,8 +163,8 @@ EDB Postgres Distributed for Kubernetes doesn't require privileged mode for cont The PostgreSQL containers run as the postgres system user. No component requires running as root. Likewise, volumes access doesn't require privileged mode or root privileges. -Proper ermissions must be assigned by the Kubernetes platform or administrators. -The PostgreSQL containers run with a read-only root filesystem (that is, no writable layer). +Proper permissions must be assigned by the Kubernetes platform or administrators. +The PostgreSQL containers run with a read-only root filesystem, that is, no writable layer. The operator explicitly sets the required security contexts. @@ -180,7 +179,7 @@ and SELinux context. article. !!! Warning "Security context constraints and namespaces" - As stated by [Openshift documentation](https://docs.openshift.com/container-platform/latest/authentication/managing-security-context-constraints.html#role-based-access-to-ssc_configuring-internal-oauth), + As stated in the [Openshift documentation](https://docs.openshift.com/container-platform/latest/authentication/managing-security-context-constraints.html#role-based-access-to-ssc_configuring-internal-oauth), SCCs aren't applied in the default namespaces (`default`, `kube-system`, `kube-public`, `openshift-node`, `openshift-infra`, `openshift`). Don't use them to run pods. CNP clusters deployed in those namespaces @@ -226,7 +225,7 @@ The current implementation of EDB Postgres Distributed for Kubernetes creates passwords for the postgres superuser and the database owner. As far as encryption of passwords is concerned, EDB Postgres Distributed for Kubernetes follows -the default behavior of PostgreSQL: starting from PostgreSQL 14, +the default behavior of PostgreSQL: starting with PostgreSQL 14, `password_encryption` is by default set to `scram-sha-256`. On earlier versions, it's set to `md5`. diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/ssl_connections.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/ssl_connections.mdx index bd38e5b5d94..47d71bcf954 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/ssl_connections.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/ssl_connections.mdx @@ -1,5 +1,5 @@ --- -title: 'Client TLS/SSL Connections' +title: 'Client TLS/SSL connections' originalFilePath: 'src/ssl_connections.md' --- diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/use_cases.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/use_cases.mdx index 5ff7f57ea32..831bcd9cf1d 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/use_cases.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/use_cases.mdx @@ -13,7 +13,7 @@ at the same time and need to run in a traditional environment such as a VM. The following is a summary of the basic considerations. See the -[EDB Postgres for Kubernetes documentation](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/use_cases/) +[EDB Postgres for Kubernetes documentation](/postgres_for_kubernetes/latest/use_cases/) for more detail. ## Case 1: Applications inside Kubernetes @@ -24,7 +24,7 @@ namespace inside a Kubernetes cluster. ![Application and Database inside Kubernetes](./images/apps-in-k8s.png) The application, normally stateless, is managed as a standard deployment, -with multiple replicas spread over different Kubernetes node, and internally +with multiple replicas spread over different Kubernetes nodes and internally exposed through a ClusterIP service. The service is exposed externally to the end user through an Ingress and the @@ -32,12 +32,12 @@ provider's load balancer facility by way of HTTPS. ## Case 2: Applications outside Kubernetes -Another possible use case is to manage your Postgres Distributed database inside +Another possible use case is to manage your PGD database inside Kubernetes while having your applications outside of it, for example, in a -virtualized environment. In this case, Postgres Distributed is represented by an IP +virtualized environment. In this case, PGD is represented by an IP address or host name and a TCP port, corresponding to the defined Ingress resource in Kubernetes. -The application can still benefit from a TLS connection to Postgres Distributed. +The application can still benefit from a TLS connection to PGD. ![Application outside Kubernetes](./images/apps-outside-k8s.png) \ No newline at end of file diff --git a/product_docs/docs/postgres_distributed_for_kubernetes/1/using_pgd.mdx b/product_docs/docs/postgres_distributed_for_kubernetes/1/using_pgd.mdx index e3310accd59..4c971e5752a 100644 --- a/product_docs/docs/postgres_distributed_for_kubernetes/1/using_pgd.mdx +++ b/product_docs/docs/postgres_distributed_for_kubernetes/1/using_pgd.mdx @@ -1,11 +1,11 @@ --- -title: 'Managing EDB Postgres Distributed databases' +title: 'Managing EDB Postgres Distributed (PGD) databases' originalFilePath: 'src/using_pgd.md' --- As described in the [architecture document](architecture.md), EDB Postgres Distributed for Kubernetes is an operator created to deploy -Postgres Distributed (PGD) databases. +PGD databases. It provides an alternative over deployment with TPA, and by leveraging the Kubernetes ecosystem, it can offer self-healing and declarative control. The operator is also responsible of the backup and restore operations. @@ -13,7 +13,7 @@ See [Backup](backup.md). However, many of the operations and control of PGD clusters aren't managed by the operator. -The pods created by EDB Postgres Distributed for Kubernetes come with +The pods created by EDB Postgres Distributed for Kubernetes come with the [PGD CLI](https://www.enterprisedb.com/docs/pgd/latest/cli/) installed. You can use this tool, for example, to execute a switchover. @@ -41,7 +41,7 @@ location-a-proxy-0 1/1 Running 0 2h location-a-proxy-1 1/1 Running 0 2h ``` -The proxy nodes have `proxy` in the name. Choose one and get a command +The proxy nodes have `proxy` in the name. Choose one, and get a command prompt in it: ```shell @@ -91,19 +91,19 @@ location-a-3 1403922770 location-a data ACTIVE ACTIVE Up 3 ## Accessing the database -In the [use cases](use_cases.md), you can find a discussion on using the +In [Use cases](use_cases.md) is a discussion on using the database within the Kubernetes cluster versus from outside. In [Connectivity](connectivity.md), you can find a discussion on services, which is relevant for accessing the database from applications. However you implement your system, your applications must use the proxy -service to connect to reap the benefits of Postgres Distributed and +service to connect to reap the benefits of PGD and of the increased self-healing capabilities added by the EDB Postgres Distributed for Kubernetes operator. !!! Important As per the EDB Postgres for Kubernetes defaults, data nodes are - created with a database called `app`, owned by a user named `app`, in + created with a database called `app` and owned by a user named `app`, in contrast to the `bdrdb` database described in the EDB Postgres Distributed documentation. You can configure these values in the `cnp` section of the manifest. @@ -121,7 +121,7 @@ kubectl exec -n my-namespace -ti location-a-1-1 -- psql ``` In the familiar territory of psql, remember that the default -created database is named `app` (see warning above). +created database is named `app` (see previous warning). ```terminal postgres=# \c app