Merge pull request #3391 from EnterpriseDB/release/2022-11-29

Release: 2022-11-29
EnterpriseDB · Nov 29, 2022 · bf30eb2 · bf30eb2 · github-actions · Nov 29, 2022
2 parents 1acb1dd + 559794e
commit bf30eb2
Show file tree

Hide file tree

Showing 21 changed files with 1,223 additions and 3 deletions.
diff --git a/advocacy_docs/pg_extensions/advanced_storage_pack/configuring.mdx b/advocacy_docs/pg_extensions/advanced_storage_pack/configuring.mdx
@@ -0,0 +1,22 @@
+---
+title: Configuring Advanced Storage Pack
+navTitle: Configuring
+---
+
+Place the extension module implementing the custom TAM in `shared_preload_libraries` so that it loads early during the Postgres startup. This step is necessary to ensure that the extension is available before the first access to a table based on the given TAM. For example, update the parameter in `postgresql.conf` with `autocluster` or `refadata`:
+
+```ini
+shared_preload_libraries = '$libdir/<extension_name>'
+```
+
+After restarting the server, execute the SQL command to create the extension. This command creates the extension only in the connected database where the SQL is executed, and must be rerun in each database where the extension used:
+
+```sql
+CREATE EXTENSION <extension_name>;
+```
+
+Within databases where the extension has been created, tables can be created to use the TAM which the extension provides:
+
+```sql
+CREATE TABLE mytable USING <extension_name>;
+```
diff --git a/advocacy_docs/pg_extensions/advanced_storage_pack/index.mdx b/advocacy_docs/pg_extensions/advanced_storage_pack/index.mdx
@@ -0,0 +1,37 @@
+---
+title: EDB Advanced Storage Pack
+navigation:
+- rel_notes
+- installing
+- configuring
+---
+
+EDB Advanced Storage Pack provides advanced storage options for PostgreSQL databases in the form of Table Access Method (TAM) extensions. These storage options can enhance the performance and reliability of databases without requiring application changes.
+
+For tables whose access patterns are known in advance, a targeted TAM that makes different trade-offs may be preferable. For instance, if a given table in an application is INSERT-only and the rows never receive any updates, using a specialized TAM for this table that has INSERT-specific optimizations could be considered.
+
+EnterpriseDB offers two TAMs in the Advanced Storage pack: 
+
+## Autocluster 
+
+The Autocluster TAM provides faster access to clustered data by keeping track of the last inserted row for any value in a side-table. New rows can then be added to the same data blocks as previous rows, keeping the data clustered, which reduces access time to related data. This feature is achieved by maintaining rows with the same key values clustered together so that an index scan for a specific key can find all the rows close together and doesn't need to retrieve as many table pages to satisfy the query.
+
+## Refdata
+
+The Refdata TAM is optimized for mostly-static data, which contains occasional INSERTs and very few DELETEs and UPDATEs. For database schemas that utilize foreign keys to reference data, this TAM can provide performance gains of 5-10% and increased scalability. This feature is achieved by taking an exclusive lock on the reference table whenever it is modified, blocking out concurrent modifications by any other session as well as modifications to tables which reference the table. For example:
+
+```sql
+CREATE TABLE department (
+	department_id	SERIAL PRIMARY KEY,
+	department_name	TEXT
+) USING refdata;
+
+CREATE TABLE employee (
+	...
+	department_id	NOT NULL REFERENCES department(department_id)
+);
+```
+
+The `employee` table is just a standard heap table; only the `department` table uses the `refdata` TAM. Inserts and updates of the employee table don't take out row level locks on the department table, thereby saving query time, avoiding the need to update the rows in the department table, and avoiding the need to write out the referred-to department table rows to disk and to the write ahead log.
+
+If updates to the `department` table are frequent, using the Refdata TAM isn't advisable, because concurrent modifications to it and to the employee table are then blocked. If only infrequent changes are made to the `department` table, speeding up frequent changes to the employee table, and reducing write ahead log traffic may well be worth this cost.
diff --git a/advocacy_docs/pg_extensions/advanced_storage_pack/installing.mdx b/advocacy_docs/pg_extensions/advanced_storage_pack/installing.mdx
@@ -0,0 +1,79 @@
+---
+title: Installing Advanced Storage Pack
+navTitle: Installing
+---
+
+The Advanced Storage Pack is supported on the same platforms as the Postgres distribution you are using. Support for Advanced Storage Pack starts with Postgres 11. For details, see:
+- [EDB Postgres Advanced Server Product Compatibility](https://www.enterprisedb.com/platform-compatibility#epas)
+
+- [PostgreSQL Product Compatibility](https://www.enterprisedb.com/resources/platform-compatibility#pg)
+
+- [EDB Postgres Distributed (includes EDB Postgres Extended)](https://www.enterprisedb.com/resources/platform-compatibility#bdr)
+
+## Prerequisites
+
+Before you begin the installation process:
+
+- Install Postgres
+  - [Installing EDB Postgres Advanced Server](/epas/latest/epas_inst_linux/installing_epas_using_edb_repository/)
+
+  - [Installing PostgreSQL](https://www.postgresql.org/download/) 
+
+  - [Installing EDB Postgres Distributed (includes EDB Postgres Extended)](https://www.enterprisedb.com/docs/pgd/latest/deployments/tpaexec/)
+
+- Set up the repository
+
+  Setting up the repository is a one-time task. If you have already set up your repository, you do not need to perform this step.
+
+  To set up the repository, go to [EDB repositories](https://www.enterprisedb.com/repos-downloads) and follow the instructions provided there.
+
+
+## Install the package
+
+The syntax for the RPM package install command is:
+
+```shell
+sudo <package-manager> -y install edb-<postgres><postgres_version>-advanced-storage-pack<major_version>-<full_version>
+```
+
+And the syntax for the Debian package install command is:
+
+```shell
+sudo <package-manager> -y install edb-<postgres><postgres_version>-advanced-storage-pack-<major_version>-<full_version>
+```
+
+where: 
+- `<package-manager>`is the package manager used with your operating system:
+
+  | Package manager |             Operating system     |
+  | --------------- | -------------------------------- |
+  | dnf             | RHEL 8 and derivatives           |
+  | yum             | RHEL 7 and derivatives, CentOS 7 |
+  | zypper          | SLES                             |
+  | apt-get         | Debian and derivatives           |
+
+- `<postgres>` is the distribution of Postgres you are using:
+
+ |    Postgres distribution     |   Value    |
+ | ---------------------------- | ---------- |
+ | PostgreSQL                   | pg         |
+ | EDB Postgres Advanced Server | as         |
+ | EDB Postgres Extended        | pgextended |
+
+- `<postgres_version>` is the version of Postgres you are using.
+
+- `<major_version>` is the major version of the extension you are installing. 
+
+- `<full_version>` is the full version of the extension you are installing.
+
+For example, to install Advanced Storage Pack 1.0.0 for EDB Postgres Advanced Server 14 on a RHEL 8 platform:
+
+```shell
+sudo dnf -y install edb-as14-advanced-storage-pack1-1.0.0
+```
+
+And to install Advanced Storage Pack 1.0.0 for EDB Postgres Advanced Server 14 on a Debian 11 platform:
+
+```shell
+sudo apt-get -y install edb-pg15-advanced-storage-pack-1-1.0.0
+```
diff --git a/...cacy_docs/pg_extensions/advanced_storage_pack/rel_notes/asp_1.0.0_rel_notes.mdx b/...cacy_docs/pg_extensions/advanced_storage_pack/rel_notes/asp_1.0.0_rel_notes.mdx
@@ -0,0 +1,10 @@
+---
+title: Release notes for Advanced Storage Pack version 1.0.0
+navTitle: "Version 1.0.0"
+---
+
+This release of Advanced Storage Pack includes:
+
+|  Type   |                                Description                                 |
+| ------- | -------------------------------------------------------------------------- |
+| Feature | This is the initial release and includes the Refdata and Autocluster TAMs. |
diff --git a/advocacy_docs/pg_extensions/advanced_storage_pack/rel_notes/index.mdx b/advocacy_docs/pg_extensions/advanced_storage_pack/rel_notes/index.mdx
@@ -0,0 +1,18 @@
+---
+title: Advanced Storage Pack release notes
+navTitle: "Release notes"
+indexCards: none
+---
+The Advanced Storage Pack documentation describes the latest version of Advanced Storage Pack,
+including minor releases and patches. These release notes
+cover what was new in each release. For new functionality introduced
+in a minor or patch release, there are also indicators in the content
+about the release that introduced the feature.
+
+|           Version           | Release Date |
+| --------------------------- | ------------ |
+| [1.0.0](asp_1.0.0_rel_notes) | 2022 Nov 30  |
+
+
+
+
diff --git a/advocacy_docs/pg_extensions/advanced_storage_pack/using.mdx b/advocacy_docs/pg_extensions/advanced_storage_pack/using.mdx
@@ -0,0 +1,201 @@
+---
+title: Using EDB Advanced Storage Pack
+navTitle: Using
+---
+
+The following are scenarios where the EDB Advances Storage Pack TAMs would be useful.
+
+## Refdata example
+
+A scenario where Refdata would be useful is when creating a reference table of all the New York Stock Exchange (NYSE) stock symbols and their corporate names. This data is expected to change very rarely and be referenced frequently from a table tracking all stock trades for the entire market (like in the [Advanced Autocluster example](#advanced-autocluster-example)), so Refdata can be used instead of heap to increase performance. 
+
+```sql
+CREATE SEQUENCE nyse_symbol_id_seq;
+CREATE TABLE nyse_symbol (
+	nyse_symbol_id INTEGER NOT NULL PRIMARY KEY DEFAULT NEXTVAL('nyse_symbol_id_seq'),
+	symbol TEXT NOT NULL,
+	name TEXT NOT NULL
+) USING refdata;
+```
+## Autocluster example
+
+A scenario where Autocluster would be useful is with an Internet of Things (IoT) data, which are usually inserted with many rows that relate to each other and often use append-only data. When using heap instead of Autocluster, Postgres can't cluster together these related rows, so access to the set of rows touches many data blocks, can be very slow, and input/output heavy.
+
+The following example is for an IoT of thermostats which report a houses's temperature and temperature settings every 60 seconds:
+
+```sql
+CREATE TABLE iot (
+    thermostat_id        bigint,
+    recordtime           timestamp,
+    measured_temperature  float 4,
+    temperature_setting   float 4,
+) USING autocluster;
+```
+
+Using Autocluster, rows with the same `thermostat_id` are clustered together and are easier to access:
+
+```sql
+CREATE INDEX ON iot USING btree(thermostat_id);
+SELECT autocluster.autocluster(
+    rel := 'iot'::regclass,
+    cols := '{1}',
+    max_objects := 10000
+);
+```
+
+!!! Note
+The `cols` parameter should match the number of columns specified in `USING btree()`. In this case, only `thermostat_id` is listed so the value is `{1}`. 
+!!!
+
+Populate the table with the `thermostat_id` and `recordtime` data:
+
+```sql
+INSERT INTO iot (thermostat_id, recordtime) VALUES (456, 12:01);
+INSERT INTO iot (thermostat_id, recordtime) VALUES (8945, 04:55);
+INSERT INTO iot (thermostat_id, recordtime) VALUES (456, 15:32);
+INSERT INTO iot (thermostat_id, recordtime) VALUES (6785, 01:36);
+INSERT INTO iot (thermostat_id, recordtime) VALUES (456, 19:25);
+INSERT INTO iot (thermostat_id, recordtime) VALUES (5678, 03:44);
+```
+
+When you select the data from the IoT table, you can see from the ctid location that the data with the same `thermostat_id` was clustered together:
+
+```sql
+SELECT ctid, thermostat_id, recordtime FROM iot;
+__OUTPUT__
+ ctid  | thermostat_id | recordtime
+-------+-------+---------
+ (0,1) |     456 | 12:01
+ (2,2) |     8945| 04:55
+ (0,2) |     456  | 15:32
+ (3,2) |     6785| 01:36
+ (0,3) |     456 | 19:25
+ (2,5) |     5678| 03:44
+(6 rows)
+```
+
+ ## Advanced Autocluster example
+
+This is a more advanced way of using Autocluster than the previous example. It involves referencing the NYSE table from the [Refdata example](#refdata) and clustering together the rows based on the stock symbol, making it easier to find the latest number of trades. 
+
+Start with the NYSE table from the Refdata example: 
+
+```sql
+CREATE SEQUENCE nyse_symbol_id_seq;
+CREATE TABLE nyse_symbol (
+	nyse_symbol_id INTEGER NOT NULL PRIMARY KEY DEFAULT NEXTVAL('nyse_symbol_id_seq'),
+	symbol TEXT NOT NULL,
+	name TEXT NOT NULL
+) USING refdata;
+```
+
+Then, create a highly updated table containing NYSE trades, referencing the mostly static stock symbols in the Refdata table. And, cluster the rows on the stock symbol to make it easier to look up the last x trades for a given stock: 
+
+```sql
+CREATE TABLE nyse_trade (
+	nyse_symbol_id INTEGER NOT NULL REFERENCES nyse_symbol(nyse_symbol_id),
+	trade_time TIMESTAMP NOT NULL DEFAULT NOW(),
+	trade_price	FLOAT8 NOT NULL CHECK(trade_price >= 0.0),
+	trade_volume BIGINT NOT NULL CHECK(trade_volume >= 1)
+); --  USING autocluster;
+CREATE INDEX ON nyse_trade USING BTREE(nyse_symbol_id);
+SELECT autocluster.autocluster(
+	rel := 'nyse_trade'::regclass,
+	cols := '{1}',
+	max_objects := 3000
+);
+ autocluster 
+-------------
+
+(1 row)
+```
+
+Create a view to facilitate inserting by symbol name rather than id: 
+
+```sql
+CREATE VIEW nyse_trade_symbol AS
+	SELECT ns.symbol, nt.trade_time, nt.trade_price, nt.trade_volume
+		FROM nyse_symbol ns
+		JOIN nyse_trade nt
+		  ON ns.nyse_symbol_id = nt.nyse_symbol_id;
+CREATE RULE stock_insert AS ON INSERT TO nyse_trade_symbol
+	DO INSTEAD INSERT INTO nyse_trade
+		(SELECT ns.nyse_symbol_id, NEW.trade_time, NEW.trade_price, NEW.trade_volume
+			FROM nyse_symbol ns
+			WHERE ns.symbol = NEW.symbol
+		);
+```
+
+For more information on creating a view, see the [PostgreSQL documentation](https://www.postgresql.org/docs/current/sql-createview.html).
+
+Pre-populate the static data (shortened for brevity): 
+
+```sql
+INSERT INTO nyse_symbol (symbol, name) VALUES
+    ('A', 'Agilent Technologies'),
+	('AA', 'Alcoa Corp'),
+	('AAC', 'Ares Acquisition Corp Cl A'),
+	('AAIC', 'Arlington Asset Investment Corp'),
+	('AAIN', 'Arlington Asset Investment Corp 6.000%'),
+	('AAN', 'Aarons Holdings Company'),
+	('AAP', 'Advance Auto Parts Inc'),
+	('AAQC', 'Accelerate Acquisition Corp Cl A'),
+    ('ZTR', 'Virtus Total Return Fund Inc'),
+	('ZTS', 'Zoetis Inc Cl A'),
+	('ZUO', 'Zuora Inc'),
+	('ZVIA', 'Zevia Pbc Cl A'),
+	('ZWS', 'Zurn Elkay Water Solutions Corp'),
+	('ZYME', 'Zymeworks Inc');
+ANALYZE nyse_symbol;
+```
+
+Insert stock trades over a given time range on Friday, November 18 2022 (shortened for brevity):
+
+```sql 
+\timing
+INSERT INTO nyse_trade_symbol VALUES ('NSC', 'Fri Nov 18 09:51:32 2022', 248.100000, 98778);
+Time: 32.349 ms
+INSERT INTO nyse_trade_symbol VALUES ('BOE', 'Fri Nov 18 09:51:32 2022', 9.640000, 72973);
+Time: 1.055 ms
+INSERT INTO nyse_trade_symbol VALUES ('LOMA', 'Fri Nov 18 09:51:32 2022', 6.180000, 41632);
+Time: 0.927 ms
+INSERT INTO nyse_trade_symbol VALUES ('LXP', 'Fri Nov 18 09:51:32 2022', 10.670000, 85768);
+Time: 0.941 ms
+INSERT INTO nyse_trade_symbol VALUES ('ABBV', 'Fri Nov 18 09:51:32 2022', 155.000000, 46842);
+Time: 0.916 ms
+INSERT INTO nyse_trade_symbol VALUES ('AGD', 'Fri Nov 18 09:51:32 2022', 9.360000, 90684);
+Time: 0.669 ms
+INSERT INTO nyse_trade_symbol VALUES ('PAGS', 'Fri Nov 18 11:14:31 2022', 12.985270, 34734);
+Time: 0.849 ms
+INSERT INTO nyse_trade_symbol VALUES ('KTF', 'Fri Nov 18 11:14:31 2022', 8.435753, 73719);
+Time: 0.679 ms
+INSERT INTO nyse_trade_symbol VALUES ('AES', 'Fri Nov 18 11:14:31 2022', 28.072732, 549);
+Time: 0.667 ms
+INSERT INTO nyse_trade_symbol VALUES ('LIN', 'Fri Nov 18 11:14:31 2022', 334.617829, 39838);
+Time: 0.665 ms
+INSERT INTO nyse_trade_symbol VALUES ('DTB', 'Fri Nov 18 11:14:31 2022', 18.679245, 55863);
+Time: 0.680 ms
+ANALYZE nyse_trade;
+Time: 73.832 ms
+```
+
+Select the ctid from the data for a given stock symbol to see in the output how it has been clustered together:
+
+```sql
+SELECT ctid, * FROM nyse_trade WHERE nyse_symbol_id = 1000 ORDER BY trade_time DESC LIMIT 10;
+__OUTPUT__
+   ctid    | nyse_symbol_id |        trade_time        | trade_price | trade_volume 
+-----------+----------------+--------------------------+-------------+--------------
+ (729,71)  |           1000 | Fri Nov 18 11:13:51 2022 |   11.265938 |        72662
+ (729,22)  |           1000 | Fri Nov 18 11:08:39 2022 |   11.262747 |        50897
+ (729,20)  |           1000 | Fri Nov 18 11:08:30 2022 |   11.267203 |        37120
+ (729,9)   |           1000 | Fri Nov 18 11:07:21 2022 |   11.269852 |          792
+ (729,6)   |           1000 | Fri Nov 18 11:07:02 2022 |   11.268067 |        46221
+ (632,123) |           1000 | Fri Nov 18 11:04:46 2022 |   11.272623 |        97874
+ (632,118) |           1000 | Fri Nov 18 11:04:28 2022 |   11.271794 |        65579
+ (632,14)  |           1000 | Fri Nov 18 10:55:45 2022 |   11.268543 |         8557
+ (632,2)   |           1000 | Fri Nov 18 10:54:45 2022 |    11.26414 |        94078
+ (506,126) |           1000 | Fri Nov 18 10:54:01 2022 |   11.264657 |        89641
+(10 rows)
+```
+