Fixes for linkses

Signed-off-by: Dj Walker-Morgan <[email protected]>
EnterpriseDB · Nov 4, 2024 · 5aa5078 · 5aa5078
1 parent b4f8bc7
commit 5aa5078
Show file tree

Hide file tree

Showing 5 changed files with 24 additions and 62 deletions.
diff --git a/advocacy_docs/edb-postgres-ai/analytics/external_tables.mdx b/advocacy_docs/edb-postgres-ai/analytics/external_tables.mdx
@@ -13,7 +13,7 @@ External tables allow you to access and query data stored in S3-compatible objec
 
 * An EDB Postgres AI account and a Lakehouse node.
 * An S3-compatible object storage location with data stored as Delta Lake Tables.
-  * See [Bringing your own data](../loadingdata) for more information on how to prepare your data.
+  * See [Bringing your own data](reference/loadingdata) for more information on how to prepare your data.
 * Credentials to access the S3-compatible object storage location, unless it is a public bucket.
   * These credentials will be stored within the database. We recommend creating a separate user with limited permissions for this purpose.
 
@@ -26,12 +26,12 @@ Using an S3 bucket that isn't in the same region as your node will
 
 ## Creating an External Storage Location
 
-The first step is to create an external storage location which references S3-compatible object storage where your data resides. A storage location is an object within the database which you refer to to access the data; each storage location has a name for this purpose. 
+The first step is to create an external storage location which references S3-compatible object storage where your data resides. A storage location is an object within the database which you refer to to access the data; each storage location has a name for this purpose.
 
 Creating a named storage location is performed with SQL by executing the `pgaa.create_storage_location` function.
 `pgaa` is the name of the extension and namespace that provides the functionality to query external storage locations.
 The `create_storage_location` function takes a name for the new storage location, and the URI of the S3-compatible object storage location as parameters.
-The function optionally can take a third parameter, `options`, which is a JSON object for specifying optional settings, detailed in the [functions reference](reference/functions#create_storage_location). 
+The function optionally can take a third parameter, `options`, which is a JSON object for specifying optional settings, detailed in the [functions reference](reference/functions#pgaacreate_storage_location).
 For example, in the options, you can specify the access key ID and secret access key for the storage location to enable access to a private bucket.
 
 The following example creates an external table that references a public S3-compatible object storage location:

diff --git a/advocacy_docs/edb-postgres-ai/analytics/quick_start.mdx b/advocacy_docs/edb-postgres-ai/analytics/quick_start.mdx
@@ -81,50 +81,33 @@ Persistent data in system tables (users, roles, etc) is stored in an attached
 block storage device and will survive a restart or backup/restore cycle.
 * Only Postgres 16 is supported.
 
-For more notes about supported instance sizes,
-see [Reference - Supported AWS instances](./reference/#supported-aws-instances).
+For more notes about supported instance sizes,see [Reference - Supported AWS instances](./reference/instances).
 
 ## Operating a Lakehouse node
 
 ### Connect to the node
 
-You can connect to the Lakehouse node with any Postgres client, in the same way
-that you connect to any other cluster from EDB Postgres AI Cloud Service
-(formerly known as BigAnimal): navigate to the cluster detail page and copy its
-connection string.
+You can connect to the Lakehouse node with any Postgres client, in the same way that you connect to any other cluster from EDB Postgres AI Cloud Service (formerly known as BigAnimal): navigate to the cluster detail page and copy its connection string.
 
-For example, you might copy the `.pgpass` blob into `~/.pgpass` (making sure to
-replace `$YOUR_PASSWORD` with the password you provided when launching the
-cluster). Then you can copy the connection string and use it as an argument to
-`psql` or `pgcli`.
+For example, you might copy the `.pgpass` blob into `~/.pgpass` (making sure to replace `$YOUR_PASSWORD` with the password you provided when launching the cluster).
+Then you can copy the connection string and use it as an argument to `psql` or `pgcli`.
 
-In general, you should be able to connect to the database with any Postgres
-client. We expect all introspection queries to work, and if you find one that
-doesn't, then that's a bug.
+In general, you should be able to connect to the database with any Postgres client.
+We expect all introspection queries to work, and if you find one that doesn't, then that's a bug.
 
 ### Understand the constraints
 
-* Every cluster uses EPAS or PGE. So expect to see boilerplate tables from those
-flavors in the installation when you connect.
-* Queryable data (like the benchmarking datasets) is stored in object storage
-as Delta Tables. Every cluster comes pre-loaded to point to a storage bucket
-with benchmarking data inside (TPC-H, TPC-DS, Clickbench) at
-scale factors 1 and 10.
+* Every cluster uses EPAS or PGE. So expect to see boilerplate tables from those flavors in the installation when you connect.
+* Queryable data (like the benchmarking datasets) is stored in object storage as Delta Tables. Every cluster comes pre-loaded to point to a storage bucket with benchmarking data inside (TPC-H, TPC-DS, Clickbench) at scale factors 1 and 10.
 * Only AWS is supported at the moment. Bring Your Own Account (BYOA) is not supported.
-* You can deploy a cluster in any region that is activated in
-your EDB Postgres AI Account. Each region has a bucket with a copy of the
-benchmarking data, and so when you launch a cluster, it will use the
-benchmarking data in the location closest to it.
-* The cluster is ephemeral. None of the data is stored on the hard drive,
-except for data in system tables, e.g. roles and users and grants.
-If you restart the cluster, or backup the cluster and then restore it,
-it will restore these system tables. But the data in object storage will
+* You can deploy a cluster in any region that is activated in your EDB Postgres AI Account. Each region has a bucket with a copy of the
+benchmarking data, and so when you launch a cluster, it will use the benchmarking data in the location closest to it.
+* The cluster is ephemeral. None of the data is stored on the hard drive, except for data in system tables, e.g. roles and users and grants.
+If you restart the cluster, or backup the cluster and then restore it, it will restore these system tables. But the data in object storage will
 remain untouched.
-* The cluster supports READ ONLY queries of the data in object
-storage (but it supports write queries to system tables for creating users,
+* The cluster supports READ ONLY queries of the data in object storage (but it supports write queries to system tables for creating users,
 etc.). You cannot write directly to object storage. You cannot create new tables.
-* If you want to load your own data into object storage,
-see [Reference - Bring your own data](./reference/#advanced-bring-your-own-data).
+* If you want to load your own data into object storage, see [Reference - Bring your own data](reference/loadingdata).
 
 ## Inspect the benchmark datasets
 
@@ -140,7 +123,7 @@ The available benchmarking datasets are:
 * 1 Billion Row Challenge
 
 For more details on benchmark datasets,
-see Reference - Available benchmarking datasets](./reference/#available-benchmarking-datasets).
+see Reference - Available benchmarking datasets](./reference/datasets).
 
 ## Query the benchmark datasets
 

diff --git a/...es-ai/analytics/reference/deltatables.mdx → ...s-ai/analytics/reference/delta_tables.mdx b/...es-ai/analytics/reference/deltatables.mdx → ...s-ai/analytics/reference/delta_tables.mdx
@@ -26,3 +26,5 @@ export AWS_SECRET_ACCESS_KEY="..."
 ```
 
 This will export the data from the `some_table` table in the `test-db` database to a Delta Table in the `my_schema/my_table` path in the `my-bucket` bucket.
+
+You can now query this table in the Lakehouse node by creating an external table that references the Delta Table in the `my_schema/my_table` path. See [External Tables](../external_tables) for the details on how to do that.
diff --git a/advocacy_docs/edb-postgres-ai/analytics/reference/loadingdata.mdx b/advocacy_docs/edb-postgres-ai/analytics/reference/loadingdata.mdx
@@ -15,41 +15,19 @@ However, this comes with some major caveats (which will eventually be resolved):
 
 ### Caveats
 
-* The tables must be stored as [Delta Lake Tables](http://github.com/delta-io/delta/blob/master/PROTOCOL.md) within the location
+* The tables must be stored as [Delta Lake Tables](http://github.com/delta-io/delta/blob/master/PROTOCOL.md) within the location.
 * A "Delta Lake Table" (or "Delta Table") is a folder of Parquet files along with some JSON metadata.
 * Each table must be prefixed with a `$schema/$table/` where `$schema` and `$table` are valid Postgres identifiers (i.e. < 64 characters)
   * For example, this is a valid Delta Table that will be recognized by Beacon Analytics:
     * `my_schema/my_table/{part1.parquet, part2.parquet, _delta_log}`
-      * These `$schema` and `$table` identifiers will be queryable in the Lakehouse node, e.g.:
+      * These `$schema` and `$table` identifiers will be queryable in the Postgres Lakehouse node, e.g.:
         * `SELECT count(*) FROM my_schema.my_table;`
-  * This Delta Table will NOT be recognized by Lakehouse Analytics (missing a schema):
+  * This Delta Table will NOT be recognized by Postgres Lakehouse node (missing a schema):
     * `my_table/{part1.parquet, part2.parquet, _delta_log}`
 
 ### Loading data into your bucket
 
 You can use the `lakehouse-loader` utility to export data from an arbitrary Postgres instance to Delta Tables in a storage bucket. 
 See [Delta Lake Table Tools](delta_tables) for more information on how to obtain and use that utility.
 
-### Querying your own data
-
-By default, each Lakehouse node is configured to point to a bucket with benchmarking datasets inside.
-To point it to a different bucket, you can call the `pgaa.create_storage_location` function:
-
-```sql
-SELECT pgaa.create_storage_location.set_bucket_location('mystore', 's3://my-bucket');
-```
-
-You will then be able create a table that references the Delta Table in the bucket:
-
-```sql
-CREATE TABLE public.tablename () USING PGAA WITH (pgaa.storage_location = 'mystore', pgaa.path = 'schemaname/tablename');
-```
-
-Which you can then query:
-
-```sql
-SELECT COUNT(*) FROM public.tablename;
-```
-
 For further details, see the [External Tables](../external_tables) documentation.
-
diff --git a/advocacy_docs/edb-postgres-ai/analytics/reference/providers_and_regions.mdx b/advocacy_docs/edb-postgres-ai/analytics/reference/providers_and_regions.mdx
@@ -31,5 +31,4 @@ To be precise:
 * Managed Storage Locations can only be created in EDB-hosted AWS regions
 * Lakehouse Sync can only sync from source databases in EDB-hosted AWS regions
 
-These limitations will be removed as we continue to improve the product. Eventually,
-we will support BYOA, as well as Azure and GCP, for all Lakehouse use cases.
+These limitations will be removed as we continue to improve the product. Eventually, we will support BYOA, as well as Azure and GCP, for all Lakehouse use cases.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -26,3 +26,5 @@ export AWS_SECRET_ACCESS_KEY="..."
		```

		This will export the data from the `some_table` table in the `test-db` database to a Delta Table in the `my_schema/my_table` path in the `my-bucket` bucket.

		You can now query this table in the Lakehouse node by creating an external table that references the Delta Table in the `my_schema/my_table` path. See [External Tables](../external_tables) for the details on how to do that.