PGScatalog · nebfield · Nov 29, 2023 · Oct 3, 2023 · Oct 5, 2023 · Oct 5, 2023
diff --git a/docs/getting-started.rst b/docs/getting-started.rst
@@ -97,7 +97,7 @@ parameter:
     --pgs_id PGS001229 # one score
     --pgs_id PGS001229,PGS001405 # many scores separated by , (no spaces)
 
-.. note:: You can also select scores associated with traits (``--efo_id``) and
+.. note:: You can also select scores associated with traits (``--trait_efo``) and
           publications (``--pgp_id``)
 
 If you would like to use a custom scoring file not published in the PGS Catalog,
@@ -129,11 +129,17 @@ for more information). If your custom PGS was in GRCh37 an example would look li
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 To enable genetic ancestry similarity calculations and PGS normalisation,
-download our pre-built reference database:
+download one of our pre-built reference databases:
 
 .. code-block:: console
 
-    $ wget https://ftp.ebi.ac.uk/pub/databases/spot/pgs/resources/pgsc_calc.tar.zst
+    $ wget https://ftp.ebi.ac.uk/pub/databases/spot/pgs/resources/pgsc_HGDP+1kGP_v1.tar.zst
+
+This database contains a merged 1000 Genomes and Human Genome Diversity Project reference panel, and is the recommended default panel. 
+
+You may prefer to use 1000 Genomes only:
+
+    $ wget https://ftp.ebi.ac.uk/pub/databases/spot/pgs/resources/pgsc_1000G_v1.tar.zst
 
 See :ref:`ancestry` for more details.
 
@@ -149,7 +155,7 @@ they match the scoring file genome build.
         -profile <docker/singularity/conda> \
         --input samplesheet.csv --target_build GRCh37 \
         --pgs_id PGS001229 \
-        --run_ancestry pgsc_calc.tar.zst 
+        --run_ancestry pgsc_HGDP+1kGP_v1.tar.zst
 
 Congratulations, you've now (`hopefully`) calculated some scores!
 |:partying_face:|

diff --git a/docs/how-to/ancestry.rst b/docs/how-to/ancestry.rst
@@ -6,17 +6,29 @@ How do I normalise calculated scores across different genetic ancestry groups?
 Download reference data
 -----------------------
 
-The fastest method of getting started is to download our reference panel:
+The fastest method of getting started is to download a `reference panel`_:
 
 .. code-block:: console
 
-    $ wget https://ftp.ebi.ac.uk/pub/databases/spot/pgs/resources/pgsc_calc.tar.zst
+    $ wget https://ftp.ebi.ac.uk/pub/databases/spot/pgs/resources/pgsc_1000G_v1.tar.zst
 
-The reference panel is based on 1000 Genomes. It was originally downloaded from
-the PLINK 2 `resources section`_. To minimise file size INFO annotations are
-excluded. KING pedigree corrections were enabled.
+This example reference panel is based on 1000 Genomes (`Nature 2015`_).
+
+We also provide a reference panel that combines 1000 Genomes with data from the Human Genome
+Diversity Project derived from the gnomAD release (v3.1, `Koenig, Yohannes et al. bioRxiv 2023`_),
+which includes additional samples and ancestry groups:
+
+.. code-block:: console
+
+    $ wget https://ftp.ebi.ac.uk/pub/databases/spot/pgs/resources/pgsc_HGDP+1kGP_v1.tar.zst
 
 .. _`resources section`: https://www.cog-genomics.org/plink/2.0/resources
+.. _`reference panel`: https://ftp.ebi.ac.uk/pub/databases/spot/pgs/resources/
+.. _`Nature 2015`: https://doi.org/10.1038/nature15393
+.. _`Koenig, Yohannes et al. bioRxiv 2023`: https://doi.org/10.1101/2023.01.23.525248
+
+.. note:: These reference databases are not compatible with the test profile. 
+  The test profile is not biologically meaningful, and is only used to test the workflow installed.
 
 Bootstrap reference data
 ~~~~~~~~~~~~~~~~~~~~~~~~
@@ -34,6 +46,6 @@ To enable genetic similarity analysis and score normalisation, just include the
 .. code-block:: console
 
     $ nextflow run pgscatalog/pgsc_calc -profile test,docker \
-        --run_ancestry path/to/reference/pgsc_calc.tar.zst
+        --run_ancestry path/to/reference/pgsc_HGDP+1kGP_v1.tar.zst
 
 The ``--run_ancestry`` parameter requires the path to the reference database.
diff --git a/docs/how-to/database.rst b/docs/how-to/database.rst
@@ -8,17 +8,18 @@ A reference database is required to run some parts of the workflow:
 - Automatic genetic ancestry assignment with Principal Component Analysis
 - PGS normalisation methods that account for genetic ancestry
 
-.. note:: It's simplest to download the reference database we have hosted at the
-          PGS Catalog
+.. note:: It's simplest to download a reference database we host at the
+          PGS Catalog FTP
 
 Download reference database
 ---------------------------
 
-A reference database is available to download here:
+PGS Catalog created reference database(s) are available to download here:
 
-``https://ftp.ebi.ac.uk/pub/databases/spot/pgs/resources/pgsc_calc.tar.zst``
+``https://ftp.ebi.ac.uk/pub/databases/spot/pgs/resources/pgsc_1000G_v1.tar.zst``
+``https://ftp.ebi.ac.uk/pub/databases/spot/pgs/resources/pgsc_HGDP+1kGP_v1.tar.zst``
 
-The database is about 7GB and supports both GRCh37 and GRCh38 input target
+The databases are either 7GB or 16GB and support both GRCh37 and GRCh38 input target
 genomes.
 
 Once the reference database is included, remember you must include the ``--run_ancestry``