From 8353d40124ec50c56949e3415fb8d0b9eac5677a Mon Sep 17 00:00:00 2001 From: Cornelius Roemer Date: Tue, 30 Apr 2024 18:35:51 +0200 Subject: [PATCH 1/2] Exclude 3 overdiverged clade I sequences --- phylogenetic/README.md | 18 +++++++++++------- phylogenetic/defaults/exclude_accessions.txt | 5 +++++ 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/phylogenetic/README.md b/phylogenetic/README.md index 6c29f4c..865c361 100644 --- a/phylogenetic/README.md +++ b/phylogenetic/README.md @@ -16,24 +16,28 @@ If you're unfamiliar with Nextstrain builds, you may want to follow our The easiest way to run this pathogen build is using the Nextstrain command-line tool from within the `phylogenetic/` directory: - cd phylogenetic/ - nextstrain build . +```bash +cd phylogenetic/ +nextstrain build . +``` Once you've run the build, you can view the results with: - nextstrain view . +```bash +nextstrain view . +``` ### Example build You can run an example build using the example data provided in this repository via: -``` +```bash nextstrain build . --configfile build-configs/ci/config.yaml ``` When the build has finished running, view the output Auspice trees via: -``` +```bash nextstrain view . ``` @@ -53,7 +57,7 @@ If you analyze and plan to publish using these data, please contact these labs f Within the analysis pipeline, these data are fetched from data.nextstrain.org and written to `data/` with: ```bash -nextstrain build . data/sequences.fasta data/metadata.tsv +nextstrain build . data/sequences.fasta.xz data/metadata.tsv.gz ``` ### Run analysis pipeline @@ -107,7 +111,7 @@ It can also be used as a small subset of real-world data. Example data should be updated every time metadata schema is changed or a new clade/lineage emerges. To update, run: -```sh +```bash nextstrain build . update_example_data -F \ --configfiles build-configs/ci/config.yaml build-configs/chores/config.yaml ``` diff --git a/phylogenetic/defaults/exclude_accessions.txt b/phylogenetic/defaults/exclude_accessions.txt index 2ff012e..f3866b1 100644 --- a/phylogenetic/defaults/exclude_accessions.txt +++ b/phylogenetic/defaults/exclude_accessions.txt @@ -74,3 +74,8 @@ PP098595 PP098578 HM172544 # cidofovir-resistant lab strain that is derived from DQ011155 (h/t Andrew Rambaut) + +TMP0003 # Overdiverged 23MPX1786C +TMP0045 # Overdiverged RDC-NKV-GOM-MPOX-004 + +NC_003310 # Overdiverged RefSeq NC_003310 From 64bd2258ad66039c6143b4a1a8048f75ad4ead7e Mon Sep 17 00:00:00 2001 From: Cornelius Roemer Date: Tue, 30 Apr 2024 18:40:06 +0200 Subject: [PATCH 2/2] Update clade I to include South Kivu cluster --- phylogenetic/defaults/clades.tsv | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/phylogenetic/defaults/clades.tsv b/phylogenetic/defaults/clades.tsv index 3867491..d35cfbb 100644 --- a/phylogenetic/defaults/clades.tsv +++ b/phylogenetic/defaults/clades.tsv @@ -2,8 +2,8 @@ clade gene site alt outgroup nuc 179226 T -clade I nuc 86502 T -clade I nuc 35352 A +clade I nuc 87560 T +clade I nuc 136015 A clade II nuc 86502 G clade II nuc 150970 A