-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
phylogenetic: Clade I build failed during filter #283
Comments
Hrm, I thought this was simple fix of just updating the config with: diff --git a/phylogenetic/defaults/clade-i/config.yaml b/phylogenetic/defaults/clade-i/config.yaml
index e8ad850..dfda6a8 100644
--- a/phylogenetic/defaults/clade-i/config.yaml
+++ b/phylogenetic/defaults/clade-i/config.yaml
@@ -19,7 +19,7 @@ auspice_name: "mpox_clade-I"
filter:
min_date: 1900
min_length: 100000
- exclude_where: 'clade!=I'
+ exclude_where: 'clade!=I clade!=Ia clade!=Ib' However, this doesn't work because of how the augur filter handles multiple values for the
|
What about |
Without making changes to the workflow itself, there are two paths forward:
diff --git a/phylogenetic/defaults/clade-i/config.yaml b/phylogenetic/defaults/clade-i/config.yaml
index e8ad850..4103fba 100644
--- a/phylogenetic/defaults/clade-i/config.yaml
+++ b/phylogenetic/defaults/clade-i/config.yaml
@@ -19,7 +19,7 @@ auspice_name: "mpox_clade-I"
filter:
min_date: 1900
min_length: 100000
- exclude_where: 'clade!=I'
+ exclude_where: 'clade=II clade=IIa clade=IIb clade=outgroup clade=""'
diff --git a/phylogenetic/defaults/clade-i/config.yaml b/phylogenetic/defaults/clade-i/config.yaml
index e8ad850..f40e848 100644
--- a/phylogenetic/defaults/clade-i/config.yaml
+++ b/phylogenetic/defaults/clade-i/config.yaml
@@ -19,7 +19,6 @@ auspice_name: "mpox_clade-I"
filter:
min_date: 1900
min_length: 100000
- exclude_where: 'clade!=I'
### We don't want to subsample, so specify a config which is essentially a no-op
@@ -27,6 +26,7 @@ subsample:
everything:
group_by: ""
sequences_per_group: ""
+ other_filters: "--query \"(clade == 'I' | clade == 'Ia' | clade == 'Ib')\""
## align
max_indel: 10000 Option (1) is less change but probably more maintenance since it will need to be updated if there are new clades that need to be excluded. Option (2) is safer in that it will always only include clade I and will only need to be updated if there new children clades of clade I. |
+1 for Option (2): it's more logically direct for the Clade I build to "include Clade I and subclades", not "exclude all other subclades". |
Why not update the workflow to accept a |
I just noticed that |
Oh sorry, I meant to suggest |
Ah, gotcha. Yeah, can do! |
Use `--query` to filter to Clade I and children clades since we have migrated to Nextclade v3 that includes clades I, Ia, and Ib. Resolves #283
For my understanding, this wasn't a nextclade v2 → v3 thing, it was a result of #281 changing the nextclade dataset from "hMPXV" to "MPXV", right? (It doesn't appear that any mpox nextclade datasets have been updated in the last few months, which was my other thought.) |
When using Nextclade v2 to download datasets, it only downloads the v2 datasets so it was using the MPXV/ancestral dataset from 2023-08-01. After switching to Nextclade v3, it now uses the latest dataset that includes the new Ia/Ib clades. |
Wow - that's a gotcha right there. Thanks for clarifying! |
Since we've migrated ingest to Nextclade v3 in #281, the clade labels have been updated to include
Ia
andIb
.This led to the automated build for Clade I to fail during the filter step
The text was updated successfully, but these errors were encountered: