-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Visualisation of simplified backbone phylogenies #222
Comments
This can similarly be done for the global, all-time Nextstrain tree. Instead of using the Pango lineage labels, we would use the Nextstrain clade definitions. I think the easiest way may be to use the nucleotide definitions for each Nextstrain clade in order to identify the corresponding node in our ARGs, at least for the first pass. We should see the same clade hierarchy. Some useful files for this analysis are here: |
Also just encountered this list of mutations in the founder sequences of the Nextstrain clades assembled by Richard Neher. Tagging the list here in case they come in handy. |
Jerome and I thought of another way, or intermediate hackiness: we could use Ana's lineage imputation method, and simply find the earliest node for each imputed pango lineage. We could test by looking both at the proportion of times that the standard lineage-defining mutations occur above this node (NB: if it is a unary node, we should include nodes below it too). |
Probably a simple approach to visually compare the backbones of the Viridian UShER tree and our pandemic-scale ARG is to leverage the Pango lineage roots, excluding the Pango recombinants. The Pango lineage roots are already labelled in the UShER tree, but it is trickier to get the corresponding nodes in our ARGs. Suppose we do have the nodes identified, then we could simplify down to only those nodes (n = 2,131). For a cleaner view, we could exclude the less evolutionarily/epidemiologically relevant Pango lineages.
The text was updated successfully, but these errors were encountered: