Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Show unaliased Pango lineage in tooltip #984

Open
corneliusroemer opened this issue Sep 3, 2022 · 5 comments · May be fixed by #985
Open

ENH: Show unaliased Pango lineage in tooltip #984

corneliusroemer opened this issue Sep 3, 2022 · 5 comments · May be fixed by #985
Labels
package: nextclade_web t:feat Type: request of a new feature, functionality, enchancement

Comments

@corneliusroemer
Copy link
Member

Even though I spend about a lot of time working with Pango lineages, I get confused what the aliases correspond to.

It would be amazing if we could show the unaliased Pango lineages in a tooltip.

This should be pretty easy to do:

Where do we get the aliases from? We could store them in the tree, in the .extra properties for Nextclade. Or we put them in virus properties, or an optional input file - or we download directly from Github (https://raw.githubusercontent.com/cov-lineages/pango-designation/master/pango_designation/alias_key.json).

Then we just need to have a bit of JS code to notice when the feature should be switched on (Nextclade Pango column contained in dataset). And it's a simple lookup: take the letters in alias_key.json and tag on the dots.

This could be very helpful for a lot of people to learn what new aliases correspond to.

I know that @chaoran-chen is working on a similar feature for covSpectrum.

See here for why this would be useful:
image

@corneliusroemer corneliusroemer added t:feat Type: request of a new feature, functionality, enchancement needs triage Mark for review and label assignment package: nextclade_web labels Sep 3, 2022
@ivan-aksamentov ivan-aksamentov linked a pull request Sep 5, 2022 that will close this issue
@ivan-aksamentov ivan-aksamentov removed the needs triage Mark for review and label assignment label Sep 5, 2022
@katievigil
Copy link

Hi, What is the difference between the Pango lineage (nextclade) and the unaliased?
032723

@ivan-aksamentov
Copy link
Member

ivan-aksamentov commented Mar 27, 2023

@lucyintheskyzzz Pango lineages sometimes go a bit crazy, like B.1.23.456.7.89, so at some point Pango folks invented aliases. In Nextclade the "lineage" column nowadays shows the alias and "unaliased" column shows either partially aliased or canonical lineage, i.e. with aliasing cancelled to some degree or entirely.

You can see how aliases are helpful on some of the example sequences:
https://clades.nextstrain.org/?dataset-name=sars-cov-2&input-fasta=example
Sort by "Pango lineage" and scroll to rows containing "AY" lineages

01

This is not related to Nextclade itself. Aliases is a Pango thing and Nextclade mimics Pango. But yes, our column names are a bit backwards (for legacy reasons) and perhaps our usage of the jargon term "unaliased" is a bit frivolous.

@katievigil
Copy link

Hi @ivan-aksamentov do you know where I can find if my sample had del 69-70?

@ivan-aksamentov
Copy link
Member

@lucyintheskyzzz If you mouse-hover a value in the column "Del", a tooltip will show up displaying a list of nucleotide deletions and a list of aminoacid deletions. If S:H69- and S:V70- are in the list, then Nextclade detected these deletions in your sample. Note that if your sequence is incomplete (missing this particular region) or has non-ACGT nucleotides in this region, Nextclade will be unable to detect deletions there and they will not show up.

01

Alternatively, you can mouse-hover the black markers in sequence view (last column) and, if detected, deletions will also be displayed there:

02

You can also download the TSV output file:

03

and open it in Excel, Google Sheets or other spreadheet software or analyze it programmatically. The aaDeletions column will contain all detected aminoacid deletions:

04

Finally, you can download translated polypeptide sequences (in fasta format) and open them in any alignment viewer software (such as AliView or https://alignmentviewer.org/) and then see if columns 69 and 70 has - in them:

05

P.S. I suggest you open a new issue for each question, suggestion or bug report, or join our discussion forum (https://discussion.nextstrain.org), instead of posting into unrelated issues. This will make it easier for developers and other users to navigate in GitHub issues.

@katievigil
Copy link

OK thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
package: nextclade_web t:feat Type: request of a new feature, functionality, enchancement
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants