-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Virus protein accessions without ensemble mapping #19
Comments
Arthur Brady commented: You could submit UniProt accessions, if they exist, and describe the proteins just as proteins, not genes. We do not in principle support draft genomic data because of its basic instability, although if there’s some entity or organization governing covid gene nomenclature, using data from such a source might be a possibility. Ensembl only provides IDs for genes for selected model organisms (although there are a large number of them) – for genes from organisms not represented in Ensembl, we can import IDs from other spaces (as we have done for GlyTouCan IDs not present in PubChem), but we would still need some sort of ID-issuing authority to have created stable identifiers for the genetic objects in question. Until/unless that’s done, we won’t be able to integrate draft (or anonymous) data alongside stable identifiers, for obvious reasons. |
Hi @jeet-vora and @ReneRanzinger I think this is still an open issue. As Arthur mentioned in his comment, if you know a reliable, stable authoritative source for viral genes (COVID and HCV), we can import those IDs to include in our controlled vocabulary so that you can start using them in your next submission (next year). Let us know. Thanks, |
There is no gene nomenclature resource as of now for viruses. We use UniProt for virus genes as it currently the best and curated resource for virus proteins and genes. Another option is to use NCBI Gene. In GlyGen the protein and gene related information comes from UniProt and most of them can be mapped to Ensembl Gene ID and/or NCBI GeneID. |
HI @jeet-vora NCBI Gene sounds like a good option if that works for you. Will you be able to provide us a list of NCBI Gene IDs for your set of viral genes? Thanks, |
Hi Jessica and Arthur,
In GlyGen we have protein and glycan data for medically important virus species like SARS-CoV, and HCV. We are planning to submit the data for these species however the proteins do not have ENSEMBL mapping as viruses do not have chromosomes.
Do you have any suggestions on how we can tackle this in order to submit the data? Thanks
We are also adding mouse and rat data, so if any issue arises we will bring to your attention. Maybe in few days we can get together on a call to discuss these few issues including the ones reported by Rene.
The text was updated successfully, but these errors were encountered: