Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can GenomicAnnotations.jl deal with multi-exon genes? #9

Closed
Xiao-Zhong opened this issue Sep 28, 2022 · 3 comments
Closed

Can GenomicAnnotations.jl deal with multi-exon genes? #9

Xiao-Zhong opened this issue Sep 28, 2022 · 3 comments

Comments

@Xiao-Zhong
Copy link

Xiao-Zhong commented Sep 28, 2022

Hi, I'm trying converting a genbank file to a gff with the package, however it fails. For example, one gene is as below:
Input: Screenshot 1; https://www.ncbi.nlm.nih.gov/nuccore/7525012
Screen Shot 2022-09-28 at 10 30 09 pm

Code: Screenshot 2
Screen Shot 2022-09-28 at 10 36 05 pm

Output: Screenshot 3
Screen Shot 2022-09-28 at 10 37 43 pm

As shown, the CDS in the gff output doesn't have an intron between the two exons. CDS sequences extracted based on these kinds of coordinates must be wrong. Thank you!

@kdyrhage
Copy link
Member

The intron information is stored in the Locus, so in theory yes. GFF.Writer ignores it, though. I only ever work with bacterial data so I don't know what the expected output is. Could you give an example of what you would like the result to look like?

@Xiao-Zhong
Copy link
Author

Thanks! Please check the links below:
https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md
https://www.ncbi.nlm.nih.gov/genbank/genomes_gff/

Oh. I see. You only focused on bacterial data, single-exon genes.

@kdyrhage
Copy link
Member

v0.3.8 fixes this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants