Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SL sites for TcCLB.511029.20 (Non-esmer) inferred for wrong direction #1

Open
khughitt opened this issue Nov 3, 2015 · 0 comments
Open

Comments

@khughitt
Copy link
Contributor

khughitt commented Nov 3, 2015

When attempting to parse the results from the UTR analysis for T. cruzi CL Brener Non-Esmeraldo-like, the detected SL sites one of the genes (TcCLB.511029.20) appears to be incorrect.

While the gene is on the positive strand, the detected SL sites are all downstream of the CDS position.

So far, I have only encountered this for this one gene, and the cause of the problem is not immediately obvious, so going to just document the problem for now.

tcruzi_infecting_hsapiens_amastigote_nonesmer_sl_sorted.gff

TcChr35-P utr_analysis.py trans_splice_site 152726 152726 3 - . ID=TcCLB.511029.20.sl.5;Name=TcCLB.511029.20;description=kinetoplast-associated+protein+3+(KAP3)
TcChr35-P utr_analysis.py trans_splice_site 152727 152727 1 - . ID=TcCLB.511029.20.sl.4;Name=TcCLB.511029.20;description=kinetoplast-associated+protein+3+(KAP3)
TcChr35-P utr_analysis.py trans_splice_site 152731 152731 2 - . ID=TcCLB.511029.20.sl.6;Name=TcCLB.511029.20;description=kinetoplast-associated+protein+3+(KAP3)
TcChr35-P utr_analysis.py trans_splice_site 152765 152765 1 - . ID=TcCLB.511029.20.sl.3;Name=TcCLB.511029.20;description=kinetoplast-associated+protein+3+(KAP3)
TcChr35-P utr_analysis.py trans_splice_site 152785 152785 95 - . ID=TcCLB.511029.20.sl.1;Name=TcCLB.511029.20;description=kinetoplast-associated+protein+3+(KAP3)
TcChr35-P utr_analysis.py trans_splice_site 152786 152786 10 - . ID=TcCLB.511029.20.sl.2;Name=TcCLB.511029.20;description=kinetoplast-associated+protein+3+(KAP3)

GFF (TriTrypDB-8.1_TcruziCLBrenerNon-Esmeraldo-like.gff)

TcChr35-P TriTrypDB gene 152083 152706 . + . ID=TcCLB.511029.20;Name=TcCLB.511029.20;description=kinetoplast-associated+protein+3+%28KAP3%29;size=624;web_id=TcCLB.511029.20;locus_tag=TcCLB.511029.20;size=624;Alias=KAP3,Tc00.1047053511029.20:pep,Tc00.1047053511029.20:mRNA,Tc00.1047053511029.20,Tc00.1047053511029.20:exon:1,TcCLB.511029.20,6032.t00002
TcChr35-P TriTrypDB mRNA 152083 152706 . + . ID=rna_TcCLB.511029.20-1;Name=TcCLB.511029.20-1;description=TcCLB.511029.20-1;size=624;Parent=TcCLB.511029.20;Ontology_term=GO:0006323,GO:0005759,GO:0020023,GO:0003677;Dbxref=ApiDB:TcCLB.511029.20,taxon:9000000025
TcChr35-P TriTrypDB CDS 152083 152706 . + 0 ID=cds_TcCLB.511029.20-1;Name=cds;description=.;size=624;Parent=rna_TcCLB.511029.20-1
TcChr35-P TriTrypDB exon 152083 152706 . + . ID=exon_TcCLB.511029.20-1;Name=exon;description=exon;size=624;Parent=rna_TcCLB.511029.20-1

matched_reads_R1.csv

HWI-1KL118:27:C0PJ6ACXX:7:1208:11217:103147,TcCLB.511029.20,TcChr35-P,-,-,152696,152784,CTGTACTATATTGATCGCACTGCTGAATTTCAGCCGTTATTTTGTTCATCCATCCATCAACGGGGAGTGAAGAGCCAACAGCAATAAAAAAATGCTTCGAC,CTGTACTATATTG,TGTTCTTCACAGA,CTGTACTATATTG,TGTTCTTCACAGA,152785
HWI-1KL118:27:C0PJ6ACXX:7:2101:1394:148845,TcCLB.511029.20,TcChr35-P,-,-,152696,152784,CTGTACTATATTGATCGCACTGCTGAATTTCAGCCGTTATTTTGTTCATCCATCCATCAACGGGGAGTGAAGAGCCAACAGCAATAAAAAAATGCTTCGAC,CTGTACTATATTG,TGTTCTTCACAGA,CTGTACTATATTG,TGTTCTTCACAGA,152785
HWI-1KL118:27:C0PJ6ACXX:7:1107:8283:186762,TcCLB.511029.20,TcChr35-P,-,-,152699,152784,TTTCTGTACTATATTGATCGCACTGCTGAATTTCAGCCGTTATTTTGTTCATCCATCCATCAACGGGGAGTGAAGAGCCAACAGCAATAAAAAAATGCTTC,TTTCTGTACTATATTG,GTTTGTTCTTCACAGA,TTTCTGTACTATATTG,GTTTGTTCTTCACAGA,152785
HWI-1KL118:27:C0PJ6ACXX:7:1101:12969:15716,TcCLB.511029.20,TcChr35-P,-,-,152700,152784,GTTTCTGTACTATATTGATCGCACTGCTGAATTTCAGCCGTTATTTTGTTCATCCATCCATCAACGGGGAGTGAAGAGCCAACAGCAATAAAAAAATGCTT,GTTTCTGTACTATATTG,CGTTTGTTCTTCACAGA,GTTTCTGTACTATATTG,CGTTTGTTCTTCACAGA,152785
HWI-1KL118:27:C0PJ6ACXX:7:1103:9460:28433,TcCLB.511029.20,TcChr35-P,-,-,152700,152784,GTTTCTGTACTATATTGATCGCACTGCTGAATTTCAGCCGTTATTTTGTTCATCCATCCATCAACGGGGAGTGAAGAGCCAACAGCAATAAAAAAATGCTT,GTTTCTGTACTATATTG,CGTTTGTTCTTCACAGA,GTTTCTGTACTATATTG,CGTTTGTTCTTCACAGA,152785
HWI-1KL118:27:C0PJ6ACXX:7:2304:6149:193330,TcCLB.511029.20,TcChr35-P,-,-,152700,152784,GTTTCTGTACTATATTGATCGCACTGCTGAATTTCAGCCGTTATTTTGTTCATCCATCCATCAACGGGGAGTGAAGAGCCAACAGCAATAAAAAAATGCTT,GTTTCTGTACTATATTG,CGTTTGTTCTTCACAGA,GTTTCTGTACTATATTG,CGTTTGTTCTTCACAGA,152785

GFF.parse() (Python)

Out[18]: SeqFeature(FeatureLocation(ExactPosition(152082), ExactPosition(152706), strand=1), type='gene', id='TcCLB.511029.20')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant