Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starting with comma in alt field #67

Open
olekto opened this issue Nov 19, 2015 · 0 comments
Open

Starting with comma in alt field #67

olekto opened this issue Nov 19, 2015 · 0 comments

Comments

@olekto
Copy link

olekto commented Nov 19, 2015

Hi, I was trying to filter a vcf created with the 4.0 beta version, when the filtering script crashed with this error:
Traceback (most recent call last):
File "/projects/cees/bin/lobstr/lobSTR-bin-Linux-x86_64-4.0.0/share/lobSTR/scripts/lobSTR_filter_vcf.py", line 132, in
for record in reader:
File "/cluster/software/VERSIONS/python_packages-2.7_5/cluster/software/VERSIONS/python2-2.7.10/lib/python2.7/site-packages/vcf/parser.py", line 539, in next
alt = self._map(self._parse_alt, row[4].split(','))
File "/cluster/software/VERSIONS/python_packages-2.7_5/cluster/software/VERSIONS/python2-2.7.10/lib/python2.7/site-packages/vcf/parser.py", line 347, in _map
for x in iterable]
File "/cluster/software/VERSIONS/python_packages-2.7_5/cluster/software/VERSIONS/python2-2.7.10/lib/python2.7/site-packages/vcf/parser.py", line 515, in _parse_alt
elif str[0] == '.' and len(str) > 1:
IndexError: string index out of range

The following entry is likely the culprit:
LG15 1591293 . AAATAAAAATAAAATAAAA ,AAATAAAA,AAATAAAAATAAAATAAAAAAAATAAAAT 669.516 . END=1591311;MOTIF=AAATA;NS=10;REF=3.8;RL=19;RU=AAATA;VT=STR;RPA=0,1.6,5.8 GT:ALLREADS:AML:DISTENDS:DP:GB:PL:Q:SB:STITCH 0/3:0|1;10|1:0.980102/0.961652:10:2:0/10:16,22,191,22,125,122,0,26,24,20:0.941778:2:0 0/3:0|2;10|3:0.999776/0.999994:28:5:0/10:52,67,478,67,347,341,0,52,49,37:0.999769:1.44444:0 0/3:0|1;10|1:0.980102/0.961652:51:2:0/10:16,22,191,22,125,122,0,26,24,20:0.941778:4.25:0 3/3:10|2:0.999969/0.999969:67:2:10/10:44,50,197,50,197,197,5,6,6,0:0.499146:2:0 0/3:0|4;10|2:1/0.997848:41.6667:6:0/10:26,44,574,44,311,299,0,105,99,87:0.997848:26:0 2/3:-11|4;-1|1;10|2:0.999984/0.999784:8:7:-11/10:165,149,354,44,191,176,140,112,0,385:0.999784:26:0 1/0:-19|1;0|7;10|1:0.998671/0.999881:-3.875:9:-19/0:71,0,740,29,283,288,73,161,180,233:0.998671:4.25:0 0/0:0|1:0.993932/0.993932:-28:1:0/0:0,3,98,3,33,30,3,29,27,26:0.330906:5:0 3/3:10|4:1/1:1.5:4:10/10:89,101,395,101,395,395,11,12,12,0:0.798921:5:0 0/3:0|2;10|1:0.999907/0.940512:6.66667:3:0/10:13,22,287,22,155,149,0,52,49,43:0.940419:9.25:0

The ALT field starts with comma, which I don't think is according to specs.
allelotype was run on a set of BWA aligned bams, like this:
allelotype --command classify --bam bam1,bam2,bam3,bam4,bam5,bam6,bam7,bam8,bam9,bam10
--strinfo genome_strinfo.tab --noise_model /projects/cees/bin/lobstr/lobSTR-bin-Linux-x86_64-3.0.2/share/lobSTR/models/illumina_v2.0.3
--index-prefix genome_index/lobSTR_ --out genome_ncc
--filter-mapq0 --realign --max-repeats-in-ends 3 --min-read-end-match 10 >allelotype_ncc.out 2> allelotype_ncc.err

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant