Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bam processing (WIP) #19

Open
wants to merge 45 commits into
base: dev
Choose a base branch
from
Open

Bam processing (WIP) #19

wants to merge 45 commits into from

Conversation

averagehat
Copy link
Contributor

Blocked by VDBWRAIR/bioframes#5, see also #7 (comment)
also need some kind of simple test that ensure that jip is co-operating correctly, ala this but more straightforward I reckon

I added a dependency on bioframes, I will add a dependency to biotest, will probably want to contribute a way to generate fasta records and maybe even something for VCFs there.

This includes:

  • [ ]tagging & sorting & indexing bam with a modified version of ngs_mapper's tagreads.py (which writes to a new file/stdout and tags all reads as required by freebayes)
  • jip job for freebayes
  • new implmementation of consensus.py
  • tests for jip jobs
  • tests for consensus.py

@averagehat
Copy link
Contributor Author

NB: need to fix the python contracts, forgot how to get it to work with seqrecord and non-list iterables

@averagehat
Copy link
Contributor Author

A consensus can have deletions at either the end of the reference. When all of the reads support this, none of that position gets reported in the pileup, and freebayes won't catch that the consensus is shorter at the ends (at least I think this is what happens).
The way to get around this is to trim the consensus by those ends of the pileup which are empty.

@averagehat
Copy link
Contributor Author

Freebayes entries can look like this:

H3N2/CY074920/Managua/2010/PA_3 207 .   AC  GT,AT   8859.33 .   AB=0.432024,0.564955;ABP=16.295,15.1404;AC=1,1;AF=0.5,0.5;AN=2;AO=143,187;CIGAR=2X,1M1X;DP=331;DPB=331;DPRA=0,0;EPP=39.4698,4.41537;EPPR=0;GTI=0;LEN=2,1;MEANALT=3,3;MQM=60,60;MQMR=0;NS=1;NUMALT=2;ODDS=716.828;PAIRED=0.503497,0.754011;PAIREDR=0;PAO=0,0;PQA=0,0;PQR=0;PRO=0;QA=4862,6499;QR=0;RO=0;RPL=110,89;RPP=93.0429,3.95088;RPPR=0;RPR=33,98;RUN=1,1;SAF=101,129;SAP=55.8697,61.5472;SAR=42,58;SRF=0;SRP=0;SRR=0;TYPE=mnp,snp;technology.ILLUMINA=1,1 GT:DP:RO:QR:AO:QA:GL    1/2:331:0:0:143,187:4862,6499:-922.526,-528.256,-485.208,-394.293,0,-338.001

Note that although base-by-base information is not provided, information is provided for both possible mutations, which is good.

@averagehat averagehat added this to the Bam Processing milestone Feb 25, 2016
@averagehat averagehat self-assigned this Feb 25, 2016
@averagehat
Copy link
Contributor Author

I propose testing this by adding it to the map pipeline in #20
ready to merge

@averagehat
Copy link
Contributor Author

I need to think about whether the example here undercuts this implementation. I don't think it does.

@averagehat
Copy link
Contributor Author

Notice that the build errors start after a change to the readme file in 272a115. This is another setuptools issue. There hasn't been an update to the pyvcf package in 3 months. Also, the distribute package has been deprecated for some time? I don't know. @necrolyte2 have you seen this one before?

@necrolyte2
Copy link
Member

I've already forgotten what we did last time that fixed that type of error.
I hate setuptools/distribute and just want to punch it in the face

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants