-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wrong haplotype frequencies #14
Comments
Marco As a sanity check, I made an artificial GLF file where I took the the first columns of common.30.hap, and summed them. If geno==0 I output .98 .01 .01 if geno==1, output .01 .98 .01 if geno==2 output .01 .01 .98. I should recover a match to the first two haplotypes in the POSTERIORS file, and indeed this is what I saw. So in real life we should see some intermediate accuracy of only some of the 30 SNPs is informative. Let me know how this works out for you. |
Actually I think this is working right. The program in the second step is making sub-haplotypes based on the intersection of the chip SNPs and the ref haplotype SNPs. The frequencies of the sub-haplotypes are then computed. Try this. Copy your current bim and glf files. In the copies, delete the second and third rows. Edit your settings XML to reflect the new bim and glf, and look at the frequencies, they should no longer be uniform. |
The basic issue here is that I forgot that in Version 2.0 I made things easier so the user did not have to manually fill in .3 .3 .3 for the non-genotyped SNPs. They just needed to fill in the rows in BIM and GLF that were on the chip and the software was supposed to figure things out from there with the intersection idea for making sub-haplotypes I just described above. |
Thanks for the explanation, this is reassuring. |
I'm using the common.30 example as discussed in issue #11, and added some printouts in 20913ec (to activate, set
debug_haplotype
totrue
and recompile). The output contains the following:Note how the counts are correct in the first block but are set to 1 in the second block. Further down the printout of haplotype frequencies is the following:
So it seems that we are indeed losing the correct haplotype frequencies.
The text was updated successfully, but these errors were encountered: