-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rhomap results with extremely high 4Ner/kb #21
Comments
Some thoughts below; although I will caution that it has been many years
since I have thought about these issues.
1.can i use summary.txt to draw the results or do i need to use stat on the rates.txt to get res.txt and then to draw a plot?
For rhomap, you should be able to use the summary.txt file.
2.Do you have some recommended parameters with rhomap?Because in LDhat.2.2 manual there is only recommended parameters in interval.
From memory, the default parameters are reasonably for human data.
3.And the weirdest thing is, my results are extremely high with 4Ner/kb。In some region it can reach 867 (4Ner/kb).And i try rhomap on annother population data i even see 2000 (4Ner/kb).Is the numerical value of this result accurate?And I used whole 5mb region of mhc snps to calculate the recombination rate ,do i need to split to small region ?
In some respects, this isn't particularly strange. For regions with high
recombination rates, the point estimate of the recombination rate is very
difficult to estimate. One way to think about this is as the recombination
rate in a hotspot gets higher, then SNPs on either side of the region
become increasingly uncorrelated, and therefore don't contain much
information about the recombination rate in the hotspot.
If you are concerned about the rate estimates from rhomap, I think it is
sensible to continue to use interval. In general, interval is a more robust
algorithm for rate estimation (although it doesn't allow inference of
hotspot locations as easily).
… —
Reply to this email directly, view it on GitHub
<#21>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABEVMYZ32CNNW62YM4ZZGGDVLWDSDANCNFSM5W3P3ZOQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Adam Auton
|
Thank you very much for your reply, which is of great help to me |
@auton1 |
This looks somewhat more discordant than I would expect. What parameters
are you using? I also note that the units on the y-axis appear different.
…On Sun, 5 Jun 2022 at 00:49, masen0407 ***@***.***> wrote:
@auton1 <https://github.com/auton1>
I want to confirm the accuracy of the following results.
i plot YRI(with 96 samples as n=192 ) interval and rhomap as follows:
[image: Screen Shot 2022-06-05 at 3 43 13 PM]
<https://user-images.githubusercontent.com/55567673/172040780-8ac42366-1dbd-4e55-ab5b-da8d9d0d9c6e.png>
and rhomap :
[image: Screen Shot 2022-06-05 at 3 43 53 PM]
<https://user-images.githubusercontent.com/55567673/172040779-bb739cd4-73fb-4d71-904b-4f5233437f14.png>
It all rely on MHC region.
1.Can both plot tell the recombination hot spots region?
2.with out Ne and if i only want to find the region of recombination hot
spots,which would u recommended to use?
—
Reply to this email directly, view it on GitHub
<#21 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABEVMY76NXVEQGSVZKHABITVNRLXPANCNFSM5W3P3ZOQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Adam Auton
|
@auton1 |
The recombination rate "hotspots" correspond to segmental duplication loci. These might be violating the model assumptions. On the other hand, it is highly possible based on what we're seeing in the HPRC that there is a lot of recombination-like activity in these loci. This is the subgraph of the MHC in the HPRC pggb graph: It approximately corresponds to your reference range. From annotation using gfaestus, it's clear that the big bubble, which would correspond to the highest peak in your plot, corresponds to the MHC Class-II genes. This is taken from three slides starting here https://docs.google.com/presentation/d/1qDSHpi1i2esmnIuiBA0g5EOGzvq8jUnkIw1yFFrp4hw/edit#slide=id.g12e89fff832_0_117 |
@ekg @auton1 Thers are many paper use LDhat to count the recombination.Such as https://www.nature.com/articles/ng1885 and https://doi.org/10.1016/j.jgg.2022.03.006. They may end up very different from my rhomap plot.How to find a population-specific recombination hot spots region(may not need to be format as cM/Mb,but just to confirm,where have a higher recombination rate)? |
@auton1 |
For starters, I suggest converting the interval output to rho / kb (i.e.
scaling by physical distance, in kb, between SNPs), which may resolve some
of the discrepancies.
I'd also suggest increasing the hotspot penalty in rhomap.
Is your lkgen command correct? You said you had 96 samples, which would
imply 192 haplotypes? (Is your data phased?)
…On Sun, 5 Jun 2022 at 17:47, masen0407 ***@***.***> wrote:
i use vcftools to get ldhat.sites and ldhat.locs
lkgen -lk lk_n192_t0.001 -nseq 96 to get a new new_lk.txt
with rhomap:
rhomap -lk new_lk.txt -burn 100000 -its 1000000 -samp 100 -seq ldhat.sites
-loc ldhat.locs
with interval:
interval -lk new_lk.txt -samp 2000 -its 1000000 -bpen 5 -seq ldhat.sites
-loc ldhat.locs
stat -input rates.txt
And yes ,the y-axis appear different so i want to know the rhomap results
with extremely high 4Ner/kb
—
Reply to this email directly, view it on GitHub
<#21 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABEVMY2DL5U6PNUPIEGVSU3VNVDAPANCNFSM5W3P3ZOQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Adam Auton
|
Thank you for your software.I'm trying to use software to calculate the hotpots of recombination in the MHC region. THE SNP data of chr6 MHC region of YRI population in KGP were used. After PHASED, I used rhomap to directly calculate the data of 5MB region.
The running parameters are as follows:
rhomap -lk new_lk.txt -burn 100000 -its 1100000 -samp 100 -seq ldhat.sites -loc ldhat.locs
The likelihood lookup files used n=192, theta=0.001 per site file.
After the calculation,i want to get the hotpots region in MHC region,I have the following questions。
1.can i use summary.txt to draw the results or do i need to use stat on the rates.txt to get res.txt and then to draw a plot?
2.Do you have some recommended parameters with rhomap?Because in LDhat.2.2 manual there is only recommended parameters in interval.
3.And the weirdest thing is, my results are extremely high with 4Ner/kb。In some region it can reach 867 (4Ner/kb).And i try rhomap on annother population data i even see 2000 (4Ner/kb).Is the numerical value of this result accurate?And I used whole 5mb region of mhc snps to calculate the recombination rate ,do i need to split to small region ?
The text was updated successfully, but these errors were encountered: