-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A bug in ieugwasr::ld_matrix_local #38
Comments
Is this fixed yet? |
Try to read a file, which has all "T" in the 5-th column such as:
$ cat test.bim
1 rs38 0 32 T C
1 rs11 0 357 T C
1 rs96 0 733 T A
When I try it, I get the followings:
> bim <- read.table("test.bim", stringsAsFactors=F)
> bim2 <- read.table("test.bim", stringsAsFactors=F, tryLogical=F)
> bim == bim2
V1 V2 V3 V4 V5 V6
[1,] TRUE TRUE TRUE TRUE FALSE TRUE
[2,] TRUE TRUE TRUE TRUE FALSE TRUE
[3,] TRUE TRUE TRUE TRUE FALSE TRUE
Notice that the 5-th column is all FALSE.
This behavior does not affect the ld calculation by plink, as the bim is
not used by plink in ieugwasr::ld_matrix_local.
When the function returns the ld matrix, it mislabels the columns and rows
when "with_alleles=TRUE".
The behavior is not new, as it has been reported at
https://www.biostars.org/p/258586/
The R-4.3.0 introduced "tryLogical", according to the release note:
"type.convert() and hence read.table() get new option tryLogical = TRUE with back compatible default. When set to false, converts "F" or "T" columns to character."
I am not in the position to update the github source code of ieugwasr.
Instead, I made a workaround to post-process the ld matrix by fixing the
column and row labels.
I incorporated this fix into the TwoSampleMR::dat_to_MRInput function. If
anybody is interested in this fix, please let me know.
Sincerely,
Sangsoo Kim
2024년 4월 15일 (월) 오전 10:53, leoarrow1 ***@***.***>님이 작성:
… Im using R4.3.3, I did not experience this issue.
image.png (view on web)
<https://github.com/MRCIEU/ieugwasr/assets/127536391/a2050ee2-2aaf-43af-85a1-094a054035f7>
—
Reply to this email directly, view it on GitHub
<#38 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADAJTYXURWG3OISIBW3BY6DY5MXJFAVCNFSM6AAAAAA4U5PQD2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJUGMYDINJUHA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Thank you for your prompt response.
I have carefully reviewed your message and the provided code snippet. If Im getting the code right, to clarify, the ld_matrix_local function utilizes PLINK to extract genotype data for a given SNP (RS ID) from the reference panel. This below code is a snippet from the source for this process:
```
# source code for your convenience to check
fun1 <- paste0(shQuote(plink_bin, type=shell),
" --bfile ", shQuote(bfile, type=shell),
" --extract ", shQuote(fn, type=shell),
" --make-just-bim ",
" --keep-allele-order ",
" --out ", shQuote(fn, type=shell)
)
system(fun1)
```
Subsequently, the BIM file is read using read.table:`bim <- read.table(paste0(fn, ".bim"), stringsAsFactors=FALSE)`, where bim file were used? If `with_alleles = T`, it will also use him file to attach the allele name in rownames and colnames. Hence, the bim file is used in `ld_matrix_local`.
I am currently encountering difficulties with the SuSiE-coloc analysis, as it indicates that the LD matrix is not consistent with the z-scores of the summary data. I am eager to learn more about your method for calculating the LD matrix, as I am still struggling with this aspect in order to perform a coloc-susie analysis.
BTW the attached file and the image mentioned in the correspondence were not received. Thank you for the assistance.
Best,
Ao
|
This file is an R script, containing dat2MRInput and Harmonise_LD_dat functions. Sincerely. |
Really appreciate it! Thats really helpful! Thanks, Ao |
The following is a result of reading a bim file generated by plink, as implemented in ld_matrix_local.
In this case, V5 has all "T" alleles. However, read.table in R guessed that this column must be a logical one as it is filled with only "T".
This behavior can be suppressed only in R 4.3.0+ by supplying an option "tryLogical=F", whose result is:
The text was updated successfully, but these errors were encountered: