Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed convergence and over-shrinkage #3

Open
weiw23 opened this issue Dec 9, 2022 · 5 comments
Open

failed convergence and over-shrinkage #3

weiw23 opened this issue Dec 9, 2022 · 5 comments

Comments

@weiw23
Copy link

weiw23 commented Dec 9, 2022

Hi @jeffspence ,

I tried VILMA with the following inputs:

  1. 1.7G HapMap LD matrix in this package;
  2. summary statistics with over 950K variants overlapped with the LD matrix;
  3. K = 60; and K = 30 (components)

and I found the following issues (AUC of the binary phenotype is close to 0.5):

  1. VILMA estimates are a bit over-shrinkage: a large proportion of summary statistics are shrunk to 0;
  2. I got "failed to converge" message for both K=60 and K=30;
  3. few variants was estimated to a very big value: beta in summary statistics is close to 0 (p-value >0.05) but the posterior estimation is far away from zero.

Can you please advice on how to resolve these issues?
Thanks a lot.
Wei

@jeffspence
Copy link
Owner

Hi Wei,

It definitely sounds like something is going wrong. Are the variants that are shrunk to zero listed as having either missing summary statistics or missing LD information in the output tsv file?
It's also possible that there is a bug in vilma -- dichotomous traits have not been thoroughly tested, but I would need a bit more information to get to the bottom of it.

Thanks for using the software and hopefully we can get it working for you,
Jeff

@weiw23
Copy link
Author

weiw23 commented Dec 13, 2022

Hi Jeff,

Thanks a lot for your quick response.
In the estimates.tsv file, there are around 950K rows and all the rows have FALSE in both missing_sumstats and missing LD.
I plot the estimated beta against sumstats beta (marginal beta), the dots are either close to the diagonal line or on the y=0 line, which means the shrinkage was doing reasonable work. However, there are too many beta shrunk to zero.
I don't think this is due to the binary phenotype, since the input is just sumstats. (I will try a quantitative traits)
In my sumstats, the sample size is much larger and the signal is much stronger.
Can you please advise on how to debug this issue?

Thanks again.
Wei

@weiw23
Copy link
Author

weiw23 commented Dec 13, 2022

@jeffspence
Can I please ask if you have tried the 1.7G LD matrix with sumstats from a different GWAS cohort (in the same population)?

@weiw23
Copy link
Author

weiw23 commented Dec 20, 2022

I tried VILMA to build my own LD matrix from plink data, and run VILMA fit to get PRS. The posterior mean is still over-shrunk to zero.
This over-shrinkage issue might not be related to LD matrix.
Can you Please advise on which part you think can cause this issue? LD threshold or any parameters?
Thanks a lot.

@jeffspence
Copy link
Owner

Hi @weiw23

I have not used the provided LD matrix on other samples, but I have used other LD matrices on the sample used to generate the provided LD matrix, which should get at the same point. Accuracy goes down with misspecified LD matrices, but I have not encountered anything like what you've described, so I don't think it's the LD matrix.

It would be easiest for me if you could provide some summary stats (maybe by email) that result in this issue, so that I can track down what's going wrong. If that's not possible, then I would guess that the issues is either:

  1. vilma tries to automatically select a reasonable range of effect sizes, and something could be going wrong here. vilma should produce a file like *covariance.pkl and you could check the elements of the vector stored in that file. Those correspond to different variances in the mixture-of-gaussians prior. I would double check to make sure that span the right orders-of-magnitude of what you would expect for squared true effect sizes. If they're all too small, then something is going wrong with the automatic gridding, and this would result in lots of effects getting shrunk to 0.
  2. There is some kind of numerical issue somewhere (e.g., underflow). This would be a bit harder to track down, but it would be good to fix any issues of this kind if possible, so I would be interested in getting to the root of the problem.

Hope this helps,
Jeff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants