Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shared a lookup table for n = 320 (theta = 0.01) #14

Open
jshoyer opened this issue Jul 7, 2020 · 2 comments
Open

Shared a lookup table for n = 320 (theta = 0.01) #14

jshoyer opened this issue Jul 7, 2020 · 2 comments

Comments

@jshoyer
Copy link

jshoyer commented Jul 7, 2020

In case anyone is interested, I created a new likelihood lookup table for n = 320 sequences/chromosomes -- see https://zenodo.org/record/3934350
That seemed sufficiently large and computationally expensive to make sharing worthwhile.
I would have created a pull request, but the table is too large for GitHub (207.5 MB compressed, 806.8 MB uncompressed), and centralized distribution of the tables via Git is not disk-space-efficient anyway. Ideas for helping people discover the file would be welcome.
Feel free to close this issue whenever.

@jshoyer jshoyer changed the title Shared a table for n = 320 (theta = 0.01) Shared a lookup table for n = 320 (theta = 0.01) Jul 9, 2020
@enocI21
Copy link

enocI21 commented Aug 10, 2020

Dear jshoye,

Thank you very much for sharing your table, and would like to ask you a few questions.
How long did it take you to calculate that table? Because I have tried to calculate mine for 250 (500 sequences/chromosomes, 2.6 million configurations) animals but it is too slow, and the analysis only occupies 2% of a server with 150 GB of RAM, and 32 cores, so you can imagine that It would take him 2 years to calculate this table, and well my goal would be to have a table for 1000 animals since I have at least 13000 genotypes, is there any advice you can give me?
Is there a way to allocate more memory and cores to the calculation to speed up the process?
thanks for your answer

@jshoyer
Copy link
Author

jshoyer commented Aug 11, 2020

I used both Slurm job arrays and GNU parallel to parallelize the computations with the --splits flag to ldhat complete. See the job script that I included in the Zenodo record: https://zenodo.org/record/3934350/files/ldhat-complete-n320-t0.01-split10000fold.sbatch

The job will still take quite a while with 32 CPU cores. I used hundreds of cores.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants