-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extremely Slow in cal_ot_mat
#15
Comments
Unfortunately the computation of the earth-mover distance is time consuming, so there are two possible approaches 1) Reduce to top genes and coarse grain cellsThis is the way used in the tutorial and the relevant code looks like
2) Limit the cell-cell pairs for which the distance is calculatedThe function For an in-depth discussion, please have a look at this article for the R version of the tool https://klugerlab.github.io/GeneTrajectory/articles/fast_computation.html. |
Thank @fra-pcmgf. However I have already reduce to top genes and coarse grain cells, as the tutorial did. The only difference is that I set genes = select_top_genes(..., n_variable_genes=3000)
# .......
gene_expression_updated, graph_dist_updated = coarse_grain_adata(..., features=genes, n=500) In "Improve the computation efficiency of gene-gene distance matrix" documentation, it said that,
But currently this I'm just wondering that, is such a long calculation period is normal for dataset of this size, or some kinds of bug? I'm not sure and have no idea about these questions. |
Can you try to run the mouse dermal tutorial? It uses the defaults for |
Sorry for the late response. I have just ran the mouse tutorial, and all the parameters were kept same with the tutorial. My cpus are intel Xeon E5 2680v4 × 2, all 28 cores are used in calculation process. I manually stopped the So it's even more weird to spend more time on a smaller dataset, which shape is 1570 × 19241, with Any idea? |
The speed on the mouse tutorial (you can see the ETA of 48:17 in the progress bar) looks similar to mine, so I don't think there is any issue with package versions or your installation. I'm not sure why the progress bar doesn't show. It could be because the parallel computation never starts (e.g. if the matrix is large and the machine runs out of memory or there are issues with Python's multiprocessing), but I can't think of why it should happen if you can run the tutorial which has a similar size. Can you check that you are running For example running
returns for the mouse tutorial
|
Dear developers,
I have a tiny anndata object which shape is 1570 × 19241. But when I ran
cal_ot_mat
as tutorial "Gene Trajectory Python tutorial: Human myeloid", the progress is extremelty slow and no progress bar is shown.It cost me more than 90 minute without any result, and it still running. System monitor suggests all cores are fully occupied.
It's very weird for such a tiny dataset. Any help are greatly appreciated.
Here is my code.
My
anndata
object,My packages,
Python version
3.9.19
The text was updated successfully, but these errors were encountered: