-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tophatfusion_post process does not fullly utilise CPUs it gets #23
Comments
Currently testing the suggested solution. |
According to the current tests, smaller split helps the CPU usage, but not the run time.
As a job's runtime can be affected by other processes/hardware differences between farm nodes/network connections/etc on a farm. I'm now testing it using Docker on an FCE instance. In the whole tophatfusion_post process, filtering the fusion events from the split files takes slightly more than half of the total runtime. We have no control of the other steps in the process, as they all run under tophat_fusion_post.py, which is an external tool. So if smaller splits do not help, there's no other way that we can tune this process's performance. Furthermore, in the smaller split size (5000) run (under singularity on Sanger Farm4), one fusion event is lost:
|
Running tophat fusion on farm4 and on an FCE instance with same split settings returned different results. In most cases, results on FCE have one less spanning reads. This could be because mapping was done separately. |
Tested three tophat_fusion split sizes: 50,000, 10,000 and 5,000. (Using docker on a FCE instance.) 16 cores Split_5000: 65892.73user 2417.95system 4:31:44elapsed 418%CPU
Split_10000: 58742.54user 2230.02system 4:01:17elapsed 421%CPU
Split_50000: 48274.74user 2209.20system 3:51:01elapsed 364%CPU It seems that 20,000/25,000 may provide a sweet spot. |
|
$ diff <(cut -f 1,3- 108988_tophatfusion_50000/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_tophatfusion_25000/108988.tophat-fusion.normals.filtered.strand.txt)
$ diff <(cut -f 1,3- 108988_tophatfusion_50000/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_tophatfusion_20000/108988.tophat-fusion.normals.filtered.strand.txt)
$ diff <(cut -f 1,3- 108988_tophatfusion_50000/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_tophatfusion_10000/108988.tophat-fusion.normals.filtered.strand.txt)
$ diff <(cut -f 1,3- 108988_tophatfusion_50000/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_tophatfusion_5000/108988.tophat-fusion.normals.filtered.strand.txt)
269d268
< 7:23253332-M:8463 GPNMB 7 23253332 MT-ATP8 M 8463 1 19 1 0.00 + + It also seems that split-5000 affects output. I'll leave the split size unchanged. |
@byb121 have you considered the impact of only allowing 4 threads, i.e. if 16 cores runs in 4h, but only uses ~400% CPU, what does 12/8/4 cpus do to runtime memory usage. If the additional threads don't speed it up and are wasted for the bulk of the run we should be reducing them. |
For Defuse uses CPUs very efficiently:
Star-fusion can not use more than 400% of CPU, but quite quick comparing to tophat and defuse. Its run time is stabled at about 80mins when more than 4 cores are given. If we group the three tools in one script, I think using 8 cores is our best choice. |
More CPU/wall time/RAM stats: https://github.com/cancerit/cgpRna/wiki/7.-CPU,-Wall-Time-and-Max-RAM |
As @keiranmraine reported: Pipeline version is requesting 16 cpu, but load is only ~2, looking at point where tophat-fusion-post is executing under this there are only 5 processes.
We suspected this is because sometimes there aren't many fusion events to split into smaller chunks for tophatfusion_post to fill up threads with, thus reducing the number of this line could lead to more and smaller splits of fusion events, so that hopefully tophatfusion_post can have better CPU usage and shorter runtime.
The text was updated successfully, but these errors were encountered: