Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tophatfusion_post process does not fullly utilise CPUs it gets #23

Closed
byb121 opened this issue Jul 12, 2019 · 9 comments
Closed

tophatfusion_post process does not fullly utilise CPUs it gets #23

byb121 opened this issue Jul 12, 2019 · 9 comments
Assignees

Comments

@byb121
Copy link
Contributor

byb121 commented Jul 12, 2019

As @keiranmraine reported: Pipeline version is requesting 16 cpu, but load is only ~2, looking at point where tophat-fusion-post is executing under this there are only 5 processes.

We suspected this is because sometimes there aren't many fusion events to split into smaller chunks for tophatfusion_post to fill up threads with, thus reducing the number of this line could lead to more and smaller splits of fusion events, so that hopefully tophatfusion_post can have better CPU usage and shorter runtime.

@byb121 byb121 self-assigned this Jul 12, 2019
@byb121
Copy link
Contributor Author

byb121 commented Jul 12, 2019

Currently testing the suggested solution.

@byb121
Copy link
Contributor Author

byb121 commented Jul 24, 2019

According to the current tests, smaller split helps the CPU usage, but not the run time.

$ tail -n 3 108988_infuse_tophat_1/logs_tophat/Sanger_CGP_Tophat_Implement_tophatfusion_post.0.err
[Wed Jul 17 02:32:08 2019] Run complete [06:44:17 elapsed]
73607.32user 15274.00system 6:44:18elapsed 366%CPU (0avgtext+0avgdata 12813588maxresident)k
78497348inputs+968984outputs (70559major+7871198468minor)pagefaults 0swaps
$ tail -n 3 108988_infuse_tophat_2/logs_tophat/Sanger_CGP_Tophat_Implement_tophatfusion_post.0.err
[Sat Jul 20 09:39:52 2019] Run complete [10:45:22 elapsed]
184473.51user 32143.11system 10:45:23elapsed 559%CPU (0avgtext+0avgdata 12811840maxresident)k
78512770inputs+969768outputs (65834major+7857753864minor)pagefaults 0swaps

As a job's runtime can be affected by other processes/hardware differences between farm nodes/network connections/etc on a farm. I'm now testing it using Docker on an FCE instance.

In the whole tophatfusion_post process, filtering the fusion events from the split files takes slightly more than half of the total runtime. We have no control of the other steps in the process, as they all run under tophat_fusion_post.py, which is an external tool. So if smaller splits do not help, there's no other way that we can tune this process's performance.

Furthermore, in the smaller split size (5000) run (under singularity on Sanger Farm4), one fusion event is lost:

$ diff <(cut -f 1,3- 108988_infuse_tophat_1/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_infuse_tophat_2/108988.tophat-fusion.normals.filtered.strand.txt)
269d268
< 7:23253332-M:8463	GPNMB	7	23253332	MT-ATP8	M	8463	1	19	1	0.00	+	+

@byb121
Copy link
Contributor Author

byb121 commented Jul 24, 2019

Running tophat fusion on farm4 and on an FCE instance with same split settings returned different results. In most cases, results on FCE have one less spanning reads.

This could be because mapping was done separately.

@byb121
Copy link
Contributor Author

byb121 commented Jul 25, 2019

Tested three tophat_fusion split sizes: 50,000, 10,000 and 5,000. (Using docker on a FCE instance.)

16 cores

Split_5000: 65892.73user 2417.95system 4:31:44elapsed 418%CPU
Split_10000: 58742.54user 2230.02system 4:01:17elapsed 421%CPU
Split_50000: 48274.74user 2209.20system 3:51:01elapsed 364%CPU

It seems that 20,000/25,000 may provide a sweet spot.

@byb121
Copy link
Contributor Author

byb121 commented Jul 26, 2019

Split_20000: 52726.10user 2074.86system 4:06:54elapsed 369%CPU
Split_25000: 50817.56user 2031.27system 3:57:39elapsed 370%CPU

@byb121
Copy link
Contributor Author

byb121 commented Jul 26, 2019

$ diff <(cut -f 1,3- 108988_tophatfusion_50000/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_tophatfusion_25000/108988.tophat-fusion.normals.filtered.strand.txt)
$ diff <(cut -f 1,3- 108988_tophatfusion_50000/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_tophatfusion_20000/108988.tophat-fusion.normals.filtered.strand.txt)
$ diff <(cut -f 1,3- 108988_tophatfusion_50000/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_tophatfusion_10000/108988.tophat-fusion.normals.filtered.strand.txt)
$ diff <(cut -f 1,3- 108988_tophatfusion_50000/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_tophatfusion_5000/108988.tophat-fusion.normals.filtered.strand.txt)
269d268
< 7:23253332-M:8463	GPNMB	7	23253332	MT-ATP8	M	8463	1	19	1	0.00	+	+

It also seems that split-5000 affects output.

I'll leave the split size unchanged.

@keiranmraine
Copy link
Contributor

@byb121 have you considered the impact of only allowing 4 threads, i.e. if 16 cores runs in 4h, but only uses ~400% CPU, what does 12/8/4 cpus do to runtime memory usage. If the additional threads don't speed it up and are wasted for the bulk of the run we should be reducing them.

@keiranmraine keiranmraine reopened this Jul 29, 2019
@byb121
Copy link
Contributor Author

byb121 commented Aug 8, 2019

For tophat-fusion, there's little performance difference between 16 and 8 cores, but if reduced to 4 cores, it will take 4 hours more to run comparing to 9.5 hours on 8 cores.

Defuse uses CPUs very efficiently:

defuse_fusion_defuse.time   16core 8core 4core
CPU 1159% 668% 370%
Wall time (hh:mm:ss) 06:15:08 10:18:23 18:41:36
Max resident size (kbytes) 6,124,448 6,124,588 6,124,900

Star-fusion can not use more than 400% of CPU, but quite quick comparing to tophat and defuse. Its run time is stabled at about 80mins when more than 4 cores are given.

If we group the three tools in one script, I think using 8 cores is our best choice.

@byb121
Copy link
Contributor Author

byb121 commented Aug 9, 2019

More CPU/wall time/RAM stats: https://github.com/cancerit/cgpRna/wiki/7.-CPU,-Wall-Time-and-Max-RAM

@byb121 byb121 closed this as completed Aug 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants