tophatfusion_post process does not fullly utilise CPUs it gets #23

byb121 · 2019-07-12T11:54:00Z

As @keiranmraine reported: Pipeline version is requesting 16 cpu, but load is only ~2, looking at point where tophat-fusion-post is executing under this there are only 5 processes.

We suspected this is because sometimes there aren't many fusion events to split into smaller chunks for tophatfusion_post to fill up threads with, thus reducing the number of this line could lead to more and smaller splits of fusion events, so that hopefully tophatfusion_post can have better CPU usage and shorter runtime.

byb121 · 2019-07-12T11:54:29Z

Currently testing the suggested solution.

byb121 · 2019-07-24T09:51:45Z

According to the current tests, smaller split helps the CPU usage, but not the run time.

$ tail -n 3 108988_infuse_tophat_1/logs_tophat/Sanger_CGP_Tophat_Implement_tophatfusion_post.0.err
[Wed Jul 17 02:32:08 2019] Run complete [06:44:17 elapsed]
73607.32user 15274.00system 6:44:18elapsed 366%CPU (0avgtext+0avgdata 12813588maxresident)k
78497348inputs+968984outputs (70559major+7871198468minor)pagefaults 0swaps
$ tail -n 3 108988_infuse_tophat_2/logs_tophat/Sanger_CGP_Tophat_Implement_tophatfusion_post.0.err
[Sat Jul 20 09:39:52 2019] Run complete [10:45:22 elapsed]
184473.51user 32143.11system 10:45:23elapsed 559%CPU (0avgtext+0avgdata 12811840maxresident)k
78512770inputs+969768outputs (65834major+7857753864minor)pagefaults 0swaps

As a job's runtime can be affected by other processes/hardware differences between farm nodes/network connections/etc on a farm. I'm now testing it using Docker on an FCE instance.

In the whole tophatfusion_post process, filtering the fusion events from the split files takes slightly more than half of the total runtime. We have no control of the other steps in the process, as they all run under tophat_fusion_post.py, which is an external tool. So if smaller splits do not help, there's no other way that we can tune this process's performance.

Furthermore, in the smaller split size (5000) run (under singularity on Sanger Farm4), one fusion event is lost:

$ diff <(cut -f 1,3- 108988_infuse_tophat_1/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_infuse_tophat_2/108988.tophat-fusion.normals.filtered.strand.txt)
269d268
< 7:23253332-M:8463	GPNMB	7	23253332	MT-ATP8	M	8463	1	19	1	0.00	+	+

byb121 · 2019-07-24T15:24:07Z

Running tophat fusion on farm4 and on an FCE instance with same split settings returned different results. In most cases, results on FCE have one less spanning reads.

This could be because mapping was done separately.

byb121 · 2019-07-25T09:46:37Z

Tested three tophat_fusion split sizes: 50,000, 10,000 and 5,000. (Using docker on a FCE instance.)

16 cores

Split_5000: 65892.73user 2417.95system 4:31:44elapsed 418%CPU
Split_10000: 58742.54user 2230.02system 4:01:17elapsed 421%CPU
Split_50000: 48274.74user 2209.20system 3:51:01elapsed 364%CPU

It seems that 20,000/25,000 may provide a sweet spot.

byb121 · 2019-07-26T09:54:12Z

Split_20000: 52726.10user 2074.86system 4:06:54elapsed 369%CPU
Split_25000: 50817.56user 2031.27system 3:57:39elapsed 370%CPU

byb121 · 2019-07-26T09:58:48Z

$ diff <(cut -f 1,3- 108988_tophatfusion_50000/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_tophatfusion_25000/108988.tophat-fusion.normals.filtered.strand.txt)
$ diff <(cut -f 1,3- 108988_tophatfusion_50000/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_tophatfusion_20000/108988.tophat-fusion.normals.filtered.strand.txt)
$ diff <(cut -f 1,3- 108988_tophatfusion_50000/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_tophatfusion_10000/108988.tophat-fusion.normals.filtered.strand.txt)
$ diff <(cut -f 1,3- 108988_tophatfusion_50000/108988.tophat-fusion.normals.filtered.strand.txt) <(cut -f 1,3- 108988_tophatfusion_5000/108988.tophat-fusion.normals.filtered.strand.txt)
269d268
< 7:23253332-M:8463	GPNMB	7	23253332	MT-ATP8	M	8463	1	19	1	0.00	+	+

It also seems that split-5000 affects output.

I'll leave the split size unchanged.

keiranmraine · 2019-07-29T13:57:54Z

@byb121 have you considered the impact of only allowing 4 threads, i.e. if 16 cores runs in 4h, but only uses ~400% CPU, what does 12/8/4 cpus do to runtime memory usage. If the additional threads don't speed it up and are wasted for the bulk of the run we should be reducing them.

byb121 · 2019-08-08T15:25:26Z

For tophat-fusion, there's little performance difference between 16 and 8 cores, but if reduced to 4 cores, it will take 4 hours more to run comparing to 9.5 hours on 8 cores.

Defuse uses CPUs very efficiently:

	16core	8core	4core
CPU	1159%	668%	370%
Wall time (hh:mm:ss)	06:15:08	10:18:23	18:41:36
Max resident size (kbytes)	6,124,448	6,124,588	6,124,900

Star-fusion can not use more than 400% of CPU, but quite quick comparing to tophat and defuse. Its run time is stabled at about 80mins when more than 4 cores are given.

If we group the three tools in one script, I think using 8 cores is our best choice.

byb121 · 2019-08-09T10:16:10Z

More CPU/wall time/RAM stats: https://github.com/cancerit/cgpRna/wiki/7.-CPU,-Wall-Time-and-Max-RAM

byb121 added the enhancement label Jul 12, 2019

byb121 self-assigned this Jul 12, 2019

byb121 pinned this issue Jul 12, 2019

byb121 mentioned this issue Jul 12, 2019

bumped version, added notes to CHANGES #22

Closed

byb121 closed this as completed Jul 26, 2019

byb121 pushed a commit that referenced this issue Jul 26, 2019

revert tophat fusion split to 50000, as the decision of #23

43eb840

keiranmraine reopened this Jul 29, 2019

byb121 closed this as completed Aug 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tophatfusion_post process does not fullly utilise CPUs it gets #23

tophatfusion_post process does not fullly utilise CPUs it gets #23

byb121 commented Jul 12, 2019

byb121 commented Jul 12, 2019

byb121 commented Jul 24, 2019 •

edited

Loading

byb121 commented Jul 24, 2019

byb121 commented Jul 25, 2019 •

edited

Loading

byb121 commented Jul 26, 2019

byb121 commented Jul 26, 2019

keiranmraine commented Jul 29, 2019

byb121 commented Aug 8, 2019

byb121 commented Aug 9, 2019

tophatfusion_post process does not fullly utilise CPUs it gets #23

tophatfusion_post process does not fullly utilise CPUs it gets #23

Comments

byb121 commented Jul 12, 2019

byb121 commented Jul 12, 2019

byb121 commented Jul 24, 2019 • edited Loading

byb121 commented Jul 24, 2019

byb121 commented Jul 25, 2019 • edited Loading

byb121 commented Jul 26, 2019

byb121 commented Jul 26, 2019

keiranmraine commented Jul 29, 2019

byb121 commented Aug 8, 2019

byb121 commented Aug 9, 2019

byb121 commented Jul 24, 2019 •

edited

Loading

byb121 commented Jul 25, 2019 •

edited

Loading