Performance (speed up) is poor #240

vikasrs · 2019-09-13T17:03:20Z

Hi,
My specs:
iMac 4 Ghz Intel core i7
32 GB DDR3
AMD Radeon R9 M395X 4096 MB
MacOS 10.13.6

I built and installed the ngraph-bridge with plaidML backend from source and tested execution times. I am not seeing any meaningful speed ups using ngraph with either CPU or PlaidML backend over regular tensorflow (no ngraph).

I am running the classify example suggested in the docs in a loop, so not just measuring the first execution.

ngraph with plaidML: I can see in the activity monitor that GPU utilization spikes up considerably. However, I don't see any speedup compared to no ngraph run.
(iteration # followed by elapsed time for sess.run in seconds)

2019-09-13 08:11:57.699701: I /Users/vikasrs/Documents/Software/ngraph-bridge/ngraph_bridge/ngraph_rewrite_pass.cc:235] NGraph using backend: PLAIDML:0
Elapsed:  5.9642627239227295
1
Elapsed:  0.050453901290893555
2
Elapsed:  0.05062294006347656
3
Elapsed:  0.05094504356384277
4
Elapsed:  0.05070328712463379
5
Elapsed:  0.05025196075439453
6
Elapsed:  0.050736188888549805
7
Elapsed:  0.050640106201171875
8
Elapsed:  0.050331830978393555
9
Elapsed:  0.050276756286621094

ngraph with CPU backend

2019-09-13 08:25:41.420098: I /Users/vikasrs/Documents/Software/ngraph-bridge/ngraph_bridge/ngraph_rewrite_pass.cc:235] NGraph using backend: CPU
Elapsed:  4.661229133605957
1
Elapsed:  0.1206521987915039
2
Elapsed:  0.12163519859313965
3
Elapsed:  0.12114787101745605
4
Elapsed:  0.12049198150634766
5
Elapsed:  0.12010717391967773
6
Elapsed:  0.12009716033935547
7
Elapsed:  0.12112116813659668
8
Elapsed:  0.12095427513122559
9
Elapsed:  0.1200251579284668

No NGraph

0
Elapsed:  3.480649948120117
1
Elapsed:  0.05639004707336426
2
Elapsed:  0.05907034873962402
3
Elapsed:  0.05813884735107422
4
Elapsed:  0.061283111572265625
5
Elapsed:  0.05951809883117676
6
Elapsed:  0.05975770950317383
7
Elapsed:  0.05814003944396973
8
Elapsed:  0.05785775184631348
9
Elapsed:  0.057444095611572266

Is this the expected result or am I missing something?

The text was updated successfully, but these errors were encountered:

denise-k · 2019-09-16T16:54:56Z

Hi! I’m from the PlaidML team. We wanted to address the slowness you are seeing when using the PlaidML backend with nGraph.

We’ve been aware of performance issues caused while converting nGraph ops to the PlaidML level. We have actively been working on fixing the most egregious performance issues.

In the longer term, we have also been working on new ways of representing operations in PlaidML which will facilitate even more efficient lowering from nGraph to PlaidML. If you'd like to test out our newer code, you can set the environment variable USE_STRIPE to 1.

Additionally, we'd like to mention a caveat specific to the example you provided:

PlaidML works best with sufficiently large networks and larger batch sizes. This is because GPU-based computations have overhead associated with data transfers between the CPU and GPU that you don't see when you use the CPU by itself. This may be why you don't see a performance benefit out of using your GPU + PlaidML for the inference example you specified.

If you have any questions about this, please feel free to reach out to our team by filing an issue or joining the collective nGraph + PlaidML workspace

gopoka pushed a commit that referenced this issue Oct 28, 2019

Implement All op for GNMT training (#240)

de71d65

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance (speed up) is poor #240

Performance (speed up) is poor #240

vikasrs commented Sep 13, 2019 •

edited

Loading

denise-k commented Sep 16, 2019

Performance (speed up) is poor #240

Performance (speed up) is poor #240

Comments

vikasrs commented Sep 13, 2019 • edited Loading

denise-k commented Sep 16, 2019

vikasrs commented Sep 13, 2019 •

edited

Loading