Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance (speed up) is poor #240

Open
vikasrs opened this issue Sep 13, 2019 · 1 comment
Open

Performance (speed up) is poor #240

vikasrs opened this issue Sep 13, 2019 · 1 comment

Comments

@vikasrs
Copy link

vikasrs commented Sep 13, 2019

Hi,
My specs:
iMac 4 Ghz Intel core i7
32 GB DDR3
AMD Radeon R9 M395X 4096 MB
MacOS 10.13.6

I built and installed the ngraph-bridge with plaidML backend from source and tested execution times. I am not seeing any meaningful speed ups using ngraph with either CPU or PlaidML backend over regular tensorflow (no ngraph).

I am running the classify example suggested in the docs in a loop, so not just measuring the first execution.

ngraph with plaidML: I can see in the activity monitor that GPU utilization spikes up considerably. However, I don't see any speedup compared to no ngraph run.
(iteration # followed by elapsed time for sess.run in seconds)

2019-09-13 08:11:57.699701: I /Users/vikasrs/Documents/Software/ngraph-bridge/ngraph_bridge/ngraph_rewrite_pass.cc:235] NGraph using backend: PLAIDML:0
Elapsed:  5.9642627239227295
1
Elapsed:  0.050453901290893555
2
Elapsed:  0.05062294006347656
3
Elapsed:  0.05094504356384277
4
Elapsed:  0.05070328712463379
5
Elapsed:  0.05025196075439453
6
Elapsed:  0.050736188888549805
7
Elapsed:  0.050640106201171875
8
Elapsed:  0.050331830978393555
9
Elapsed:  0.050276756286621094

ngraph with CPU backend

2019-09-13 08:25:41.420098: I /Users/vikasrs/Documents/Software/ngraph-bridge/ngraph_bridge/ngraph_rewrite_pass.cc:235] NGraph using backend: CPU
Elapsed:  4.661229133605957
1
Elapsed:  0.1206521987915039
2
Elapsed:  0.12163519859313965
3
Elapsed:  0.12114787101745605
4
Elapsed:  0.12049198150634766
5
Elapsed:  0.12010717391967773
6
Elapsed:  0.12009716033935547
7
Elapsed:  0.12112116813659668
8
Elapsed:  0.12095427513122559
9
Elapsed:  0.1200251579284668

No NGraph

0
Elapsed:  3.480649948120117
1
Elapsed:  0.05639004707336426
2
Elapsed:  0.05907034873962402
3
Elapsed:  0.05813884735107422
4
Elapsed:  0.061283111572265625
5
Elapsed:  0.05951809883117676
6
Elapsed:  0.05975770950317383
7
Elapsed:  0.05814003944396973
8
Elapsed:  0.05785775184631348
9
Elapsed:  0.057444095611572266

Is this the expected result or am I missing something?

@denise-k
Copy link

Hi! I’m from the PlaidML team. We wanted to address the slowness you are seeing when using the PlaidML backend with nGraph.

We’ve been aware of performance issues caused while converting nGraph ops to the PlaidML level. We have actively been working on fixing the most egregious performance issues.

In the longer term, we have also been working on new ways of representing operations in PlaidML which will facilitate even more efficient lowering from nGraph to PlaidML. If you'd like to test out our newer code, you can set the environment variable USE_STRIPE to 1.

Additionally, we'd like to mention a caveat specific to the example you provided:

  • PlaidML works best with sufficiently large networks and larger batch sizes. This is because GPU-based computations have overhead associated with data transfers between the CPU and GPU that you don't see when you use the CPU by itself. This may be why you don't see a performance benefit out of using your GPU + PlaidML for the inference example you specified.

If you have any questions about this, please feel free to reach out to our team by filing an issue or joining the collective nGraph + PlaidML workspace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants