-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LSTM 8x slow on gpu #367
Comments
Is there a repro script that you would be able to provide us? Otherwise, it would help if you could send us the device placement logs. Run |
out.txt @PatriceVignola |
There are a 2 issues that I could notice here at a cursory glance:
|
I have try a heavy model with dense on gpu its faster then cpu. Gpu usage stats low again but I think must be a problem about stats. When will be added first change or will be added? I can make tries any time. Thanks for answers |
I might be too late, but I think 89c is the problem, try to cool it down, it might be just trottling issue. |
Train on 36090 samples 2022-04-25 21:56:12.505195: I tensorflow/stream_executor/platform/default/dso_loader.cc:97] Successfully opened dynamic library C:\Users\onurb\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python/directml.24bfac66e4ee42ec393a5fb471412d0177bc7bcf.dll 2022-04-25 21:56:12.506028: I tensorflow/stream_executor/platform/default/dso_loader.cc:97] Successfully opened dynamic library dxgi.dll 2022-04-25 21:56:12.509302: I tensorflow/stream_executor/platform/default/dso_loader.cc:97] Successfully opened dynamic library d3d12.dll 2022-04-25 21:56:12.961954: I tensorflow/core/common_runtime/dml/dml_device_cache.cc:250] DirectML device enumeration: found 1 compatible adapters. 2022-04-25 21:56:12.962441: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2022-04-25 21:56:12.966749: I tensorflow/core/common_runtime/dml/dml_device_cache.cc:186] DirectML: creating device on adapter 0 (AMD Radeon(TM) Graphics) 2022-04-25 21:56:13.055907: I tensorflow/stream_executor/platform/default/dso_loader.cc:97] Successfully opened dynamic library Kernel32.dll 36090/36090 - 232s - loss: 0.0014 - acc: 0.0396 Train on 36090 samples
when using only cpu takes 30-40s there is huge difference. Also look like Gpu not taking load.
I am using 4750u apu
The text was updated successfully, but these errors were encountered: