LSTM 8x slow on gpu #367

onurberkay · 2022-04-25T19:05:04Z

Train on 36090 samples 2022-04-25 21:56:12.505195: I tensorflow/stream_executor/platform/default/dso_loader.cc:97] Successfully opened dynamic library C:\Users\onurb\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python/directml.24bfac66e4ee42ec393a5fb471412d0177bc7bcf.dll 2022-04-25 21:56:12.506028: I tensorflow/stream_executor/platform/default/dso_loader.cc:97] Successfully opened dynamic library dxgi.dll 2022-04-25 21:56:12.509302: I tensorflow/stream_executor/platform/default/dso_loader.cc:97] Successfully opened dynamic library d3d12.dll 2022-04-25 21:56:12.961954: I tensorflow/core/common_runtime/dml/dml_device_cache.cc:250] DirectML device enumeration: found 1 compatible adapters. 2022-04-25 21:56:12.962441: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2022-04-25 21:56:12.966749: I tensorflow/core/common_runtime/dml/dml_device_cache.cc:186] DirectML: creating device on adapter 0 (AMD Radeon(TM) Graphics) 2022-04-25 21:56:13.055907: I tensorflow/stream_executor/platform/default/dso_loader.cc:97] Successfully opened dynamic library Kernel32.dll 36090/36090 - 232s - loss: 0.0014 - acc: 0.0396 Train on 36090 samples

when using only cpu takes 30-40s there is huge difference. Also look like Gpu not taking load.
I am using 4750u apu

The text was updated successfully, but these errors were encountered:

PatriceVignola · 2022-04-25T19:25:39Z

Is there a repro script that you would be able to provide us? Otherwise, it would help if you could send us the device placement logs. Run tf.debugging.set_log_device_placement(True) before redirecting the output to a file.

onurberkay · 2022-04-25T19:46:59Z

out.txt @PatriceVignola
just a simple code. need some libraries to run => pip install yfinance / pip install scikit-learn / pip install matplotlib
https://www.online-python.com/2Pa6iM1QZ3

PatriceVignola · 2022-04-25T20:41:39Z

There are a 2 issues that I could notice here at a cursory glance:

The model uses a Qr operator internally, which isn't supported on DML (it isn't supported on CUDA either, but they "fake" register it to run on the CPU in order to enable device colocation on CUDA). We can do the same thing that CUDA does here and register it the same way for DML, and we might see some marginal perf improvements.
The fact that you only have 1% load on the GPU is worrying. On my desktop, I see at least 40% throughout the whole training process when running the script that you linked. We haven't really tested tensorflow-directml on AMD APUs yet, but our experience with many integrated graphics in the past is that it's just faster to run everything on the CPU. For integrated graphics to work, they have to be powerful enough to make it worth to transfer data between the CPU and the GPU. I'll see if I can get my hands on a 4750 and investigate more.

onurberkay · 2022-04-25T21:39:03Z

I have try a heavy model with dense on gpu its faster then cpu. Gpu usage stats low again but I think must be a problem about stats. When will be added first change or will be added? I can make tries any time. Thanks for answers
model.add(Dense(2000,kernel_regularizer=regularizers.l2(0.00000000001))) model.add(Dense(2000,kernel_regularizer=regularizers.l2(0.00000000001))) model.add(Dense(2000,kernel_regularizer=regularizers.l2(0.00000000001))) model.add(Dense(2000,kernel_regularizer=regularizers.l2(0.00000000001))) model.add(Dense(2000,kernel_regularizer=regularizers.l2(0.00000000001))) model.add(Dense(2000,kernel_regularizer=regularizers.l2(0.00000000001)))

RichardErkhov · 2022-06-03T19:47:34Z

I might be too late, but I think 89c is the problem, try to cool it down, it might be just trottling issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LSTM 8x slow on gpu #367

LSTM 8x slow on gpu #367

onurberkay commented Apr 25, 2022

PatriceVignola commented Apr 25, 2022

onurberkay commented Apr 25, 2022

PatriceVignola commented Apr 25, 2022

onurberkay commented Apr 25, 2022

RichardErkhov commented Jun 3, 2022

LSTM 8x slow on gpu #367

LSTM 8x slow on gpu #367

Comments

onurberkay commented Apr 25, 2022

PatriceVignola commented Apr 25, 2022

onurberkay commented Apr 25, 2022

PatriceVignola commented Apr 25, 2022

onurberkay commented Apr 25, 2022

RichardErkhov commented Jun 3, 2022