LSTM training is super slow on GPU #34

phgilde · 2020-08-06T14:12:04Z

This training loop takes more than a second per epoch using tensorflow-directml but a fraction of a second with standard tensorflow.
It actually doesnt work at all (error is NaN after a couple of iterations) but I already opened another Issue for that.

Code:

import tensorflow as tf
import numpy as np
from tensorflow import keras
import matplotlib.pyplot as plt
import time
from datetime import timedelta

def fn(x):
    return tf.sin(x)

seq_length = 200
x = tf.linspace(tf.constant(0, dtype=tf.float32), 50, seq_length)
y = fn(x)

n_outputs = 50
model = keras.layers.LSTM(n_outputs, return_sequences=True)
optimizer = keras.optimizers.Adam(learning_rate=1e-3)
loss_fn = keras.losses.MSE

loss_history = []
epochs = 2_000
out_epochs = 10
start = time.time()
for epoch in range(epochs):
    with tf.GradientTape() as tape:
        y_pred = model(tf.zeros(shape=(1, seq_length, 1)))
        y_pred_data = y_pred[0, :, 0]
        loss = loss_fn(y, y_pred_data)
    loss_history.append(loss.numpy())
    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))
    if epoch % out_epochs == 0:
        print(f"Epoch {epoch}: Loss = {loss} ({timedelta(seconds=time.time()-start)})")

System: Intel i5-7200U with Intel HD graphics 620

The text was updated successfully, but these errors were encountered:

PatriceVignola · 2020-08-07T17:02:50Z

Thank you for reporting this @phgilde . Are you running this script on Windows or WSL?

phgilde · 2020-08-07T17:32:27Z

@PatriceVignola I'm running this on windows

jstoecker · 2020-09-17T04:22:21Z

We've implemented the single-step/block-based LSTM/GRU/RNN ops, but these are really better suited to CPU architectures. Models typically use the multi-step cuDNN ops when executing on a GPU device. It's not unsurprising that there's some more work here to make DML perform better with recurrent networks.

wchao1115 · 2020-10-01T23:42:31Z

@phgilde What GPU you're running this with? You mentioned standard tensorflow and that your config is with Intel HD graphics. Is this training script running on CPU?

ghostlypi · 2021-02-19T19:41:49Z

I've had the same issue on an RX 560. In task manager neither the GPU nor the CPU seems to take on any load.

onurberkay · 2022-04-25T18:27:46Z

I have same problem with 4750u amd apu , also gpu load not even %1-2

PatriceVignola · 2022-04-25T18:50:15Z

@onurberkay What does tf.config.list_physical_devices() give you?

jstoecker transferred this issue from microsoft/DirectML Sep 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LSTM training is super slow on GPU #34

LSTM training is super slow on GPU #34

phgilde commented Aug 6, 2020

PatriceVignola commented Aug 7, 2020

phgilde commented Aug 7, 2020

jstoecker commented Sep 17, 2020

wchao1115 commented Oct 1, 2020

ghostlypi commented Feb 19, 2021

onurberkay commented Apr 25, 2022

PatriceVignola commented Apr 25, 2022

LSTM training is super slow on GPU #34

LSTM training is super slow on GPU #34

Comments

phgilde commented Aug 6, 2020

PatriceVignola commented Aug 7, 2020

phgilde commented Aug 7, 2020

jstoecker commented Sep 17, 2020

wchao1115 commented Oct 1, 2020

ghostlypi commented Feb 19, 2021

onurberkay commented Apr 25, 2022

PatriceVignola commented Apr 25, 2022