Running out of memory in colab #2

Fredrum · 2019-12-15T02:00:22Z

I'm going through and running your training setup on a google colab netbook but when I get to this section it keeps crashing:

trainer = MultiGPUTrainer('trainer', make_model,
                          TrainerClass=FixedOrderTrainer, sampler_opts=dict(samples_per_line=1),
                          optimizer_opts=dict(base_lr=1.4e-3, warmup_time=16000))

sometimes this causes the crash:
sess.run(tf.global_variables_initializer())

This is the error message:

ResourceExhaustedError: OOM when allocating tensor with shape[512,769954] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node trainer/worker_0/trainer_mod/trainer/worker_0/mod/logits/W/Adam/Assign (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]

I have tried restarting the runtime many times without change. I also tried setting the GPU count to 1. But it still runs out of memory.

Do you have any ideas how I can work around this?

Cheers, Fred

The text was updated successfully, but these errors were encountered:

TIXFeniks · 2019-12-20T13:55:18Z

Hello! We originally trained the model on 8 GPUs with a lot more GPU memory on each GPU than is available in Colab. Consider using just the FixedOrderTrainer without going multi-GPU (Colab only allows using 1 GPU per notebook). Does this fix the issue?

Fredrum · 2019-12-21T06:40:29Z

Thank you I'll try that too.
I managed to get it to start training but it was very slow going on the Colab so I gave up. It would have taken a few weeks I think.
I'm just going to go and do some basic tutorials that I have found and go from there, to get a better basic understanding of this stuff.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running out of memory in colab #2

Running out of memory in colab #2

Fredrum commented Dec 15, 2019 •

edited

Loading

TIXFeniks commented Dec 20, 2019

Fredrum commented Dec 21, 2019

Running out of memory in colab #2

Running out of memory in colab #2

Comments

Fredrum commented Dec 15, 2019 • edited Loading

TIXFeniks commented Dec 20, 2019

Fredrum commented Dec 21, 2019

Fredrum commented Dec 15, 2019 •

edited

Loading