RuntimeError on Colab Notebook, Training T5 on WikiSQL, RuntimeError: output with shape [16, 8, 1, 1] doesn't match the broadcast shape [16, 8, 1, 64] #11

eshehadi · 2023-05-15T19:30:56Z

I am running the colab notebook shared here:

https://github.com/mrm8488/shared_colab_notebooks/blob/bf6d578042bbb393e8cfcb336e2909c9f460b91c/T5_wikiSQL_multitask_with_HF_transformers.ipynb

When I get to trainer.evaluate() I get the following error message:

RuntimeError: output with shape [16, 8, 1, 1] doesn't match the broadcast shape [16, 8, 1, 64]

I've attempted to search for solutions, but I can't find many instances where this type of error comes up with NLP training. It seems to most often occur with image raster data.

I would greatly appreciate any insight that you may have. Thanks!

Eric

The text was updated successfully, but these errors were encountered:

dharma610 · 2023-06-27T18:59:58Z

@mrm8488 I faced the same issue, could you please help us out here?

eshehadi · 2023-06-27T19:04:25Z

@mrm8488 I faced the same issue, could you please help us out here?

I think it has to do with the transformers version. I tried running the code on my local machine as opposed to Colab and had to downgrade transformers to an earlier version.

asksonu · 2023-07-03T11:34:00Z

@eshehadi , which transformers version has worked for you ? I tried 4.30.2, 4.29.0 and 4.30.0 (in google colab), all of them I was getting same error.

asksonu · 2023-07-04T14:14:28Z

I figured the problem was with padding and truncation of input and output tokens in the function convert_to_features. Error disappeared after I have replaced below

input_encodings = tokenizer.batch_encode_plus(example_batch['input'], pad_to_max_length=True, max_length=64)
target_encodings = tokenizer.batch_encode_plus(example_batch['target'], pad_to_max_length=True, max_length=64)

with

input_encodings = tokenizer.batch_encode_plus(example_batch['input'], truncation=True, padding="max_length", max_length=64)
target_encodings = tokenizer.batch_encode_plus(example_batch['target'],  truncation=True, padding="max_length", max_length=64)

calam1 · 2023-07-07T15:30:54Z

@eshehadi , which transformers version has worked for you ? I tried 4.30.2, 4.29.0 and 4.30.0 (in google colab), all of them I was getting same error.

I changed it to 4.26.0 to get past the shape error

calam1 · 2023-07-07T15:32:34Z

I figured the problem was with padding and truncation of input and output tokens in the function convert_to_features. Error disappeared after I have replaced below

input_encodings = tokenizer.batch_encode_plus(example_batch['input'], pad_to_max_length=True, max_length=64)
target_encodings = tokenizer.batch_encode_plus(example_batch['target'], pad_to_max_length=True, max_length=64)

with

input_encodings = tokenizer.batch_encode_plus(example_batch['input'], truncation=True, padding="max_length", max_length=64)
target_encodings = tokenizer.batch_encode_plus(example_batch['target'],  truncation=True, padding="max_length", max_length=64)

unfortunately for me, while running transform 4.30.2 (latest) making the padding change did not resolve my problem. I had to downgrade the transform version to 4.26.0 (the minor versions may work also, I did not try them)

eshehadi closed this as completed Jun 27, 2023

eshehadi reopened this Jun 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError on Colab Notebook, Training T5 on WikiSQL, RuntimeError: output with shape [16, 8, 1, 1] doesn't match the broadcast shape [16, 8, 1, 64] #11

RuntimeError on Colab Notebook, Training T5 on WikiSQL, RuntimeError: output with shape [16, 8, 1, 1] doesn't match the broadcast shape [16, 8, 1, 64] #11

eshehadi commented May 15, 2023

dharma610 commented Jun 27, 2023

eshehadi commented Jun 27, 2023

asksonu commented Jul 3, 2023

asksonu commented Jul 4, 2023

calam1 commented Jul 7, 2023

calam1 commented Jul 7, 2023

RuntimeError on Colab Notebook, Training T5 on WikiSQL, RuntimeError: output with shape [16, 8, 1, 1] doesn't match the broadcast shape [16, 8, 1, 64] #11

RuntimeError on Colab Notebook, Training T5 on WikiSQL, RuntimeError: output with shape [16, 8, 1, 1] doesn't match the broadcast shape [16, 8, 1, 64] #11

Comments

eshehadi commented May 15, 2023

dharma610 commented Jun 27, 2023

eshehadi commented Jun 27, 2023

asksonu commented Jul 3, 2023

asksonu commented Jul 4, 2023

calam1 commented Jul 7, 2023

calam1 commented Jul 7, 2023