Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError on Colab Notebook, Training T5 on WikiSQL, RuntimeError: output with shape [16, 8, 1, 1] doesn't match the broadcast shape [16, 8, 1, 64] #11

Open
eshehadi opened this issue May 15, 2023 · 6 comments

Comments

@eshehadi
Copy link

I am running the colab notebook shared here:

https://github.com/mrm8488/shared_colab_notebooks/blob/bf6d578042bbb393e8cfcb336e2909c9f460b91c/T5_wikiSQL_multitask_with_HF_transformers.ipynb

When I get to trainer.evaluate() I get the following error message:

RuntimeError: output with shape [16, 8, 1, 1] doesn't match the broadcast shape [16, 8, 1, 64]

I've attempted to search for solutions, but I can't find many instances where this type of error comes up with NLP training. It seems to most often occur with image raster data.

I would greatly appreciate any insight that you may have. Thanks!

Eric

@dharma610
Copy link

@mrm8488 I faced the same issue, could you please help us out here?

@eshehadi
Copy link
Author

@mrm8488 I faced the same issue, could you please help us out here?

I think it has to do with the transformers version. I tried running the code on my local machine as opposed to Colab and had to downgrade transformers to an earlier version.

@eshehadi eshehadi reopened this Jun 27, 2023
@asksonu
Copy link

asksonu commented Jul 3, 2023

@eshehadi , which transformers version has worked for you ? I tried 4.30.2, 4.29.0 and 4.30.0 (in google colab), all of them I was getting same error.

@asksonu
Copy link

asksonu commented Jul 4, 2023

I figured the problem was with padding and truncation of input and output tokens in the function convert_to_features. Error disappeared after I have replaced below

input_encodings = tokenizer.batch_encode_plus(example_batch['input'], pad_to_max_length=True, max_length=64)
target_encodings = tokenizer.batch_encode_plus(example_batch['target'], pad_to_max_length=True, max_length=64)

with

input_encodings = tokenizer.batch_encode_plus(example_batch['input'], truncation=True, padding="max_length", max_length=64)
target_encodings = tokenizer.batch_encode_plus(example_batch['target'],  truncation=True, padding="max_length", max_length=64)

@calam1
Copy link

calam1 commented Jul 7, 2023

@eshehadi , which transformers version has worked for you ? I tried 4.30.2, 4.29.0 and 4.30.0 (in google colab), all of them I was getting same error.

I changed it to 4.26.0 to get past the shape error

@calam1
Copy link

calam1 commented Jul 7, 2023

I figured the problem was with padding and truncation of input and output tokens in the function convert_to_features. Error disappeared after I have replaced below

input_encodings = tokenizer.batch_encode_plus(example_batch['input'], pad_to_max_length=True, max_length=64)
target_encodings = tokenizer.batch_encode_plus(example_batch['target'], pad_to_max_length=True, max_length=64)

with

input_encodings = tokenizer.batch_encode_plus(example_batch['input'], truncation=True, padding="max_length", max_length=64)
target_encodings = tokenizer.batch_encode_plus(example_batch['target'],  truncation=True, padding="max_length", max_length=64)

unfortunately for me, while running transform 4.30.2 (latest) making the padding change did not resolve my problem. I had to downgrade the transform version to 4.26.0 (the minor versions may work also, I did not try them)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants