You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some layers are not loaded(‘encoder.embed_token.weight’ and 'shared.weight') when loading converted safetensors into transfomrers.MT5EncoderModel.
#520
Open
2 tasks done
WongGawa opened this issue
Aug 23, 2024
· 0 comments
I converted pytorch_model.bin to model.safetensors locally for mt5-xxl model.
And I compared transfromers.MT5EncoderModel's last_hidden_states difference by mt5-xxl's pytorch_model.bin and safetensors.
And because encoder.embed_tokens.weight and shared.weight are not loaded into MT5EncoderModel when loading converted safetensors, which causes those layers to be newly initialized.
Information
The official example scripts
My own modified scripts
Reproduction
convert pytorch_model.bin to model.safetensors by convert.py
loading into transformers.MT5EncoderModel of pytorch_model.bin and converted model.safetensors
compute difference between las_hidden_state of pytorch_model.bin and model.safetensors
Expected behavior
the same output or at lease within a certain margin of error
The text was updated successfully, but these errors were encountered:
System Info
I converted pytorch_model.bin to model.safetensors locally for mt5-xxl model.
And I compared transfromers.MT5EncoderModel's last_hidden_states difference by mt5-xxl's pytorch_model.bin and safetensors.
And because encoder.embed_tokens.weight and shared.weight are not loaded into MT5EncoderModel when loading converted safetensors, which causes those layers to be newly initialized.
Information
Reproduction
Expected behavior
the same output or at lease within a certain margin of error
The text was updated successfully, but these errors were encountered: