Some layers are not loaded(‘encoder.embed_token.weight’ and 'shared.weight') when loading converted safetensors into transfomrers.MT5EncoderModel. #520

WongGawa · 2024-08-23T03:31:52Z

System Info

I converted pytorch_model.bin to model.safetensors locally for mt5-xxl model.
And I compared transfromers.MT5EncoderModel's last_hidden_states difference by mt5-xxl's pytorch_model.bin and safetensors.
And because encoder.embed_tokens.weight and shared.weight are not loaded into MT5EncoderModel when loading converted safetensors, which causes those layers to be newly initialized.

Information

The official example scripts
My own modified scripts

Reproduction

convert pytorch_model.bin to model.safetensors by convert.py
loading into transformers.MT5EncoderModel of pytorch_model.bin and converted model.safetensors
compute difference between las_hidden_state of pytorch_model.bin and model.safetensors

Expected behavior

the same output or at lease within a certain margin of error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some layers are not loaded(‘encoder.embed_token.weight’ and 'shared.weight') when loading converted safetensors into transfomrers.MT5EncoderModel. #520

Some layers are not loaded(‘encoder.embed_token.weight’ and 'shared.weight') when loading converted safetensors into transfomrers.MT5EncoderModel. #520

WongGawa commented Aug 23, 2024

Some layers are not loaded(‘encoder.embed_token.weight’ and 'shared.weight') when loading converted safetensors into transfomrers.MT5EncoderModel. #520

Some layers are not loaded(‘encoder.embed_token.weight’ and 'shared.weight') when loading converted safetensors into transfomrers.MT5EncoderModel. #520

Comments

WongGawa commented Aug 23, 2024

System Info

Information

Reproduction

Expected behavior