Skip to content

Commit

Permalink
Fix pad token id in config (#394)
Browse files Browse the repository at this point in the history
### Description

This PR sets `pad_token_id` in `genai_config.json` to a single value
when a model does not specify a pad token id but it specifies a list of
EOS token ids.

### Motivation and Context

When the pad token id is not specified, `pad_token_id` in
`genai_config.json` stores the same value that `eos_token_id` in
`genai_config.json` contains. When `eos_token_id` has a list of EOS
token ids, then `pad_token_id` also has a list of pad token ids. This
causes a parsing issue in ONNX Runtime GenAI because it expects only one
pad token id.

This PR also fixes [this
issue](#384).
  • Loading branch information
kunal-vaishnavi authored May 3, 2024
1 parent e82ab3d commit 1f3776d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/python/py/models/builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ def make_genai_config(self, model_name_or_path, extra_kwargs, out_dir):
"num_key_value_heads": self.num_kv_heads,
},
"eos_token_id": config.eos_token_id,
"pad_token_id": config.pad_token_id if hasattr(config, "pad_token_id") and config.pad_token_id is not None else config.eos_token_id,
"pad_token_id": config.pad_token_id if hasattr(config, "pad_token_id") and config.pad_token_id is not None else config.eos_token_id[0] if isinstance(config.eos_token_id, list) else config.eos_token_id,
"type": self.model_type[ : self.model_type.find("For")].lower(),
"vocab_size": self.vocab_size,
},
Expand Down

0 comments on commit 1f3776d

Please sign in to comment.