forked from OpenGPTX/lm-evaluation-harness
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix chat template; fix leaderboard math (EleutherAI#2475)
* batch commit * :Revert "batch commit" This reverts commit d859d1c. * batch commit * checkout from main * checkout from main * checkout from main * checkout from main * checkout from main * cleanup * cleanup * cleanup * cleanup * cleanup * cleanup * cleanup * cleanup * cleanup * Chat template fix (OpenGPTX#7) * cleanup * cleanup * cleanup * linting * fix tests * add ifeval install to new_task CI * Revert "add ifeval install to new_task CI" This reverts commit 1d19449. * adds leaderboard tasks (#1) * adds leaderboard tasks * Delete lm_eval/tasks/leaderboard/leaderboard_chat_template.yaml * add readme * Delete lm_eval/tasks/leaderboard/mmlu_pro/mmlu_pro_chat_template.yaml * modify readme * fix bbh task * fix bbh salient task * modify the readme * Delete lm_eval/tasks/leaderboard/ifeval/README.md * Delete lm_eval/tasks/leaderboard/math/README.md * add leaderboard to the tasks repertory * add anouncment about new leaderbaord tasks * linting * Update README.md Co-authored-by: Hailey Schoelkopf <[email protected]> * installs ifeval dependency in new_task github workflow --------- Co-authored-by: Nathan Habib <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]> * fix math parser * fix math parser * fix version * add warning about chat template --------- Co-authored-by: Nathan Habib <[email protected]> Co-authored-by: Nathan Habib <[email protected]> Co-authored-by: Nathan Habib <[email protected]> Co-authored-by: Hailey Schoelkopf <[email protected]> Co-authored-by: Nathan Habib <[email protected]>
- Loading branch information
1 parent
bd80a6c
commit 77c811e
Showing
6 changed files
with
71 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Chat Template Delimiter Handling Update | ||
|
||
## Overview | ||
This change modifies how delimiters are handled when applying chat templates in the request construction process for likelihood and multiple-choice based tasks. When `apply_chat_template` is set to `True`, the target delimiter is now set to an empty string instead of using the configured delimiter. | ||
|
||
## Background | ||
By default, the system uses a target delimiter (typically a whitespace " ") between the context and target text when constructing prompts. The full string is constructed as: | ||
``` | ||
doc_to_text(doc) + target_delimiter + doc_to_target(doc) | ||
``` | ||
|
||
While this worked well for base models where we wanted the model to predict a single whitespace followed by the answer, chat models have their own formatting conventions that handle spacing differently. | ||
|
||
## The Change | ||
- When `apply_chat_template=True`, the target delimiter is now empty ("") instead of the default whitespace | ||
- This prevents interference between chat template formatting and the default delimiter system | ||
- Particularly important for multiple choice tasks where the template itself handles spacing | ||
|
||
## Example | ||
``` | ||
# Before (with default delimiter " ") | ||
<user>Question: What color is the sky?\nAnswer:<assistant> blue | ||
# After | ||
<user>Question: What color is the sky?\nAnswer:<assistant>blue | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters