Skip to content

Commit

Permalink
change glianorex to test split (#2332)
Browse files Browse the repository at this point in the history
* change glianorex to test set

* nit

* fix test; doc_to_target can be str for multiple_choice

* nit
  • Loading branch information
baberabb authored Sep 26, 2024
1 parent af92448 commit 7d24238
Show file tree
Hide file tree
Showing 6 changed files with 21 additions and 5 deletions.
5 changes: 5 additions & 0 deletions lm_eval/tasks/glianorex/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,8 @@ All tasks are multiple choice questions with 4 options, only one correct option.

- `glianorex_en`: Evaluates the accuracy on 264 questions in English.
- `glianorex_fr`: Evaluates the accuracy on 264 questions in French.

#### Change Log

* (all tasks) 2024-09-23 -- 1.0
* Switched the `test_split` from `train` to `test`.
4 changes: 3 additions & 1 deletion lm_eval/tasks/glianorex/glianorex.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
task: glianorex
dataset_path: maximegmd/glianorex
output_type: multiple_choice
test_split: train
test_split: test
doc_to_text: !function preprocess_glianorex.doc_to_text
doc_to_target: !function preprocess_glianorex.doc_to_target
doc_to_choice: [ 'A', 'B', 'C', 'D' ]
Expand All @@ -12,3 +12,5 @@ metric_list:
- metric: acc_norm
aggregation: mean
higher_is_better: true
metadata:
version: 1.0
4 changes: 3 additions & 1 deletion lm_eval/tasks/glianorex/glianorex_en.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
task: glianorex_en
dataset_path: maximegmd/glianorex
output_type: multiple_choice
test_split: train
test_split: test
doc_to_text: !function preprocess_glianorex.doc_to_text
doc_to_target: !function preprocess_glianorex.doc_to_target
process_docs: !function preprocess_glianorex.filter_english
Expand All @@ -13,3 +13,5 @@ metric_list:
- metric: acc_norm
aggregation: mean
higher_is_better: true
metadata:
version: 1.0
4 changes: 3 additions & 1 deletion lm_eval/tasks/glianorex/glianorex_fr.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
task: glianorex_fr
dataset_path: maximegmd/glianorex
output_type: multiple_choice
test_split: train
test_split: test
doc_to_text: !function preprocess_glianorex.doc_to_text
doc_to_target: !function preprocess_glianorex.doc_to_target
process_docs: !function preprocess_glianorex.filter_french
Expand All @@ -13,3 +13,5 @@ metric_list:
- metric: acc_norm
aggregation: mean
higher_is_better: true
metadata:
version: 1.0
3 changes: 2 additions & 1 deletion lm_eval/tasks/glianorex/preprocess_glianorex.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ def doc_to_text(doc) -> str:
return f"Question: {doc['question']}\n{answers}Answer:"


def doc_to_target(doc) -> int:
def doc_to_target(doc) -> str:
# answer_idx is `A`, `B`, `C`, `D` etc.
return doc["answer_idx"]


Expand Down
6 changes: 5 additions & 1 deletion tests/test_tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,11 @@ def test_doc_to_target(self, task_class, limit):
)
_array_target = [task.doc_to_target(doc) for doc in arr]
if task._config.output_type == "multiple_choice":
assert all(isinstance(label, int) for label in _array_target)
# TODO<baber>: label can be string or int; add better test conditions
assert all(
(isinstance(label, int) or isinstance(label, str))
for label in _array_target
)

def test_build_all_requests(self, task_class, limit):
task_class.build_all_requests(rank=1, limit=limit, world_size=1)
Expand Down

0 comments on commit 7d24238

Please sign in to comment.