Skip to content

Commit

Permalink
Add script for instruction building dataset
Browse files Browse the repository at this point in the history
  • Loading branch information
binkjakub committed Dec 12, 2024
1 parent 9b7929d commit 0387b7a
Show file tree
Hide file tree
Showing 6 changed files with 611 additions and 254 deletions.
4 changes: 2 additions & 2 deletions configs/dataset/pl-court-frankowe-instruct.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: data/datasets/pl/sprawy_frankowe/instructions_08_12_2024
name: data/datasets/pl/sprawy_frankowe
prompt_field: prompt
context_field: context
output_field: output

max_output_tokens: 800 # 1050 is the real maximum, we take 0.95 quantile value
max_output_tokens: 800
3 changes: 3 additions & 0 deletions data/datasets/pl/sprawy_frankowe/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
/train.jsonl
/test.jsonl
/dataset_info.json
Loading

0 comments on commit 0387b7a

Please sign in to comment.