Skip to content

Commit

Permalink
Merge branch 'master' of github.com:OpenNMT/OpenNMT-py into T5
Browse files Browse the repository at this point in the history
  • Loading branch information
vince62s committed Jul 11, 2023
2 parents a1170bf + 2cf6ae0 commit 72cf518
Show file tree
Hide file tree
Showing 3 changed files with 65 additions and 1 deletion.
62 changes: 62 additions & 0 deletions eval_llm/MMLU/llama33b-onmt.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
ACC-abstract_algebra: 0.3700
ACC-anatomy: 0.5185
ACC-astronomy: 0.6118
ACC-business_ethics: 0.5800
ACC-clinical_knowledge: 0.5547
ACC-college_biology: 0.5833
ACC-college_chemistry: 0.3800
ACC-college_computer_science: 0.4400
ACC-college_mathematics: 0.3600
ACC-college_medicine: 0.5376
ACC-college_physics: 0.3137
ACC-computer_security: 0.6800
ACC-conceptual_physics: 0.4723
ACC-econometrics: 0.3333
ACC-electrical_engineering: 0.4690
ACC-elementary_mathematics: 0.3413
ACC-formal_logic: 0.3571
ACC-global_facts: 0.3900
ACC-high_school_biology: 0.6419
ACC-high_school_chemistry: 0.3793
ACC-high_school_computer_science: 0.5800
ACC-high_school_european_history: 0.7152
ACC-high_school_geography: 0.7273
ACC-high_school_government_and_politics: 0.8187
ACC-high_school_macroeconomics: 0.5590
ACC-high_school_mathematics: 0.2741
ACC-high_school_microeconomics: 0.5588
ACC-high_school_physics: 0.3311
ACC-high_school_psychology: 0.7596
ACC-high_school_statistics: 0.4676
ACC-high_school_us_history: 0.7696
ACC-high_school_world_history: 0.7637
ACC-human_aging: 0.6861
ACC-human_sexuality: 0.6718
ACC-international_law: 0.7603
ACC-jurisprudence: 0.6574
ACC-logical_fallacies: 0.6994
ACC-machine_learning: 0.3750
ACC-management: 0.7573
ACC-marketing: 0.8333
ACC-medical_genetics: 0.6100
ACC-miscellaneous: 0.7752
ACC-moral_disputes: 0.6503
ACC-moral_scenarios: 0.3855
ACC-nutrition: 0.6471
ACC-philosophy: 0.6656
ACC-prehistory: 0.6667
ACC-professional_accounting: 0.4326
ACC-professional_law: 0.4342
ACC-professional_medicine: 0.5441
ACC-professional_psychology: 0.6144
ACC-public_relations: 0.6818
ACC-security_studies: 0.6367
ACC-sociology: 0.7761
ACC-us_foreign_policy: 0.8300
ACC-virology: 0.5000
ACC-world_religions: 0.7953
ACC-all: 0.5701
total run time 10761.85


Llama Paper 33B: 57.8
2 changes: 1 addition & 1 deletion onmt/translate/translation.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ def log(self, sent_number):

if self.word_aligns is not None:
pred_align = self.word_aligns[0]
pred_align_pharaoh = build_align_pharaoh(pred_align)
pred_align_pharaoh, _ = build_align_pharaoh(pred_align)
pred_align_sent = " ".join(pred_align_pharaoh)
msg.append("ALIGN: {}\n".format(pred_align_sent))

Expand Down
2 changes: 2 additions & 0 deletions tools/convert_T5.py
Original file line number Diff line number Diff line change
Expand Up @@ -354,6 +354,7 @@ def __init__(self, model_path: str):
vocabs["src"] = src_vocab
vocabs["tgt"] = src_vocab
vocabs["data_task"] = "seq2seq"

vocabs["decoder_start_token"] = "<blank>"

onmt_cp["vocab"] = {}
Expand Down Expand Up @@ -575,3 +576,4 @@ def __init__(self, model_path: str):
opt.tokenizer_model,
)
print("With OpenNMT-py use the model with the .new extension or rename it")

0 comments on commit 72cf518

Please sign in to comment.