Launcher Mcore T5 model for training/sft/eval/peft #383

huvunvidia · 2024-07-15T14:54:30Z

No description provided.

thomasdhc · 2024-07-30T20:24:19Z

launcher_scripts/conf/peft/t5/squad.yaml

@@ -14,7 +14,7 @@ trainer:
  devices: 8
  accelerator: gpu
  num_nodes: 1
-  precision: bf16
+  precision: 16


Is the precision meant to be 16 here?

@thomasdhc Yes, it is meant to be 16.
In our PEFT experiments, we didn't get to test thoroughly with bf16 yet, only with 16.

thomasdhc

LGTM

Huy Vu2 added 4 commits July 15, 2024 07:52

first commit, containing training/sft/eval/peft

8f3fce4

update launcher_scripts/conf/peft/t5/squad.yaml

1167219

adding mcore_t5 for all T5 and mT5 configs

7b1ea61

fix merge conflict

b220795

thomasdhc reviewed Jul 30, 2024

View reviewed changes

thomasdhc approved these changes Jul 30, 2024

View reviewed changes

Merge branch 'main' into huvu/mcore_t5_launcher

92b1034

thomasdhc merged commit 07a8f78 into main Jul 31, 2024
3 checks passed

Provide feedback