Skip to content

Commit

Permalink
feat: add FSDP to standalone
Browse files Browse the repository at this point in the history
Signed-off-by: Sébastien Han <[email protected]>
  • Loading branch information
leseb committed Oct 10, 2024
1 parent 46a8374 commit d92ecfd
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 6 deletions.
6 changes: 3 additions & 3 deletions standalone/standalone.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@
--max_batch_len=20000 \
--seed=42 \
--cpu_offload_optimizer \
--sharding_strategy=FULL_SHARD \
--distributed_training_framework fsdp \
--is_granite \
--checkpoint_at_epoch
command:
Expand Down Expand Up @@ -188,14 +188,14 @@
--output_dir=/tmp/model \
--num_epochs={epoch_num} \
--effective_batch_size=3840 \
--learning_rate=2e-6 \
--learning_rate=1e-4 \
--num_warmup_steps=800 \
--save_samples=0 \
--log_level=INFO \
--max_batch_len=20000 \
--seed=42 \
--cpu_offload_optimizer \
--sharding_strategy=FULL_SHARD \
--distributed_training_framework fsdp \
--is_granite \
--checkpoint_at_epoch
command:
Expand Down
6 changes: 3 additions & 3 deletions standalone/standalone.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ spec:
--max_batch_len=20000 \
--seed=42 \
--cpu_offload_optimizer \
--sharding_strategy=FULL_SHARD \
--distributed_training_framework fsdp \
--is_granite \
--checkpoint_at_epoch
command:
Expand Down Expand Up @@ -173,14 +173,14 @@ spec:
--output_dir=/tmp/model \
--num_epochs={epoch_num} \
--effective_batch_size=3840 \
--learning_rate=2e-6 \
--learning_rate=1e-4 \
--num_warmup_steps=800 \
--save_samples=0 \
--log_level=INFO \
--max_batch_len=20000 \
--seed=42 \
--cpu_offload_optimizer \
--sharding_strategy=FULL_SHARD \
--distributed_training_framework fsdp \
--is_granite \
--checkpoint_at_epoch
command:
Expand Down

0 comments on commit d92ecfd

Please sign in to comment.