Skip to content

Commit

Permalink
Merge pull request #371 from malay-nagda/malay/token_drop
Browse files Browse the repository at this point in the history
Malay/token drop
  • Loading branch information
erhoo82 authored Jul 9, 2024
2 parents 70278f9 + b885571 commit 74d8409
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 0 deletions.
3 changes: 3 additions & 0 deletions launcher_scripts/conf/training/mixtral/mixtral_8x3b.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,9 @@ exp_manager:
model:
mcore_gpt: true
moe_grouped_gemm: true
moe_token_dispatcher_type: alltoall
moe_pad_expert_input_to_capacity: True
moe_expert_capacity_factor: 1.0
micro_batch_size: 1
global_batch_size: 128
rampup_batch_size: null
Expand Down
2 changes: 2 additions & 0 deletions launcher_scripts/conf/training/mixtral/mixtral_8x7b.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ model:
mcore_gpt: true
moe_grouped_gemm: true
moe_token_dispatcher_type: alltoall
moe_pad_expert_input_to_capacity: True
moe_expert_capacity_factor: 1.0
moe_aux_loss_coeff: 0.01
micro_batch_size: 1
global_batch_size: 256
Expand Down

0 comments on commit 74d8409

Please sign in to comment.