You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
The key has expired.
FMoE core
Previous mp_group is renamed to slice_group, indicating that all workers in the group receive the same input batch, and process a slice of the input. mp_group will be deprecated in our next release.
ROCm supported.
FMoELinear is moved to a stand-alone file.
Groupped data parallel
Support any group name by their relative tag name.
Load balancing
A brand new balancing strategy - SWIPE. Contributed by authors of a (currently unpublished) paper.
A property has_loss is added to each gate, in order to identify whether balance loss should be collected.
Megatron-LM support
Experts are partitioned by tensor model parallelism in mp_group, instead of expert parallelism.