v0.2.1
Load balancing
- Fix gradient for balance loss.
Misc
- Typos.
- Update benchmark interface.
- Remove some redundant code for performance improvement.
- Enable
USE_NCCL
by default. - Compatibility for PyTorch
<1.8.0
and>=1.8.0
.
Megatron adaption
- Patch for numerical correctness of gradient clipping.
- Support to pipeline parallelism.