forked from NVIDIA/Megatron-LM
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OverlappedDistributedOptimizer 支持 pipeline parallelism > 1 和 data parallelism > 1 同时使用吗? #37
Comments
藐视目前PP还有问题,作者推荐使用TP; |
扫了下代码感觉不支持,如果不支持,感觉和magatron的配置不是很兼容。 |
可以跑的。 |
建议你看看代码,跑通不见得是跑对了。你可以全局搜一下 |
普通的PP 这个commit afddb84 支持了一下,virtual PP 还没支持。 |
record_grad_accumulation_boundary这个函数在哪里定义的呢? |
@li-yi-dong |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
OverlappedDistributedOptimizer 似乎不支持 pipeline parallelism?
The text was updated successfully, but these errors were encountered: