Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

分类数目变大,尽管可以将参数拆分到各个GPU上,但是各个GPU上的隐层特征allgather也带来显存消耗 #76

Open
gobigrassland opened this issue Nov 6, 2020 · 1 comment
Assignees

Comments

@gobigrassland
Copy link

分类数目变大,虽然可以将分类层参数拆分到各个GPU上,但是各个GPU上的隐层特征allgather也带来显存消耗。随着分类层数目变多,虽然可以通过增加GPU数量来保证fc层参数分配到各个GPU上的显存是一个常数,但是隐层特征x,还是会随着GPU数增加而增加。单卡显存有限,这样也限制仅仅通过增加GPU数量来应对分类数量线性增长。这个问题在论文“Partial FC: training 10 million identities on a single machine”提出。

@sandyhouse sandyhouse self-assigned this Nov 6, 2020
@sandyhouse
Copy link

感谢您的反馈,我们先调研下这个问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants