Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ema模型是否正确被保存 #180

Open
Blueskyvvvvv opened this issue Aug 10, 2024 · 1 comment
Open

ema模型是否正确被保存 #180

Blueskyvvvvv opened this issue Aug 10, 2024 · 1 comment

Comments

@Blueskyvvvvv
Copy link

想问一下这里是如何实现ema模型的保存呢:
image
从这里没有看到model被显示的修改,是deepspeed内完成的么?

我有check保存下来的model和ema_model,发现参数是完全一样的,所以不确定是不是使用方式有问题,望解答,感谢

@1049451037
Copy link
Member

1049451037 commented Aug 12, 2024

是sat内部实现的optimizer进行的操作:

https://github.com/THUDM/SwissArmyTransformer/blob/main/sat/ops/fused_ema_adam.py

直接将ema给fuse到optmizer更新的过程中了。

应该不会有问题,之前CogVLM训练的时候能观察到ema权重评测结果高一些。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants