Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GLM-4使用的是Deepnorm吗? #580

Open
XuRuihan opened this issue Oct 8, 2024 · 0 comments
Open

GLM-4使用的是Deepnorm吗? #580

XuRuihan opened this issue Oct 8, 2024 · 0 comments

Comments

@XuRuihan
Copy link

XuRuihan commented Oct 8, 2024

Feature request / 功能建议

GLM-4使用的是Pre-Norm还是Deepnorm(Post-Norm)?

Motivation / 动机

Techinical Report中的说法,应该是沿用了GLM-130b的Deepnorm。
但是huggingface给出的配置文件中apply_residual_connection_post_layernorm=False,应该是没有使用Post-Norm;但是这个文件中还有一个post_layer_norm=True,这个参数只在decoder最后使用layernorm。
所以到底哪个是对的啊,应不应该用啊?

Your contribution / 您的贡献

见上

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant