Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why use cls not use avg #6

Open
luoqishuai opened this issue Sep 4, 2023 · 1 comment
Open

why use cls not use avg #6

luoqishuai opened this issue Sep 4, 2023 · 1 comment

Comments

@luoqishuai
Copy link

I saw the pooler_ type in the training parameters often uses cls instead of avg.
Attempting to set avg in infocse will result in an error.

pooler_output = pooler_output,view((batch_size, num_sent, pooler_output.size(-1)))
RuntimeError: shape '[16,2,768]' is invalid for input of size 29884416

May I ask why CLS is used instead of AVG?

@donglifang622
Copy link

bert = BertNodel.from_pretrained(model_args.model_name_or_path)
model.bert.load_state_dict(bert.state_dict())
为什么这里加载bert时报错了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants