Skip to content

关于 iterator_sm70 文件中 copy 函数调用的作用 #2750

Answered by lzhangzz
vicety asked this question in Q&A
Discussion options

You must be logged in to vote

这里是提前清零越界值,否则可能会有 NAN 或者 INF。随便设个值也行估计是因为会被 attention score mask 掉。

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by vicety
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants