Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

relative distance 연산 #13

Open
imj2185 opened this issue Feb 6, 2021 · 1 comment
Open

relative distance 연산 #13

imj2185 opened this issue Feb 6, 2021 · 1 comment

Comments

@imj2185
Copy link

imj2185 commented Feb 6, 2021

안녕하세요.

Music Transformer 페이퍼와 비교하면서 코들를 읽다가 질문이 있어서 올립니다.

  1. 페이퍼 섹션 3.4 에 relative distance를 구하여 dot product 연산하는 부분이 있는데 코드에서는

self.E = torch.randn([self.max_seq, int(self.dh)], requires_grad=False)로 distribution을 쓰시더라구요.

이부분은 페이퍼와 다르게 하신건가요?

감사합니다.

@serkansulun
Copy link

serkansulun commented Jun 14, 2021

Bump. Can someone explain the usage of
self.E = torch.randn([self.max_seq, int(self.dh)], requires_grad=False)
while calculating relative attention? Also, this parameter isn't registered so it prevents reproducibility when model is reloaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants