Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First of all, thank you for sharing. You say, "This inversion only calculates derivatives, so it is not considered backpropagation. Hinton points out that you don't need to know the specifics to derive. Isn't this a step against the point in the paper? Thank you. #6

Open
zym1599 opened this issue Jan 11, 2023 · 2 comments

Comments

@zym1599
Copy link

zym1599 commented Jan 11, 2023

First of all, thank you for sharing. You say, "This inversion only calculates derivatives, so it is not considered backpropagation. Hinton points out that you don't need to know the specifics to derive. Isn't this a step against the point in the paper? Thank you.

@zym1599 zym1599 changed the title 首先感谢您的分享。您说“这种反向仅计算导数,因此不被视为反向传播。”hinton指出不需要知道具体细节进行求导。这一步操作不是违背了论文中的观点吗?谢谢 First of all, thank you for sharing. You say, "This inversion only calculates derivatives, so it is not considered backpropagation. Hinton points out that you don't need to know the specifics to derive. Isn't this a step against the point in the paper? Thank you. Jan 11, 2023
@makrout
Copy link

makrout commented Jan 20, 2023

Backpropagation requires gradients to flow back across layers, and hence does not enjoy the locality update feature. Here, gradients, however, don't flow across layers and are used for local updates only. For this reason, the layers have decoupled backwards passes, and this is why these lines of code perform this local update by involving the layer cost function. This can be perceived as 1-layer backpropagation and can be substituted with other estimation/optimization methods that do not require gradients. In other words, this code is using local gradients without loss of generality and is not against Hinton's claim.

@taomanwai
Copy link

Other estimation/optimization methods that do not require gradients <- any suggestions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants