You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thank you for sharing. You say, "This inversion only calculates derivatives, so it is not considered backpropagation. Hinton points out that you don't need to know the specifics to derive. Isn't this a step against the point in the paper? Thank you.
#6
Open
zym1599 opened this issue
Jan 11, 2023
· 2 comments
First of all, thank you for sharing. You say, "This inversion only calculates derivatives, so it is not considered backpropagation. Hinton points out that you don't need to know the specifics to derive. Isn't this a step against the point in the paper? Thank you.
The text was updated successfully, but these errors were encountered:
zym1599
changed the title
首先感谢您的分享。您说“这种反向仅计算导数,因此不被视为反向传播。”hinton指出不需要知道具体细节进行求导。这一步操作不是违背了论文中的观点吗?谢谢
First of all, thank you for sharing. You say, "This inversion only calculates derivatives, so it is not considered backpropagation. Hinton points out that you don't need to know the specifics to derive. Isn't this a step against the point in the paper? Thank you.
Jan 11, 2023
Backpropagation requires gradients to flow back across layers, and hence does not enjoy the locality update feature. Here, gradients, however, don't flow across layers and are used for local updates only. For this reason, the layers have decoupled backwards passes, and this is why these lines of code perform this local update by involving the layer cost function. This can be perceived as 1-layer backpropagation and can be substituted with other estimation/optimization methods that do not require gradients. In other words, this code is using local gradients without loss of generality and is not against Hinton's claim.
First of all, thank you for sharing. You say, "This inversion only calculates derivatives, so it is not considered backpropagation. Hinton points out that you don't need to know the specifics to derive. Isn't this a step against the point in the paper? Thank you.
The text was updated successfully, but these errors were encountered: