-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empirical variances don't match Table 1 #6
Comments
Oh yeah, and I could just be calculating the gradient variances wrong; am currently using: T.var(T.grad(expressions.loss_train,l_hidden.W)) Should probably be: T.mean(T.var(T.grad(expressions.loss_train, l_hidden.W), axis=0)) Repeating with this instead, will take a while to get the results. |
Unsurprisingly, initial results suggest that's not going to fix it. |
Comment from talk: we could be having problems with normalisation in this calculation; should normalise in some way. |
Looking at the paper's code Tim sent me, have found the following differences:
|
Have implemented these changes in the code, and the results match a lot better. Unfortunately, there are still some problems; mostly that the variance is increasing after training for 100 epochs. Will have to look at the code again to figure out why this is the case. |
The results in table 1 are very bad, because it actually looks like the variance is higher for the local parameterization versus the single weight samples. Possible reasons for this:
The text was updated successfully, but these errors were encountered: