Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrectly pytree recognition by KFAC optimizer #273

Open
Uernd opened this issue Oct 4, 2024 · 1 comment
Open

Incorrectly pytree recognition by KFAC optimizer #273

Uernd opened this issue Oct 4, 2024 · 1 comment

Comments

@Uernd
Copy link

Uernd commented Oct 4, 2024

First of all, thanks for the contribution of KFAC team !

While using KFAC optimizer to optimize an ANN, I noticed that the KFAC optimizer seems have some trouble to understand the structure of parameter tree if the parameter is used more than once while constructing the neural network.

If the original ANN denoted as f(params, inputs), then if we simply use a modified ANN as F(params, inputs) = f(params, inputs) + f(params, inputs),the program will throw an error. I have tried functools.partial to fix the parameters, but it seems the program will get stuck somehow. If I use vmap, some of the parameters would be labelled as 'orphan' and by experiments, this would affect the optimization process.

I wonder if there is already some methods to avoid these issues? Would you consider update the optimizer to fix this bug?

Thanks again for the well-designed optimizer !

@james-martens
Copy link
Collaborator

For K-FAC the optimizer doesn't currently support the parameters being used in more than once in the graph. This doesn't rule out RNNs and transformers since they usually only use the parameter once in the graph, just with an operation that has a time dimension. As of a week or two ago, the behavior when finding such a parameter is to automatically register it as "generic", which will resort to a crude curvature approximation. If you use the TNT feature in the code, you can get generically-registered layers to do something more useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants