Incorrectly pytree recognition by KFAC optimizer #273

Uernd · 2024-10-04T15:03:22Z

First of all, thanks for the contribution of KFAC team !

While using KFAC optimizer to optimize an ANN, I noticed that the KFAC optimizer seems have some trouble to understand the structure of parameter tree if the parameter is used more than once while constructing the neural network.

If the original ANN denoted as f(params, inputs), then if we simply use a modified ANN as F(params, inputs) = f(params, inputs) + f(params, inputs)，the program will throw an error. I have tried functools.partial to fix the parameters, but it seems the program will get stuck somehow. If I use vmap, some of the parameters would be labelled as 'orphan' and by experiments, this would affect the optimization process.

I wonder if there is already some methods to avoid these issues? Would you consider update the optimizer to fix this bug?

Thanks again for the well-designed optimizer !

james-martens · 2024-11-07T21:45:40Z

For K-FAC the optimizer doesn't currently support the parameters being used in more than once in the graph. This doesn't rule out RNNs and transformers since they usually only use the parameter once in the graph, just with an operation that has a time dimension. As of a week or two ago, the behavior when finding such a parameter is to automatically register it as "generic", which will resort to a crude curvature approximation. If you use the TNT feature in the code, you can get generically-registered layers to do something more useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrectly pytree recognition by KFAC optimizer #273

Incorrectly pytree recognition by KFAC optimizer #273

Uernd commented Oct 4, 2024

james-martens commented Nov 7, 2024

Incorrectly pytree recognition by KFAC optimizer #273

Incorrectly pytree recognition by KFAC optimizer #273

Comments

Uernd commented Oct 4, 2024

james-martens commented Nov 7, 2024