Add Support for KFAC Optimization in LSTM and GRU Layers #188

neuronphysics · 2023-11-22T16:02:17Z

Feature

I kindly request the addition of support for the Kronecker-Factored Approximate Curvature (KFAC) optimization technique in LSTM and GRU layers within the existing KFAC Optimizer. Currently, most of the KFAC Optimizer classes are tailored for linear and 2D convolution layers. Extending its capabilities to encompass RNN layers would be a significant enhancement.

Proposal

The proposal entails integrating KFAC optimization support for LSTM and GRU layers into the KFAC optimizer. This would involve adapting the KFAC Optimizer to calculate the requisite statistics and computation of chain-structured linear Gaussian graphical model for LSTM and GRU layers which I could not find any public implementation of it.

Motivation

LSTM and GRU layers are foundational components in dealing with sequential data, and time-series analysis. I wonder how much KFAC can significantly improve model training using LSTM and GRU layers by providing accurate approximations of the Fisher information matrix? By integrating support for LSTM and GRU layers within the KFAC Optimizer, researchers would gain the ability to apply the KFAC optimization technique to a wider array of models, including reinforcement learning algorithms.

Additional Context

I have full confidence that the repository maintainers, particularly the first author of the paper titled

KRONECKER-FACTORED CURVATURE APPROXIMATIONS FOR RECURRENT NEURAL NETWORKS
is the best person to extend KFAC support to LSTM and GRU layers within the KFAC optimizer. Such an extension would represent a valuable addition to this useful repository.

I appreciate your consideration of this feature request. Thank you.

james-martens · 2023-11-23T14:13:45Z

Yeah support for recurrent networks is something we have partially implemented internally. If there's interest I guess we could try to get this out sooner.

neuronphysics · 2023-11-23T14:36:56Z

Great to hear that support for recurrent networks is implemented. There's definitely interest in this feature, and making it public sooner would be much appreciated, especially for its application in RL models which is my main interest.

neuronphysics · 2024-04-22T17:11:44Z

Is there any update on publishing the KFAC code for RNNs?

james-martens · 2024-04-24T22:15:58Z

Sorry, no. Myself and others have been very busy and haven't had time. If you're interested in using a Kronecker-factored method compatible with RNNs out of the box, you could try Shampoo or TNT, which make fewer assumptions about the structure of the network. I imagine that these are implemented in some open source library, but don't know specifically. We might eventually release support for these approaches in kfac_jax, but I have no timeline for that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for KFAC Optimization in LSTM and GRU Layers #188

Add Support for KFAC Optimization in LSTM and GRU Layers #188

neuronphysics commented Nov 22, 2023

james-martens commented Nov 23, 2023

neuronphysics commented Nov 23, 2023 •

edited

Loading

neuronphysics commented Apr 22, 2024

james-martens commented Apr 24, 2024

Add Support for KFAC Optimization in LSTM and GRU Layers #188

Add Support for KFAC Optimization in LSTM and GRU Layers #188

Comments

neuronphysics commented Nov 22, 2023

Feature

Proposal

Motivation

Additional Context

james-martens commented Nov 23, 2023

neuronphysics commented Nov 23, 2023 • edited Loading

neuronphysics commented Apr 22, 2024

james-martens commented Apr 24, 2024

neuronphysics commented Nov 23, 2023 •

edited

Loading