Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: hessian #77

Open
jbrea opened this issue May 27, 2022 · 4 comments
Open

feature request: hessian #77

jbrea opened this issue May 27, 2022 · 4 comments

Comments

@jbrea
Copy link

jbrea commented May 27, 2022

How much work would it be to add something like valhessian!(H, schain, x, p)? Where H is the Hessian of a simple chain with loss.

@chriselrod
Copy link
Contributor

It's possible to implement hessians (I'd suggest a forward over reverse approach), but would of course take some work.

A PR implementing that would gladly be accepted and I could provide instructions.

Long term, the plan for this library is to rely on Enzyme for its derivatives, in which case this shouldn't be a problem (in theory).

@jbrea
Copy link
Author

jbrea commented May 27, 2022

Cool, thanks!

Currently I would only need hessians for chains with TurboDense layers. But I looked a bit into the code of dense!, and got a bit lost :)

Long term, the plan for this library is to rely on Enzyme for its derivatives

I was trying to use SimpleChains with Enzyme, but crashed julia (1.7.2). Should it already work with a newer version of julia?

@chriselrod
Copy link
Contributor

I was trying to use SimpleChains with Enzyme, but crashed julia (1.7.2). Should it already work with a newer version of julia?

It probably shouldn't crash, but it also probably won't perform well until it can switch to the rewritten LoopVectoriziation.
The rewrite is ongoing, and still a ways away, but the major reason it will help here is that it'll be able to run its optimizations after Enzyme generates the AD code, instead of before as is currently the case.

@RS-Coop
Copy link

RS-Coop commented Feb 13, 2024

Hello,

I am interested in training small models constructed with SimpleChains using second-order non-convex optimization methods, hence my interest in this thread.

I have a decent understanding of how to implement fast (mixed mode) matrix-free Hessian-vector products using Enzyme, and I am able to do this for regular Julia functions.

It seems that support for such functionality is not currently available (?), but this thread hints to it eventually being so. @chriselrod Can you comment on this? Thanks for the time!

Cheers,
Cooper

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants