Releases: ClashLuke/HeavyBall
Releases · ClashLuke/HeavyBall
v1.3.0
faster, less memory, minor fixes
- LaProp/Adam/... are now compilable
fused_hook
andhook_optimizer_into_model
, reducing memory usage by fusing backward pass with optimizer step- fewer inplace ops, giving better compilations and cleaner code
- scaling ("graft", "scale", "none") for Muon, allowing Adam#Muon at minimal cost
storage_dtype
argument is implemented again- LaProp is correctly implemented, ADOPT is more stable
- via @ethansmith2000: cleaner, more maintainable
defaults
, reducing the surface for potential errors
Stability, Muon and Fixes
utils
- bugfixes impacting SFAdamW and RMSProp
- breaking:
zeroth_power_method
no longer supportseigh
and doesn't allow specification of the number of newtonschulz iterations - faster newtonschulz5 (via @tysam-code)
- PSGD preconditioner dampening (via @evanatyourservice)
chainable
- implementation of
nesterov_momentum
,heavyball_momentum
andorthogonalize_update
- implementation of
- core
- heavyball.Muon (by chaining
nesterov_momentum
andorthogonalize_update
); Muon supports gradient and update clipping out of the box
- heavyball.Muon (by chaining