-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apex FusedAdam vs PyTorch Adam #42
Comments
Cool and thank you! Did not know about this one. I need to test it out. Would be interesting if it has an effect on the performance. Or do you mean something like that it minimizes more effectively? In this case the optimizer would need fewer training steps and this could speed up training considerably. What is exactly your training setup for the garden scene (resolution, iterations)? |
My Python-based implementation trains the default mipNeRF360 garden scene configuration (i.e. 30k iterations, resolution of 1297x840) in 16:57 (29.48it/s) This is after replacing PT Adam with Fused Adam. It's literally just installing the apex library, making all parameters contiguous when initializing them, removing the set_to_none=True (apex FusedAdam does this by default), and then exchanging torch.optim.Adam() with apex.optimizers.FusedAdam(). As far as I know the updates to parameters are applied all at once with FusedAdam (hence the term "fused"). So the improvement is not due to better optimization but purely due to a more performant implementation. |
Hey, this looks like a cool project!
I was wondering whether you tried to replace PyTorch Adam with the FusedAdam implementation from NVIDIA Apex (https://nvidia.github.io/apex/optimizers.html). Note that while PyTorch itself includes a fused Adam implementation, I found that it does not work properly back when I first tested it earlier this year.
I just tested this (PyTorch Adam -> Apex FusedAdam) within my re-implementation of the official Python implementation and on my 4090 the training for the garden scene went from 24 minutes to about 18 minutes, a 33% improvement.
As I like Python-based implementations for research-driven development, I will not look into this for your project myself so I thought I just let you know.
The text was updated successfully, but these errors were encountered: