-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial Delay in Image Generation with Flux Schnell on H100 #24
Comments
The slowdown is due to the torch.compile compilation, it should speed up after that, but the initial generation may take a while, and also may take a while for each new requested image shape. The initial slowdown is much more reasonable with torch nightly, or just torch > 2.4.x, since I believe they made it quite a bit faster, or at least it is faster on my machine. I barely notice compilation speed anymore, though I have a beefy computer so there is that. |
Thanks so much for your reply! I really appreciate it. |
I have tested this slowdown on h100 and rtx4090. The slowdown is around 1 minute for just torch and for torch nightly its around 3-7 seconds |
Yeah- so essentially using nightly is significantly better. |
I'm still experiencing a slowdown with the initial compilation on a H100 with torch nightly builds (2.6.0.dev20240918+cu124) |
I think it depends. Sometimes compilation will be more costly than others depending on torch version. I think at the time, nightly was 2.5.0 or 2.5.1, I'm not sure. So, it could be that you may only need one of those two for fastest compile time. |
Thank you very much for your incredible work @aredden!
I wanted to ask you about something I've noticed when using Flux Schnell with an H100. (when using compile_extras and compile_blocks) After running the three warmups of Flux Schnell, the first image I generate takes about 45 seconds to start the first iteration, but the subsequent images generate quickly. Is this normal? Is there any way to avoid this initial delay?
I appreciate your help in advance.
The text was updated successfully, but these errors were encountered: