-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The speed of drawing is not satisfactory #26
Comments
I am a bit confused. Where are you getting 3.32 iterations per second? Total generation time doesn't mean as much as the it/s speed. You also need to take into account the image size and the number of steps you decide to generate with 🤔 |
This is the generative parameter I used, and his final speed is 7000 ms, not 300 ms |
The speeds you are getting look normal to me. The model does 3.32 forward passes per second which is relatively close to max tflops for a 4090 if you're generating an image at 1024x1024. If you want more speed you can shrink the size of the image, setting height to less than 1024, or width to less than 1024. Or you can use schnell which allows you to generate an image in 4 steps instead of 24 steps, at a bit less quality. Other things you can do is try a flux hyper lora which allows you to reduce the number of steps to ~8 steps. The speeds that H100's get are very different from the speeds that you get with a 4090. H100's max tflops for fp8 is absolutely gigantic, around 1500-2000 tflops, vs a 4090 which "only" (still a lot) gets ~330 tflops with fp8. |
I have encountered a new problem. I successfully built a project using 4090 on the runpod platform, but it did not make my graph generation speed twice faster, but the speed of 7000 milliseconds as the original project. Please tell me the reason, I can't solve this problem. Thank you very much.
The text was updated successfully, but these errors were encountered: