Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The speed of drawing is not satisfactory #26

Open
lvjin521 opened this issue Oct 14, 2024 · 4 comments
Open

The speed of drawing is not satisfactory #26

lvjin521 opened this issue Oct 14, 2024 · 4 comments

Comments

@lvjin521
Copy link

image image image

I have encountered a new problem. I successfully built a project using 4090 on the runpod platform, but it did not make my graph generation speed twice faster, but the speed of 7000 milliseconds as the original project. Please tell me the reason, I can't solve this problem. Thank you very much.

@Viper373
Copy link

image image image
I have encountered a new problem. I successfully built a project using 4090 on the runpod platform, but it did not make my graph generation speed twice faster, but the speed of 7000 milliseconds as the original project. Please tell me the reason, I can't solve this problem. Thank you very much.

I have also encountered the same problem and hope to receive a reply as soon as possible!!!

@aredden
Copy link
Owner

aredden commented Oct 15, 2024

I am a bit confused. Where are you getting 3.32 iterations per second? Total generation time doesn't mean as much as the it/s speed. You also need to take into account the image size and the number of steps you decide to generate with 🤔

@lvjin521
Copy link
Author

{ "prompt": "A detailed and adorable illustration of a small dog. The dog should be fluffy with big, expressive eyes, floppy ears, and a playful expression. It should be sitting on the ground with its tail wagging slightly, surrounded by a warm, cozy environment that enhances the cuteness of the scene. The colors should be soft and gentle, with warm lighting that makes the dog look even more endearing.", "width": 1024, "height": 1024, "num_steps": 24, "guidance": 3.5, "seed": 2 }

This is the generative parameter I used, and his final speed is 7000 ms, not 300 ms

@aredden
Copy link
Owner

aredden commented Oct 18, 2024

The speeds you are getting look normal to me. The model does 3.32 forward passes per second which is relatively close to max tflops for a 4090 if you're generating an image at 1024x1024. If you want more speed you can shrink the size of the image, setting height to less than 1024, or width to less than 1024. Or you can use schnell which allows you to generate an image in 4 steps instead of 24 steps, at a bit less quality. Other things you can do is try a flux hyper lora which allows you to reduce the number of steps to ~8 steps. The speeds that H100's get are very different from the speeds that you get with a 4090. H100's max tflops for fp8 is absolutely gigantic, around 1500-2000 tflops, vs a 4090 which "only" (still a lot) gets ~330 tflops with fp8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants