The speed of drawing is not satisfactory #26

lvjin521 · 2024-10-14T05:54:10Z

I have encountered a new problem. I successfully built a project using 4090 on the runpod platform, but it did not make my graph generation speed twice faster, but the speed of 7000 milliseconds as the original project. Please tell me the reason, I can't solve this problem. Thank you very much.

Viper373 · 2024-10-15T07:00:32Z

I have encountered a new problem. I successfully built a project using 4090 on the runpod platform, but it did not make my graph generation speed twice faster, but the speed of 7000 milliseconds as the original project. Please tell me the reason, I can't solve this problem. Thank you very much.

I have also encountered the same problem and hope to receive a reply as soon as possible!!!

aredden · 2024-10-15T15:34:50Z

I am a bit confused. Where are you getting 3.32 iterations per second? Total generation time doesn't mean as much as the it/s speed. You also need to take into account the image size and the number of steps you decide to generate with 🤔

lvjin521 · 2024-10-16T01:13:55Z

{ "prompt": "A detailed and adorable illustration of a small dog. The dog should be fluffy with big, expressive eyes, floppy ears, and a playful expression. It should be sitting on the ground with its tail wagging slightly, surrounded by a warm, cozy environment that enhances the cuteness of the scene. The colors should be soft and gentle, with warm lighting that makes the dog look even more endearing.", "width": 1024, "height": 1024, "num_steps": 24, "guidance": 3.5, "seed": 2 }

This is the generative parameter I used, and his final speed is 7000 ms, not 300 ms

aredden · 2024-10-18T00:08:32Z

The speeds you are getting look normal to me. The model does 3.32 forward passes per second which is relatively close to max tflops for a 4090 if you're generating an image at 1024x1024. If you want more speed you can shrink the size of the image, setting height to less than 1024, or width to less than 1024. Or you can use schnell which allows you to generate an image in 4 steps instead of 24 steps, at a bit less quality. Other things you can do is try a flux hyper lora which allows you to reduce the number of steps to ~8 steps. The speeds that H100's get are very different from the speeds that you get with a 4090. H100's max tflops for fp8 is absolutely gigantic, around 1500-2000 tflops, vs a 4090 which "only" (still a lot) gets ~330 tflops with fp8.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The speed of drawing is not satisfactory #26

The speed of drawing is not satisfactory #26

lvjin521 commented Oct 14, 2024

Viper373 commented Oct 15, 2024

aredden commented Oct 15, 2024

lvjin521 commented Oct 16, 2024

aredden commented Oct 18, 2024

The speed of drawing is not satisfactory #26

The speed of drawing is not satisfactory #26

Comments

lvjin521 commented Oct 14, 2024

Viper373 commented Oct 15, 2024

aredden commented Oct 15, 2024

lvjin521 commented Oct 16, 2024

aredden commented Oct 18, 2024