You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks so much for your work. I'm using the H100 for the experiment on the acceleration, however I can only achieve around 6it/s for the Flux-dev inference. Here's my config file:
import io
from flux_pipeline import FluxPipeline
import torch
import time
pipe = FluxPipeline.load_pipeline_from_config_path(
"configs/config-dev-prequant.json" # or whatever your config is
)
# compile model
pipe.model.to(memory_format=torch.channels_last)
pipe.model = torch.compile(pipe.model)
for i in range(10):
output_jpeg_bytes: io.BytesIO = pipe.generate(
# Required args:
prompt="A beautiful asian woman in traditional clothing with golden hairpin and blue eyes, wearing a red kimono with dragon patterns",
# Optional args:
width=1024,
height=1024,
num_steps=20,
guidance=3.5,
seed=13456,
strength=0.8,
)
Would you please have a look and see if there's anything mistake that I might make? Thanks!
The text was updated successfully, but these errors were encountered:
That is interesting. I would try maybe using data type bfloat16 for the flow_dtype, since otherwise it'll be using my torch-cublas-hgemm which only really gets speed ups for consumer gpus. Also- you shouldn't compile the model before inference. I would recommend just letting the model get compiled on it's own, since if you set 'compile blocks' and 'compile extras' to true (which you have), it will get compiled on it's own.
Hi,
Thanks so much for your work. I'm using the H100 for the experiment on the acceleration, however I can only achieve around 6it/s for the Flux-dev inference. Here's my config file:
Here's the demo file:
Would you please have a look and see if there's anything mistake that I might make? Thanks!
The text was updated successfully, but these errors were encountered: