You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
def predict(self) -> Any:
"""Run a single prediction on the model"""
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
vram = int(torch.cuda.get_device_properties(0).total_memory/(1024*1024*1024))
print("VRAM", vram)
pipe = FluxPipeline.from_pretrained(flux_path, torch_dtype=torch.bfloat16).to(device)
pipe.enable_model_cpu_offload()
prompt = "A cat holding a sign that says hello world"
image = pipe(
prompt,
height=1024,
width=1024,
guidance_scale=3.5,
num_inference_steps=50,
max_sequence_length=512,
generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-dev.png")
return "flux-dev.png"
I have 24GB VRAM (the vram variable report 23) on a NVIDIA GeForce RTX 4090.
But when I run sudo cog predict --setup-timeout 3600 I get an Out of Memory error. But flux should be able to run 22GB. I wonder if it is something related to cog/wsl/docker?
The text was updated successfully, but these errors were encountered:
I have this predict function:
I have 24GB VRAM (the vram variable report 23) on a NVIDIA GeForce RTX 4090.
But when I run
sudo cog predict --setup-timeout 3600
I get an Out of Memory error. But flux should be able to run 22GB. I wonder if it is something related to cog/wsl/docker?The text was updated successfully, but these errors were encountered: