not run #44

ayttop · 2024-10-27T19:31:03Z

not run on colab t4

from OmniGen import OmniGenPipeline
import torch
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import transformers
transformers.logging.set_verbosity_error()
device = "cuda" if torch.cuda.is_available() else "cpu"
pipe = OmniGenPipeline.from_pretrained("Shitao/OmniGen-v1", device_map=device)

Text to Image

images = pipe(
prompt="A curly-haired man in a red shirt is drinking tea.",
height=768,
width=512,
guidance_scale=1,
seed=0,
separate_cfg_infer=True,
num_inference_steps=1,
num_images_per_prompt=1,
use_kv_cache=True
)
images[0].save("example_t2i.png") # save output PIL Image

Text to Image

images = pipe(
prompt="A curly-haired man in a red shirt is drinking tea.",
height=768,
width=512,
guidance_scale=1,
seed=0,
separate_cfg_infer=True,
num_inference_steps=1,
num_images_per_prompt=1,
use_kv_cache=True
)
images[0].save("example_t2i.png") # save output PIL Image

TypeError Traceback (most recent call last)
in <cell line: 8>()
6 transformers.logging.set_verbosity_error()
7 device = "cuda" if torch.cuda.is_available() else "cpu"
----> 8 pipe = OmniGenPipeline.from_pretrained("Shitao/OmniGen-v1", device_map=device)
9
10 # Text to Image

TypeError: OmniGenPipeline.from_pretrained() got an unexpected keyword argument 'device_map'

ayttop · 2024-10-27T20:17:04Z

not run with accelerate,bitsandbytes
from OmniGen import OmniGenPipeline
from accelerate import init_empty_weights
import bitsandbytes as bnb

Initialize the model with empty weights to save memory

with init_empty_weights():
pipe = OmniGenPipeline.from_pretrained(
"Shitao/OmniGen-v1",
device_map="auto", # Automatically maps model layers to available devices
torch_dtype=bnb.float16, # Set data type for bitsandbytes
load_in_4bit=True # Load model in 4-bit precision using bitsandbytes
)

staoxiao · 2024-10-28T09:05:33Z

Current code doesn't support quantization. We will consider this in the future.

able2608 · 2024-10-28T10:27:59Z

Apparently someone did try to implement quantization, however it is still a WIP and might be somewhat fiddly to use. Check out this PR if you are interested in using it: #29.
You might need to tweak some files as discussed in the PR's discussion after downloading it to get it to work, plus Colab RAM (yes RAM not VRAM) only caps at 12GB for free tier users, so the quantization process will be slow at least and will probably straight up OOM for now. It pretty much filled up the 16GB of RAM on my system running Windows 11 and requires extensive offloading to the disk when quantizing. However judging from the VRAM usage on my system, once the quantization process is over the model might be able to fit in T4's VRAM. Perhaps you would want to wait for the code to be more optimized.

nitinmukesh · 2024-10-28T19:29:40Z

@able2608 @staoxiao

It is working on low VRAM

try this
https://www.youtube.com/watch?v=9ZXmXA2AJZ4

ayttop · 2024-10-28T22:07:53Z

2024-10-28 22:00:46.180633: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-28 22:00:46.455676: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-28 22:00:46.544035: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-28 22:00:47.043118: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-28 22:00:49.018393: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

Running on local URL: http://127.0.0.1:7860/
Running on public URL: https://9fb36dc912050a91af.gradio.live/

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run gradio deploy from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)
Fetching 10 files: 100% 10/10 [00:00<00:00, 124460.06it/s]

but dont work with colab t4

ayttop · 2024-10-28T22:08:37Z

!git clone https://github.com/Manni1000/OmniGen.git

%cd OmniGen

!pip install -e .

!pip install gradio spaces

!apt install net-tools -y

!netstat -an | grep 7860

from google.colab import output

!python /content/OmniGen/app.py

ayttop · 2024-10-28T22:28:42Z

!pip install -r /content/OmniGen/requirements.txt

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
gcsfs 2024.6.1 requires fsspec==2024.6.1, but you have fsspec 2024.5.0 which is incompatible.
torchaudio 2.1.1+cu121 requires torch==2.1.1, but you have torch 2.3.1+cu121 which is incompatible.
Successfully installed fsspec-2024.5.0 torch-2.3.1+cu121 torchvision-0.18.1+cu121 triton-2.3.1

ayttop · 2024-10-28T22:31:33Z

2024-10-28 22:29:35.060112: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-28 22:29:35.093217: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-28 22:29:35.103173: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-28 22:29:35.126043: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-28 22:29:36.350897: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

Running on local URL: http://127.0.0.1:7860/
Running on public URL: https://bd987c9a537c35850b.gradio.live/

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run gradio deploy from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)
Fetching 10 files: 100% 10/10 [00:00<00:00, 50472.97it/s]

ayttop · 2024-10-28T22:31:56Z

gpu not run on colab t4

ayttop · 2024-10-28T22:33:21Z

ayttop · 2024-10-28T22:33:39Z

ayttop · 2024-10-28T23:03:06Z

on colab tpu

!python /content/OmniGen/app.py
/usr/local/lib/python3.10/dist-packages/gradio/utils.py:980: UserWarning: Expected 11 arguments for function <function generate_image at 0x7d4a6ec5e290>, received 10.
warnings.warn(
/usr/local/lib/python3.10/dist-packages/gradio/utils.py:984: UserWarning: Expected at least 11 arguments for function <function generate_image at 0x7d4a6ec5e290>, received 10.
warnings.warn(

Running on local URL: http://127.0.0.1:7860/
Running on public URL: https://a19be6bd2476aa5424.gradio.live/

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run gradio deploy from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)
/usr/local/lib/python3.10/dist-packages/gradio/helpers.py:987: UserWarning: Unexpected argument. Filling with None.
warnings.warn("Unexpected argument. Filling with None.")
Fetching 10 files: 100% 10/10 [00:00<00:00, 93832.30it/s]
Loading safetensors
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 624, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/dist-packages/gradio/route_utils.py", line 323, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 2018, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1567, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 846, in wrapper
response = f(*args, **kwargs)
File "/content/OmniGen/app.py", line 51, in generate_image
output = pipe(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/content/OmniGen/OmniGen/pipeline.py", line 189, in call
generator = torch.Generator(device=self.device).manual_seed(seed)
RuntimeError: manual_seed expected a long, but got bool

ayttop · 2024-10-28T23:09:24Z

werruww · 2024-10-29T19:05:30Z

Collecting cloud-tpu-client==0.10
Downloading cloud_tpu_client-0.10-py3-none-any.whl.metadata (1.2 kB)
Collecting torch==1.13.0
Downloading torch-1.13.0-cp310-cp310-manylinux1_x86_64.whl.metadata (23 kB)
Collecting torchvision==0.14.0
Downloading torchvision-0.14.0-cp310-cp310-manylinux1_x86_64.whl.metadata (11 kB)
Collecting torchtext==0.14.0
Downloading torchtext-0.14.0-cp310-cp310-manylinux1_x86_64.whl.metadata (6.9 kB)
ERROR: Could not find a version that satisfies the requirement torch_xla==1.13 (from versions: 2.1.0rc5, 2.1.0, 2.2.0, 2.3.0, 2.4.0, 2.5.0)
ERROR: No matching distribution found for torch_xla==1.13

yuezewang · 2024-10-30T13:55:19Z

not run on colab t4

from OmniGen import OmniGenPipeline import torch import os os.environ["CUDA_VISIBLE_DEVICES"] = "0" import transformers transformers.logging.set_verbosity_error() device = "cuda" if torch.cuda.is_available() else "cpu" pipe = OmniGenPipeline.from_pretrained("Shitao/OmniGen-v1", device_map=device)

Text to Image

images = pipe( prompt="A curly-haired man in a red shirt is drinking tea.", height=768, width=512, guidance_scale=1, seed=0, separate_cfg_infer=True, num_inference_steps=1, num_images_per_prompt=1, use_kv_cache=True ) images[0].save("example_t2i.png") # save output PIL Image

Text to Image

images = pipe( prompt="A curly-haired man in a red shirt is drinking tea.", height=768, width=512, guidance_scale=1, seed=0, separate_cfg_infer=True, num_inference_steps=1, num_images_per_prompt=1, use_kv_cache=True ) images[0].save("example_t2i.png") # save output PIL Image

TypeError Traceback (most recent call last) in <cell line: 8>() 6 transformers.logging.set_verbosity_error() 7 device = "cuda" if torch.cuda.is_available() else "cpu" ----> 8 pipe = OmniGenPipeline.from_pretrained("Shitao/OmniGen-v1", device_map=device) 9 10 # Text to Image

TypeError: OmniGenPipeline.from_pretrained() got an unexpected keyword argument 'device_map'

Hello, you should remove the device_map=device as:

# The pipeline will detect valid gpu device automatically
pipe = OmniGenPipeline.from_pretrained("Shitao/OmniGen-v1")  # so just remove ', device_map=device'

werruww · 2024-10-31T18:35:50Z

The problem is that I want to run it in Colab T4 and the RAM is 12, so I want to either quantize it and then use it after saving it in T4 or use it with acclrate device_map=device

werruww · 2024-10-31T18:42:03Z

yuezewang

it not run on gpu colab t4

Your session crashed after using all available RAM.

from OmniGen import OmniGenPipeline

pipe = OmniGenPipeline.from_pretrained("goodasdgood/OmniGen_quantization")

Text to Image

images = pipe(
prompt="A curly-haired man in a red shirt is drinking tea.",
height=1024,
width=1024,
guidance_scale=2.5,
seed=0,
)
images[0].save("example_t2i.png") # save output PIL Image

werruww · 2024-10-31T18:42:24Z

https://huggingface.co/goodasdgood/OmniGen_quantization/tree/main

werruww · 2024-10-31T19:45:56Z

yuezewang

where path to model quantization?

Ordoumpozanis · 2024-11-01T10:40:34Z

in order to run i and bypass the error od device i just exposed the device to pipeline.py

Define device globally (optional)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f'Device ={device}')

90
class OmniGenPipeline:
def init(
self,
vae: AutoencoderKL,
model: OmniGen,
processor: OmniGenProcessor,
):
self.vae = vae
self.model = model
self.processor = processor
self.model.to(torch.bfloat16)
self.model.eval()
self.vae.eval()

    self.model_cpu_offload = False

then replace any sel.device with device as now is global and it will work

Qarqor5555555 · 2024-11-01T11:03:17Z

Question: Does adding compression to the loading function not store the model on the hard disk? Is this method different from the method for converting a model to 4 bit? Like unshulesh

Qarqor5555555 · 2024-11-01T11:07:41Z

Question: Does adding pressure in the loading function differ from the method of converting the model to 4bit like unsulsh

ayttop · 2024-11-01T21:43:07Z

NameError: name 'is_torch_npu_available' is not defined. Did you mean: 'is_torch_xla_available'?

ayttop · 2024-11-01T21:43:31Z

from OmniGen import OmniGenPipeline

import torch
pipe = OmniGenPipeline.from_pretrained("C:/Users/m/Desktop/4/OmniGen-v1")

Text to Image

images = pipe(
prompt="car.",
height=64,
width=64,
num_inference_steps=2,
guidance_scale=2,
seed=0,
)
images[0].save("example_t2i.png") # save output PIL Image

NameError: name 'is_torch_npu_available' is not defined. Did you mean: 'is_torch_xla_available'?

ronfromhp · 2024-11-02T01:55:56Z

@able2608 @staoxiao

It is working on low VRAM

try this https://www.youtube.com/watch?v=9ZXmXA2AJZ4

wait, how are you getting it 30 times faster than mine?

this is for the exact same prompt

staoxiao · 2024-11-02T06:46:14Z

@ronfromhp , do you have a GPU? Running on CPU is very slow. You can try the latest code, and refer to https://github.com/VectorSpaceLab/OmniGen/blob/main/docs/inference.md#requiremented-resources for inference time.

ronfromhp · 2024-11-02T06:53:23Z

@staoxiao , I have a RTX 4050 laptop GPU 6gb. So it must be running slow because of that. But i tried the forked repo of the guy i was replying to #44 (comment) and it seems he's got a quantised model working that's like 50-100 times faster on my gpu

nitinmukesh · 2024-11-02T08:02:37Z

@ronfromhp

Can you confirm that my fork is working fine for you and the generation is fast? Other viewers of my channel confirmed that it is working good.

ronfromhp · 2024-11-02T14:27:28Z

@nitinmukesh , upto a certain point, it is fast. But it fails at a certain point if i give two input images prompts and ask for a 1080p output for example. Then it falls back to 280 sec/step. I'd describe it as a sigmoid curve, if you exceed a certain threshold it becomes 50 ish times slower

NormalMultiaccount · 2024-11-03T01:55:28Z

Has anyone had omnigen run on collab?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

not run #44

not run #44

ayttop commented Oct 27, 2024

ayttop commented Oct 27, 2024

staoxiao commented Oct 28, 2024

able2608 commented Oct 28, 2024

nitinmukesh commented Oct 28, 2024 •

edited

Loading

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

werruww commented Oct 29, 2024

yuezewang commented Oct 30, 2024

Text to Image

Text to Image

werruww commented Oct 31, 2024

werruww commented Oct 31, 2024

werruww commented Oct 31, 2024

werruww commented Oct 31, 2024

Ordoumpozanis commented Nov 1, 2024 •

edited

Loading

Qarqor5555555 commented Nov 1, 2024

Qarqor5555555 commented Nov 1, 2024

ayttop commented Nov 1, 2024

ayttop commented Nov 1, 2024

ronfromhp commented Nov 2, 2024

staoxiao commented Nov 2, 2024

ronfromhp commented Nov 2, 2024 •

edited

Loading

nitinmukesh commented Nov 2, 2024

ronfromhp commented Nov 2, 2024

NormalMultiaccount commented Nov 3, 2024

not run #44

not run #44

Comments

ayttop commented Oct 27, 2024

Text to Image

Text to Image

ayttop commented Oct 27, 2024

Initialize the model with empty weights to save memory

staoxiao commented Oct 28, 2024

able2608 commented Oct 28, 2024

nitinmukesh commented Oct 28, 2024 • edited Loading

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

ayttop commented Oct 28, 2024

werruww commented Oct 29, 2024

yuezewang commented Oct 30, 2024

Text to Image

Text to Image

werruww commented Oct 31, 2024

werruww commented Oct 31, 2024

Text to Image

werruww commented Oct 31, 2024

werruww commented Oct 31, 2024

Ordoumpozanis commented Nov 1, 2024 • edited Loading

Define device globally (optional)

Qarqor5555555 commented Nov 1, 2024

Qarqor5555555 commented Nov 1, 2024

ayttop commented Nov 1, 2024

ayttop commented Nov 1, 2024

Text to Image

ronfromhp commented Nov 2, 2024

staoxiao commented Nov 2, 2024

ronfromhp commented Nov 2, 2024 • edited Loading

nitinmukesh commented Nov 2, 2024

ronfromhp commented Nov 2, 2024

NormalMultiaccount commented Nov 3, 2024

nitinmukesh commented Oct 28, 2024 •

edited

Loading

Ordoumpozanis commented Nov 1, 2024 •

edited

Loading

ronfromhp commented Nov 2, 2024 •

edited

Loading