Cant get maxperf running #10

blacklig · 2023-12-06T22:11:40Z

Hi, thanks for great project, but I am struggling to get it running.

I have nvidia driver 530, cuda 12.1, torch 2.1.0, python 3.1, xformers 0.22, like I would say all met...

stable fast I installed version corresponding to my setup via pip3: https://github.com/chengzeyi/stable-fast/releases/download/v0.0.13.post3/stable_fast-0.0.13.post3+torch210cu121-cp310-cp310-manylinux2014_x86_64.whl

yet when I try to run it I get such nasty error which I really dont know where to start and what can be wrong as it looks like it is comming from stable-fast and not really sure what to do.

Anyone any clue what might be wrong?

(venv) sd@sd:~/Playground/ArtSpew$ python3 maxperf.py
Loading pipeline components...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 15.28it/s]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at huggingface/diffusers#254 .
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/cuda/graphs.py:88: UserWarning: The CUDA Graph is empty. This usually means that the graph was attempted to be captured on wrong device or stream. (Triggered internally at ../aten/src/ATen/cuda/CUDAGraph.cpp:192.)
super().capture_end()
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:159: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
obj_type = tensors[start].item()
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:218: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
size = tensors[start].item()
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:228: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
size = tensors[start].item()
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:214: TracerWarning: Converting a tensor to a Python list might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
return bytes(tensors[start].tolist()), start + 1
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py:66: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if input_shape[-1] > 1 or self.sliding_window is not None:
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py:137: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if past_key_values_length > 0:
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py:273: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py:281: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if causal_attention_mask.size() != (bsz, 1, tgt_len, src_len):
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py:313: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:23: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
return torch.tensor([num], dtype=torch.int64)
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:253: TracerWarning: torch.Tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
return super().new(cls, x, *args, **kwargs)
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:123: TracerWarning: torch.as_tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
return (torch.as_tensor(tuple(obj), dtype=torch.uint8), )
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/diffusers/schedulers/scheduling_euler_discrete.py:353: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
if len(index_candidates) > 1:
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/diffusers/schedulers/scheduling_euler_discrete.py:358: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
self._step_index = step_index.item()
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/utils/flat_tensors.py:197: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
return bool(tensors[start].item()), start + 1
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py:878: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if dim % default_overall_up_factor != 0:
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/diffusers/models/resnet.py:265: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert hidden_states.shape[1] == self.channels
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/diffusers/models/resnet.py:271: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert hidden_states.shape[1] == self.channels
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/diffusers/models/resnet.py:173: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert hidden_states.shape[1] == self.channels
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/diffusers/models/resnet.py:186: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if hidden_states.shape[0] >= 64:
/usr/bin/ld: skipping incompatible /lib/i386-linux-gnu/libcuda.so when searching for -lcuda
/usr/bin/ld: skipping incompatible /lib/i386-linux-gnu/libcuda.so when searching for -lcuda
/usr/bin/ld: cannot find -lcuda: No such file or directory
/usr/bin/ld: skipping incompatible /lib/i386-linux-gnu/libcuda.so when searching for -lcuda
/usr/bin/ld: skipping incompatible /lib/i386-linux-gnu/libcuda.so when searching for -lcuda
collect2: error: ld returned 1 exit status
Traceback (most recent call last):
File "/home/sd/Playground/ArtSpew/maxperf.py", line 256, in
mw = MainWindow()
File "/home/sd/Playground/ArtSpew/maxperf.py", line 177, in init
self.genImage()
File "/home/sd/Playground/ArtSpew/maxperf.py", line 209, in genImage
images = genit(0, prompts=prompts, batchSize=batchSize, nSteps=1)
File "/home/sd/Playground/ArtSpew/maxperf.py", line 234, in genit
images = pipe(
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 918, in call
noise_pred = self.unet(
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/cuda/graphs.py", line 29, in dynamic_graphed_callable
cached_callable = simple_make_graphed_callable(
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/cuda/graphs.py", line 46, in simple_make_graphed_callable
return make_graphed_callable(callable,
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/cuda/graphs.py", line 75, in make_graphed_callable
callable(*tree_copy(example_inputs),
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/jit/trace_helper.py", line 62, in wrapper
return traced_module(*args, **kwargs)
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/jit/trace_helper.py", line 119, in forward
outputs = self.module(*self.convert_inputs(args, kwargs))
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):

graph(%input, %num_groups, %weight, %bias, %eps, %cudnn_enabled):
%y : Tensor = sfast_triton::group_norm_silu(%input, %num_groups, %weight, %bias, %eps)
~~~~~~~~~~~~ <--- HERE
return (%y)
RuntimeError: CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmp264mff0j/main.c', '-O3', '-I/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/triton/common/../third_party/cuda/include', '-I/usr/include/python3.10', '-I/tmp/tmp264mff0j', '-shared', '-fPIC', '-lcuda', '-o', '/tmp/tmp264mff0j/group_norm_4d_channels_last_forward_collect_stats_kernel.cpython-310-x86_64-linux-gnu.so', '-L/lib/x86_64-linux-gnu', '-L/lib/i386-linux-gnu', '-L/lib/i386-linux-gnu']' returned non-zero exit status 1.

At:
/usr/lib/python3.10/subprocess.py(369): check_call
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/triton/common/build.py(103): _build
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/triton/compiler/make_launcher.py(37): make_stub
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/triton/compiler/compiler.py(614): compile
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/triton/runtime/jit.py(532): run
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/triton/init.py(35): new_func
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/triton/runtime/autotuner.py(305): run
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/triton/runtime/autotuner.py(305): run
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/triton/ops/group_norm.py(425): group_norm_forward
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/triton/torch_ops.py(188): forward
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/autograd/function.py(539): apply
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/triton/torch_ops.py(226): group_norm_silu
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/nn/modules/module.py(1527): _call_impl
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/nn/modules/module.py(1518): _wrapped_call_impl
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/jit/trace_helper.py(119): forward
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/nn/modules/module.py(1527): _call_impl
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/nn/modules/module.py(1518): _wrapped_call_impl
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/jit/trace_helper.py(62): wrapper
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/cuda/graphs.py(75): make_graphed_callable
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/cuda/graphs.py(46): simple_make_graphed_callable
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/sfast/cuda/graphs.py(29): dynamic_graphed_callable
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/nn/modules/module.py(1527): _call_impl
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/nn/modules/module.py(1518): _wrapped_call_impl
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py(918): call
/home/sd/Playground/ArtSpew/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py(115): decorate_context
/home/sd/Playground/ArtSpew/maxperf.py(234): genit
/home/sd/Playground/ArtSpew/maxperf.py(209): genImage
/home/sd/Playground/ArtSpew/maxperf.py(177): init
/home/sd/Playground/ArtSpew/maxperf.py(256):

The text was updated successfully, but these errors were encountered:

aifartist · 2023-12-06T22:35:46Z

@blacklig
Do you have gcc installed?

blacklig · 2023-12-07T07:10:46Z

Hi, thanks for paying attention :)

I have gcc 11.4

I have installed build-essential before I started doing quite anything. It is actually very clean install... Only "messing" I did I started with drivers 535 which in nvidia-smi was laveled like CUDA 12.2 although I had no cuda at all I think (at least wasnt anywhere to be found), and libraries were failing like it was build for CUDA 12.1 etc. So I installed CUDA 12.1 and somehow nvidia-smi reports driver 530 instead if 535 so did it downgraded even drivers somehow? But I think this probably wont be the issue here...

Or could you share yours setup? I have older specs than you, but should be OK.. i7 7700k, RTX 3090, 64 GB ram

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cant get maxperf running #10

Cant get maxperf running #10

blacklig commented Dec 6, 2023

aifartist commented Dec 6, 2023

blacklig commented Dec 7, 2023

Cant get maxperf running #10

Cant get maxperf running #10

Comments

blacklig commented Dec 6, 2023

aifartist commented Dec 6, 2023

blacklig commented Dec 7, 2023