We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I tried using the 3 video generator scripts with directml, but none of them worked
Text-to-Video in Text models: Potat v1, ZeroScope v2 Dark, ModelScope 1.7b
Image-to-Video in Image models: VGen
Stable Video Diffusion in Image models: SVD XT 1.1
2024-07-16 12:38:07,164 | sd | INFO | launch | Starting SD.Next 2024-07-16 12:38:07,169 | sd | INFO | installer | Logger: file="C:\StabilityMatrix\Data\Packages\SD.Next\sdnext.log" level=INFO size=899852 mode=append 2024-07-16 12:38:07,171 | sd | INFO | installer | Python version=3.10.11 platform=Windows bin="C:\StabilityMatrix\Data\Packages\SD.Next\venv\Scripts\python.exe" venv="C:\StabilityMatrix\Data\Packages\SD.Next\venv" 2024-07-16 12:38:07,474 | sd | INFO | installer | Version: app=sd.next updated=2024-07-10 hash=2ec6e9ee branch=master url=https://github.com/vladmandic/automatic/tree/master ui=main 2024-07-16 12:38:08,050 | sd | INFO | launch | Platform: arch=AMD64 cpu=AMD64 Family 25 Model 80 Stepping 0, AuthenticAMD system=Windows release=Windows-10-10.0.22631-SP0 python=3.10.11 2024-07-16 12:38:08,053 | sd | DEBUG | installer | Torch allocator: "garbage_collection_threshold:0.80,max_split_size_mb:512" 2024-07-16 12:38:08,054 | sd | DEBUG | installer | Torch overrides: cuda=False rocm=False ipex=False diml=True openvino=False 2024-07-16 12:38:08,054 | sd | DEBUG | installer | Torch allowed: cuda=False rocm=False ipex=False diml=True openvino=False 2024-07-16 12:38:08,054 | sd | INFO | installer | Using DirectML Backend 2024-07-16 09:35:37,397 | sd | DEBUG | launch | Starting module: <module 'webui' from 'C:\StabilityMatrix\Data\Packages\SD.Next\webui.py'> 2024-07-16 09:35:37,397 | sd | INFO | launch | Command line args: ['--medvram', '--autolaunch', '--use-directml'] medvram=True autolaunch=True use_directml=True 2024-07-16 09:35:37,399 | sd | DEBUG | launch | Env flags: [] 2024-07-16 09:37:38,790 | sd | INFO | loader | Load packages: {'torch': '2.3.1+cpu', 'diffusers': '0.29.1', 'gradio': '3.43.2'} 2024-07-16 09:37:42,767 | sd | DEBUG | shared | Read: file="config.json" json=35 bytes=1548 time=0.000 2024-07-16 09:37:42,821 | sd | INFO | shared | Engine: backend=Backend.DIFFUSERS compute=directml device=privateuseone:0 attention="Dynamic Attention BMM" mode=no_grad 2024-07-16 09:37:42,979 | sd | INFO | shared | Device: device=AMD Radeon RX 6600M n=1 directml=0.2.2.dev240614 2024-07-16 09:37:42,987 | sd | DEBUG | shared | Read: file="html\reference.json" json=45 bytes=25986 time=0.006 2024-07-16 09:38:04,704 | sd | DEBUG | init | ONNX: version=1.18.1 provider=DmlExecutionProvider, available=['AzureExecutionProvider', 'CPUExecutionProvider']
Text-to-Video Model: Potat v1 12:47:35-275745 ERROR Arguments: args=('task(c5jnmnvhq3xjo9w)', 'woman, sitting on couch, female curvy, detailed face, perfect face, correct eyes, hairstyles, detailed muzzle, detailed mouth, five fingers, proper hands, proper shading, proper lighting, detailed character, high quality,', 'worst quality, bad quality, (text), ((signature, watermark)), extra limb, deformed hands, deformed feet, multiple tails, deformed, disfigured, poorly drawn face, mutated, extra limb, ugly, face out of frame, oversaturated, sketch, comic, no pupils, simple background, ((blurry)), mutation, intersex, bad anatomy, disfigured,', [], 20, 0, 26, True, False, False, False, 1, 1, 6, 6, 0.7, 0, 0.5, 1, 1, -1.0, -1.0, 0, 0, 0, 512, 512, False, 0.3, 2, 'None', False, 20, 0, 0, 10, 0, '', '', 0, 0, 0, 0, False, 4, 0.95, False, 0.6, 1, '#000000', 0, [], 11, 1, 'None', 'None', 'None', 'None', 0.5, 0.5, 0.5, 0.5, None, None, None, None, 0, 0, 0, 0, 1, 1, 1, 1, None, None, None, None, False, '', 'None', 16, 'None', 1, True, 'None', 2, True, 1, 0, True, 'none', 3, 4, 0.25, 0.25, 3, 1, 1, 0.8, 8, 64, True, True, 0.5, 600.0, 1.0, 1, 1, 0.5, 0.5, 'OpenGVLab/InternVL-14B-224px', False, False, 'positive', 'comma', 0, False, False, '', 'None', '', 1, '', 'None', 1, True, 10, 'Potat v1', True, 24, 'GIF', 2, True, 1, 0, 0, '', [], 0, '', [], 0, '', [], False, True, False, False, False, False, 0, 'None', [], 'FaceID Base', True, True, 1, 1, 1, 0.5, False, 'person', 1, 0.5, True) kwargs={} 12:47:35-284260 ERROR gradio call: AttributeError ┌───────────────────── Traceback (most recent call last) ─────────────────────┐ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\call_queue.py:31 in f │ │ │ │ 30 │ │ │ try: │ │ > 31 │ │ │ │ res = func(*args, **kwargs) │ │ 32 │ │ │ │ progress.record_results(id_task, res) │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\txt2img.py:89 in txt2img │ │ │ │ 88 │ p.script_args = args │ │ > 89 │ processed = scripts.scripts_txt2img.run(p, *args) │ │ 90 │ if processed is None: │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\scripts.py:483 in run │ │ │ │ 482 │ │ parsed = p.per_script_args.get(script.title(), args[script.ar │ │ > 483 │ │ processed = script.run(p, *parsed) │ │ 484 │ │ s.record(script.title()) │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\scripts\text2video.py:88 in run │ │ │ │ 87 │ │ │ shared.opts.sd_model_checkpoint = checkpoint │ │ > 88 │ │ │ sd_models.reload_model_weights(op='model') │ │ 89 │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\sd_models.py:1572 in reloa │ │ │ │ 1571 │ from modules import lowvram, sd_hijack │ │ > 1572 │ checkpoint_info = info or select_checkpoint(op=op) # are we sele │ │ 1573 │ next_checkpoint_info = info or select_checkpoint(op='dict' if lo │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\sd_models.py:248 in select │ │ │ │ 247 │ │ return None │ │ > 248 │ checkpoint_info = get_closet_checkpoint_match(model_checkpoint) │ │ 249 │ if checkpoint_info is not None: │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\sd_models.py:197 in get_cl │ │ │ │ 196 def get_closet_checkpoint_match(search_string): │ │ > 197 │ if search_string.startswith('huggingface/'): │ │ 198 │ │ model_name = search_string.replace('huggingface/', '') │ └─────────────────────────────────────────────────────────────────────────────┘ AttributeError: 'CheckpointInfo' object has no attribute 'startswith' Text-to-Video Model: ZeroScope v2 Dark 12:50:00-451738 ERROR Arguments: args=('task(yfgrwdtd3i1wg4r)', 'woman, sitting on couch, female curvy, detailed face, perfect face, correct eyes, hairstyles, detailed muzzle, detailed mouth, five fingers, proper hands, proper shading, proper lighting, detailed character, high quality,', 'worst quality, bad quality, (text), ((signature, watermark)), extra limb, deformed hands, deformed feet, multiple tails, deformed, disfigured, poorly drawn face, mutated, extra limb, ugly, face out of frame, oversaturated, sketch, comic, no pupils, simple background, ((blurry)), mutation, intersex, bad anatomy, disfigured,', [], 20, 7, 26, True, False, False, False, 1, 1, 6, 6, 0.7, 0, 0.5, 1, 1, -1.0, -1.0, 0, 0, 0, 512, 512, False, 0.3, 2, 'None', False, 20, 0, 0, 10, 0, '', '', 0, 0, 0, 0, False, 4, 0.95, False, 0.6, 1, '#000000', 0, [], 11, 1, 'None', 'None', 'None', 'None', 0.5, 0.5, 0.5, 0.5, None, None, None, None, 0, 0, 0, 0, 1, 1, 1, 1, None, None, None, None, False, '', 'None', 16, 'None', 1, True, 'None', 2, True, 1, 0, True, 'none', 3, 4, 0.25, 0.25, 3, 1, 1, 0.8, 8, 64, True, True, 0.5, 600.0, 1.0, 1, 1, 0.5, 0.5, 'OpenGVLab/InternVL-14B-224px', False, False, 'positive', 'comma', 0, False, False, '', 'None', '', 1, '', 'None', 1, True, 10, 'ZeroScope v2 Dark', True, 24, 'GIF', 2, True, 1, 0, 0, '', [], 0, '', [], 0, '', [], False, True, False, False, False, False, 0, 'None', [], 'FaceID Base', True, True, 1, 1, 1, 0.5, False, 'person', 1, 0.5, True) kwargs={} 12:50:00-459258 ERROR gradio call: TypeError ┌───────────────────── Traceback (most recent call last) ─────────────────────┐ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\call_queue.py:31 in f │ │ │ │ 30 │ │ │ try: │ │ > 31 │ │ │ │ res = func(*args, **kwargs) │ │ 32 │ │ │ │ progress.record_results(id_task, res) │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\txt2img.py:89 in txt2img │ │ │ │ 88 │ p.script_args = args │ │ > 89 │ processed = scripts.scripts_txt2img.run(p, *args) │ │ 90 │ if processed is None: │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\scripts.py:483 in run │ │ │ │ 482 │ │ parsed = p.per_script_args.get(script.title(), args[script.ar │ │ > 483 │ │ processed = script.run(p, *parsed) │ │ 484 │ │ s.record(script.title()) │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\scripts\text2video.py:75 in run │ │ │ │ 74 │ │ │ │ > 75 │ │ if model['path'] in shared.opts.sd_model_checkpoint: │ │ 76 │ │ │ shared.log.debug(f'Text2Video cached: model={shared.opts. │ └─────────────────────────────────────────────────────────────────────────────┘ TypeError: argument of type 'CheckpointInfo' is not iterable Text-to-Video Model: ModelScope 1.7b 13:02:06-745445 ERROR Processing: args={'prompt': ['woman, sitting on couch, female curvy, detailed eyes, perfect eyes, detailed face, perfect face, perfectly rendered face, correct eyes, hairstyles, detailed muzzle, detailed mouth, five fingers, proper hands, proper shading, proper lighting, detailed character, high quality,'], 'negative_prompt': ['worst quality, bad quality, (text), ((signature, watermark)), extra limb, deformed hands, deformed feet, multiple tails, deformed, disfigured, poorly drawn face, mutated, extra limb, ugly, face out of frame, oversaturated, sketch, comic, no pupils, simple background, ((blurry)), mutation, intersex, bad anatomy, disfigured,'], 'guidance_scale': 6, 'generator': [<torch._C.Generator object at 0x0000017C89FBA530>], 'callback_steps': 1, 'callback': <function diffusers_callback_legacy at 0x0000017C8BF3ECB0>, 'num_inference_steps': 20, 'eta': 1.0, 'output_type': 'latent', 'width': 320, 'height': 320, 'num_frames': 16} input must be 4-dimensional 13:02:06-750699 ERROR Processing: RuntimeError ┌───────────────────── Traceback (most recent call last) ─────────────────────┐ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\processing_diffusers.py:12 │ │ │ │ 121 │ │ else: │ │ > 122 │ │ │ output = shared.sd_model(**base_args) │ │ 123 │ │ if isinstance(output, dict): │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\utils │ │ │ │ 114 │ │ with ctx_factory(): │ │ > 115 │ │ │ return func(*args, **kwargs) │ │ 116 │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\diffusers\p │ │ │ │ 596 │ │ │ │ # predict the noise residual │ │ > 597 │ │ │ │ noise_pred = self.unet( │ │ 598 │ │ │ │ │ latent_model_input, │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\nn\mo │ │ │ │ 1531 │ │ else: │ │ > 1532 │ │ │ return self._call_impl(*args, **kwargs) │ │ 1533 │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\nn\mo │ │ │ │ 1540 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hook │ │ > 1541 │ │ │ return forward_call(*args, **kwargs) │ │ 1542 │ │ │ │ ... 12 frames hidden ... │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\nn\mo │ │ │ │ 1540 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hook │ │ > 1541 │ │ │ return forward_call(*args, **kwargs) │ │ 1542 │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\nn\mo │ │ │ │ 609 │ def forward(self, input: Tensor) -> Tensor: │ │ > 610 │ │ return self._conv_forward(input, self.weight, self.bias) │ │ 611 │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\nn\mo │ │ │ │ 604 │ │ │ ) │ │ > 605 │ │ return F.conv3d( │ │ 606 │ │ │ input, weight, bias, self.stride, self.padding, self.dil │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\dml\amp\autocast_mode.py:4 │ │ │ │ 42 │ │ op = getattr(resolved_obj, func_path[-1]) │ │ > 43 │ │ setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: f │ │ 44 │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\dml\amp\autocast_mode.py:1 │ │ │ │ 14 │ if not torch.dml.is_autocast_enabled: │ │ > 15 │ │ return op(*args, **kwargs) │ │ 16 │ args = list(map(cast, args)) │ └─────────────────────────────────────────────────────────────────────────────┘ RuntimeError: input must be 4-dimensional Image-to-Video Model: VGen 13:08:33-673173 WARNING Pipeline class change failed: type=DiffusersTaskType.IMAGE_2_IMAGE pipeline=I2VGenXLPipeline AutoPipeline can't find a pipeline linked to I2VGenXLPipeline for None 13:08:34-378645 INFO Base: class=I2VGenXLPipeline 13:08:47-883849 ERROR Processing: args={'prompt': ['woman, sitting on couch, female curvy, detailed eyes, perfect eyes, detailed face, perfect face, perfectly rendered face, correct eyes, hairstyles, detailed muzzle, detailed mouth, five fingers, proper hands, proper shading, proper lighting, detailed character, high quality,'], 'negative_prompt': ['worst quality, bad quality, (text), ((signature, watermark)), extra limb, deformed hands, deformed feet, multiple tails, deformed, disfigured, poorly drawn face, mutated, extra limb, ugly, face out of frame, oversaturated, sketch, comic, no pupils, simple background, ((blurry)), mutation, intersex, bad anatomy, disfigured,'], 'guidance_scale': 6, 'generator': [<torch._C.Generator object at 0x0000026E161C7150>], 'num_inference_steps': 20, 'eta': 1.0, 'output_type': 'pil', 'width': 512, 'height': 512, 'image': <PIL.Image.Image image mode=RGB size=512x512 at 0x26E118AE500>, 'num_frames': 16, 'target_fps': 8, 'decode_chunk_size': 8} the dimesion of at::Tensor must be 4 or lower, but got 5 13:08:47-888378 ERROR Processing: RuntimeError ┌───────────────────── Traceback (most recent call last) ─────────────────────┐ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\processing_diffusers.py:12 │ │ │ │ 121 │ │ else: │ │ > 122 │ │ │ output = shared.sd_model(**base_args) │ │ 123 │ │ if isinstance(output, dict): │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\utils │ │ │ │ 114 │ │ with ctx_factory(): │ │ > 115 │ │ │ return func(*args, **kwargs) │ │ 116 │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\diffusers\p │ │ │ │ 639 │ │ image = self.video_processor.preprocess(resized_image).to(dev │ │ > 640 │ │ image_latents = self.prepare_image_latents( │ │ 641 │ │ │ image, │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\diffusers\p │ │ │ │ 465 │ │ # duplicate image_latents for each generation per prompt, usi │ │ > 466 │ │ image_latents = image_latents.repeat(num_videos_per_prompt, 1 │ │ 467 │ └─────────────────────────────────────────────────────────────────────────────┘ RuntimeError: the dimesion of at::Tensor must be 4 or lower, but got 5 Stable Video Diffusion Model: SVD XT 1.1 13:12:03-607975 ERROR Processing: args={'generator': <torch._C.Generator object at 0x000001F873C34810>, 'callback_on_step_end': <function diffusers_callback at 0x000001F84F665D80>, 'callback_on_step_end_tensor_inputs': ['latents'], 'num_inference_steps': 20, 'output_type': 'pil', 'image': <PIL.Image.Image image mode=RGB size=1024x576 at 0x1F8531FC610>, 'width': 1024, 'height': 576, 'num_frames': 14, 'decode_chunk_size': 6, 'motion_bucket_id': 128, 'noise_aug_strength': 0.1, 'min_guidance_scale': 1, 'max_guidance_scale': 3} the dimesion of at::Tensor must be 4 or lower, but got 5 13:12:03-611978 ERROR Processing: RuntimeError ┌───────────────────── Traceback (most recent call last) ─────────────────────┐ │ C:\StabilityMatrix\Data\Packages\SD.Next\modules\processing_diffusers.py:12 │ │ │ │ 121 │ │ else: │ │ > 122 │ │ │ output = shared.sd_model(**base_args) │ │ 123 │ │ if isinstance(output, dict): │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\torch\utils │ │ │ │ 114 │ │ with ctx_factory(): │ │ > 115 │ │ │ return func(*args, **kwargs) │ │ 116 │ │ │ │ C:\StabilityMatrix\Data\Packages\SD.Next\venv\lib\site-packages\diffusers\p │ │ │ │ 523 │ │ # image_latents [batch, channels, height, width] ->[batch, nu │ │ > 524 │ │ image_latents = image_latents.unsqueeze(1).repeat(1, num_fram │ │ 525 │ └─────────────────────────────────────────────────────────────────────────────┘ RuntimeError: the dimesion of at::Tensor must be 4 or lower, but got 5 13:12:03-690490 WARNING Pipeline class change failed: type=DiffusersTaskType.TEXT_2_IMAGE pipeline=StableVideoDiffusionPipeline AutoPipeline can't find a pipeline linked to StableVideoDiffusionPipeline for None
Diffusers
Standard
Master
StableDiffusion 1.5
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Issue Description
I tried using the 3 video generator scripts with directml, but none of them worked
Text-to-Video in Text
models: Potat v1, ZeroScope v2 Dark, ModelScope 1.7b
Image-to-Video in Image
models: VGen
Stable Video Diffusion in Image
models: SVD XT 1.1
Version Platform Description
2024-07-16 12:38:07,164 | sd | INFO | launch | Starting SD.Next
2024-07-16 12:38:07,169 | sd | INFO | installer | Logger: file="C:\StabilityMatrix\Data\Packages\SD.Next\sdnext.log" level=INFO size=899852 mode=append
2024-07-16 12:38:07,171 | sd | INFO | installer | Python version=3.10.11 platform=Windows bin="C:\StabilityMatrix\Data\Packages\SD.Next\venv\Scripts\python.exe" venv="C:\StabilityMatrix\Data\Packages\SD.Next\venv"
2024-07-16 12:38:07,474 | sd | INFO | installer | Version: app=sd.next updated=2024-07-10 hash=2ec6e9ee branch=master url=https://github.com/vladmandic/automatic/tree/master ui=main
2024-07-16 12:38:08,050 | sd | INFO | launch | Platform: arch=AMD64 cpu=AMD64 Family 25 Model 80 Stepping 0, AuthenticAMD system=Windows release=Windows-10-10.0.22631-SP0 python=3.10.11
2024-07-16 12:38:08,053 | sd | DEBUG | installer | Torch allocator: "garbage_collection_threshold:0.80,max_split_size_mb:512"
2024-07-16 12:38:08,054 | sd | DEBUG | installer | Torch overrides: cuda=False rocm=False ipex=False diml=True openvino=False
2024-07-16 12:38:08,054 | sd | DEBUG | installer | Torch allowed: cuda=False rocm=False ipex=False diml=True openvino=False
2024-07-16 12:38:08,054 | sd | INFO | installer | Using DirectML Backend
2024-07-16 09:35:37,397 | sd | DEBUG | launch | Starting module: <module 'webui' from 'C:\StabilityMatrix\Data\Packages\SD.Next\webui.py'>
2024-07-16 09:35:37,397 | sd | INFO | launch | Command line args: ['--medvram', '--autolaunch', '--use-directml'] medvram=True autolaunch=True use_directml=True
2024-07-16 09:35:37,399 | sd | DEBUG | launch | Env flags: []
2024-07-16 09:37:38,790 | sd | INFO | loader | Load packages: {'torch': '2.3.1+cpu', 'diffusers': '0.29.1', 'gradio': '3.43.2'}
2024-07-16 09:37:42,767 | sd | DEBUG | shared | Read: file="config.json" json=35 bytes=1548 time=0.000
2024-07-16 09:37:42,821 | sd | INFO | shared | Engine: backend=Backend.DIFFUSERS compute=directml device=privateuseone:0 attention="Dynamic Attention BMM" mode=no_grad
2024-07-16 09:37:42,979 | sd | INFO | shared | Device: device=AMD Radeon RX 6600M n=1 directml=0.2.2.dev240614
2024-07-16 09:37:42,987 | sd | DEBUG | shared | Read: file="html\reference.json" json=45 bytes=25986 time=0.006
2024-07-16 09:38:04,704 | sd | DEBUG | init | ONNX: version=1.18.1 provider=DmlExecutionProvider, available=['AzureExecutionProvider', 'CPUExecutionProvider']
Relevant log output
Backend
Diffusers
UI
Standard
Branch
Master
Model
StableDiffusion 1.5
Acknowledgements
The text was updated successfully, but these errors were encountered: