python api code for text to video generation? #633

SutirthaChakraborty · 2024-08-19T14:35:40Z

How long videos can we generate? what would be the best model combinations ?

SamitHuang · 2024-08-28T07:16:46Z

Hi @SutirthaChakraborty with opensora_hpcai, we support generating 720P videos of 16 seconds (408 frames).

For model combination, do you mean combining text-to-image model with image-to-video model? If so, I would suggest using SD3 or Flux.1 (in PR) for T2I generation followed by DynamiCrafter for I2V generation for the best visual quality. If you prefer long videos, you may use opensora_hpcai to do I2V.

Thanks for your attention to our aigc kit.

SutirthaChakraborty · 2024-08-28T09:05:04Z

Hi @SamitHuang Thanks for your detailed reply.
Is there any code reference for using SD3 + DynamiCrafter pre-trained weights and saving the mp4? Just like you provided for image generation in the readme file.

SamitHuang · 2024-08-28T13:33:47Z

Sure.

SD3 T2I based on mindone.diffusers. Here is an example.

>>> import mindspore
>>> from mindone.diffusers import StableDiffusion3Pipeline

>>> pipe = StableDiffusion3Pipeline.from_pretrained(
...     "stabilityai/stable-diffusion-3-medium-diffusers",
...     mindspore_dtype=mindspore.float16,
... )
>>> prompt = "A cat holding a sign that says hello world"
>>> image = pipe(prompt)[0][0]
>>> image.save("sd3.png")

DynamiCrafter for I2V following the instructions in examples/dynamicrafter

SutirthaChakraborty · 2024-08-28T15:58:53Z

There is no direct way to run the dynamicrafter?

SamitHuang · 2024-08-29T06:31:33Z

There is no direct way to run the dynamicrafter?

Sorry that it can only run with script/inference.py currently. @HaoyangLee may consider wrapping the inference script into a class or function that is easier to integrate into a high-level pipeline.

SamitHuang · 2024-08-29T06:31:56Z

There is no direct way to run the dynamicrafter?

Sorry that it can only run with script/inference.py currently. @HaoyangLee may consider wrapping the inference script into a class or function that is easier to integrate into a high-level pipeline.

vigo999 assigned SamitHuang Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python api code for text to video generation? #633

python api code for text to video generation? #633

SutirthaChakraborty commented Aug 19, 2024

SamitHuang commented Aug 28, 2024 •

edited

Loading

SutirthaChakraborty commented Aug 28, 2024

SamitHuang commented Aug 28, 2024 •

edited

Loading

SutirthaChakraborty commented Aug 28, 2024

SamitHuang commented Aug 29, 2024

SamitHuang commented Aug 29, 2024

python api code for text to video generation? #633

python api code for text to video generation? #633

Comments

SutirthaChakraborty commented Aug 19, 2024

SamitHuang commented Aug 28, 2024 • edited Loading

SutirthaChakraborty commented Aug 28, 2024

SamitHuang commented Aug 28, 2024 • edited Loading

SutirthaChakraborty commented Aug 28, 2024

SamitHuang commented Aug 29, 2024

SamitHuang commented Aug 29, 2024

SamitHuang commented Aug 28, 2024 •

edited

Loading

SamitHuang commented Aug 28, 2024 •

edited

Loading