-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hpcai OpenSora v1.2 - 3D VAE inference #560
Conversation
image_latents = self.vae.encode(x) | ||
image_latents = image_latents * self.scale_factor | ||
# image_latents = ops.stop_gradient(self.vae.encode(x)) | ||
image_latents = ops.stop_gradient(self.vae.module.encode(x) * self.vae.scale_factor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the scale factor is independent of type of the vae to use, so I think better to keep the parameter in difussion pipeline scope instead of vae scope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once it is parsed to vae, it is as a member. it should be configurable but hpcai write it in the code and train it with this fixed value... which is not so proper for different vae training data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get this: AttributeError: The 'VideoAutoencoderPipeline' object has no attribute 'module'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this: AttributeError: The 'VideoAutoencoderPipeline' object has no attribute 'scale_factor'.
@@ -71,40 +71,26 @@ def vae_decode(self, x: Tensor) -> Tensor: | |||
Return: | |||
y: (b H W 3), batch of images, normalized to [0, 1] | |||
""" | |||
b, c, h, w = x.shape | |||
|
|||
if self.micro_batch_size is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
micro_batch_size
is removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it's wrapped in VAE class
vae = VideoAutoencoderKL( | ||
config=SD_CONFIG, ckpt_path=args.vae_checkpoint, micro_batch_size=args.vae_micro_batch_size | ||
) | ||
elif args.vae_dtype == 'OpenSoraVAE_V1_2"': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elif args.vae_dtype == 'OpenSoraVAE_V1_2"': | |
elif args.vae_type == "OpenSoraVAE_V1_2": |
ckpt_path: "models/opensora_v1.2_stage3.ckpt" | ||
t5_model_dir: "models/t5-v1_1-xxl/" | ||
|
||
vae_model_type: "OpenSoraVAE_V1_2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vae_model_type: "OpenSoraVAE_V1_2" | |
vae_type: OpenSoraVAE_V1_2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please set temporal compression VAE_T_COMPRESS
and input_size
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
return latent_size | ||
|
||
|
||
class VideoAutoencoderPipelineConfig(PretrainedConfig): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's good to use PretrainedConfig
with our project. It's better to move these parameters under __init__
in VideoAutoencoderPipeline
.
9c5e789
* add vae 3d enc-dec * update test * dev save * testing * spatial vae test pass * fix * add vae param list * fix name order * add shape * add shape * order pnames * ordered temporal pnames * vae 3d recons ok * update docs * add test scripts * add convert script * adapt to 910b * support ms2.3 5d GN * rm test files * fix format * debug infer * add sample t2v yaml * fix i2v * update comment * fix format * rm tmp test * fix docs * fix var name * fix latent shape compute * add info * fix image enc/dec * fix format * adapt new vae in training * fix dtype * pad bf16 fixed by cast to fp16 * fix ops.pad bf16 with fp32 cast * replace pad with concat * replace pad_at_dim with concat for bf16
* add vae 3d enc-dec * update test * dev save * testing * spatial vae test pass * fix * add vae param list * fix name order * add shape * add shape * order pnames * ordered temporal pnames * vae 3d recons ok * update docs * add test scripts * add convert script * adapt to 910b * support ms2.3 5d GN * rm test files * fix format * debug infer * add sample t2v yaml * fix i2v * update comment * fix format * rm tmp test * fix docs * fix var name * fix latent shape compute * add info * fix image enc/dec * fix format * adapt new vae in training * fix dtype * pad bf16 fixed by cast to fp16 * fix ops.pad bf16 with fp32 cast * replace pad with concat * replace pad_at_dim with concat for bf16
What does this PR do?
Fixes # (issue)
Adds # (feature)
opensora v1.2 vae:
Passed test on 910*
Before submitting
What's New
. Here are thedocumentation guidelines
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@xxx