issues about applying vae encode to channel-wise concat image condition #38

Dawn-LX · 2024-12-08T14:22:07Z

def tensor_to_vae_latent(t, vae):
    video_length = t.shape[1]

    t = rearrange(t, "b f c h w -> (b f) c h w")

    latents = vae.encode(t).latent_dist.sample()
    latents = rearrange(latents, "(b f) c h w -> b f c h w", f=video_length)
    latents = latents * vae.config.scaling_factor

    return latents

AnimateLCM/animatelcm_svd/train_svd_lcm.py

Line 1010 in 9a5a314

    
           conditional_latents = tensor_to_vae_latent(conditional_pixel_values, vae)[:, 0, :, :, :]

AnimateLCM/animatelcm_svd/train_svd_lcm.py

Line 1011 in 9a5a314

conditional_latents = conditional_latents / vae.config.scaling_factor

NOTE here SVD used re-scale back, following InstructPix2Pix, channel_concat image do not apply vae's scaling_factor
But unlike InstructPix2Pix, SVD used vae.encode(x).latent_dist.sample() for channel_concat image
However, InstructPix2Pix used vae.encode(x).latent_dist.mode() for channel_concat image

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

issues about applying vae encode to channel-wise concat image condition #38

issues about applying vae encode to channel-wise concat image condition #38

Dawn-LX commented Dec 8, 2024 •

edited

Loading

issues about applying vae encode to channel-wise concat image condition #38

issues about applying vae encode to channel-wise concat image condition #38

Comments

Dawn-LX commented Dec 8, 2024 • edited Loading

Dawn-LX commented Dec 8, 2024 •

edited

Loading