Bug Report: Manual Update of audio_emotion Not Reflecting Correctly #32

nazmulAbid-cell · 2024-12-19T09:14:57Z

@ltzheng
When manually updating the audio_emotion variable in inference.py (line 142) to set it to a specific emotion (e.g., disgusted with value 1), the output does not reflect the updated emotion.

Modify inference.py at line 142:

              audio_emotion = torch.full((audio_emb.shape[0],), 1, dtype=torch.int32, device=device)
              num_emotion_classes = 9

This sets the audio_emotion to 1 (disgusted) for all embeddings.

Run the inference pipeline.
Expected Behavior
The output should correctly reflect the manually set audio_emotion value.

Observed Behavior
Observe that the output does not reflect the manually assigned emotion (disgusted).

The text was updated successfully, but these errors were encountered:

ltzheng · 2024-12-19T11:34:31Z

The preview model we released may not effectively generalize emotion offset to all reference images, so this function was not explicitly included. However, as a workaround, you can manually input a reference image with a different emotion to adjust the emotion in the generated video.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug Report: Manual Update of audio_emotion Not Reflecting Correctly #32

Bug Report: Manual Update of audio_emotion Not Reflecting Correctly #32

nazmulAbid-cell commented Dec 19, 2024

ltzheng commented Dec 19, 2024

Bug Report: Manual Update of audio_emotion Not Reflecting Correctly #32

Bug Report: Manual Update of audio_emotion Not Reflecting Correctly #32

Comments

nazmulAbid-cell commented Dec 19, 2024

ltzheng commented Dec 19, 2024