DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation [AAAI2025]

Official pytorch code release of "DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation"

@misc{kim2024deeptalkdynamicemotionembedding,
      title={DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation}, 
      author={Jisoo Kim and Jungbin Cho and Joonho Park and Soonmin Hwang and Da Eun Kim and Geon Kim and Youngjae Yu},
      year={2024},
      eprint={2408.06010},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2408.06010}, 
}

📨 News

🛩️ 12/Dec/24 - Released the training code

🛩️ 11/Dec/24 - Released the inference&rendering code

🛩️ 10/Dec/24 - DEEPTalk is accepted to AAAI2025

⚙️ Settings

❗clone this repo recursively using

git clone --recurse-submodules <repository_url>

or update submodules recursively using

git submodule update --init --recursive

note that spectre requires git-lfs which can be installed by

conda install conda-forge::git-lfs

Environment

Make environment and install pytorch

conda create -n deeptalk python=3.9
conda activate deeptalk
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

Install pytorch3d and other requirements. Refer to this page for pytorch3d details.

pip install pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py39_cu113_pyt1121/download.html
pip install -r requirements.txt

Install osmesa and ffmpeg for headless rendering and audio, video processing.

conda install menpo::osmesa
conda install conda-forge::ffmpeg

For trainig DEEPTalk on stage2, we used nvdiffrast. Install nvdiffrast from this repo.

git clone https://github.com/NVlabs/nvdiffrast.git
cd nvdiffrast
git checkout v0.3.1
python setup.py install

Download Checkpoints

Download DEE, FER, TH-VQVAE, DEEPTalk checkpoints from here.

DEE.pt: Place in ./DEE/checkpoint
FER.pth: Place in ./FER/checkpoint
TH-VQVAE.pth: Place in ./DEEPTalk/checkpoint/TH-VQVAE
DEEPTalk.pth: Place in ./DEEPTalk/checkpoint/DEEPTalk

Download emotion2vec_bast.pt from the emotion2vec repository.

emotion2vec_base.pt: Place in ./DEE/models/emo2vec/checkpoint

Download LRS3_V_WER32.3 model from the Spectre repository. (❗This is for Stage2 training)

Place the LRS3_V_WER32.3 folder at ./DEEPTalk/externals/spectre/data/data/LRS3_V_WER32.3

Download Files

Donload files from Ringnet project.

FLAME_sample.ply: Place in ./DEEPTalk/models/flame_models
flame_dynamic_embedding.npy: Place in ./DEEPTalk/models/flame_models
flame_static_embedding.pkl: Place in ./DEEPTalk/models/flame_models

Download Flame files from FLAME website.

generic_model.pkl: Place in ./DEEPTalk/models/flame_models

Download head_template files from FLAME website. (❗This is for Stage2 training)

related issue
head_template.jpg: Place in ./DEEPTalk/models/flame_models/geometry
head_template.mtl: Place in ./DEEPTalk/models/flame_models/geometry
head_template.obj: Place in ./DEEPTalk/models/flame_models/geometry

🛹 Inference

Run the following copmmand to make a video. Results will be saved in ./DEEPTalk/outputs.

cd DEEPTalk
python demo.py --audio_path {raw audio file (.wav) or sampled audio (.npy)}

📚 Dataset

Download Data

Download MEAD Dataset from here.

Process Data

Use the reconstruction method from EMOCAV2 to reconstruct FLAME parameters from MEAD.

Leave an issue if your having troubles processing MEAD. We might be able to provide the exact parameters.

🏋️ Training

1. Train TH-VQVAE on MEAD FLAME parameters

Make a copy of /DEEPTalk/checkpoint/TH-VQVAE/config_TH-VQVAE.json and change the arguments like data.data_dir or name to train your own model. Then run

cd DEEPTalk
python train_VQVAE.py --config {your config path}

2. Train DEEPTalk stage1

Make a copy of /home/whwjdqls99/DEEPTalk/DEEPTalk/checkpoint/DEEPTalk/config_stage1.json and change the arguments like data.data_dir or name to train your own model. Then run

cd DEEPTalk
python train_DEEPTalk_stage1.py --DEEPTalk_config {your config path}

3. Train DEEPTalk stage2

Make a copy of /home/whwjdqls99/DEEPTalk/DEEPTalk/checkpoint/DEEPTalk/config.json and change the arguments like data.data_dir or name to train your own model. Then run

cd DEEPTalk
python train_DEEPTalk_stage2.py --DEEPTalk_config {your config path} --checkpoint {stage1 trained model checkpoint path}

Acknowledgements

We gratefully acknowledge the open-source projects that served as the foundation for our work:

License

This code is released under the MIT License.

Please note that our project relies on various other libraries, including FLAME, PyTorch3D, and Spectre, as well as several datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
DEE		DEE
DEEPTalk		DEEPTalk
FER		FER
demo		demo
render		render
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation [AAAI2025]

📨 News

⚙️ Settings

Environment

Download Checkpoints

Download Files

🛹 Inference

📚 Dataset

Download Data

Process Data

🏋️ Training

1. Train TH-VQVAE on MEAD FLAME parameters

2. Train DEEPTalk stage1

3. Train DEEPTalk stage2

Acknowledgements

License

About

Releases

Packages

Languages

License

whwjdqls/DEEPTalk

Folders and files

Latest commit

History

Repository files navigation

DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation [AAAI2025]

📨 News

⚙️ Settings

Environment

Download Checkpoints

Download Files

🛹 Inference

📚 Dataset

Download Data

Process Data

🏋️ Training

1. Train TH-VQVAE on MEAD FLAME parameters

2. Train DEEPTalk stage1

3. Train DEEPTalk stage2

Acknowledgements

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages