Official pytorch code release of "DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation"
@misc{kim2024deeptalkdynamicemotionembedding,
title={DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation},
author={Jisoo Kim and Jungbin Cho and Joonho Park and Soonmin Hwang and Da Eun Kim and Geon Kim and Youngjae Yu},
year={2024},
eprint={2408.06010},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2408.06010},
}
🛩️ 12/Dec/24 - Released the training code
🛩️ 11/Dec/24 - Released the inference&rendering code
🛩️ 10/Dec/24 - DEEPTalk is accepted to AAAI2025
❗clone this repo recursively using
git clone --recurse-submodules <repository_url>
or update submodules recursively using
git submodule update --init --recursive
note that spectre requires git-lfs
which can be installed by
conda install conda-forge::git-lfs
Make environment and install pytorch
conda create -n deeptalk python=3.9
conda activate deeptalk
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
Install pytorch3d and other requirements. Refer to this page for pytorch3d details.
pip install pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py39_cu113_pyt1121/download.html
pip install -r requirements.txt
Install osmesa and ffmpeg for headless rendering and audio, video processing.
conda install menpo::osmesa
conda install conda-forge::ffmpeg
For trainig DEEPTalk on stage2, we used nvdiffrast. Install nvdiffrast from this repo.
git clone https://github.com/NVlabs/nvdiffrast.git
cd nvdiffrast
git checkout v0.3.1
python setup.py install
Download DEE, FER, TH-VQVAE, DEEPTalk checkpoints from here.
- DEE.pt: Place in
./DEE/checkpoint
- FER.pth: Place in
./FER/checkpoint
- TH-VQVAE.pth: Place in
./DEEPTalk/checkpoint/TH-VQVAE
- DEEPTalk.pth: Place in
./DEEPTalk/checkpoint/DEEPTalk
Download emotion2vec_bast.pt from the emotion2vec repository.
- emotion2vec_base.pt: Place in
./DEE/models/emo2vec/checkpoint
Download LRS3_V_WER32.3 model from the Spectre repository. (❗This is for Stage2 training)
- Place the LRS3_V_WER32.3 folder at
./DEEPTalk/externals/spectre/data/data/LRS3_V_WER32.3
Donload files from Ringnet project.
- FLAME_sample.ply: Place in
./DEEPTalk/models/flame_models
- flame_dynamic_embedding.npy: Place in
./DEEPTalk/models/flame_models
- flame_static_embedding.pkl: Place in
./DEEPTalk/models/flame_models
Download Flame files from FLAME website.
- generic_model.pkl: Place in
./DEEPTalk/models/flame_models
Download head_template files from FLAME website. (❗This is for Stage2 training)
- related issue
- head_template.jpg: Place in
./DEEPTalk/models/flame_models/geometry
- head_template.mtl: Place in
./DEEPTalk/models/flame_models/geometry
- head_template.obj: Place in
./DEEPTalk/models/flame_models/geometry
Run the following copmmand to make a video. Results will be saved in ./DEEPTalk/outputs
.
cd DEEPTalk
python demo.py --audio_path {raw audio file (.wav) or sampled audio (.npy)}
Download MEAD Dataset from here.
Use the reconstruction method from EMOCAV2 to reconstruct FLAME parameters from MEAD.
Leave an issue if your having troubles processing MEAD. We might be able to provide the exact parameters.
Make a copy of /DEEPTalk/checkpoint/TH-VQVAE/config_TH-VQVAE.json
and change the arguments like data.data_dir
or name
to train your own model.
Then run
cd DEEPTalk
python train_VQVAE.py --config {your config path}
Make a copy of /home/whwjdqls99/DEEPTalk/DEEPTalk/checkpoint/DEEPTalk/config_stage1.json
and change the arguments like data.data_dir
or name
to train your own model.
Then run
cd DEEPTalk
python train_DEEPTalk_stage1.py --DEEPTalk_config {your config path}
Make a copy of /home/whwjdqls99/DEEPTalk/DEEPTalk/checkpoint/DEEPTalk/config.json
and change the arguments like data.data_dir
or name
to train your own model.
Then run
cd DEEPTalk
python train_DEEPTalk_stage2.py --DEEPTalk_config {your config path} --checkpoint {stage1 trained model checkpoint path}
We gratefully acknowledge the open-source projects that served as the foundation for our work:
This code is released under the MIT License.
Please note that our project relies on various other libraries, including FLAME, PyTorch3D, and Spectre, as well as several datasets.