This repo is the official implementation of "Generating Human Motion in 3D Scenes from Text Descriptions".
[2024/11/02] We release the training code.
[2024/10/21] We release the visualization code.
[2024/06/09] We first release the test & evaluation code.
conda create -n most python=3.9
conda activate most
# install pytorch
conda install pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.6 -c pytorch -c nvidia
# install pytroch3d
pip install pytorch3d-0.7.2-cp39-cp39-linux_x86_64.whl
# install other requirements
cat requirements.txt | sed -e '/^\s*-.*$/d' -e '/^\s*#.*$/d' -e '/^\s*$/d' | awk '{split($0, a, "#"); if (length(a) > 1) print a[1]; else print $0;}' | awk '{split($0, a, "@"); if (length(a) > 1) print a[2]; else print $0;}' | xargs -n 1 pip install
# install MoST lib
pip install -e . --no-build-isolation --no-deps
NOTE:
pytorch3d
download link.- If you want to run stage 1, please uncomment
shapely, tenacity, openai, scikit-learn
in requirements.txt.
- Download ScanNet v2 from link. We only need files that ends with
*_vh_clean_2.ply
,*_vh_clean.aggregation.json
,*_vh_clean_2*segs.json
. - Link to data/:
mkdir data
ln -s /path/to/scannet data/ScanNet
- Preprocess by runing:
python tools/preprocess_scannet.py
Files will be saved in data/scannet_preprocess.
- Download HUMANISE dataset from link.
- Link to data/
mkdir data
ln -s /path/to/humanise data/HUMANISE
(Only needed if you want to train the models by yourself.)
- Please follow HUMOR to download and preprocess AMASS dataset.
- Link to data/
ln -s /path/to/amass_processed data/amass_preprocess
- Download SMPLX models from link.
- Put the smplx folder under
data/smpl_models
folder:
mkdir data/smpl_models
mv smplx data/smpl_models/
- Weights are shared in link. Please download and unzip it and put the folder most_release under
out
folder:
mv most_release out/release
Here, we use ground truth object detection results for ScanNet scenes (in the HUMANISE dataset). If you want to test on a new scene, please follow GroupFree3D to get object bounding boxes.
python tools/locate_target.py -c configs/locate/locate_chatgpt.yaml
We use Azure OpenAI service, please refer to this link and this link.
python tools/generate_results.py -c configs/test/generate.yaml
The results will be saved in out/test
.
python tools/evaluate_results.py -c configs/test/evaluate.yaml
The generated results are shared in link. You can use your own generated results or download it and unzip it as out/test
folder.
We use wis3d lib to visualize the results. To prepare for the visualization:
python tools/visualize_results.py -c configs/test/visualize.yaml
Then, in terminal:
wis3d --vis_dir out/vis3d --host ${HOST} --port ${PORT}
You can then visualize the results in ${HOST}:${PORT}
.
Train the trajectory model:
python tools/train.net -c configs/train/trajgen/traj_amass.yaml task amass_traj
Train the motion model:
python tools/train.net -c configs/train/motiongen/motion_amass.yaml task amass_motion
The outputs and models will be saved in out/train/
Train the trajectory model:
python tools/train.net -c configs/train/trajgen/traj_humanise.yaml task humanise_traj resume True resume_model_dir out/train/amass_traj/model
Train the motion model:
python tools/train.net -c configs/train/motiongen/motion_humanise.yaml task humanise_motion resume True resume_model_dir out/train/amass_motion/model
@inproceedings{cen2024text_scene_motion,
title={Generating Human Motion in 3D Scenes from Text Descriptions},
author={Cen, Zhi and Pi, Huaijin and Peng, Sida and Shen, Zehong and Yang, Minghui and Shuai, Zhu and Bao, Hujun and Zhou, Xiaowei},
booktitle={CVPR},
year={2024}
}