This is the official Pytorch implementation for: VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data
pip install -r requirement.txt
python setup.py develop
The directory tree should look like this:
${ROOT}
|-- data
|-- MSCOCO
| |-- annotations
| | |-- person_keypoints_train2017.json
| |-- images
| | |-- train2017
|-- MuCo-3DHP
| |-- images
| | |-- augmented_set
| | |-- unaugmented_set
| |-- MuCo-3DHP.json
|-- MuPoTS-3D
| |-- cameras.pkl
| |-- images
| | |-- TS1
| | |-- ...
| | |-- TS20
| |-- MuPoTS-3D.json
|-- panoptic-toolbox
| |-- data
| |-- data_hmor
| | |-- 160224_haggling1
| | |-- 160224_mafia1
| | |-- ...
| | |-- train_cam.pkl
| | |-- val_cam.pkl
| |-- clean_train.pkl
| |-- clean_valid.pkl
| |-- pack.py
|-- models
|-- pose_resnet_152_384x288.pth.tar
-
Download MSCOCO parsed data [data]
-
Download MuCo parsed and composited data [data] provided by [RootNet]
-
Download MuPoTS parsed data [images][annotations]
-
Download Panoptic by following the instructions in panoptic-toolbox and extract them under
${ROOT}/data/panoptic-toolbox/data
. Download[annotations] provided by [HMOR] and put them under${ROOT}/data/panoptic-toolbox/$
, and runpython pack.py
. -
Download pretrained ResNet-152 model pose_resnet_152_384x288.pth.tar provided by [Simple Baselines] and put it under
${ROOT}/models/$
.
We use 4 NVIDIA V100 with 32GB GPU memory for training.
Train the 2D pose estimation and human detection backbone with 2 gpus:
python run/train_3d.py --cfg configs/coco/backbone_res152_mix_panoptic.yaml --gpus 2
Train the root depth estimator and 3D pose estimator with 4 gpus:
python run/train_3d.py --cfg configs/panoptic/synthesize_full.yaml --gpus 4
Train the 2D pose estimation and human detection backbone with 2 gpus:
python run/train_3d.py --cfg configs/coco/backbone_res152_mix_muco.yaml --gpus 2
Train the root depth estimator and 3D pose estimator with 4 gpus:
python run/train_3d.py --cfg configs/muco/synthesize_full.yaml --gpus 4
Our pre-trained models are available for download from Google drive or Onedrive.
Inference with 4 gpus:
python run/validate_3d.py --cfg configs/panoptic/synthesize_full_inference.yaml --gpus 4
Inference with 4 gpus:
python run/validate_3d.py --cfg configs/muco/synthesize_full_inference.yaml --gpus 4
The results are in ${ROOT}/mupots_results/$
, then use the evaluation code provided by MuPoTS-3D dataset to evaluate the results.
If our code helps your research, please consider citing the following paper:
@inproceedings{su2022virtualpose,
title={VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data},
author={Su, Jiajun and Wang, Chunyu and Ma, Xiaoxuan and Zeng, Wenjun and Wang, Yizhou},
booktitle={European Conference on Computer Vision},
pages={55--71},
year={2022},
organization={Springer}
}
This repo is built on https://github.com/microsoft/voxelpose-pytorch.