We provide 4 models of varying size. Sapiens-0.3B, Sapiens-0.6B, Sapiens-1B, Sapiens-2B. In general, performance improves with increasing the model size.
We use an offshelf detector to do top-down pose estimation. Please install, download and set the path appropriately.
- Install
mmdet
export SAPIENS_ROOT=/path/to/sapiens cd $SAPIENS_ROOT/engine; pip install -e . cd $SAPIENS_ROOT/cv; pip install -e . cd $SAPIENS_ROOT/det; pip install -e .
You can also skip using a bounding box detector by remove the --det-config
and --det-checkpoint
from the scripts - in this case the entire image is used as input.
Best for general in-the-wild scenarios with body keypoints only, adhering to the COCO keypoint format.
Coming Soon!
Offers second-best generalization with body, face, hands, and feet keypoints, following the COCO-WholeBody keypoint format.
Coming Soon!
The highest number of keypoints predictor. Detailed 274 face keypoints. Following the Sociopticon keypoint format.
Model | Checkpoint Path |
---|---|
Sapiens-1B | $SAPIENS_LITE_CHECKPOINT_ROOT/pose/checkpoints/sapiens_1b/sapiens_1b_goliath_coco_wholebody_mpii_crowdpose_aic_best_goliath_AP_640_$MODE.pt2 |
- Navigate to your script directory:
cd $SAPIENS_LITE_ROOT/scripts/demo/[torchscript,bfloat16,float16]
- For 17 keypoints estimation (uncomment your model config line for inference):
./pose_keypoints17.sh
- For 133 keypoints estimation (uncomment your model config line for inference):
./pose_keypoints133.sh
- For 308 keypoints estimation (uncomment your model config line for inference, we recommend using face crops for better results!):
./pose_keypoints308.sh
Define INPUT
for your image directory and OUTPUT
for results. Visualization and keypoints in JSON format are saved to OUTPUT
.
Customize LINE_THICKNESS
, RADIUS
, and KPT_THRES
as needed. Adjust BATCH_SIZE
, JOBS_PER_GPU
, TOTAL_GPUS
and VALID_GPU_IDS
for multi-GPU configurations.
Note, we skip the keypoint skeleton visualization in interest of speed.