BlendMimic3D is a pioneering synthetic dataset developed using Blender, designed to enhance Human Pose Estimation (HPE) research. This dataset features diverse scenarios including self-occlusions, object-based occlusions, and out-of-frame occlusions, tailored for the development and testing of advanced HPE models. All data were extracted using our BlendMimic3D-DataExtractor. For more information, interactive demos, and downloads related to the BlendMimic3D project, please visit our project webpage.
BlendMimic3D bridges the gap in existing datasets by offering realistic scenarios, facilitating the development of algorithms capable of handling complex real-world challenges.
Existing HPE datasets have significantly contributed to the advancement of pose estimation technologies. However, the complexity and variability of real-world scenarios demand more versatile and comprehensive datasets. BlendMimic3D addresses these needs by providing detailed scenarios that mimic real-life challenges.
- Realistic Environments: BlendMimic3D encompasses simple environments, resembling Human3.6M dataset, shopping activities and multi-person contexts, simulating real-world environments.
- Diverse Occlusion Scenarios: Specifically addresses self-occlusions, object-based occlusions, and out-of-frame occlusions.
- Multi-Perspective Capture: Utilizes four cameras to capture diverse human movements and interactions from multiple angles.
- Pixel-Perfect Annotations: Offers detailed annotations for 2D keypoints, 3D keypoints, and occlusion data.
- Videos: A collection of videos capturing a range of actions from four different camera perspectives.
- Camera Parameters: Includes intrinsic and extrinsic parameters for camera calibration.
- 3D and 2D Keypoint Positions: Provides both 3D and 2D positions of keypoints for comprehensive pose estimation.
- Occlusion Data: Contains a binary array depicting which keypoints are occluded in each frame.
BlendMimic3D is designed for versatility and can be easily integrated with existing HPE models. The dataset is structured to facilitate direct comparisons with Human3.6M, enabling researchers to evaluate and benchmark their models effectively.
The BlendMimic3D dataset is organized ensuring comprehensive coverage of various human pose estimation challenges:
- Scenarios: Three distinct scenarios ranging from simple environments to more complex and realistic settings.
- Subjects: Three synthetic subjects, each performing a variety of actions to simulate single and multi-person contexts, with up to three subjects present.
- Videos: A total of 128 videos, each with an average duration of 20 seconds (600 frames), capturing a wide array of actions from four different camera perspectives.
- Data Structure: For each subject, the dataset is organized into folders for Videos, D2_Positions, D3_Positions, Occlusions, and Cameras, containing files with their respective information.
This dataset is designed to reflect the structure of the Human3.6M dataset, categorized by subject and action. Synthetic subjects designated as S1, S2, and S3 each cover 14 distinct actions. S1 focuses on self-occlusions, S2 on object and out-of-frame occlusions, and S3 tackles both occlusions and multi-person scenarios in a retail environment.
If you use our dataset in your research, please cite our paper:
[ citation format]
This work was supported by LARSyS funding (DOI: 10.54499/LA/P/0083/2020, 10.54499/UIDP/50009/2020, and 10.54499/UIDB/50009/2020) and 10.54499/2022.07849.CEECIND/CP1713/CT0001, through Fundação para a Ciência e a Tecnologia, and by the SmartRetail project [PRR - C645440011-00000062], through IAPMEI - Agência para a Competitividade e Inovação.