Given the licenses of respective raw datasets, we recommend users download the raw data from their official websites and then organize them following the below guide. Detailed steps are shown as follows.
-
Download ScanNet v2 data HERE. Link or move the folder to this level of directory.
-
Download 3RScan data HERE. Link or move the folder to this level of directory.
-
Download Matterport3D data HERE. Link or move the folder to this level of directory.
-
Download ARKitScenes data HERE. Link or move the folder to this level of directory.
-
Download EmbodiedScan data and extract it here. Currently, please fill in the form, and we will reply with the data download link.
The directory structure should be as below.
data
├── scannet
│ ├── scans
│ │ ├── <scene_id>
│ │ ├── ...
├── 3rscan
│ ├── <scene_id>
│ ├── ...
├── matterport3d
│ ├── <scene_id>
│ ├── ...
├── arkitscenes
│ ├── Training
│ | ├── <scene_id>
│ | ├── ...
│ ├── Validation
│ | ├── <scene_id>
│ | ├── ...
├── embodiedscan_occupancy
├── embodiedscan_infos_train.pkl
├── embodiedscan_infos_val.pkl
├── embodiedscan_infos_test.pkl
├── embodiedscan_train_vg.json
├── embodiedscan_val_vg.json
├── embodiedscan_test_vg.json
├── embodiedscan_train_mini_vg.json (mini set)
├── embodiedscan_val_mini_vg.json (mini set)
├── embodiedscan_train_vg_all.json (w/ complex prompts)
├── embodiedscan_val_vg_all.json (w/ complex prompts)
- Enter the project root directory, extract images by running
python embodiedscan/converter/generate_image_scannet.py --dataset_folder data/scannet/
# generate_image_scannet.py can be very slow because it extracts images from .sens files. Add --fast to generate only images used by embodiedscan.
python embodiedscan/converter/generate_image_3rscan.py --dataset_folder data/3rscan/
The directory structure should be as below after that
data
├── scannet
│ ├── scans
│ │ ├── <scene_id>
│ │ ├── ...
│ ├── posed_images
│ │ ├── <scene_id>
│ │ | ├── *.jpg
│ │ | ├── *.png
│ │ ├── ...
├── 3rscan
│ ├── <scene_id>
│ │ ├── sequence
│ │ | ├── *.color.jpg
│ │ | ├── *.depth.pgm
│ ├── ...
├── matterport3d
│ ├── <scene_id>
│ ├── ...
├── arkitscenes
│ ├── Training
│ | ├── <scene_id>
│ | ├── ...
│ ├── Validation
│ | ├── <scene_id>
│ | ├── ...
├── embodiedscan_occupancy
├── embodiedscan_infos_train.pkl
├── embodiedscan_infos_val.pkl
├── embodiedscan_infos_test.pkl
├── embodiedscan_train_vg.json
├── embodiedscan_val_vg.json
├── embodiedscan_train_mini_vg.json
├── embodiedscan_val_mini_vg.json
- Also extract EmbodiedScan occupancy annotations here by running
python embodiedscan/converter/extract_occupancy_ann.py --src data/embodiedscan_occupancy --dst data