OV-3DET: Open-Vocabulary Point-Cloud Object Detection without 3D Annotation

OV-3DET: An Open Vocabulary 3D DETector.

OV-3DET: Open-Vocabulary Point-Cloud Object Detection without 3D Annotation,
Yuheng Lu, Chenfeng Xu, Xiaobao Wei, Xiaodong Xie, Masayoshi Tomizuka, Kurt Keutzer and Shanghang Zhang,
Accepted to CVPR2023

Features

Detects 3D objects according to text prompting.
The training of OV-3DET does not require 3D annotation.

Installation

See installation instructions.

Dataset preparation

See dataset instructions, or directly download the processed dataset.

Training OV-3DET

Phase 1

Learn to Localize 3D Objects from 2D Pretrained Detector:

# ScanNet
bash scripts/scannet_train_loc.sh
# SUN RGB-D
bash scripts/sunrgbd_train_loc.sh

Phase 2

Learn to Classify 3D Objects from 2D Pretrained vision-language Model:

# ScanNet
bash scripts/scannet_train_dtcc.sh
# SUN RGB-D
bash scripts/sunrgbd_train_dtcc.sh

Evaluate OV-3DET

To evaluate OV-3DET, simply by running:

# ScanNet
bash scripts/evaluate_scannet.sh
# SUN RGB-D
bash scripts/evaluate_sunrgbd.sh

Pretrained Models

We provide the pretrained model weights for both "Phase 1" and "Phase 2".

Dataset	Phase	Epochs	Model weights
ScanNet	1	400	weights
ScanNet	2	50	weights
SUN RGB-D	1	400	weights
SUN RGB-D	2	50	weights

Acknowledgement

This codebase is modified base on 3DETR [1], CLIP [2] and Detic [3], we sincerely appreciate their contributions!

[1] An end-to-end transformer model for 3d object detection. ICCV. 2021.
[2] Learning transferable visual models from natural language supervision. ICML. 2021.
[3] Detecting twenty-thousand classes using image-level supervision. ECCV. 2022.

Citation

If you find this repository helpful, please consider citing our work:

@article{lu2023open,
  title={Open-Vocabulary Point-Cloud Object Detection without 3D Annotation},
  author={Lu, Yuheng and Xu, Chenfeng and Wei, Xiaobao and Xie, Xiaodong and Tomizuka, Masayoshi and Keutzer, Kurt and Zhang, Shanghang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
Assets		Assets
Data_Maker		Data_Maker
datasets		datasets
models		models
scripts		scripts
third_party/pointnet2		third_party/pointnet2
utils		utils
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
criterion.py		criterion.py
dtcc_loss.py		dtcc_loss.py
engine.py		engine.py
main.py		main.py
optimizer.py		optimizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OV-3DET: Open-Vocabulary Point-Cloud Object Detection without 3D Annotation

Features

Installation

Dataset preparation

Training OV-3DET

Phase 1

Phase 2

Evaluate OV-3DET

Pretrained Models

Acknowledgement

Citation

About

Releases

Packages

Languages

License

lyhdet/OV-3DET

Folders and files

Latest commit

History

Repository files navigation

OV-3DET: Open-Vocabulary Point-Cloud Object Detection without 3D Annotation

Features

Installation

Dataset preparation

Training OV-3DET

Phase 1

Phase 2

Evaluate OV-3DET

Pretrained Models

Acknowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages