NanoDet

Super fast and lightweight anchor-free object detection model. Real-time on mobile devices.

⚡Super lightweight: Model file is only 1.8 MB.
⚡Super fast: 97fps(10.23ms) on mobile ARM CPU.
😎Training friendly: Much lower GPU memory cost than other models. Batch-size=80 is available on GTX1060 6G.
😎Easy to deploy: Provide C++ implementation and Android demo based on ncnn inference framework.

NEWS!!!

[2021.02.03] Support EfficientNet-Lite and Rep-VGG backbone. Please check the config folder.
[2021.01.10] NanoDet-g with lower memory access cost, which designed for edge NPU or GPU, is now available! Check config/nanodet-g.yml and download: COCO pre-trained model(Google Drive) | (BaiduDisk百度网盘) code:otcd
[2020.12.19] MNN python and cpp demos are available.
[2020.12.05] Support voc .xml format dataset! Refer to config/nanodet_custom_xml_dataset.yml.
[2020.12.01] Great thanks to nihui, now you can try NanoDet running in web browser! 👉 https://nihui.github.io/ncnn-webassembly-nanodet/

Benchmarks

Model	Resolution	COCO mAP	Latency(ARM 4xCore)	FLOPS	Params	Model Size(ncnn fp16)
NanoDet-m	320*320	20.6	10.23ms	0.72B	0.95M	1.8MB
NanoDet-m	416*416	21.7	16.44ms	1.2B	0.95M	1.8MB
NanoDet-g	416*416	22.9	Not Designed For ARM	4.2B	3.81M	7.7MB
YoloV3-Tiny	416*416	16.6	37.6ms	5.62B	8.86M	33.7MB
YoloV4-Tiny	416*416	21.7	32.81ms	6.96B	6.06M	23.0MB

Note:

Performance is measured on Kirin 980(4xA76+4xA55) ARM CPU based on ncnn. You can test latency on your phone with ncnn_android_benchmark.
NanoDet mAP(0.5:0.95) is validated on COCO val2017 dataset with no testing time augmentation.
YOLO mAP refers from Scaled-YOLOv4: Scaling Cross Stage Partial Network.
NanoDet-g is designed for edge NPU, GPU or TPU with high parallel computing power but low memory bandwidth. It has much lower memory access cost than NanoDet-m.

NanoDet is a FCOS-style one-stage anchor-free object detection model which using ATSS for target sampling and using Generalized Focal Loss for classification and box regression. Please refer to these papers for more details.

Fcos: Fully convolutional one-stage object detection

ATSS:Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection

Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection

知乎中文介绍 | QQ交流群：908606542 (答案：炼丹)

Demo

Android demo

Android demo project is in demo_android_ncnn folder. Please refer to Android demo guide.

NCNN C++ demo

C++ demo based on ncnn is in demo_ncnn folder. Please refer to Cpp demo guide.

MNN demo

Inference using Alibaba's MNN framework is in demo_mnn folder. Including python and cpp inference code. Please refer to MNN demo guide.

Pytorch demo

First, install requirements and setup NanoDet following installation guide. Then download COCO pretrain weight from here 👉COCO pretrain weight for torch>=1.6(Google Drive) | (百度网盘) code:6au1

👉COCO pretrain weight for torch<=1.5(Google Drive) | (百度云盘) code:topw

Inference images

python demo/demo.py image --config CONFIG_PATH --model MODEL_PATH --path IMAGE_PATH

Inference video

python demo/demo.py video --config CONFIG_PATH --model MODEL_PATH --path VIDEO_PATH

Inference webcam

python demo/demo.py webcam --config CONFIG_PATH --model MODEL_PATH --camid YOUR_CAMERA_ID

Install

Requirements

Linux or MacOS
CUDA >= 10.0
Python >= 3.6
Pytorch >= 1.3
experimental support Windows (Notice: Windows not support distributed training before pytorch1.7)

Step

Create a conda virtual environment and then activate it.

 conda create -n nanodet python=3.8 -y
 conda activate nanodet

Install pytorch

conda install pytorch torchvision cudatoolkit=11.0 -c pytorch

Install requirements

pip install Cython termcolor numpy tensorboard pycocotools matplotlib pyaml opencv-python tqdm

Setup NanoDet

git clone https://github.com/RangiLyu/nanodet.git
cd nanodet
python setup.py develop

Model Zoo

NanoDet supports variety of backbones. Go to the config folder to see the sample training config files.

Model	Backbone	Resolution	COCO mAP	FLOPS	Params	Pre-train weight
NanoDet-m	ShuffleNetV2 1.0x	320*320	20.6	0.72B	0.95M	Download
NanoDet-g	Custom CSP Net	416*416	22.9	4.2B	3.81M	Download
NanoDet-EfficientLite	EfficientNet-Lite0	320*320	24.7	1.72B	3.11M	Download
NanoDet-EfficientLite	EfficientNet-Lite1	416*416	30.3	4.06B	4.01M	Download
NanoDet-EfficientLite	EfficientNet-Lite2	512*512	32.6	7.12B	4.71M	Download
NanoDet-RepVGG	RepVGG-A0	416*416	27.8	11.3B	6.75M	Download

How to Train

Prepare dataset

If your dataset annotations are pascal voc xml format, refer to config/nanodet_custom_xml_dataset.yml

Or convert your dataset annotations to MS COCO format(COCO annotation format details).
Prepare config file

Copy and modify an example yml config file in config/ folder.

Change save_path to where you want to save model.

Change num_classes in model->arch->head.

Change image path and annotation path in both data->train data->val.

Set gpu, workers and batch size in device to fit your device.

Set total_epochs, lr and lr_schedule according to your dataset and batchsize.

If you want to modify network, data augmentation or other things, please refer to Config File Detail
Start training

For single GPU, run
```
python tools/train.py CONFIG_PATH
```
For multi-GPU, NanoDet using distributed training. (Notice: Windows not support distributed training before pytorch1.7) Please run
```
python -m torch.distributed.launch --nproc_per_node=GPU_NUM --master_port 29501 tools/train.py CONFIG_PATH
```
Visualize Logs

TensorBoard logs are saved in save_dir which you set in config file.

To visualize tensorboard logs, run:
```
cd <YOUR_SAVE_DIR>
tensorboard --logdir ./logs
```

How to Deploy

NanoDet provide C++ and Android demo based on ncnn library.

Convert model

To convert NanoDet pytorch model to ncnn, you can choose this way: pytorch->onnx->ncnn

To export onnx model, run tools/export.py.
```
python tools/export.py --cfg_path ${CONFIG_PATH} --model_path ${PYTORCH_MODEL_PATH}
```
Then using onnx-simplifier to simplify onnx structure.
```
python -m onnxsim ${INPUT_ONNX_MODEL} ${OUTPUT_ONNX_MODEL}
```
Run onnx2ncnn in ncnn tools to generate ncnn .param and .bin file.

After that, using ncnnoptimize to optimize ncnn model.

If you have quentions about converting ncnn model, refer to ncnn wiki. https://github.com/Tencent/ncnn/wiki
Run NanoDet model with C++

Please refer to demo_ncnn.
Run NanoDet on Android

Please refer to android_demo.

Thanks

https://github.com/Tencent/ncnn

https://github.com/open-mmlab/mmdetection

https://github.com/implus/GFocal

https://github.com/cmdbug/YOLOv5_NCNN

https://github.com/rbgirshick/yacs

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
config		config
demo		demo
demo_android_ncnn		demo_android_ncnn
demo_libtorch		demo_libtorch
demo_mnn		demo_mnn
demo_ncnn		demo_ncnn
demo_openvino		demo_openvino
docs		docs
nanodet		nanodet
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NanoDet

Super fast and lightweight anchor-free object detection model. Real-time on mobile devices.

NEWS!!!

Benchmarks

Demo

Android demo

NCNN C++ demo

MNN demo

Pytorch demo

Install

Requirements

Step

Model Zoo

How to Train

How to Deploy

Thanks

About

Releases

Packages

Languages

License

czyczyczy/nanodet

Folders and files

Latest commit

History

Repository files navigation

NanoDet

Super fast and lightweight anchor-free object detection model. Real-time on mobile devices.

NEWS!!!

Benchmarks

Demo

Android demo

NCNN C++ demo

MNN demo

Pytorch demo

Install

Requirements

Step

Model Zoo

How to Train

How to Deploy

Thanks

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages