English | 简体中文
This directory contains models optimized for mobile applications, at present the following models included:
Backbone | Architecture | Input | Image/gpu 1 | Lr schd | Box AP | Download | PaddleLite Model Download |
---|---|---|---|---|---|---|---|
MobileNetV3 Small | SSDLite | 320 | 64 | 400K (cosine) | 16.2 | Link | Link |
MobileNetV3 Small | SSDLite Quant 2 | 320 | 64 | 400K (cosine) | 15.4 | Link | Link |
MobileNetV3 Large | SSDLite | 320 | 64 | 400K (cosine) | 23.3 | Link | Link |
MobileNetV3 Large | SSDLite Quant 2 | 320 | 64 | 400K (cosine) | 22.6 | Link | Link |
MobileNetV3 Large w/ FPN | Cascade RCNN | 320 | 2 | 500k (cosine) | 25.0 | Link | Link |
MobileNetV3 Large w/ FPN | Cascade RCNN | 640 | 2 | 500k (cosine) | 30.2 | Link | Link |
MobileNetV3 Large | YOLOv3 | 320 | 8 | 500K | 27.1 | Link | Link |
MobileNetV3 Large | YOLOv3 Prune 2 | 320 | 8 | - | 24.6 | Link | Link |
Notes:
- [1] All models are trained on 8 GPUs.
- [2] See the note section on SSDLite quantization。
- [3] See the note section on how YOLO head is pruned.
-
Models are benched on following chipsets with Paddle-Lite 2.6 (to be released)
- Qualcomm Snapdragon 625
- Qualcomm Snapdragon 835
- Qualcomm Snapdragon 845
- Qualcomm Snapdragon 855
- HiSilicon Kirin 970
- HiSilicon Kirin 980
-
With 1 CPU thread (latency numbers are in ms)
SD625 SD835 SD845 SD855 Kirin 970 Kirin 980 SSDLite Large 289.071 134.408 91.933 48.2206 144.914 55.1186 SSDLite Large Quant SSDLite Small 122.932 57.1914 41.003 22.0694 61.5468 25.2106 SSDLite Small Quant YOLOv3 baseline 1082.5 435.77 317.189 155.948 536.987 178.999 YOLOv3 prune 253.98 131.279 89.4124 48.2856 122.732 55.8626 Cascade RCNN 320 286.526 125.635 87.404 46.184 149.179 52.9994 Cascade RCNN 640 1115.66 495.926 351.361 189.722 573.558 207.917 -
With 4 CPU threads (latency numbers are in ms)
SD625 SD835 SD845 SD855 Kirin 970 Kirin 980 SSDLite Large 107.535 51.1382 34.6392 20.4978 50.5598 24.5318 SSDLite Large Quant SSDLite Small 51.5704 24.5156 18.5486 11.4218 24.9946 16.7158 SSDLite Small Quant YOLOv3 baseline 413.486 184.248 133.624 75.7354 202.263 126.435 YOLOv3 prune 98.5472 53.6228 34.4306 21.3112 44.0722 31.201 Cascade RCNN 320 131.515 59.6026 39.4338 23.5802 58.5046 36.9486 Cascade RCNN 640 473.083 224.543 156.205 100.686 231.108 138.391
We use a complete quantitative training method to train the SSDLite model. It is trained for a total of 400,000 rounds with the 8-card GPU. We freeze res_conv1
and se_block
. The command used is listed bellow:
python slim/quantization/train.py --not_quant_pattern res_conv1 se_block \
-c configs/ssd/ssdlite_mobilenet_v3_large.yml \
--eval
For more quantization tutorials, please refer to Model Quantization Compression Tutorial
We pruned the YOLO-head and distill the pruned model with YOLOv3-ResNet34 as the teacher, which has a higher mAP on COCO (31.4 with 320*320 input).
The following configurations can be used for pruning:
-
Prune with fixed ratio, overall prune ratios is 86%
--pruned_params="yolo_block.0.0.0.conv.weights,yolo_block.0.0.1.conv.weights,yolo_block.0.1.0.conv.weights,yolo_block.0.1.1.conv.weights,yolo_block.0.2.conv.weights,yolo_block.0.tip.conv.weights,yolo_block.1.0.0.conv.weights,yolo_block.1.0.1.conv.weights,yolo_block.1.1.0.conv.weights,yolo_block.1.1.1.conv.weights,yolo_block.1.2.conv.weights,yolo_block.1.tip.conv.weights,yolo_block.2.0.0.conv.weights,yolo_block.2.0.1.conv.weights,yolo_block.2.1.0.conv.weights,yolo_block.2.1.1.conv.weights,yolo_block.2.2.conv.weights,yolo_block.2.tip.conv.weights" \ --pruned_ratios="0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.875,0.875,0.875,0.875,0.875,0.875"
-
Prune filters using FPGM algorithm:
--prune_criterion=geometry_median
- More models configurations
- Quantized models