Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md

Repository files navigation

Awesome-model-compression-and-acceleration

Some papers I collected and deemed to be great to read, which is also what I'm about to read, raise a PR or issue if you have any suggestion regarding the list, Thank you.

Survey

A Survey of Model Compression and Acceleration for Deep Neural Networks [arXiv '17]
Recent Advances in Efficient Computation of Deep Convolutional Neural Networks [arXiv '18]
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Model and structure

MobilenetV2: Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation [arXiv '18, Google]
NasNet: Learning Transferable Architectures for Scalable Image Recognition [arXiv '17, Google]
DeepRebirth: Accelerating Deep Neural Network Execution on Mobile Devices [AAAI'18, Samsung]
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices [arXiv '17, Megvii]
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications [arXiv '17, Google]
CondenseNet: An Efficient DenseNet using Learned Group Convolutions [arXiv '17]
Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video[arxiv'17]
Shift-based Primitives for Efficient Convolutional Neural Networks [WACV'18]

Quantization

The ZipML Framework for Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning [ICML'17]
Compressing Deep Convolutional Networks using Vector Quantization [arXiv'14]
Quantized Convolutional Neural Networks for Mobile Devices [CVPR '16]
Fixed-Point Performance Analysis of Recurrent Neural Networks [ICASSP'16]
Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations [arXiv'16]
Loss-aware Binarization of Deep Networks [ICLR'17]
Towards the Limit of Network Quantization [ICLR'17]
Deep Learning with Low Precision by Half-wave Gaussian Quantization [CVPR'17]
ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks [arXiv'17]
Training and Inference with Integers in Deep Neural Networks [ICLR'18]
Deep Learning with Limited Numerical Precision[ICML'2015]

Pruning

Binarized neural network

Low-rank Approximation

Distilling

System

Some optimization techniques

消灭重复计算
展开循环
利用SIMD指令
OpenMP
定点化
避免非连续内存读写

References

About

No description, website, or topics provided.

Report repository

Releases

No releases published

Packages

No packages published

Contributors 6