images-classification-caltech/models_speedup.md at master · udsclub/images-classification-caltech · GitHub

Here the brief list of existed models speedup techniques with the links:

Compression
Knowledge distillation
- Distilling the Knowledge in a Neural Network
- Sequence-Level Knowledge Distillation
low bit networks
I don't know exactly
- Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications

Blog posts:

How to Quantize Neural Networks with TensorFlow
Compression of neural networks
Compressing and regularizing deep neural networks
Deep Compression and EIE slides and video
Pruning deep neural networks to make them fast and small(in pyTorch)

Comment/contributing are welcome!