Skip to content

Latest commit

 

History

History
 
 

alds

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition

Lucas Liebenwein*, Alaa Maalouf*, Dan Feldman, Daniela Rus

*Equal contribution

We present a global compression framework for deep neural networks that automatically analyzes each layer to identify the optimal per-layer compression ratio, while simultaneously achieving the desired overall compression. Our algorithm hinges on the idea of compressing each convolutional (or fully-connected) layer by slicing its channels into multiple groups and decomposing each group via low-rank decomposition.

We frame the compression problem as an optimization problem where we wish to minimize the maximum compression error across layers and propose an efficient algorithm towards a solution.

Compared to manual solution (i.e. manual compression of each layer) our algorithm (Automatic Layer-wise Decomposition Selector or ALDS) automatically determines the decomposition for each layer enabling higher compression ratios for the same level of accuracy without requiring substantial manual hyperparameter tuning.

Setup

Check out the main README.md and the respective packages for more information on the code base.

Overview

Implementation of main algorithm (ALDS)

Our main algorithm (Automatic Layer-wise Decomposition Selector or ALDS) is integrated into the torchprune package.

The implementation can be found here.

Run compression experiments

The experiment configurations are located here. To reproduce the experiments for a specific configuration, run:

python -m experiment.main param/cifar/prune/resnet20.yaml

Visualize results

You should be able to retrieve the nominal prune-accuracy trade-offs from the data/results folder.

You can also visualize the results using the results_viewer.py script:

python results_viewer.py

Run it from inside the script folder. The script can also be run interactively as Jupyter notebook.

Load network checkpoint

If you want to use the network checkpoints in your own experiments or code, follow the load_networks.py script. It should be self-explanatory.

Hyperparameter sweeps

If you want to run a hyperparameter sweep over different amounts of retraining, you can run

python -m experiment.main param/cifar/retrainsweep/resnet20.yaml

Note that this experiment will be quite expensive to run in terms of required compute since it will repeat the compression experiment for different amounts of retraining and different compression methods over multiple repetitions.

To visualize the results use the retrain_sweep.py script:

python retrain_sweep.py

Run it from inside the script folder. The script can also be run interactively as Jupyter notebook.

Citation

Please cite the following paper when using our work.

Paper link

Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition

Bibtex

@inproceedings{liebenwein2021alds,
 author = {Lucas Liebenwein and Alaa Maalouf and Dan Feldman and Daniela Rus},
 booktitle = {Advances in Neural Information Processing Systems},
 title = {Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition},
 url = {https://arxiv.org/abs/2107.11442},
 volume = {34},
 year = {2021}
}