Neural Arithmetic Units

This code encompass two publiations. The ICLR paper is still in review, please respect the double-blind review process.

Figure, shows performance of our proposed NMU model.

Publications

SEDL Workshop at NeurIPS 2019

Reproduction study of the Neural Arithmetic Logic Unit (NALU). We propose an improved evaluation criterion of arithmetic tasks including a "converged at" and a "sparsity error" metric. Results will be presented at SEDL|NeurIPS 2019. – Read paper.

@inproceedings{maep-madsen-johansen-2019,
    author={Andreas Madsen and Alexander Rosenberg Johansen},
    title={Measuring Arithmetic Extrapolation Performance},
    booktitle={Science meets Engineering of Deep Learning at 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)},
    address={Vancouver, Canada},
    journal={CoRR},
    volume={abs/1910.01888},
    month={October},
    year={2019},
    url={http://arxiv.org/abs/1910.01888},
    archivePrefix={arXiv},
    primaryClass={cs.LG},
    arxivId = {2001.05016},
    eprint={1910.01888}
}

ICLR 2020 (Spotlight)

Our main contribution, which includes a theoretical analysis of the optimization challenges with the NALU. Based on these difficulties we propose several improvements. – Read paper.

@inproceedings{mnu-madsen-johansen-2020,
    author = {Andreas Madsen and Alexander Rosenberg Johansen},
    title = {{Neural Arithmetic Units}},
    booktitle = {8th International Conference on Learning Representations, ICLR 2020},
    volume = {abs/2001.05016},
    year = {2020},
    url = {http://arxiv.org/abs/2001.05016},
    archivePrefix={arXiv},
    primaryClass={cs.LG},
    arxivId = {2001.05016},
    eprint={2001.05016}
}

Install

python3 setup.py develop

This will install this code under the name stable-nalu, and the following dependencies if missing: numpy, tqdm, torch, scipy, pandas, tensorflow, torchvision, tensorboard, tensorboardX.

Experiments used in the paper

All experiments results shown in the paper can be exactly reproduced using fixed seeds. The lfs_batch_jobs directory contains bash scripts for submitting jobs to an LFS queue. The bsub and its arguments, can be replaced with python3 or an equivalent command for another queue system.

The export directory contains python scripts for converting the tensorboard results into CSV files and contains R scripts for presenting those results, as presented in the paper.

Naming changes

As said earlier the naming convensions in the code are different from the paper. The following translations can be used:

Linear: --layer-type linear
ReLU: --layer-type ReLU
ReLU6: --layer-type ReLU6
NAC-add: --layer-type NAC
NAC-mul: --layer-type NAC --nac-mul normal
NAC-sigma: --layer-type PosNAC --nac-mul normal
NAC-nmu: --layer-type ReRegualizedLinearPosNAC --nac-mul normal --first-layer ReRegualizedLinearNAC
NALU: --layer-type NALU
NAU: --layer-type ReRegualizedLinearNAC
NMU: --layer-type ReRegualizedLinearNAC --nac-mul mnac

Extra experiments

Here are 4 experiments in total, they correspond to the experiments in the NALU paper.

python3 experiments/simple_function_static.py --help # 4.1 (static)
python3 experiments/sequential_mnist.py --help # 4.2

Example with using NMU on the multiplication problem:

python3 experiments/simple_function_static.py \
    --operation mul --layer-type ReRegualizedLinearNAC --nac-mul mnac \
    --seed 0 --max-iterations 5000000 --verbose \
    --name-prefix test --remove-existing-data

The --verbose logs network internal measures to the tensorboard. You can access the tensorboard with:

tensorboard --logdir tensorboard

Name		Name	Last commit message	Last commit date
Latest commit History 329 Commits
experiments		experiments
export		export
lfs_batch_jobs		lfs_batch_jobs
notebook		notebook
notes		notes
paper		paper
poster		poster
slides		slides
stable_nalu		stable_nalu
.gitattributes		.gitattributes
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
nmu-social.svg		nmu-social.svg
python_lfs_job.sh		python_lfs_job.sh
readme-image.png		readme-image.png
setup.py		setup.py
test.csv		test.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Arithmetic Units

Publications

SEDL Workshop at NeurIPS 2019

ICLR 2020 (Spotlight)

Install

Experiments used in the paper

Naming changes

Extra experiments

About

Releases

Packages

Languages

License

AndreasMadsen/stable-nalu

Folders and files

Latest commit

History

Repository files navigation

Neural Arithmetic Units

Publications

SEDL Workshop at NeurIPS 2019

ICLR 2020 (Spotlight)

Install

Experiments used in the paper

Naming changes

Extra experiments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages