Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning
Sandeep Subramanian, Adam Trischler, Yoshua Bengio & Christopher Pal
ICLR 2018
GenSen is a technique to learn general purpose, fixed-length representations of sentences via multi-task training. These representations are useful for transfer and low-resource learning. For details please refer to ICLR paper.
We provide a distributed PyTorch with Horovod implementation of the paper along with pre-trained models as well as code to evaluate these models on a variety of transfer learning benchmarks. This code is based on the gibhub codebase from Maluuba, but we have refactored the code in the following aspects:
- Support a distributed PyTorch with Horovod
- Clean and refactor the original code in a more structured form
- Change the training file (
train.py
) from non-stopping to stop when the validation loss reaches to the local minimum - Update the code from Python 2.7 to 3+ and PyTorch from 0.2 or 0.3 to 1.0.1
- Add some necessary comments
- Add some code for training on AzureML platform
- Fix the bug on when setting the batch size to 1, the training raises an error
- Python 3+
- PyTorch 1.0.1
- nltk
- h5py
- numpy
- scikit-learn
@article{subramanian2018learning,
title={Learning general purpose distributed sentence representations via large scale multi-task learning},
author={Subramanian, Sandeep and Trischler, Adam and Bengio, Yoshua and Pal, Christopher J},
journal={arXiv preprint arXiv:1804.00079},
year={2018}
}