Supplementary code for NeurIPS submission "Sequence Modeling with Unconstrained Generation Order"(arxiv). This code trains and applies a machine translation model that can generate sequences in arbitrary order
- A machine with some CPU (preferably 4+) and at least one GPU
- The optimal performance is reached when running on 8 GPUs
- Some popular Linux x64 distribution
- Tested on Ubuntu16.04, should work fine on any popular linux64 and even MacOS;
- Windows and x32 systems may require heavy wizardry to run;
- When in doubt, use Docker, preferably GPU-enabled (i.e. nvidia-docker)
- Setup environment
- Clone or download this repo.
cd
yourself to it's root directory. - Get a python distribution. Anaconda works fine.
- Install packages from
requirements.txt
- Prepare data
- Grab the WMT English-Russian dataset from http://statmt.org/ (or another language of your choosing)
- Tokenize it with mosestokenizer or any other reasonable tokenizer. It is also recommended that you lowercase the data.
- Learn and apply BPE with subword-nmt
- You can find example preprocessing pipelines here.
- Run jupyter notebook
- All the training notebooks are in the
./notebooks/
folder - Before you run the first cell, optionally set
%env CUDA_VISIBLE_DEVICES=###
to devices that you plan to use. - Follow the code as it loads data, trains model and reports training progress.
- NOTE: The BLEU metric measured in the notebook is not the one used for evaluation. See sacrebleu.