Code for Label-Agnostic Sequence Labeling by Copying Nearest Neighbors.
Update 10/2021: An updated version of the code, which uses the latest version of the Huggingface transformers and datasets libraries is now in the update branch.
The code was developed and tested with pytorch-pretrained-bert 0.5.1. (So you may need to do something like pip install pytorch-pretrained-bert==0.5.1
).
All the data is in data.tar.gz.
Trained models can be downloaded from here.
To train the neighbor-based NER model, run:
python -u train_words.py -cuda -db_fi ner-cased-first-b16-n50.pt -detach_db -save mynermodel.pt
By default the above script will save a database file to the argument of -db_fi
. Once the database file has been saved, you can rerun the above with -load_saved_db
to avoid retokenizing everything.
The other models can be trained analogously, by substituting in the correct data files in data/
; see the options in train_words.py
. For POS tasks, the -acc_eval
flag should be used, and we used a batch size of 20 and a learning rate of 2E-05.
To evaluate on the development set, run:
python -u train_words.py -cuda -db_fi ner-cased-first-b16-n50.pt -load_saved_db -train_from mynermodel.pt -eval_ne_per_sent 100 -pred_shard_size 64 -just_eval dev -val_sent_fi data/conll2003/conll2003-dev.words -val_tag_fi data/conll2003/conll2003-dev.nertags
To run evaluation with recomputed neighbors (under the fine-tuned, rather than pretrained BERT), use the option -just_eval dev-newne
.
You can transfer evaluation as follows:
python -u train_words.py -cuda -train_from mynermodel.pt -eval_ne_per_sent 100 -pred_shard_size 64 -align_strat first -just_eval "dev-newne" -zero_shot -sent_fi data/onto/train.words -tag_fi data/onto/train.ner -val_sent_fi data/onto/dev.words -val_tag_fi data/onto/dev.ner
The -dp_pred
and -c
options can be used for DP decoding.