Dependencies

Code for Label-Agnostic Sequence Labeling by Copying Nearest Neighbors.

Update 10/2021: An updated version of the code, which uses the latest version of the Huggingface transformers and datasets libraries is now in the update branch.

Dependencies

The code was developed and tested with pytorch-pretrained-bert 0.5.1. (So you may need to do something like pip install pytorch-pretrained-bert==0.5.1).

Data

All the data is in data.tar.gz.

Trained Models

Trained models can be downloaded from here.

Training

To train the neighbor-based NER model, run:

python -u train_words.py -cuda -db_fi ner-cased-first-b16-n50.pt -detach_db -save mynermodel.pt

By default the above script will save a database file to the argument of -db_fi. Once the database file has been saved, you can rerun the above with -load_saved_db to avoid retokenizing everything.

The other models can be trained analogously, by substituting in the correct data files in data/; see the options in train_words.py. For POS tasks, the -acc_eval flag should be used, and we used a batch size of 20 and a learning rate of 2E-05.

Evaluation

To evaluate on the development set, run:

python -u train_words.py -cuda -db_fi ner-cased-first-b16-n50.pt -load_saved_db -train_from mynermodel.pt -eval_ne_per_sent 100 -pred_shard_size 64 -just_eval dev -val_sent_fi data/conll2003/conll2003-dev.words -val_tag_fi data/conll2003/conll2003-dev.nertags

To run evaluation with recomputed neighbors (under the fine-tuned, rather than pretrained BERT), use the option -just_eval dev-newne.

You can transfer evaluation as follows:

python -u train_words.py -cuda -train_from mynermodel.pt -eval_ne_per_sent 100 -pred_shard_size 64 -align_strat first -just_eval "dev-newne" -zero_shot -sent_fi data/onto/train.words -tag_fi data/onto/train.ner -val_sent_fi data/onto/dev.words -val_tag_fi data/onto/dev.ner

The -dp_pred and -c options can be used for DP decoding.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
data.py		data.py
data.tar.gz		data.tar.gz
dp_pred.py		dp_pred.py
eval_util.py		eval_util.py
train_bl.py		train_bl.py
train_words.py		train_words.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dependencies

Data

Trained Models

Training

Evaluation

About

Releases

Packages

Languages

swiseman/neighbor-tagging

Folders and files

Latest commit

History

Repository files navigation

Dependencies

Data

Trained Models

Training

Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages