Skip to content

Latest commit

 

History

History
31 lines (26 loc) · 1.9 KB

README.md

File metadata and controls

31 lines (26 loc) · 1.9 KB

Astro-mT5

This repository contains code as well as the paper for [ Astro-mT5: Entity Extraction from Astrophysics Literature using mT5 Language Model ] accepted at AACL-IJCNLP Workshop '2022.

Abstract

Scientific research requires reading and extracting relevant information from existing scientific literature in an effective way. To gain insights over a collection of such scientific documents, extraction of entities and recognizing their types is considered to be one of the important tasks. Numerous studies have been conducted in this area of research. In our study, we introduce a framework for entity recognition and identification of NASA astrophysics dataset, which was published as a part of the DEAL SharedTask. We use a pre-trained multilingual model, based on a natural language processing framework for the given sequence labeling tasks. Experiments show that our model, Astro-mT5, outperforms the existing baseline in astrophysics related information extraction. Our paper is available at work.

Setup

Install Package Dependencies

git clone https://github.com/flairNLP/flair.git
cd flair
git checkout add-t5-encoder-support
pip3 install -e .
For running the experiment run_ner.py and test.py have to be kept inside the flair directory.

Training

The main training procedure is:
python3 run_ner.py --dataset_name NER_MASAKHANE \
--model_name_or_path google/mt5-large\
--layers -1\
--subtoken_pooling first_last\
--hidden_size 256\
--batch_size 4\
--learning_rate 5e-05\
--num_epochs 100\
--use_crf True\
--output_dir /content/mt5-large

Tesing

After training, you can find the best checkpoint on the dev set according to the evaluation results. For this run
python3 test.py