This work focuses on extraction of Adverse Drug Reactions(ADRs)from ADRs-related tweets and sentences extracted from PubMed abstracts.
Our paper 《An Attentive Neural Sequence Labeling Model for Adverse Drug Reactions Mentions Extraction》 has been accepted by IEEE Access.
You can find more details (in Chinese) about our paper via my blog: Sequence labeling with embedding-level attention.
We use two datasets in our paper, the first one is a Twitter dataset, which is used in paper Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts, another dataset is called ADE-Corpus-V2, which is used in paper An Attentive Sequence Model for Adverse Drug Event Extraction from Biomedical Text and availe online: https://sites.google.com/site/adecorpus/home/document.
Because it is against Twitter's Terms of Service to publish the text of tweets, so we cannot provide the first Twitter dataset, you can obtain this dataset from the author of paper Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts so that you can keep your dataset consistent with that used in our paper.
Please get these two datasets ready and put them to twitter_adr/data and pubmed_adr/data, respectively. I have provided the PubMed dataset in pubmed_adr/data.
For both datasets, we use the pretrained GloVe 300d word embedding, please download it and put it to twitter_adr/embeddings.
We have twitter_adr/data_processing.py and pubmed_adr/data_processing.py to process the two datasets, respectively.
The twitter_adr/model.py and pubmed_adr/model.py are the model code to generate the predictions, and approximateMatch.py is the script which adopts approximate matching and prints the results of the model.
This is the result obtained by our model. The F1 on the Twitter dataset is about 0.84, which is more than 10% of the previous SOTA (state-of-the-art), and the F1 on the PubMed dataset is about 0.91, which is more than the previous SOTA about 5%. Due to randomness of the result, it is recommended to run the model several times and average their results.
Because our model is essentially focusing on a sequence labeling task, it can be generalized to any token level classification tasks, such as Named Entity Recognition (NER), Part Of Speech tagging (POS taging). Next I want to validate our model on some larger datasets and explore the magical effects of pre-training.
I want to sincerely shout out to the following work: