Skip to content

A BERT-based Automated Information Extraction System of Radiology Reports for Bone Fracture Detection and Diagnosis

Notifications You must be signed in to change notification settings

Lightweight-Integration-Limited/BoneBert

Repository files navigation

BoneBert: Information Extraction for Bone Fracture Detection and Diagnosis

BoneBert is our proposed information extraction system of Bone X-ray radiology reports to retrieve the details of bone fractrue detection and diagnosis based on BERT by Google. The "semi-supervised" model was first trained on annotations generated by a handcrafted rule-based labelling system (BonePert) and later fine-tuned on a small set of expert annotations.

Please refer to our paper for more details.

Data

data/
|-- train.csv  / training set
|-- val.csv    / validation set
|-- test.csv   / test set

Here, sample.csv gives a few samples from the training set.

BonePert

Setting Up the Environment

Install all the python packages required.

$ pip -r requirements.txt

Usage

$ python run_bonepert.py

By default, the code uses the manually expanded rule base in BonePert+. Alternatively, you can use the rule base in BonePert.

BoneBert

Setting Up the Environment

There are two ways of setting up the environment.

1. Docker (recommended)

Build a Docker image from Dockerfile.

$ nvidia-docker build -t bonebert .

Start a container using the image just built.

$ nvidia-docker run -t -d \
--env PYTHONPATH=. \
--env NVIDIA_VISIBLE_DEVICES=all \
--env MODEL_DIR=/model \
--env DATA_DIR=/data \
--env TRAIN_OUTPUT_DIR=/output_train \
--env FINETUNE_OUTPUT_DIR=/output_finetune \
--mount type=bind,source=/$(pwd)/bert/run_bluebert_ner.py,target=/bonebert/bluebert/run_bluebert_ner.py \
-v /$(pwd)/data:/data \
-v /$(pwd)/output_train:/output_train \
-v /$(pwd)/output_finetune:/output_finetune \
bonebert

2. Manual

Clone ncbi-nlp/bluebert and install all the required packages using BlueBert's requirements.txt.

$ pip -r requirements.txt

Replace the bluebert/run_bluebert_ner.py file with run_bluebert_ner.py.

Usage

To train a BoneBert model with GPU, please ensure that you have at least 8GB of GPU memory.

1. Convert the data from csv formats to bert formats.

$ python run_convert_to_bert.py

2. Train with Labels from BonePert

$ nvidia-docker exec -it [container-id] \
python bluebert/run_bluebert_ner.py \
--do_prepare=true \
--do_train=true \
--do_predict=true \
--task_name=extra \
--vocab_file=$MODEL_DIR/vocab.txt \
--bert_config_file=$MODEL_DIR/bert_config.json \
--init_checkpoint=$MODEL_DIR/bert_model.ckpt \
--num_train_cpochs=30.0 \
--do_lower_case=true \
--data_dir=$DATA_DIR \
--output_dir=$TRAIN_OUTPUT_DIR

3. Fine-tune with Expert Annotations

$ nvidia-docker exec -it [container-id] \
python bluebert/run_bluebert_ner.py \
--do_prepare=true \
--do_train=true \
--do_predict=true \
--task_name=fracture \
--vocab_file=$MODEL_DIR/vocab.txt \
--bert_config_file=$MODEL_DIR/bert_config.json \
--init_checkpoint=$TRAIN_OUTPUT_DIR/bert_model.ckpt-9645 \
--num_train_cpochs=30.0 \
--do_lower_case=true \
--data_dir=$DATA_DIR \
--output_dir=$FINETUNE_OUTPUT_DIR

4. Convert results to csv formats

$ python run_analyse_bert.py

Citing BoneBert

@InProceedings{10.1007/978-3-030-74251-5_21,
  author="Dai, Zhihao and Li, Zhong and Han, Lianghao",
  title="BoneBert: A BERT-based Automated Information Extraction System of Radiology Reports for Bone Fracture Detection and Diagnosis",
  booktitle="Advances in Intelligent Data Analysis XIX",
  year="2021",
  publisher="Springer International Publishing",
  address="Cham",
  pages="263--274",
  isbn="978-3-030-74251-5"
}

Acknowledgments

The code is adapted from ncbi-nlp/NegBio and ncbi-nlp/BlueBERT.

We are grateful for the authros of NegBio, CheXpert-labeller, and BlueBERT.

About

A BERT-based Automated Information Extraction System of Radiology Reports for Bone Fracture Detection and Diagnosis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published