The main difference resides in the fact that NegBio doesn't work for Portuguese, so we adapt Brazillian Negex triggers to detect negation and uncertainty. In this code we use EasyNegex repository implementation of Negex.
Please install following dependencies or use the Dockerized labeler (see below).
- Clone the EasyNegex repository to the root of this repository:
git clone https://github.com/fuchsfelipel/easyNegex
- Make the virtual environment:
conda env create -f environment.yml
- Activate the virtual environment:
conda activate chexpert-label
- Install NLTK data:
python -m nltk.downloader universal_tagset punkt wordnet
Place reports in a headerless, single column csv {reports_path}
. Each report must be contained in quotes if (1) it contains a comma or (2) it spans multiple lines. See sample_reports.csv (with output labeled_reports.csv)for an example.
python label.py --reports_path {reports_path}
Run python label.py --help
for descriptions of all of the command-line arguments.
This repository builds upon the work of CheXpert, Negex and EasyNegex.
If you're using the BRAX labeler forked from CheXpert labeling tool, please cite BRAX dataset and CheXpert as a reference.