Causal Relation Extraction From Medical Texts

Causal Relation Extraction and Identification using Conditional Random Fields. It was a project under our faculty Mr. Tirthankar Dasgupta.

Link to the project presentation.

Introduction

Causal Relation is a relation between two events: cause and effect. Cause is the producer of the effect, and effect the result of the cause.

Ex. “Hunger is the most common cause of crying in a young baby.” Here cause is “Hunger” and effect is “Crying”.
The present work is focused on the detection and extraction of Causal Relations from Medical domain text.

From the point of view of detecting Causal Relations, the following distinctions may be useful:

• Marked or unmarked: a causation is marked if there is a specific linguistic unit that signals the relation; unmarked otherwise. “I bought it because I read a good review” is marked; “Be careful. It’s unstable” isn’t.
• Ambiguity: if the mark signals always a causation, it is unambiguous (e.g. “because”). If it signals sometimes a causation, it is ambiguous (e.g. “since” ).
• Explicit or implicit: a causation is explicit if both arguments are present; implicit if one or both are missing. “She was thrown out of the hotel after she had run naked through its halls.” is explicit; “John killed Bob.” is implicit, since the effect, Bob’s death, is not explicitly stated. We focus on marked and explicit causations.

Workflow

1. Data Preprocessing
2. Feature Selection and Extraction
3. Training Model
4. Testing Model Prediction Accuracy

Data Preprocessing

Extracting unique words
POS Tagging & Term Labelling (CC- cause, EE- effect, O- Null, RR- relation(Causal Link word) )

Code Snippet:-

Feature Selection and Extraction

Word Case (upper/lower)
Word POS
Word title
Type (Alphanumeric/Character)

Model Selection and Training

Statistical Model CRF (Conditional Random Field) is used from sklearn-crfsuite library. We trained model on our preprocessed training dataset.

Code Snippet:-

Model Testing

Testing model on test data with following Precession, Recall, & F-1 score values.

Code Snippet:-

The Results of Conditional Random Field:-

Future Scope

To get more accurate result we can use (Sequence Models) Deep Neural Networks, like Bidirectional LSTM Models.
These models can be used owing to their high accuracy because of their very deep feature extraction capabilities. Only disadvantage is that they (LSTMs) require very large amount of data for training.

References

• University Of New Zealand
• Wikipedia
• Automatic Extraction of Causal Relations from Text using Linguistically Informed Deep Neural Networks

Author

 Prateek Gupta

Other Contributor

Special Thanks to Shivendra Pratap Singh for all his efforts and contributions.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.gitignore		.gitignore
Causal_Relation_Extraction.ipynb		Causal_Relation_Extraction.ipynb
LICENSE		LICENSE
README.md		README.md
crf.png		crf.png
crf_dataset.csv		crf_dataset.csv
crf_dataset_preparation.py		crf_dataset_preparation.py
crf_test_dataset.csv		crf_test_dataset.csv
crf_validation_dataset.csv		crf_validation_dataset.csv
majority.py		majority.py
modelTesting.png		modelTesting.png
preprocessing.png		preprocessing.png
results.png		results.png
test.csv		test.csv
train.csv		train.csv
validation.csv		validation.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Causal Relation Extraction From Medical Texts

Introduction

Workflow

Data Preprocessing

Code Snippet:-

Feature Selection and Extraction

Model Selection and Training

Code Snippet:-

Model Testing

Code Snippet:-

The Results of Conditional Random Field:-

Future Scope

References

Author

Other Contributor

About

Releases

Packages

Languages

License

prateekguptaiiitk/Causal_Relation_Extraction

Folders and files

Latest commit

History

Repository files navigation

Causal Relation Extraction From Medical Texts

Introduction

Workflow

Data Preprocessing

Code Snippet:-

Feature Selection and Extraction

Model Selection and Training

Code Snippet:-

Model Testing

Code Snippet:-

The Results of Conditional Random Field:-

Future Scope

References

Author

Other Contributor

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages