-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
18 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,22 @@ | ||
# 2024_Inconsistency_Detection | ||
# A Natural Language Processing Approach to Detect Inconsistencies in Death Investigation Notes Attributing Suicide Circumstances | ||
|
||
This repository contains the source code for Detecting Suicide-Cause Annotation Inconsistencies in Death Investigation Notes. | ||
This repository contains the source code for Detecting Suicide-Cause Annotation Inconsistencies in Death Investigation Notes. | ||
|
||
## Abstract | ||
|
||
**Background**: Data accuracy is essential for scientific research and policy development. The National Violent Death Reporting System (NVDRS) data is widely used for discovering the patterns and causing factors of death. Recent studies suggested the annotation inconsistencies within the NVDRS and the potential impact on erroneous suicide-circumstance attributions. | ||
|
||
**Methods**: We present an empirical Natural Language Processing (NLP) approach to detect annotation inconsistencies and adopt a cross-validation-like paradigm to identify possible label errors. We analyzed 267,804 suicide death incidents between 2003 and 2020 from the NVDRS. We measured annotation inconsistency by the degree of changes in the F-1 score. | ||
|
||
**Results**: Our results show that incorporating the target state’s data into training the suicide-circumstance classifier brings an increase of 5.4% to the F-1 score on the target state’s test set and a decrease of 1.1% on other states’ test set. | ||
|
||
**Conclusions**: To conclude, we present an NLP framework to detect the annotation inconsistencies, show the effectiveness of identifying and rectifying possible label errors, and eventually propose an improvement solution to improve the coding consistency of human annotators. | ||
|
||
## Citation | ||
|
||
Wang S, Zhou Y, Han Z, Tao C, Xiao Y, Ding Y, Ghosh J, Peng Y. Uncovering Misattributed Suicide Causes through Annotation Inconsistency Detection in Death Investigation Notes. arXiv preprint arXiv:2403.19432. 2024 Mar 28. | ||
|
||
## Acknowledgement | ||
|
||
This study was supported by AIM-AHEAD Consortium Development Program. | ||
This study was supported by the AIM-AHEAD Consortium Development Program of NIH under grant number OT2OD032581, National Institute on Aging (NIA) of NIH under grant number RF1AG072799, National Institute of Allergy and Infectious Diseases (NIAID) of NIH under grant number 1R01AI130460, National Science Foundation under grant number 2145640, the Center for Health Economics of Treatment Interventions for Substance Use Disorder, HCV, and HIV (NIDA P30DA040500) and the National Institute for Health Care Management Research and Educational Foundation. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH and NSF. | ||
|