You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Charge normalization involves converting long specific strings of charge descriptions to simplified general terms for the charge.
For example, converting "DRIVING WHILE INTOXICATED BAC >=0.15" to "Driving Under the Influence of Alcohol".
Currently, this is taking place in the cleaner module. The simple process is to cross reference the complex text that is scraped against a database for the simple version. This database is a .json file like the one below.
[{"charge_name": "DRIVING WHILE INTOXICATED BAC >=0.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC >=.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC>=0.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC >=0.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC >= 0.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC>=0.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC >=0.15-A", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC>=0.15 -A", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC >=0.15- A (THP)", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC>= 0.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC >=0.15-A", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC >= 0.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC > =0.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC >=.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC >= 0.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC>.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC>=.15", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}
,{"charge_name": "DRIVING WHILE INTOXICATED BAC>=0.15-A", "uccs_code": "4020", "charge_desc": "Driving Under the Influence of Alcohol", "offense_category_desc": "Driving Under the Influence", "offense_type_desc": "DUI Offense"}]
Problem
However, because there are novel charges that can come through because the way the court clerk typed it was different from one day to the next, the cleaner often cannot find the generalized form of the charge. The charge can have addition fields like "(COUNT ONE) ARSON" that cause it to not match any fields related to Arson.
Goal
Rewrite the cleaner module so that is better at normalizing types of charges. This may involve something simple like removing certain types of formatting or as complex as using machine learning or AI.
The text was updated successfully, but these errors were encountered:
Background
Charge normalization involves converting long specific strings of charge descriptions to simplified general terms for the charge.
For example, converting "DRIVING WHILE INTOXICATED BAC >=0.15" to "Driving Under the Influence of Alcohol".
Currently, this is taking place in the
cleaner
module. The simple process is to cross reference the complex text that is scraped against a database for the simple version. This database is a .json file like the one below.Problem
However, because there are novel charges that can come through because the way the court clerk typed it was different from one day to the next, the cleaner often cannot find the generalized form of the charge. The charge can have addition fields like "(COUNT ONE) ARSON" that cause it to not match any fields related to Arson.
Goal
Rewrite the
cleaner
module so that is better at normalizing types of charges. This may involve something simple like removing certain types of formatting or as complex as using machine learning or AI.The text was updated successfully, but these errors were encountered: