Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract involved crash humans and their injuries automatically from article text #14

Open
digitaldutch opened this issue Sep 1, 2024 · 0 comments

Comments

@digitaldutch
Copy link
Owner

digitaldutch commented Sep 1, 2024

Currently, the humans and their injuries involved in the crash are determined by volunteers copying the article full text selecting this data manually. Helpers click the Add button and fill in a url to a media page. The roaddanger spider then tries to read the meta tags (JSON-LD, Twitter/X, Open graph, etc).

Any missing data is copied manually. From the title and full text all important data is extracted like:

  • All involved humans and their mode of transportation. All transport options can be found on the data export page.
  • Their injuries: Dead, injured, unharmed of unknown
  • If the human is a child (below 18)
  • If the human was intoxicated
  • If the human drove away of fled
  • If it was a one sided crash (no other humans involved)

Entry screen:
Data entry screen

Entering data will be faster if these steps can be automated using an AI language model that reads the text and then automatically selects all involved humans and their characteristics. As roaddanger.org is multilingual, it would be nice if this feature supports multiple languages.

All current crash data (full texts and all meta data like involved humans) can be downloaded in JSON format from this page. This data can be used to train or test the language models.

@digitaldutch digitaldutch changed the title Extract involved crash humans their injury automatically from article text Extract involved crash humans and their injuries automatically from article text Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant