This repo contains supplementary material for the paper "Exploring aspects of similarity between spoken personal narratives by disentangling them into narrative clause types", in proceedings of the 2020 ACL NUSE Workshop.
paper, video @ ACL (10'), video @ ICML (5'), slides, poster, @medialab
Saldias, B., & Roy, D. (July, 2020) Exploring aspects of similarity between spoken personal narratives by disentangling them into narrative clause types. Proceedings of the 2020 ACL Workshop on Narrative Understanding, Storylines, and Events (NUSE). ACL.
We introduce the largest dataset of annotated spoken personal narratives to our knowledge, from now on referenced as Roadtrip Nation or RTN corpus. These narratives were obtained from transcripts of stories video-recorded by Roadtrip Nation (RTN) -- see figure 1. In those videos, people from many backgrounds share stories about their lives and career pathways.
-
The corpus comprises 10,296 narrative clauses from 594 stories (each one told by a different person), which account for more than 10 hours of people telling stories, each one averaging 17.1 clauses or 62 seconds long, where each clause has on average 11 tokens. You can find it here: data/Saldias&Roy-RTN_data.csv
-
You can find audible stories by exploring
Highlight
videos in the Roadtrip Nation stories' explorer.
Figure 0: Roadtrip Nation stories' explorer.
- We followed the annotation guidelines, for Labov’s sociolinguistic model of personal narratives (Labov et al., 1967), constructed by trained researchers in Swanson et al. (2014) to explain to Mechanical Turkers how to annotate our clauses.
Figure 1: A fragment of a personal narrative in the RTN corpus annotated by Turkers using Labov’s model.
This material is made available under a Creative Commons Attribution 4.0 International License. Please attribute any use of this material to Saldias, B., & Roy, D. (2020) and cite as stated in section Citation below. To download, please either go to our data
folder or use the following command:
wget --quiet https://raw.githubusercontent.com/social-machines/acl-nuse-personal-narratives/master/data/Saldias%26Roy-RTN_data.csv
Find the Mechanical Turk requester HTML template for the three tasks described in the paper here:
- RTN corpus annotation under Labov's framework
- Which one of the stories below is the most similar to the main story?
- In what aspects are the following personal narratives similar?
MTurk Task 1 | Mturk Task 2 |
---|---|
If you use content in this repo, please consider citing us as below:
@inproceedings{saldias-roy-2020-exploring,
title = "Exploring aspects of similarity between spoken personal narratives by disentangling them into narrative clause types",
author = "Saldias, Belen and
Roy, Deb",
booktitle = "Proceedings of the First Joint Workshop on Narrative Understanding, Storylines, and Events",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.nuse-1.10",
pages = "78--86",
}
The authors would like to thank Roadtrip Nation for collecting and sharing their data. We also thank Swanson et al. (2014) who shared their corpus with us, and the anonymous reviewers who provided valuable feedback. We were inspired by researchers at the Laboratory for Social Machines (LSM) at MIT who are passionate about storytelling. This project was funded by LSM Member companies McKinsey & Company and Twitter.