(We didn't use Family Guy but the name stuck)
We will clean and analyze online transcripts of several popular American sitcoms (Family Guy, South Park, Seinfeld, Spongebob) and present our findings. Our groups are looking to gain some insight onto the development of humor from more to less mature sitcoms as one of the primary goals in this project. We also look to compare the complexity of vocabulary between the shows. The desired outcome of this project is a text generator which will be able to create new content based on the styles of these individual shows.
This project was originally presented at the Data Science at UCSB 2018 Project Showcase. Check out Data Science at UCSB here
- Jay Singh
- Lauren Shin
- Evan Azevedo
- Liam Abrams
- Mikaela Guerrero
- Jerry Liu
- Stevyn Fessler
- Michelle Su
Special thanks to Jason Freeberg and Timothy Nguyen for giving us guidance in our beginning and later stages, respectively.
Beautiful soup
Pandas
SciKit-Learn
See requirements.txt
We hope to learn something about the target audiences for each show based on the transcripts of those shows based on the vocabulary used and general tone of dialogue.