-
Notifications
You must be signed in to change notification settings - Fork 4
ICS02: 8. Introduction to R
Thursday Feb 28, 16:00 UK = 18:00 EET
Convenors: Christopher Ohge & Gabriel Bodard (University of London)
YouTube link: https://youtu.be/X8iCDZVgWSA
Access the HTML version of the notebook, with the visualisations
This session will introduce will introduce basic programming concepts with the R language. After an introductory lesson on regular expressions, R syntax, and basic R functions, we will use the tidy text library package to perform text analysis tasks.
-
Christopher [CO] reads outline
-
Preliminary remarks (5 min), Gabby [GB] and CO:
-
Most common programming languages in DH
-
Why R & why is it important?
- Regular expressions (15 min), GB:
- Exercises on gutenberg texts
- Intro to R and tidytext (40 min), CO
Before the session, make sure to download the R software package from http://www.r-project.org/.
-
Click on "download R."
-
Choose the appropriate CRAN mirror in your area for downloading (for me it's the UK > Imperial College London link).
-
Download and install the appropriate R 3.5.2 binary for your operating system.
Then download the latest version of RStudio at https://www.rstudio.com.
-
Click on "Download RStudio."
-
Download the RStudio Desktop (free) version.
-
Chose the appropriate installer: Most of you will use either RStudio 1.1.463 - Windows Vista/7/8/10 or Mac OS X 10.6+.
- Hawkins, Laura F. 'Computational Models for Analyzing Data Collected from Reconstructed Cuneiform Syllabaries.' Digital Humanities Quarterly 12.1 (2018). Available: http://digitalhumanities.org:8081/dhq/vol/12/1/000368/000368.html (Wayback Machine version)
- Rockwell, G. 'What is Text Analysis, Really?' Literary and Linguistic Computing 18.2 (2003): 209-219. Available: http://www.geoffreyrockwell.com/publications/WhatIsTAnalysis.pdf
- Rydberg-Cox, Jeff. Statistical Methods for Studying Literature in R. Available: https://daedalus.umkc.edu/StatisticalMethods/index.html
- Silge, Julia, and David Robinson. Text Mining with R: A Tidy Approach. Available: https://www.tidytextmining.com/. Especially chapters 1–3.
- Jockers, Matthew. Text Analysis with R for Students of Literature (Springer, 2014). Especially Chapters 1, 2, 6, 7, and 11
- Regex Tester: https://www.regextester.com/
- Regex Quickstart: https://www.rexegg.com/regex-quickstart.html
- tba
-
How could we create a regular expression to remove those non-words in
dickens.words.v
? -
Modify the for loop above by confining your results to only dialogue in
dickens.words.v
. -
Using the gutenbergr package, load some new text files (more than one, please) that interest you. Create a tidy tibble of the textual data and chose a visualisation method for displaying your results.
-
Based on your results, posit a new question--or questions--about what you would like to investigate further. Modify a code block(s) from Part I of the R Notebook to answer your question.