This repository provides materials for our session on how to clean and wrangle data using janitor and forcatsas part of the I2DS Tools for Data Science workshop run at the Hertie School, Berlin in October 2023. The student-run workshop is part of the course Introduction to Data Science taught by Simon Munzert at the Hertie School, Berlin, in Fall 2023.
This session will introduce you to the intricacies of factor management with R using the "forcats" package, as well as data cleaning and tidying with the "janitor" package. Both packages are essential for efficient data manipulation and ensuring clean and consistent datasets. Have a look at our presentation to learn more!
The goals of this session are to:
- Equip you with conceptual knowledge about the "forcats" and "janitor" packages.
- Demonstrate various functions and utilities provided by both packages.
- Provide you with practice material on how to efficiently wrangle and clean data with both packages.
- Elena Dreyer
- Luis Fernando Ramirez Ruiz
- Shruti Kakade
janitor
For original material on the janitor package
forcats
- Official R Documentation (including all sublinks)
- Another R Documentation
- Wickham, H., & Grolemund, G. (2017). Factors. In R for Data Science (1st ed.). O’Reilly Media, Inc. https://r4ds.had.co.nz/factors.html
- Wickham, H., & Grolemund, G. (2023). Factors. In R for Data Science (2nd ed.). O’Reilly Media, Inc. https://r4ds.hadley.nz/factors.html
- McNamara A, Horton NJ. 2017. Wrangling categorical data in R. PeerJ Preprints 5:e3163v2 https://doi.org/10.7287/peerj.preprints.3163v2
- Cleaning and Exploring Data with the “janitor”
- RDocumentation - Janitor
- Tabyl: frequency tables for R users
- 'forcats' cheatsheet Credits to the nice images!
- Click here to take our quiz to test your knowledge about the
forcats
package! - Have a look at our [practice materials] (https://github.com/elenaivadreyer/01-wrangling-dreyer-ramirez-kakade/blob/main/final_presentation_files/janitor_exercises.Rmd) and try out
janitor
to wrangle untidy data
The material in this repository is made available under the MIT license.
Elena Dreyer prepared the presentation slides for forcats
, recorded the session on forcats
and created the Mentimeter quiz.
Luis Fernando Ramirez Ruiz prepared the presentation slides for janitor
, recorded the session on janitor
and created the practice material.