-
Notifications
You must be signed in to change notification settings - Fork 1
/
09-dplyr.Rmd
59 lines (44 loc) · 2.22 KB
/
09-dplyr.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
title: "Data manipulation with dplyr"
output:
html_document:
toc: true
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## The tidyverse packages
- Website: http://www.tidyverse.org/packages/
- Install with `install.packages("tidyverse")`
Tidyverse is a package of packages that shares a common philosophy designed for data wrangling, exploring, and visualization. Of the many packages associated with tidyverse we've already covered a brief introduction to `ggplot2`, and will next cover `dplyr` and `tidyr`.
dplyr is a grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges:
- `mutate()` adds new variables that are functions of existing variables
- `select()` picks variables based on their names.
- `filter()` picks cases based on their values.
- `summarise()` reduces multiple values down to a single summary.
- `arrange()` changes the ordering of the rows.
These all combine naturally with `group_by()` which allows you to perform any operation "by group". You can learn more about them in `vignette("dplyr")`. As well as these single-table verbs, dplyr also provides a variety of two-table verbs, which you can learn about in `vignette("two-table")`.
## Pipe operator
`%>%` pipes an object forward into a function or call expression
- Basic piping
- x %>% f is equivalent to f(x)
- x %>% f(y) is equivalent to f(x, y)
- x %>% f %>% g %>% h is equivalent to h(g(f(x)))
- Using `%>%` with unary function calls
```{r, echo=TRUE}
require(tidyverse)
iris %>% head
iris %>% tail
```
More information available at: http://magrittr.tidyverse.org/
RStudio keyboard shortcut for `%>%`:
- Ctrl-Shift-M (Windows)
- Command-Shift-M (Mac)
## Lesson
1. [Dataframe manipulation with dplyr](http://swcarpentry.github.io/r-novice-gapminder/13-dplyr/index.html)
## Optional lesson
1. [Programming with dplyr](http://dplyr.tidyverse.org/articles/programming.html)
## Resources
1. [Data wrangling cheat sheet](https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf)
1. [Relational data](https://r4ds.had.co.nz/relational-data.html)
1. [What is the tidyverse?](https://rviews.rstudio.com/2017/06/08/what-is-the-tidyverse/)