my_class_reference.rmd

---
title: "Class Reference"
author: "Student name"
output:
  html_document:
    theme: cerulean
    highlight: pygments
    toc: true
    toc_float:
      collapsed: true
      smooth_scroll: false
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Introduction

Consider this a personal guide to the commands and functions you will learn. In general, when you come across an R command or function that you want to remember, put it in here along with a description of what it does and when you'd use it.


## Things I Learned on Jan. 27

The command to set a working directory is setwd(). For example:

```{r}
setwd("~/Documents/GitHub/data_journalism_2022_spring")
```

```{r}
install.packages("tidyverse")
library(tidyverse)
```


### Summarizing

I need to use group_by and summariz. Here's an example of grouping by county and calculating counts, sum and other descriptive statistics.


```{r}
ppp_maryland_loans %>%
  group_by(project_county_name) %>%
  summarise(
    count_loans = n(),
    total_loans_amount = sum(amount),
    mean_loan_amount = mean(amount),
    median_loan_amount = median(amount),
    min_loan_amount = min(amount),
    max_loan_amount = max(amount)
  ) %>%
  arrange(desc(max_loan_amount))
```
Store of a new variable object 
```{r}
"fatima" -> wv_winred_contribs %>%
  
```
### Lubridate - this package makes it easier to do the things R does with date-times and possible to do the things R does not.

### head is the title, summary is summary of data, colnames is the brief version of each column name, and glimpse is to get a brief sense of data. 

```{r}
head(primary_18)
summary(primary_18)
colnames(primary_18)
glimpse(primary_18)
```

### How to add a new column based on an existing column. Run the following code to create a new column called `percent_election_day` based on a calculation using two existing columns.

```{r}
primary_18_with_election_day_percent <- primary_18 %>%
  select(office, district, name_raw, party, jurisdiction, election_day, votes) %>% 
  mutate(
  percent_election_day = election_day/votes
)
```

### The following code organize data by the new column. 

```{r}
# better ordering?
primary_18 %>%
  select(office, district, name_raw, party, jurisdiction, election_day, votes) %>% 
  mutate(
  percent_election_day = (election_day/votes)*100
)  %>% arrange(desc(percent_election_day))
```

### Mutate is also useful for standardizing data - for example, making different spellings of, say, cities into a single one.
You'll notice that there's a mix of styles: "Baltimore" and "BALTIMORE" for example. R will think those are two different cities, and that will mean that any aggregates we create based on city won't be accurate.
Mutate - it's not just for math! And a function called `str_to_upper` that will convert a character column into all uppercase. Now we can say exactly how many donations came from Baltimore (I mean, of course, BALTIMORE).

```{r}
standardized_maryland_cities <- maryland_cities %>%
  mutate(
    upper_city = str_to_upper(city)
)
```

### filter helps with reducing number of rows while select helps with reducing number of columns

### removing some objects from the workspace write the code down in console:
> rm(maryland_expenses)