diff --git a/config.yml b/config.yml index 65bd4ebc..71d669ef 100644 --- a/config.yml +++ b/config.yml @@ -9,7 +9,7 @@ default: #grades_due: "January 26" holiday: "In recognition of [Martin Luther King Jr. Day](https://en.wikipedia.org/wiki/Martin_Luther_King_Jr._Day), there will be **no class on Monday January 15th, 2024**." google_survey: "https://forms.gle/hQ7jdHZSZoDpHAdE6" - slack_workspace: "https://intro-to-r-140-604.slack.com/" + slack_workspace: "https://daseh.slack.com/" # ta: "Padmashri Saravanan (psarava1 at jhu.edu), Alex Newman (anewma28 at jhu.edu)" ds: "Candace Savonen (csavonen at fredhutch.org)" instructors: "[Carrie Wright](https://carriewright11.github.io/) (cwright2 @ fredhutch.org), [Ava Hoffman](https://www.avahoffman.com/) (ahoffma2 at fredhutch.org), [Elizabeth Humphries]() (ehumphri at fredhutch.org)" diff --git a/modules/Intro/Intro.Rmd b/modules/Intro/Intro.Rmd index 216698ad..42054ef8 100644 --- a/modules/Intro/Intro.Rmd +++ b/modules/Intro/Intro.Rmd @@ -13,16 +13,14 @@ opts_chunk$set(comment = "") ## Welcome to class! -1. Introductions -2. Class overview -3. Getting R up and running - +1. Introductions +2. Class overview +3. Getting R up and running ```{r, fig.alt="Welcome!", out.width = "60%", echo = FALSE, fig.align='center'} knitr::include_graphics("images/welcome.jpg") ``` - [Photo by Belinda Fewings on Unsplash] ## Before we start .. @@ -39,10 +37,9 @@ Associate, Department of Biostatistics, JHSPH PhD in Biomedical Sciences -Email: cwright2@fredhutch.org Web: https://carriewright11.github.io +Email: [cwright2\@fredhutch.org](mailto:cwright2@fredhutch.org){.email} Web: https://carriewright11.github.io ```{r, fig.alt="Carrie's picture", out.width = "30%", echo = FALSE, fig.align='center'} -# knitr::include_graphics("https://ca.slack-edge.com/T023TPZA8LF-U024F9G49S8-9861ddd543db-192") knitr::include_graphics(here::here("modules", "Intro", "images", "carrie.png")) ``` @@ -56,7 +53,7 @@ Associate, Department of Biostatistics, JHSPH PhD in Ecology -Email: ahoffma2@fredhutch.org Web: https://avahoffman.com +Email: [ahoffma2\@fredhutch.org](mailto:ahoffma2@fredhutch.org){.email} Web: https://avahoffman.com ```{r, fig.alt="Ava's picture", out.width = "30%", echo = FALSE, fig.align='center'} knitr::include_graphics(here::here("modules", "Intro", "images", "ava.png")) @@ -70,7 +67,7 @@ Staff Scientist, Fred Hutchinson Cancer Center PhD in Molecular Epidemiology -Email: ehumphri@fredhutch.org +Email: [ehumphri\@fredhutch.org](mailto:ehumphri@fredhutch.org){.email} **NOTE** this is not her dog @@ -97,7 +94,6 @@ Like learning a spoken language, programming takes **practice**. knitr::include_graphics("images/sweeping-the-ocean.gif") ``` - ## The Learning Curve Learning R has been career changing for all of us, and we want to share that! @@ -108,61 +104,57 @@ We want you to succeed -- We will get through this together! knitr::include_graphics("images/low-five-high-five.gif") ``` - ## What is R? -- R is a language and environment for statistical computing and graphics developed in 1991 +- R is a language and environment for statistical computing and graphics developed in 1991 -- R is the open source implementation of the [S language](https://en.wikipedia.org/wiki/S_(programming_language)), which was developed by [Bell laboratories](https://ca.slack-edge.com/T023TPZA8LF-U024EN26Q0L-113294823b2c-512) in the 70s. +- R is the open source implementation of the [S language](https://en.wikipedia.org/wiki/S_(programming_language)), which was developed by [Bell laboratories](https://ca.slack-edge.com/T023TPZA8LF-U024EN26Q0L-113294823b2c-512) in the 70s. -- The aim of the S language, as expressed by John Chambers, is "to turn ideas into software, quickly and faithfully" +- The aim of the S language, as expressed by John Chambers, is "to turn ideas into software, quickly and faithfully" ```{r, fig.alt="Bell Labs old logo", out.width = "40%", echo = FALSE, fig.align='center'} knitr::include_graphics("https://upload.wikimedia.org/wikipedia/commons/thumb/9/98/Bell_Laboratories_logo.svg/2880px-Bell_Laboratories_logo.svg.png") ``` - -[source: http://www.r-project.org/, https://en.wikipedia.org/wiki/S_(programming_language), https://en.wikipedia.org/wiki/Bell_Labs)] +[source: , [https://en.wikipedia.org/wiki/S\_(programming_language)](https://en.wikipedia.org/wiki/S_(programming_language)){.uri}, )] ## What is R? -- **R**oss Ihaka and **R**obert Gentleman at the University of Auckland, New Zealand developed R - +- **R**oss Ihaka and **R**obert Gentleman at the University of Auckland, New Zealand developed R -- R is both [open source](https://en.wikipedia.org/wiki/Open_source) and [open development](https://en.wikipedia.org/wiki/Open-source_software_development) +- R is both [open source](https://en.wikipedia.org/wiki/Open_source) and [open development](https://en.wikipedia.org/wiki/Open-source_software_development) ```{r, fig.alt="R logo", out.width = "20%", echo = FALSE, fig.align='center'} knitr::include_graphics("images/Rlogo.png") ``` - -[source: http://www.r-project.org/, https://en.wikipedia.org/wiki/R_(programming_language)] +[source: , [https://en.wikipedia.org/wiki/R\_(programming_language)](https://en.wikipedia.org/wiki/R_(programming_language)){.uri}] ## Why R? -* Free (open source) +- Free (open source) -* High level language designed for statistical computing +- High level language designed for statistical computing -* Powerful and flexible - especially for data wrangling and visualization +- Powerful and flexible - especially for data wrangling and visualization -* Extensive add-on software (packages) +- Extensive add-on software (packages) -* Strong community +- Strong community ```{r, fig.alt="R-Ladies - a non-profit civil society community", out.width = "20%", echo = FALSE, fig.align='center'} knitr::include_graphics("https://github.com/rladies/branding-materials/raw/main/logo/R-LadiesGlobal_RBG_online_LogoWithText_Horizontal.png") ``` -[source: https://github.com/rladies/meetup-presentations_baltimore] -## Why not R? +[source: ] +## Why not R? -* Little centralized support, relies on online community and package developers +- Little centralized support, relies on online community and package developers -* Annoying to update +- Annoying to update -* Slower, and more memory intensive, than the more traditional programming languages (C, Perl, Python) +- Slower, and more memory intensive, than the more traditional programming languages (C, Perl, Python) ```{r, fig.alt="tortoise and hare", out.width = "40%", echo = FALSE, fig.align='center'} knitr::include_graphics("images/tortoise_hare.jpg") @@ -170,7 +162,6 @@ knitr::include_graphics("images/tortoise_hare.jpg") [[source -School vector created by nizovatina - www.freepik.com](https://www.freepik.com/vectors/school)] - ## Introductions What do you hope to get out of the class? @@ -180,16 +171,16 @@ Why do you want to use R? ```{r, fig.alt="image of rocks with word hope painted on", out.width = "60%", echo = FALSE, fig.align='center'} knitr::include_graphics("images/hope.jpg") ``` + [Photo by Nick Fewings on Unsplash] +# Logistics ## Course Website -https://daseh.org/ + -Materials will be uploaded the night before class. -We are constantly trying to improve content! -Please refresh/download materials before class. +Materials will be uploaded the night before class. We are constantly trying to improve content! Please refresh/download materials before class. ```{r, fig.alt="Data Science for Environmental Public Health course logo", out.width = "60%", echo = FALSE, fig.align='center'} knitr::include_graphics("../../docs/images/DaSEH_logo_transparent.png") @@ -197,40 +188,42 @@ knitr::include_graphics("../../docs/images/DaSEH_logo_transparent.png") ## Learning Objectives -- Understanding basic programming syntax -- Reading data into R -- Recoding and manipulating data -- Using add-on packages (more on what this is soon!) -- Making exploratory plots -- Performing basic statistical tests -- Writing R functions -- **Building intuition** +- Understanding basic programming syntax +- Reading data into R +- Recoding and manipulating data +- Using add-on packages (more on what this is soon!) +- Making exploratory plots +- Performing basic statistical tests +- Writing R functions +- **Building intuition** ## Course Format -ONLINE VIRTUAL COURSE -* Lecture with slides, interactive -* Lab/Practical experience -* Two 10 min breaks each day - timing may vary -* July 8-18, 10:30am - 2:00pm PST on Zoom +ONLINE VIRTUAL COURSE -IN-PERSON CODE-A-THON -* Mostly independent group work -* Frequent check-ins with instructors and other groups -* Some lectures about the practical aspects of coding -* July 29-31 (in person in Seattle) +- Lecture with slides, interactive\ +- Lab/Practical experience\ +- Two 10 min breaks each day - timing may vary\ +- July 8-18, 10:30am - 2:00pm PST on Zoom +IN-PERSON CODE-A-THON +- Mostly independent group work\ +- Frequent check-ins with instructors and other groups\ +- Some lectures about the practical aspects of coding\ +- July 29-31 (in person in Seattle) -## Surveys +## Pulse Check Survey -- Daily survey / pulse check : `r config::get("google_survey")` +`r config::get("google_survey")` + +Let us know anonymously how you're doing with the material. ```{r, fig.alt="Surveys count", out.width = "40%", echo = FALSE, fig.align='center'} knitr::include_graphics("images/feedback-illustration.jpeg") ``` -[[source - Banner vector created by pch.vector - www.freepik.com]("https://www.freepik.com/vectors/banner")] +[[source - Banner vector created by pch.vector - www.freepik.com](%22https://www.freepik.com/vectors/banner%22)] ## Homework @@ -246,13 +239,23 @@ If you can, we suggest working virtually with a **large monitor or two screens** knitr::include_graphics("images/monitors.jpg") ``` -[[source - reddit.com]("https://www.reddit.com/r/ProgrammerHumor/comments/11ygrjj/deducing_your_personality_from_your_monitor_setup/")] +[[source - reddit.com](%22https://www.reddit.com/r/ProgrammerHumor/comments/11ygrjj/deducing_your_personality_from_your_monitor_setup/%22)] + +# Research Survey + +## Research Survey + +We are collecting data about user experience with our course to learn more about how to improve the data science education experience. This data may ultimately be used for a research publication and reporting to the NIH. + +https://forms.gle/e2CQFDJsgyZwLV3S9 + +# Getting Started ## Installing R -* Install the [latest R version](http://cran.r-project.org/) `r config::get("r_version")` +- Install the [latest R version](http://cran.r-project.org/) `r config::get("r_version")` -* [Install RStudio](https://www.rstudio.com/products/rstudio/download/) +- [Install RStudio](https://www.rstudio.com/products/rstudio/download/) More detailed instructions [on the website](https://daseh.org/docs/module_details/day0.html). @@ -286,7 +289,6 @@ knitr::include_graphics("images/hex.png") ## Basic terms - **Function** - a function is a piece of code that allows you to do something in R. You can write your own, use functions that come directly from installing R, or use functions from additional packages. You can think of a function as **verb** in R. @@ -297,44 +299,41 @@ A function might help you add numbers together, create a plot, or organize your sum(1, 20234) ``` - ## Basic terms - **Argument** - what you pass to a function +**Argument** - what you pass to a function -- can be data like the number 1 or 20234 +- can be data like the number 1 or 20234 ```{r} sum(1, 20234) ``` -- can be options about how you want the function to work such as `digits` +- can be options about how you want the function to work such as `digits` ```{r} round(0.627, digits = 2) round(0.627, digits = 1) ``` - ## Basic terms - **Object** - an object is something that can be worked with or on in R - can be lots of different things! You can think of objects as **nouns** in R. -- a matrix of numbers -- a plot -- a function -- data +- a matrix of numbers +- a plot +- a function +- data ... many more ## Variable and Sample -- **Variable**: something measured or counted that is a characteristic about a sample +- **Variable**: something measured or counted that is a characteristic about a sample examples: temperature, length, count, color, category -- **Sample**: individuals that you have data about - +- **Sample**: individuals that you have data about - examples: people, houses, viruses etc. @@ -342,43 +341,39 @@ examples: people, houses, viruses etc. head(iris) ``` - ## Columns and Rows ```{r, fig.alt="R hex stickers for packages", out.width = "50%", echo = FALSE, fig.align='center'} knitr::include_graphics("https://keydifferences.com/wp-content/uploads/2016/09/rows-vs-column.jpg") ``` + [[source](https://keydifferences.com/difference-between-rows-and-columns.html)] -Sample = Row +Sample = Row\ Variable = Column Data objects that looks like this is often called a **data frame**. Fancier versions from the tidyverse are called **tibbles** (more on that soon!). - ## More on Functions and Packages -* When you download R, it has a "base" set of functions/packages (**base R**) - * You can install additional packages for your uses from [CRAN](https://cran.r-project.org/) or [GitHub](https://github.com/) - * These additional packages are written by RStudio or R users/developers (like us) - * There are also packages for bioinformatics available at [Bioconductor](https://www.bioconductor.org/) - +- When you download R, it has a "base" set of functions/packages (**base R**) + - You can install additional packages for your uses from [CRAN](https://cran.r-project.org/) or [GitHub](https://github.com/) + - These additional packages are written by RStudio or R users/developers (like us) + - There are also packages for bioinformatics available at [Bioconductor](https://www.bioconductor.org/) ```{r, fig.alt="Picture of R package stickers", out.width = "30%", echo = FALSE, fig.align='center'} knitr::include_graphics("../Intro/images/hex.png") ``` - ## Using Packages -* Not all packages available on CRAN or GitHub are trustworthy -* Posit makes [many useful packages](https://posit.co/products/open-source/rpackages/) -* How to [trust](https://simplystatistics.org/posts/2015-11-06-how-i-decide-when-to-trust-an-r-package/) an R package -* Many packages have accompanying academic papers published in peer-reviewed journals -* Widely used packages have better documentation (official and in forums) and are more likely free of errors - +- Not all packages available on CRAN or GitHub are trustworthy +- Posit makes [many useful packages](https://posit.co/products/open-source/rpackages/) +- How to [trust](https://simplystatistics.org/posts/2015-11-06-how-i-decide-when-to-trust-an-r-package/) an R package +- Many packages have accompanying academic papers published in peer-reviewed journals +- Widely used packages have better documentation (official and in forums) and are more likely free of errors ## Tidyverse and Base R: Two Dialects @@ -386,10 +381,10 @@ We will mostly show you how to use tidyverse packages and functions. This is a newer set of packages designed for data science that can make your code more **intuitive** as compared to the original older Base R. -**Tidyverse advantages**: - - **consistent structure** - making it easier to learn how to use different packages - - particularly good for **wrangling** (manipulating, cleaning, joining) data - - more flexible for **visualizing** data +**Tidyverse advantages**:\ +- **consistent structure** - making it easier to learn how to use different packages\ +- particularly good for **wrangling** (manipulating, cleaning, joining) data\ +- more flexible for **visualizing** data Packages for the tidyverse are managed by a team of respected data scientists at Posit. @@ -405,13 +400,13 @@ We will practice this in labs :) Differs depending on the source (CRAN, GitHub, etc) -Must be done **once** for each installation of R (e.g., version 4.2 >> 4.3). +Must be done **once** for each installation of R (e.g., version 4.2 \>\> 4.3). ## Installing Packages: Dropdown Menu You can install packages from CRAN using the tool menu in RStudio: -tools > Install Packages +tools \> Install Packages ```{r, fig.alt="Install packages menu in RStudio", out.width = "20%", echo = FALSE, fig.align='center'} knitr::include_graphics("images/install_packages1.png") @@ -463,21 +458,50 @@ knitr::include_graphics("images/install_v_load.png") knitr::include_graphics("../../images/lol/install_packages.jpg") ``` +# Let's practice! + +## Installing `remotes` and `dasehr` + +Install the `remotes` package. + +```{r, eval=F} +install.packages("remotes") +``` + +
+ +Then load the package. + +```{r, eval=F} +library(remotes) +``` + +## Installing `remotes` and `dasehr` + +Next, run the following. + +It will install our custom package, `dasehr` from GitHub. + +```{r, eval=F} +install_github("fhdsl/dasehr") +``` + + +# Where to find help + ## Useful (+ mostly Free) Resources -Found on our website under the `Resources` tab: -https://daseh.org/resources.html +Found on our website under the `Resources` tab: https://daseh.org/resources.html -- videos from previous offerings of the class -- cheatsheets for each class +- videos from previous offerings of the class +- cheatsheets for each class ## Help!!! Error messages can be scary! -- Check out the FAQ/Help page on the website: https://daseh.org/help.html -- Ask questions in Slack! Copy+pasting your error messages is really helpful! -- Leverage our awesome TA time for 1:1 troubleshooting +- Check out the FAQ/Help page on the website: https://daseh.org/help.html +- Ask questions in Slack! Copy+pasting your error messages is really helpful! **We will also dedicate time today to debug any installation issues** @@ -487,14 +511,14 @@ knitr::include_graphics("images/forrest-gump-running.gif") ## Summary -- R is a powerful data visualization and analysis software language. -- Add-on **packages** like the `tidyverse` can help make R more intuitive. -- **Functions** (like verbs) perform specific tasks in R and are found within packages. -- **Arguments** within functions specify how to perform a function. -- **Objects** (like nouns) are data or variables. -- We will be both installing and loading packages. -- Materials will be updated frequently as we improve it. Please use the **Google Form survey** so you can provide feedback throughout the class! -- Lots of **resources** can be found on the website. _You will have access to the website after the class is over._ +- R is a powerful data visualization and analysis software language. +- Add-on **packages** like the `tidyverse` can help make R more intuitive. +- **Functions** (like verbs) perform specific tasks in R and are found within packages. +- **Arguments** within functions specify how to perform a function. +- **Objects** (like nouns) are data or variables. +- We will be both installing and loading packages. +- Materials will be updated frequently as we improve it. Please use the **Google Form survey** so you can provide feedback throughout the class! +- Lots of **resources** can be found on the website. *You will have access to the website after the class is over.* 🏠 [Class Website](https://daseh.org/) @@ -507,3 +531,4 @@ knitr::include_graphics(here::here("images/the-end-g23b994289_1280.jpg")) ``` Image by Gerd Altmann from Pixabay + diff --git a/modules/Statistics/Statistics.Rmd b/modules/Statistics/Statistics.Rmd index 41fd8ed3..c121883b 100644 --- a/modules/Statistics/Statistics.Rmd +++ b/modules/Statistics/Statistics.Rmd @@ -467,8 +467,8 @@ For example, if we want to fit a regression model where outcome is `income` and ## Linear regression -We will use our dataset about nitrate levels by quarter in public water sources in Washington. We'll load a slightly different version of this dataset, which can be found at "https://daseh.org/data/ -Nitrate_Exposure_for_WA_Public_Water_Systems_byquarter_v2_data.csv" +We will use our dataset about nitrate levels by quarter in public water sources in Washington. We'll load a slightly different version of this dataset, which can be found at +https://daseh.org/data/Nitrate_Exposure_for_WA_Public_Water_Systems_byquarter_v2_data.csv. ```{r} diff --git a/resources/dictionary.txt b/resources/dictionary.txt index 15396aab..0c2f4c0f 100644 --- a/resources/dictionary.txt +++ b/resources/dictionary.txt @@ -255,6 +255,7 @@ ug Un Uncheck Ungroup +uri Unsplash UseR useR diff --git a/resources/ignore-urls.txt b/resources/ignore-urls.txt index 99de380b..242b2d83 100644 --- a/resources/ignore-urls.txt +++ b/resources/ignore-urls.txt @@ -7,3 +7,4 @@ https://en.wikipedia.org/wiki/Path_(computing http://cran.us.r-project.org) http://www.r-project.org/, https://github.com/awesomedata/awesome-public-datasets, +https://en.wikipedia.org/wiki/R_(programming_language \ No newline at end of file