Skip to content

Commit

Permalink
Merge pull request #711 from HeatherSavoy-USDA/main
Browse files Browse the repository at this point in the history
R workshop instructions update
  • Loading branch information
HeatherSavoy-USDA authored Oct 4, 2024
2 parents 56eccb5 + efe9be1 commit c86ddbc
Showing 1 changed file with 72 additions and 59 deletions.
131 changes: 72 additions & 59 deletions sn_collections/_workshops/2024-10-04-package-env-workshop-r.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ In this session, we will begin by using R from the command line. Later, we will
Multiple R versions are available in the environment module system. Note that modules are named with 'r' and the program available after the module is loaded is 'R'. With each new minor version of R you use, the `renv` package will need to be installed.

1. First, use the cluster's environment module system to find and load the version of R you want to use for your project: `module spider r` or `ml spider r` (note the lower-case 'r'!).
1. Load the version of R you'd like to use. E.g., `module load r/4.3.0` or `ml load r/4.3.0`.
1. Load the version of R you'd like to use. E.g., `module load r/4.4.0` or `ml load r/4.4.0`.
1. Run `R` to open an R session of the version of R you loaded from a module.
1. Run the R command `install.packages('renv')` to install `renv` for this version of R.
1. Run `q()` to exit R, and enter `n` when prompted to save the workspace image.
Expand All @@ -30,31 +30,37 @@ To use `renv` for package management, it needs to be associated with an R projec

1. If you are not already in your workshop directory, change into it by running `cd /90daydata/shared/$USER/`.
1. Create and change directory to a project directory:

{:.copy-code}
```
mkdir project1
cd project1
mkdir my_project
cd my_project
```
1. Start an R session with `R`.
1. Initialize `renv` by running `renv::init(settings = list(use.cache = FALSE, ppm.enabled = FALSE))`. Some handy messages will appear to describe what `renv` has done.
> **Exercise 1:** Save the following R code into a file named `exercise1.R`. (Hint: use `nano exercise1.R`.) Open an R session and initialize `renv` for the project. What happens?
> **Exercise 1:** What files have been added to the project directory? (Hint: use `ls -a` to include hidden files). What kind of content do they contain? (Hint: use `cat filename` to print file contents to the screen.)
{:.copy-code}
```
library(magrittr)
> **Exercise 2:** Return to /90daydata/shared/$USER/, create and change directory to a new project folder `project1`, and save the following R code into a file named `exercise2.R`. (Hint: use `nano exercise2.R`.) Open an R session and initialize `renv` for the project. What kinds of messages appear now?
x <- help.search("*", package="base")
{:.copy-code}
```
library(magrittr)

N <- 5
for(i in c(1:N,N:1)){
string <- x$matches$Title %>%
sample(i) %>%
cat('\n')
}
```
x <- help.search("*", package="base")

> **Exercise 2:** What files have been added to the project directory? (Hint: use `ls -a` to include hidden files).
N <- 3
for(i in c(1:N,N:1)){
string <- x$matches$Title %>%
sample(i) %>%
cat('\n')
}
```
1. Run `R` to open a new R session.
1. Run the script with `source('exercise1.R')`.
R will have to be restarted before the project library is setup, i.e., our `exercise2.R` script won't run successfully.
1. Quit the current session and run `R` to open a new one.
1. Run the script with `source('exercise2.R')`.
Now we are set up with a project with an `renv` environment!
Expand All @@ -65,25 +71,30 @@ Next, we will expand our project with additional packages!
1. You can install packages into your environment as you normally would with `install.packages('PACKAGE')`, or `renv` does have an expanded installation function `renv::install('PACKAGE')` that supports additional remote package sources, e.g., GitHub. If you are interested in learning more about `renv::install()`, please see the documentation [here](https://rstudio.github.io/renv/reference/install.html).
1. To have `renv` save the state of the project (i.e., capture all the metadata of the used packages) in the environment configuration file called a 'lockfile', run `renv::snapshot()`.
1. If you want to assess the state of the environment, (i.e., which packages are installed but not used, or which packages are used but not recorded), run `renv::status()`.
1. Let's save the script below in our project and install an old version of the `cli` package so we can simulate needing to update to the latest version next: `install.packages("https://cran.r-project.org/src/contrib/Archive/cli/cli_3.6.1.tar.gz", repos=NULL,type="source”)`.
1. If we call `renv::status()`, it will tell us we are out of sync. If we then call `renv::snapshot()`, it will update the project.
> **Exercise 3:** Save the following R code into a file named `exercise3.R`. Open an R session. What happens? Modify the environment to be consistent and to make the program run. What does the updated program do?
{:.copy-code}
```
library(magrittr)
library(cli)

{:.copy-code}
```
library(magrittr)
library(cli)
x <- help.search("*", package="base")
N <- 7
for(i in c(1:N,N:1)){
string <- x$matches$Title %>%
sample(i) %>%
col_magenta() %>%
cat('\n')
}
```
x <- help.search("*", package="base")

N <- 3
for(i in c(1:N,N:1)){
string <- x$matches$Title %>%
sample(i) %>%
col_magenta() %>%
cat('\n')
}
```
> **Exercise 3:** Update the version of `cli` with: `install.packages('cli')`. Modify the environment to be consistent.
1. Another `renv` function to make your project environment consistent is `renv::restore()`. It helps update your project library to match your lockfile.
1. For example, if we install the `MASS` library because we think we may need it but later don't, `renv::restore(clean-TRUE)` will help remove the unused package from the project library.
1. `renv::restore()` can also be used to revert package version discrepancies like for `cli` above.
## Reproduce renv projects
Expand All @@ -92,7 +103,9 @@ In order to make environments and package management _truly_ useful, we need a m
1. The `renv` directories and files that should remain with the project are the `renv.lock` file, the `renv/activate.R` and `renv/settings.json` files, and the `.Rprofile` file. With these files, the project environment can be easily recreated, therefore helping to ensure that your code and analyses are fully reproducible.
1. If you are using git for version control for the project, `renv` adds the `renv` files that do not need to be tracked (i.e., the packages themselves) to the `.gitignore` file for you.
> **Exercise 4:** Create a new project directory in your workshop directory (i.e., at the same level as `project1` and not within it). Copy over all of `project1`'s files except `renv/library` and `renv/staging` (the package files) into the new project. From the new project directory, run `R`. What happens?
> **Exercise 4:** Create a new project directory in your workshop directory. Copy over the script and lockfile from `project1` into the new project. From the new project directory, run `R`. What happens? Try initializing.
> **Exercise 5:** Create another new project directory in your workshop directory. Copy over all of `project1`'s files except `renv/library` and `renv/staging` (the package files) into the new project. From the new project directory, run `R`. What happens?
Expand All @@ -105,15 +118,15 @@ The approach of using `renv` in RStudio is very similar to using `renv` with com
Multiple R versions are available when requesting RStudio Server sessions on Open OnDemand. From the [Open OnDemand](https://atlas-ood.hpc.msstate.edu/) page, select "Interactive Apps" > "RStudio Server". You will be taken to a page with multiple input fields to configure your RStudio Server session and one of those is "R Version".
1. For the following exercise, select the following inputs:
* Account: scinet_workshop1
* Queue: short--------Max Time: 2-00:00:00
* QOS: 400thread
* **R Version: 4.3.2-EL9**
* Number of hours: 1
* Number of cores: 2
* Memory required: 8G
* Optional Slurm Arguments: \-\-reservation=scinet_workshop1
1. When you are in RStudio Server, install `renv`.
* R Version: 4.3.3
* Account Name: scinet_workshop1
* Partition: atlas
* QOS: normal 14-00:00:00
* Number of hours: 2
* Number of nodes: 1
* Number of tasks: 1
* Additional Slurm Parameters: \-\-reservation=workshop \-\-mem=8G
1. When you are in RStudio Server, install `renv`. Note, we only need to install `renv` because we chose a different version of R.
## Creating and managing R environments with the `renv` package
Expand All @@ -123,30 +136,30 @@ Since `renv` is project specific, you need to change the working directory to th
In RStudio Server, there are also additional graphical features in the interface when `renv` is active. E.g., note that there is an "renv" button at the top of the "Packages" pane. If you click on it, there is a dropdown menu that includes shortcuts to the `renv::snapshot()` and `renv::restore()` functions.
1. Create a new project directory.
2. Initialize `renv` for the new project with `renv::init(settings = list(use.cache = FALSE, ppm.enabled = FALSE))`.
2. Initialize `renv` for the new project with `renv::init(settings = list(use.cache = FALSE, ppm.enabled = FALSE))`. If you do not see the activation message when R restarts, you will have to manually call it with `source(renv/activate.R)`.
3. Add at least one script to the project and install the packages it uses with either `install.packages('PACKAGE')` or `renv::install('PACKAGE')`.
4. Take a snapshot of the environment with `renv::snapshot()` to update the lockfile.
> **Exercise 5:** Use the `renv::install()` function to download the development version of the 'nsyllable' package on GitHub at 'quanteda/nsyllable'. Save the script below as an R script in your project directory. What does the program do? Use the "renv" button in the "Packages" pane to make sure this project's lockfile captures the new package. Open the `renv.lock` file to see the entry for the new package.
> **Exercise 6:** Use the `renv::install()` function to download the development version of the 'nsyllable' package on GitHub at 'quanteda/nsyllable'. Save the script below as an R script in your project directory. What does the program do? Use the "renv" button in the "Packages" pane to make sure this project's lockfile captures the new package. Open the `renv.lock` file to see the entry for the new package.
{:.copy-code}
```
library(magrittr)
library(nsyllable)
{:.copy-code}
```
library(magrittr)
library(nsyllable)

x <- help.search("*", package="base")
x$matches["nsyl"] <- nsyllable(x$matches$Title)
x <- help.search("*", package="base")
x$matches["nsyl"] <- nsyllable(x$matches$Title)

for(i in c(5,7,5)){
for(i in c(5,7,5)){

correct_length <- x$matches$nsyl == i
correct_length <- x$matches$nsyl == i

string <- x$matches$Title[correct_length] %>%
sample(1) %>%
cat('\n')
}
```
string <- x$matches$Title[correct_length] %>%
sample(1) %>%
cat('\n')
}
```
# If you don't want to use `renv`
Expand Down

0 comments on commit c86ddbc

Please sign in to comment.