From a29b7ec7ae7397a6f77ed455b237c598e984f72e Mon Sep 17 00:00:00 2001 From: Heather Savoy <47045484+HeatherSavoy-USDA@users.noreply.github.com> Date: Thu, 3 Oct 2024 15:22:39 -0600 Subject: [PATCH 1/2] workshop: update OOD RStudio instructions --- .../2024-10-04-package-env-workshop-r.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/sn_collections/_workshops/2024-10-04-package-env-workshop-r.md b/sn_collections/_workshops/2024-10-04-package-env-workshop-r.md index dfe444340..af80c8eec 100644 --- a/sn_collections/_workshops/2024-10-04-package-env-workshop-r.md +++ b/sn_collections/_workshops/2024-10-04-package-env-workshop-r.md @@ -105,14 +105,14 @@ The approach of using `renv` in RStudio is very similar to using `renv` with com Multiple R versions are available when requesting RStudio Server sessions on Open OnDemand. From the [Open OnDemand](https://atlas-ood.hpc.msstate.edu/) page, select "Interactive Apps" > "RStudio Server". You will be taken to a page with multiple input fields to configure your RStudio Server session and one of those is "R Version". 1. For the following exercise, select the following inputs: - * Account: scinet_workshop1 - * Queue: short--------Max Time: 2-00:00:00 - * QOS: 400thread - * **R Version: 4.3.2-EL9** - * Number of hours: 1 - * Number of cores: 2 - * Memory required: 8G - * Optional Slurm Arguments: \-\-reservation=scinet_workshop1 + * R Version: 4.4.0 + * Account Name: scinet_workshop1 + * Partition: atlas + * QOS: normal 14-00:00:00 + * Number of hours: 2 + * Number of nodes: 1 + * Number of tasks: 1 + * Additional Slurm Parameters: \-\-reservation=workshop \-\-mem=8G 1. When you are in RStudio Server, install `renv`. From efe9be19e0d662b7c9547b9015a592332c9e65ca Mon Sep 17 00:00:00 2001 From: Heather Savoy <47045484+HeatherSavoy-USDA@users.noreply.github.com> Date: Fri, 4 Oct 2024 10:42:48 -0600 Subject: [PATCH 2/2] workshop update R instructions --- .../2024-10-04-package-env-workshop-r.md | 117 ++++++++++-------- 1 file changed, 65 insertions(+), 52 deletions(-) diff --git a/sn_collections/_workshops/2024-10-04-package-env-workshop-r.md b/sn_collections/_workshops/2024-10-04-package-env-workshop-r.md index af80c8eec..0a5c60934 100644 --- a/sn_collections/_workshops/2024-10-04-package-env-workshop-r.md +++ b/sn_collections/_workshops/2024-10-04-package-env-workshop-r.md @@ -19,7 +19,7 @@ In this session, we will begin by using R from the command line. Later, we will Multiple R versions are available in the environment module system. Note that modules are named with 'r' and the program available after the module is loaded is 'R'. With each new minor version of R you use, the `renv` package will need to be installed. 1. First, use the cluster's environment module system to find and load the version of R you want to use for your project: `module spider r` or `ml spider r` (note the lower-case 'r'!). -1. Load the version of R you'd like to use. E.g., `module load r/4.3.0` or `ml load r/4.3.0`. +1. Load the version of R you'd like to use. E.g., `module load r/4.4.0` or `ml load r/4.4.0`. 1. Run `R` to open an R session of the version of R you loaded from a module. 1. Run the R command `install.packages('renv')` to install `renv` for this version of R. 1. Run `q()` to exit R, and enter `n` when prompted to save the workspace image. @@ -30,31 +30,37 @@ To use `renv` for package management, it needs to be associated with an R projec 1. If you are not already in your workshop directory, change into it by running `cd /90daydata/shared/$USER/`. 1. Create and change directory to a project directory: + + {:.copy-code} ``` - mkdir project1 - cd project1 + mkdir my_project + cd my_project ``` +1. Start an R session with `R`. +1. Initialize `renv` by running `renv::init(settings = list(use.cache = FALSE, ppm.enabled = FALSE))`. Some handy messages will appear to describe what `renv` has done. - > **Exercise 1:** Save the following R code into a file named `exercise1.R`. (Hint: use `nano exercise1.R`.) Open an R session and initialize `renv` for the project. What happens? +> **Exercise 1:** What files have been added to the project directory? (Hint: use `ls -a` to include hidden files). What kind of content do they contain? (Hint: use `cat filename` to print file contents to the screen.) - {:.copy-code} - ``` - library(magrittr) +> **Exercise 2:** Return to /90daydata/shared/$USER/, create and change directory to a new project folder `project1`, and save the following R code into a file named `exercise2.R`. (Hint: use `nano exercise2.R`.) Open an R session and initialize `renv` for the project. What kinds of messages appear now? - x <- help.search("*", package="base") +{:.copy-code} +``` +library(magrittr) - N <- 5 - for(i in c(1:N,N:1)){ - string <- x$matches$Title %>% - sample(i) %>% - cat('\n') - } - ``` +x <- help.search("*", package="base") - > **Exercise 2:** What files have been added to the project directory? (Hint: use `ls -a` to include hidden files). +N <- 3 +for(i in c(1:N,N:1)){ + string <- x$matches$Title %>% + sample(i) %>% + cat('\n') +} +``` -1. Run `R` to open a new R session. -1. Run the script with `source('exercise1.R')`. +R will have to be restarted before the project library is setup, i.e., our `exercise2.R` script won't run successfully. + +1. Quit the current session and run `R` to open a new one. +1. Run the script with `source('exercise2.R')`. Now we are set up with a project with an `renv` environment! @@ -65,25 +71,30 @@ Next, we will expand our project with additional packages! 1. You can install packages into your environment as you normally would with `install.packages('PACKAGE')`, or `renv` does have an expanded installation function `renv::install('PACKAGE')` that supports additional remote package sources, e.g., GitHub. If you are interested in learning more about `renv::install()`, please see the documentation [here](https://rstudio.github.io/renv/reference/install.html). 1. To have `renv` save the state of the project (i.e., capture all the metadata of the used packages) in the environment configuration file called a 'lockfile', run `renv::snapshot()`. 1. If you want to assess the state of the environment, (i.e., which packages are installed but not used, or which packages are used but not recorded), run `renv::status()`. +1. Let's save the script below in our project and install an old version of the `cli` package so we can simulate needing to update to the latest version next: `install.packages("https://cran.r-project.org/src/contrib/Archive/cli/cli_3.6.1.tar.gz", repos=NULL,type="sourceā€)`. +1. If we call `renv::status()`, it will tell us we are out of sync. If we then call `renv::snapshot()`, it will update the project. - > **Exercise 3:** Save the following R code into a file named `exercise3.R`. Open an R session. What happens? Modify the environment to be consistent and to make the program run. What does the updated program do? +{:.copy-code} +``` +library(magrittr) +library(cli) - {:.copy-code} - ``` - library(magrittr) - library(cli) - - x <- help.search("*", package="base") - - N <- 7 - for(i in c(1:N,N:1)){ - string <- x$matches$Title %>% - sample(i) %>% - col_magenta() %>% - cat('\n') - } - ``` +x <- help.search("*", package="base") + +N <- 3 +for(i in c(1:N,N:1)){ + string <- x$matches$Title %>% + sample(i) %>% + col_magenta() %>% + cat('\n') +} +``` +> **Exercise 3:** Update the version of `cli` with: `install.packages('cli')`. Modify the environment to be consistent. + +1. Another `renv` function to make your project environment consistent is `renv::restore()`. It helps update your project library to match your lockfile. +1. For example, if we install the `MASS` library because we think we may need it but later don't, `renv::restore(clean-TRUE)` will help remove the unused package from the project library. +1. `renv::restore()` can also be used to revert package version discrepancies like for `cli` above. ## Reproduce renv projects @@ -92,7 +103,9 @@ In order to make environments and package management _truly_ useful, we need a m 1. The `renv` directories and files that should remain with the project are the `renv.lock` file, the `renv/activate.R` and `renv/settings.json` files, and the `.Rprofile` file. With these files, the project environment can be easily recreated, therefore helping to ensure that your code and analyses are fully reproducible. 1. If you are using git for version control for the project, `renv` adds the `renv` files that do not need to be tracked (i.e., the packages themselves) to the `.gitignore` file for you. - > **Exercise 4:** Create a new project directory in your workshop directory (i.e., at the same level as `project1` and not within it). Copy over all of `project1`'s files except `renv/library` and `renv/staging` (the package files) into the new project. From the new project directory, run `R`. What happens? +> **Exercise 4:** Create a new project directory in your workshop directory. Copy over the script and lockfile from `project1` into the new project. From the new project directory, run `R`. What happens? Try initializing. + +> **Exercise 5:** Create another new project directory in your workshop directory. Copy over all of `project1`'s files except `renv/library` and `renv/staging` (the package files) into the new project. From the new project directory, run `R`. What happens? @@ -105,7 +118,7 @@ The approach of using `renv` in RStudio is very similar to using `renv` with com Multiple R versions are available when requesting RStudio Server sessions on Open OnDemand. From the [Open OnDemand](https://atlas-ood.hpc.msstate.edu/) page, select "Interactive Apps" > "RStudio Server". You will be taken to a page with multiple input fields to configure your RStudio Server session and one of those is "R Version". 1. For the following exercise, select the following inputs: - * R Version: 4.4.0 + * R Version: 4.3.3 * Account Name: scinet_workshop1 * Partition: atlas * QOS: normal 14-00:00:00 @@ -113,7 +126,7 @@ Multiple R versions are available when requesting RStudio Server sessions on Ope * Number of nodes: 1 * Number of tasks: 1 * Additional Slurm Parameters: \-\-reservation=workshop \-\-mem=8G -1. When you are in RStudio Server, install `renv`. +1. When you are in RStudio Server, install `renv`. Note, we only need to install `renv` because we chose a different version of R. ## Creating and managing R environments with the `renv` package @@ -123,30 +136,30 @@ Since `renv` is project specific, you need to change the working directory to th In RStudio Server, there are also additional graphical features in the interface when `renv` is active. E.g., note that there is an "renv" button at the top of the "Packages" pane. If you click on it, there is a dropdown menu that includes shortcuts to the `renv::snapshot()` and `renv::restore()` functions. 1. Create a new project directory. -2. Initialize `renv` for the new project with `renv::init(settings = list(use.cache = FALSE, ppm.enabled = FALSE))`. +2. Initialize `renv` for the new project with `renv::init(settings = list(use.cache = FALSE, ppm.enabled = FALSE))`. If you do not see the activation message when R restarts, you will have to manually call it with `source(renv/activate.R)`. 3. Add at least one script to the project and install the packages it uses with either `install.packages('PACKAGE')` or `renv::install('PACKAGE')`. 4. Take a snapshot of the environment with `renv::snapshot()` to update the lockfile. - > **Exercise 5:** Use the `renv::install()` function to download the development version of the 'nsyllable' package on GitHub at 'quanteda/nsyllable'. Save the script below as an R script in your project directory. What does the program do? Use the "renv" button in the "Packages" pane to make sure this project's lockfile captures the new package. Open the `renv.lock` file to see the entry for the new package. +> **Exercise 6:** Use the `renv::install()` function to download the development version of the 'nsyllable' package on GitHub at 'quanteda/nsyllable'. Save the script below as an R script in your project directory. What does the program do? Use the "renv" button in the "Packages" pane to make sure this project's lockfile captures the new package. Open the `renv.lock` file to see the entry for the new package. - {:.copy-code} - ``` - library(magrittr) - library(nsyllable) +{:.copy-code} +``` +library(magrittr) +library(nsyllable) - x <- help.search("*", package="base") - x$matches["nsyl"] <- nsyllable(x$matches$Title) +x <- help.search("*", package="base") +x$matches["nsyl"] <- nsyllable(x$matches$Title) - for(i in c(5,7,5)){ +for(i in c(5,7,5)){ - correct_length <- x$matches$nsyl == i +correct_length <- x$matches$nsyl == i - string <- x$matches$Title[correct_length] %>% - sample(1) %>% - cat('\n') - } - ``` +string <- x$matches$Title[correct_length] %>% + sample(1) %>% + cat('\n') +} +``` # If you don't want to use `renv`