⏸️ If you're reading this in RStudio, select Visual
mode (top of pane) for easy reading.
R projects that use this template and workflow will benefit from the following:
Projects provide a self-contained environment for data analysis, where well-organised file structure and workflow can increase project reproducibility for both yourself and others. Code is formatted in a consistent style via the styler and lintr packages to increase readability.
Code is broken up until smaller fragments which are called upon and assembled as needed. This speeds up R running speed compared to one large R notebook, and makes identifying errors much easier!
All project contents are backed up to GitHub - an online free repository. Changes are also tracked or 'version controlled' via Git, allowing users to retrieve previous versions of projects.
GitHub project repositories can be shared between collaborators. Project directories also use relative pathing via the here package, making code easy to run on different devices.
Each project is isolated in its own environment, where the renv package allows users to record and use the same version of project dependencies without breaking other projects. The config package lets you save and reuse other configurations made in the project's environment.
Complete the following prerequisites for first-time use:
Sign up using your personal email to ensure continued access.
Download the most recent version of R software, RStudio, Git (will also install Git Bash) and GitHub Desktop.
Git - for easy tracking/saving of code within RStudio
- Set your username for every repository on your computer
- Set your commit email address on GitHub
- Set your commit email address for every repository on your computer
GitHub Desktop - for easy commit/revert permissions in GitHub Desktop
- Connect your account
- Configure your Git defaults
- Pick your integrations - select
RStudio
as your preferred text editor.
- Open RStudio > navigate to
Tools
>Global Options…
>Git/SVN
> selectGit.exe
from the drop-down menu. - If you don't have a SSH key allocated, click
Create SSH Key…
>OK
- you should now see aGit
pane in the top-right of RStudio.
Complete the following each time you want to set up a new R project from this template:
Click here > Use this template
(green button) > Create a new repository
> add a repository name, description and tick public/private - once generated, you will be redirected to the repository on GitHub.
Click <> Code
(green button) > Open with GitHub Desktop
> choose a location to save your project.
- Navigate to the R-project-template folder you created - this is your 'root' directory.
- Change the file name
Template-r-projects-workflow.Rproj
to your repository name. - If applicable to your project, amend the license and author(s) in the
LICENSE.md
file - click here for more information about licensing.
- Go to your root directory > review folder structure and modify to suit your purposes by creating/renaming/deleting folders as needed.
- Place a blank R script file in each new folder as a place-holder for GitHub - you can copy-paste the
EMPTY.R
files from this template. - Populate folders with initial scripts and/or data - this is generally done within the
00-code
and01-data
sub-directories. - Rename scripts and data according to tidyverse's file naming conventions - remember, good coding style makes R easier to read and use!
Go to your root directory > double click the .Rproj file
to open the project in RStudio > check text in top-right corner - this should match the name of your project. Ensure you are always working within projects to reap the benefits provided by this template. You can switch and close projects using the drop-down arrow (don't do this now).
- From RStudio, go to the
Files
tab from the bottom-right pane (by default) > select your root directory and navigate to00-code
>00-install-dependencies.R
- this will open the script in a separate tab (top). - Run the script to install all packages required to use this template/workflow - enter
Y
into the console when prompted to include dependencies and to update the project's lockfile. This will install and update the project's lockfile with these packages (including their version). When working in projects from this template, R will use this lockfile instead of your complete inventory of packages.
-
From RStudio, click the save icon (top-left) > close RStudio.
-
Open GitHub Desktop >select your project from
Current repository
tab (left panel) > select theChanges
tab to review changes made locally (i.e. on this device) - green text/files show project additions, and red text/files show deletions. -
Ensure all changes are ticked > add a summary and description in
Summary (required)
andDescription
boxes. E.g., "Project set up" and "Create new repository, clone to local, rename files, organise folders, populate folders with initial files, install dependencies.". -
Click
Commit to main
to incorporate these changes into yourmain
online version (more on this later) > select theHistory
tab - you should now see a record of your past commits. -
Navigate to
Changes
tab > clickPush origin
(in blue) to "push" local changes to GitHub.com - this project is now backed up online and accessible for public/private use from any device. -
Check that your progress was logged on GitHub.com by right-clicking your project under the
Current repository
tab >View on GitHub
>Go to file
(top right) >History
(top right) > select your commit message > review the changes.
Once your project is set up, your regular workflow will look something like this:
-
Open GitHub Desktop > select project from
Current repository
tab > clickFetch origin
(top panel) - this will compare your local repository or "repo" against the online repo on GitHub.com (the "origin"). -
Check the message and respond accordingly:
"
Last fetched just now
" = your local repo is up-to-date with GitHub.com - do nothing."
Pull origin
" = a change exists on GitHub.com which is not in your local repo - clickPull origin
to pull from GitHub.com onto your machine.
- Go to your root directory > double click the
.Rproj file
to open the project in RStudio. - In the
Files
pane, click00-code
>conductor.Rmd
to open the notebook - this is where you call on scripts and data contained in sub-directories to generate output(s).
- Run the "
Current dependencies
" chunk to install and/or load packages from the project's lockfile. - As you progress through your project, add any extra packages to the
list_packages()
vector in the "Add/load packages during a coding session
" chunk > run the chunk to install and load these packages.
- Begin writing your code for the project in the "
Begin writing your code
" chunk ofconductor.Rmd
. - As you complete stages in the analysis, move your code out of
conductor.Rmd
into their own separate files. I've created two templatestemplate-r-script.R
andtemplate-r-notebook.Rmd
in00-code/templates
that you can use to create new files with nice structure and formatting. - Save your code as separate files in the
00-code
directory - index these (00, 01 etc.) or create sub-folders if needed to keep track of code. Don't forget to use tidyverse's file naming conventions. - In
conductor.Rmd
, usesource()
functions to stitch your code together and run these as needed -source()
uses relative pathing from the root folder (via./
) to locate and run code. E.g.source("./00-code/00-install-dependencies.R", chdir = TRUE)
.
- Open GitHub Desktop >select your project from
Current repository
tab (left panel) - you should keep this open next to RStudio as you are working and use it regularly. - When ready to commit changes to your
main
online version, save your R code > tick/untick box(es) from theChanges
tab in GitHub Desktop > populate theSummary (required)
andDescription
boxes - try to group these together in small, related tasks for easy tracking. If you group too many or too few tasks in a single commit, it will be harder to view or amend your changes in the future. - When happy, click
Commit to main
>Push origin
- your changes are now backed up to GitHub.com.
- After each coding session, add any extra package(s) to your project's lockfile by navigating to
00-code
from theFiles
pane >00-install-dependencies.R
> add packages to thelist_packages()
vector > run the script > save and close - this will load them with the "Current dependencies
" chunk in future sessions. - In the "
Add/load packages during a coding session
" chunk ofconductor.Rmd
, remove the package name(s) from thelist_packages()
vector > run the chunk - check that your package(s) were installed/loaded without error. - Before you proceed with the next step, commit and push the current version of your project to GitHub.
- Format your code according to the tidyverse style guide by running the "
Format code as you complete stages
" chunk line-by-line (don't click the play symbol!) and perform the following checks and/or changes as appropriate:
lintr::use_lintr(type = "tidyverse")
- can change type to other styles if preferred.
lintr::lint_dir()
- identifies bad formatting across all R code in entire project ("Markers" tab below). Use this to identify where checks should be made in the next line.
styler::style_dir("./")
- reformats your entire project to the style specified in use_lintr(). Check the modified code identified above to ensure it works and you're happy with the change(s).
lintr::lint_dir()
- identify bad formatting that cannot be fixed by styler. Use this to manually make formatting changes according to the tidyverse style guide.
- Save all changes and commit them to GitHub.com when ready.
You may also require these functions on a case-by-case basis:
Open GitHub Desktop > select your project from the Current repository
tab (left panel) > complete the following based on your scenario:
-
If not already committed: Go to the
Changes
tab > select the change(s) you want to revert > right click >Discard changes…
. -
If already committed: Go to the
History
tab > select the commit(s) you want to revert > right click >Revert changes in commit
.
Branching allows you to clone a version of your repo and develop it independently from the main
version - this allows you to try new things without risk of messing up your previous code. It is also a good way for collaborators to work on projects independently before deciding on and committing changes.
Here is an example of how branching works:
-
Bill creates an R project with GitHub and gave shared access to Bob. The
main
branch of the project includes all the data and initial scripts used for cleaning and summarising the data. -
Bob wants to try out a new technique for summarising the data without affecting the Bills code, so he creates a new branch called
Bob-summary-trial
which clones themain
branch. He then works onBob-summary-trial
independently from themain
branch. -
Bob finishes his technique and pushes this variation of the code to GitHub.com. He also creates a pull request for Bill to assess whether he wants to merge Bobs changes into
main
. -
Bill pulls
Bob-summary-trial
from GitHub.com and compares it tomain
. He likes Bobs changes and approves the pull request - Bobs code is now merged into themain
branch of the project.
How to branch:
- Open GitHub Desktop > select your project from the
Current repository
tab (left panel) > selectCurrent branch
(top panel) >New branch
> name your new branch > clickCreate branch
. Under theChanges
tab, clickPublish branch
to push the new branch to GitHub.com. - Work under the new branch by selecting the branch from
Current branch
(top panel) in GitHub Desktop - push and pull changes as you would normally in themain
branch. - To request that changes made in your new branch be merged with
main
, open GitHub Desktop > select your project from theCurrent repository
tab (left panel) > select the new branch fromCurrent branch
(top panel) >Preview Pull Request
> check that the request states "Merge [x] commits into [base: main] from [name of new branch].
" and that the commit history is accurate. - When happy, click
Create pull request
> leave a title and description to describe the merge > clickCreate pull request
- this request needs to be approved by the project creator. - To approve a pull request, click
Current branch
in GitHub Desktop > select themain
branch that you want to merge with >Choose a branch to merge into main
> select the new branch you want to merge intomain
>Create a merge commit
>Push origin
- the two branches are now merged. - If desired, delete the old branch by clicking
Current branch
> right-click the branch you want to delete >Delete…
'> tickYes, delete this branch on the remote
>Delete
.
Click here for the tutorial on creating a site or blog using GitHub pages, here for linking your repo to a Digital Object Identifier (DOI), and here for adding a CITATION file to your repo to help users cite you accurately.
Package | Attribution |
---|---|
config | Allaire J (2023). _config: Manage Environment Specific Configuration Values_. R package version 0.3.2, https://CRAN.R-project.org/package=config. |
devtools | Wickham H, Hester J, Chang W, Bryan J (2022). _devtools: Tools to Make Developing R Packages Easier_. R package version 2.4.5, https://CRAN.R-project.org/package=devtools. |
fs | Hester J, Wickham H, Csárdi G (2023). _fs: Cross-Platform File System Operations Based on 'libuv'_. R package version 1.6.3, https://CRAN.R-project.org/package=fs. |
here | Müller K (2020). _here: A Simpler Way to Find Your Files_. R package version 1.0.1, https://CRAN.R-project.org/package=here. |
lintr | Hester J, Angly F, Hyde R, Chirico M, Ren K, Rosenstock A, Patil I (2023). _lintr: A 'Linter' for R Code_. R package version 3.1.0, https://CRAN.R-project.org/package=lintr. |
markdown | Xie Y, Allaire J, Horner J (2023). _markdown: Render Markdown with 'commonmark'_. R package version 1.10, https://CRAN.R-project.org/package=markdown. |
plyr | Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software, 40(1), 1-29. URL https://www.jstatsoft.org/v40/i01/. |
renv | Ushey K, Wickham H (2023). _renv: Project Environments_. R package version 1.0.3, https://CRAN.R-project.org/package=renv. |
rmarkdown | Allaire J, Xie Y, Dervieux C, McPherson J, Luraschi J, Ushey K, Atkins A, Wickham H, Cheng J, Chang W, Iannone R (2023). _rmarkdown: Dynamic Documents for R_. R package version 2.25, https://github.com/rstudio/rmarkdown. |
rstudioapi | Ushey K, Allaire J, Wickham H, Ritchie G (2023). _rstudioapi: Safely Access the RStudio API_. R package version 0.15.0, https://CRAN.R-project.org/package=rstudioapi. |
styler | Müller K, Walthert L (2023). _styler: Non-Invasive Pretty Printing of R Code_. R package version 1.10.2, https://CRAN.R-project.org/package=styler. |