-
Notifications
You must be signed in to change notification settings - Fork 0
Development journal
The intent of this journal is to provide notes related to:
- Development strategies for
isotopeconverter
- R package creation
These notes include some key information at the top and then the rest are in reverse chronological order, so the most recent date is first.
Just a reminder of some really useful git and GitHub commands and strategies:
Useful packages for R package development:
-
available
: A great package for checking a bunch of stuff related to the name of a package including: availability on CRAN and definitions on Urban Dictionary (for unintentionally offensive names). -
usethis
: Serves as a wrapper for a number of the packages highlighted below. Something of a "one stop shop" for package development. -
devtools
: A ton of development tools for making packages. -
roxygen2
: A convenient way to generate function and package metadata. -
testthat
: A package for unit tests. -
covr
: Used with the codecov badge to automatically determine how much of a package has unit tests. -
rhub
: Used to test if an R package can be installed on multiple platforms. Can think of it as an R-specific alternative to Travis and Appveyor that is not quite as mature. -
rmarkdown
: Great for making websites and other stuff using a form of markdown and R. -
knitr
: The "connector" between R, Rmarkdown, and pandoc for making lots of different documents (including html). -
pkgdown
: Used to make attractive documentation websites for a package. As an example, the GitHub site below was used to produce the pkgdown.r-lib site. Seems to be a wrapper around somermarkdown
functionality. -
spelling
: A spell-checker for R. Not actually sure if this is useful, but I'd be willing to check it out.
Other useful package development resources:
- Developing Packages with RStudio
- R Package Development -- John Muschelli
Continuous integration:
- travis-ci.org
- ci.appveyor
- codecov.io: Code coverage percent for unit tests in a package.
- Used with the package
covr
: https://cran.r-project.org/web/packages/covr/index.html
- Used with the package
- CRAN
- lifecycle: This may actually be specific to the tidyverse, but thought I'd check it out.
For an example of developer tags see the usethis
library.
Added a devel
branch for messing around with the development of the package.
Implemented a lot of what I've learned related to R package development to generate the first "working" (it can be attached, but does not have any actual functionality) version of the package.
Got Travis and Appveyor running. Also got codecov up, but there are no tests yet, so that badge isn't really that useful, yet. Here's where I can find more information about each service for this repo:
- travis: https://travis-ci.org/wetlandscapes/isotopeconverter
- Building an R Project: https://docs.travis-ci.com/user/languages/r
- appveyor: https://ci.appveyor.com/project/wetlandscapes/isotopeconverter
- codecov: https://codecov.io/gh/wetlandscapes/isotopeconverter
I think I should probably setup a devel branch for the package so I'm not over-working the continuous integration tools.
Today I continued my review of R Packages, related to testing (test/
), but also looked at some other package development resources related to usethis
.
This chapter of R Packages is all about the unit test and making some automated tests to ensure code generates the expected result. Unit tests are called such because each test is meant to examine a "unit" of functionality. Tests are particularly useful for automatically ensuring that existing code is generating the intended output and that new functionalities do not impact old ones.
-
devtools::use_testthat
: Sets up automated checking.- Creates a folder,
tests/testthat
for performing unit tests. - Modifies the
DESCRIPTION
file. - Creates
tests/testthat.R
for running tests.
- Creates a folder,
- Modify code or test.
-
devtools::test
: Test package (Ctrl + Shift + T) - Repeat until all tests pass.
Example test from the stringr
package:
context("String length")
library(stringr)
test_that("str_length is number of characters", {
expect_equal(str_length("a"), 1)
expect_equal(str_length("ab"), 2)
expect_equal(str_length("abc"), 3)
})
Note that the above is grouped into a hierarchy of tests and expectations, which are in-turn located in files.
Questions to ask when writing tests:
- Does an output have the correct value?
- Does an output have the correct class?
- Does it produce a warning when it's supposed to?
- Does it produce an error when it's supposed to?
It's important to write both "passing" and "failing" tests. This can be done using the testthat::expect_*
functions.
testthat::expect_*
functions have two arguments:
- The actual result of a function.
- The expected result from the same function.
-
devtools::use_testthat
: Sets up automated checking. -
devtools::test
: Test package (Ctrl + Shift + T). -
testthat::test_that
: Runs a test for related expectations.- Take advantage of the description argument. It can help one diagnose which test returned an error.
-
testthat::expect_*
: A series of related functions to help with unit testing. Examples:-
testthat::expect_equal
: Tests if two things are similar, with numerical tolerance (usesall.equal
). -
testthat::expect_identical
: Test for exact equivalence. -
testthat::expect_match
: Matches a character vector against a regular expression. -
testthat::expect_output
: Inspects printed output. Uses regular expressions. -
testthat::expect_message
: Output includes a message. Uses regular expressions. -
testthat::expect_warning
: Output includes a warning. Uses regular expressions. -
testthat::expect_error
: Output is an error. Uses regular expressions. -
testthat::expect_is
: Checks if a result inherits from a specified class. -
testthat::expect_false
: A nice catch-all for a condition. -
testthat::expect_true
: A nice catch-all for a condition. -
testthat::expect_equal_to_reference
: Caches a result when the expectation is not easy to predict.
-
-
testthat::skip
: Used to skip a particular test. Placed in the actual function.
*_output
, *_message
, *_warning
, and *_error
can be particularly powerful when including a more specific output string. This allows one to test for multiple conditions.
Each test should focus on some small functionality associated with a function, that way it is easier to diagnose where and how things went wrong. Using testthat::test_that
the first argument (the description) should finish the sentence "Test that...". For example, from the units
package:
test_that("We can concatenate units if they have the same unit", {
x <- 1:4 * as_units("m")
y <- 5:8 * as_units("m")
z <- c(x, y)
expect_equal(length(z), length(x) + length(y))
expect_equal(x, z[1:4])
expect_equal(y, z[1:4 + 4])
})
Notice that the desc
argument in test_that
can be combined into "Test that... we can concatenate units if they have the same unit."
Tests can make it less likely that changes in code will break functionality, but tests can also make it harder to purposely modify functionality. So a balance must be struck. This may help explain why some packages I'd expect to be 100 % covered by unit tests are not -- flexibility in future development.
Strategies:
- Not sure what this means, but "focus on testing the external interface to your functions."
- Containerize functionality tests as much as possible. This will allow you to modify or delete tests more easily when altering functionalities in the future.
- Focus on complicated parts, as those are the most likely to break; spend less time on the simple things that you know will work.
- Always write a test when a bug is discovered. That is, when a bug is found, write a test first, then write the code that will pass the test.
Example:
check_api <- function() {
if (not_working()) {
skip("API not available")
}
}
test_that("foo api returns bar when given baz", {
check_api()
...
})
This section provides an example of "refactoring", which simplifies some of the redundancies associated with the expect_*
functions in some circumstances.
Imports are used to determine how a function in one package finds a function in another package. Effectively reduces conflicts in function names between packages by better defining environments. Exports avoid conflicts with external functions.
The search path are the ordered set of package environments interrogated to find a function, data, etc. The search path can be accessed via search
.
How to attach packages:
-
library(x)
: Used in data analysis scripts. Do not use in a package. The valuex
represents the package being loaded. -
requireNamespace("x", quietly = TRUE)
: Used in packages. Returns a FALSE, which can be used to generate an error.
In a package context, Imports
and Depends
in the DESCRIPTION
file are where packages should be loaded. Depends
is used when a package is built "on top" of another. Imports
loads a package, though it may not be available unless using the ::
operator. Depends
attaches a library -- it both loads it and makes its functions available without needing to use ::
. A lot of the management association with these dependencies can be taken care of using the usethis
package.
The NAMESPACE
file should probably be handled using the roxygen2
package. This, in part, is what the tags from roxygen2
help with.
Namespace directives:
- Exports
*
export()
: functions, including S3 and S4 generics *exportPattern()
: Export all functions that follow a certain string pattern- For S4:
*
exportClasses()
: S4 classes *exportMethods()
: S4 methods *S3method()
: S3 methods
- For S4:
*
- Imports
*
import()
: Import all functions from a package *importFrom()
: Import select functions from a package- For S4:
*
importClassesFrom()
: S4 classes *importMethodsFrom()
: S4 methods *useDynLib()
: Import a function from C.
- For S4:
*
- Add
roxygen2
tags to an .R file. -
devtools::document
: Convert roxygen comments to a .Rd file. - Examine
NAMESPACE
to make sure it makes sense. - Repeat.
Use @exports
tag from roxygen2
to export a function.
Files that start with "." are not automatically exported to NAMESPACE
. It's generally better to export too little than too much, especially during the development phases of a project, because internal functions can be improved without altering external performance.
There is some additional information about exporting S3, S4, and references classes.
Different methods:
- Use a combination of names and function:
package::function()
. Preferred if a function is only called a couple times. -
@importFrom <package> <operator>
. Preferred for calling specific operators from a package, like%>%
. -
@importFrom <package> <function>
: Preferred for calling a specific function. -
@import <package>
: Attaches the whole package.
The Imports
field in the DESCRIPTION
file and the import directives in the NAMESPACE
file seem confusingly alike, but that is just due to poor naming convention. Imports
: ensures dependent packages are installed along with your package. Import directives actually attaches the functions from another package.
Some additional information provided related to S3 and S4 objects.
Notes from the blog usethis
workflow for package development by Emil Hvitfeldt
This blog serves as something of an update to two other really useful blogs on package development, but focuses on elements of the usethis
package:
- Writing an R package from scratch by Hilary Parker
- Writing an R package from scratch by Tomas Westlake
- Load essential packages:
available
devtools
roxygen2
usethis
testthat
spelling
- Check that the name of the package is available and not associated with some weird definition:
available
.
Note: I may not be able to use these commands, because I've already done some of this things in a different workflow.
-
usethis::create_package
: Create the package. -
usethis::use_git
: Setup the package to be used with git. -
usethis::use_github
: Setup the package to be used with GitHub.
- Generate a license.
- Example:
usethis::use_gpl3_license
. - Useful resources for choosing a license: https://choosealicense.com/
- Example:
-
usethis::use_readme_rmd
: Generate a README. Adds the file to .Rbuildignore. - Add continuous integration. Do these one at a time as each one has extra steps associated with it, highlighted in output to console.
usethis::use_travis
usethis::use_appveyor
usethis::coverage(type = c("codecov"))
-
usethis::use_testthat
: Adds thetestthat
package workflow for unit testing. -
usethis::use_spell_check
: Usespelling
package to ensure spell check is done.-
devtools::check()
is used to trigger spell check.
-
-
usethis::use_data_raw
: If using raw data that needs to be created/formatted. -
usethis::use_news_md
: For bigger projects with release information.
Shortcuts below are for Windows.
- Write code
- Restart R session: Ctrl+Shift+F10
- Build and reload package: Ctrl+Shift+B
- Test package: Ctrl+Shift+T
- Check package: Ctrl+Shift+E
- Document package: Ctrl+Shift+D
-
usethis::use_r
: Make an R function that will be added toR/
. Note: Not entire clear if this is intended for individual functions or if I can add multiple functions in a single file. The latter would be more convenient. -
usethis::use_test
: Add unit tests to the function(s) created usinguse_r
. -
usethis::use_package
: Import external packages that your package will depend on. -
Special
use_package
cases.-
usethis::use_rcpp
: If usingRcpp
. -
usethis::use_pipe
: If you want to use the pipe operator,%>%
without importing all ofmagrittr
. -
usethis::use_tibble
: If you want to work with tibbles.
-
usethis::use_vignette
- Restart R session: Ctrl+Shift+F10
- Document package: Ctrl+Shift+D
- Check package: Ctrl+Shift+E
- Restart R session: Ctrl+Shift+F10
- Document package: Ctrl+Shift+D
- Check package: Ctrl+Shift+E
-
usethis::use_version
: Update the version
- Helpers for tidyverse development using
usethis
: https://usethis.r-lib.org/reference/tidyverse.html
Today I reviewed more from R Packages. Specifically, I reviewed elements of documentation (man/
folder) and making vignettes (vignettes/
).
This chapter largely focuses on using the roxygen2
package for developing documentation, as it automates a number of elements associated with making the documentation, and mingles the documents and functions together. For more on roxygen2
:
Objects covered by this chapter:
- Functions
- Packages
- Requires some special formatting compared to functions.
- Classes, generics, methods
- This includes S3, S4, and reference classes.
Anatomy of help file (parts in parentheses are optional):
- Name
- (Alias)
- Title
- First line of a
roxygen2
skeleton
- First line of a
- Description
- Second line of a
roxygen2
skeleton
- Second line of a
- Usage
- Arguments (parameters)
- Details
- Third+ lines of a
roxygen2
skeleton, before additional tags are included
- Third+ lines of a
- Value
- (Character strings)
- (Expressions)
- (Note)
- (Author(S))
- (S4 methods)
- (References)
- (See Also)
- Return
- Examples
Vocabulary:
- Blocks: Partition a function into different parts.
- Tags: Used to delineate blocks of information related to a function.
- Use
@@
for a literal@
sign.
- Use
- Formatting commands: Used to format help text, make references, etc.
Example tags
-
@param <argument> <description of argument>
: Indicates that the<argument>
is a parameter of the function.- Description should include the parameter type (e.g., numeric, string, factor).
- Can combine parameters into a single description. Ex.
@param x, y Numeric vectors.
.
-
@inheritParams <source_function>
: The new function has the same parameters as the source function, reducing the amount of typing and explanation required for similar functions. -
@inheritParams package::function
: Same as@inheritParams
, but allows one to extract parameter information from another package and function. -
@examples
: Examples of how to use the function (like a miniature vignette).- Provides executable R code that will run when called from the
utils::example()
function. Ex.example("sum")
- See
\dontrun{}
formats for not running some examples (because they were included to illustrate examples of failure).
- Provides executable R code that will run when called from the
-
@return <description>
: The output of the function. -
@seealso
: Good for pointing users to other commands, packages, or resources of interest related to a function. -
@family
: Similar to@seealso
, but used to reference a "family" of functions that are related. -
@aliases
: Alternative function names -- makes it easier to find a function. -
@keywords
: Keywords related to function.- Often used in the context of
@keywords internal
to highlight it is an "internal" function in the package, not really meant to be shared with regular users, but still of interest to the package developer and others that might be interested in extending the package (thus it still has documentation).
- Often used in the context of
-
@docType
: Particularly useful when documenting a package. Ex.@docType package
. -
@slot
: Used to document the slot of an S4 class. -
@rdname
: Associated with S4 method documentation. Seems to be used for referencing, so information can be reiterated, without having to repeat one's self. -
@describeIn
: Also associated with S4 method documentation. Seems to be used for referencing, so information can be reiterated, without having to repeat one's self.- Seems I should reference the Documenting multiple functions in the same file, if I need to use this in the context of an S4 package.
-
@include
: Used to specify the order in which S4 classes are created. Sets theCollate
field in theDESCRIPTION
file. -
@field
: Used with reference classes. Replaces the functionality of the slot.
Example formatting
More on text formatting can be found out: http://r-pkgs.had.co.nz/man.html#text-formatting
-
\code{}
: Similar to adding `` in markdown. -
\link{}
: Use to link to a function in current or other package.- Ex.
\code{\link{<functioname>}}
- Ex.
\code{\link[<packagename>]{<functioname>}}
- Ex.
-
\dontrun{}
: Used with the@examples
tag. Can be expressed over multiple lines. Ex.\dontrun{sum("a")}
. -
\eqn{}
: Inline equation. -
\deqn{}
: Display block equation. -
\tabular{}
: For making simple tables.-
\tab
: Separates columns. -
\cr
: Separates rows.
-
A function for turning a dataframe into a table that can be used in a help file:
tabular <- function(df, ...) {
stopifnot(is.data.frame(df))
align <- function(x) if (is.numeric(x)) "r" else "l"
col_align <- vapply(df, align, character(1))
cols <- lapply(df, format, ...)
contents <- do.call("paste",
c(cols, list(sep = " \\tab ", collapse = "\\cr\n ")))
paste("\\tabular{", paste(col_align, collapse = ""), "}{\n ",
contents, "\n}\n", sep = "")
}
cat(tabular(mtcars[1:5, 1:5]))
Generics
Documented as regular functions.
Classes
The constructor function is what gets documented, since there are no formal definitions.
Methods
These are the "." functions associated with generics like print
. It's choose your own adventure for which of these get documented, though I'd err on providing too much information, rather than too little.
Generics
Defined like a function.
Classes
Need to use a combination of @slot
and setClass()
. Ex:
#' An S4 class to represent a bank account.
#'
#' @slot balance A length-one numeric vector
Account <- setClass("Account",
slots = list(balance = "numeric")
)
Methods
Must be documented. Is associated with either the class document, generic document, or its own document, depending on the complexity.
Classes
Uses @field
instead of slots. Also uses a different style in which the class methods are wrapped in a setRefClass
function, where the class and methods are defined.
I think I can do a lot of this with the workflow I've developed using knitr
/rmarkdown
on other projects.
-
browseVignette
: Opens up a browser (e.g., Chrome) with a list of vignettes for a named package, including the different viewing options for those vignettes (e.g., html, pdf, Rmd file, a code-only file).- Ex. browseVignettes("dplyr")
-
vignette
: Access individual vignettes.- If bare (no argument included), the function returns a "View" of all the vignettes available.
- If a vignette name is provided, then the function returns that vignette.
-
devtools::use_vignette
- Use to create a
vignette/
folder (if it doesn't already exist), as well as a .Rmd file associated with the vignette. - Ex.
devtools::use_vignette("vignette-name")
will generate the filevignettes/vignette-name.Rmd
. - It's not yet clear to me as to how sophisticated a vignette can be. That is, can I include supporting files (e.g., images, custom css, custom javaScript)?
- Use to create a
-
devtools::build_vignettes
can be used to build just the vignettes, thoughdevtools::build
will generate more useful results.-
Note that neither RStudio's "Build & reload" button or
devtools::install_github()
will build vignettes, because they are time consuming.devtools::install_github(build_vignettes = TRUE)
will force vignettes to be built.
-
Note that neither RStudio's "Build & reload" button or
This is the YAML header (metadata). Below is an example header. Notice the output
and vignette
fields. They contain information not usual to "typical" rmarkdown html outputs. Note the line %\VignetteIndexEntry{Vignette Title}
. Here, Vignette Title needs to be changed to the actual title (apparently it is not inherited from the YAML title).
---
title: "Vignette Title"
author: "Vignette Author"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette #This template was designed to work well with R packages
vignette: > #Special character used to tell YAML to interpret something as a literal string
%\VignetteIndexEntry{Vignette Title} #"Vignette Title" needs to be changed to the value of the actual title.
%\VignetteEngine{knitr::rmarkdown}
\usepackage[utf8]{inputenc}
---
Most of the rest of this chapter is related to rmarkdown
and knitr
, which I feel pretty comfortable with.
- I should review the udunits documentation to better understand how that library works, since it underlies the
quantities
ecosystem of packages.- I'm starting to think
units
is not flexible enough for my needs, but may be convinced otherwise upon a slightly deeper dive into the documentation. - If I understand how udunits works a bit more, I might be able to make some wrappers, or even a class, that "hides" some of my work arounds when using
units
, making the user experience a bit nicer and intuitive.
- I'm starting to think
- Review the
roxygen2
documentation. - Review
testthat
documentation.
Currently reviewing R Packages and came across the .onLoad
idea in the Code chapter, which can be used to perform some "setup" functions when a package is loaded. This might be a good way to deal with adding custom isotope units to the units
package. .onLoad
is conventionally stored in a file called zzz.R in the R/ folder. Seems like I'm going to need to read R Packages at least twice -- once while getting started and again when trying to figure out what I missed.
The Package metadata chapter in R Packages provides a basic outline of what is contained in a DESCRIPTION file. For an example description file: https://github.com/r-quantities/units/blob/master/DESCRIPTION.
-
create
: Creates a package. This may not be necessary if using RStudio. -
use_package
: Adds package dependencies and suggests to the description file. -
load_all
: Loads all package dependencies. -
document
: Converts Roxygen comments to a .Rd file for documentation (viaroxygen2::roxygenise
)- Seems like this would be better than completely re-building a package.
- Downside is that links between documentation do not work unless the package is completely rebuilt.
- Full list of author roles: http://www.loc.gov/marc/relators/relaterm.html
- Common roles: cre, aut, and ctb
R packages need, at a minimum, a description file. I think this may get generated when you create a package in RStudio.
Idea: For the table of isotope standards, include three different formats:
- R readable (as in when using the
View
function) - Formatted to be used in an html output from
kableExtra
andknitr
- Formatted to be used in a latex output from
kableExtra
andknitr
Reviewed a lot of documentation from the r-quantities and related projects: units
, errors
, quantities
, and constants
. Seems like I might be able build some functionality on top of these packages.