Martin Lysy
TMB is an R package providing a convenient interface to the CppAD C++ library for automatic differentiation. More specifically for the purpose of statistical inference, TMB provides an automatic and extremely efficient implementation of Laplace's method to approximately integrate out the latent variables of a model p(x | theta) = \int p(x, z | theta) dz via numerical optimization. TMB is extensively documented, and numerous examples indicate that it can be used to effectively handle tens to thousands of latent variables in a model.
TMB was designed for users to compile and save standalone statistical models. Distributing one or more TMB models as part of an R package requires a nontrivial compilation process (Makevars[.win]
), and some amount of boilerplate code. Thus, the purpose of TMBtools is to provide helper functions for the development of R packages which contain TMB source code, promoting effective TMB coding practices as discussed below. The main package functions are:
-
tmb_create_package()
, which creates an R package infrastructure with the proper TMB compile instructions. -
use_tmb()
, which adds TMB functionality to an existing package. -
export_models()
, which updates the package's TMB compile instructions when new models are added.
Note to Developers: While TMBtools depends on a number of packages to facilitate its work, none of these dependencies are passed on to your package, except TMB itself.
Install the R package devtools package and run:
devtools::install_github("mlysy/TMBtools")
testthat::test_package("TMBtools", reporter = "progress") # optionally run the unit tests
As per the standard TMB tutorial, a typical standalone TMB model defined in ModelA.cpp
would look something like this:
/// @file ModelA.cpp
#include <TMB.hpp>
template<class Type>
Type objective_function<Type>::operator() () {
// model definition
}
And would be compiled and called from R as follows:
TMB::compile("ModelA.cpp") # compile model
dynload(TMB::dynlib("ModelA")) # link to R session
# instantiate model object
modA <- TMB::MakeADFun(data = data_list,
parameters = param_list,
DLL = "ModelA")
modA$fn(other_param_list) # call its negative loglikelihood
modA$gr(other_param_list) # negative loglikelihood gradient
To create an R package containing ModelA
, we must convert ModelA.cpp
to a header file ModelA.hpp
and make a few minor modifications:
/// @file ModelA.hpp
// **DON'T** #include <TMB.hpp> as it is not include-guarded
#undef TMB_OBJECTIVE_PTR
#define TMB_OBJECTIVE_PTR obj
// name of function below **MUST** match filename
// (here it's ModelA)
template <class Type>
Type ModelA(objective_function<Type>* obj) {
// exactly same body as
// `objective_function<Type>::operator()`
// in `ModelA.cpp`
}
#undef TMB_OBJECTIVE_PTR
#define TMB_OBJECTIVE_PTR this
For ModelA.hpp
, ModelB.hpp
, etc. similarly defined and in the current directory (base::getwd()
), the R package containg them is created with:
TMBtools::tmb_create_package(path = "path/to/MyTMBPackage",
tmb_files = c("ModelA.hpp", "ModelB.hpp"))
Once MyTMBPackage is installed, models can be accessed with:
library(MyTMBPackage) # package needs to be loaded
# instantiate ModelA object
modA <- TMB::MakeADFun(data = c(model = "ModelA", # which model to use
data_list),
parameters = param_list,
DLL = "MyTMBPackage_TMBExports") # package's DLL
modA$fn(other_param_list) # same usage as before
# instantiate ModelB object
modB <- TMB::MakeADFun(data = c(model = "ModelB", # use ModelB
data_list_B),
parameters = param_list_B,
DLL = "MyTMBPackage_TMBExports")
modB$fn(other_param_list_B)
For more usage details, e.g., how to add new models to an existing package, please see the package vignette.
As a sanity check, you may create a fully operational example package, i.e., with unit tests passing R CMD --as-cran check
, with the sample code below. To run this code you will need to install the packages devtools, usethis, and numDeriv.
TMBtools::tmb_create_package(path = "path/to/TMBExampleTest",
example_code = TRUE,
fields = list(
`Authors@R` = 'person("Your", "Name",
email = "[email protected]",
role = c("aut", "cre"))'))
# R wrapper functions to TMB models
# note: data = list(pkg = ...) must match name of package (here TMBExampleTest)
usethis::use_template(template = "norm_ADFun.R", package = "TMBtools",
save_as = file.path("R", "norm_ADFun.R"),
data = list(pkg = "TMBExampleTest"))
usethis::use_template(template = "gamma_ADFun.R", package = "TMBtools",
save_as = file.path("R", "gamma_ADFun.R"),
data = list(pkg = "TMBExampleTest"))
# testthat tests
usethis::use_testthat()
usethis::use_package(package = "numDeriv", type = "Suggests")
usethis::use_template(template = "test-norm_ADFun.R", package = "TMBtools",
save_as = file.path("tests", "testthat",
"test-norm_ADFun.R"))
usethis::use_template(template = "test-gamma_ADFun.R", package = "TMBtools",
save_as = file.path("tests", "testthat",
"test-gamma_ADFun.R"))
# update NAMESPACE and package documentation
pkgbuild::compile_dll() # need to compile src first
devtools::document()
The last two lines set up all elements of the package, including the TMB C++ code, but without actually installing the package. The next few lines run the various package checks. NOTE: You must quit + restart R before running these to make sure things work properly on newer versions of R.
# essentially equivalent to R CMD --as-cran check
devtools::check() # on your local platform
# on winbuilder (CRAN's Windows)
# must provide a valid email address
devtools::check_win_devel()
-
TMB and other source code (e.g., Rcpp) in an R package cannot be merged into a single shared library, as documented here. Therefore, the approach taken here is to have a separate shared object for the TMB models, for which the source code is stored in
src/TMB
, and for which separate compiling instructions provided inMakevars[.win]
.Developers can include whatever Rcpp code they like in the package. The
Makevars[.win]
is set up so that R's normal C++ compiling is completely unaffected by the TMB part, and can be modified to provide e.g., additionalCXX_FLAGS
in the usual way.However, standard practice is that the name of the main shared library (the one that gets created by Rcpp) is that of the package, so the TMB shared object needs to be called something else. I have found that in order to avoid the CRAN check note
Foreign function call to a different package
, in theNAMESPACE
file, it is necessary that theuseDynLib
call to the main shared library come first, i.e., before TMB's.TMBtools::tmb_create_package()
sets things up properly assuming that roxygen2 is used to create the documentation, andTMBtools::use_tmb()
provides instructions to do so when modifying theNAMESPACE
is out of its control. -
TMBtools follows the recommendations provided here to combine all the package's TMB models into a single standalone-type meta-model
src/TMB/MyTMBPackage
. The appropriate model is selected at instantiation time using themodel
element of thedata
argument toTMB::MakeADFun
. This results in much smaller package sizes compared to multiple standalone models, and doesn't appear to impact performance. -
The meta-model implementation in TMBtools (based heavily on the approach advocated here) is something like:
/// @file MyTMBPackage_TMBExports.cpp #include <TMB.hpp> #include "ModelA.hpp" #include "ModelB.hpp" template<class Type> Type objective_function<Type>::operator() () { DATA_STRING(model); if(model == "ModelA") { return ModelA(this); } else if(model == "ModelB") { return ModelB(this); } else { Rf_error("Unknown model."); } return 0; }
The unit tests in
TMBtools/test/testthat/test-aMb.R
indicate indicate that TMB does not get confused betweenDATA_*
andPARAMETER*
arguments across.hpp
model files -- for example,DATA_VECTOR(x)
inModelA
andPARAMETER_MATRIX(x)
inModelB
-- nor doesTMB::MakeADFun
expect you to provide all arguments to all models at once. In other words, the main function bodies ofModelA.hpp
andModelB.hpp
can be defined exactly you would for standalone filesModelA.cpp
andModelB.cpp
.
- Add unit tests for
tmb_create_package()
,use_tmb()
, andexport_models()
. - Run tests with
OpenMP
enabled. - Add examples for
use_tmb()
andexport_models()
.