-
-
Notifications
You must be signed in to change notification settings - Fork 25
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Parallelization of the projection (#235)
* Parallelize the projection (i.e., parallelize across the projected draws). A related discussion may be found in issue #77. * Introduce global option `projpred.nprjdraws_parallel`. * Extend docs for parallelization. * Improve the documentation for global option `projpred.nprjdraws_parallel` (and adapt the code correspondingly). * Increase the default for option `projpred.nprjdraws_parallel` from 100 to 200 to ensure a speed improvement even for `devtools::load_all()` instead of `library(projpred)` (for `devtools::load_all()`, parallelization seems to be a bit slower). * Iterate over the objects in the `foreach()` loop themselves instead of their index. This should avoid the export of those (large) objects. * Use the `.export` argument in the `foreach()` call to handle exports manually. This could avoid the unnecessary export of (large) objects which are unused. * Use the `.noexport` argument in the `foreach()` call to handle exports manually. This could avoid the unnecessary export of (large) objects which are unused. * Check the most important (largest) objects which should not be exported. * Set `.noexport` manually. * Explain why we have the global option `projpred.nprjdraws_parallel`. * Set `.export` manually (necessary for the `doFuture` package with `options(doFuture.foreach.export = ".export")`). * Make the parallelization of the projection an optional feature. * Add comments concerning the parallelization. * Rename option `projpred.nprjdraws_parallel` to `projpred.prll_prj_trigger`. * Throw errors if packages **foreach** and **iterators** are not installed. * Add package **parallel** to `Suggests:`. * Add package **doFuture** to `Suggests:` (only needed for the parallel tests, though, which are not run on CRAN by default). * Add packages **future** and **future.callr** to `Suggests:`. * `bootstrap()`: Remove argument `oobfun` which is not used within **projpred** (and makes future modifications of `bootstrap()` more complicated). * `bootstrap()`: Replace `seq.int()` by `seq_len()`, as recommended in `?seq.int`. * `bootstrap()`: Improve a comment. * Parallelize `bootstrap()`. * Extend the docs for the parallelized projection (in ``?`projpred-package` ``). * Test the `"noclust"` setting (for projecting from a reference model) more often. * Avoid a failing "`offsetnew` works" test due to numerical inaccuracies for extreme values. * Add test "PRNG is not taking place where not expected". * Simplify the test "PRNG is not taking place where not expected". * Rename test "PRNG is not taking place where not expected" to "non-clustered projection does not require a seed". * Add tests for parallelization. * Move the code for the sequential `bootstrap()` run from `setup.R` to `test_parallel.R`. * Revert the parallelization of the bootstrap. The first reason is that in order for the parallel test results to match the sequential ones exactly, we would have to use doRNG::`%dorng%`() also in the sequential case of `bootstrap()` and then, for the sequential results in the tests, not register any **foreach** backend at all or use `registerDoSEQ()`. The second reason is that, because of the parallelization overhead, the parallelized bootstrap didn't provide a speed improvement, at least not up to the number of bootstrap samples used by default (2000). * Even if not on CRAN, `R CMD check` imposes a limit of 2 cores (at least by the defaults used in RStudio).
- Loading branch information
Showing
9 changed files
with
303 additions
and
37 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
context("parallel") | ||
|
||
# Setup ------------------------------------------------------------------- | ||
|
||
if (run_prll) { | ||
trigger_default <- options(projpred.prll_prj_trigger = 0L) | ||
|
||
if (dopar_backend == "doParallel") { | ||
doParallel::registerDoParallel(ncores) | ||
} else if (dopar_backend == "doFuture") { | ||
doFuture::registerDoFuture() | ||
export_default <- options(doFuture.foreach.export = ".export") | ||
if (future_plan == "callr") { | ||
future::plan(future.callr::callr, workers = ncores) | ||
} else if (future_plan == "multisession") { | ||
future::plan(future::multisession, workers = ncores) | ||
} else if (future_plan == "multicore") { | ||
future::plan(future::multicore, workers = ncores) | ||
} else { | ||
stop("Unrecognized `future_plan`.") | ||
} | ||
} else { | ||
stop("Unrecognized `dopar_backend`.") | ||
} | ||
stopifnot(identical(foreach::getDoParWorkers(), ncores)) | ||
} | ||
|
||
# project() --------------------------------------------------------------- | ||
|
||
test_that("project() in parallel gives the same results as sequentially", { | ||
skip_if_not(run_prll) | ||
tstsetups <- grep("\\.glm\\.", names(prjs), value = TRUE) | ||
for (tstsetup in tstsetups) { | ||
args_prj_i <- args_prj[[tstsetup]] | ||
p_repr <- do.call(project, c( | ||
list(object = refmods[[args_prj_i$tstsetup_ref]]), | ||
excl_nonargs(args_prj_i) | ||
)) | ||
expect_equal(p_repr, prjs[[tstsetup]], info = tstsetup) | ||
} | ||
}) | ||
|
||
# varsel() ---------------------------------------------------------------- | ||
|
||
test_that("varsel() in parallel gives the same results as sequentially", { | ||
skip_if_not(run_prll) | ||
skip_if_not(run_vs) | ||
tstsetups <- grep("\\.glm\\.", names(vss), value = TRUE) | ||
for (tstsetup in tstsetups) { | ||
args_vs_i <- args_vs[[tstsetup]] | ||
vs_repr <- do.call(varsel, c( | ||
list(object = refmods[[args_vs_i$tstsetup_ref]]), | ||
excl_nonargs(args_vs_i) | ||
)) | ||
expect_equal(vs_repr, vss[[tstsetup]], info = tstsetup) | ||
} | ||
}) | ||
|
||
# cv_varsel() ------------------------------------------------------------- | ||
|
||
test_that("cv_varsel() in parallel gives the same results as sequentially", { | ||
skip_if_not(run_prll) | ||
skip_if_not(run_cvvs) | ||
tstsetups <- grep("\\.glm\\.", names(cvvss), value = TRUE) | ||
for (tstsetup in tstsetups) { | ||
args_cvvs_i <- args_cvvs[[tstsetup]] | ||
# Use suppressWarnings() because of occasional warnings concerning Pareto k | ||
# diagnostics: | ||
cvvs_repr <- suppressWarnings(do.call(cv_varsel, c( | ||
list(object = refmods[[args_cvvs_i$tstsetup_ref]]), | ||
excl_nonargs(args_cvvs_i) | ||
))) | ||
expect_equal(cvvs_repr, cvvss[[tstsetup]], info = tstsetup) | ||
} | ||
}) | ||
|
||
# Teardown ---------------------------------------------------------------- | ||
|
||
if (run_prll) { | ||
if (dopar_backend == "doParallel") { | ||
doParallel::stopImplicitCluster() | ||
} else if (dopar_backend == "doFuture") { | ||
future::plan(future::sequential) | ||
options(doFuture.foreach.export = export_default$doFuture.foreach.export) | ||
rm(export_default) | ||
} else { | ||
stop("Unrecognized `dopar_backend`.") | ||
} | ||
|
||
options(projpred.prll_prj_trigger = trigger_default$projpred.prll_prj_trigger) | ||
rm(trigger_default) | ||
} |
Oops, something went wrong.