-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhead of future_lapply relative to parLapply/mclapply, etc. #68
Comments
There's a fair bit of overhead from the capturing and relaying of conditions, e.g. messages and warnings. When using the low-level Future API, these can be disabled by: f <- future(..., conditions = NULL) where the default is For historical reasons due to being able to roll out condition relaying in future.apply, furrr, and doFuture, attempting to do the same there, e.g. f <- future_lapply(..., future.conditions = NULL) will end up becoming There's also an overhead from capturing and relaying the standard output. This can indeed be disabled by setting Now, I've just pushed an update to the develop branch where it's possible to disable the condition relaying mechanism as well. Install that version and try with: fut2 <- function(v) {
v <- future_lapply(v, function(.x) {
gsub("\\*$", "", .x) %>% gsub("\\*.+$", "", .) %>% unique %>%
paste0(collapse = ",")
}, future.stdout = NA, future.conditions = NULL)
} I think that'll shave off 10-15% of the overhead. You can do manual specification of globals and packages via arguments There's more optimization that can be done in the future package so you can expect some more improvement in future (pun intended) release - not dramatic but improvements. cc/ @DavisVaughan |
FYI, I've shaved off some internal overhead that involves R expression manipulations in the develop version of future. You can expect that |
@HenrikBengtsson FWIW you can inject x <- quote(list(1, 2))
x[[2]] <- NULL
x
#> list(2)
x <- quote(list(1, 2))
x[[2]] <- list(NULL)
x
#> list(list(NULL), 2)
x <- quote(list(1, 2))
x[2] <- NULL
x
#> list(2)
# Aha!
x <- quote(list(1, 2))
x[2] <- list(NULL)
x
#> list(NULL, 2) Created on 2021-03-15 by the reprex package (v1.0.0) |
Hi. I eventually figured it out - it turned out that for certain types of expression one has to do some coersion for that to work, cf. https://github.com/HenrikBengtsson/future/blob/5c52ff365fc2efcdb063e4cc98acd52d441437a5/R/000.bquote.R#L109-L116. (Now when I look at it, I can't recall in exactly what situations the |
I have a number of tasks that look like:
lapply(long_list, fast_function)
and I'd like to get away from using mclapply (for reasons you've talked about before).However, in my benchmarks I see that future_apply has a larger overhead comapred to parLapply/mclapply.
Are there parameters I can tune to improve the performance on these types of tasks?
An example:
The text was updated successfully, but these errors were encountered: