period.apply on xts object takes about 50x longer than running it on it's core data and recasting. #278

ckatsulis · 2018-11-02T19:41:50Z

Description

on large xts objects +1MM lines, period.apply is much faster using core data and recasting

Expected behavior

about equal runtimes with and without a numeric index

Minimal, reproducible example

n <- 1e6
ep <- seq(1, n, 3) 
x <- .xts(runif(n, 1, n), 1:n)
system.time(test1 <- period.apply(x, ep, sum)) 
#    user  system elapsed
#  12.637   0.010  12.653
system.time(test2 <- period.apply(coredata(x), ep, sum)) 
#    user  system elapsed
#   0.803   0.000   0.804

Session Info

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Rcpp_0.12.19               roll_1.0.7                 forecast_8.4               DEoptim_2.2-4              dygraphs_1.1.1.6           PerformanceAnalytics_1.5.2 FinancialInstrument_1.3.1  quantmod_0.4-13            TTR_0.23-3                
[10] xts_0.11-1                 zoo_1.8-3                 

loaded via a namespace (and not attached):
 [1] uroot_2.0-9          lattice_0.20-35      assertthat_0.2.0     digest_0.6.15        lmtest_0.9-36        mime_0.5             R6_2.2.2             plyr_1.8.4           backports_1.1.2      futile.options_1.0.1 ggplot2_3.0.0       
[12] pillar_1.2.3         rlang_0.3.0          uuid_0.1-2           lazyeval_0.2.1       curl_3.2             misc3d_0.8-4         data.table_1.11.4    fracdiff_1.4-2       R.utils_2.6.0        R.oo_1.22.0          checkmate_1.8.5     
[23] stringr_1.3.1        pander_0.6.2         htmlwidgets_0.9      RCurl_1.95-4.10      munsell_0.5.0        shiny_1.0.5          compiler_3.5.1       httpuv_1.4.4.2       pkgconfig_2.0.1      qmao_1.6.11          urca_1.3-0          
[34] htmltools_0.3.6      nnet_7.3-12          tidyselect_0.2.5     htmlTable_1.9        tibble_1.4.2         quadprog_1.5-5       dplyr_0.7.7          later_0.7.3          bitops_1.0-6         R.methodsS3_1.7.1    grid_3.5.1          
[45] nlme_3.1-137         jsonlite_1.5         xtable_1.8-2         gtable_0.2.0         formatR_1.5          magrittr_1.5         scales_0.5.0         RcppParallel_4.4.1   stringi_1.2.3        promises_1.0.1       tseries_0.10-45     
[56] bindrcpp_0.2.2       timeDate_3043.102    futile.logger_1.4.3  plot3D_1.1.1         lambda.r_1.2.3       tools_3.5.1          RJSONIO_1.3-0        glue_1.3.0           purrr_0.2.5          yaml_2.1.19          colorspace_1.3-2    
[67] knitr_1.20           bindr_0.1.1          profvis_0.3.5

joshuaulrich · 2018-11-03T12:12:08Z

period.apply() is essentially a convenience wrapper around sapply(), so it's not particularly performant. There's a faster unexported C-based version that needs some testing before I'm comfortable exporting it and/or replacing period.apply() with it (see 7c29359).

n <- 1e6L
ep <- c(0L, seq(1L, n, 3L))
x <- .xts(runif(n, 1, n), 1:n)
system.time(test1 <- xts:::period_apply(x, ep, sum))
#    user  system elapsed
#   2.104   0.044   2.149
system.time(test2 <- period.apply(coredata(x), ep, sum))
#    user  system elapsed
#   0.705   0.019   0.725
identical(coredata(test1), test2)
# [1] TRUE

I may be able to use some ideas from simplify2array() to make it even faster. But testing... it needs much testing.

ckatsulis · 2018-11-03T15:16:52Z

Cool. Thx Josh! Chris Sent from my iPhone

…

On Nov 3, 2018, at 07:12, Joshua Ulrich ***@***.***> wrote: period.apply() is essentially a convenience wrapper around sapply(), so it's not particularly performant. There's a faster unexported C-based version that needs some testing before I'm comfortable exporting it and/or replacing period.apply() with it (see 7c29359). n <- 1e6L ep <- c(0L, seq(1L, n, 3L)) x <- .xts(runif(n, 1, n), 1:n) system.time(test1 <- xts:::period_apply(x, ep, sum)) # user system elapsed # 2.104 0.044 2.149 system.time(test2 <- period.apply(coredata(x), ep, sum)) # user system elapsed # 0.705 0.019 0.725 identical(coredata(test1), test2) # [1] TRUE — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

harvey131 · 2018-12-14T20:37:27Z

I've been able to go faster using data.table and converting the results back to an xts:

library(xts)  
n <- 1e6L  
ep <- c(0L, seq(1L, n, 3L))  

x <- .xts(runif(n, 1, n), 1:n)
system.time(test1 <- xts:::period_apply(x, ep, sum))
   user  system elapsed 
   1.25    0.00    1.25 
system.time(test2 <- period.apply(coredata(x), ep, sum))
   user  system elapsed 
   0.47    0.00    0.46 
identical(coredata(test1), test2)
[1] TRUE

period.sumDT <- function(x, INDEX)
{
  idx <- findInterval(index(x), index(x)[INDEX], left.open = T)
  dt <- data.table::data.table(idx=idx, data=coredata(x)[,1])
  dt <- as.matrix(dt[, sum(data, na.rm=T), by=idx])
  idx <- dt[,1] < length(INDEX)
  setNames( xts(dt[idx,2], index(x)[INDEX][dt[idx,1]+1]), colnames(x))
}
system.time(test3 <- period.sumDT(x, ep ))
   user  system elapsed 
   0.07    0.00    0.08 
all.equal(test1, test3)
[1] TRUE

>   period_apply <- xts:::period_apply  # not exported
>   set.seed(21)
>   x <- .xts(rnorm(1e7), 1:1e7)
>   e <- endpoints(x, "seconds", 5)
>   system.time(y <- period.apply(x, e, sum))  # current version
   user  system elapsed 
  49.17    0.03   49.20 
>   system.time(z <- period_apply(x, e, sum))  # new C version
   user  system elapsed 
   8.14    0.09    8.33 
>   all.equal(y, z)
[1] TRUE

period.sumDT <- function(x, INDEX)
{
  idx <- findInterval(index(x), index(x)[INDEX], left.open = T)
  dt <- data.table::data.table(idx=idx, data=coredata(x)[,1])
  dt <- as.matrix(dt[, sum(data, na.rm=T), by=idx])
  idx <- dt[,1] < length(INDEX)
  setNames( xts(dt[idx,2], index(x)[INDEX][dt[idx,1]+1]), colnames(x))
}
>   system.time(test3 <- period.sumDT(x, e ))
   user  system elapsed 
   0.31    0.11    0.42 
>   all.equal(y, test3)
[1] TRUE

joshuaulrich · 2018-12-15T11:53:40Z

I appreciate your comment and desire to make xts even faster. However, your code is over-optimized to this specific example.

Here are the issues that are immediately clear to me:

period.apply() needs to accept any function, not only sum()
The user's function may also return an object with more than one column
The function may also assume that each subset is an xts object

I'm also very unlikely to change the existing period.apply() function until we have adequate unit test coverage.

I also just checked my suspicion that data.table optimizes the call to sum() by avoiding any R evaluations and instead doing all the summing at the C level. So the performance improvement your example shows would not generalize to most other functions.

harvey131 · 2018-12-17T17:09:35Z

I see show improvement for other functions as well. It seems to be faster in any instance where FUN operates on an argument x that is the coredata of each interval. It could be useful as an alternative implementation of period.sum, period.min/max. I also included an example comparing to period.sum, which does not allow x to have more than one column.

This example would not work if the user needed the x argument to FUN to be an xts object, with an index attribute. It is not documented in period.apply if FUN can assume it gets an xts object, or even that x is an xts. You have an example using letters where x is not an xts object .

> library(xts)
> n <- 1e6L
> x <- .xts(runif(n, 1, n), 1:n)
> e <- endpoints(x, "seconds", 5)
> # not exported
> period_apply <- xts:::period_apply
> # period.apply using datatable
> period.apply.DT <- function(x, INDEX, FUN, ...)
+ {
+   dt <- data.table::as.data.table(x)
+   dt$period <- as.numeric(dt$index[INDEX][findInterval(dt$index, dt$index[INDEX], left.open = T)+1])
+   dt <- as.matrix(dt[, lapply(.SD, FUN, ...), by='period', .SDcols = !'index'])
+   setNames(xts(dt[,-1], structure(dt[,1], class = c("POSIXct", "POSIXt"), tz='')), colnames(x))
+ }
> # period.sum using datatable
> period.sum.DT <- function(x, INDEX, na.rm=F)
+ {
+   dt <- data.table::as.data.table(x)
+   dt$period <- as.numeric(dt$index[INDEX][findInterval(dt$index, dt$index[INDEX], left.open = T)+1])
+   dt <- as.matrix(dt[, lapply(.SD, sum, na.rm=na.rm), by='period', .SDcols = !'index'])
+   setNames(xts(dt[,-1], structure(dt[,1], class = c("POSIXct", "POSIXt"), tz='')), colnames(x))
+ }
> #
> my_check <- function(values) { all(sapply(values[-1], function(x) all.equal(values[[1]], x))) }
> # period.apply with FUN=sum
> microbenchmark::microbenchmark(period.apply(x, e, sum),
+                                period_apply(x, e, sum),
+                                period.apply.DT(x, e, sum),
+                                times=3, check=my_check, unit = 's')
Unit: seconds
                       expr       min        lq      mean    median        uq       max neval
    period.apply(x, e, sum) 4.2666698 4.2768328 4.2851441 4.2869959 4.2943812 4.3017665     3
    period_apply(x, e, sum) 0.7217397 0.7342956 0.7661842 0.7468515 0.7884065 0.8299614     3
 period.apply.DT(x, e, sum) 0.1849304 0.1883792 0.1928879 0.1918279 0.1968667 0.2019055     3
> # period.sum
> microbenchmark::microbenchmark(period.sum(x, e),
+                                period.sum.DT(x, e),
+                                times=3, check=my_check, unit = 's')
Unit: seconds
                expr       min        lq      mean   median        uq       max neval
    period.sum(x, e) 4.2866287 4.4034498 4.4782497 4.520271 4.5740602 4.6278495     3
 period.sum.DT(x, e) 0.1248247 0.1333578 0.1644498 0.141891 0.1842624 0.2266339     3
> # period.apply with FUN=min
> microbenchmark::microbenchmark(period.apply(x, e, min),
+                                period_apply(x, e, min),
+                                period.apply.DT(x, e, min),
+                                times=3, check=my_check, unit = 's')
Unit: seconds
                       expr       min        lq      mean    median        uq       max neval
    period.apply(x, e, min) 4.2764956 4.3273872 4.3512772 4.3782788 4.3886679 4.3990571     3
    period_apply(x, e, min) 0.5536805 0.5877314 0.6010733 0.6217822 0.6247697 0.6277571     3
 period.apply.DT(x, e, min) 0.1632318 0.1833229 0.1949875 0.2034139 0.2108654 0.2183168     3
> # period.apply with arbitrary function
> microbenchmark::microbenchmark(period.apply(x, e, function(x) { sqrt(prod(x)) + min(x) }),
+                                period_apply(x, e, function(x) { sqrt(prod(x)) + min(x) }),
+                                period.apply.DT(x, e, function(x) { sqrt(prod(x)) + min(x) }),
+                                times=3, check=my_check, unit = 's')
Unit: seconds
                                                              expr       min        lq      mean    median        uq       max neval
    period.apply(x, e, function(x) {     sqrt(prod(x)) + min(x) }) 4.5865064 4.6597227 4.7162952 4.7329391 4.7811896 4.8294402     3
    period_apply(x, e, function(x) {     sqrt(prod(x)) + min(x) }) 1.1638887 1.1998162 1.2164511 1.2357438 1.2427324 1.2497210     3
 period.apply.DT(x, e, function(x) {     sqrt(prod(x)) + min(x) }) 0.2706527 0.2719362 0.2981492 0.2732197 0.3118975 0.3505753     3

joshuaulrich · 2018-12-19T13:22:21Z

Thanks for the notes about period.sum() and others. That slowdown was due to an unnecessary as.matrix() call. The timings are more in xts' favor once those are removed:

> microbenchmark::microbenchmark(period.sum(x, e),
+                                period.sum.DT(x, e),
+                                times=3, check=my_check)
Unit: milliseconds
                expr      min       lq     mean   median       uq      max neval
    period.sum(x, e) 11.41976 11.47986 17.24279 11.53996 20.15431 28.76866     3
 period.sum.DT(x, e) 59.08953 61.27357 64.23368 63.45761 66.80575 70.15389     3

This example would not work if the user needed the x argument to FUN to be an xts object, with an index attribute. It is not documented in period.apply if FUN can assume it gets an xts object, or even that x is an xts. You have an example using letters where x is not an xts object.

You're correct that it isn't documented that FUN may expect x to be an xts object. But period.apply() calls try.xts() which attempts to convert x to xts, if possible. Practically, it is very likely that FUN expects an object similar to the object passed to period.apply(). Unfortunately, in your faster function that is only the case when the object passed to period.apply() is a data.table.

I'm very surprised at the data.table performance for the arbitrary function. It is faster than simply calling the function 200,000 times (the length of endpoints). I don't feel good about that.

d <- x[1:4]
f <- function(x) { sqrt(prod(x)) + min(x) }
system.time(for(i in seq_len(length(e))) f(d))
#    user  system elapsed 
#   0.576   0.000   0.577 
system.time(period.apply.DT(x, e, f))
#    user  system elapsed 
#    0.34    0.00    0.34

ckatsulis · 2018-12-19T15:40:42Z

Will the new version have that as.matrix call removed?

…

On Wed, Dec 19, 2018 at 7:22 AM Joshua Ulrich ***@***.***> wrote: Thanks for the notes about period.sum() and others. That slowdown was due to an unnecessary as.matrix() call. The timings are more in xts' favor once those are removed: > microbenchmark::microbenchmark(period.sum(x, e),+ period.sum.DT(x, e),+ times=3, check=my_check)Unit: milliseconds expr min lq mean median uq max neval period.sum(x, e) 11.41976 11.47986 17.24279 11.53996 20.15431 28.76866 3 period.sum.DT(x, e) 59.08953 61.27357 64.23368 63.45761 66.80575 70.15389 3 This example would not work if the user needed the x argument to FUN to be an xts object, with an index attribute. It is not documented in period.apply if FUN can assume it gets an xts object, or even that x is an xts. You have an example using letters where x is not an xts object. You're correct that it isn't *documented* that FUN may expect x to be an xts object. But period.apply() calls try.xts() which *attempts* to convert x to xts, if possible. Practically, it is very likely that FUN expects an object similar to the object passed to period.apply(). Unfortunately, in your faster function that is only the case when the object passed to period.apply() is a data.table. ------------------------------ I'm very surprised at the data.table performance for the arbitrary function. It is faster than simply calling the function 200,000 times (the length of endpoints). I don't feel good about that. f <- function(x) { sqrt(prod(x)) + min(x) }R> system.time(for(i in seq_len(length(e))) f(d)) user system elapsed 0.576 0.000 0.577 R> system.time(period.apply.DT(x, e, f)) user system elapsed 0.34 0.00 0.34 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#278 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHKwqvDDTcdtL9s7Gw6sfAT_6mn8ZMYFks5u6j2PgaJpZM4YMQ5d> .

The period.XXX() functions (sum, prod, min, max) only work on univariate series, so the as.matrix() call is not necessary. It makes the functions considerably slower, because it converts the xts index into row names for the matrix. See #278.

AndreMikulec · 2019-01-06T08:21:50Z

I can not repeat this test. What is 'd?'

system.time(for(i in seq_len(length(e))) f(d))

joshuaulrich · 2019-01-06T17:42:34Z

@AndreMikulec d is the first 4 rows of x. Sorry for the confusion.

d <- x[1:4]
system.time(for(i in seq_len(length(e))) f(d))
#    user  system elapsed 
#   0.543   0.000   0.544
system.time(period.apply.DT(x, e, f))
#    user  system elapsed 
#    0.31    0.00    0.31

AndreMikulec · 2019-01-10T22:51:06Z

I do not know.I am wild guessing.
Possibly multithreading is occuring.

From
https://github.com/Rdatatable/data.table/blob/d3ccd8139d3d592201e0f085c5f37a4fc5a426e3/NEWS.0.md

Contents

Following gsum and gmean, now gmin and gmax from GForce are also implemented. 
Closes part of #523. Benchmarks are also provided. 
R DT[, list(sum(x), min(y), max(z), .N), by=...] # runs by default using GForce

From
https://github.com/Rdatatable/data.table/blob/master/src/gsumm.c

Contents

. . . 
SEXP gforce(SEXP env, SEXP jsub, SEXP o, SEXP f, SEXP l, SEXP irowsArg)
. . . 
#pragma omp parallel for num_threads(getDTthreads())
. . . 
SEXP ans = PROTECT( eval(jsub, env) );

Benchmarks here are interesting.
From
Rdatatable/data.table#523

joshuaulrich added enhancement Enhancement to existing feature help wanted labels Nov 3, 2018

joshuaulrich added this to the 0.12-0 milestone Jul 4, 2019

joshuaulrich modified the milestones: 0.12-0, 0.12-1 Jul 12, 2020

joshuaulrich modified the milestones: 0.12.1, 0.12.2 Sep 9, 2020

joshuaulrich modified the milestones: 0.12.2, 0.12.3 Oct 12, 2022

joshuaulrich modified the milestones: 0.13.0, 0.13.1 Jan 16, 2023

joshuaulrich removed this from the 0.13.1 milestone Apr 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

period.apply on xts object takes about 50x longer than running it on it's core data and recasting. #278

period.apply on xts object takes about 50x longer than running it on it's core data and recasting. #278

ckatsulis commented Nov 2, 2018 •

edited by joshuaulrich

Loading

joshuaulrich commented Nov 3, 2018 •

edited

Loading

ckatsulis commented Nov 3, 2018 via email

harvey131 commented Dec 14, 2018

joshuaulrich commented Dec 15, 2018 •

edited

Loading

harvey131 commented Dec 17, 2018 •

edited

Loading

joshuaulrich commented Dec 19, 2018 •

edited

Loading

ckatsulis commented Dec 19, 2018 via email

AndreMikulec commented Jan 6, 2019

joshuaulrich commented Jan 6, 2019

AndreMikulec commented Jan 10, 2019

period.apply on xts object takes about 50x longer than running it on it's core data and recasting. #278

period.apply on xts object takes about 50x longer than running it on it's core data and recasting. #278

Comments

ckatsulis commented Nov 2, 2018 • edited by joshuaulrich Loading

Description

Expected behavior

Minimal, reproducible example

Session Info

joshuaulrich commented Nov 3, 2018 • edited Loading

ckatsulis commented Nov 3, 2018 via email

harvey131 commented Dec 14, 2018

joshuaulrich commented Dec 15, 2018 • edited Loading

harvey131 commented Dec 17, 2018 • edited Loading

joshuaulrich commented Dec 19, 2018 • edited Loading

ckatsulis commented Dec 19, 2018 via email

AndreMikulec commented Jan 6, 2019

joshuaulrich commented Jan 6, 2019

AndreMikulec commented Jan 10, 2019

ckatsulis commented Nov 2, 2018 •

edited by joshuaulrich

Loading

joshuaulrich commented Nov 3, 2018 •

edited

Loading

joshuaulrich commented Dec 15, 2018 •

edited

Loading

harvey131 commented Dec 17, 2018 •

edited

Loading

joshuaulrich commented Dec 19, 2018 •

edited

Loading