Possible inefficency of `$unconstrain_variables()` #920

crsh · 2024-02-12T19:40:52Z

For context, I have implemented methods to support CmdStanFit-objects in bridgesampling::bridge_sampler(). While the new methods work as expected, they are very slow compared to the stanfit-methods for rstan. On my 2017 MacBook Pro, My example takes about 1.000 ms with rstan::unconstrain_pars() but almost 20.000 ms with fit$unconstrain_pars():

library("cmdstanr")
library("rstan")
library("microbenchmark")

set.seed(12345)

mu <- 0
tau2 <- 0.5
sigma2 <- 1

n <- 20
theta <- rnorm(n, mu, sqrt(tau2))
y <- rnorm(n, theta, sqrt(sigma2))

### set prior parameters ###
mu0 <- 0
tau20 <- 1
alpha <- 1
beta <- 1

# models
stan_code_H0 <- 'data {
int<lower=1> n; // number of observations
vector[n] y; // observations
real<lower=0> alpha;
real<lower=0> beta;
real<lower=0> sigma2;
}
parameters {
real<lower=0> tau2; // group-level variance
vector[n] theta; // participant effects
}
model {
target += inv_gamma_lpdf(tau2 | alpha, beta);
target += normal_lpdf(theta | 0, sqrt(tau2));
target += normal_lpdf(y | theta, sqrt(sigma2));
}
'

# RStan
rstan_model_H0 <- suppressWarnings(
  stan_model(model_code = stan_code_H0, model_name="stanmodel")
)

rstan_object_H0 <- sampling(
  rstan_model_H0, data = list(y = y, n = n, alpha = alpha, beta = beta, sigma2 = sigma2),
  iter = 3500, warmup = 500, chains = 4, show_messages = FALSE, refresh = 0
)

# CmdStanR
cmd_stan_model_H0 <- write_stan_file(stan_code_H0) |>
  cmdstan_model(force_recompile = TRUE)

cmdstan_fit_H0 <- cmd_stan_model_H0$sample(
  data = list(y = y, n = n, alpha = alpha, beta = beta, sigma2 = sigma2)
  , iter_sampling = 3500, iter_warmup = 500, chains = 4, show_messages = FALSE, refresh = 0
)

cmdstan_fit_H0$init_model_methods()

# Compare unconstraining
microbenchmark(
  upars <- cmdstan_fit_H0$unconstrain_draws(draws = cmdstan_fit_H0$draws(format = "draws_df"))
  , {
    pars <- rstan::extract(rstan_object_H0, permute = FALSE)
    skeleton <- rstan:::create_skeleton(rstan_object_H0@model_pars, rstan_object_H0@par_dims)
    upars <- apply(pars, 1:2, FUN = function(theta) {
      unconstrain_pars(rstan_object_H0, rstan:::rstan_relist(theta, skeleton))
    })
  }
  , times = 3
)

        min         lq       mean    median        uq       max neval cld
 19371.8549 20333.6410 20795.4397 21295.427 21507.232 21719.037     3   a 
   806.4729   913.1353   998.0147  1019.798  1093.786  1167.774     3   b

Note that this does not include the additional compilation time required by $init_model_methods().

I have spent quite some time profiling this issue and from what I can tell, it is this line that is responsible for most of the difference:

cmdstanr/R/fit.R

Line 526 in d3b455f

    
           private$model_methods_env_$unconstrain_variables(private$model_methods_env_$model_ptr_, stan_pars)

I know next to nothing about C++, so I was wondering whether this is an issue with the implementation of $unconstrain_variables(), whether there is a problem with my setup, or whether I'm possibly making a mistake somewhere. Any pointers would be appreciated.

CmdStanR version number

packageVersion("cmdstanr")

[1] ‘0.7.1’

The text was updated successfully, but these errors were encountered:

jgabry · 2024-04-22T22:44:58Z

Thanks for reporting and sorry for the delay. @andrjohns is this speed difference to be expected?

andrjohns · 2024-05-04T22:08:57Z

@crsh Thanks for flagging! I've implemented fixes/optimisations via #960, and your example is now benchmarking for me as:

> microbenchmark(
+   upars <- cmdstan_fit_H0$unconstrain_draws()
+   , times = 3
+ )
Unit: milliseconds
                                        expr      min      lq     mean   median       uq      max neval
 upars <- cmdstan_fit_H0$unconstrain_draws() 14.29379 14.7255 15.84829 15.15721 16.62554 18.09387     3

And also just to flag that you can just call fit$unconstrain_draws() if you only want to unconstrain the internal draws. Passing an argument is only needed if you want to use the current model to unconstrain a different set of draws

andrjohns · 2024-05-06T04:21:39Z

Closed by #960

crsh mentioned this issue Feb 12, 2024

Adds support for CmdStanFit objects. #27 quentingronau/bridgesampling#38

Open

andrjohns mentioned this issue May 4, 2024

Improve efficiency of model methods, tidy code #960

Merged

2 tasks

andrjohns closed this as completed May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible inefficency of `$unconstrain_variables()` #920

Possible inefficency of `$unconstrain_variables()` #920

crsh commented Feb 12, 2024 •

edited

Loading

jgabry commented Apr 22, 2024

andrjohns commented May 4, 2024

andrjohns commented May 6, 2024

Possible inefficency of $unconstrain_variables() #920

Possible inefficency of $unconstrain_variables() #920

Comments

crsh commented Feb 12, 2024 • edited Loading

jgabry commented Apr 22, 2024

andrjohns commented May 4, 2024

andrjohns commented May 6, 2024

Possible inefficency of `$unconstrain_variables()` #920

Possible inefficency of `$unconstrain_variables()` #920

crsh commented Feb 12, 2024 •

edited

Loading