Acceptance Ratio being 0 #4

zhenglei-gao · 2015-05-29T13:29:59Z

I think the main advantages of rcppbugs compared to JAGS and STAN is that user can have customized function in R in rcppbugs. Users can't define new functions as part of the modelling language in BUGS and JAGS. Although it could be done using low-level prgramming for all 3.

I would like to implement MCMC methods for parameter estimation and inference for a complex computer model (which takes parameters and generates simulated results), in this case TOXSWA.

In the script below, I defined a function toxswa which write the parameter values into input files, then run command line version of toxswa and extract the output. It seems working fine but the acceptance ratio from the ouput chain (300 runs) is 0. I am not sure where it went wrong.

parDat <- data.frame(name=c("kdomwb1","dt50wl","dt50wb"),
val=c(0.2183,100,100),lb=0,ub=Inf,fixed=c(1,0,0))
test.tpl <- "test.tpl"
obsDat <- read.table("CObserved.txt",skip=20)
names(obsDat) <- c("Time","Water","Sediment")
obsDat[1:2,3] <- NA
obsDat
library(reshape)
longObs <- melt(obsDat,id.vars=1) #!variable is obsGrp, value is Observed
con<-file(test.tpl,open="rt")
texts <- readLines(con)
pattern <- "\\Q@\\E[^)]*\\Q@\\E"
## pattern1 <- "\\([^)]*\\)"
toReplace <- regexpr(pattern,texts)
id <- which(toReplace>0)
ind1 <- toReplace[id]
ind2 <- attr(toReplace,"match.length")[id]
tmptext <- strsplit(texts[id],split="")
pnames <- sapply(1:length(id),function(i){
  str <- tmptext[[i]]
  str <- paste(str[(ind1[i]+1):(ind1[i]+ind2[i]-2)],collapse="")
  parNms <- gsub("[[:space:]]", "", str)
  parNms
})
close(con)
optPar <- c("dt50wl"=100,"dt50wb"=100)
logoptPar <- log(optPar)

toxswa <- function(par,fixPar=c("kdomwb1"=0.2183),tofile="C:/TOXSWA/Pesticides/fenamidone/SU.INP",
                  templateText,id,pnames=NULL,resFile="C:/TOXSWA/RUNS/Onga9707/ec001.out",...){
  ## from par to parDat
  allPars <- c(exp(par),fixPar)
  ## Here should make sure that allPars has the same order as pnames.
  if(!is.null(pnames)) if(any(names(allPars)!=pnames)) stop("Parameter vector should be rearranged!")
  ## from parDate to INPUT File
  ## Do not call genInput function to save some time.
  pattern <- "\\Q@\\E[^)]*\\Q@\\E"
  repPars <- as.character(format(allPars))
  texts[id] <- sapply(1:length(id),function(i){
    gsub(pattern,repPars[i],texts[id[i]])
  })
  fw <- file(tofile, open="wt")
  writeLines(texts,fw)
  close(fw)
  system("run.bat")
  ## Extract Ouput!
  yhat <- extractPred(resFile,model="TOXSWA",outlen=152/0.01,outInd=obsDat$Time/0.01+1,writeRes=FALSE)
  return(yhat)
}
longObs <- melt(obsDat,id.vars=1)
y <- longObs$val


### MCMC Part 
library(rcppbugs)
sepWatSed <- FALSE
if(sepWatSed) tau.y <- mcmc.gamma(
  c(sd(obsDat$Water),sd(obsDat$Sediment,na.rm=TRUE)),
  alpha=0.1,beta=0.1)  else tau.y <- mcmc.gamma(sd(y,na.rm=TRUE),alpha=0.1,beta=0.1)  
par <- mcmc.normal(rnorm(2),mu=log(313.7,77.1),tau=0.0001)
y.hat <- deterministic(toxswa,par,fixPar=c("kdomwb1"=0.2183),tofile="C:/TOXSWA/Pesticides/fenamidone/SU.INP",templateText=texts,id=id,pnames=NULL)

y.lik <- mcmc.normal(y,mu=y.hat,tau=tau.y,observed=TRUE)

m <- create.model(par, tau.y, y.hat, y.lik)
#runtime <- system.time(ans <- run.model(m, iterations=1e5L, burn=1e4L, adapt=2e3L, thin=10L))
runtime <- system.time(ans <- run.model(m, iterations=3e2L, burn=0, adapt=200, thin=2L))

print(apply(ans[["par"]],2,mean))
cat("acceptance.ratio:",get.ar(ans),"\n")
print(runtime)

The text was updated successfully, but these errors were encountered:

armstrtw · 2015-05-29T13:48:40Z

usually this happens when your initial values is out of the support range for the respective distribution. the sampler may not be able to figure out how to jump into the support range since it's just a random walk jumper...

I thought I had added some messages about initializing an mcmc.object whith values outside the support but possibly not. I'll have to check the code.

Alternatively the issue could be that the initial value is in the support range, but the variance of the jumper is so bit that all of the random walk jumps go outside the support range. I haven't looked at your code, but you may be able to narrow down the issue just from knowing which nodes have narrow support ranges...

I'm glad you are using the package. I really need to release an update as there have been numerous changes to the backend (the c++ package).

If you can't get the R bit to work you can always code up your model directly in c++ (and then print more diagnostic messages as the chain progresses). Let me know if you want to go that route.

-Whit

zhenglei-gao · 2015-06-01T08:09:46Z

Hi, Whit,

Thanks for your reply. I will try out all possibilities and get back to you. This is still my first try. My skills in c++ are at the very basic level and I will try that as the last step.

Also, I'm not sure if you noticed but in the model function there is a system call which calls another program to do the whole model calculation. In the end I would like to have a general function which can be coupled with any "computer model" and perform inference based on mcmc chains. This is similar to the program PEST which performs parameter estimation and used in industry quite widely.

Another thing in this particular model is that the two parameters are actually highly correlated. One of the reasons that I can guess is that random walk steps are too independent to jump to acceptable values for both parameters. Maybe adding an covariance structure in the priors would help but I am not sure if this can be done in rcppbugs.

zhenglei-gao · 2015-06-01T13:59:05Z

Update: just did a run with simulated dataset instead of the real ones. I also restricted in the simulation that the 2 parameters in the original model must be equal. Then the sampling actually updates well and the acceptance ratio was 0.17 for 100 samples.

I am going to check some other MCMC algorithms to make a simple comparison.

zhenglei-gao · 2015-09-28T16:04:27Z

Now I did a simulation study without calling other computer program. But the acceptance ratio is still 0.

The functions needed for this example is in a package repository at Github.

The Data Looks like

and the results are:

install_github("zhenglei-gao/SedimentWater",args="--no-multiarch")

Simulate Data

library("SedimentWater")
n <- 4
yini <- c(1,rep(0,n))
SWModel <- gen.TOXSWA(n=n)
library(ggplot2)
library(reshape)
theme_set(theme_bw())
##cat(SWModel)
names(yini) <- c("Pw",paste0("Ps",0:(n-1)))
Zs <- 2
Zsed <- rep(Zs/n,n)
names(Zsed) <- paste0("Zs",0:(n-1))
parms <- c(kpwa = log(2)/50, kpse = log(2)/50, Kd = 50,rhob=2.05,theta=0.5,Zw=5,Dp=0.05,delta=1,Zsed)

cToxswaA <- compile.ode(SWModel, language = "C", 
                        parms = parms,y=yini) 
code(cToxswaA)
sigmaS <- c(sigmaW=0.1,sigmaS=0.01)
simData <- simSW(func = cToxswaA, yini = yini, parms = parms, times = seq(0,100,by=10),sigmaS =sigmaS,plotting=TRUE)

Least square optimization

nparms <- length(parms)
res <- nls.lm(par=parms[1:3],lower=rep(0,3),fn=calcResid,data=simData,cModel=cToxswaA,fixParms=parms[4:nparms],fixInis=yini,transParms = NULL,sepn=3)
summary(res)
parms[1:3]
## Plot fitted
fitparms <- parms
fitparms[1:3] <- res$par
dta <- ode (func = cToxswaA, y = yini, parms = fitparms,times = seq(0,100,by=10))
plotSW(dta,origDta=simData)

rcppbugs fit

y <- simData$Concentration
e <- calcResid(par=res$par,data=simData,cModel=cToxswaA,fixParms=parms[4:nparms],fixInis=yini,transParms = NULL,sepn=3)
obsDat <- cast(simData,Time~Media,value="Concentration")
library(rcppbugs)
sepWatSed <- FALSE
if(sepWatSed) tau.y <- mcmc.gamma(
  c(sd(obsDat$Water),sd(obsDat$Sediment,na.rm=TRUE)),
  alpha=0.1,beta=0.1)  else {
    tau.y <- mcmc.gamma(sd(e),alpha=0.1,beta=0.1)  
}
sepn <-3
logpar <- mcmc.normal(rnorm(sepn),mu=log(res$par),tau=1)
logpar <- mcmc.normal(rnorm(sepn),mu=log(c(100,100, 5
)),tau=1)
y.hat <- deterministic(getY,logpar,data=simData,cModel=cToxswaA, fixParms=parms[4:nparms],fixInis=yini,transParms = NULL,sepn=3)

y.lik <- mcmc.normal(y,mu=y.hat,tau=tau.y,observed=TRUE)

m <- create.model(logpar, tau.y, y.hat, y.lik)
#runtime <- system.time(ans <- run.model(m, iterations=1e5L, burn=1e4L, adapt=2e3L, thin=10L))
cat("running model...\n")
runtime <- system.time(ans <- run.model(m, iterations=1e5L, burn=0, adapt=0, thin=10L))

print(apply(ans[["logpar"]],2,mean))
cat("acceptance.ratio:",get.ar(ans),"\n")
print(runtime)

The model defined in cToxswaA is as follows

##   1: #include 
##   2: 
##   3:  
##   4:  static double parms[12];
##   5: #define kpwa parms[0]
##   6: #define kpse parms[1]
##   7: #define Kd parms[2]
##   8: #define rhob parms[3]
##   9: #define theta parms[4]
##  10: #define Zw parms[5]
##  11: #define Dp parms[6]
##  12: #define delta parms[7]
##  13: #define Zs0 parms[8]
##  14: #define Zs1 parms[9]
##  15: #define Zs2 parms[10]
##  16: #define Zs3 parms[11]
##  17: 
##  18: void initpar(void (* odeparms)(int *, double *)){
##  19:     int N =12;
##  20:     odeparms(&N, parms);
##  21: }
##  22: 
##  23: void func ( int * n, double * t, double * y, double * f, double * rpar, int * ipar ) {
##  24: 
##  25:   
##  26:   
##  27:  double  Pw, Ps0, Ps1, Ps2, Ps3 ;
##  28:  
##  29:  double  dPw, dPs0, dPs1, dPs2, dPs3 ;
##  30:  
##  31:  
##  32:   
##  33:  Pw = y[0];
##  34:  Ps0 = y[1];
##  35:  Ps1 = y[2];
##  36:  Ps2 = y[3];
##  37:  Ps3 = y[4];
##  38:   
##  39:   double fdiss= 1/(1+Kd*rhob/theta);
##  40:   dPw = -kpwa*Pw-2*Dp/(Zs0)*(Pw/Zw-fdiss*Ps0/Zs0);
##  41:   dPs0 = -kpse*Ps0+2*Dp/(Zs0)*(Pw/Zw-fdiss*Ps0/Zs0)-2*Dp*fdiss/(Zs0+Zs1)*(Ps0/Zs0-Ps1/Zs1);
##  42: dPs1= -delta*kpse*Ps1+2*Dp*fdiss/(Zs0+Zs1)*(Ps0/Zs0-Ps1/Zs1)-2*Dp*fdiss/(Zs1+Zs2)*(Ps1/Zs1-Ps2/Zs2);
##  43: dPs2= -delta*kpse*Ps2+2*Dp*fdiss/(Zs1+Zs2)*(Ps1/Zs1-Ps2/Zs2)-2*Dp*fdiss/(Zs2+Zs3)*(Ps2/Zs2-Ps3/Zs3);
##  44: dPs3= -delta*kpse*Ps3+2*Dp*fdiss/(Zs2+Zs3)*(Ps2/Zs2-Ps3/Zs3);
##  45:  
##  46:  f[0] = dPw;
##  47:  f[1] = dPs0;
##  48:  f[2] = dPs1;
##  49:  f[3] = dPs2;
##  50:  f[4] = dPs3;
##  51:   
##  52: 
##  53: }

zhenglei-gao · 2015-09-28T16:06:36Z

I am not sure where it went wrong. Optimization produced sensible results but are quite far away from the true values. Is it possible that it is because there are multiple local minima?

armstrtw · 2015-09-28T16:37:10Z

If your acceptance ratio is at zero, then the chain simply cannot progress at all.

I thought that I had checks in place so that one gets a 'stop' error thrown if you construct the initial values outside the support range of the statistic, but perhaps there is a bug.

you have your model posted at that github link? I'll try to have a look.

zhenglei-gao · 2015-09-29T08:20:58Z

Yes, the model and the result should be reproducible from the above code. The R folder in that github link contains the functions used. The initial values are from the least square estimates. I will also try to use other prior variances to see if things change.

Maybe it is a stupid mistake somewhere.

armstrtw · 2015-09-29T09:48:58Z

have you tried STAN for this model?

zhenglei-gao · 2015-09-29T13:13:23Z

No, I haven't yet. Last time I checked there was no suport for ODE integrator yet. But I see now they have a working code brunch. However I need time to figure out how to define and use the ODE integrator in STAN.

zhenglei-gao · 2015-10-05T11:44:36Z

Hi Whit,

I have ran the model in STAN and the chains freeze very often. I asked in stan user mailing list here where I got an answer related to the stiffness of the ODEs. I am not sure whether this caused the freezing of the sampling or the acceptance ratio being 0.

The stan files used to simulate data and define the model were attached in the above link. The R code were also included.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Acceptance Ratio being 0 #4

Acceptance Ratio being 0 #4

zhenglei-gao commented May 29, 2015

armstrtw commented May 29, 2015

zhenglei-gao commented Jun 1, 2015

zhenglei-gao commented Jun 1, 2015

zhenglei-gao commented Sep 28, 2015

zhenglei-gao commented Sep 28, 2015

armstrtw commented Sep 28, 2015

zhenglei-gao commented Sep 29, 2015

armstrtw commented Sep 29, 2015

zhenglei-gao commented Sep 29, 2015

zhenglei-gao commented Oct 5, 2015

Acceptance Ratio being 0 #4

Acceptance Ratio being 0 #4

Comments

zhenglei-gao commented May 29, 2015

armstrtw commented May 29, 2015

zhenglei-gao commented Jun 1, 2015

zhenglei-gao commented Jun 1, 2015

zhenglei-gao commented Sep 28, 2015

zhenglei-gao commented Sep 28, 2015

armstrtw commented Sep 28, 2015

zhenglei-gao commented Sep 29, 2015

armstrtw commented Sep 29, 2015

zhenglei-gao commented Sep 29, 2015

zhenglei-gao commented Oct 5, 2015