Site icon R-bloggers

an attempt at EP-ABC from scratch, nothing more… [except for a few bugs]

[This article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Following a request from one of the reviewers of our chapter Likelihood-free model choice, I tried to run EP-ABC on a toy problem and to compare it with the outcome of a random forest ABC. Literally starting from scratch, namely from the description found in Simon and Nicolas’ JASA paper.  To run my test, I chose as my modelled data an exponential Exp(λ) sample of size 20, with a Gaussian N(0,1) prior on the log parameter (here replicated 100 times):

n=20
trueda=matrix(rexp(100*n),ncol=n)
#prior values
er0=0
Q0=1

Then I applied the formulas found in the paper for approximating the evidence, first defining the log normalising constant for an unnormalised Gaussian density as on the top of p.318

Psi=function(er,Q){
  -.5*(log(Q/2/pi)-Q*er*er)}

[which exhibited a typo in the paper, as Simon Barthelmé figured out after emails exchanges that the right formulation was the inverse]

Psi=function(er,Q){
  -.5*(log(Q/2/pi)-er*er/Q)}

and iterating the site updates as described in Sections 2.2, 3.1 and Algorithms 1 and 3:

ep=function(dat,M=1e4,eps,Nep=10){
 n=length(dat)
 #global and ith natural parameters for Gaussian approximations
 #initialisation:
 Qi=rep(0,n)
 Q=Q0+sum(Qi)
 eri=rep(0,n)
 er=er0+sum(eri)
 #constants Ci in (2.6)
 normz=rep(1,n)
 for (t in 1:Nep){
  for (i in sample(1:n)){
  #site i update
   Qm=Q-Qi[i] #m for minus i
   erm=er-eri[i]
   #ABC sample
   thetas=rnorm(M,me=erm/Qm,sd=1/sqrt(Qm))
   dati=rexp(M,rate=exp(thetas))
#update by formula (3.3)
   Mh=sum((abs(dati-dat[i])<eps))
   mu=mean(thetas[abs(dati-dat[i])<eps])
  sig=var(thetas[abs(dati-dat[i])<eps])
  if (Mh>1e3){
#enough proposals accepted
   Qi[i]=1/sig-Qm
   eri[i]=mu/sig-erm
   Q=Qm+Qi[i]
   er=erm+eri[i]
#update of Ci as in formula (2.7)
#with correction bottom of p.319
  normz[i]=log(Mh/M/2/eps)- Psi(er,Q)+Psi(erm,Qm)
  }}}
#approximation of the log evidence as on p.318
return(sum(normz)+Psi(er,Q)-Psi(er0,Q0)) }

except that I made an R beginner’s mistake (!) when calling the normal simulation to be

thetas=rnorm(M,me=erm/Qm)/sqrt(Qm)

to be compared with the genuine evidence under a conjugate Exp(1) prior [values of the evidence should differ but should not differ that much]

ev=function(dat){
 n=length(dat)
 lgamma(n+1)-(n+1)*log(1+sum(dat))
 }

After correcting for those bugs and too small sample sizes, thanks to Simon kind-hearted help!, running this code results in minor discrepancies between both quantities:

evid=ovid=rep(0,100)
M=1e4 #number of ABC samples
propep=.1 #tolerance factor
for (t in 1:100){
 #tolerance scaled by data
 epsi=propep*sd(trueda[t,])
 evid[t]=ep(trueda[t,],M,epsi)
 ovid[t]=ev(trueda[t,])
 }
plot(evid,ovid,cex=.6,pch=19)

as shown by the comparison below between the evidence and the EP-ABC approximations to the evidence, called epabence (the dashed line is the diagonal):

Obviously, this short (if lengthy at my scale) experiment is not intended to conclude that EP-ABC work or does not work. (Simon also provided additional comparison with the genuine evidence, that is under the right prior, and with the zero Monte-Carlo version of the EP-ABC evidence, achieving a high concentration over the diagonal.) I just conclude that the method does require a certain amount of calibration to become stable. Calibrations steps that are clearly spelled out in the paper (adaptive choice of M, stabilisation by quasi-random samples, stabilisation by stochastic optimisation steps and importance sampling), but also imply that EP-ABC is also “far from routine use because it make take days to tune on real problems” (pp.315-316). Thinking about the reasons for such discrepancy over the past days, one possible explanation I see for the impact of the calibration parameters is that the validation of EP if any occurs at a numerical rather than Monte Carlo level, hence that the Monte Carlo error must be really small to avoid interfering with the numerical aspects.


Filed under: R, Statistics, University life Tagged: ABC, computer experiment, expectation-propagation, Gaussian approximation, implementation, Monte Carlo Statistical Methods, R

To leave a comment for the author, please follow the link and comment on their blog: R – Xi'an's Og.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.