Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
[This was originally going to be about multilevel modelling using INLA, but to do that flexibly you need INLA’s observation matrix mechanism, and this one deserves an explanation in its own right, so I’m introducing it by way of an easier example (for which you don’t actually need the observation matrix mechanism, but it’s clearer that way and it generalises better). For more on INLA, see the INLA website or this previous post.]
In Ramsay and Silverman’s Functional Data Analysis there’s a chapter devoted to doing regression when what you are trying to predict is a scalar quantity and what you have as covariate (input) is a function. For example, you might have measured someone’s heart rate during an effort test (which can be thought of as
with the usual assumption that
The linear functional regression problem actually comes up quite a bit: in neuroscience and psychology, reverse correlation and classification images problems can be viewed as more or less exactly the same thing (trying to predict a neuron’s linear response from an audio signal is exactly the marathon example, with a different likelihood function). In other fields, it might be known as a signal reconstruction problem, or as an integral equation problem. In signal reconstruction we know think of
So suppose we have measured a bunch of Fourier coefficients. If the measurements are noisy we can’t just use the inverse Fourier transform, there is some uncertainty involved. The Bayesian way would be to set a prior on
In latent Gaussian models you have, on the one hand, a set of
(the prior)
and:
(the likelihood)
where
Usually
so that
Here’s a concrete example. We take
Our measurement functions will be cosines at various frequencies
To get the equivalent of a discrete Fourier transform we can restrict the
We construct the matrix
(This image makes my eyes water)
R code to generate data looks like this:
gendat { nMes compute.dotprod mes data.frame(k=ks,y = mes + noise.sigma*rnorm(length(ks))) }
The INLA call I use is:
res
This bit uses a “rw2” model to enforce smoothness, and a variable “ind” which corresponds to the time index (it corresponds to
We take measurements for a sequence of (non-integer)
And here is the estimated signal
If you remove measurement noise, you’ll have the world’s slowest discrete Fourier transform! It’s actually interesting to play around with this example a bit, because there are plenty of ways to make it go completely wrong (for example when removing sine measurements).
(*) Not only would this notation be clearer, but it would also make it possible to have things like
that’d define two latent fields with different dimensions. On the other hand, I imagine it would be a lot harder to parse.
Here’s the full R code:
##An example of functional regression in INLA ##Author: Simon Barthelmé, TU Berlin, 2012 ##-------------------------------------------- ##Source the file, then run ##--- ##res ##plot.measurements(res) ##plot.functions(res) ##--- ##to generate the figures in the blog post (give or take some noise) require(INLA) require(ggplot2) require(plyr) #The "signal" we're measuring fun #The measurement functions: k = freq., phi = phase #phi = pi/2 corresponds to a sine function mfun #Generate random measurement data gendat { nMes compute.dotprod mes data.frame(k=ks,y = mes + noise.sigma*rnorm(length(ks))) } #Generate a discrete measurement matrix measurement.matrix { t(sapply(1:length(ks),function(ind) mfun(ts,ks[ind],phis[ind])/length(ts))) } #Generate some data and run INLA on them run.example { nMes #A is the measurement matrix ts A dat res res$ks res$phis res$ts res } plot.measurements { nMes df.measure sine=factor((res$phis==pi/2),lab=c('cosine phase','sine phase')), fitted.mean=res$summary.linear.predictor[(1:nMes),1], fitted.low=res$summary.linear.predictor[(1:nMes),3], fitted.up=res$summary.linear.predictor[(1:nMes),5], measured=res$data$y ) ggplot(df.measure,aes(x=k))+geom_ribbon(aes(ymin=fitted.low,ymax=fitted.up),alpha=.1)+geom_line(aes(y=fitted.mean))+geom_point(aes(y=measured),col="darkblue")+theme_bw()+labs(y="Measurements",x="Frequency (Hz)")+facet_wrap(~ sine) } plot.functions { nMes #In function space df.function fitted.mean=res$summary.linear.predictor[-(1:nMes),1], fitted.low=res$summary.linear.predictor[-(1:nMes),3], fitted.up=res$summary.linear.predictor[-(1:nMes),5], true=fun(res$ts) ) ggplot(df.function,aes(x=t))+geom_ribbon(aes(ymin=fitted.low,ymax=fitted.up),alpha=.1)+geom_line(aes(y=fitted.mean))+geom_line(aes(y=true),col="darkred")+theme_bw()+labs(x="Time (sec.)",y="f(t)") }
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.