Probable Points and Credible Intervals, Part 1
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
After having broken the Bayesian eggs and prepared your model in your statistical kitchen the main dish is the posterior. The posterior is the posterior is the posterior, given the model and the data it contains all the information you need and anything else will be a little bit less nourishing. However, taking in the posterior in one gulp can be a bit difficult, in all but the most simple cases it will be multidimensional and difficult to plot. But even if it is one-dimensional and you could plot it (as, say, a density plot) that does not necessary mean that it is easy to see what’s going on.
One way of getting around this is to take a bite at a time and look at summaries of the marginal posteriors of the variables of interest, the two most common type of summaries being point estimates and credible intervals (an interval that covers a certain percentage of the probability distribution). Here one is faced with a choice, which of the many ways of constructing point estimates and credible intervals to choose? This is a perfectly good question that can be given an unhelpful answer (with a predictable follow-up question):
– That depends on your loss function.
– So which loss function should I use?
The reason for this exchange is that most summaries of the posterior can be seen as minimizing a loss given one or another loss function. This way of viewing posterior summaries is part of statistical decision theory, and is useful, coherent and the topic of many books (and the possible forthcoming Part 2 of this blog post).
One can, however, also view posterior summaries as just graphical summaries. That is, as compact, convenient ways of looking at the posterior, well knowing that these summaries are not the whole picture, just a convenient graphical representation. This post will go through the following six common point estimates and credible intervals from this perspective: The posterior mode, median and mean, and the standard deviation of the posterior, the quantile interval and the highest density interval. I will use the following hypothetical posteriors to showcase these summaries:
These distributions are more or less symmetric, skewed, bi-modal and short-tailed. They are also more or less commonly encountered, with the top-left symmetric heap shaped distribution being the archetypal well behaved posterior and the lower-right distribution being extremely badass bi-modal.
The Mode and Highest Density Interval
If you would represent a density plot by a point and an interval a reasonable choice would be to use the mode (the highest point) and the interval that contains the highest part of the density. Below is the mode and the highest density interval covering 95% of the probability density for the six example posteriors:
It is, of course, up to you whether you think these points and intervals represent the underlying probability densities well or not. It obviously does not work well in the bi-modal case and it might be a bit strange that the point estimate for the upper-middle distribution is pushed all the way to the left, but otherwise I think it works pretty well. For most of the densities I get a pretty good idea regarding what the underlying posterior looks like.
There is no function in base R to directly calculate these measures given a sample s
representing the posterior. A quick-n-dirty function for estimating the mode can be defined by taking the maximum of a density estimate, like this:
estimate_mode <- function(s) { d <- density(s) d$x[which.max(d$y)] } estimate_mode(s)
The are also many functions for estimating the mode in the modeest
package, for example, the half sample mode:
library(modeest) mlv(s, method = "HSM")