Laplace the Bayesianista and the Mass of Saturn
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
One of the things that had always intrigued me was something I read about the famous French mathematician Pierre Simon Laplace using Bayes’ theorem to estimate the mass of the planet Saturn in the early nineteenth century and that his method turned out to be remarkably accurate. As a matter of historical fact, Laplace (the Frenchman), working on mathematical theories of probability during the period 1812 to 1816, was entirely unaware of the prior work of Bayes (the Englishman) published posthumously in 1763. Rather, Laplace discovered a more generalized version of Bayes’ theorem on his own. Pr(M | D,I)=Pr(D | M,I)×Pr(M | I)Pr(D | I)
Although many authors refer to Laplace’s work, as well as remarking that his estimate of the mass of Saturn only differs from the modern value by about 0.5%, they never show how Laplace actually applied eqn. (???) to achieve his result. However, I just came across a translation of Laplace’s calculation and I am going to reproduce it here using R. I’ll try to stick with Laplace’s notation as much as possible and explain what it means in modern statistical terminology.
Laplace credits astronomer and former student Monsieur Alexis Bouvard with having already compiled measurements and calculations that facilitated his Bayesian approach. Out of some 129 equations for Saturn’s motion involving perturbations due to the tidal interaction between Jupiter and Saturn, as well as their moons (i.e., orbital wobbles) the following constants are distilled. z′=0.00620
In Laplace’s presentation, the mass of Saturn is then expressed as 13534.08(1+z′)
“The probability that the error of $z^{\prime}$ is comprehended within the limits $\pm~U$ is, by $\S~1$, √P√π∫ du c−P u2,The use of the letter $c$ instead of $e$ in the integral is most likely a typo in translation, since Euler had already introduced the base $e = 2.718282$ in the early part of the eighteenth century and Laplace seems to have used it in other publications. I found that if I identify the quantity $P$ in eqn. (???) (which he calls a weight) with the variance P=12σ2the integral being taken from $u = -U$ to $u = U$.”
“the probability that this mass is comprehended within the limits 13512.3 ± 1100 13534.08About this ratio, Laplace commented elsewhere:is equal to $\dfrac{11327}{11328}$.”
“Applying to them my formulae of probability I find that it is a bet of 11,000 against one that the error of this result is not 1/100 of its value or that which amounts to almost the same that after a century of new observations added to the preceding ones and examined in the same manner the new result will not differ by 1/100 from that of M. Bouvard.”Note the use of bookmaking parlance.
It becomes easier to understand what Laplace is saying if we plot the function represented by eqn. (???).
The normal probability density function is centered at zero and the relevant probability is given by the blue area under the curve between the limits $\pm U$, indicated by the vertical dashed lines. It resembles a plot of the error distribution in the mass calculation. The horizontal blue line is the full width at half max height (FWHM). The horizontal red line is twice the standard deviation ($\sigma$) corresponding to the reduced FWHM. (See Visualizing Variance) Here is the R code that computes the above plot.
# Created by NJG on Saturday, September 14, 2013 # SATURN constants z = 0.00620 P = exp(4.885683 * log(10)) sd = sqrt(1/(2*P)) pSat = 11327 / 11328 # Msat is expressed as solar mass fraction Fnorm <- function(x) { sqrt(P/pi) * exp(-P * x^2) } # Numerically integrate the normal dsn U <- 1/100 prob <- integrate(Fnorm, -U, U) # Compare the probabilities print(prob) print(pSat) # Plot the posterior dsn pts <- 1000 x <- seq(-U,U,length=pts) plot(x, Fnorm(x), type="l", xlab=expression(paste("Error in ", M[Sat])), ylab=expression(paste("Posterior PDF for ", M[Sat])) ) polygon(c(x,rev(x)), c(rep(0,pts),rev(Fnorm(x))), col=colors()[405]) abline(v=0) abline(h=0, v=c(-sd,sd),col="gray") abline(v=c(-U,U),col="blue",lty="dashed") abline(v=c(-4*sd,4*sd),col="red") HM <- sqrt(P/pi)/2 HW <- uniroot(function(x) Fnorm(x) - HM, c(0,1))$root sdHW <- 2 * HW / (2+1/3) sdHH <- 1.2 * HM lines(x=c(-HW, HW), y=c(HM,HM), col="blue", lwd=2) abline(h=HM,lty="dotted") lines(x=c(-sdHW, sdHW), y=c(sdHH, sdHH), col="red", lwd=2) arrows(x0=0,x1=sd,y0=50,y1=50,length = 0.10) text(x=sd/2, y=55, expression(sigma)) arrows(x0=0,x1=4*sd,y0=30,y1=30,length = 0.10) text(x=4*sd/2, y=35, expression(4*sigma))
One of the problems I ran into was determining the value for $U$. Laplace never states explicitly what value he uses. It turns out to be $U = 0.01$, i.e., the $1/100$ or 1% that appears in the interval that contains the mass of Saturn. Calculating the definite integral with those limits produces Pr(MSat | D,I)=0.9999117
"Newton had found, by the observations of Pound on the greatest elongation of the fourth satellite of Saturn, the mass of this planet equal to $1/3012$, that which surpasses by a sixth the preceding result. There are odds of millions of billions against one that the one of Newton is in error."What does all this mean?
- $U = \pm~0.01$ corresponds to an interval slightly less than $\pm ~4 \sigma$.
- It looks like a confidence interval, but it isn't in the modern sense because the number of measurements is essentially $n = 1$.
- Similarly, we can't speak of $M_{Sat}$ as a mean value.
- The CI that corresponds to 99% (i.e., $1 - 0.01$) is closer to $3 \sigma$
- Supposedly, that's the point of the Bayesian approach. You can get away with a good guess for the prior.
Or did Laplace just get lucky?
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.