How cold is it? A Bayesian attempt to measure temperature
[This article was first published on mages' blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
It is getting colder in London, yet it is still quite mild considering that it is late November. Well, indoors it still feels like 20°C (68°F) to me, but I have been told last week that I should switch on the heating. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Luckily I found an old thermometer to check. The thermometer showed 18°C. Is it really below 20°C?
The thermometer is quite old and I’m not sure that is works properly anymore. So, what shall I do now? Perhaps I should consider that both measurements are uncertain and try to combine them.
I believe that I can sense the temperature within ±3°C, while I think that the thermometer still works within ±2°C. Assuming that both measurements follow a Gaussian (Normal) distribution, with the uncertainties given as standard deviations, I can use Bayes’ theorem to combine my hypothesis with the data. The posterior distribution will be Gaussian again with conjugated hyper-parameters:
[
mu=left.left(frac{mu_0}{sigma_0^2} + frac{sum_{i=1}^n x_i}{s^2}right)right/left(frac{1}{sigma_0^2} + frac{n}{s^2}right) \
sigma^2=left(frac{1}{sigma_0^2} + frac{n}{s^2}right)^{-1}
]With (K := frac{nsigma_0^2}{s^2+nsigma_0^2} ) this simplifies to:
[
mu = K, bar{x} + (1 – K), mu_0 mbox{, with } bar{x}=frac{1}{n}sum_{i=1}^n x_i\
sigma = s ,sqrt{K/n}
]In my case I have: (n=1,; x_1=18^{circ}C,; s=2^{circ}C,; mu_0=20^{circ}C,; sigma_0=3^{circ}C).
Hence, the posterior distribution has parameters (mu=18.6^{circ}C) and (sigma=1.7^{circ}C). Thus, my best guess would be that is actually a little colder than I thought. One could argue that the probability that is below 20° is 80%.
Over the last five days my perception of the temperature didn’t change, neither did the weather forecast, but the measurements showed: 18°C, 19°C, 17.5°C, 18°C, 18.5°C.
With that information the parameters update to (mu=18.3^{circ}C) and (sigma=0.9^{circ}C). I can’t deny it any longer it has got colder. The probability that is below 20°C is now at 97% and the heating is on.
Without any prior knowledge I may have used a t-test to check the measurements. But here I believe that I have information about the thermometer and my own temperature sensing abilities which I don’t want to ignore.
R code
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
temp1 <- 18 | |
K1 <- 1*9/(4+1*9) | |
(mu1 <- K1*mean(temp1) + (1 - K1)*20) | |
# 18.61538 | |
(sig1 <- 2*sqrt(K1/1)) | |
# 1.664101 | |
pnorm(20, mu1, sig1) | |
# 0.7973097 | |
curve(dnorm(x, mu1, sig1), 14, 23, | |
ylim=c(0,0.5), | |
xlab="Temperatur", | |
ylab="Density", | |
main="How cold is it?") | |
temp5 <- c(18, 19, 17.5, 18, 18.5) | |
K5 <- 5*9/(4+5*9) | |
(mu5 <- K5*mean(temp5) + (1-K5)*20) | |
# 18.34694 | |
(sig5 <- 2*sqrt(K5/5)) | |
# 0.8571429 | |
pnorm(20, mu5, sig5) | |
# 0.973108 | |
curve(dnorm(x, mu5, sig5), 14, 23, | |
col=2, add=TRUE) | |
legend("topright", bty="n",col=c(1,2), lty=1, | |
legend=c("One measurement","Five measurements")) | |
# t-test, assuming no prior knowledge | |
t.test(temp5, mu=20) | |
# | |
# One Sample t-test | |
# | |
# t = -7.0602, df = 4, p-value = 0.002123 | |
# data: temp5 | |
# alternative hypothesis: true mean is not equal to 20 | |
# 95 percent confidence interval: | |
# 17.49214 18.90786 | |
# sample estimates: | |
# mean of x | |
# 18.2 | |
# Bayesian t-test | |
library(BayesianFirstAid) | |
bayes.t.test(temp5, mu=20) | |
# |**************************************************| 100% | |
# | |
# Bayesian estimation supersedes the t test (BEST) - one sample | |
# | |
# data: temp5, n = 5 | |
# | |
# Estimates [95% credible interval] | |
# mean of temp5: 18 [17, 19] | |
# sd of temp5: 0.72 [0.24, 1.9] | |
# | |
# The mean is more than 20 by a probability of 0.003 | |
# and less than 20 by a probability of 0.997 |
Session Info
R version 3.1.2 (2014-10-31) Platform: x86_64-apple-darwin13.4.0 (64-bit) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: [1] BayesianFirstAid_0.1 rjags_3-14 coda_0.16-1 [4] lattice_0.20-29 loaded via a namespace (and not attached): [1] grid_3.1.2 MASS_7.3-35 mnormt_1.5-1 stringr_0.6.2
To leave a comment for the author, please follow the link and comment on their blog: mages' blog.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.