Site icon R-bloggers

Bayes vs. the Invaders! Part Four: Convergence

[This article was first published on Weird Data Science, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Sealed and Buried

In the previous three posts2. Most importantly our negative binomial GLM employs a log link function to relate the linear model to the data:

$$\begin{eqnarray}
y &\sim& \mathbf{NegBinomial}(\mu, \phi)\\
\log(\mu) &=& \alpha + \beta x\\
\alpha &\sim& \mathcal{N}(0, 1)\\
\beta &\sim& \mathcal{N}(0, 1)\\
\phi &\sim& \mathbf{HalfCauchy}(2)
\end{eqnarray}$$

In a basic linear regression, \(y=\alpha+\beta x\), the \(\alpha\) parameter can be interpreted as the value of \(y\) when \(x\) is 0. Increasing the value of \(x\) by 1 results in a change in the \(y\) value of \(\beta\). We have, however, been drawn far beyond such naive certainties.

The \(\alpha\) and \(\beta\) coefficients in our negative binomial GLM produce the \(\log\) of the \(y\) value: the mean of the negative binomial in our parameterisation.

With a simple rearrangement, we can being to understand the grim effects of this transformation:

$$\begin{array}
_ & \log(\mu) &=& \alpha + \beta x\\
\Rightarrow &\mu &=& \operatorname{e}^{\alpha + \beta x}\\
\end{array}$$

If we set \(x=0\):

$$\begin{eqnarray}
\mu_0 &=& \operatorname{e}^{\alpha}
\end{eqnarray}$$

The mean of the negative binomial when \(x\) is 0 is therefore \(\operatorname{e}^{\alpha}\). If we increase the value of \(x\) by 1:

$$\begin{eqnarray}
\mu_1 &=& \operatorname{e}^{\alpha + \beta}\\
&=& \operatorname{e}^{\alpha} \operatorname{e}^{\beta}
\end{eqnarray}$$

Which, if we recall the definition of the underlying mean of our model’s negative binomial, \(\mu_0\), above, is:
$$\mu_0 \operatorname{e}^{\beta}$$

The effect of an increase in \(x\) is therefore multiplicative with a log link: each increase of \(x\) by 1 causes the mean of the negative binomial to be further multiplied by \(\operatorname{e}^{\beta}\).

Despite this insidious complexity, in many senses our naive interpretation of these values still holds true. A higher value for the \(\beta\) coefficient does mean that the rate of sightings increases more swiftly with population.

With the full, unsettling panoply of US States laid out before us, any attempt to elucidate their many and varied deviations would be overwhelming. Broadly, we can see that both slope and intercepts are generally restricted to a fairly close range, with the 50% and 95% credible intervals notably overlapping in many cases. Despite this, there are certain unavoidable abnormalities from which we cannot — must not — shrink:

We can conclude, then, that while the total number of sightings in Delaware are currently low, any increase in numbers of residents there appears to possess a strange fascination for visitors from beyond the inky blackness of space. By contrast, whilst our alien observers have devoted significant resources to monitoring Utah and California, their apparent willingness to devote further effort to tracking those states’ burgeoning populations is low.

Trembling Uncertainty

One of the fundamental elements of the Bayesian approach is its willing embrace of uncertainty. The output of our eldritch inferential processes are not point estimates of the outcome, as in certain other approaches, but instead posterior predictive distributions for those outcomes. As such, if when we turn our minds to predicting new outcomes based on previously unseen data, our outcome is a distribution over possible values rather than a single estimate. Thus, at the dark heart of Bayesian inference is a belief in the truth that all uncertainty be quantified as probability distributions.

The Bayesian approach as inculcated here has a predictive bent to it. These intricate methods lend themselves to forecasting a distribution of possibilities before the future unveils itself. Here, we gain a horrifying glimpse into the emerging occurrence of alien visitations to the US as its people busy themselves about their various concerns, scrutinised and studied, perhaps almost as narrowly as a man with a microscope might scrutinise the transient creatures that swarm and multiply in a drop of water.

Unavoidable Choices

The twisted reasoning underlying this series of posts has been not only in indoctrinating others into the hideous formalities of Bayesian inference, probabilistic programming, and the arcane subtleties of the Stan programming language; but also as an exercise in exposing our own minds to their horrors. As such, there is a tentative method to the madness of some of the choices made in this series that we will now elucidate.

Perhaps the most jarring choice has been our choice to code these models in Stan directly, rather than using one of the excellent helper libraries that allow for more concise generation of the underlying Stan code. Both brms and rstanarm possess the capacity to spawn models such as ours with greater simplicity of specification and efficiency of output, due to a number of arcane tricks. As an exercise in internalising such forbidden knowledge, however, it is useful to address reality unshielded by such swaddling conveniences.

In fabricating models for more practical reasons, however, we would naturally turn to these tools unless our unspeakable demands go beyond their natural scope. As a personal choice, brms is appealing due to it more natural construction of readable per-model Stan code to be compiled. This allows for the grotesque internals of generated models to be inspected and, if required, twisted to whatever form we desire. rstanarm, by contrast, avoids per-model compilation by pre-compiling more generically applicable models, but its underlying Stan code is correspondingly more arcane for an unskilled neophyte.

The Stan models presented in previous posts have also been constructed as simply as possible and have avoided all but the most universally accepted tricks for improving speed and stability3. Most notably, Stan presents specific functions for GLMs based on the Poisson and negative binomial distributions that apply standard link functions directly. As mentioned, we consider it more useful for personal and public indoctrination to use the basic, albeit log-form parameterisations.

Last Rites

In concluding the dark descent of this series of posts on Bayesian inference, generalised linear models, and the unearthly effects of extraterrestrial visitions on humanity, we have applied numerous esoteric techniques to identify, describe, and quantify the relationship between human population and UFO sightings. The enigmatic model constructed throughout this and the previous three entries darkly implies that, while the rate of inexplicable aerial phenomena is inextricably and positively linked to humanity’s unchecked growth, there are nonetheless unseen factors that draw our non-terrestrial visitors to certain populations more than others, and that their focus and attention is ever more acute.

This series has inevitably fallen short of a full and meaningful elucidation of the techniques of Bayesian inference and Stan. From this first step on such a path, then, interested students of the bizarre and arcane would be well advised to draw on the following esoteric resources:

Until then, watch the skies and archive your data.

Footnotes

To leave a comment for the author, please follow the link and comment on their blog: Weird Data Science.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.