Sub-Gaussian property for the Beta distribution (part 3, final)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When a Beta random variable wants to act like a Bernoulli: convergence of optimal proxy variance.
In this third and last post about the Sub-Gaussian property for the Beta distribution [1] (post 1 and post 2), I would like to show the interplay with the Bernoulli distribution as well as some connexions with optimal transport (OT is a hot topic in general, and also on this blog with Pierre’s posts on Wasserstein ABC).
Let us see how sub-Gaussian proxy variances can be derived from transport inequalities. To this end, we need first to introduce the Wasserstein distance (of order 1) between two probability measures P and Q on a space . It is defined wrt a distance d on
by
where is the set of probability measures on
with fixed marginal distributions respectively
and
Then, a probability measure
is said to satisfy a transport inequality with positive constant
, if for any probability measure
dominated by
,
where is the entropy, or Kullback–Leibler divergence, between
and
. The nice result proven by Bobkov and Götze (1999) [2] is that the constant
is a sub-Gaussian proxy variance for P.
For a discrete space equipped with the Hamming metric,
, the induced Wasserstein distance reduces to the total variation distance,
. In that setting, Ordentlich and Weinberger (2005) [3] proved the distribution-sensitive transport inequality:
where the function is defined by
and the coefficient
is called the balance coefficient of
, and is defined by
. In particular, the Bernoulli balance coefficient is easily shown to coincide with its mean. Hence, applying the result of Bobkov and Götze (1999) [2] to the above transport inequality yields a distribution-sensitive proxy variance of
for the Bernoulli with mean
, as plotted in blue above.
In the Beta distribution case, we have not been able to extend this transport inequality methodology since the support is not discrete. However, a nice limiting argument holds. Consider a sequence of Beta random variables with fixed mean
and with a sum
going to zero. This converges to a Bernoulli random variable with mean
, and we have shown that the limiting optimal proxy variance of such a sequence of Beta with decreasing sum
is the one of the Bernoulli.
References
[1] Marchal, O. and Arbel, J. (2017), On the sub-Gaussianity of the Beta and Dirichlet distributions. Electronic Communications in Probability, 22:1–14, 2017. Code on GitHub.
[2] Bobkov, S. G. and Götze, F. (1999). Exponential integrability and transportation cost related to logarithmic Sobolev inequalities. Journal of Functional Analysis, 163(1):1–28.
[3] Ordentlich, E. and Weinberger, M. J. (2005). A distribution dependent refinement of Pinsker’s inequality. IEEE Transactions on Information Theory, 51(5):1836–1840.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.