The Beta distribution of the third kind (or generalised Beta prime)

[This article was first published on Saturn Elephant, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

We present the family of so-called Beta distributions of the third kind. In the context of Bayesian statistics, it is a conjugate family of prior distributions on the odds parameter of the binomial model. This distribution is known, but nobody provided a way to sample from it. We show how one can sample from this distribution in R.

Preliminaries: the (scaled) Beta prime distribution

The Beta distribution of the third kind generalizes the Beta distribution of the second kind, also known under the name Beta prime distribution.

The Beta prime distribution B(c,d,λ) is the distribution of the random variable λU1U where UB(c,d).

Its density function at x0 is 1λcB(c,d)xc1(1+xλ)c+d.

Usually the definition does not include the scale parameter λ (that is, it is usually defined for λ=1 only).

It is easy to implement a sampler for this distribution, the density function and the cumulative density function:

rbetaprime <- function(n, c, d, lambda = 1){
  stopifnot(lambda > 0)
  u <- rbeta(n, c, d)
  lambda * u/(1-u)
}
dbetaprime <- function(x, c, d, lambda = 1){
  stopifnot(lambda > 0)
  lambda/(lambda+x)^2 * dbeta(x/lambda/(1+x/lambda), c, d)
}
pbetaprime <- function(q, c, d, lambda){
  stopifnot(lambda > 0)
  pbeta(q/lambda/(1+q/lambda), c, d)
}

Beta distribution of the third kind

The Beta distribution of the third kind B3 was firstly introduced (as far as I know) in the paper Some Poisson mixtures with a hyperscale parameter, written by myself.

For parameters c>0, d>0, κR, τ>0, the density function of B3(c,d,κ,τ) is f(ϕ)=1B(c,d)2F1(c,c+dκ,c+d;11τ)ϕc1(1+ϕ)κ(1+ϕτ)c+dκ,ϕ0.

Thus, for κ=0, the B3(c,d,κ,τ) distribution equals B(c,d,τ), and for κ=c+d or τ=1, it equals B(c,d,1). Note that in general, τ is not a scale parameter.

Let’s write a R function computing this density:

library(gsl)
Gauss2F1 <- function(a,b,c,x){ 
  if(x>=0 && x<1){ # hyperg_2F1 works fine in this situation
    hyperg_2F1(a, b, c, x)
  }else{ # transform to come down to the first situation
    hyperg_2F1(c-a, b, c, 1-1/(1-x)) / (1-x)^b
  }
}
dB3 <- function(x, c, d, kappa, tau){
  stopifnot(c > 0, d > 0, tau > 0)
  if(kappa == 0){
    dbetaprime(x, c, d, tau)
  }else if(kappa == c+d){
    dbetaprime(x, c, d, 1)
  }else{
    1/beta(c,d) * x^(c-1)*(1+x)^(-kappa)/(1+x/tau)^(c+d-kappa) / 
      Gauss2F1(c, c+d-kappa, c+d, 1-1/tau)
  }
}

This distribution is related to the four-parameter generalized Beta distribution introduced by Chen & Novick in the paper Bayesian analysis for binomial models with generalized beta prior distributions (1984); this distribution takes its value in [0,1]. They are related by an elementary transformation: ΘGB4(c,d,κ,τ)Θ1ΘB3(c,d,c+dκ,1τ).

Update 2019-09-05: generalised Beta distribution

I’ve just discovered that the GB4 distribution appears in the paper On Kummer’s distributions of type two and generalized Beta distributions written by Hamza & Vallois. It is named generalised Beta distribution in this paper, it is denoted by βδ(a,b,c) and its density function at u[0,1] is given by 1β(a,b)2F1(c,a;a+b;1δ)ua1(1u)b1(1+(δ1)u)c

for a,b,δ>0 and cR.

We have the following relation: if ΦB3(c,d,κ,τ), then Φ1+Φβ1τ(c,d,κcd).

So, maybe a better name for B3 would be generalised Beta prime distribution.

Cumulative distribution function

The cumulative distribution function of B3 involves the Appell hypergeometric function F1. A Fortran implementation of this function is available in the R package appell. This package has been removed from CRAN, but you can still install it. If ΦB3(c,d,κ,τ), then, for q0, Pr(Φq)=qcF1(c;κ,c+dκ;c+1;q,qτ)cB(c,d)2F1(c,c+dκ,c+d;11τ).

Here is a R implementation. I found that it works well with the option userflag = 0 of the appellf1 function.

pB3 <- function(q, c, d, kappa, tau, userflag = 0){
  stopifnot(c > 0, d > 0, tau > 0)
  if(kappa == 0){
    pbetaprime(q, c, d, tau)
  }else if(kappa == c+d){
    pbetaprime(q, c, d, 1)
  }else{
    C <- beta(c,d) * Gauss2F1(c, c+d-kappa, c+d, 1-1/tau)
    Appell <- 
      appell::appellf1(c, kappa, c+d-kappa, c+1, -q, -q/tau, 
                       userflag = userflag)
    q^c/c * Re(Appell$val) / C
  }
}

Sampling the Beta distribution of the third kind

It is not very easy to sample the B3 distribution. In her master thesis, Myriam Chabot proved that it can be represented as a discrete mixture of B2 distributions, and we will use this result.

This result is the following one.

  • For τ<1, let K be a random variable on N whose probability mass at kN is given by 12F1(d,c+dκ,c+d,1τ)(1τ)kk!(c+dκ)k(d)k(c+d)k

    and let Φ be a random variable such that (ΦK=k)B(c,d+k,1).
    Then ΦB3(c,d,κ,τ).

  • For τ>1, let K be a random variable on N whose probability mass at kN is given by 12F1(c,c+dκ,c+d,11τ)(11τ)kk!(c+dκ)k(c)k(c+d)k

    and let Φ be a random variable such that (ΦK=k)B(c+k,d,1).
    Then ΦB3(c,d,κ,τ).

So we can sample B3(c,d,κ,τ) if we are able to sample these discrete distributions. To do so, we use the Runuran package.

library(Runuran)
pmf_unnormalized <- function(k, c, d, kappa, tau){
  out <- numeric(length(k))
  positive <- k >= 0
  k <- k[positive]
  out[positive] <- 
    if(tau < 1){
      exp(k*log(1-tau) - lfactorial(k) + 
            lnpoch(c+d-kappa,k) + lnpoch(d,k) - lnpoch(c+d,k)) 
    }else{
      exp(k*log(1-1/tau) - lfactorial(k) + 
            lnpoch(c+d-kappa,k) + lnpoch(c,k) - lnpoch(c+d,k))
    }
  out
}
NormalizingConstant <- function(c, d, kappa, tau){
  if(tau < 1){
    hyperg_2F1(d, c+d-kappa, c+d, 1-tau) 
  }else{
    hyperg_2F1(c, c+d-kappa, c+d, 1-1/tau) 
  }
}
Ksampler <- function(n, c, d, kappa, tau){
  dist <- unuran.discr.new(
    pmf = function(k) pmf_unnormalized(k, c, d, kappa, tau),
    lb = 0, ub= Inf, sum = NormalizingConstant(c, d, kappa, tau)
  )
  unuran <- unuran.new(dist, method="dgt") 
  ur(unuran, n)
}
rB3 <- function(n, c, d, kappa, tau){
  stopifnot(c > 0, d > 0, tau > 0)
  if(tau == 1 || kappa == c+d){
    rbetaprime(n, c, d, 1)
  }else if(kappa == 0){
    rbetaprime(n, c, d, tau)
  }else{
    K <- Ksampler(n, c, d, kappa, tau)
    if(tau < 1){
      rbetaprime(n, c, d + K, 1)
    }else{
      rbetaprime(n, c + K, d, 1)
    }
  }
}

Let’s check. The density:

c <- 2; d <- 3; kappa <- 4; tau <- 5
nsims <- 1000000
sims <- rB3(nsims, c, d, kappa, tau)
plot(density(sims, from = 0, to = quantile(sims, 0.95)))
curve(dB3(x, c, d, kappa, tau), add = TRUE, col = "red", 
      lty = "dashed", lwd = 2)

The cumulative distribution function:

q <- seq(0.1, 4, length.out = 10)[-6]
cdfValues <- sapply(q, function(x) pB3(x, c, d, kappa, tau))
empirical_cdf <- ecdf(sims)
plot(empirical_cdf, xlim = c(0,4))
points(q, cdfValues, pch=19)

I’ve removed the sixth value of the vector q because there is a crash of appellf1 for this value:

q <- seq(0.1, 4, length.out = 10)
pB3(q[6], c, d, kappa, tau)
## Error in appell::appellf1(c, kappa, c + d - kappa, c + 1, -q, -q/tau, : f1conv: Computation not possible

It works with the option userflag = 1:

pB3(q[6], c, d, kappa, tau, userflag = 1)
## [1] 0.849321

Finally perhaps the option userflag = 1 is a better default value:

cdfValues <- sapply(q, function(x){
  pB3(x, c, d, kappa, tau, userflag = 1)
})
plot(empirical_cdf, xlim = c(0,4))
points(q, cdfValues, pch=19)

Application to the Bayesian binomial model

Consider the binomial statistical model parameterized by the odds ratio ϕ: L(ϕx)ϕx(1+ϕ)n.

Take a B3(c,d,κ,τ) prior distribution on ϕ: π(ϕ)ϕc1(1+ϕ)κ(1+ϕτ)c+dκ
Then the posterior distribution on ϕ is B3(c+x,d+nx,κ+n,τ).

Application to the Bayesian “two Poisson samples” model

Consider the statistical model given by two independent observations xP(λS),yP(μT)

where S and T are known “sample sizes”. We parametrize the model by μ and ϕ:=λμ.

Assigning a Gamma prior distribution GB(a,b) on μ, it is not difficult to get (μϕ,x,y)G(a+x+y,b+ϕS+T).

In their paper A Bayesian framework for the ratio of two Poisson rates in the context of vaccine efficacy trials (2011), Laurent & Legrand derived the marginal likelihood on ϕ in the case of the semi-conjuguate family of prior distributions. Their result holds as long as μ and ϕ are independent under the prior distribution, and this result is ˜L(ϕx,y)ϕx(1+ρϕ)a+x+y
where ρ=ST+b.

Now, let’s assign a B(c,d,τ) prior distribution on ϕ. Then the posterior distribution on ϕ is given by π(ϕx,y)π(ϕ)˜L(ϕx,y)ϕc+x1(1+ρϕ)(a+x+y)(1+ϕτ)c+d,

and by noting that ϕτ=ρϕρτ, we recognize the scaled B3 distribution (ϕx,y)1ρ×B3(c+x,a+d+y,a+x+y,ρτ).
In particular, if τ=1ρ=T+bS, then we find (ϕx,y)B2(c+x,a+d+y,τ),
and this is the situation of the semi-conjugate family studied by Laurent & Legrand.

To leave a comment for the author, please follow the link and comment on their blog: Saturn Elephant.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)