simulating correlated Binomials [another Bernoulli factory]
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This early morning, just before going out for my daily run around The Parc, I checked X validated for new questions and came upon that one. Namely, how to simulate X a Bin(8,2/3) variate and Y a Bin(18,2/3) such that corr(X,Y)=0.5. (No reason or motivation provided for this constraint.) And I thought the following (presumably well-known) resolution, namely to break the two binomials as sums of 8 and 18 Bernoulli variates, respectively, and to use some of those Bernoulli variates as being common to both sums. For this specific set of values (8,18,0.5), since 8×18=12², the solution is 0.5×12=6 common variates. (The probability of success does not matter.) While running, I first thought this was a very artificial problem because of this occurrence of 8×18 being a perfect square, 12², and cor(X,Y)x12 an integer. A wee bit later I realised that all positive values of cor(X,Y) could be achieved by randomisation, i.e., by deciding the identity of a Bernoulli variate in X with a Bernoulli variate in Y with a certain probability ϖ. For negative correlations, one can use the (U,1-U) trick, namely to write both Bernoulli variates as
in order to minimise the probability they coincide.
I also checked this result with an R simulation
> z=rbinom(10^8,6,.66) > y=z+rbinom(10^8,12,.66) > x=z+rbinom(10^8,2,.66) cor(x,y) > cor(x,y) [1] 0.5000539
Searching on Google gave me immediately a link to Stack Overflow with an earlier solution with the same idea. And a smarter R code.
Filed under: Books, Kids, pictures, R, Running, Statistics, University life Tagged: binomial distribution, cross validated, inverse cdf, Jacob Bernoulli, Parc de Sceaux, R, random simulation, stackoverflow
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.