Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Consider the preliminary question of getting a sequence of N heads out of k tosses, with probability 1-p(N,k). The complementary probability is given by the recurrence formula
Indeed, my reasoning is that the event of no consecutive N heads out of k tosses can be decomposed according to the first occurrence of a tail out of the first N tosses. Conditioning on whether this first tail occurs at the first, second, …, nth draw leads to this recurrence relation. As I wanted to make sure, I rand the following R code
#no sequence of length N out of k draws pnk=function(N,k){ if (k<N){ p=1} else{ p=0 for (j in 1:N) p=p+pnk(N,k-j)/2^j } return(p) }
and got the following check:
> k=15 > #N=2 > 1-pnk(2,k)-sum(apply(vale[,-1]*vale[,-k],1,max))/10^6 [1] 6.442773e-05 > #N=3 > 1-pnk(3,k)-sum(apply(vale[,-(1:2)]*vale[,-c(1,k)]*vale[,-((k-1):k)],1,max))/10^6 [1] 0.0004090137
Next, the probability of getting the first consecutive N heads in m≥ N tosses is
Both first cases are self-explanatory. the third case corresponds to a tail occurring at the m−N−1th draw, followed by N heads, and prohibiting N consecutive heads prior to the m−N−1th toss. When checking by
Tsim=10^7 S=sample(c(0,1),Tsim,rep=TRUE) SS=S[-Tsim]*S[-1] out=NULL i=2 while (i<=length(SS)){ if ((SS[i]==1)&&(SS[i-1]==1)){ out=c(out,i);i=i+1} i=i+1} dif=diff((1:length(SS[-out]))[SS[-out]==1]) trobs=probs=tabulate(dif+(dif==1))/length(dif)[1:20] for (t in 1:20) trobs[t]=qmn(2,t) barplot(probs,col="orange2",ylim=c(-max(probs),max(probs))) barplot(-trobs[1:20],col="wheat",add=TRUE)
I however get a discrepancy shown in the above graph for the cases m=3,4, and N=2, which is be due to the pseudo-clever way I compute the waiting times, removing the extra 1′s… Because the probabilities to wait 3 and 4 times for 2 heads should really be both equal to 1/2³.
Now, the probability to get M heads first and N heads in m≥ N tosses (and no less) is
The third case is explained by the fact that completions of the first sequence of heads must stop (by a tail) before reaching N heads. Hence the conditional probability of waiting m tosses to get N consecutive heads given the first M consecutive heads is
The expected number can then be derived by
or
for the number of *additional* steps…
Checking for the smallest values of M and N, I got a reasonable agreement with the theoretical value of 2N+1-2M+1(established on Cross validated). (For larger values of M and N, I had to replace the recursive definition of pnk with a matrix computed once for all.)
Filed under: R, Statistics, University life Tagged: conditioning, heads and tails, R, recursion
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.