hard birthday problem
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
From an X validated question, found that WordPress now allows for direct link to pdf documents, like the above paper by my old friend Anirban Das Gupta! The question is about estimating a number M of individuals with N distinct birth dates over a year of T days. After looking around I could not find a simpler representation of the probability for N=r other than (1) in my answer,
borrowed from a paper by Fisher et al. (Another Fisher!) Checking Feller leads to the probability (p.102)
which fits rather nicely simulation frequencies, as shown using
apply(!apply(matrix(sample(1:Nb,T*M,rep=TRUE),T,M),1,duplicated),2,sum)
Further, Feller (1970, pp.103-104) justifies an asymptotic Poisson approximation with parameter$
from which an estimate of $M$ can be derived. With the birthday problem as illustration (pp.105-106)!
It may be that a completion from N to (R¹,R²,…) where the components are the number of days with one birthdate, two birthdates, &tc. could help design an EM algorithm that would remove the summation in (1) but I did not spend more time on the problem (than finding a SAS approximation to the probability!).
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.