Site icon R-bloggers

Significant birthdays in the weekend

[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I am a listener to BBC‘s podcast More or Less. In the program Tim Harford looks at data with both humour and determination to find what the numbers mean. Last week he handled a listener question. Does everybody get a significant birthday (20, 30 years etc.) in a weekend. His back of the envelope answer was, yes, we think so. Given the regularity by which days shift over the years, it should happen by 60 latest. If anybody has this at 70, please contact us. This blog post tries to answer by which youngest age anybody born in the previous century has a significant birthday in the weekend and concludes that indeed by 60 latest it should have happened. Not surprising, 20 years has 2/7=28% of days. Somewhat surprising, 30, 40 and 50 are approximately equally probable (3/14=21%).
Additional note. I was not the only one who did this calculation. This weekend he mentioned some other listeners who emailed equivalent results to him.

Code

# days contains all days as month-day in a year.
days <- format(as.Date(‘1900-01-01′,format=’%Y-%m-%d’)+0:364,’%m-%d’)
tail(days)
[1] “12-26” “12-27” “12-28” “12-29” “12-30” “12-31”

# does a year day combination fall in weekend,
# format makes Sunday to ‘0’, Saturday to ‘6’
isspecday <- function(y,d) {
  i <- paste(y,d,sep=’-‘)
  format(as.Date(i,format=’%Y-%m-%d’),’%w’) %in% c(‘0′,’6’)
}

# which years have a weekend, for a given day (month day combination)
sourcedata <- function(dayno) {
  year <- 1900:2100
  days <- isspecday(year,days[dayno])
  data.frame(year=year,sunday=days)
}

# example – use fifth day
sd1 <- sourcedata(5)
# reorganize in matrix so 20 , 30.. years after get next to each other
sd2 <- sapply(seq(20,80,10),function(x) sd1$sunday[(1:100)+x])
# make columns to ages
# other years get 1000, so min() can extract minimum age
sd3 <- sd2 * rep(1,100) %o% seq(20,80,10)
sd3[sd3==0] <- 1000
tail(sd3)
       [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[95,]    20 1000 1000 1000 1000   70 1000
[96,]  1000   30 1000 1000 1000 1000   80
[97,]  1000 1000   40 1000 1000 1000   80
[98,]  1000 1000 1000   50 1000 1000 1000
[99,]  1000 1000 1000   50   60 1000 1000
[100,]   20 1000 1000 1000   60   70 1000

# result, ages
table(apply(sd3,1,min))
20 30 40 50 60 
28 22 21 22  7 

# make a wrapper, i to become day
goodyear <- function(i) {
  sd1 <- sourcedata(i)
  sd2 <- sapply(seq(20,80,10),function(x) sd1$sunday[(1:100)+x])
  sd3 <- sd2 * rep(1,100) %o% seq(20,80,10)
  sd3[sd3==0] <- 1000
  table(apply(sd3,1,min))
}
# and apply
sa <- sapply(1:365,goodyear)
# for brievety, only show last days
tail(t(sa))
       20 30 40 50 60
[360,] 28 22 22 21  7
[361,] 28 22 21 22  7
[362,] 28 21 22 21  8
[363,] 29 21 21 22  7
[364,] 29 21 21 22  7
[365,] 29 22 21 21  7
# overall
colSums(t(sa))/365
       20        30        40        50        60 
28.572603 21.430137 21.424658 21.430137  7.142466 

To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.