Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This morning, there was an interesting post entitled “why does Easter move around so much?” online on http://economist.com/blogs/economist-explains/…
In my time series classes, I keep saying that sometimes, series can exhibit seasonlity, but the seasonal effect can be quite irregular. It is the cas for river levels, where snowmelt can have a huge impact, and it is irregular. Similarly, chocolate sales (even monthly, or quarterly) depends on Easter. Because it can be either in March, or in April, the seasonal pattern is not as regular as flower sales for instance (Valentine beeing always on February 14th, as far as I remember). If we look at the word eggs on http://google.com/trends/q=eggs…, we do observe a cycle related to Easter.
The title of the article published by http://economist.com/blogs/economist-explains/… claims that there is a lot of variability on Eater’s day. Let us check ! The answer to the question “When is Easter ?” can be the following (if we want a short answer): Easter Sunday is the first Sunday after the first full moon after vernal equinox. For more details, see e.g. http://ortelius.de/east. The algorithm used to compute the date of Easter can is online, on http://smart.net/~mmontes/….
> century = year/100 > G = year % 19 > K = (century - 17)/25 > I = (century - century/4 - (century - K)/3 + 19*G + 15) % 30 > I = I - (I/28)*(1 - (I/28)*(29/(I + 1))*((21 - G)/11)) > J = (year + year/4 + I + 2 - century + century/4) % 7 > L = I - J > EasterMonth = 3 + (L + 40)/44 > EasterDay = L + 28 - 31*(EasterMonth/4)
Actually, this algorithm can be found in some R packages. Here we use the date of Easter from AD 1000 and AD 3000,
> library(timeDate) > E=Easter(1000:3000) > D=as.Date(E) > table(months(D))/2001 april march 0.7651174 0.2348826
(April being before March, in the alphabetical order) If we look at the distribution of the date, it is the following, the starting point being March 1st,
> J=as.numeric(D-as.Date(paste("01/03/",1000:3000,sep=""),"%d/%m/%Y")) > hist(J,breaks=seq(20,55),col="light green")
And if we look at the autocorrelation function, we can observe that indeed, after 19 years, there is a strong correlation (that could be seen in the algorithm given previously),
> plot(acf(J))
But in order to get a better understanding of the dynamics, we can also look at transiftion matrices. Define
> Q=quantile(J,seq(0,1,by=.25)) > Q[1]=Q[1]-1 > C=cut(J,Q)
Then, the one year transition matrix is (in %)
> k=1; n=length(C) > B=data.frame(X1=(C[1:(n-k)]),X2=(C[(k+1):n])) > (T=table(B$X1,B$X2)) (20,31] (31,39] (39,46] (46,55] (20,31] 0 0 265 277 (31,39] 316 0 13 182 (39,46] 224 264 0 0 (46,55] 1 247 211 0 > P=T/apply(T,1,sum) > round(P*1000)/10 (20,31] (31,39] (39,46] (46,55] (20,31] 0.0 0.0 48.9 51.1 (31,39] 61.8 0.0 2.5 35.6 (39,46] 45.9 54.1 0.0 0.0 (46,55] 0.2 53.8 46.0 0.0
I.e. if Easter was early in the year (say in March, in the first quartile), then very likeliy, the year after, it will be late in the year (with 50% chance in the third quartile, and 50% chance in the fourth one).
Arthur Charpentier
Arthur Charpentier, professor in Montréal, in Actuarial Science. Former professor-assistant at ENSAE Paristech, associate professor at Ecole Polytechnique and assistant professor in Economics at Université de Rennes 1. Graduated from ENSAE, Master in Mathematical Economics (Paris Dauphine), PhD in Mathematics (KU Leuven), and Fellow of the French Institute of Actuaries.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.