Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This government is committed to introducing posthumous pardons for people with certain historical sexual offence convictions who would be innocent of any crime now (British Government Spokesperson, September 2016)
Last September, the British government announced its intention to pursue what has become known as the Alan Turing law, offering exoneration to the tens of thousands of gay men convicted of historic charges. The law was finally unveiled on 20 October 2016.
This plot shows the daily views of the Alan Turing’s wikipedia page during the last 365 days:
< !-- iframe plugin v.4.3 wordpress.org/plugins/iframe/ -->
There are three huge peaks in May 27th, July 30th, and October 29th that can be easily detected using AnomalyDetection package:
After substituting these anomalies by a simple linear imputation, it is clear that the time series has suffered a significant impact since the last days of September:
< !-- iframe plugin v.4.3 wordpress.org/plugins/iframe/ -->
To estimate the amount of incremental views since September 28th (this is the date I have chosen as starting point) I use CausalImpact package:
Last plot shows the accumulated effect. After 141 days, there have been around 1 million of incremental views to the Alan Turing’s wikipedia page (more than 7.000 per day) and it does not seem ephemeral.
Alan Turing has won another battle, this time posthumous. And thanks to it, there is a lot of people that have discovered his amazing legacy: long life to Alan Turing.
This is the code I wrote to do the experiment:
library(httr) library(jsonlite) library(stringr) library(xts) library(highcharter) library(AnomalyDetection) library(imputeTS) library(CausalImpact) library(dplyr) # Views last 365 days (Sys.Date()-365) %>% str_replace_all("[[:punct:]]", "") %>% substr(1,8) -> date_ini Sys.time() %>% str_replace_all("[[:punct:]]", "") %>% substr(1,8) -> date_fin url="https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/all-agents/Alan%20Turing/daily" paste(url, date_ini, date_fin, sep="/") %>% GET %>% content("text") %>% fromJSON %>% .[[1]] -> wikistats # To prepare dataset for highcharter wikistats %>% mutate(day=str_sub(timestamp, start = 1, end = 8)) %>% mutate(day=as.POSIXct(day, format="%Y%m%d", tz="UTC")) -> wikistats # Highcharts viz rownames(wikistats)=wikistats$day wikistats %>% select(views) %>% as.xts %>% hchart # Anomaly detection wikistats %>% select(day, views) -> tsdf tsdf %>% AnomalyDetectionTs(max_anoms=0.01, direction='both', plot=TRUE)->res res$plot # Imputation of anomalies tsdf[tsdf$day %in% as.POSIXct(res$anoms$timestamp, format="%Y-%m-%d", tz="UTC"),"views"]<-NA ts(tsdf$views, frequency = 365) %>% na.interpolation() %>% xts(order.by=wikistats$day) -> tscleaned tscleaned %>% hchart # Causal Impact from September 28th x=sum(index(tscleaned)<"2016-09-28 UTC") impact <- CausalImpact(data = tscleaned %>% as.numeric, pre.period = c(1,x), post.period = c(x+1,length(tscleaned)), model.args = list(niter = 5000, nseasons = 7), alpha = 0.05) plot(impact)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.