Converting cross sectional data with dates to weekly averages in R.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
First we’ll create some sample date with randomly generated dates within our time frame:
first <- as.Date("2012/01/25", "%Y/%m/%d")##start date##
last <- as.Date("2012/05/11", "%Y/%m/%d")##end day##
dt <- last-first
nSamples <- 1000
set.seed(1)
date<-as.Date(round(first+
runif(nSamples)*as.numeric(dt)))
Then we will combine with randomly generated values:
value<-sample(1:10, size=1000, replace=TRUE)
data<-data.frame(value, date)
Now that we have our observations we can move onto finding the weekly averages. However our weekly average data starts with the week ending 1/30/2012 which is a Tuesday, so you have to assign that date to everyday in that week using the lubridate package:
library("lubridate")
data$week<-floor_date(data$date,”week”) +8
The “+8” is because floor_date goes to the previous sunday, and we need it to go the following Tuesday.
Now we can use ddply function from the ply package to find the averages from every week:
library("plyr")
x<-ddply(data, .(week), function(z) mean(z$value))
The ddply function finds the averages of all values within each particular week in the data.
The hard work is now all done, but we will need to rename the columns before calling it done:
colnames(x) <- c("week", "value")
You now have the weekly averages to compare to the other dataset.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.