[This article was first published on Rcrastinate, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I’ve got a NetAtmo weather station. One can download the measurements from its web interface as a CSV file. I wanted to give time series analysis with the extraction of seasonal components (‘decomposition’) a try, so I thought it would be a good opportunity to use the temperature measurements of my weather station. The data is available.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
To make things visually a little easier, I only tried this with 14 days of temperature measurements, including all measurements from November 1st, 2015 till November 14th, 2015.
The raw data looks like this (on the x-axis, there is a running number of measurements):
data <- read.table(“temps.csv”, header = T, sep = “t”)
plot(data$temp, type = “l”, bty = “n”, ylab = “Temperature in °C”)
Don’t be too confused about the one peak over thirty degrees – we got a particularly friendly November and sometimes, the outdoor sensor of the station is standing in the sun.
Back to business. With the next line of code, I convert the column temp to a time series. I want to extract seasonal components from the time series later, so I have to specify a frequency parameter.
data$temp <- ts(data = data$temp, frequency = 285)
Where does the 285 come from? The NetAtmo station takes one measurement every 5 minutes. So, in one hour, there should be 60 / 5 = 12 measurements and in one day there should be 60 / 5 * 24 = 288 measurements. However, if we check table(data$date), we see that most of the days, the station made 3 measurements less per day, so I use this as my frequency.
In the last step, we decompose and plot the time series.
plot(stl(data$temp, s.window = “periodic”))
In the last step, we decompose and plot the time series.
plot(stl(data$temp, s.window = “periodic”))
The plot is divided into the raw data (again), the seasonal component as extracted by the function stl, the overall trend of the time series and the differences between trend and data. Nice.
To leave a comment for the author, please follow the link and comment on their blog: Rcrastinate.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.