Site icon R-bloggers

(JUST RELEASED) timetk 2.0.0: Visualize Time Series Data in 1-Line of Code

[This article was first published on business-science.io, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’m EXTACTIC ???? to announce the release of timetk version 2.0.0. This is a monumental release that significantly expands the functionality of timetk to go WAY BEYOND the original goals of the package. Now, the package is a full-featured time series visualization, data wrangling, and preprocessing toolkit.

I’ll go into detail about what you can do in 1-line of code, and I have several more articles coming on how to use several of the new features.

But first a little history…

Background

Time series has been a passion project for me since my days of forecasting sales and economic data for a manufacturing company I worked for. My one gripe has always been that I had to use 50 different packages (zoo, xts, dplyr, etc) made by 50 different people to perform common data wrangling and visualization analyses. timetk solves this problem by making a consistent approach to visualize, wrangle, and preprocess time series data inside the tidyverse and tidymodels ecosystem.

With advancements in tidy-time series, the combination of the tidyverse and time series is an amazingly powerful concept. I’m not the first one to think of this idea. In fact, Davis Vaughan created tibbletime and the “tidyverts” (Rob Hyndman, Earo Wang, and Mitchell O’Hara-Wild) have created a whole forecasting and data wrangling system using a tsibble data structure.

These are amazing packages, but they solve different needs. tibbletime focused on data wrangling. The tidyverts focused on forecasting at scale using ARIMA and company.

Visualize, wrangle, and preprocess time series data

My needs are different. I need:

So I created timetk version 2.0.0 to solve these needs.

Here’s the new mission, and what you can do in 1-line of code with timetk >= 2.0.0.

Mission

To make it easy to visualize, wrangle and preprocess time series data for forecasting and machine learning prediction.

What Can You Do in 1-Line of Code?

First step, load these R packages.

library(tidyverse)
library(lubridate)
library(timetk)

Investigate a time series…

This is fun! In 1 line of code we can visualize a dataset.

# Static ggplot
taylor_30_min %>%
    plot_time_series(date, value, .color_var = week(date),
                     .interactive = FALSE, .color_lab = "Week")

So what did we just do?

We are exploring taylor_30_min, which is a classic electricity-demand time series that comes from one of my all-time-favorite packages, forecast. It’s been updated to the tibble structure with a time stamp column called “date” and a value column called “value”.

We can plotted it using plot_time_series(). I set .interactive = FALSE to return a ggplot (I’ll do that for all of the visualizations in this tutorial). The default is to return an interactive plotly graph, which is great for shiny apps, rmarkdown HTML documents, and super powerful for exploring time series (zooming, panning, etc).

Want the an interactive plotly visualization?

Just try this code and explore the plotly visualization.

# INTERACTIVE Plotly
taylor_30_min %>%
    plot_time_series(date, value, .color_var = week(date),
                     .interactive = TRUE, .plotly_slider = TRUE, .color_lab = "Week")

Let’s pick up the pace. Here’s some more amazing visualization capabilities!

Visualize anomalies…

We can visualize anomalies for multiple time series groups. Here we use group_by() to group the time series. Note this is a different dataset, walmart_sales_weekly.

walmart_sales_weekly %>%
    group_by(Store, Dept) %>%
    plot_anomaly_diagnostics(Date, Weekly_Sales, 
                             .facet_ncol = 3, .interactive = FALSE)

Make a seasonality plot…

We can get seasonality plots.

taylor_30_min %>%
    plot_seasonal_diagnostics(date, value, .interactive = FALSE)

Inspect autocorrelation, partial autocorrelation (and cross correlations too)…

And we can search the Autocorrelation and Partial Autocorrelation.

taylor_30_min %>%
    plot_acf_diagnostics(date, value, .lags = "1 week", .interactive = FALSE)

Documentation

I have several more tutorials and a full time series course coming. Visit the timetk website documentation for more information and a complete list of function references.

Time Series Course (Coming Soon)

I teach timetk in my Time Series Analysis & Forecasting Course. If interested in learning Pro-Forecasting Strategies then join my waitlist. The course is coming soon.

You will learn:

Signup for the Time Series Course waitlist

Have questions on using Timetk for time series?

Make a comment in the chat below. ????

And, if you plan on using timetk for your business, it’s a no-brainer – Join my Time Series Course Waitlist (It’s coming, it’s really insane).

To leave a comment for the author, please follow the link and comment on their blog: business-science.io.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.