How To Forecast With Tableau And R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
After working with Tableau for the last several years, I have to admit that I’m quite impressed with the statistical capabilities of the software. It’s nowhere near the analytical powerhouse that R is, but for visualization it does a pretty good job.
As good as Tableau is, much of the statistical properties are “out of the box” and don’t offer as much configurability as some would like. Below is a walk-through of one use case that could be put into production.
All materials in this walk-through can be found on my GitHub page.
The Problem:
With forecasts, users are used to seeing the predicted value alongside the actual value. This can visually help some users determine the accuracy of the model. Tableau generates metrics such as MPE and MAPE that measure the accuracy of fit but it has a difficult time visualizing them.
Yikes, there’s some Excel ugliness!
The Solution:
Using Rob Hyndman’s forecast package in R, I can easily write the results of the fitted model at the row level, which makes visualizing the accuracy of the model much easier. Below is some simple R code using Tableau’s built-in “Superstore” data set.
To Replicate this in Tableau…
My workbook is available on my GitHub page (Tableau Public doesn’t support R scripts).
Before you get going, you need to establish a connection between R and Tableau.
1. Create an R script for Tableau
In this case, I’m creating a calculated field and using the same logic as the R script above, but this time passing it as a calculated field to return the fit of my forecast model. I’m naming this calculated field “fitR future.”
2. Determine periods to forecast
Create a parameter that tells Tableau how far ahead you want to look. This will tie in with other calculations later. I’m naming this new parameter “Periods to Forecast”.
3. Calculate Date Shift
Now that I’ve got “Periods to Forecast,” I have to tell Tableau how to look at those future dates. Note, Superstore Sales has an “Order Date” and a “Shipping Date,” here I’m using Order Date and creating a calculated field called Order Date Shift.
5. Actual vs. Predicted Values
Tableau needs to know the difference right? To accomplish this, I create one more calculated field called “ActualvsPredicted.”
Putting it All Together
Now that I’ve got all the values, I can let Tableau do what it does best. The first viz I made was by dragging my new Order Date Shift field to columns and my Predicted and Profit fields to rows. This makes a replication of the chart I plotted in R.
And of coarse, I can tweak the view until my heart’s content.
Stacking it up With Tableau’s Native Forecast
Notice back in the R code, I simply invoked the ets(ts) function to forecast. I could have passed a specific model to this function but I left it out intentionally. The forecast package in R finds the best ETS model for my data. In this case if found the best fit was an ANA model.
Likewise, Tableau will attempt to find the best model to fit your data… And the results? Like R, Tableau finds an ANA model to be the best fit. I did this little experiment to prove that Tableau’s built-in forecasting is really quite good. However, there will always be use cases for people who need a bit more control over their forecast models.
The final result…
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.