Site icon R-bloggers

Forecasting In R: The Greatest Shortcut That Failed The Ljung-Box

[This article was first published on The Dancing Economist, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Okay so you want to forecast in R, but don’t want to manually find the best model and go through the drudgery of plotting and so on.  I have recently found the perfect function for you.  Its called auto.arima and it automatically fits the best arima model to your time series. In a word- it is “brilliant”.  Lets take a look at its brilliance with the following examples.

So in our last post the last thing we plotted was de-trended GDP and were hoping to forecast it.  R makes this not only really easy, but also hilariously fun.

Just type in the following code with dt being your de-trended series.



fitdt<-auto.arima(dt)


plot(forecast(fitdt,h=25))

Here is the marvelous plot:



Here are the summary statistics:


Series: dt 
ARIMA(2,0,2)(0,0,1)[4] with zero mean     


Coefficients:
         ar1           ar2        ma1        ma2     sma1
      1.4753    -0.5169   0.0519    0.1909  0.1539
s.e.  0.1201   0.1166    0.1236   0.0909   0.0726


sigma^2 estimated as 1528:  log likelihood = -1314.11
AIC = 2640.22   AICc = 2640.55   BIC = 2661.53


In-sample error measures:
          ME         RMSE          MAE                  MPE                    MAPE             MASE 
 -0.02875014  39.08953811  22.33132382    -30.62351536  75.51060679   0.83683534

So it fitted the de-trended series to a ARMA(2,2). Lets take a look at what it fits regular ol’ GDP to.


> fitx<-auto.arima(GDP)
> plot(forecast(fitx,h=25))
> summary(fitx)


Series: GDP 
ARIMA(2,2,2)                    


Coefficients:
          ar1        ar2            ma1      ma2
        -0.078   0.4673     -0.3520  -0.5813
s.e.   0.158    0.0928      0.1611   0.1525


sigma^2 estimated as 1653:  log likelihood = -1312.34
AIC = 2634.69   AICc = 2634.93   BIC = 2652.42


In-sample error measures:
        ME       RMSE              MAE             MPE           MAPE           MASE 
 3.7546824   40.4951109   22.4820107    0.1552383    0.7111309    0.3613070 


So it fit it also to an ARMA(2,2), but we have that extra 2 there in the middle.  That 2 was clearly not there when we fit for the trendless de-trended GDP and shows up to account for the quadratic trend of the regular ol’ GDP series.

Located above is the plot of the h=25 step ahead forecast and below is some code for the actual values in list form


> fit7=forecast(fitx,h=20)
> list(fit7)



         Point Forecast    Lo 80     Hi 80              Lo 95     Hi 95
65 Q3       15124.35 15072.25 15176.45      15044.67 15204.03
65 Q4       15245.80 15148.82 15342.78      15097.48 15394.11
66 Q1       15359.95 15215.32 15504.58      15138.76 15581.15
66 Q2       15475.10 15285.46 15664.74      15185.07 15765.12
66 Q3       15586.75 15352.89 15820.62      15229.09 15944.42
66 Q4       15699.15 15423.24 15975.05      15277.19 16121.11
67 Q1       15809.85 15493.02 16126.69      15325.30 16294.41
67 Q2       15921.03 15564.75 16277.32      15376.14 16465.92
67 Q3       16031.39 15636.52 16426.26      15427.49 16635.29
67 Q4       16142.03 15709.51 16574.56      15480.54 16803.52
68 Q1       16252.27 15782.66 16721.87      15534.07 16970.47
68 Q2       16362.67 15856.53 16868.81      15588.59 17136.74
68 Q3       16472.86 15930.53 17015.20      15643.44 17302.29
68 Q4       16583.15 16004.92 17161.39       15698.82 17467.48
69 Q1       16693.34 16079.38 17307.30      15754.37 17632.31
69 Q2       16803.58 16154.01 17453.15      15810.15 17797.01
69 Q3       16913.77 16228.64 17598.89      15865.96 17961.57
69 Q4       17023.98 16303.31 17744.65      15921.81 18126.15
70 Q1       17134.17 16377.91 17890.43      15977.57 18290.77
70 Q2       17244.37 16452.46 18036.29       16033.25 18455.50



Let’s see if this model passes the Ljung-Box test for white noise.


> Box.test(fitx$resid,type=”Ljung-Box”)


Box-Ljung test


data:  fitx$resid 
X-squared = 0.0511, df = 1, p-value = 0.8212


No! It doesn’t pass the test! AHHHHHH!!! WHAT ARE WE GOING TO DO!??!?! Find out in the next post or look it up, but whatever you do-

Keep Dancin’

Steven J.

Update: This model does actually pass the “Ljung-Box” test. Please read the next post for details.


To leave a comment for the author, please follow the link and comment on their blog: The Dancing Economist.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.