Time Varying Higher Moments with the racd package.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The Autoregressive Conditional Density (ACD) model of Hansen (1994) extended GARCH models to include time variation in the higher moment parameters. It was a somewhat natural extension to the premise of time variation in the conditional mean and variance, though it probably raised more questions than it, or subsequent research have been able to answer. In my mind, some of these questions are :
- What is the rationale and source for time variation in the skew and shape parameters of a conditional density?
- Can one single source of randomness adequately model both the odd and even moment variation?
- What is the impact on the coefficients of the lower moments?
- What are the implications of the time varying density for the standardized residuals on long run relationships, simulation methods and the consistency of the parameters?
- Do the marginal benefits, in applied work, outweigh the costs of the difficult estimation?
The first question is quite open and difficult to answer. In a GARCH model, most of the variation in the underlying hypothesized density, its expansion and contraction with the news arrival, is captured by the conditional variance dynamics. Given a suitably chosen skewed and shaped distribution, extreme events can also be accommodated within this framework as can changes in the asymmetry of the distribution. If we believe that there is value in allowing higher moment parameters to vary, which we can at times infer from the estimation and forecast performance, is this the result of some underlying inadequacy of the GARCH model or does the ‘true’ and unknown data generating process actually contain such dynamics (and why)?
For the second question, stochastic volatility models have already provided one way to test this at least for GARCH models, though at a considerable computational cost. For ACD models, this is probably still an open research question.
For the third question, some evidence was presented in Harvey and Siddique (1999) where the inclusion of time varying skewness affected the persistence of the conditional variance and caused some of the asymmetries in the variance to disappear (through a reduced asymmetry coefficient in the variance dynamics). I would also actually expect that for standard GARCH models with very high persistence, the inclusion of time variation in the shape parameter (and hence more directly on the higher even moments) to lead to a reduction in such persistence as some of the extreme variations from shocks could be better accommodated. I also wonder whether the inclusion of time variation in the skew parameter would have an impact on the first moment parameters, particularly when their dynamics might include threshold effects.
The fourth question is very difficult to answer, particularly the consistency of the parameters. There is little theoretical work in the ACD literature on this and there have only been some footnotes in the research about simulations confirming that the parameters of ACD models tested had the standard \( \sqrt(N) \) consistency. I remain very skeptical on this point considering the ’3-dimensional’ nature of ACD simulation. That is, for each period \( t \) that is generated in a simulation, the density of the standardized residuals is varying, unlike GARCH model simulation where a long path results in a sampling from the standardized distribution.
The final question is the one most recent papers have focused on, and an extensive literature is available in the vignette of the racd package. The results are mixed, though one would not necessarily conclude this from just reading the abstract of any one of the papers where there is the natural tendency to portray the results in the best possible light.
Following the rather long intro, the remaining article will provide for an introduction to the usage of the racd package which shares many features with the rugarch package (since it extends it). If you have some unique insights into these models, would like to add something or comment, feel free to drop me an email. The racd package is currently hosted in the teatime repository on r-forge and there are no plans at present to release it to CRAN.
The Model Specification
The model specification follows closely the implementation in the rugarch package with the exception of the ‘distribution.model’ option which is now a list with a number of choices for defining the conditional skew and shape dynamics. For the conditional variance equation, the GARCH models which are included are the standard GARCH (‘sGARCH’), component GARCH (‘csGARCH’) and the multiplicative component GARCH (‘mcsGARCH’) for intraday data. The reason for only including these models has to do with the calculation of their long run properties and persistence which do not require simulation in light of the time variation of the higher moments.
# setup library(racd) library(rugarch) library(parallel) library(xts) data(sp500ret) plot2xts = function(x, ...) { plot(x, y = NULL, type = 'l', auto.grid = FALSE, major.ticks = 'auto', minor.ticks = FALSE, major.format = FALSE, bar.col = 'grey', candle.col = 'white', ann = TRUE, axes = TRUE, cex.main = 0.9, cex.axis = 0.9, ...) grid() } args(acdspec) ## function (variance.model = list(model = 'sGARCH', garchOrder = c(1, ## 1), external.regressors = NULL, variance.targeting = FALSE), ## mean.model = list(armaOrder = c(1, 1), include.mean = TRUE, ## archm = FALSE, arfima = FALSE, external.regressors = NULL), ## distribution.model = list(model = 'snorm', skewOrder = c(1, ## 1, 1), skewshock = 1, skewshocktype = 1, skewmodel = 'quad', ## skew.regressors = NULL, shapeOrder = c(0, 1, 1), shapeshock = 1, ## shapeshocktype = 1, shapemodel = 'quad', shape.regressors = NULL, ## exp.rate = 1), start.pars = list(), fixed.pars = list())
The distribution.model list contains the details of the conditional higher moment specification:
- model. This is the conditional distribution, with the same choice of skewed and shaped distributions as in the rugarch package.
- skewOrder (skewmodel) and shapeOrder (shapemodel). Denote the order of the skew and shape (models) under the different parameterizations available and described in the package’s vignette. Not all models have 3 parameters. For example, the ‘xar’ shape model has 2 parameters whilst the ‘xar’ in skew has 3 (since it is based on the signed rather than absolute shocks).
- skewshocktype and shapeshocktype. A value of 1 denotes the use of squared shocks while any other value denotes absolute value shocks.
- skewshock and shapeshock. A value of 1 denotes the use of the standardized residuals as shocks while any other value denotes the use of the residuals.
- skew.regressors and shape.regressors. A matrix of regressors for the skew and shape dynamics. This should be considered experimental at present, until further testing.
- exp.rate. The scaling value for the exponential transformation of the unbounded shape dynamics (the skew dynamics use a logistic transformation without extra parameters.)
For the test example, I’ll use the following specification:
spec = acdspec(mean.model = list(armaOrder = c(0, 1)), variance.model = list(variance.targeting = TRUE), distribution.model = list(model = 'nig', skewOrder = c(1, 1, 1), shapeOrder = c(1,1,1), skewmodel = 'quad', shapemodel = 'pwl'))
The Estimation
As described in the vignette, estimation in ACD models is highly nonlinear and there is no guarantee of a global optimum. For this reason, it is suggested that the problem is estimated from different starting parameters which is why the routine includes a number of solvers preceded by ‘ms’ to denote a multistart strategy. The included solvers are optim (and msoptim), ucminf (and msucminf), solnp (and mssolnp), nlminb (and msnlminb) and a local implementation of cmaes (bound constrained global solver). For the unconstrained solvers (optim and ucminf), the parameters are transformed into a bounded domain via the logistic map. The use of parallel evaluation in the multistart solvers is enabled by passing a cluster object from the parallel package, as the example which follows illustrates:
data(sp500ret) cl = makePSOCKcluster(10) fit = acdfit(spec, sp500ret, solver = 'msoptim', solver.control = list(restarts = 10), cluster = cl) ## ## *---------------------------------* ## * ACD Model Fit * ## *---------------------------------* ## ## Conditional Variance Dynamics ## ----------------------------------- ## GARCH Model : sGARCH(1,1) ## Mean Model : ARFIMA(0,0,1) ## Distribution : nig ## ## Conditional Skew Dynamics ## ----------------------------------- ## ACD Skew Model : quad(1,1,1) ## Shock type: Std.Residuals ## ## Conditional Shape Dynamics ## ----------------------------------- ## ACD Shape Model : pwl(1,1,1) ## Shock type: Std.Residuals ## ## Optimal Parameters ## ------------------------------------ ## Estimate Std. Error t value Pr(>|t|) ## mu 0.000579 0.000109 5.3235 0.000000 ## ma1 0.009191 0.014274 0.6439 0.519643 ## alpha1 0.089522 0.008213 10.8999 0.000000 ## beta1 0.902153 0.009429 95.6765 0.000000 ## skcons -0.080024 0.031859 -2.5118 0.012011 ## skalpha1 0.284640 0.055745 5.1061 0.000000 ## skgamma1 0.027768 0.013604 2.0412 0.041233 ## skbeta1 0.570018 0.093124 6.1210 0.000000 ## shcons 0.443515 0.193872 2.2877 0.022157 ## shalpha1 0.014926 0.018471 0.8081 0.419033 ## shgamma1 0.783159 0.119464 6.5556 0.000000 ## shbeta1 0.657811 0.088193 7.4588 0.000000 ## omega 0.000001 NA NA NA ## ## Robust Standard Errors: ## Estimate Std. Error t value Pr(>|t|) ## mu 0.000579 0.000107 5.43342 0.000000 ## ma1 0.009191 0.014895 0.61709 0.537178 ## alpha1 0.089522 0.010646 8.40916 0.000000 ## beta1 0.902153 0.012237 73.72243 0.000000 ## skcons -0.080024 0.030544 -2.61994 0.008795 ## skalpha1 0.284640 0.060626 4.69498 0.000003 ## skgamma1 0.027768 0.015036 1.84676 0.064782 ## skbeta1 0.570018 0.087479 6.51609 0.000000 ## shcons 0.443515 0.318005 1.39468 0.163112 ## shalpha1 0.014926 0.009775 1.52695 0.126773 ## shgamma1 0.783159 0.144317 5.42667 0.000000 ## shbeta1 0.657811 0.146672 4.48492 0.000007 ## omega 0.000001 NA NA NA ## ## LogLikelihood : 18154 ## ## Information Criteria ## ------------------------------------ ## ACD GARCH ## Akaike -6.5697 -6.5540 ## Bayes -6.5554 -6.5480 ## Shibata -6.5698 -6.5540 ## Hannan-Quinn -6.5647 -6.5519 ## ## [truncated remaining output] ## ## Elapsed time : 1.818
As can be inferred from the robust standard error, the majority of the higher moment parameters are significant, and the information criteria indicate an improvement over the non time varying GARCH equivalent model.
There are a number of extractor functions, in addition to the standard ones such as ‘fitted’,’sigma’,’residuals’, ‘quantile’ and ‘pit’, such as ‘shape’ and ‘skew’ which extract the conditional time varying skew and shape xts vectors with option for returning either the ‘transformed’ (default TRUE) or unbounded values. Additionally, the methods ‘skewness’ and ‘kurtosis’ return the implied conditional time varying skewness and excess kurtosis xts vectors. The following figure provides a visual illustration of the estimated dynamics:
par(mfrow = c(3, 2), mai = c(0.75, 0.75, 0.3, 0.3)) plot2xts(fitted(fit), col = 'steelblue', main = 'Conditional Mean') plot2xts(abs(as.xts(sp500ret)), col = 'grey', main = 'Conditional Sigma') lines(sigma(fit), col = 'steelblue') plot2xts(skew(fit, transform = FALSE), col = 'grey', main = 'Skew') lines(skew(fit), col = 'steelblue') legend('topleft', c('Unbounded', 'Bounded'), col = c('grey', 'steelblue'), lty = c(1,1), bty = 'n', cex = 0.8) plot2xts(shape(fit, transform = FALSE), col = 'grey', main = 'Shape', ylim = c(0,10)) lines(shape(fit), col = 'steelblue') legend('topleft', c('Unbounded', 'Bounded'), col = c('grey', 'steelblue'), lty = c(1,1), bty = 'n', cex = 0.8) plot2xts(skewness(fit), col = 'steelblue', main = 'Skewness') plot2xts(kurtosis(fit), col = 'steelblue', main = 'Kurtosis (ex)')
Forecasting and Filtering
Forecasting in the racd package can only be done on an estimated (ACDfit) object (unlike rugarch where a specification object can also be used), but for 1-ahead forecasting it is possible to use the acdfilter method instead. For n-ahead (n>1) forecasts, for the higher moment dynamics, this is done via simulation as there is no closed form solution as explained in the vignette. There is nothing particularly interesting to note about the acdforecast method here so I will go directly to the rolling forecast and backtesting method (acdroll). This has a number of extra options compared to the ugarchroll method and these are explained in detail in the vignette. Suffice to say, these extra options are related to restrictions on the dynamics to facilitate convergence.
roll = acdroll(spec, sp500ret, n.start = 2000, refit.every = 100, refit.window = 'recursive', solver = 'msoptim', solver.control = list(restarts = 10), cluster = cl, fixARMA = TRUE, fixGARCH = TRUE, fixUBShape = TRUE, UBSHapeAdd = 3, compareGARCH = 'LL') # check convergence(roll) if it is not zero, resubmit via the 'resume' # method. gspec = ugarchspec(mean.model = list(armaOrder = c(0, 1)), variance.model = list(variance.targeting = TRUE), distribution = 'nig') rollg = ugarchroll(gspec, sp500ret, n.start = 2000, refit.every = 100, refit.window = 'recursive') show(roll) ## ## *-------------------------------------* ## * ACD Roll * ## *-------------------------------------* ## No.Refits : 36 ## Refit Horizon : 100 ## No.Forecasts : 3523 ## GARCH Model : sGARCH(1,1) ## Mean Model : ARFIMA(0,0,1) ## ## ACD Skew Model : pwl(1,1,1) ## ACD Shape Model : pwl(1,1,1) ## Distribution : nig ## ## Forecast Density: ## Mu Sigma Skew Shape Shape(GIG) Realized ## 1995-02-03 2e-04 0.0055 0.0501 0.6855 0 0.0123 ## 1995-02-06 2e-04 0.0060 0.3130 0.1493 0 0.0052 ## 1995-02-07 2e-04 0.0060 0.3292 0.4318 0 -0.0007 ## 1995-02-08 3e-04 0.0059 0.2219 0.8460 0 0.0008 ## 1995-02-09 3e-04 0.0058 0.1515 0.9846 0 -0.0021 ## 1995-02-10 3e-04 0.0057 0.0580 1.0217 0 0.0026 ## ## .......................... ## Mu Sigma Skew Shape Shape(GIG) Realized ## 2009-01-23 0.0008 0.0273 -0.1161 1.2626 0 0.0054 ## 2009-01-26 0.0003 0.0264 -0.1010 1.6388 0 0.0055 ## 2009-01-27 0.0003 0.0255 -0.0890 1.7417 0 0.0109 ## 2009-01-28 0.0001 0.0248 -0.0623 1.6254 0 0.0330 ## 2009-01-29 -0.0004 0.0253 0.0364 0.6087 0 -0.0337 ## 2009-01-30 0.0013 0.0258 -0.1027 1.3595 0 -0.0231 ## ## Elapsed: 31.52 mins ## vartable = rbind(as.data.frame(VaRTest(alpha = 0.01, actual = roll@forecast$VaR[, 'realized'], VaR = roll@forecast$VaR[, 'alpha(1%)'])1, row.names = c('ACD(1%)')), as.data.frame(VaRTest(alpha = 0.05, actual = roll@forecast$VaR[, 'realized'], VaR = roll@forecast$VaR[, 'alpha(5%)'])1, row.names = c('ACD(5%)')), as.data.frame(VaRTest(alpha = 0.01, actual = rollg@forecast$VaR[, 'realized'], VaR = rollg@forecast$VaR[, 'alpha(1%)'])1, row.names = c('GARCH(1%)')), as.data.frame(VaRTest(alpha = 0.05, actual = rollg@forecast$VaR[, 'realized'], VaR = rollg@forecast$VaR[, 'alpha(5%)'])1, row.names = c('GARCH(5%)'))) ## ## ## print(vartable, digits = 3) ## expected.exceed actual.exceed cc.LRp cc.Decision ## ACD(1%) 35 28 0.357 Fail to Reject H0 ## ACD(5%) 176 203 1.000 Fail to Reject H0 ## GARCH(1%) 35 26 0.215 Fail to Reject H0 ## GARCH(5%) 176 191 0.571 Fail to Reject H0 ## ## print(rbind(as.data.frame(BerkowitzTest(qnorm(as.numeric(pit(roll))), tail.test = TRUE, alpha = 0.01)1, row.names = 'ACD'), as.data.frame(BerkowitzTest(qnorm(as.numeric(pit(rollg))), tail.test = TRUE, alpha = 0.01)1, row.names = 'GARCH')), digits = 4) ## uLL rLL LRp Decision ## ACD -163.2 -164.3 0.35633 fail to reject NULL ## GARCH -157.8 -160.3 0.07761 fail to reject NULL ## ## HL = cbind(HLTest(as.numeric(pit(roll)))$statistic, HLTest(as.numeric(pit(rollg)))$statistic) colnames(HL) = c('ACD', 'GARCH') print(HL, digits = 4) ## ACD GARCH ## M(1,1) 0.2349 5.787 ## M(2,2) 6.2283 19.640 ## M(3,3) 9.6166 26.599 ## M(4,4) 11.2113 28.929 ## M(1,2) 0.8564 4.306 ## M(2,1) 8.4187 24.476 ## W 18.3033 11.897
At the 1% and 5% coverage levels, neither the ACD nor GARCH models can be rejected, where the null is the correct number of and independence of the VaR exceedances, with higher p-values for the ACD model. In terms of the goodness of fit of the tail of the density, and using the tail test of Berkowitz (2001) at the 1% quantile, the ACD model appears to generate a significantly higher p-value than the GARCH model which can be rejected at the 10% level of significance. Note that the ‘pit’ method returns the probability integral transformation of the realized data given the conditional forecasted density. Another test which uses the ‘pit’ method is that of Hong and Li (2005), a Portmanteau type test, which evaluates the goodness of fit on the whole density. While both models are rejected as providing a ‘correct fit’ (W statistic), the indications from the moment based statistics (M statistics) indicate that the ACD model has significantly better fit, as can be inferred from the lower values (the statistics are distributed as N(0,1)).
Simulation
Unlike GARCH models where there is one call to the random number of size n.sim, in ACD models there are n.sim calls of size 1 because of the time variation in the conditional standardized residuals density. Simulation may be carried out either from a fitted object (using acdsim) or from a specification object (using acdpath). For the latter, this is not enabled for the mcsGARCH model. The following example provides a short illustration of the method and shows how to obtain equivalence between the simulation from fit and spec:
sim1 = acdsim(fit, n.sim = 250, m.sim = 5, rseed = 100:104) # re-define the spec without variance targeting (only used in estimation # routine) spec = acdspec(mean.model = list(armaOrder = c(0, 1)), variance.model = list(variance.targeting = FALSE), distribution.model = list(model = 'nig', skewOrder = c(1, 1, 1), shapeOrder = c(1, 1, 1), skewmodel = 'quad', shapemodel = 'pwl')) setfixed(spec) < - as.list(coef(fit)) sim2 = acdpath(spec, n.sim = 250, m.sim = 5, rseed = 100:104, prereturns = tail(sp500ret[,1], 1), presigma = as.numeric(tail(sigma(fit), 1)), preshape = as.numeric(tail(shape(fit),1)), preskew = as.numeric(tail(skew(fit), 1))) # check c(all.equal(fitted(sim1), fitted(sim2)), all.equal(sigma(sim1), sigma(sim2)), all.equal(skew(sim1), skew(sim2)), all.equal(shape(sim1), shape(sim2))) ## [1] TRUE TRUE TRUE TRUE S = skewness(sim1) K = kurtosis(sim1) R = fitted(sim1) V = sigma(sim1) par(mfrow = c(2, 2), mai = c(0.75, 0.75, 0.3, 0.3), cex.axis = 0.8) matplot(S, type = 'l', col = 1:5, main = 'Simulated Skewness', xlab = '') matplot(K, type = 'l', col = 1:5, main = 'Simulated Kurtosis (ex)', xlab = '') matplot(apply(R, 2, 'cumsum'), type = 'l', col = 1:5, main = 'Simulated Paths', ylab = '', xlab = '') matplot(V, type = 'l', col = 1:5, main = 'Simulated Sigma', xlab = '')
Conclusion
Time varying higher moments within a GARCH modelling framework (ACD) provide for a natural extension to time variation in the conditional mean and variance. Whether the added marginal benefits of their inclusion justify the complexity in their estimation remains an open empirical question which hopefully the racd package will enable to be researched in greater depth and transparency.
References
Berkowitz, J. (2001). Testing density forecasts, with applications to risk management. Journal of Business & Economic Statistics, 19(4), 465-474.
Hansen, B. E. (1994). Autoregressive Conditional Density Estimation. International Economic Review, 35(3), 705-30.
Harvey, C. R., & Siddique, A. (1999). Autoregressive conditional skewness. Journal of financial and quantitative analysis, 34(4), 465-497.
Hong, Y., and Li, H. (2005), Nonparametric specification testing for continuous-time models with applications to term structure of interest rates, Review of Financial Studies, 18(1), 37-84.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.