Inference and autoregressive processes
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Consider a (stationary) autoregressive process, say of order 2,
for some white noise with variance . Here is a code to generate such a process,
> phi1=.5 > phi2=-.4 > sigma=1.5 > set.seed(1) > n=240 > WN=rnorm(n,sd=sigma) > Z=rep(NA,n) > Z[1:2]=rnorm(2,0,1) > for(t in 3:n){Z[t]=phi1*Z[t-1]+phi2*Z[t-2]+WN[t]}
Here, we have to estimate two sets of parameters: the autoregressive coefficients, and the variance of the innovation process . There are (at least) three techniques to estimate those parameters.
- using least square regression
A natural idea is to see here a regression model, and thus, if we consider a matrix formulation,
Here we can run (conditional) ordinary least squares estimation,
> base=data.frame(Y=Z[3:n],X1=Z[2:(n-1)],X2=Z[1:(n-2)]) > regression=lm(Y~0+X1+X2,data=base) > summary(regression) Call: lm(formula = Y ~ 0 + X1 + X2, data = base) Residuals: Min 1Q Median 3Q Max -4.3491 -0.8890 -0.0762 0.9601 3.6105 Coefficients: Estimate Std. Error t value Pr(>|t|) X1 0.45107 0.05924 7.615 6.34e-13 *** X2 -0.41454 0.05924 -6.998 2.67e-11 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.449 on 236 degrees of freedom Multiple R-squared: 0.2561, Adjusted R-squared: 0.2497 F-statistic: 40.61 on 2 and 236 DF, p-value: 6.949e-16 > regression$coefficients X1 X2 0.4510703 -0.4145365 > summary(regression)$sigma [1] 1.449276
- using Yule-Walker equations
As we’ve seen in class, we can easily get the following equations for the autocovariance functions,
which can also be written
So we just have to solve a simple linear system of equations. Note that if we divide by the variance, those equations can be written in terms of the autocorrelation functions
The code is the following
> rho1=cor(Z[1:(n-1)],Z[2:n]) > rho2=cor(Z[1:(n-2)],Z[3:n]) > A=matrix(c(1,rho1,rho1,1),2,2) > b=matrix(c(rho1,rho2),2,1) > (PHI=solve(A,b)) [,1] [1,] 0.4517579 [2,] -0.4155920
Now, we need to extract the estimated innovation process, from this set of parameters (note that it could be possible to include the variance term in Yule-Walker equations, to get a three dimensional linear equation)
> estWN=base$Y-(PHI[1]*base$X1+PHI[2]*base$X2) > sd(estWN) [1] 1.445706
This estimator is probably not the best one (we can take into account that we’ve lost two degrees of freedom), but as a starting point, let us consider this one.
- using (conditional) likelihood estimators
Finally, we can assume some distribution for the innovation process. The standard model is a Gaussian model, i.e.
In that case, the conditional log likelihood (conditional since we set the first two observations here) is
> CondLogLik=function(A,TS){ + phi1=A[1]; phi2=A[2] + sigma=A[3] ; L=0 + for(t in 3:length(TS)){ + L=L+dnorm(TS[t],mean=phi1*TS[t-1]+ + phi2*TS[t-2],sd=sigma,log=TRUE)} + return(-L)}
Now, we can run standard optimization procedures,
> LogL=function(A) CondLogLik(A,TS=Z) > optim(c(0,0,1),LogL) $par [1] 0.4509685 -0.4144938 1.4430930 $value [1] 425.0164 $counts function gradient 88 NA $convergence [1] 0 $message NULLHere, our three estimators are rather close. Actually, if we generate 1,000 time series (of size 240), those are the Box-plots of our three estimators, for the first order autoregressive coefficient
for the second one,
and finally for the standard deviation of the innovation process
All those estimators behave nicely, and are rather close. Note that they all might be biased, but they are consistent (see Davidson and MacKinnon for instance, in their book, for more details).
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.