Inference and autoregressive processes

arthur charpentier

10 years ago

[This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Consider a (stationary) autoregressive process, say of order 2,

for some white noise with variance . Here is a code to generate such a process,

> phi1=.5
> phi2=-.4
> sigma=1.5
> set.seed(1)
> n=240
> WN=rnorm(n,sd=sigma)
> Z=rep(NA,n)
> Z[1:2]=rnorm(2,0,1)
> for(t in 3:n){Z[t]=phi1*Z[t-1]+phi2*Z[t-2]+WN[t]}

Here, we have to estimate two sets of parameters: the autoregressive coefficients, and the variance of the innovation process . There are (at least) three techniques to estimate those parameters.

using least square regression

A natural idea is to see here a regression model, and thus, if we consider a matrix formulation,

Here we can run (conditional) ordinary least squares estimation,

> base=data.frame(Y=Z[3:n],X1=Z[2:(n-1)],X2=Z[1:(n-2)])
> regression=lm(Y~0+X1+X2,data=base)
> summary(regression)
 
Call:
lm(formula = Y ~ 0 + X1 + X2, data = base)
 
Residuals:
Min      1Q  Median      3Q     Max
-4.3491 -0.8890 -0.0762  0.9601  3.6105
 
Coefficients:
Estimate Std. Error t value Pr(>|t|)
X1  0.45107    0.05924   7.615 6.34e-13 ***
X2 -0.41454    0.05924  -6.998 2.67e-11 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
Residual standard error: 1.449 on 236 degrees of freedom
Multiple R-squared: 0.2561,	Adjusted R-squared: 0.2497
F-statistic: 40.61 on 2 and 236 DF,  p-value: 6.949e-16
 
> regression$coefficients
X1         X2
0.4510703 -0.4145365
> summary(regression)$sigma
[1] 1.449276

using Yule-Walker equations

As we’ve seen in class, we can easily get the following equations for the autocovariance functions,

which can also be written

So we just have to solve a simple linear system of equations. Note that if we divide by the variance, those equations can be written in terms of the autocorrelation functions

The code is the following

> rho1=cor(Z[1:(n-1)],Z[2:n])
> rho2=cor(Z[1:(n-2)],Z[3:n])
> A=matrix(c(1,rho1,rho1,1),2,2)
> b=matrix(c(rho1,rho2),2,1)
> (PHI=solve(A,b))
[,1]
[1,]  0.4517579
[2,] -0.4155920

Now, we need to extract the estimated innovation process, from this set of parameters (note that it could be possible to include the variance term in Yule-Walker equations, to get a three dimensional linear equation)

> estWN=base$Y-(PHI[1]*base$X1+PHI[2]*base$X2)
> sd(estWN)
[1] 1.445706

This estimator is probably not the best one (we can take into account that we’ve lost two degrees of freedom), but as a starting point, let us consider this one.

using (conditional) likelihood estimators

Finally, we can assume some distribution for the innovation process. The standard model is a Gaussian model, i.e.

In that case, the conditional log likelihood (conditional since we set the first two observations here) is

> CondLogLik=function(A,TS){
+ phi1=A[1];  phi2=A[2]
+ sigma=A[3]	; L=0
+ for(t in 3:length(TS)){
+ L=L+dnorm(TS[t],mean=phi1*TS[t-1]+
+ phi2*TS[t-2],sd=sigma,log=TRUE)}
+ return(-L)}

Now, we can run standard optimization procedures,

> LogL=function(A) CondLogLik(A,TS=Z)
> optim(c(0,0,1),LogL)
$par
[1]  0.4509685 -0.4144938  1.4430930
 
$value
[1] 425.0164
 
$counts
function gradient
88       NA
 
$convergence
[1] 0
 
$message
NULL

Here, our three estimators are rather close. Actually, if we generate 1,000 time series (of size 240), those are the Box-plots of our three estimators, for the first order autoregressive coefficient

for the second one,

and finally for the standard deviation of the innovation process

All those estimators behave nicely, and are rather close. Note that they all might be biased, but they are consistent (see Davidson and MacKinnon for instance, in their book, for more details).

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics - Tag - R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.