All About Spherically Distributed Regression Errors
[This article was first published on Econometrics Beat: Dave Giles' Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This post is based on a handout that I use for one of my courses, and it relates to the usual linear regression model,
y = Xβ + ε
In our list of standard assumptions about the error term in this linear multiple regression model, we include one that incorporates both homoskedasticity and the absence of autocorrelation. That is, the individual values of the errors are assumed to be generated by a random process whose variance (σ2) is constant, and all possible distinct pairs of these values are uncorrelated. This implies that the full error vector, ε, has a scalar covariance matrix, σ2In.
We refer to this overall situation as one in which the values of the error term follow a “Spherical Distribution”. Let’s take a look at the origin of this terminology.
The following discussion is quite general, so you’ll realize that it applies to any random variables, not just the error term in our regression model. Further, so that we can look at some diagrams, let’s consider the special case of two dimensions, rather than three, so that what would be a (3-dimensional) sphere becomes a (2-dimensional) circle.
So, consider the pair of random values εiand εj, which we’ll generically denote x and y. (This latter terminology has nothing to do with X and y in the regression model.) The values of these two random variables are plotted in the directions of the x and y axes in the graphs which follow.
In the three-dimensional plots we will see the joint probability density function, p(x, y) in the direction of the z axis. All of these three-dimensional plots are for values in the range -3 ≤ x, y ≤ 3. Scales are given on the associated two-dimensional “contour” plots. The latter plots show “isolines” – that is, lines that join up (x, y) points that yield the same value (height) for p(x, y). These contours are exactly analogous to the contour lines that you see on a topographic map to depict the nature of the terrain. They reflect what you see when you look down vertically on to the three-dimensional (bivariate) density plots.
The appearances of the density plots, and the shape of the associated contour plots, depend upon the variances of x and y, and the covariance (and hence correlation) between these two random variables.
If x and y have the same variance (i.e., if εi and εjare homoskedastic), and if they are uncorrelated (i.e., if εi and εj are not autocorrelated), then the contours will form circles. If there were three random variables we would need a four-dimensional density graph, and the contours would form a sphere. Hence the term “Spherical Distribution”. If there were four or more random variables the sphere would become a “hyper-sphere”.
If x and y have different variances, the joint density surface is no longer symmetrical in the x and y directions, and then the coutour plot takes the form of an ellipse, rather than a circle. The same thing happens if x and y are correlated, even if they have the same variance. In this case, the slope of the primary axis of the ellipse is determined by the sign of the correlation between x and y.
Some examples follow, all for the case where x and y follow a bivariate normal distribution with zero means for x and y. The plots were all done using R (of course).
E(x , y)’ = (0 , 0)’ ; var.(x) = var.(y) = 1 ; cov. (x , y) = 0
E(x , y)’ = (0 , 0)’ ; var.(x) = 1; var.(y) = 9 ; cov. (x , y) = 0
E(x , y)’ = (0 , 0)’ ; var.(x) = var.(y) = 1 ; cov. (x , y) = 0.5
E(x , y)’ = (0 , 0)’ ; var.(x) = var.(y) = 1 ; cov. (x , y) = -0.75
E(x , y)’ = (0 , 0)’ ; var.(x) = var.(y) = 1 ; cov. (x , y) = 0.99
E(x , y)’ = (0 , 0)’ ; var.(x) = 1 ; var.(y) = 9 ; cov. (x , y) = 0.7
Now, consider a final case:
Do x and y have the same means?
Do they have the same variances?
Are they correlated – if so, positively or negatively?
© 2013, David E. Giles
To leave a comment for the author, please follow the link and comment on their blog: Econometrics Beat: Dave Giles' Blog.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.