On Cochran Theorem (and Orthogonal Projections)

[This article was first published on R-english – Freakonometrics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Cochran Theorem – from The distribution of quadratic forms in a normal system, with applications to the analysis of covariance published in 1934 – is probably the most import one in a regression course. It is an application of a nice result on quadratic forms of Gaussian vectors. More precisely, we can prove that if YN(0,Id) is a random vector with d N(0,1) variable then (i) if A is a (squared) idempotent matrix YAYχ2r where r is the rank of matrix A, and (ii) conversely, if YAYχ2r then A is an idempotent matrix of rank r. And just in case, A is an idempotent matrix means that A2=A, and a lot of results can be derived (for instance on the eigenvalues). The prof of that result (at least the (i) part) is nice: we diagonlize matrix A, so that A=PΔP, with P orthonormal. Since A is an idempotent matrix observe thatA2=PΔP=PΔP=PΔ2Pwhere Δ is some diagonal matrix such that Δ2=Δ, so terms on the diagonal of Δ are either 0 or 1‘s. And because the rank of A (and Δ) is r then there should be r 1‘s and dr 1‘s. Now writeYAY=YPΔPY=ZΔZwhere Z=PY that satisfiesZN(0,PP) i.e. ZN(0,Id). Thus ZΔZ=i:Δi,i1Z2iχ2rNice, isn’t it. And there is more (that will be strongly connected actually to Cochran theorem). Let A=A1++Ak, then the two following statements are equivalent (i) A is idempotent and rank(A)=rank(A1)++rank(Ak) (ii) Ai‘s are idempotents, AiAj=0 for all ij.

Now, let us talk about projections. Let y be a vector in Rn. Its projection on the space V(v1,,vp) (generated by those p vectors) is the vector ˆy=Vˆa that minimizes yVa (in a). The solution isˆa=(VV)1Vy and ˆy=Vˆa
Matrix P=V(VV)1V is the orthogonal projection on {v1,,vp} and ˆy=Py.

Now we can recall Cochran theorem. Let YN(μ,σ2Id) for some σ>0 and μ. Consider sub-vector orthogonal spaces F1,,Fm, with dimension di. Let PFi be the orthogonal projection matrix on Fi, then (i) vectors PF1X,,PFmX are independent, with respective distribution N(PFiμ,σ2Idi) and (ii) random variables PFi(Xμ)2/σ2 are independent and χ2di distributed.

We can try to visualize those results. For instance, the orthogonal projection of a random vector has a Gaussian distribution. Consider a two-dimensional Gaussian vector

library(mnormt) r = .7 s1 = 1 s2 = 1 Sig = matrix(c(s1^2,r*s1*s2,r*s1*s2,s2^2),2,2) Sig Y = rmnorm(n = 1000,mean=c(0,0),varcov = Sig) plot(Y,cex=.6) vu = seq(-4,4,length=101) vz = outer(vu,vu,function (x,y) dmnorm(cbind(x,y), mean=c(0,0), varcov = Sig)) contour(vu,vu,vz,add=TRUE,col='blue') abline(a=0,b=2,col="red")

Consider now the projection of points y=(y1,y2) on the straight linear with directional vector u with slope a (say a=2). To get the projected point x=(x1,x2) recall that x2=ay1 and x,yu. Hence, the following code will give us the orthogonal projections

p = function(a){ x0=(Y[,1]+a*Y[,2])/(1+a^2) y0=a*x0 cbind(x0,y0) }

with

P = p(2) for(i in 1:20) segments(Y[i,1],Y[i,2],P[i,1],P[i,2],lwd=4,col="red") points(P[,1],P[,2],col="red",cex=.7)

Now, if we look at the distribution of points on that line, we get… a Gaussian distribution, as expected,

z = sqrt(P[,1]^2+P[,2]^2)*c(-1,+1)[(P[,1]>0)*1+1] vu = seq(-6,6,length=601) vv = dnorm(vu,mean(z),sd(z)) hist(z,probability = TRUE,breaks = seq(-4,4,by=.25)) lines(vu,vv,col="red")

Or course, we can use the matrix representation to get the projection on u, or a normalized version of that vector actually

a=2 U = c(1,a)/sqrt(a^2+1) U [1] 0.4472136 0.8944272 matP = U %*% solve(t(U) %*% U) %*% t(U) matP %*% Y[1,] [,1] [1,] -0.1120555 [2,] -0.2241110 P[1,] x0 y0 -0.1120555 -0.2241110

(which is consistent with our manual computation). Now, in Cochran theorem, we start with independent random variables,

Y = rmnorm(n = 1000,mean=c(0,0),varcov = diag(c(1,1)))

Then we consider the projection on u and v=u

U = c(1,a)/sqrt(a^2+1) matP1 = U %*% solve(t(U) %*% U) %*% t(U) P1 = Y %*% matP1 z1 = sqrt(P1[,1]^2+P1[,2]^2)*c(-1,+1)[(P1[,1]>0)*1+1] V = c(a,-1)/sqrt(a^2+1) matP2 = V %*% solve(t(V) %*% V) %*% t(V) P2 = Y %*% matP2 z2 = sqrt(P2[,1]^2+P2[,2]^2)*c(-1,+1)[(P2[,1]>0)*1+1]

We can plot those two projections

plot(z1,z2)

and observe that the two are indeed, independent Gaussian variables. And (of course) there squared norms are χ21 distributed.

To leave a comment for the author, please follow the link and comment on their blog: R-english – Freakonometrics.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)