Data Analysis Lectures

[This article was first published on Reimagined Invention, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This is the 1st theoretical post I’m writing on this blog. I’ll review some important Linear Algebra concepts that I’m using at work and these will save you a lot of time if you work with large datasets as I do.

  • Diagonalization
  • Cayley-Hamilton Theorem
  • Time saving steps in R

Diagonalization

Let \(\renewcommand{\vec}[1]{\boldsymbol{#1}} \newcommand{\R}{\mathbb{R}} A=

[5003]
\), then \(A^2=
[5003]
[5003]
=
[520032]
\)
and \(A^3 = AA^2 =
[5003]
[520032]
=
[530033]
\)
$

In general,

Ak=[5k003k]fork1.

If A can be displayed in a useful factorization of the form A=PDP1, with D a diagonal matrix (dij=0 if ij) and P invertible, then is easy to obtain Ak.

Remember than an eigenvector of A is a vector v\Rn such that Av=λv with λ being an eigenvalue of A.

Theorem (Diagonalization Theorem). Let A an n×n matrix. A is diagonalizable if and only if A has n linearly independent eigenvectors. In fact, A=PDP1, with D a diagonal matrix and P an invertible matrix, if and only if the columns of P are n linearly independent vectors of A. In this case, the diagonal entries of D are eigenvalues of A that correspond to the eigenvectors in P.

Proof. Let P\Rn×n being a matrix of eigenvectors of A and D being any diagonal matrix of eigenvalues, then

AP=A[v1v2v3vn]=[Av1Av2Av3Avn]

and

PD=P[λ1000λ2000λn]=[λ1v1λ2v2λnvn].

Suppose that A is diagonalizable, then A=PDP1 and by multiplying by P I obtain AP=PD. By replacing the obtained values of AP and PD y obtain

Av1=λ1v1,Av2=λ2v2,,Avn=λnvn

Given that P is invertible, its columns v1,,vn must be linearly independent. Those columns are also different from 0, so λ1,,λn are eigenvalues and v1,,vn are eigenvectors.

Given any n eigenvectors of A, let P=[v1v2vn] and dij=λi for i=j and dij=0 in other case. Then AP=PD is true without imposing conditions on the eigenvectors. If the eigenvectors are linearly independent, then P is invertible and AP=PD implies A=PDP1.

Cayley-Hamilton Theorem

Statement and Proof

Let A\Rn×n the characteristic polynomial of A is defined as

PA(λ)=det(λIA),λ\R.

Theorem (Cayley-Hamilton, 1858). Every square matrix is a root of its own characteristic polynomial.

Proof. Let A\Rn×n. The characteristic polynomial of A is a polinomial of λ, its degree is n, and can be written as

det(λIA)=anλn+an1λn1+an2λn2+an3λn3++a1λ+a0.

Consider the adjoint matrix of (λIA). By definition of the adjoint, this will be a matrix whose components are monic polinomials of λ and its highest degree is n1.

Hence, it is possible to write the adjoint as

adj(λIA)=λn1Bn1+λn2Bn2+λn3Bn3++λB1+B0.

Where Bj\Rn×n and each Bj is a constant matrix.

By using adjoint properties I have

(λIA)tadj(λIA)=det(λIA)I.

By replacing this equation () and () I obtain

(λIA)t(ni=1λniBn1)=(ni=0aniλni)I.

Hence,

(λnBn1λn1ABn1)++(λB0AB0)=anλnI++a0I.

By quating similar terms

Bn1=anI multiply by AnAnBn1=anAnBn2ABn1=an1I multiply by An1An1Bn2AnBn1=an1An1Bn3ABn2=an2I multiply by An2An2Bn3An1Bn2=an2An2Bn4ABn3=an3I multiply by An3An3Bn4An2Bn3=an3An3B0AB1=a1 multiply by AAB0A2B1=a1AAB0=a0I multiply by IAB0=a0I.

By summing these last equations I obtain

anAn+an1An1++a1A+a0I=0

which concludes the proof.

Application: Obtain the inverse of a matrix

Cayley-Hamilton theorem can be used, among other things, to obtain the inverse of a matrix. To do that I propose three steps.

1st step. Obtain the characteristic polinomial of A that is defined as

p(λ)=det(λIA).

2nd step. Equate the characteristic polinomial to zero and replace λ by A and attach I to the terms where there is no replacement. Example:

p(λ)=λ2+2λ+1=0p(A)=A2+2A+I=0.

3rd step. Solve for I and remember that AA1=I. Continuing the last example:

I=A(A2I)A1=(A+2I).

Examples

2×2 matrix

Consider the matrix

A=[1234].

To obtain the inverse of A these are the steps:

1st step. Obtain the characteristic polinomial of A and equate to zero

p(A)=(λ123λ4)=(λ1)(λ4)6.

2nd step. Take the last result and transform by replacing λ by A

p(λ)=0p(A)=(AI)(A4I)6I=A25A2I=0.

3rd step. Factorize by A in the equation A25A2I=0 and solve for I

12A(A5I)=I.

As AA1=I I obtain A1=12(A5I) and then

A1=[213/21/2].

4×4 matrix

Here the method proves to be useful.

Consider the matrix

A=[1000210032104321].

Following the last steps I obtain

p(λ)=det(λIA)=|λ10002λ10032λ10432λ1|=(λ1)4.

By using (x+y)n=nk=0n!(nk)!k!xnkyk I obtain

det(λIA)=λ44λ2+6λ24λ+1.

By doing the replacements I obtain

p(λ)=0p(A)=A44A3+6A24A+I=0A(A3+4A26A+4I)=I

and then A1=A3+4A26A+4I.

Now I obtain the matrix that belongs to the obtained polinomial

A3=[1000610021610562161]4A2=[4000164004016408040164]6A=[6000126001812602418126].

And by replacing the obtained matrix I replace in the obtained polinomial and I have the inverse of A

A1=[1000210012100121].

Time saving steps in R

A <- rbind(c(5,0,0,0),c(0,5,0,0),c(1,4,-3,0),c(-1,2,0,-3))
P <- eigen(A)$vectors
D <- diag(eigen(A)$values)

ptm <- proc.time()
A%*%A

##      [,1] [,2] [,3] [,4]
## [1,]   25    0    0    0
## [2,]    0   25    0    0
## [3,]    2    8    9    0
## [4,]   -2    4    0    9

proc.time() - ptm

##    user  system elapsed 
##   0.003   0.000   0.003

ptm <- proc.time()
P%*%(D%*%D)%*%solve(P)

##      [,1] [,2] [,3] [,4]
## [1,]   25    0    0    0
## [2,]    0   25    0    0
## [3,]    2    8    9    0
## [4,]   -2    4    0    9

proc.time() - ptm

##    user  system elapsed 
##   0.003   0.001   0.007

Final comment

For a small matrix this can be tedious but for a large matrix consider that in some cases you can diagonalize and save a lot of memory if you are using, for example, R o Matlab and you need a really large correlation matrix.

To leave a comment for the author, please follow the link and comment on their blog: Reimagined Invention.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)