Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Problem
When a correlation or covariance matrix is not positive definite (i.e., in instances when some or all eigenvalues are negative), a cholesky decomposition cannot be performed. Sometimes, these eigenvalues are very small negative numbers and occur due to rounding or due to noise in the data. In simulation studies a known/given correlation has to be imposed on an input dataset. In such cases one has to deal with the issue of making a correlation matrix positive definite. Following are papers in the field of stochastic precipitation where such matrices are used.
- FP Brissette, M Khalili, R Leconte, Journal of Hydrology, 2007, “Efficient stochastic generation of multi-site synthetic precipitation data”
- GA Baigorria, JW Jones, Journal of Climate, 2010, “GiST: A stochastic model for generating spatially and temporally correlated daily rainfall data”
- M Mhanna and W Bauwens, International Journal of Climatology, 2012, “A stochastic space-time model for the generation of daily rainfall in the Gaza Strip”
A Solution
The paper by Rebonato and Jackel, “The most general methodology for creating a valid correlation matrix for risk management and option pricing purposes”, Journal of Risk, Vol 2, No 2, 2000, presents a methodology to create a positive definite matrix out of a non-positive definite matrix.
Below is my attempt to reproduce the example from Rebonato and Jackel (2000). The correlation matrix below is from the example.
origMat <- array(c(1, 0.9, 0.7, 0.9, 1, 0.3, 0.7, 0.3, 1), dim = c(3, 3)) origEig <- eigen(origMat) origEig$values ## [1] 2.296728 0.710625 -0.007352
As you can see, the third eigenvalue is negative. Trying a cholesky decomposition on this matrix fails, as expected.
cholStatus <- try(u <- chol(origMat), silent = FALSE) cholError <- ifelse(class(cholStatus) == "try-error", TRUE, FALSE)
Here, I use the method of Rebonato and Jackel (2000), as elaborated by Brissette et al. (2007), to fix the correlation matrix. As per the method, replace the negative eigenvalues with 0 (or a small positive number as Brissette et al. 2007 suggest), then normalize the new vector.
# fix the correl matrix newMat <- origMat iter <- 0 while (cholError) { iter <- iter + 1 cat("iteration ", iter, "\n") # replace -ve eigen values with small +ve number newEig <- eigen(newMat) newEig2 <- ifelse(newEig$values < 0, 0, newEig$values) # create modified matrix eqn 5 from Brissette et al 2007, inv = transp for # eig vectors newMat <- newEig$vectors %*% diag(newEig2) %*% t(newEig$vectors) # normalize modified matrix eqn 6 from Brissette et al 2007 newMat <- newMat/sqrt(diag(newMat) %*% t(diag(newMat))) # try chol again cholStatus <- try(u <- chol(newMat), silent = TRUE) cholError <- ifelse(class(cholStatus) == "try-error", TRUE, FALSE) } ## iteration 1 ## iteration 2 # final check eigen(newMat)$values ## [1] 2.290e+00 7.096e-01 -1.332e-15
Unresolved Issue(?)
However, as you can see, the third eigenvalue is still negative (but very close to zero). The “chol” function in R is not giving an error probably because this negative eigenvalue is within the “tolerance limits”. I would like to know what these “tolerance limits” are.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.