Association and concordance measures
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Following the course, in order to define assocation measures (from Kruskal (1958)) or concordance measures (from Scarsini (1984)), define a concordance function as follows: let be a random pair with copula , and with copula . Then define
the so-called concordance function. Thus
As proved last week in class,
Based on that function, several concordance measures can be derived. A popular measure is Kendall’s tau, from Kendall (1938), defined as i.e.
which is simply . Here, computation can be tricky. Consider the following sample,
> set.seed(1) > n=40 > library(mnormt) > X=rmnorm(n,c(0,0), + matrix(c(1,.4,.4,1),2,2)) > U=cbind(rank(X[,1]),rank(X[,2]))/(n+1)
Then, using R function, we can obtain Kendall’s tau easily,
> cor(X,method="kendall")[1,2] [1] 0.3794872
To get our own code (and to understand a bit more how to get that coefficient), we can use
> i=rep(1:(n-1),(n-1):1) > j=2:n > for(k in 3:n){j=c(j,k:n)} > M=cbind(X[i,],X[j,]) > concordant=sum((M[,1]-M[,3])*(M[,2]-M[,4])>0) > discordant=sum((M[,1]-M[,3])*(M[,2]-M[,4])<0) > total=n*(n-1)/2 > (K=(concordant-discordant)/total) [1] 0.3794872
or the following (we’ll use random variable quite frequently),
> i=rep(1:n,each=n) > j=rep(1:n,n) > Z=((X[i,1]>X[j,1])&(X[i,2]>X[j,2])) > (K=4*mean(Z)*n/(n-1)-1) [1] 0.3794872
Another measure is Spearman’s rank correlation, from Spearman (1904),
where has distribution .
Here, which leads to the following expressions
Numerically, we have the following
> cor(X,method="spearman")[1,2] [1] 0.5388368 > cor(rank(X[,1]),rank(X[,2])) [1] 0.5388368
Note that it is also possible to write
Another measure is the cograduation index, from Gini (1914), obtained by sybstituting an L1 norm instead of a L2 one in the previous expression,
Note that this index can also be defined as . Here,
> Rx=rank(X[,1]);Ry=rank(X[,2]); > (G=2/(n^2) *(sum(abs(Rx+Ry-n-1))- + sum(abs(Rx-Ry)))) [1] 0.41
Finally, another measure is the one from Blomqvist (1950). Let denote the median of , i.e.
Then define
or equivalently
> Mx=median(X[,1]);My=median(X[,2]) > (B=4*sum((X[,1]<=Mx)*((X[,2]<=My)))/n-1) [1] 0.4
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.