Gini coefficient, concentration measurement: an implementation in R

[This article was first published on The Beginner Programmer, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Another subject we took in the statistics class was the Gini index.

Gini index or ratio or coefficient is used to calculate how much a certain transferable phenomenon such as income or stocks for instance, is concentrated.

For example, say you are evaluating a company and you’d like to know more about how the shares are divided among the shareholders. You could use Gini index for that!

I’ve calculated the index using R and random data you can download here. In case you’d like to know more about Gini index check here.

Rplot

Here my simple R implementation of the index.

# Load data
tb <- read.table("C:\\b.txt",header=TRUE,sep=",")
# Add 5 new columns for analysis purposes
for(i in 1:5)
{
cbind(tb,0)
}
# Storing the number of rows and columns
nRow <- nrow(tb)
nCol <- ncol(tb)
# Cumulative frequencies
i <- 1
totalF = sum(tb[,2])
while(i <= nRow)
{
if(i==1)
{
tb[1,3] <- tb[1,2]
tb[1,4] <- tb[1,2]/1000
}else{
tb[i,3] <- tb[i-1,3]+tb[i,2]
tb[i,4] <- tb[i-1,3]/totalF + tb[i,2]/1000
}
i <- i + 1
}
i <- 1
while(i<=nRow)
{
tb[i,5] <- tb[i,1]*tb[i,2]
if(i==1)
{
tb[i,6] <- tb[i,5]
}else{
tb[i,6] <- tb[i-1,6]+tb[i,5]
}
i <- i + 1
}
i <- 1
while(i <= nRow)
{
tb[i,7] <- tb[i,6]/sum(tb[,5])
i = i +1
}
# Show and plot the data
tb
a <- c(0,1)
b <- c(0,1)
c <- c(0,tb[,4])
d <- c(0,tb[,7])
plot(a,b,main="Concentration",type="l",col="green",lwd=2)
lines(c,d,type="b",col="red",ylab="Relative freq",xlab="Relative freq",lwd=2)
# Calculate Gini's R concentration index
getR <- function(mat)
{
R <- 0.5
area = 0.5*tb[1,4]*tb[1,7]
i <- 2
while(i <= nRow)
{
area = area + 0.5*(tb[i,4]-tb[i-1,4])*(tb[i-1,7]+tb[i,7])
i = i + 1
}
acmax <- (sum(tb[,2])-1)/(2*sum(tb[,2]))
R <- (R - area)/acmax
return(R)
}
# Print data
paste("Concentration index R is: ",getR(tb)*100,"%")
view raw concMeas.R hosted with ❤ by GitHub

Here below are the results


im2


im1



It looks like the data I used shows a 24% concentration. Cool!

To leave a comment for the author, please follow the link and comment on their blog: The Beginner Programmer.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)