Gini coefficient, concentration measurement: an implementation in R

Posted on December 17, 2014 by Mic in R bloggers | 0 Comments

[This article was first published on The Beginner Programmer, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Another subject we took in the statistics class was the Gini index.

Gini index or ratio or coefficient is used to calculate how much a certain transferable phenomenon such as income or stocks for instance, is concentrated.

For example, say you are evaluating a company and you’d like to know more about how the shares are divided among the shareholders. You could use Gini index for that!

I’ve calculated the index using R and random data you can download here. In case you’d like to know more about Gini index check here.

Here my simple R implementation of the index.

	# Load data
	tb <- read.table("C:\\b.txt",header=TRUE,sep=",")


	# Add 5 new columns for analysis purposes
	for(i in 1:5)
	{
	cbind(tb,0)
	}

	# Storing the number of rows and columns
	nRow <- nrow(tb)
	nCol <- ncol(tb)


	# Cumulative frequencies
	i <- 1
	totalF = sum(tb[,2])
	while(i <= nRow)
	{
	if(i==1)
	{
	tb[1,3] <- tb[1,2]
	tb[1,4] <- tb[1,2]/1000
	}else{
	tb[i,3] <- tb[i-1,3]+tb[i,2]
	tb[i,4] <- tb[i-1,3]/totalF + tb[i,2]/1000
	}
	i <- i + 1
	}

	i <- 1
	while(i<=nRow)
	{
	tb[i,5] <- tb[i,1]*tb[i,2]
	if(i==1)
	{
	tb[i,6] <- tb[i,5]
	}else{
	tb[i,6] <- tb[i-1,6]+tb[i,5]
	}
	i <- i + 1
	}

	i <- 1
	while(i <= nRow)
	{
	tb[i,7] <- tb[i,6]/sum(tb[,5])
	i = i +1
	}

	# Show and plot the data
	tb
	a <- c(0,1)
	b <- c(0,1)
	c <- c(0,tb[,4])
	d <- c(0,tb[,7])
	plot(a,b,main="Concentration",type="l",col="green",lwd=2)
	lines(c,d,type="b",col="red",ylab="Relative freq",xlab="Relative freq",lwd=2)


	# Calculate Gini's R concentration index
	getR <- function(mat)
	{
	R <- 0.5
	area = 0.5tb[1,4]tb[1,7]
	i <- 2
	while(i <= nRow)
	{
	area = area + 0.5(tb[i,4]-tb[i-1,4])(tb[i-1,7]+tb[i,7])
	i = i + 1
	}
	acmax <- (sum(tb[,2])-1)/(2*sum(tb[,2]))
	R <- (R - area)/acmax
	return(R)
	}

	# Print data
	paste("Concentration index R is: ",getR(tb)*100,"%")

view raw concMeas.R hosted with ❤ by GitHub

Here below are the results

It looks like the data I used shows a 24% concentration. Cool!

To leave a comment for the author, please follow the link and comment on their blog: The Beginner Programmer.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Gini coefficient, concentration measurement: an implementation in R

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)