[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Having introduced the R-index, it is time to look how it works. For this a simple example is sufficient. What happens if a product is different from another product. To make this at least slightly realistic, three products are needed. Two products will be equal, and one, the odd product, will have a varying distance. It is assumed, that the observations which a panelist does are normally distributed (sd=1) and based on these observations. Violating these assumptions is for a future time.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Again, critical values of R-indices are given by the red and blue lines (Bi and O’Mahony 1995 respectively 2007, Journal of Sensory Studies)
In the plot is is clear that R-index is more different from 50 as the odd product is more different from the others. At a distance of about 0.7 one has a fair chance to declare the odd product as different. At a distance 1.3 the odd product will almost certainly be declared different.
It is also clear that R-index is non-linear in terms of distance. With the current number of products and panelists, under 15 to 20 (or over 80 to 85) increasing the distance has less effect than closer to 50. R-index is therefore not a sensible distance measure.
The R code
library(plyr)
library(ggplot2)
makeRanksDiff <- function(prods,nrep) {
nprod <- length(prods)
inList <- lapply(1:nrep,function(x) rank(rnorm(n=nprod,mean=prods)))
data.frame(person=factor(rep(1:nrep,each=nprod)),
prod=factor(rep(1:nprod,times=nrep)),
rank=unlist(inList))
}
# R index for two products out of a cross table
tab2Rindex <- function(t1,t2) {
Rindex <- crossprod(rev(t1)[-1],cumsum(rev(t2[-1]))) + 0.5*crossprod(t1,t2)
100*Rindex/(sum(t1)*sum(t2))
}
# faster version – no labels
FastAllRindex <- function(rankExperiment) {
crst <- xtabs(~ prod + rank,data=rankExperiment)
nprod <- nlevels(rankExperiment$prod)
Rindices <- unlist( lapply(1:(nprod-1),function(p1) {
lapply((p1+1):nprod,function(p2) tab2Rindex(crst[p1,],crst[p2,])) }) )
Rindices
}
set.seed(12)
location <- seq(0,5,by=.1)
last <- lapply(location,function(xo) {
li <- lapply(1:250,function(xi) {
re <- makeRanksDiff(prod=c(0,xo,0),nrep=25)
FastAllRindex(re)
})
li2 <- as.data.frame(do.call(rbind,li))
li2$location <- xo
li2
} )
nprod <- 3
compare <- unlist(lapply(1:(nprod-1),function(p1) {
lap <- lapply((p1+1):nprod,function(p2) {
paste(‘Prod’,p2,’vs’,’Prod’,p1) })}) )
rankmatrix <- as.data.frame(do.call(rbind,last))
summy <- ddply(rankmatrix,~location,function(xo) {
what <- grep(‘location’,names(xo) ,value=TRUE,invert=TRUE)
rb <- sapply(what,function(xi) quantile(xo[,xi],c(.025,.5,.975)))
rb <- as.data.frame(t(rb))
rb$what <- compare
rb
})
g1 <- ggplot(summy,aes(location,`50%`) )
g1 <- g1 + geom_point(aes(colour= what))
g1 <- g1+ geom_errorbar(aes(ymax = `97.5%`, ymin=`2.5%`,colour=what))
g1 <- g1 + scale_y_continuous(name=’R-index’ )
g1 <- g1 + scale_x_continuous(name=’Location of odd product’)
g1 <- g1 + geom_hline(yintercept=50 + 18.57*c(-1,1),colour=’red’)
g1 <- g1 + geom_hline(yintercept=50 + 15.21*c(-1,1),colour=’blue’)
g1
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.