Bayesian analysis of sensory profiling data III
[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Last week I extended my Bayesian model. So this week I wanted to test it with different data. There is one other data set with profiling data in R, french fries in the reshape package. ‘This data was collected from a sensory experiment conducted at Iowa State University in 2004. The investigators were interested in the effect of using three different fryer oils had on the taste of the fries’. In this post I try t analyze the data, however the documentation which I found is so limited, it is needed to second guess the objectives of the experiment.
Data
The data includes a treatment and a time variable. The time is in weeks and crossed with the other variables, suggesting that this is systematically varied and hence in itself a factor of interest. In the end I decided to cross treatments and time to generate a product factor with 30 levels. In addition I crossed time with a repetition factor so this effect could function as a random variable. As bonus, there are some missing data, which Stan does not like and are removed before analysis.data(french_fries,package=’reshape’)
head(french_fries)
vars <- names(french_fries)
fries <- reshape(french_fries,
idvar=vars[1:4],
varying=list(vars[5:9]),
v.names=’Score’,
direction=’long’,
timevar=’Descriptor’,
times=vars[5:9])
head(fries)
fries$Product <- interaction(fries$treatment,fries$time)
fries$Session <- interaction(fries$time,fries$rep)
fries$Descriptor <- factor(fries$Descriptor)
fries <- fries[complete.cases(fries),]
Model
The model is a variation on last week, because I there is no rounds information but there are sessions. In addition, I made the range for the shift variable a bit larger, because it appeared some panelists have extreme behavior.model1 <-
‘
data {
int
int
int
int
int
vector[nobs] y;
int
int
int
int
real maxy;
}
parameters {
matrix
vector
vector
vector
real
vector [npanelist] lpanelistvar;
vector [ndescriptor] ldescriptorvar;
real
real
}
transformed parameters {
vector [nobs] expect;
vector[npanelist] sensitivity;
real mlogsens;
real mlpanelistvar;
real mldescriptorvar;
real meanshift[ndescriptor];
real meansit[nsession];
vector [nobs] sigma;
mlogsens <- mean(logsensitivity);
for (i in 1:ndescriptor) {
meanshift[i] <- mean(shift[i]);
}
mlpanelistvar <- mean(lpanelistvar);
mldescriptorvar <- mean(ldescriptorvar);
for (i in 1:nsession) {
meansit[i] <- mean(sitting[i]);
}
for (i in 1:npanelist) {
sensitivity[i] <- pow(10,logsensitivity[i]-mlogsens);
}
for (i in 1:nobs) {
expect[i] <- profile[product[i],descriptor[i]]
*sensitivity[panelist[i]]
+ shift[descriptor[i],panelist[i]]-meanshift[descriptor[i]]
+ sitting[session[i],panelist[i]]-meansit[session[i]];
sigma[i] <- sqrt(varr
*exp(lpanelistvar[panelist[i]]-mlpanelistvar)
*exp(ldescriptorvar[descriptor[i]]-mldescriptorvar));
}
}
model {
logsensitivity ~ normal(0,0.1);
for (i in 1: ndescriptor) {
shift[i] ~ normal(0,sigmashift);
ldescriptorvar[i] ~ normal(0,1);
}
for (i in 1:nsession)
sitting[i] ~ normal(0,sigmasitting);
for (i in 1:npanelist)
lpanelistvar[i] ~ normal(0,1);
y ~ normal(expect,sigma);
}
generated quantities {
vector [npanelist] panelistsd;
vector [ndescriptor] descriptorsd;
for (i in 1:npanelist) {
panelistsd[i] <- sqrt(exp(lpanelistvar[i]));
}
for (i in 1:ndescriptor) {
descriptorsd[i] <- sqrt(exp(ldescriptorvar[i]));
}
}
‘
Additional code
This is just the stuff needed to get Stan runningdatainfries <- with(fries,list(
panelist=(1:nlevels(subject))[subject],
npanelist=nlevels(subject),
session=(1:nlevels(Session))[Session],
nsession=nlevels(Session),
product=(1:nlevels(Product))[Product],
nproduct=nlevels(Product),
descriptor=(1:nlevels(Descriptor))[Descriptor],
ndescriptor=nlevels(Descriptor),
y=Score,
nobs=length(Score),
maxy=15))
pars <- c('profile','sensitivity','shift','sitting',
‘panelistsd’,’descriptorsd’)
library(rstan)
fitfries <- stan(model_code = model1,
data = datainfries,
pars=pars,
iter = 1100,
warmup=100,
chains = 4)
Results
Traceplot is not shown, but the general plot provides some overview.Plot of the profile plot has been adapted given the design. This plot should convey the message the data contains. As time progresses the flavor becomes more rancid and pointy, less potato. This process starts at circa 4 weeks.Treatment 3 seems least affected, but the difference is minute.
samples <- extract(fitfries,'profile')$profile
nsamp <- dim(samples)[1]
profile <- expand.grid(Product=levels(fries$Product),
Descriptor=levels(fries$Descriptor))
profile$des <- as.numeric(profile$Descriptor)
profile$prod <- as.numeric(profile$Product)
profs <- as.data.frame(t(sapply(1:nrow(profile),function(i){
subsamples <-c(samples[1:nsamp,
profile$prod[i],profile$des[i]])
c(mean=mean(subsamples),quantile(subsamples,c(.1,.9)))
})))
names(profs) <- c('score','lmin','lmax')
profile <- cbind(profile,profs)
profile <- merge(profile,
unique(fries[,c(‘Product’,’treatment’,’time’)])
)
library(ggplot2)
profile$timen <- c(1:12)[profile$time]
profile <- profile[order(profile$timen),]
p1 <- ggplot(profile, aes(y=score, x=timen,color=treatment))
p1 + geom_path() +
scale_x_continuous(
breaks=1:nlevels(profile$time),
labels=levels(profile$time)) +
xlab(”) +
geom_ribbon(aes(ymin=lmin, ymax=lmax,fill=treatment),
alpha=.15,linetype=’blank’) +
facet_grid( Descriptor ~ . )
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.