Site icon R-bloggers

Forecasting By Combining Expert Opinion

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

by Michael Helbraun

Michael is member of Revolution Analytics Sales Support team. In the following post, he shows how to synthesize a probability distribution from the opinion of multiple experts: an excellent way to construct a Bayesian prior.

There are lots of different ways to forecast.  Depending on whether there’s historical data, trend, or seasonality you might choose to start with a particular technique.  Assuming good domain expertise one effective method is combining expert opinion via Monte Carlo simulation to generate a stochastic forecast.  While this example is set up to combine 3 different people’s perspectives of what the number might be, this technique could also be used to combine domain expertise with traditional analytic techniques like time series, regression, neural networks, etc.

First we grab some estimates from our three experts:

Next we generate triangular distributions based on each of our expert’s opinions; we then randomly select one value from each trial:

 The end result – a nicely merged stochastic estimate:

Michael's code (below) uses Revolution's RevoScaleR library. Notice that the rxSetComputeContext() function (line 22) instructs the computer to set up for parallel computation using the resources on the local machine, and the rxExec() function in line 26 executes the rtriangle() function in parallel. By just changing the compute context this same code could run in parallel using all of the resources of and LSF or Hadoop cluster.

###############################################################################
##  																		 ##
##	Revolution R Enterprise - MCS Forecasting, combining expert opinion		 ##
##																			 ##
###############################################################################
 
# Clear out memory for a fresh run and load required packages
rm(list = ls())
library(triangle); library(distr); library (ggplot2)
 
# read input parameters
bigDataDir <- "C:/Data/Demos/Datasets"
bigDataDir <- "C:/..."
inDataFile <- file.path(bigDataDir, "/Expert Estimates.csv")
expertOpinion <- rxImport(inData = inDataFile)
 
View(expertOpinion)
 
# Set simulation parameters
trials <- 1000
rxOptions(numCoresToUse = -1)			
rxSetComputeContext("localpar")			
 
# create individual triangular distributions
orderedTri <- function(expertNum, trials) {
revoFcast <- rxExec(FUN = rtriangle, 
                   timesToRun = 1, n = trials,
                   a = expertOpinion$Min[expertNum], 
                   b = expertOpinion$Max[expertNum], 
                   c = expertOpinion$MostLikely[expertNum],
                   packagesToLoad = "triangle")
  return(revoFcast)
}
 
# create distribution for each of our experts
revoFcast = NA	
for (i in 1:nrow(expertOpinion)) {
  if (is.na(revoFcast)) {revoFcast <- orderedTri(i,trials)} 
  else revoFcast <-c(revoFcast,orderedTri(i,trials))
}
 
# prepare the results
revoFcast <-(data.frame(revoFcast))
names(revoFcast) <- paste("Expert", 1:nrow(expertOpinion), sep="")
 
# ensure that the results are uncorrelated
cor(revoFcast)
 
# create a combined probability distribution and select a forecast value from the prob weighted dist
combinedDist  <- function(trialNum) {
  cDist <- DiscreteDistribution(supp = as.double(revoFcast[trialNum,]), 
  prob = expertOpinion$Weighting/sum(expertOpinion$Weighting))
  rD <- r(cDist)  # variable to generate values from the dist
  return(rD(1))	  # generate/select 1 value
}
 
merged <- rxExec(FUN = combinedDist, trialNum = rxElemArg(c(1:trials)),
                 execObjects = c("revoFcast","expertOpinion"),
                 packagesToLoad = "distr") 
 
# add the forecast to our working data set
merged <- data.frame(merged)		
names(merged) <- NULL
revoFcast$merged <- t(merged)
 
# chart the output 
View(revoFcast)	# Look at our combined data set
 
# restructure the data for plotting
histVals <- data.frame(Value = c(revoFcast$Expert1, revoFcast$Expert2, revoFcast$Expert3, revoFcast$merged), 
Source = c(rep(c("Expert1", "Expert2", "Expert3","Merged Opinion"), each = trials )))
names(histVals) = c("Value", "Source")	
 
# draw our combined plot	
ggplot(histVals, aes(Value, fill = Source)) + geom_density(alpha = 0.25) + ggtitle("Combined Expert Opinion")

Download Expert Estimates the small data file used to drive Michael's simulation.

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.