caretEnsemble Classification example
[This article was first published on Modern Toolmaking, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Here’s a quick demo of how to fit a binary classification model with caretEnsemble. Please note that I haven’t spent as much time debugging caretEnsemble for classification models, so there’s probably more bugs than my last post. Also note that multi class models are not yet supported.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Setup | |
rm(list = ls(all = TRUE)) | |
gc(reset=TRUE) | |
set.seed(1234) #From random.org | |
#Libraries | |
library(caret) | |
library(devtools) | |
install_github('caretEnsemble', 'zachmayer') #Install zach's caretEnsemble package | |
library(caretEnsemble) | |
#Data | |
library(mlbench) | |
dat <- mlbench.xor(500, 2) | |
X <- data.frame(dat$x) | |
Y <- factor(ifelse(dat$classes=='1', 'Yes', 'No')) | |
#Split train/test | |
train <- runif(nrow(X)) <= .66 | |
#Setup CV Folds | |
#returnData=FALSE saves some space | |
folds=5 | |
repeats=1 | |
myControl <- trainControl(method='cv', number=folds, repeats=repeats, | |
returnResamp='none', classProbs=TRUE, | |
returnData=FALSE, savePredictions=TRUE, | |
verboseIter=TRUE, allowParallel=TRUE, | |
summaryFunction=twoClassSummary, | |
index=createMultiFolds(Y[train], k=folds, times=repeats)) | |
PP <- c('center', 'scale') | |
#Train some models | |
model1 <- train(X[train,], Y[train], method='gbm', trControl=myControl, | |
tuneGrid=expand.grid(.n.trees=500, .interaction.depth=15, .shrinkage = 0.01)) | |
model2 <- train(X[train,], Y[train], method='blackboost', trControl=myControl) | |
model3 <- train(X[train,], Y[train], method='parRF', trControl=myControl) | |
model4 <- train(X[train,], Y[train], method='mlpWeightDecay', trControl=myControl, trace=FALSE, preProcess=PP) | |
model5 <- train(X[train,], Y[train], method='knn', trControl=myControl, preProcess=PP) | |
model6 <- train(X[train,], Y[train], method='earth', trControl=myControl, preProcess=PP) | |
model7 <- train(X[train,], Y[train], method='glm', trControl=myControl, preProcess=PP) | |
model8 <- train(X[train,], Y[train], method='svmRadial', trControl=myControl, preProcess=PP) | |
model9 <- train(X[train,], Y[train], method='gam', trControl=myControl, preProcess=PP) | |
model10 <- train(X[train,], Y[train], method='glmnet', trControl=myControl, preProcess=PP) | |
#Make a list of all the models | |
all.models <- list(model1, model2, model3, model4, model5, model6, model7, model8, model9, model10) | |
names(all.models) <- sapply(all.models, function(x) x$method) | |
sort(sapply(all.models, function(x) min(x$results$ROC))) | |
#Make a greedy ensemble - currently can only use RMSE | |
greedy <- caretEnsemble(all.models, iter=1000L) | |
sort(greedy$weights, decreasing=TRUE) | |
greedy$error | |
#Make a linear regression ensemble | |
linear <- caretStack(all.models, method='glm', trControl=trainControl(method='cv')) | |
linear$error | |
#Predict for test set: | |
library(caTools) | |
preds <- data.frame(sapply(all.models, function(x){predict(x, X[!train,], type='prob')[,2]})) | |
preds$ENS_greedy <- predict(greedy, newdata=X[!train,]) | |
preds$ENS_linear <- predict(linear, newdata=X[!train,], type='prob')[,2] | |
sort(data.frame(colAUC(preds, Y[!train]))) |
Right now, this code fails for me if I try a model like a nnet or an SVM for stacking, so there’s clearly bugs to fix.
The greedy model relies 100% on the gbm, which makes sense as the gbm has an AUC of 1 on the training set. The linear model uses all of the models, and achieves an AUC of .5. This is a little weird, as the gbm, rf, SVN, and knn all achieve an AUC of close to 1.0 on the training set, and I would have expected the linear model to focus on these predictions. I’m not sure if this is a bug, or a failure of my stacking model.
To leave a comment for the author, please follow the link and comment on their blog: Modern Toolmaking.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.