Generalized Boosted Regression with A Monotonic Marginal Effect for Each Predictor
[This article was first published on Yet Another Blog in Statistical Computing » S+/R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In the practice of risk modeling, it is sometimes mandatory to maintain a monotonic relationship between the response and each predictor. Below is a demonstration showing how to develop a generalized boosted regression with a monotonic marginal effect for each predictor.
################################################## # FIT A GENERALIZED BOOSTED REGRESSION MODEL # # FOLLOWING FRIEDMAN'S GRADIENT BOOSTING MACHINE # ################################################## library(gbm) data1 <- read.table("/home/liuwensui/Documents/data/credit_count.txt", header = TRUE, sep = ",") data2 <- data1[data1$CARDHLDR == 1, -1] # Calculate the Correlation Direction Between Response and Predictors mono <- cor(data2[, 1], data2[, -1], method = 'spearman') / abs(cor(data2[, 1], data2[, -1], method = 'spearman')) # Train a Generalized Boosted Regression set.seed(2012) m <- gbm(BAD ~ ., data = data2, var.monotone = mono, distribution = "bernoulli", n.trees = 1000, shrinkage = 0.01, interaction.depth = 1, bag.fraction = 0.5, train.fraction = 0.8, cv.folds = 5, verbose = FALSE) # Return the Optimal # of Iterations best.iter <- gbm.perf(m, method = "cv", plot.it = FALSE) print(best.iter) # Calculate Variable Importance imp <- summary(m, n.trees = best.iter, plotit = FALSE) # Plot Variable Importance png('/home/liuwensui/Documents/code/imp.png', width = 1000, height = 400) par(mar = c(3, 0, 4, 0)) barplot(imp[, 2], col = gray(0:(ncol(data2) - 1) / (ncol(data2) - 1)), names.arg = imp[, 1], yaxt = "n", cex.names = 1); title(main = list("Importance Rank of Predictors", font = 4, cex = 1.5)); dev.off() # Plot Marginal Effects of Predictors png('/home/liuwensui/Documents/code/mareff.png', width = 1000, height = 1000) par(mfrow = c(3, 4), mar = c(1, 1, 1, 1), pty = "s") for (i in 1:(ncol(data2) - 1)) { plot.gbm(m, i, best.iter); rug(data2[, i + 1]) } dev.off()
Plot of Monotonic Marginal Effects
To leave a comment for the author, please follow the link and comment on their blog: Yet Another Blog in Statistical Computing » S+/R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.