R code to accompany Real-World Machine Learning (Chapters 2-4 Updates)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Abstract
I updated the R code to accompany Chapter 2-4 of the book “Real-World Machine Learning” by Henrik Brink, Joseph W. Richards, and Mark Fetherolf to be more consistent with the listings and figures as presented in the book.
rwml-R Chapters 2-4 updated
The most notable changes to rwml-R are for Chapter 4, where
multiple ROC curves are
plotted for a 10-class classifier and a tile plot is generated for
a tuning parameter grid search.
Also, for parallel computations, the doMC
package was replaced with
doParallel
.
Plotting a series of ROC curves
To be consistent with the approach followed in the book, I’ve added listings
of R code to compute the
ROC curves and AUC values “from scratch” instead of using the ROCR
package as was done previously:
Tuning model parameters in Chapter 4
The caret
package is used to tune parameters via grid search
for the Support Vector Machines model with a Radial Basis Function Kernel.
By setting summaryFunction = twoClassSummary
in trainControl
, the ROC
curve is used to select the optimal
model. For consistency with the book, tile plots were added to illustrate the
process of refining
the grid for the parameter search. The tile plot for the second (refined)
grid search is below.
Feedback welcome
If you have any feedback on the rwml-R project, please
leave a comment below or use the Tweet button.
As with any of my projects, feel free to fork the rwml-R repo
and submit a pull request if you wish to contribute.
For convenience, I’ve created a project page for rwml-R with
the generated HTML files from knitr
.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.