[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Ricky Ho has created a reference a 6-page PDF reference card on Big Data Machine Learning, with examples implemented in the R language. (A free registration to DZone Refcardz is required to download the PDF.) The examples cover:
- Predictive modeling overview (how to set up test and training sets in R)
- Linear regression (using lm)
- Logistic regression (using glm)
- Regression with regularization (using the glmnet package)
- Neural networks (using nnet)
- Support vector machines (using tune.svm from the e1071 package)
- Naïve Bayes models (using naiveBayes from the e1071 package)
- K-nearest-neighbors classification (using the knn function from the class package)
- Decision trees (using rpart)
- Ensembles of trees (using the randomForest package)
- Gradient boosting (using the gbm package)
The examples use the traditional built-in R data sets (such as the iris data, used to create the neural network above), so there's unfortunately not much of a "big data" aspect to the reference card. But if you're just getting started with prediction and classification models in R, this cheat sheet is a useful guide.
DZone Refcardz: Big Data Machine Learning Patterns for Predictive Analytics
To leave a comment for the author, please follow the link and comment on their blog: Revolutions.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.