Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Like with most predictive modeling or forecasting tasks, model validation is a critical requirement. Otherwise the produced models maybe overfit or perform no better than coin flips. Model validation is the process of defining the models performance, and thus ensuring that the model’s internal variable rankings are actually informative.
Below is a demonstration of the development and validation of an O-PLS-DA multivariate classification model for the famous Iris data set. This example describes the classification of the famous Iris data set.
O-PLS-DA model validation Tutorial
- Data pretreatment and preparation
- Model optimization
- Permutation testing
- Internal cross-validation
- External cross-validation
The Iris data only contains 4 variables, but the sample sizes are favorable for demonstrating a two tiered testing and training scheme (internal and external cross-validation). However O-PLS really shines when building models with many correlated variables (coming soon).
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.