Machine Learning : Workflow
[This article was first published on K & L Fintech Modeling, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This post gives a brief introduction to a workflow of machine learning model and mostly used R packages before diving into the details. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Given a problem to be solved, all machine learning (ML) models use the same input but different output. It is, therefore, useful to understand a common workflow of ML model. As there is no only one workflow but a variety of it, we also introduce one of them.
Sample Splitting
Construction of ML model starts from a sample splitting. Most commonly used technique is a K-fold cross validation with random shuffling. In case of time-series or panel data, the K-fold cross validation without random shuffling is used for preserving temporal sequence (future data can not be used as a predictor of past data). This method is called as K-fold forward chaining cross validation or forward chaining shortly. Two cross validations are illustrated in the following figures.
Workflow of Machine Learning
Although there are many alternatives for each step, most ML models have the following workflow in common.
Hyperparameters and R packages
R provides many ML packages which are updated irregularly. We use representative time-tested and mostly used R packages for selected some ML models in the following way.
Here, names of selected ML models include Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), Artificial Neural Network (ANN), Gradient Boosting (GBoost) and Extreme Gradient Boosting (XGBoost). Numerical values for hyperparameters of each ML model are presented as examples and are not absolute.
Concluding Remarks
Based on this workflow of ML model, we are going to investigate each ML model and implement it by using R ML packages step by step in a series of next posts. \(\blacksquare\)
To leave a comment for the author, please follow the link and comment on their blog: K & L Fintech Modeling.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.