Machine Learning : Workflow

[This article was first published on K & L Fintech Modeling, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This post gives a brief introduction to a workflow of machine learning model and mostly used R packages before diving into the details.



Given a problem to be solved, all machine learning (ML) models use the same input but different output. It is, therefore, useful to understand a common workflow of ML model. As there is no only one workflow but a variety of it, we also introduce one of them.


Sample Splitting


Construction of ML model starts from a sample splitting. Most commonly used technique is a K-fold cross validation with random shuffling. In case of time-series or panel data, the K-fold cross validation without random shuffling is used for preserving temporal sequence (future data can not be used as a predictor of past data). This method is called as K-fold forward chaining cross validation or forward chaining shortly. Two cross validations are illustrated in the following figures.
cross validation in machine learning

Workflow of Machine Learning


Although there are many alternatives for each step, most ML models have the following workflow in common.
Workflow of Machine Learning

Hyperparameters and R packages


R provides many ML packages which are updated irregularly. We use representative time-tested and mostly used R packages for selected some ML models in the following way.
hyperparameters of Machine Learning R packages
Here, names of selected ML models include Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), Artificial Neural Network (ANN), Gradient Boosting (GBoost) and Extreme Gradient Boosting (XGBoost). Numerical values for hyperparameters of each ML model are presented as examples and are not absolute.


Concluding Remarks


Based on this workflow of ML model, we are going to investigate each ML model and implement it by using R ML packages step by step in a series of next posts. \(\blacksquare\)

To leave a comment for the author, please follow the link and comment on their blog: K & L Fintech Modeling.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)