Off to CRAN! {tidyAML}
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
Are you tired of spending hours tuning and testing different machine learning models for your regression or classification problems? The new R package {tidyAML}
is here to simplify the process for you! tidyAML is a simple interface for automatic machine learning that fits the tidymodels framework, making it easier for you to solve regression and classification problems.
The tidyAML package has been designed with the goal of providing a simple API that automates the entire machine learning pipeline, from data preparation to model selection, training, and prediction. This means that you no longer have to spend hours tuning and testing different models; tidyAML will do it all for you, saving you time and effort.
In this initial release (version 0.0.1), tidyAML introduces a number of new features and minor fixes to improve the overall user experience. Here are some of the updates in this release:
New Features:
make_regression_base_tbl()
andmake_classification_base_tbl()
functions for creating base tables for regression and classification problems, respectively.internal_make_spec_tbl()
function for making the specification table for the machine learning pipeline.internal_set_args_to_tune()
function for setting arguments to tune the models. This has not yet been implemented in a true working fashion but might be useful for feedback in this initial release.create_workflow_set()
function for creating a set of workflows to test different models.get_model()
,extract_model_spec()
,extract_wflw()
,extract_wflw_fit()
, andextract_wflw_pred()
functions for extracting different parts of the machine learning pipeline.match_args()
function for matching arguments between the base and specification tables.
Minor Fixes and Improvements:
- Updates to
fast_classification_parsnip_spec_tbl()
andfast_regression_parsnip_spec_tbl()
to use themake_regression
andmake_classification
functions and theinternal_make_spec_tbl()
function. - Addition of a class for the base table functions and using that class in
internal_make_spec_tbl()
. - Update to the DESCRIPTION for R >= 3.4.0.
In conclusion, tidyAML is a game-changer for those looking to automate the machine learning pipeline. It provides a simple API that eliminates the need for manual tuning and testing of different models. With the updates in this initial release, the tidyAML package is sure to make your machine learning journey easier and more efficient.
Function
There are too many functions to go over in this post so you can find them all here
Examples
Even though there are many functions to go over, we can showcase some with a small useful example. So let’s get at it!
library(tidyAML) library(recipes) library(dplyr) rec_obj <- recipe(mpg ~ ., data = mtcars) frt_tbl <- fast_regression( .data = mtcars, .rec_obj = rec_obj, .parsnip_eng = c("lm","glm"), .parsnip_fns = "linear_reg" ) glimpse(frt_tbl)
Rows: 2 Columns: 8 $ .model_id <int> 1, 2 $ .parsnip_engine <chr> "lm", "glm" $ .parsnip_mode <chr> "regression", "regression" $ .parsnip_fns <chr> "linear_reg", "linear_reg" $ model_spec <list> [~NULL, ~NULL, NULL, regression, TRUE, NULL, lm, TRUE]… $ wflw <list> [cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb, mp… $ fitted_wflw <list> [cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb, mp… $ pred_wflw <list> [<tbl_df[24 x 1]>], [<tbl_df[24 x 1]>]
Now let’s go through the extractors.
The get_model()
function.
get_model(frt_tbl, 2) |> glimpse()
Rows: 1 Columns: 8 $ .model_id <int> 2 $ .parsnip_engine <chr> "glm" $ .parsnip_mode <chr> "regression" $ .parsnip_fns <chr> "linear_reg" $ model_spec <list> [~NULL, ~NULL, NULL, regression, TRUE, NULL, glm, TRUE… $ wflw <list> [cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb, mp… $ fitted_wflw <list> [cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb, mp… $ pred_wflw <list> [<tbl_df[24 x 1]>]
The extract_model_spec()
function.
extract_model_spec(frt_tbl, 1)
[[1]] Linear Regression Model Specification (regression) Computational engine: lm
Or do multiples:
extract_model_spec(frt_tbl, 1:2)
[[1]] Linear Regression Model Specification (regression) Computational engine: lm [[2]] Linear Regression Model Specification (regression) Computational engine: glm
The extract_wflw()
function.
extract_wflw(frt_tbl, 1)
[[1]] ══ Workflow ════════════════════════════════════════════════════════════════════ Preprocessor: Recipe Model: linear_reg() ── Preprocessor ──────────────────────────────────────────────────────────────── 0 Recipe Steps ── Model ─────────────────────────────────────────────────────────────────────── Linear Regression Model Specification (regression) Computational engine: lm
Or do multiples:
extract_wflw(frt_tbl, c(1, 2))
[[1]] ══ Workflow ════════════════════════════════════════════════════════════════════ Preprocessor: Recipe Model: linear_reg() ── Preprocessor ──────────────────────────────────────────────────────────────── 0 Recipe Steps ── Model ─────────────────────────────────────────────────────────────────────── Linear Regression Model Specification (regression) Computational engine: lm [[2]] ══ Workflow ════════════════════════════════════════════════════════════════════ Preprocessor: Recipe Model: linear_reg() ── Preprocessor ──────────────────────────────────────────────────────────────── 0 Recipe Steps ── Model ─────────────────────────────────────────────────────────────────────── Linear Regression Model Specification (regression) Computational engine: glm
The extract_wflw_fit()
function.
extract_wflw_fit(frt_tbl, 1)
[[1]] ══ Workflow [trained] ══════════════════════════════════════════════════════════ Preprocessor: Recipe Model: linear_reg() ── Preprocessor ──────────────────────────────────────────────────────────────── 0 Recipe Steps ── Model ─────────────────────────────────────────────────────────────────────── Call: stats::lm(formula = ..y ~ ., data = data) Coefficients: (Intercept) cyl disp hp drat wt 28.21291 -1.60712 0.03458 -0.02189 0.56925 -5.69276 qsec vs am gear carb 0.69956 0.39398 1.50212 -0.35338 0.48289
Or do multiples:
extract_wflw_fit(frt_tbl, 1:2)
[[1]] ══ Workflow [trained] ══════════════════════════════════════════════════════════ Preprocessor: Recipe Model: linear_reg() ── Preprocessor ──────────────────────────────────────────────────────────────── 0 Recipe Steps ── Model ─────────────────────────────────────────────────────────────────────── Call: stats::lm(formula = ..y ~ ., data = data) Coefficients: (Intercept) cyl disp hp drat wt 28.21291 -1.60712 0.03458 -0.02189 0.56925 -5.69276 qsec vs am gear carb 0.69956 0.39398 1.50212 -0.35338 0.48289 [[2]] ══ Workflow [trained] ══════════════════════════════════════════════════════════ Preprocessor: Recipe Model: linear_reg() ── Preprocessor ──────────────────────────────────────────────────────────────── 0 Recipe Steps ── Model ─────────────────────────────────────────────────────────────────────── Call: stats::glm(formula = ..y ~ ., family = stats::gaussian, data = data) Coefficients: (Intercept) cyl disp hp drat wt 28.21291 -1.60712 0.03458 -0.02189 0.56925 -5.69276 qsec vs am gear carb 0.69956 0.39398 1.50212 -0.35338 0.48289 Degrees of Freedom: 23 Total (i.e. Null); 13 Residual Null Deviance: 935.1 Residual Deviance: 121.5 AIC: 131
Finally the extract_wflw_pred()
function.
extract_wflw_pred(frt_tbl, 2)
[[1]] # A tibble: 24 × 1 .pred <dbl> 1 24.8 2 26.5 3 18.5 4 13.9 5 24.6 6 29.1 7 14.0 8 17.9 9 10.0 10 23.4 # … with 14 more rows
Or do multiples:
extract_wflw_pred(frt_tbl, 1:2)
[[1]] # A tibble: 24 × 1 .pred <dbl> 1 24.8 2 26.5 3 18.5 4 13.9 5 24.6 6 29.1 7 14.0 8 17.9 9 10.0 10 23.4 # … with 14 more rows [[2]] # A tibble: 24 × 1 .pred <dbl> 1 24.8 2 26.5 3 18.5 4 13.9 5 24.6 6 29.1 7 14.0 8 17.9 9 10.0 10 23.4 # … with 14 more rows
Voila!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.