`crossvalidation` and random search for calibrating support vector machines
[This article was first published on T. Moudiki's Webpage - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Install and load packages
options( repos = c(techtonique = 'https://techtonique.r-universe.dev', CRAN = 'https://cloud.r-project.org') ) install.packages("crossvalidation") library(crossvalidation) library(e1071)
Input data
transforming model response into a factor
y <- as.factor(as.numeric(iris$Species))
explanatory variables
X <- as.matrix(iris[, c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")])
Objective – cross-validation – function to be maximized
OF <- function(xx) { res <- crossvalidation::crossval_ml( x = X, y = y, k = 5, repeats = 3, p = 0.8, fit_func = e1071::svm, predict_func = predict, packages = "e1071", fit_params = list(gamma = xx[1], cost = xx[2]) ) # default metric is accuracy return(res$mean_training) }
There are many, many ways to maximize this objective function.
A naive random search optimization procedure
simulation of SVM’s hyperparameters’ matrix
n_points <- 250 set.seed(123) (hyperparams <- cbind.data.frame( gamma = runif(n = n_points, min = 0, max = 5), cost = 10 ^ runif(n = n_points, min = -1, max = 2) ))
accuracies on the set of simulated hyperparameters
scores <- parallel::mclapply(1:n_points, function(i) OF(hyperparams[i,]), mc.cores = parallel::detectCores()) scores <- unlist(scores)
‘best’ hyperparameters and associated training set score
max_index <- which.max(scores) xx_best <- hyperparams[max_index,] print(xx_best) gamma cost 18 0.2102977 1.101473 print(OF(xx_best)) |===================================================================================| 100% utilisateur système écoulé 0.284 0.079 0.365 [1] 0.9527778
To leave a comment for the author, please follow the link and comment on their blog: T. Moudiki's Webpage - R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.