Direct Optimization of Hyper-Parameter
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In the previous post (https://statcompute.wordpress.com/2019/02/03/sobol-sequence-vs-uniform-random-in-hyper-parameter-optimization), it is shown how to identify the optimal hyper-parameter in a General Regression Neural Network by using the Sobol sequence and the uniform random generator respectively through the N-fold cross validation. While the Sobol sequence yields a slightly better performance, outcomes from both approaches are very similar, as shown below based upon five trials with 20 samples in each. Both approaches can be generalized from one-dimensional to multi-dimensional domains, e.g. boosting or deep learning.
net <- grnn.fit(scale(Boston[, -14]), Boston[, 14], sigma = 1) sb_out <- Reduce(rbind, Map(function(x) grnn.cv(net, gen_sobol(0.1, 1.0, 20, x), 4, 2019), seq(1, 5))) uf_out <- Reduce(rbind, Map(function(x) grnn.cv(net, gen_unifm(0.1, 1.0, 20, x), 4, 2019), seq(1, 5))) Map(function(x) x[x$R2 == max(x$R2), ], list(sobol = sb_out, uniform = uf_out)) # $sobol # sigma R2 # 0.5568 0.8019342 # $uniform # sigma R2 # 0.5608 0.8019327
Other than the random search, another way to locate the optimal hyper-parameter is applying general optimization routines, As shown in the demonstration below, we first need to define an objective function, e.g. grnn.optim(), to maximize the Cross-Validation R^2. In addition, depending on the optimization algorithm, upper and lower bounds of the parameter to be optimized should also be provided. Three optimization algorithms are employed in the example, including unconstrained non-linear optimization, particle swarm optimization, and Nelder–Mead simplex optimization, with all showing comparable outcomes to ones achieved by the random search.
net <- grnn.fit(scale(Boston[, -14]), Boston[, 14], sigma = 1) | |
grnn.optim <- function(sigma, nn, nfolds, seed) { | |
dt <- nn$set | |
set.seed(seed) | |
folds <- caret::createFolds(1:nrow(dt), k = nfolds, list = FALSE) | |
r <- do.call(rbind, | |
lapply(1:nfolds, | |
function(i) data.frame(Ya = nn$Ya[folds == i], | |
Yp = grnn.predict(grnn.fit(nn$Xa[folds != i, ], nn$Ya[folds != i], sigma), | |
data.frame(nn$Xa[folds == i,]))))) | |
return(r2(r$Ya, r$Yp)) | |
} | |
### General-Purpose Unconstrained Non-Linear Optimization ### | |
op_out <- ucminf::ucminf(par = 0.5, fn = function(x) -grnn.optim(x, net, 4, 2019)) | |
# $par | |
# [1] 0.5611872 | |
# $value | |
# [1] -0.8019319 | |
### Particle Swarm Optimization ### | |
set.seed(1) | |
ps_out <- pso::psoptim(par = 0.5, upper = 1.0, lower = 0.1, | |
fn = function(x) -grnn.optim(x, net, 4, 2019), | |
control = list(maxit = 20)) | |
# $par | |
# [1] 0.5583358 | |
# $value | |
# [1] -0.8019351 | |
### Nelder–Mead Optimization ### | |
nm_out <- optim(par = 0.5, fn = function(x) -grnn.optim(x, net, 4, 2019), | |
method = "Nelder-Mead", control = list(warn.1d.NelderMead = FALSE)) | |
# $par | |
# [1] 0.5582031 | |
# $value | |
# [1] -0.8019351 |
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.