Sobol Sequence vs. Uniform Random in Hyper-Parameter Optimization

[This article was first published on S+/R – Yet Another Blog in Statistical Computing, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Tuning hyper-parameters might be the most tedious yet crucial in various machine learning algorithms, such as neural networks, svm, or boosting. The configuration of hyper-parameters not only impacts the computational efficiency of a learning algorithm but also determines its prediction accuracy.

Thus far, manual tuning and grid searching are still the most prevailing strategies. In the paper http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf, Bergstra and Bengio showed that the random search is more efficient in the hyper-parameter optimization than both the grid search and the manual tuning. Following the similar logic of the random search, a Sobol sequence is a series of quasi-random numbers designed to cover the space more evenly than uniform random numbers.

The demonstration below compared the Sobol sequence and the uniform random number generator in the hyper-parameter tuning of a General Regression Neural Network (GRNN). In this particular example, the Sobol sequence outperforms the uniform random number generator in two folds. First of all, it picks the hyper-parameter that yields a better performance, e.g. R^2, in the cross-validation. Secondly, the performance is more consistent in multiple trials with a lower variance.

data(Boston, package = "MASS")
grnn.fit <- function(x, y, sigma) {
return(grnn::smooth(grnn::learn(data.frame(y, x)), sigma))
}
grnn.predict <- function(nn, x) {
c <- parallel::detectCores() - 1
return(do.call(rbind,
parallel::mcMap(function(i) grnn::guess(nn, as.matrix(x[i, ])),
1:nrow(x), mc.cores = c))[,1])
}
r2 <- function(act, pre) {
rss <- sum((pre - act) ^ 2)
tss <- sum((act - mean(act)) ^ 2)
return(1 - rss / tss)
}
grnn.cv <- function(nn, sigmas, nfolds, seed) {
dt <- nn$set
set.seed(seed)
folds <- caret::createFolds(1:nrow(dt), k = nfolds, list = FALSE)
cv <- function(s) {
r <- do.call(rbind,
lapply(1:nfolds,
function(i) data.frame(Ya = nn$Ya[folds == i],
Yp = grnn.predict(grnn.fit(nn$Xa[folds != i, ], nn$Ya[folds != i], s),
data.frame(nn$Xa[folds == i,])))))
return(data.frame(sigma = s, R2 = r2(r$Ya, r$Yp)))
}
r2_lst <- Reduce(rbind, Map(cv, sigmas))
return(r2_lst[r2_lst$R2 == max(r2_lst$R2), ])
}
gen_sobol <- function(min, max, n, seed) {
return(round(min + (max - min) * randtoolbox::sobol(n, dim = 1, scrambling = 1, seed = seed), 4))
}
gen_unifm <- function(min, max, n, seed) {
set.seed(seed)
return(round(min + (max - min) * runif(n), 4))
}
net <- grnn.fit(Boston[, -14], Boston[, 14], sigma = 2)
sobol_out <- Reduce(rbind, Map(function(x) grnn.cv(net, gen_sobol(5, 10, 10, x), 4, 2019), seq(1, 10)))
unifm_out <- Reduce(rbind, Map(function(x) grnn.cv(net, gen_unifm(5, 10, 10, x), 4, 2019), seq(1, 10)))
out <- rbind(cbind(type = rep("sobol", 10), sobol_out),
cbind(type = rep("unifm", 10), unifm_out))
boxplot(R2 ~ type, data = out, main = "Sobol Sequence vs. Uniform Random",
ylab = "CV RSquare", xlab = "Sequence Type")
view raw sobol_grnn.R hosted with ❤ by GitHub

Screenshot from 2019-02-03 19-50-42

To leave a comment for the author, please follow the link and comment on their blog: S+/R – Yet Another Blog in Statistical Computing.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)