Sobol Sequence vs. Uniform Random in Hyper-Parameter Optimization
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Tuning hyper-parameters might be the most tedious yet crucial in various machine learning algorithms, such as neural networks, svm, or boosting. The configuration of hyper-parameters not only impacts the computational efficiency of a learning algorithm but also determines its prediction accuracy.
Thus far, manual tuning and grid searching are still the most prevailing strategies. In the paper http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf, Bergstra and Bengio showed that the random search is more efficient in the hyper-parameter optimization than both the grid search and the manual tuning. Following the similar logic of the random search, a Sobol sequence is a series of quasi-random numbers designed to cover the space more evenly than uniform random numbers.
The demonstration below compared the Sobol sequence and the uniform random number generator in the hyper-parameter tuning of a General Regression Neural Network (GRNN). In this particular example, the Sobol sequence outperforms the uniform random number generator in two folds. First of all, it picks the hyper-parameter that yields a better performance, e.g. R^2, in the cross-validation. Secondly, the performance is more consistent in multiple trials with a lower variance.
data(Boston, package = "MASS") | |
grnn.fit <- function(x, y, sigma) { | |
return(grnn::smooth(grnn::learn(data.frame(y, x)), sigma)) | |
} | |
grnn.predict <- function(nn, x) { | |
c <- parallel::detectCores() - 1 | |
return(do.call(rbind, | |
parallel::mcMap(function(i) grnn::guess(nn, as.matrix(x[i, ])), | |
1:nrow(x), mc.cores = c))[,1]) | |
} | |
r2 <- function(act, pre) { | |
rss <- sum((pre - act) ^ 2) | |
tss <- sum((act - mean(act)) ^ 2) | |
return(1 - rss / tss) | |
} | |
grnn.cv <- function(nn, sigmas, nfolds, seed) { | |
dt <- nn$set | |
set.seed(seed) | |
folds <- caret::createFolds(1:nrow(dt), k = nfolds, list = FALSE) | |
cv <- function(s) { | |
r <- do.call(rbind, | |
lapply(1:nfolds, | |
function(i) data.frame(Ya = nn$Ya[folds == i], | |
Yp = grnn.predict(grnn.fit(nn$Xa[folds != i, ], nn$Ya[folds != i], s), | |
data.frame(nn$Xa[folds == i,]))))) | |
return(data.frame(sigma = s, R2 = r2(r$Ya, r$Yp))) | |
} | |
r2_lst <- Reduce(rbind, Map(cv, sigmas)) | |
return(r2_lst[r2_lst$R2 == max(r2_lst$R2), ]) | |
} | |
gen_sobol <- function(min, max, n, seed) { | |
return(round(min + (max - min) * randtoolbox::sobol(n, dim = 1, scrambling = 1, seed = seed), 4)) | |
} | |
gen_unifm <- function(min, max, n, seed) { | |
set.seed(seed) | |
return(round(min + (max - min) * runif(n), 4)) | |
} | |
net <- grnn.fit(Boston[, -14], Boston[, 14], sigma = 2) | |
sobol_out <- Reduce(rbind, Map(function(x) grnn.cv(net, gen_sobol(5, 10, 10, x), 4, 2019), seq(1, 10))) | |
unifm_out <- Reduce(rbind, Map(function(x) grnn.cv(net, gen_unifm(5, 10, 10, x), 4, 2019), seq(1, 10))) | |
out <- rbind(cbind(type = rep("sobol", 10), sobol_out), | |
cbind(type = rep("unifm", 10), unifm_out)) | |
boxplot(R2 ~ type, data = out, main = "Sobol Sequence vs. Uniform Random", | |
ylab = "CV RSquare", xlab = "Sequence Type") |
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.