automl package: part 2/2 first steps how to

Posted on October 24, 2018 by Alex Boulangé in R bloggers | 0 Comments

[This article was first published on R-posts.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

first steps: how to

For those who will laugh at seeing deep learning with one hidden layer and the Iris data set of 150 records, I will say: you’re perfectly right

The goal at this stage is simply to take the first steps

fit a regression model manually (hard way)

Subject: predict Sepal.Length given other Iris parameters
1st with gradient descent and default hyper-parameters value for learning rate (0.001) and mini batch size (32)
input

data(iris)
xmat <- cbind(iris[,2:4], as.numeric(iris$Species))
ymat <- iris[,1]
amlmodel <- automl_train_manual(Xref = xmat, Yref = ymat)

output

(cost: mse)
cost epoch10: 20.9340400047156 (cv cost: 25.205632342013) (LR:  0.001 )
cost epoch20: 20.6280923387762 (cv cost: 23.8214521197268) (LR:  0.001 )
cost epoch30: 20.3222407903838 (cv cost: 22.1899741289456) (LR:  0.001 )
cost epoch40: 20.0217966054298 (cv cost: 21.3908446693146) (LR:  0.001 )
cost epoch50: 19.7584058034009 (cv cost: 20.7170232035934) (LR:  0.001 )
   dim X: ...

input

res <- cbind(ymat, automl_predict(model = amlmodel, X = xmat))
colnames(res) <- c('actual', 'predict')
head(res)

output

     actual   predict
[1,]    5.1 -2.063614
[2,]    4.9 -2.487673
[3,]    4.7 -2.471912
[4,]    4.6 -2.281035
[5,]    5.0 -1.956937
[6,]    5.4 -1.729314

:-[] no pain, no gain …

After some manual fine tuning on learning rate, mini batch size and iterations number (epochs):
input

data(iris)
xmat <- cbind(iris[,2:4], as.numeric(iris$Species))
ymat <- iris[,1]
amlmodel = automl_train_manual(
               Xref = xmat, Yref = ymat,
               hpar = list(
                   learningrate = 0.01,
                   minibatchsize = 2^2,
                   numiterations = 30
                   )
               )

output

(cost: mse)
cost epoch10: 5.55679482839698 (cv cost: 4.87492997304325) (LR:  0.01 )
cost epoch20: 1.64996951479802 (cv cost: 1.50339773126712) (LR:  0.01 )
cost epoch30: 0.647727077375946 (cv cost: 0.60142564484723) (LR:  0.01 )
   dim X: ...

input

res <- cbind(ymat, automl_predict(model = amlmodel, X = xmat))
colnames(res) <- c('actual', 'predict')
head(res)

output

     actual  predict
[1,]    5.1 4.478478
[2,]    4.9 4.215683
[3,]    4.7 4.275902
[4,]    4.6 4.313141
[5,]    5.0 4.531038
[6,]    5.4 4.742847

Better result, but with human efforts!

fit a regression model automatically (easy way, Mix 1)

Same subject: predict Sepal.Length given other Iris parameters
input

data(iris)
xmat <- as.matrix(cbind(iris[,2:4], as.numeric(iris$Species)))
ymat <- iris[,1]
start.time <- Sys.time()
amlmodel <- automl_train(
                Xref = xmat, Yref = ymat,
                autopar = list(
                    psopartpopsize = 15, 
                    numiterations = 5, 
                    nbcores = 4
                    )
                )
end.time <- Sys.time()
cat(paste('time ellapsed:', end.time - start.time, '\n'))

output

(cost: mse)
iteration 1 particle 1 weighted err: 22.05305 (train: 19.95908 cvalid: 14.72417 ) BEST MODEL KEPT
iteration 1 particle 2 weighted err: 31.69094 (train: 20.55559 cvalid: 27.51518 )
iteration 1 particle 3 weighted err: 22.08092 (train: 20.52354 cvalid: 16.63009 )
iteration 1 particle 4 weighted err: 20.02091 (train: 19.18378 cvalid: 17.09095 ) BEST MODEL KEPT
iteration 1 particle 5 weighted err: 28.36339 (train: 20.6763 cvalid: 25.48073 )
iteration 1 particle 6 weighted err: 28.92088 (train: 20.92546 cvalid: 25.9226 )
iteration 1 particle 7 weighted err: 21.67837 (train: 20.73866 cvalid: 18.38941 )
iteration 1 particle 8 weighted err: 29.80416 (train: 16.09191 cvalid: 24.66206 )
iteration 1 particle 9 weighted err: 22.93199 (train: 20.5561 cvalid: 14.61638 )
iteration 1 particle 10 weighted err: 21.18474 (train: 19.64622 cvalid: 15.79992 )
iteration 1 particle 11 weighted err: 23.32084 (train: 20.78257 cvalid: 14.43688 )
iteration 1 particle 12 weighted err: 22.27164 (train: 20.81055 cvalid: 17.15783 )
iteration 1 particle 13 weighted err: 2.23479 (train: 1.95683 cvalid: 1.26193 ) BEST MODEL KEPT
iteration 1 particle 14 weighted err: 23.1183 (train: 20.79754 cvalid: 14.99564 )
iteration 1 particle 15 weighted err: 20.71678 (train: 19.40506 cvalid: 16.12575 )
...
iteration 4 particle 3 weighted err: 0.3469 (train: 0.32236 cvalid: 0.26104 )
iteration 4 particle 4 weighted err: 0.2448 (train: 0.07047 cvalid: 0.17943 )
iteration 4 particle 5 weighted err: 0.09674 (train: 5e-05 cvalid: 0.06048 ) BEST MODEL KEPT
iteration 4 particle 6 weighted err: 0.71267 (train: 6e-05 cvalid: 0.44544 )
iteration 4 particle 7 weighted err: 0.65614 (train: 0.63381 cvalid: 0.57796 )
iteration 4 particle 8 weighted err: 0.46477 (train: 0.356 cvalid: 0.08408 )
...
time ellapsed: 2.65109273195267

input

res <- cbind(ymat, automl_predict(model = amlmodel, X = xmat))
colnames(res) <- c('actual', 'predict')
head(res)

output

     actual  predict
[1,]    5.1 5.193862
[2,]    4.9 4.836507
[3,]    4.7 4.899531
[4,]    4.6 4.987896
[5,]    5.0 5.265334
[6,]    5.4 5.683173

It’s even better, with no human efforts but machine time
Users on Windows won’t benefit from parallelization, the function uses parallel package included with R base…

fit a regression model experimentally (experimental way, Mix 2)

Same subject: predict Sepal.Length given other Iris parameters
input

data(iris)
xmat <- as.matrix(cbind(iris[,2:4], as.numeric(iris$Species)))
ymat <- iris[,1]
amlmodel <- automl_train_manual(
                Xref = xmat, Yref = ymat,
                hpar = list(
                    modexec = 'trainwpso',
                    numiterations = 30,
                    psopartpopsize = 50
                    )
                )

output

(cost: mse)
cost epoch10: 0.113576786377019 (cv cost: 0.0967069106128153) (LR:  0 )
cost epoch20: 0.0595472259640828 (cv cost: 0.0831404427407914) (LR:  0 )
cost epoch30: 0.0494578776185938 (cv cost: 0.0538888075333611) (LR:  0 )
   dim X: ...

input

res <- cbind(ymat, automl_predict(model = amlmodel, X = xmat))
colnames(res) <- c('actual', 'predict')
head(res)

output

     actual  predict
[1,]    5.1 5.028114
[2,]    4.9 4.673366
[3,]    4.7 4.738188
[4,]    4.6 4.821392
[5,]    5.0 5.099064
[6,]    5.4 5.277315

Pretty good too, even better!
But time consuming on larger datasets: where gradient descent should be preferred in this case

fit a regression model with custom cost (experimental way, Mix 2)

Same subject: predict Sepal.Length given other Iris parameters
Let’s try with Mean Absolute Percentage Error instead of Mean Square Error
input

data(iris)
xmat <- as.matrix(cbind(iris[,2:4], as.numeric(iris$Species)))
ymat <- iris[,1]
f <- 'J=abs((y-yhat)/y)'
f <- c(f, 'J=sum(J[!is.infinite(J)],na.rm=TRUE)')
f <- c(f, 'J=(J/length(y))')
f <- paste(f, collapse = ';')
amlmodel <- automl_train_manual(
                Xref = xmat, Yref = ymat,
                hpar = list(
                    modexec = 'trainwpso',
                    numiterations = 30,
                    psopartpopsize = 50,
                    costcustformul = f
                    )
                )

output

(cost: custom)
cost epoch10: 0.901580275333795 (cv cost: 1.15936129555304) (LR:  0 )
cost epoch20: 0.890142834441629 (cv cost: 1.24167078564786) (LR:  0 )
cost epoch30: 0.886088388448652 (cv cost: 1.22756121243449) (LR:  0 )
   dim X: ...

input

res <- cbind(ymat, automl_predict(model = amlmodel, X = xmat))
colnames(res) <- c('actual', 'predict')
head(res)

output

     actual  predict
[1,]    5.1 4.693915
[2,]    4.9 4.470968
[3,]    4.7 4.482036
[4,]    4.6 4.593667
[5,]    5.0 4.738504
[6,]    5.4 4.914144

fit a classification model with softmax (Mix 2)

Subject: predict Species given other Iris parameters
Softmax is available with PSO, no derivative needed

input

data(iris)
xmat = iris[,1:4]
lab2pred <- levels(iris$Species)
lghlab <- length(lab2pred)
iris$Species <- as.numeric(iris$Species)
ymat <- matrix(seq(from = 1, to = lghlab, by = 1), nrow(xmat), lghlab, byrow = TRUE)
ymat <- (ymat == as.numeric(iris$Species)) + 0
amlmodel <- automl_train_manual(
                Xref = xmat, Yref = ymat,
                hpar = list(
                    modexec = 'trainwpso',
                    layersshape = c(10, 0),
                    layersacttype = c('relu', 'softmax'),
                    layersdropoprob = c(0, 0),
                    numiterations = 50,
                    psopartpopsize = 50
                    )
                )

output

(cost: crossentropy)
cost epoch10: 0.373706545886467 (cv cost: 0.36117608867856) (LR:  0 )
cost epoch20: 0.267034060152876 (cv cost: 0.163635821437066) (LR:  0 )
cost epoch30: 0.212054571476337 (cv cost: 0.112664100290429) (LR:  0 )
cost epoch40: 0.154158717402463 (cv cost: 0.102895917099299) (LR:  0 )
cost epoch50: 0.141037927317585 (cv cost: 0.0864623836595045) (LR:  0 )
   dim X: ...

input

res <- cbind(ymat, automl_predict(model = amlmodel, X = xmat))
colnames(res) <- c(paste('act',lab2pred, sep = '_'),
 paste('pred',lab2pred, sep = '_'))
head(res)
tail(res)

output

  act_setosa act_versicolor act_virginica pred_setosa pred_versicolor pred_virginica
1          1              0             0   0.9863481     0.003268881    0.010383018
2          1              0             0   0.9897295     0.003387193    0.006883349
3          1              0             0   0.9856347     0.002025946    0.012339349
4          1              0             0   0.9819881     0.004638452    0.013373451
5          1              0             0   0.9827623     0.003115452    0.014122277
6          1              0             0   0.9329747     0.031624836    0.035400439

    act_setosa act_versicolor act_virginica pred_setosa pred_versicolor pred_virginica
145          0              0             1  0.02549091    2.877957e-05      0.9744803
146          0              0             1  0.08146753    2.005664e-03      0.9165268
147          0              0             1  0.05465750    1.979652e-02      0.9255460
148          0              0             1  0.06040415    1.974869e-02      0.9198472
149          0              0             1  0.02318048    4.133826e-04      0.9764061
150          0              0             1  0.03696852    5.230936e-02      0.9107221

change the model parameters (shape …)

Same subject: predict Species given other Iris parameters
1st example: with gradient descent and 2 hidden layers containing 10 nodes, with various activation functions for hidden layers
input

data(iris)
xmat = iris[,1:4]
lab2pred <- levels(iris$Species)
lghlab <- length(lab2pred)
iris$Species <- as.numeric(iris$Species)
ymat <- matrix(seq(from = 1, to = lghlab, by = 1), nrow(xmat), lghlab, byrow = TRUE)
ymat <- (ymat == as.numeric(iris$Species)) + 0
amlmodel <- automl_train_manual(
                Xref = xmat, Yref = ymat,
                hpar = list(
                    layersshape = c(10, 10, 0),
                    layersacttype = c('tanh', 'relu', ''),
                    layersdropoprob = c(0, 0, 0)
                    )
                )

nb: last activation type may be left to blank (it will be set automatically)

2nd example: with gradient descent and no hidden layer (logistic regression)
input

data(iris)
xmat = iris[,1:4]
lab2pred <- levels(iris$Species)
lghlab <- length(lab2pred)
iris$Species <- as.numeric(iris$Species)
ymat <- matrix(seq(from = 1, to = lghlab, by = 1), nrow(xmat), lghlab, byrow = TRUE)
ymat <- (ymat == as.numeric(iris$Species)) + 0
amlmodel <- automl_train_manual(
                Xref = xmat, Yref = ymat,
                hpar = list(
                    layersshape = c(0),
                    layersacttype = c('sigmoid'),
                    layersdropoprob = c(0)
                    )
                )

ToDo List

transfert learning from existing frameworks
add autotune to other parameters (layers, dropout, …)
CNN
RNN

join the team !
https://github.com/aboulaboul/automl

R for Deep Learning (I): Build Fully Connected Neural Network from Scratch

I would like to thank Feiwen, Neil and all other technical reviewers and readers for their informative comments and suggestions in this post. Backgrounds Deep Neural Network (DNN) has made a great progress in recent years in image recognition, natural language processing and automatic driving fields, such as Picture.1 shown from…

February 13, 2016

In "R bloggers"

Five ways to visualize your pairwise comparisons

In data analysis it is often nice to look at all pairwise combinations of continuous variables in scatterplots. Up until recently, I have used the function splom in the package lattice, but ggplot2 has superior aesthetics, I think anyway.Here a fe...

March 5, 2011

In "R bloggers"

Machine Learning in R for beginners

Introducing: Machine Learning in R Machine learning is a branch in computer science that studies the design of algorithms that can learn. Typical machine learning tasks are concept learning, function learning or “predictive modeling”, clustering and finding predictive patterns. These tasks are learned through available data that were observed through…

March 25, 2015

In "R bloggers"

To leave a comment for the author, please follow the link and comment on their blog: R-posts.com.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

automl package: part 2/2 first steps how to

first steps: how to

fit a regression model manually (hard way)

fit a regression model automatically (easy way, Mix 1)

fit a regression model experimentally (experimental way, Mix 2)

fit a regression model with custom cost (experimental way, Mix 2)

fit a classification model with softmax (Mix 2)

change the model parameters (shape …)

ToDo List

Related

R for Deep Learning (I): Build Fully Connected Neural Network from Scratch

Five ways to visualize your pairwise comparisons

Machine Learning in R for beginners

first steps: how to

fit a regression model manually (hard way)

fit a regression model automatically (easy way, Mix 1)

fit a regression model experimentally (experimental way, Mix 2)

fit a regression model with custom cost (experimental way, Mix 2)

fit a classification model with softmax (Mix 2)

change the model parameters (shape …)

ToDo List

Related

R for Deep Learning (I): Build Fully Connected Neural Network from Scratch

Five ways to visualize your pairwise comparisons

Machine Learning in R for beginners

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)