Applying machine learning algorithms – exercises

[This article was first published on R-exercises, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

INTRODUCTION

Dear reader,

If you are a newbie in the world of machine learning, then this tutorial is exactly what you need in order to introduce yourself to this exciting new part of the data science world.

This post includes a full machine learning project that will guide you step by step to create a “template,” which you can use later on other datasets.

Before proceeding, please follow our short tutorial.

Look at the examples given and try to understand the logic behind them. Then try to solve the exercises below using R and without looking at the answers. Then see the solutions to check your answers.

Exercise 1

Create a list named “control” that runs a 10-fold cross-validation. HINT: Use trainControl().

Exercise 2

Use the metric of “Accuracy” to evaluate models.

Exercise 3

Build the “LDA”, “CART”, “kNN”, “SVM” and “RF” models.

Exercise 4

Create a list of the 5 models you just built and name it “results”. HINT: Use resamples().

Learn more about machine learning in the online course Beginner to Advanced Guide on Machine Learning with R Tool. In this course you will learn how to:

  • Create a machine learning algorithm from a beginner point of view
  • Quickly dive into more advanced methods in an accessible pace and with more explanations
  • And much more

This course shows a complete workflow start to finish. It is a great introduction and fallback when you have some experience.

Exercise 5

Report the accuracy of each model by using the summary function on the list “results”. HINT: Use summary().

Exercise 6

Create a plot of the model evaluation results and compare the spread and the mean accuracy of each model. HINT: Use dotplot().

Exercise 7

Which model seems to be the most accurate?

Exercise 8

Summarize the results of the best model and print them. HINT: Use print().

Exercise 9

Run the “LDA” model directly on the validation set to create a factor named “predictions”. HINT: Use predict().

Exercise 10

Summarize the results in a confusion matrix. HINT: Use confusionMatrix().

To leave a comment for the author, please follow the link and comment on their blog: R-exercises.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)