Site icon R-bloggers

Machine Learning Strategy (Part 2)

[This article was first published on Philipp Probst, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This is the second blog post about machine learning strategy. It is about human-level performance, bias and variance (tradeoff) and how to improve your algorithm iteratively.

< !--excerpt-->

Human-Level Performance

Bias and Variance

There is a tradeoff between bias and variance, while learning machine learning algorithms. I will differentiate here between avoidable bias and variance:

Possible improvements:

Notes:

Examples:

Error analysis

Following things can be done in the error analysis:

Example for image classification with cats:

Concentration on problems that cause the lion part of the worse performance

General advices:

→ „Build your first system quickly, then iterate to improve it“

→ In most cases data scientists build too complex procedures/algorithms than too simple ones, so think about the complexity of the machine learning system that you have built.

Next blog post

In the next blog post I will talk about what to do with different distribution in the train/dev/test set, how to learn with multiple tasks and advantages and disadvantages of end-to-end learning.

This blog post is partly based on information that is contained in a tutorial about deep learning on coursera.org that I took recently. Hence, a lot of credit for this post goes to Andrew Ng that held this tutorial.

To leave a comment for the author, please follow the link and comment on their blog: Philipp Probst.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.