Site icon R-bloggers

Tips for getting started on Kaggle (datamining)

[This article was first published on Doodling with Data, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Ever since I heard about Kaggle.com at this year’s Bay Area Data Mining Camp, I’ve wanted to participate. But I was feeling somewhat intimidated.
Jeremy Howard’s “Intro to Kaggle” talk at yesterday’s MeetUp (DataMining for a Cause) was exactly what I needed.
He had a number of tips for beginners. His was exactly the talk that I was looking for, though I didn’t know it. I am sharing some of his tips here, in case it helps others as well.

Jeremy Howard’s Tips for Getting Started on Data Mining competitions at Kaggle

* Visit the Kaggle site and spend at least 30 minutes every day hanging around. Read the forum, the competition pages, and read the Kaggle blog
* It is much better to start participating in competitions which are just starting up, rather than in ones where there are 100s of entries and teams already well on their way
* Aim to make at least one submission each and every day
* Jeremy himself participates in competitions to see where he stands, and to learn and get better
* He’d start out making trivial submissions (all zero’s, or alternate zero’s, all entries as averages) until his algorithm got better
* A lot of people who compete use R (and SAS, Excel or Python)
* Nearly 50% of the winning entries use Random Forest techniques.
* If you place in the top 3, that is great. But personal improvement and learning should be the goal.
* As you get better, you might get invited to “private competitions.”
* Every day, strive to do a little better and improve your submission’s performance, scoring and ranking

Related Links:

To leave a comment for the author, please follow the link and comment on their blog: Doodling with Data.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.