The R-Podcast Episode 9: Adventures in Data Munging Part 1
[This article was first published on The R-Podcast, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
It’s great to be back with a new episode after an eventful break! This episode begins a series on my adventures in data munging, a.k.a data processing. I discuss three issues that demonstrate the flexibility and versatility R brings for recoding messy values, important inconsistent data files, and pinpointing problematic observations and variables. We also have an extended listener feedback segment with an audio installment of the “pitfalls” of R contributed by listener Frans. I hope you enjoy this episode and keep passing along your feedback to theRcast(at)gmail.com and stop by the forums as well!
The following resources are mentioned in this episode:
- Summary of UserR! 2012 conference: http://pairach.com/2012/06/24/user2012/
- R 2.15.1 release notes: http://cran.r-project.org/src/base/NEWS
- NHL Analysis GitHub Repo: https://github.com/thercast/nhl_analysis
- plyr package: http://plyr.had.co.nz/
- car package: http://cran.r-project.org/web/packages/car/index.html
- ProjectTemplate: http://www.projecttemplate.net/
- Hockey Summary Project: http://hsp.flyershistory.com/
- Pitfalls of R blog post: http://pitfalls-r-us.blogspot.nl/2012/07/time-zones.html
Episode 9 Time Stamps
00:00 The R-Podcast #009: Adventures in Data Munging Part 1 00:31 Introduction 01:38 Big news: +1 03:53 R 2.15.1 released 04:26 UseR! 2012 07:20 Hockey Summary Project 10:30 Dealing with empty files 15:18 Importing inconsistent data files 28:15 Recoding using car package 35:08 Useful functions for pinpointing issues 44:55 Listener Feedback 45:14 Daniel: Advice on data munging 55:01 Frans: Pitfalls of R 66:28 Wrapping up: subscribe to the podcast, [email protected], + 1-269-849-9780, Twitter @theRcast, Google Plus 71:22 End
To leave a comment for the author, please follow the link and comment on their blog: The R-Podcast.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.