[This article was first published on rapporter, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Sorry about the noisy post title, but it happens to be the name of the book I was working on in the past year, which has been just published at Packt:Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Although I do not think that reading this ~400 page book will turn everyone into a true master of R and data analysis, but I believe it can get you on the way. I wrote this book for a relatively large target audience in mind with some prior R experience (like at an introductory university course or MOOC covering how to install R, load CSV files or generate a histogram), but without the time/need to walk through a complete series of books on the stats background, algorithms and domain specific knowledge on handling different data types.
So this is not a reference book, it does not even include a piece of formal mathematical formula, but instead it does provide a practical introduction, many references and hands-on examples on the following topics:
- Reading data from larger text files and databases in an optimal way
- Loading data from the Web via parsing HTML, XML, JSON and interacting with APIs
- Filtering, summarizing and restructuring data
- Building and interpreting generalized linear models
- Traditional multivariate statistical methods for dimension reduction and latent variables
- Classification and clustering, including supervised and unsupervised statistical and machine learning methods
- Handling outliers and missing values
- Processing unstructured text data
- A bit of social network analysis
- Smoothing, seasonal decomposition and modeling time-series
- Visualizing spatial data
- The number of R Foundation members and R conference attendees (previously presented at the useR! 2014 and 2015 conferences besides an interactive webapp on R-activity around the world)
- The number of packages per R package maintainers
- The volume and timeline of messages and posters on the [R-help] mailing list
- Estimating the number of R users around the world
- The number of R users on Facebook and Twitter
After ~1001 sleepless nights, my #rstats book on #datascience is published w/ a free chapter https://t.co/7lS4pgN06k pic.twitter.com/MnM24P67dE— Gergely Daróczi (@daroczig) October 1, 2015
Some quick statistics on the book:
- 14 chapters
- 396 pages
- 95 packages loaded
- hflights and data.table used in 7, ggplot2 in 5, dplyr and plyr in 4, microbenchmark and MASS used in 3 chapters
- 5 reviewers
- more than 20 persons contributing
- 2,711 lines of the code bundle on GitHub
- 581 days between signing the author contract and the actual publication date
- around 320 e-mail sent and received with the ISBN on the subject line
- 10,000 kilometers between the places where I wrote the first and the last chapters
- and I forgot to use time tracking software after logging 174.73 hours spent on the book
To leave a comment for the author, please follow the link and comment on their blog: rapporter.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.