[This article was first published on Pachá (Batteries Included), and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
About the book
I obtained a copy of this book by Matthew Jockers throughout Universities’ access from Springer. You can also get a copy from Amazon.
This book is short and to the point. I would actually strongly recommend it to anyone interested in text mining and natural language processing.
What I do like the most about this book? That you can download the exercises from the book’s website. I downloaded the zip, extracted the folder and then created a RStudio project to the folder and that’s it. Then I could follow the explanations without needing to transcript the code from the pdf. Amazing!
Table of contents
I couldn’t find it full on the web so I write it here:
Part | Contents |
---|---|
Part I Microanalysis | R Basics |
First Foray into Text Analysis with R | |
Accessing and Comparing Word Frequency Data | |
Token Distribution Analysis | |
Correlation | |
Part II Mesoanalysis | Measures of Lexical Variety |
Hapax Richness | |
Do It KWIC | |
Do It KWIC (Better) | |
Part III Macroanalysis | Clustering |
Classification | |
Topic Modeling |
To leave a comment for the author, please follow the link and comment on their blog: Pachá (Batteries Included).
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.