Site icon R-bloggers

Knoxville, TN: R for Text Analysis Workshop

[This article was first published on R – r4stats.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The Knoxville R Users Group is presenting a workshop on text analysis using R by Bob Muenchen. The workshop is free and open to the public. You can join the group at https://www.meetup.com/Knoxville-R-Users-Group. A description of the workshop follows.

R for Text Analysis

When analyzing text using R, it’s hard to know where to begin. There are 37 packages available and there is quite a lot of overlap in what they can do. This workshop will demonstrate how to do three popular approaches: dictionary-based content analysis, latent semantic analysis, and latent Dirichlet allocation. We will spend much of the time on the data preparation steps that are important to all text analysis methods including data acquisition, word stemming/lemmatization, removal of punctuation and other special characters, phrase discovery, tokenization, and so on. While the examples will focus on the automated extraction of topics in the text files, we will also briefly cover the analysis of sentiment (e.g. how positive is customer feedback?) and style (who wrote this? are they telling the truth?)

The results of each text analysis approach will be the topics found, and a numerical measure of each topic in each document. We will then merge that with numeric data and do analyses combining both types of data.

The R packages used include quanteda, lsa, topicmodels, tidytext and wordcloud; with brief coverage of tm and SnowballC. While the workshop will not be hands-on due to time constraints, the programs and data files will be available afterwards.

Where: University of Tennessee Humanities and Social Sciences Building, room 201. If the group gets too large, the location may move and a notice will be sent to everyone who RSVPs on Meetup.com or who registers at the UT workshop site below. You can also verify the location the day before via email with Bob at muenchen@utk.edu.

When: 9:05-12:05 Friday 1/27/17

Prerequisite: R language basics

Members of UT Community register at: http://workshop.utk.edu under Researcher Focused

Members of other groups please RSVP on your respective sites so I can bring enough handouts.

To leave a comment for the author, please follow the link and comment on their blog: R – r4stats.com.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.