Kaleidoscope IIb (useR! 2011)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
L Collingwood – RTextTools
RTextTools. A machine learning library for automated text classification. This package builds on previous packages such as tm and random forests. Use case: undergrad labels congressional bills but then quits. Using the previously labelled data, automatically classify the remaining documents. The speaker gave a nice overview of machine learning techniques, but I was familiar with them so didn’t bother making notes.
Workflow:
- Read data;
- Missed opps;
- Create Corpus;
- Train Models – SVM, SLDA, TREE, etc;
- Classify models;
- Analyze data.
Jason Waddel – The Role of R in Lab Automation
License: free as in free beer and speech!
Summary: a scientist repeats the same experiment multiple times. How can we automate analysis.
R service bus allows a scientist to email/upload data and the results are automatically generated.
High level view
Various inputs such as pop, xml, REST WS. Each input is added to the queue. A pool of R servers handles the job. A simple configuration file handles the set-up.
Please note that the notes/talks section of this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.