Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Two members of the RugBcn have developed a package for R that ease the path for webscraping . Among the current packages, we highlight the well known RCurl and XML packages. Both are enough for most situations, but they have a limitation dealing with situations where there is some javascript between the user and the information. For instance when the only way of getting to the desired page is by means of clicking buttons, selecting in menus, ….
Relenium has imported the java module Selenium (implemented in many languages, though) which has been traditionally used for web testing, via the package rJava. Its use is very intuitive, since reproduces the actions that a human would perform on a web page. The webpage of the project can be found here. There is an example explaining in detail how to use it. The package is still in development, so any comments/suggestions are welcome.
We hope you enjoy it.
Lluis Ramon and Aleix Ruiz de Villa,
RugBcn (Barcelona R Users Group)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.