R-Function GScholarScraper to Webscrape Google Scholar Search Result
[This article was first published on theBioBucket*, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Based on my previous post on Web Scraping I coded and uploaded the Function “GScholarScraper” HERE for testing!Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The function will pull all (!) results, processing pages in chunks of 100 results/titles, and return a file with all titles, links, etc. It will also produce a word cloud using the words in the publication titles.
Please try your own search strings and report errors, etc.!
You can source the function by running the following lines:
setwd(tempdir()) download.file("http://docs.google.com/uc?export=download&id=0B2wAunwURQNsM2EyYWNjOWYtZmFkMi00MmJhLWJmMzUtMjRiNGFiMWVkZmI2", destfile = "google_docs_script.txt", mode = "wb") # read it and run an example: source(paste(tempdir(), "/google_docs_script.txt", sep = "")) ls() # the function should be listed # remove files from tempdir: unlink(dir())Build and run properly under:
R version 2.13.0 (2011-04-13) and R version R-2.13.2 (2011-09-30)
Platform: i386-pc-mingw32/i386 (32-bit)locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] stringr_0.5 tm_0.5-6 wordcloud_1.2 Rcpp_0.9.7
loaded via a namespace (and not attached):
[1] plyr_1.5.1 slam_0.1-23
PS: Errors reported lately (see comments) were resolved, the source code was updated..
To leave a comment for the author, please follow the link and comment on their blog: theBioBucket*.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.