Download all Documents from Google Drive with R
[This article was first published on theBioBucket*, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A commentator on my blog recently asked if it is possible to retrieve all direct links to your Google Documents. And indeed it can be very easily done with R, just like so:
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
# you'll need RGoogleDocs (with RCurl dependency..) install.packages("RGoogleDocs", repos = "http://www.omegahat.org/R", type="source") library(RGoogleDocs) gpasswd = "mysecretpassword" auth = getGoogleAuth("kay.cichini @ gmail.com", gpasswd) con = getGoogleDocsConnection(auth) CAINFO = paste(system.file(package="RCurl"), "/CurlSSL/ca-bundle.crt", sep = "") docs <- getDocs(con, cainfo = CAINFO) # get file references hrefs <- lapply(docs, function(x) return(x@access["href"])) keys <- sub(".*/full/.*%3A(.*)", "\\1", hrefs) types <- sub(".*/full/(.*)%3A.*", "\\1", hrefs) # make urls (for url-scheme see: http://techathlon.com/download-shared-files-google-drive/) # put format parameter for other output formats! pdf_urls <- paste0("https://docs.google.com/uc?export=download&id=", keys) doc_urls <- paste0("https://docs.google.com/document/d/", keys, "/export?format=", "txt") # download documents with your browser gdoc_ids <- grep("document", types) lapply(gdoc_ids, function(x) shell.exec(doc_urls[x])) pdf_ids <- grep("pdf", types, ignore.case = T) lapply(pdf_ids, function(x) shell.exec(pdf_urls[x]))
To leave a comment for the author, please follow the link and comment on their blog: theBioBucket*.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.