Mongolite 0.5: authentication and iterators
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A new version of the mongolite package has appeared on CRAN. Mongolite builds on jsonlite to provide a simple, high-performance MongoDB client for R, which makes storing small or large data in a database as easy as converting it to/from JSON. Have a look at the vignette or useR2015 slides to get started with inserting, json queries, aggregation and map-reduce.
Authentication and mongolabs
This release fixes an issue with the authentication mechanism that was reported by Dean Attali. The new version should properly authenticate to secured mongodb servers.
Try running the code below to grab some flights data from my mongolabs server:
# load the package library(mongolite) stopifnot(packageVersion("mongolite") >= "0.5") # Connect to the 'flights' dataset flights <- mongo("flights", url = "mongodb://readonly:[email protected]:43942/jeroen_test") # Count data for query flights$count('{"day":1,"month":1}') # Get data for query jan1_flights <- flights$find('{"day":1,"month":1}')
While debugging this, I found that mongolab is actually very cool. You can sign up for a your own free (up to 500MB) mongodb server and easily create data collections with one or more read-only and/or read-write user accounts. This provides a pretty neat way to publish some data (read-only) or sync and collaborate with colleagues (read-write).
Iterators
Another feature request from some early adopters was to add support for iterators. Usually you want to use the mongo$find()
method which automatically converts data from a query into a dataframe. However sometimes you need finer control over the individual documents.
The new version adds a mongo$iterate()
method to manually iteratate over the individual records from a query without any automatic simplification. Using the same example query as above:
# Connect to the 'flights' dataset flights <- mongo("flights", url = "mongodb://readonly:[email protected]:43942/jeroen_test") # Create iterator iter <- flights$iterate('{"day":1,"month":1}') # Iterate over individual records while(length(doc <- iter$one())){ # do something with the row here print(doc) }
Currently the iterator has 3 methods: one()
, batch(n = 1000)
and page(n = 1000)
. The iter$one
method will pop one document from iterator (it would be called iter$next()
if that was not a reserved keyword in R). Both iter$batch(n)
and iter$page(n)
pop n documents at once. The difference is that iter$batch
returns a list of at most length n whereas iter$page
returns a data frame with at most n rows.
Once the iterator is exhausted, its methods will only return NULL
.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.