Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Today I’m introducing new version of rmongodb (which I started to maintain) – v1.8.0. Install it from github:
library(devtools) install_github("mongosoup/rmongodb@v1.8.0")
Release version will be uploaded to CRAN shortly. This release brings a lot of improvements to rmongodb:
- Now rmongodb correctly handles arrays.
mongo.bson.to.list()
rewritten from scratch. R’s unnamed lists are treated as arrays, named lists as objects. Also it has an option – whether to try to simplify vanilla lists to arrays or not.mongo.bson.from.list()
updated.
mongo.cursor.to.list()
rewritten and has slightly changed behavior – it doesn’t produce any type coercions while fetching data from cursor.mongo.aggregation()
has new options to match MongoDB 2.6+ features. Also second argument now called pipeline (as it is called in MongoDB command).- new function
mongo.index.TTLcreate()
– creating indexes with “time to live” property. - R’s
NA
values now converted into MongoDBnull
values. - many bug fixes (including troubles with installation on Windows) – see full list
I want to highlight some of changes.
The first most important is that now rmongodb correctly handles arrays. This issue was very annoying for many users (including me :-). Moreover about half of rmongodb related questions at stackoverflow were caused by this issue. In new version of package, mongo.bson.to.list()
is rewritten from scratch and mongo.bson.from.list()
fixed. I heavily tested new behaviour and all works very smooth. Still it’s quite big internal change, because these fucntions are workhorses for many other high-level rmongodb functions. Please test it, your feedback is very wellcome. For example here is convertion of complex JSON into BSON using mongo.bson.from.JSON()
(which internally call mongo.bson.from.list()
):
library(rmongodb) json_string <- '{"_id": "dummyID", "arr":["string",3.14,[1,"2",[3],{"four":4}],{"mol":42}]}' bson <- mongo.bson.from.JSON (json_string)
This will produce following MongoDB document:
{"_id": "dummyID", "arr":["string",3.14,[1,"2",[3],{"four":4}],{"mol":42}]}
The second one is that mongo.cursor.to.list()
has new behaviour: it returns plain list of objects without any coercion. Each element of list corresponds to a document of underlying query result. Additional improvement is that mongo.cursor.to.list()
uses R’s environments to avoid extra copying, so now it is much more efficient than previous version (especially when fetching a lot of records from MongoDB).
In the next few releases I have plans to upgrade underlying mongo-c-driver-legacy to latest version 0.8.1.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.