Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
- An example of monkey patching a package
- 2013-07-11
- Source
Scope
This article is about R package development.
Motivation
In the same spirit that my previous post A dirty hack for importing packages that use Depends , I wanted to use an earlier version of the excellent gdata package in one of my packages, but as an Import instead of a Depend.
At that time gdata had a bug that prevented certain functions from being used through Import. I would like to show you the hack I used to make it work. (Gregory Warnes has now fixed gdata by releasing a new version, and I am very grateful for his responsiveness and efficiency, and for allowing me to use gdata as an illustration.)
Analysis of the problem
gdata used path.package("gdata")
to locate its files at runtime. But path.package
can only locate a package that
is attached (to the search path), and that is exactly what we want to avoid.
The idea of the hack
The function to use to locate a package directory when the package is not attached is find.package
.
What we would like is to somehow modify —at runtime— the gdata package and replace the calls to path.package
by
calls to find.package
.
When executing a call to a gdata function, R locates the path.package
symbol by first looking in the gdata's namespace,
then in the gdata's imports environment, then in the base/core namespace, then in the search path.
Because all these namespaces are sealed, and because the prime directive is not to modify the search path, we can not
alter or add a new definition in those environments. Butwe can insert a new environment somewhere in the
chain between gdata's namespace and the base environment which provides an alternate definition of path.package
.
Illustration
We are going to insert a new environment between gdata's imports environment and its parent.
This environment will contain a symbol named path.package
, but which have the same definition as find.package
(i.e. path.package
will point to find.package
).
Implementation
To achieve this, we will provide a custom .onLoad
function in MyPkg
, which is executed when the package is loaded.
We also define a special symbol to avoid repeated insertion of a new environment in the case that our package is unloaded then reloaded.
.onLoad <- function(libname, pkgname) { # Monkey patch so that gdata uses find.package instead of path.package gdata_imports <- parent.env(getNamespace('gdata')) current <- parent.env(gdata_imports) HACK <- '__hack__' # set the new env between gdata imports and base env, only if not already done if (! exists(HACK, envir=parent.env(gdata_imports) ) ) { # make a new env, with path.package poiting to find.package env <- new.env(parent=current) assign(HACK, TRUE, envir=env) assign('path.package', find.package, envir=env) ### define path.package=find.package parent.env(gdata_imports) <- env ### insert our custom env } }
Conclusion
This is an example of the kind of trick we can achieve in our objective of trustworthy computation. Once again, this trick is no longer needed for the current version of gdata.
Karl Forner @ Quartz Bio (with help from Gregory Warnes)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.