Site icon R-bloggers

RObjectTables are AWESOME

[This article was first published on Struggling Through Problems, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
< !-- begin{Schunk} !--> < !--\end{Schunk}!-->

Why isn’t everyone using the RObjectTables package? This is the best thing ever!

Here’s the basic idea of RObjectTables: An environment is an object where you can lookup names and associate them with values. And in particular its where you look up variables used in an expression. But there’s no reason you can’t take any other object that associates names with values (data.frame, list, SQL database, CSV file, filesystem, …) and use that to lookup your variables.

R already has sortof this because you can use eval() and friends on lists and data.frames. But it’s not extensible. The purpose of RObjectTables is to make it extensible.

In its current version RObjectTables is somewhat limited, because you can only attach() the created environments, not pass them to with() and such. But with extremely minor modifications this becomes possible. You can find a version with these changes on my GitHub site.

Now let me demonstrate how unbelievably useful this is. So naturally I will start with a useless example: an environment of only strings!

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> library(RObjectTables)

< !-- ccc -->

> db = newRClosureTable(list(

> assign = function(name, value) {

> # Not used

> },

>

> get = function(name) {

> name

> },

>

> exists = function(name) {

> T

> },

>

> remove = function(name) {

> # Not used

> },

>

> objects = function() {

> # Not used

> }

> ))

< !-- ccc -->

> with(db, x)

< !-- ccc --> < !-- end{Sinput} !-->

[1] “x”

< !--\end{Schunk}!-->

Muahahaha.

This environment, as you can see, is not terribly useful:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> try(

> with(db, x + y),

> silent=T

> )

< !-- ccc -->

> geterrmessage()

< !-- ccc --> < !-- end{Sinput} !-->

[1] “Error in x + y : non-numeric argument to binary operator\n”

< !--\end{Schunk}!-->

Yes… using “+” (character) as a function… not so good. Also we don’t have to feel so bad about not implementing assign() because with `could assign.

But now let’s do something really useful.

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> reality = function() {

> parent = parent.frame()

> outer = new.env(parent = parent)

>

> formulas = list()

> valid = list()

> values = list()

>

> make.ref = function(name) {

> ref = list(

> db = self,

> name = name

> )

> class(ref) = ‘ref’

> ref

> }

>

> dep.table = list()

> rdep.table = list()

>

> propagate = function(name) {

> if (is.null(valid[[name]]) || !valid[[name]]) {

> }

> else {

> for (ref in rdep.table[[name]]) {

> ref$db$reset(ref$name)

> }

> }

> }

>

> reset = function(name) {

> propagate(name)

> valid[[name]] < < - F

> }

>

> add.dep = function(name, ref) {

> dep.table[[name]] < < - c(list(ref), dep.table[[name]])

> }

>

> add.rdep = function(name, ref) {

> rdep.table[[name]] < < - c(list(ref), rdep.table[[name]])

> }

>

> del.rdep = function(name, ref) {

> rdep.table[[name]] < < - setdiff(rdep.table[[name]], list(ref))

> }

>

> ptr = newRClosureTable(list(

> assign = function(name, value) {

> force(value)

> formulas[[name]] < < - function() value

> reset(name)

> },

> get = function(name) {

> if (is.null(formulas[[name]])) {

> return(tryCatch(

> get(name, outer),

> error = function(e) getUnbound()

> ))

> }

>

> this.ref = make.ref(name)

> for (ref in working.refs) {

> ref$db$add.dep(ref$name, this.ref)

> }

>

> if (is.null(valid[[name]]) || !valid[[name]]) {

> old.deps < < - dep.table[[name]]

> dep.table[[name]] < < - list()

>

> working.refs < < - c(list(this.ref), working.refs)

>

> values[[name]] < < - formulas[[name]]()

>

> working.refs < < - working.refs[1L]

>

> lost.deps = setdiff(old.deps, dep.table[[name]])

> for (ref in lost.deps) {

> ref$db$del.rdep(ref$name, this.ref)

> }

>

> gained.deps = setdiff(dep.table[[name]], old.deps)

> for (ref in gained.deps) {

> ref$db$add.rdep(ref$name, this.ref)

> }

>

> valid[[name]] < < - T

> }

>

> values[[name]]

> },

> exists = function(name) {

> !is.null(valid[[name]]) || exists(name, parent)

> },

> remove = function(name) {

> # TODO: this

> },

> objects = function(name) {

> names(valid)

> }

> ))

> class(ptr) = c(‘reality’, class(ptr))

>

> attr(ptr, ‘delayedAssign’) = (

> function(name, promise) {

> formulas[[name]] < < - promise

> reset(name)

> promise

> }

> )

>

> self = list(

> ptr = ptr,

> reset = reset,

> add.dep = add.dep,

> add.rdep = add.rdep,

> del.rdep = del.rdep

> )

> class(self) = ‘realptr’

>

> ptr

> }

< !-- ccc -->

> working.refs = list()

< !-- ccc --> < !--\end{Sinput}!--> < !--\end{Schunk}!-->

Why did I call it a “reality”? Well… uh… all the good names are taken. So think of it as storing the current version of… reality. What it does is keep track of a set of variables that depend on each other. Oh yeah and we need some way to assign into it:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> `$.reality` = function(r, name) {

> get(name, envir=r)

> }

< !-- ccc -->

> `$< -.reality` = function(r, name, value) {

> expr = substitute(value)

> env = parent.frame()

>

> attr(r, ‘delayedAssign’)(

> name, function() eval(expr, env)

> )

>

> r

> }

< !-- ccc --> < !--\end{Sinput}!--> < !--\end{Schunk}!-->

Now we can write:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> r = reality()

< !-- ccc -->

> r$x = 10

< !-- ccc -->

> r$y = sqrt(r$x)

< !-- ccc -->

> r$y

< !-- ccc --> < !-- end{Sinput} !-->

[1] 3.162278

< !-- begin{Sinput} !-->

> r$x = 20

< !-- ccc -->

> r$y

< !-- ccc --> < !-- end{Sinput} !-->

[1] 4.472136

< !--\end{Schunk}!-->

It is the feature I have always been wishing for. Now if only we could get .GlobalEnv to do this… You might think of attach()ing it: don’t do that. Trust me. I can’t show you because it will crash Sweave and not produce any output, but that doesn’t work yet (it would be nice if it did).

Luckily R has a useful feature that was never intended for this purpose (I find it is this way with most of R’s finest features): the debugging browser. So let us…

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> with.db = function(env, func1, func2) {

> if (‘reality’ %in% class(env)) {

> func1(env)

> }

> else {

> func2()

> }

> }

< !-- ccc -->

> reality.assign = function(name, rvalue) {

> env < - parent.frame()

> name < - substitute(name)

> srvalue < - substitute(rvalue)

>

> with.db(env,

> function(db) {

> name < - as.character(deparse(name))

> promise < - function() eval(srvalue, env)

>

> attr(db, ‘delayedAssign’)(name, promise)

> alist(x=)$x

> },

> function() {

> do.call(`< -`, list(name, rvalue), envir=env)

> }

> )

> }

< !-- ccc -->

> with(r, `=` < - reality.assign)

< !-- ccc -->

> run.interpreter = function(base.env) {

> options(browserNLdisabled = T)

> with(base.env, browser())

> }

< !-- ccc --> < !--\end{Sinput}!--> < !--\end{Schunk}!-->

Now I don’t know how to feed an interactive debugging session into Sweave, so here is copy-pasted this feature in use:

Browse[1]> y = sqrt(x)

Browse[1]> for (i in 1:10) {
+ x = i
+ print(y)
+ }
[1] 1
[1] 1.414214
[1] 1.732051
[1] 2
[1] 2.236068
[1] 2.449490
[1] 2.645751
[1] 2.828427
[1] 3
[1] 3.162278
Browse[1]> b = a

Browse[1]> c = b

Browse[1]> a = 1

Browse[1]> c
> run.interpreter(r)
Called from: eval(expr, envir, enclos)
Browse[1]> y = sqrt(x)

Browse[1]> for (i in 1:10) {
+ x = i
+ print(y)
+ }
[1] 1
[1] 1.414214
[1] 1.732051
[1] 2
[1] 2.236068
[1] 2.449490
[1] 2.645751
[1] 2.828427
[1] 3
[1] 3.162278
Browse[1]> z = y**2

Browse[1]> z
[1] 10
Browse[1]> x = 1:3

Browse[1]> z
[1] 1 2 3
Browse[1]> b = a

Browse[1]> c = b

Browse[1]> a = 1

Browse[1]> c
> 

Haha “c” closed the debugger 😉

So far this actually seems stable enough for general use. I wouldn’t trust it with data you care about, but I intend to make great use of it.

To leave a comment for the author, please follow the link and comment on their blog: Struggling Through Problems.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.