You can scrap it and write something better but let me keep R ;)

ellbur

11 years ago

[This article was first published on Struggling Through Problems, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

< !-- begin{Schunk} !--> < !--\end{Schunk}!-->

Ross Ikaha (via Xi’an — thanks 😉 ) gives a nice example to show why R is basically impossible to optimize:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->
> f = function() {

> if (runif(1) > 0.5) {

> x = 10

> }

> x

> }
< !-- ccc --> < !--\end{Sinput}!--> < !--\end{Schunk}!-->

x in the last expression could be a local or a global, and this won’t be known until runtime!

That’s pretty good.

I wonder if this will always return 10:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> f = function() {

> a = 10

> a

> }

< !-- ccc -->

> f()

< !-- ccc --> < !-- end{Sinput} !-->

[1] 10

< !--\end{Schunk}!-->

Maybe we could optimize it out? No:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> regular.equals < - `=`

< !-- ccc -->

> `=` = function(…) {

> with(parent.frame(), a < - 11)

> }

< !-- ccc -->

> f()

< !-- ccc --> < !-- end{Sinput} !-->

[1] 11

< !-- begin{Sinput} !-->

> `=` < - regular.equals

< !-- ccc --> < !--\end{Sinput}!--> < !--\end{Schunk}!-->

OK but assigning to `=` is really cheating. This is cheating too:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> regular.brace = `{`

< !-- ccc -->

> `{` = function(…) 11

< !-- ccc -->

> f()

< !-- ccc --> < !-- end{Sinput} !-->

[1] 11

< !-- begin{Sinput} !-->

> `{` = regular.brace

< !-- ccc --> < !--\end{Sinput}!--> < !--\end{Schunk}!-->

If `=` and `{` weren’t available for assignment, is there any way to make f not return 10? Well there’s this, but it’s hardly “making f return 10”:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> body(f) = quote(11)

< !-- ccc -->

> f()

< !-- ccc --> < !-- end{Sinput} !-->

[1] 11

< !--\end{Schunk}!-->

I can’t think of any other way to attack it; so if you block all those I guess we can declare this function safe 😉

How about this one:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> g = function(x, y) {

> x + y

> }

< !-- ccc --> < !--\end{Sinput}!--> < !--\end{Schunk}!-->

Could we safely say that while g() is being called, f() will never be called at the same time (ie so that we could overlay their stacks?)

No, not even close. There are a million ways to break that:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> g(f(), f())

< !-- ccc --> < !-- end{Sinput} !-->

[1] 22

< !--\end{Schunk}!-->

(Lazy evaluation — f() is called while executing g())

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> normal.plus = `+`

< !-- ccc -->

> `+` = function(a, b) UseMethod(“+”)

< !-- ccc -->

> `+.numeric` = function(a, b) { f() * a * b }

< !-- ccc -->

> g(1, 2)

< !-- ccc --> < !-- end{Sinput} !-->

[1] 22

< !-- begin{Sinput} !-->

> `+` = normal.plus

< !-- ccc --> < !--\end{Sinput}!--> < !--\end{Schunk}!-->

(Generic `+` is not uncommon).

Well at least can we say g() will always evaluate both arguments? No:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> g(return(2), print(“hi”))

< !-- ccc --> < !-- end{Sinput} !-->

[1] 2

< !--\end{Schunk}!-->

Which can make code that looks like an error be valid:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> g(return(2), not.defined)

< !-- ccc --> < !-- end{Sinput} !-->

[1] 2

< !--\end{Schunk}!-->

If you assume g() is strict you will knock out some perfectly legitimate R code 🙂

Likewise an explicit call to return() doesn’t have to actually do anything:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> g = function(x) {

> 3

> }

< !-- ccc -->

> g(return(2))

< !-- ccc --> < !-- end{Sinput} !-->

[1] 3

< !--\end{Schunk}!-->

I also find these hacks delightful:

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> delayedAssign(‘b’, {b < - 7; 8})

< !-- ccc -->

> a < - b

< !-- ccc -->

> a

< !-- ccc --> < !-- end{Sinput} !-->

[1] 8

< !-- begin{Sinput} !-->

> b

< !-- ccc --> < !-- end{Sinput} !-->

[1] 7

< !--\end{Schunk}!--> < !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> x = ‘@’

< !-- ccc -->

> up = function() {

> x. < - x

> x < < - intToUtf8(utf8ToInt(x)+1L)

> delayedAssign(x, up(), ass=.GlobalEnv)

> x.

> }

< !-- ccc -->

> up()

< !-- ccc --> < !-- end{Sinput} !-->

[1] “@”

< !-- begin{Sinput} !-->

> A

< !-- ccc --> < !-- end{Sinput} !-->

[1] “A”

< !-- begin{Sinput} !-->

> B

< !-- ccc --> < !-- end{Sinput} !-->

[1] “B”

< !-- begin{Sinput} !-->

> C

< !-- ccc --> < !-- end{Sinput} !-->

[1] “C”

< !-- begin{Sinput} !-->

> D

< !-- ccc --> < !-- end{Sinput} !-->

[1] “D”

< !--\end{Schunk}!--> < !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> backw = function(…) {

> args = as.list(substitute(list(…)))[–1L]

> env = parent.frame()

> for (arg in rev(args)) {

> res < - eval(arg, env)

> }

> res

> }

< !-- ccc -->

> environment(backw) = baseenv()

< !-- ccc -->

> `{` = backw

< !-- ccc -->

> fib = function(n) {

> a

> for (i in 1:n) {

> a = b – a

> b = a + b

> }

> b = 1

> a = 1

> }

< !-- ccc -->

> fib(8)

< !-- ccc --> < !-- end{Sinput} !-->

[1] 34

< !--\end{Schunk}!--> < !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> lex.scope = function(…) {

> args = as.list(substitute(list(…)))[–1L]

> parent = parent.frame()

> env = new.env(parent=parent)

> for (arg in args) {

> res < - eval(arg, env)

> }

> res

> }

< !-- ccc -->

> environment(lex.scope) = baseenv()

< !-- ccc -->

> `{` = lex.scope

< !-- ccc -->

> {

> x = 2

> {

> x = 3

> }

> x

> }

< !-- ccc --> < !-- end{Sinput} !-->

[1] 2

< !--\end{Schunk}!-->

(Hey, that one could actually be useful).

< !-- begin{Schunk} !--> < !-- begin{Sinput} !-->

> evanescent = function(x, v) {

> force(v)

> env = parent.frame()

> name = as.character(substitute(x))

> delayedAssign(name, {

> env[[name]] = alist(x=)$x

> v

> }, ass = env

> )

> }

< !-- ccc -->

> environment(evanescent) = baseenv()

< !-- ccc -->

> `=` = evanescent

< !-- ccc -->

> a = 2

< !-- ccc -->

> a

< !-- ccc --> < !-- end{Sinput} !-->

[1] 2

< !-- begin{Sinput} !-->

> a

< !-- ccc -->

Error in eval(expr, envir, enclos) :

argument “a” is missing, with no default

< !--\end{Sinput}!--> < !--\end{Schunk}!-->

My thoughts on all this: R is a wonderful and flexible language, but it is entirely not the right language for intense numerical calculation: it is simply too flexible to be fast. And unfortunately whenever you’re trying some algorithm that hasn’t been done before, it will necessarily not already be written in C, and you find yourself having to make a choice: at what point is it too slow to keep in R, and the best thing to do is to give up and rewrite it in C?

The thing is, as an interactive environment, R’s flexibility really gains something (think how you’d implement with() in Python). I personally am OK with the current system of using R for high level code and ducking to C for tight loops; it’s better than trying to make one language be good at 2 different things.

To leave a comment for the author, please follow the link and comment on their blog: Struggling Through Problems.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.