Site icon R-bloggers

Today’s Assignment: Assignment

[This article was first published on Coffee and Econometrics in the Morning, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A new R user quickly discovers that there are multiple ways to store information into an object — the technical term for this is assignment. There’s = as in:

x = c(1,2,3)

and there’s <- as in:

x <- c(1,2,3)

R help on assignOps offers this explanation of the difference:
The operators <- and = assign into the environment in which they are evaluated. The operator <- can be used anywhere, whereas the operator = is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.
For catching subtle coding errors with =, this is a smart feature of R. When I am tired, I sometimes find myself writing a line of code that looks like this:

if(codename = “thedoctor”) watch = “yes”

In this line of code, the problem is that codename = “thedoctor” isn’t a logical statement. It is a statement that would assign the string “thedoctor” to an object called codename…. if the statement were allowed by R’s language. Fortunately, R prevents this.

This feature of the R programming language helps me because I — by default — use = for assignment rather than <-. My style is not conventional, however. Popular R coding style guides (Google, 4D pie charts) suggest that users use <- option for assignment, rather than =.

Deep down, I think I use this unconventional default because I had the = versus == distinction (more generally, assignment versus logical statement) drilled into me in my first computer science class (not in R; Java). On reflection, I have three reasons to use = rather than <-. First, = is easier to type, using one keystroke and no shift keys. Second, the equals style looks cleaner to me (completely a personal preference). Third, I think assignment should take place out in the open (i.e., exactly where = is allowed). If the line of code is out in the open, both <- and = work the same.

On a final note, the <- is not without its problems for logical statements. For example, the statement

if(agediff<-2) status = “I’m older”


will cause problems and it won’t let you know about these problems until you get some really strange results later on. Here’s the code from an interactive R session where I show what can go wrong.

> agediff = -3
> if(agediff<-2) status = “I’m older”
> status
[1] “I’m older”
> agediff
[1] 2
> agediff = 4
> if(agediff<-2) status = “I’m older”
> status
[1] “I’m older”
> agediff
[1] 2
> status=NULL
> if(agediff<-2) status = “I’m older”
> status
[1] “I’m older”
> agediff
[1] 2


The problem is two-fold in this code. First, the if statement isn’t acting as a logical statement splitting by whether “agediff< -2” Second, the value of agediff is changed because “<-” means assignment. The result is that both status and agediff have (potentially) the wrong values at the end of running the code.

As this error is a problem whether you use = or <- for assignment, it is worth knowing regardless of your style preference. Adhering to good spacing guidelines is a good way to avoid this error.

> agediff = 4
> if(agediff < -2) status = “I’m older”
> status
NULL
> agediff
[1] 4


There are some other examples in the Stack Overflow forum discussion on this topic (actually multiple threads on this), all of which are useful. It is interesting reading to be sure.
To leave a comment for the author, please follow the link and comment on their blog: Coffee and Econometrics in the Morning.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.