Don’t run afoul of Scoping Rules in R!
[This article was first published on r-bloggers – Aggregate Genius, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Whether you are a veteran programmer with experience dating back to Fortran, or a new college grad with all the latest technologies, if you use R eventually you will have to worry about scoping!
Sure, we all start out ignoring scoping when we first begin using a new language. So what if all your variables and functions are global – you are the only one using them, right?!?! Unless you give up on R, you will eventually grow beyond your own system – either having to share your code with others, or deliver it to someone else – and that’s when you’ll start to need to pay attention to your code’s quality – starting with scoping!
Let’s get started at the beginning of the R coding experience. When you execute R on the command line generally everything is added to the global scope – and this makes logical sense. Little changes when you program in a .R file – it’s just a series of commands that are executed one by one, but as your sophistication of code increases exponentially you will want and need to use functions for reusable code pieces. This more granular scoping is ideal as your codebase grows!
Seems straightforward! However there are two big gotchas – automatic searching and double-arrow assignment misuse.
If you understand and follow the above you will be well on your way to ensuring correctly scoped variables and functions in your R code. Yes, there are mechanisms for hiding variables and getting around the standard scoping and restrictions in R. However, once you are comfortable with the basics you’ll be able to properly deal with these mechanisms – we’ll leave that set of topics for another day and another post.
I’ve written a commented R script if you would like to see examples of the above scoping rules as well as the gotchas in action. Feel free to download and use it as you see fit!
Basic scoping rules in R
Variables and Function Definitions
- by default, they are added to the Global scope.
Inside a Function
- Variables passed into the function as inputs are visible by default within the function. Variables defined in the parent scope are not visible, but globally-defined variables are visible. If the parent scope is the same as the global scope – those variables will be visible!
- Variables created inside the function are local to that function and it’s sub-components, and NOT visible outside of the function.
- Each invocation of a function is independent, which means variables declared and manipulated inside a function do not retain their values
- Arguments are immutable – if you change the value of an argument, what you are actually doing is creating a new variable and changing it. R does not support “call by reference” where the arguments can be changed in the called function and then the caller can use the changed variables. This is a very important difference from other languages – in some ways it makes your code safer and easier to debug/trace and in other ways it can be inconvenient when you have to return several of values of different types.
General
- Brackets {} do not create reduced or isolated scopes in R
Watch OUT for these Gotchas
- Automatic Searching
- Double-Arrow Assignment Misuse
To leave a comment for the author, please follow the link and comment on their blog: r-bloggers – Aggregate Genius.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.