Writing clean and readable R code the easy way
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Writing R code, specially for non-programmers like myself, can be a daunting task. You start really motivated, trying to follow some naming convention, formatting your code lines in the most readable way, keeping your lines in a manageable size but when the code lines start to increase and coding problems arise, when you start to get those annoying errors that you can’t seem to understand and debug, it all goes to rat-shit. You just start hacking away, creating unnecessary variables with weird names, code hacks just to get that particular function to work, accidentally deleting lines that will break your entire workflow, etc. At this point you find yourself so overwhelmed that all those nice good practices you wanted to follow are the last thing in your mind.
So, after many (unnecessary) headaches I’ve decided to create this post to help (future) me and others that may face these same problems.
- First things first. Use RStudio. It’s a great IDE for R. If your’re still resisting to use it, like many people I know, don’t. Switch now and you can thank me later. I’ve tried some other IDE’s but still haven’t found one that remotely compares to the functionality of RStudio. It saves you a lot of time and effort.
- Use git for version control. It’s great to track changes in your code and for collaborative work. I personally like github but there some others out there. To start using it just create your repository on github and then create your local “project with version control” in RStudio. With a simple copy-paste of the repository URL you now have your local project linked to an online repository and RStudio provides a lot of functionalities to interact with it without the need to use the command line.
- Use a naming convention. It will help you and others to read and understand your code. As the R community doesn’t have an official naming convention it doesn’t really mater which one you choose, but you should be consistent throughout your code. I personally use underscore_separated because I find it easier to read but there are others. You can read this article for more information.
- Use coding standards. Similarly to naming convention it will help you and others to read and understand your code. Hadley Wickham’s or Colin Gillespie’s guides are very helpful.
- Install and use some automatic packages to help clean up your code. “lintr” and “goodpractice” are awesome.
- Finally, document your functions. How many times did you found yourself a couple of months later looking at some of your functions and just can’t remember why did you include that particular argument or where did you came up with that equation? Although you must spend some extra time doing this, it pays up in the long run. You don’t even need to mess around with TEX, just use “roxygen2“.
Happy coding everyone!!!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.