The myth of the R learning curve
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I think that the “difficult” R learning curve is a myth.
That’s because what people call the “R learning curve” is actually the combination of several disparate skill sets that are often taught as one conglomerated curriculum.
Let me explain.
Most university courses in biology that teach R, don’t just teach R. The goal of those classes is to help students learn how to plot and analyze their own data (and eventually use those skills for actual research).
So, is the course teaching data analysis and statistics? or R? Usually its goal is to teach all of those things. That’s where the problem is.
Having worked with many undergraduate and graduate students on learning R one-on-one, it’s become clear that there is a particularly deep chasm between what it means to learn R and what it means to learn statistics. R is a programming language, but data analysis and statistics per se are mostly math. R is just a tool for doing statistics. For example, statistics and data analyses can be conducted using tools ranging from a calculator, to Microsoft Excel. While R remains one of the best tools, there is no intrinsic link that implies R must be taught simultaneously with statistics. In fact, that’s my point.
One of the main reasons R appears to have a difficult learning curve is simply because it is often confounded with learning statistics at the same time. One of my goals with the courses that I teach is to separate statistics and R. If I’m going to teach a course on R, it is just about R. Once you have a solid handle on that, then we can move on to using R for learning statistics. But you need to know how to use the right tools first. That’s why I created my course on the basics of R for ecologists. It doesn’t cover any stats or data analysis, but that’s my intention.
I want to outline one more reason why R appears to have a difficult learning curve.
Many of the mainstream R courses (such as the university courses I mentioned above) tend to mistake “learning R” with “learning everything in R.” The professors that teach these courses usually have many years of experience and have thus accumulated a very large tool shed of packages and functions and operations for R (take a look at this Popular Mechanics post about some of the weirdest actual hardware tools). This then becomes the standard for what should be taught and the course is now about cramming 10 years of experience with R into one semester. Not only is this too much to teach in such a short time, but it also takes the focus away from learning what is actually most important for simply plotting and analyzing data.
To be fair, I must say there are a lot of great professors out there that do recognize this issue and carefully focus on the most important functions and operations when teaching R, but those seem to be uncommon.
In a recent post, I shared a cheat sheet on the most common but important functions when using R for ecology. (Click here to see the post and download the cheat sheet). My goal there was to share the most common functions that also provide the most bang for the buck. In other words, the majority of all the code you will ever write in R comes down to just a handful of functions.
So why don’t most R courses start by focusing on those few functions first? Maybe for the same reasons that traditional language classes focus too much on grammar and syntax than just speaking? (Check out the Natural Order Hypothesis about learning new languages.).
To wrap this all up and summarize my point, I think that there are two primary reasons that there appears to be a difficult R learning curve and why so many students do end up having a truly difficult time with R.
First, teaching R is often confounded with teaching statistics. Pick one (preferably R first), and then move on to the other.
Second, start by only teaching the most essential and important functions first. Don’t overwhelm your students with all the functions they might ever need to know. And if you know two ways to do the same thing? Just pick one.
So, what do you think about this topic? What are your Stork Beak Pliers in R?
Also be sure to check out R-bloggers for other great tutorials on learning R
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.