Site icon R-bloggers

Survival skills for today’s analyst

[This article was first published on Musings of a forgetful functor, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I suffer a little from the age-old affliction of contrarianism. If a software package is used by the majority of the population, I assume it is flawed, highly limited, and its continued use will ultimately result in the downfall of the human race. Conversely, I am always extremely interested in a piece of software that has spread no further than the ivory tower in which it was first conceived.

The most longstanding example of this is my profound preference for the statistical computing language, R, over Microsoft Excel–a program in which I have begrudgingly developed an extremely high level of expertise. As every analyst knows, in the world of statistical software Excel is like McDonald’s, Burger King, Pizza Hut, and KFC all rolled into one. It is so prepackaged and devoid of customization, yet so ubiquitous that we cannot do without it. Like the fast-food chains, we loathe Excel because it always produces the same graphs, the same simple statistical analyses. Yet when we find ourselves lost in a strange, unfriendly foreign country we go running back to the grid lines of excel to order a Big Mac. As soon as we enter the jungle, our survival skills are found wanting.

Yes my friends, like it or not, Excel is here to stay although not for lack of alternatives. The fact is that it is the user-friendly nature of this program that has been the key to its success. A friend of mine once put it thus: “Excel has allowed a generation of knowledge workers to survive without being able to program.”

In truth, the driving force behind Excel’s success is simple: Excel is easy. Oh I know that the die-hards will talk about how it is a superior visual tool, and that spreadsheets allow for increased transparency in financial models. But this argument falls flat on its face when we introduce macros to the equation; if spreadsheets are about transparency then why do we add VBA scripts that the user can neither see nor understand. And if we are happy to use scripts at some level, why on earth do we need to do everything else in a cumbersome visual environment.

Furthermore, the simple ends to which Excel users put their tools is demonstrated by the tiny fraction of users use the (admittedly limited) functionality afforded by VBA. That so many users can get by without loops, functions, or any notion of encapsulation is testament to the primitive uses to which Excel is put: it is just a big button calculator with an autofill feature. Surely there are more skills that we need to survive in the analytical jungle.

Finding an alternative
As I write this, I am sure that I have just alienated the entire community of so called “Power Excel Users”. But I am sure that many engineers, economists, and scientists will agree that Excel is too limited to be the only quantitative tool that you have available in your office. The problem is finding software that you can successfully use in an office environment, and that is worth investing the time in learning.

The most important obstacle to overcome is the cost barrier. One of my friends, Rex, works for a major insurance group in their risk division. As far as I can tell, there are few organisations as willing to shell out money on analytical software as an insurance company. As a result, Rex regularly tells me about the wonderful software package that they just bought for $X million. These packages are highly customised and very user friendly (that’s why they cost big dollars). The problem is, what happens when the company’s systems change, or when you need to solve a new problem? Moreover, how does Rex do his job when he no longer has access to the software (ie, if he moves to another job). The cost of these highly customized packages means that they are not useful tools to acquire for your repertoire. As a rule of thumb, if it costs more than the latest version of Excel then assume that it is not portable: you cannot take it with you.

Enter R
Since I am a contrarian, I am sure that my advice should be taken with a grain (if not a barrel) of salt. However, I believe that there is now a viable alternative to Excel: R. R has been around for a long time, but it has taken a while to gain the following that it so rightfully deserves.

R is completely free and thus available at your fingertips wherever you go. No need to negotiate with the boss about breaking the budget for some fancy new piece of software. Download the binary, install it, and you are good to go. The advantage of this is not just that it is freely available, but that you can rely on it being available.

That just leaves its functionality, and my friends the good news is that R has functionality in spades. Take a quick look at its graphical features and you will see that almost any chart or graph you can dream of can be generated in R. In addition, the R community is continually adding new packages with new functions. In the last few years, the development of these packages has exploded in line with growth in the user base.

Transcending Excel and transitioning to R
Having used R for a reasonable amount of time, I find it hard to see why other analysts struggle day-in day-out with Excel. However, the great barrier to using R is that it is one step closer to all-out coding. Run through an interpreter, R seems strange and frightening to the non-programmer. If you have never learned a programming language, then chances are it will take you some time to shift to R.

Another issue is the need for other people to have the ability to review, check, and edit your work. Unless your boss is up to speed with R or is willing for your work to be checked by another R-literate colleague, you may have to stick with Excel for the moment.

There is, however, great scope for the analyst to grow their organisation into R over time. Whenever you are asked to do a self-contained piece of work independently, try doing it in R. I tend to go overboard and try to create advanced graphics that showcase R’s capabilities. The majority of the time, people ask how I made the graph and are then keen to see what else R can do.

Into the jungle
As the old saying goes, “to the man that has only a hammer, every problem looks like a nail”. At the moment, there are an awful lot of organisations that are filled with people who only have Excel and every problem sure looks like a spreadsheet.

I believe that analysts that fail to expand their toolkit tend to lose the ability to solve new problems. The generation of knowledge workers who are now in their 40s may have been lucky enough to survive on nothing more than their spreadsheet skills. However, as a twenty-something making my way in the business world, I cannot see how an analyst will be able to survive without some high-powered programming in their utility belt. R may not be enough on its own, but it seems like a good starting point.

Good luck in the analytical jungle.

To leave a comment for the author, please follow the link and comment on their blog: Musings of a forgetful functor.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.