Reflections on consulting part 5 – what languages and tools to learn?
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
What languages and tools should you learn as a math/stat consultant? To jump to the answer: Excel/VBA, SQL, R, Java, and Python.
Spreadsheets have many problems with verifiability and scalability, so why Excel?
Excel is:
- Useful for prototyping ideas quickly, either for your own use or to show to other team members
- Well-known and understood by many business users – no need to explain a new system to them
- A useful cross-check on work in other languages
- Zero marginal cost – people will assume you have Windows anyway and send you Word and Excel docs, so you have to get it
- Actually has some nice features – Solver lets you do Maximum Likelihood Estimation, Pivot Tables give quick and easy summaries
If you’re consulting, you should at least know the basics of Excel. Learning introductory VB is also very worthwhile, and there are a number of perfectly fine analytic systems than run in Excel/VBA. Yes, there are problems with extending to a full enterprise-level system, but you can deal with that in a follow-on engagement.
SQL? Pretty much everything you’ll do is either an input that comes from a database, or an output that should be stored in one. And it allows for very basic data analysis.
R. Unlike my fellow bloggers at Win-Vector, I’m not a big fan of R. But you can do a lot of statistics in it, and it’s free, so no need for your clients to get an expensive licence. It’s also a reasonably modern language, unlike many of its competitors. Since R is rapidly becoming the statistical analysis language of choice for cash-strapped startups or clients without big legacy systems, every consultant should be able to work in it.
Java? You don’t need to be a Java expert, but you should understand enough about Java to look at programs to see what they do, and to make minor alterations – invaluable for dealing with an engineering team, if they have to implement your great ideas.
Python? Well data is always a mess, so you should have some way of cleaning it before it gums up your shiny new models. Python’s as good as any other, and it’s more intuitive than Perl. But this is the least important of the recommendations.
I haven’t included SAS on this list. On the one hand, it was the predominant language for statistical analysis for a long time, so you can often find consulting work in it. It’s also a language unlike most others, so almost worth learning for that reason alone. And it is excellent at reading in data and dealing with large amounts of it. On the other hand, the consulting rates for SAS work are generally low, and the lack of free licences makes learning it on spec a riskier move. So if you know SAS or can pick it up at work, that’s great. If not, it’s probably not worth learning it.
Loyal readers have already loyally read parts 1-3 (networking, networking, and networking) and part 4 (why consult). Stay tuned for part 6 (or maybe 6-7 if I cheat again), when we’ll delve more into the wonderful world of consulting.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.