So you want to be a data scientist

[This article was first published on eKonometrics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.



The New York Times made it look so easy. Take a few courses in data science and a web-based startup will readily pay top dollars for your newly acquired skills.

Since the McKinsey Global Institute reported on the impending shortage of data crunchers, the wanna be data scientists are searching for learning opportunities in big data analytics. Newspaper coverage suggests that even with limited previous exposure to empirics, one may enroll in MOOCs or join programming boot camps to establish one’s bonafides.

In a recent blog on Forbes.com, Meta S. Brown, the author of Data Mining for Dummies, gave four reasons not to get an advanced degree in data science. I, on the other hand, believe that a structured learning environment is exactly what many need to enable the career change they have contemplated for years but have not moved on it.

It all depends on upon what kind of a learner you are. If you are a disciplined, self-motivated, self-actuated individual, you can pick up the skills by attending MOOCs or participating in coding boot camps.

But if you are like the rest of us, who once enrolled in a free online course, but didn’t complete it, you need some structure. A degree or a certificate in data science or business analytics is exactly what you need to upgrade your skills and be part of the network that will help you reorient your career.

In my book, Getting Started with Data Science, I mentioned Paul Minton, who was making $20,000 serving tables in New York. However, a three-moth programming course at the Zipfian Academy turned his life around. He earned over $100,000 in 2014 as a data scientist for a web startup in San Francisco. “Six figures, right off the bat … To me, it was astonishing,” he told The New York Times.

When the inspiring data scientists think of a career in the ‘glamorous’ world of big data and analytics, they think of Mr. Minton. His story, though a bit Cinderella-ish, is true, but rare. He works for Change.org! However, not everyone should expect a similar outcome. In addition to good fortune, Mr. Minton had majored in math in his undergraduate training, and we all know that math helps. It will be unwise, however, to assume that with almost no  empirical background, one can master the 
complex world of data and algorithms in a matter of a few weeks and be gainfully employed.

While speaking at meet-ups organized by IBM’s BigDataUniversity, I encounter dozens of enthusiasts who are keen to start training in data science but do not know where to begin. I advise them to build on their core competencies and domain knowledge. For instance, if you studied journalism or creative writing as an undergraduate, you might want to learn how to analyze socioeconomic data instead of trying to set up Hadoop clusters, a big data task best left to computer scientists and engineers.

If you are a disciplined learner, you can explore data science training offered as MOOCs. Coursera, one of the largest MOOCs platform, listed several data science courses among the top 10 most popular courses in 2015. IBM’s Big Data University (BDU) is another platform dedicated to promoting training in data science and analytics. Not only BDU offers similar resources for online learning as other platforms, it also offers cloud-based resources for hands-on training through the Data Scientist Workbench.

The Workbench provides the state-of-the-art computing solutions for regular-sized data. These include RPython, and OpenRefine. To wrangle big data, the Workbench offers Hadoop and Spark-based solutions. Such coupling of computing infrastructure with online learning resources frees the new learners from the concerns about installing and maintaining software and clustering hardware.

For learners who would prefer a structured learning environment, they also have several options. They can register for courses or certificates offered by universities’ continuing education faculties, enroll in an online graduate degree in data science, or take a more traditional approach of enrolling in a full- or part-time Master’s program.

A good place to search for learning opportunities is the KDNuggets website that maintains detailed lists of post-graduate programs in data science including full-time, part-time, and online masters and other certifications.

Once you have earned some credentials, you still have to prove your worth to future employers. If you are making a switch from another career, your experience may not be of much use in your pursuits in the data-centric world. My advice to the novice data scientists lacking experience is to ask the potential employer not necessarily for a job, but instead for a data set and a puzzle. If you can solve a data-oriented problem for a firm as part of the vetting process, you can overcome the shortcomings in your résumé.

For those who are still on the fence thinking whether to take the plunge into the world of big data and analytics, they should know that the demand for data scientists far exceeds the capacity of the universities and colleges to produce them. This is unlikely to change shortly. Act now and embrace data. 

To leave a comment for the author, please follow the link and comment on their blog: eKonometrics.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)