Site icon R-bloggers

Six Reasons To Learn R For Business

[This article was first published on business-science.io - Articles, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Data science for business (DS4B) is the future of business analytics yet it is really difficult to figure out where to start. The last thing you want to do is waste time with the wrong tool. Making effective use of your time involves two pieces: (1) selecting the right tool for the job, and (2) efficiently learning how to use the tool to return business value. This article focuses on the first part, explaining why R is the right choice in six points. Our next article will focus on the second part, learning R in 12 weeks.

Reason 1: R Has The Best Overall Qualities

There are a number of tools available business analysis/business intelligence (with DS4B being a subset of this area). Each tool has its pros and cons, many of which are important in the business context. We can use these attributes to compare how each tool stacks up against the others! We did a qualitative assessment using several criteria:

Further discussion on the assessment is included in the Appendix at the end of the article.

What we saw was particularly interesting. A trendline developed exposing a tradeoff between learning curve and DS4B capability rating. The most flexible tools are more difficult to learn but tend to have higher business capability. Conversely, the “easy-to-learn” tools are often not the best long-term tools for business or data science capability. Our opinion is go for capability over ease of use.

Of the top tools in capability, R has the best mix of desirable attributes including high data science for business capability, low cost, and it’s growing very fast. The only downside is the learning curve. The rest of the article explains why R is so great for business.

Reason 2: R Is Data Science For Non-Computer Scientists

If you are seeking high-performance data science tools, you really have two options: R or Python. When starting out, you should pick one. It’s a mistake to try to learn both. Your choice comes down to what’s right for you. The difference between the R and Python has been described in numerous infographics and debates online, but the most overlooked reason is person-programming language fit. Don’t understand what we mean? Let’s break it down.

Fact 1: Most people interested in learning data science for business are not computer scientists. They are business professionals, non-software engineers (e.g. mechanical, chemical), and other technical-to-business converts. This is important because of where each language excels.

Fact 2: Most activities in business and finance involve communication. This comes in the form of reports, dashboards, and interactive web applications that allow decision makers to recognize when things are not going well and to make well-informed decisions that improve the business.

Now that we recognize what’s important, let’s learn about the two major players in data science.

About Python

Python is a general service programming language developed by software engineers that has solid programming libraries for math, statistics and machine learning. Python has best-in-class tools for pure machine learning and deep learning, but lacks much of the infrastructure for subjects like econometrics and communication tools such as reporting. Because of this, Python is well-suited for computer scientists and software engineers.

About R

R is a statistical programming language developed by scientists that has open source libraries for statistics, machine learning, and data science. R lends itself well to business because of its depth of topic-specific packages and its communciation infrastructure. R has packages covering a wide range of topics such as econometrics, finance, and time series. R has best-in-class tools for visualization, reporting, and interactivity, which are as important to business as they are to science. Because of this, R is well-suited for scientists, engineers and business professionals.

What Should You Do?

Don’t make the decision tougher than what it is. Think about where you are coming from:

Think about what you are trying to do:

Reason 3: Learning R Is Easy With The Tidyverse

Learning R used to be a major challenge. Base R was a complex and inconsistent programming language. Structure and formality was not the top priority as in other programming languages. This all changed with the “tidyverse”, a set of packages and tools that have a consistently structured programming interface.

When tools such as dplyr and ggplot2 came to fruition, it made the learning curve much easier by providing a consistent and structured approach to working with data. As Hadley Wickham and many others continued to evolve R, the tidyverse came to be, which includes a series of commonly used packages for data manipulation, visualization, iteration, modeling, and communication. The end result is that R is now much easier to learn (we’ll show you in our next article!).

Source: tidyverse.org

R continues to evolve in a structured manner, with advanced packages that are built on top of the tidyverse infrastructure. A new focus is being placed on modeling and algorithms, which we are excited to see. Further, the tidyverse is being extended to cover topical areas such as text (tidytext) and finance (tidyquant). For newcomers, this should give you confidence in selecting this language. R has a bright future.

Reason 4: R Has Brains, Muscle, And Heart

Saying R is powerful is actually an understatement. From the business context, R is like Excel on steroids! But more important than just muscle is the combination of what R offers: brains, muscle, and heart.

R has brains

R implements cutting-edge algorithms including:

These tools are used everywhere from AI products to Kaggle Competitions, and you can use them in your business analyses.

R has muscle

R has powerful tools for:

R has heart

We already talked about the infrastructure, the tidyverse, that enables the ecosystem of applications to be built using a consistent approach. It’s this infrastructure that brings life into your data analysis. The tidyverse enables:

Reason 5: R Is Built For Business

Two major advantages of R versus every other programming language is that it can produce business-ready reports and machine learning-powered web applications. Neither Python or Tableau or any other tool can currently do this as efficiently as R can. The two capabilities we refer to are rmarkdown for report generation and shiny for interactive web applications.

Rmarkdown

Rmarkdown is a framework for creating reproducible reports that has since been extended to building blogs, presentations, websites, books, journals, and more. It’s the technology that’s behind this blog, and it allows us to include the code with the text so that anyone can follow the analysis and see the output right with the explanation. What’s really cool is that the technology has evolved so much. Here are a few examples of its capability:

Shiny

Source: shiny.rstudio.com

Shiny is a framework for creating interactive web applications that are powered by R. Shiny is a major consulting area for us as four of five assignments involve building a web application using shiny. It’s not only powerful, it enables non-data scientists to gain the benefit of data science via interactive decision making tools. Here’s an example of a Google Trend app built with shiny.

Reason 6: R Community Support

Being a powerful language alone is not enough. To be successful, a language needs community support. We’ll hit on two ways that R excels in this respects: CRAN and the R Community.

CRAN: Community-Provided R Packages

CRAN is like the Apple App store, except everything is free, super useful, and built for R. With over 14,000 packages, it has most everything you can possibly want from machine learning to high-performance computing to finance and econometrics! The task views cover specific areas and are one way to explore R’s offerings. CRAN is community-driven, with top open source authors such as Hadley Wickham and Dirk Eddelbuettel leading the way. Package development is a great way to contribute to the community especially for those looking to showcase their coding skills and give back!

Community Support

You begin with R because of its capability, you stay with R because of its community. The R Community is the coolest part. It’s tight-knit, opinionated, fun, silly, and highly knowledgeable… all of the things you want in a high performing team.

Social/Web

For your #rstats holiday wish list consideration ????????https://t.co/dXYEAYXpzK pic.twitter.com/caQu53Czy8

— R-Ladies DC (@RLadiesDC) December 2, 2017

R users can be found all over the web. A few of the popular hangouts are:

Conferences

R-focused business conferences are gaining traction in a big way. Here are a few that we attend and/or will be attending in the future:

A full list of R-conferences can be found here.

Meetups

A really cool thing about R is that many major cities have a meetup nearby. Meetups are exactly what you think: a group of R-users getting together to talk R. They are usually funded by R-Consortium. You can get a full list of meetups here.

Conclusion

R has a wide range of benefits making it our obvious choice for Data Science for Busienss (DS4B). That’s not to say that Python isn’t a good choice as well, but, for the wide-range of needs for business, there’s nothing that compares to R. In this article we saw why R is a great choice. In the next article we’ll show you how to learn R in 12 weeks.

About Business Science

Business Science specializes in “ROI-driven data science”. Our focus is machine learning and data science in business and financial applications. We build web applications and automated reports to put machine learning in the hands of decision makers. Visit the Business Science or contact us to learn more!

Business Science University

Interested in learning data science for business? Enroll in Business Science University. We’ll teach you how to apply data science and machine learning in real-world business applications. We take you through the entire process of modeling problems, creating interactive data products, and distributing solutions within an organization. We are launching courses in early 2018!

Follow Business Science on Social Media

Appendix – Discussion On DS4B Tool Assessment

Here’s some additional information on the tool assessment. We have provided the code used to make the visualization, the criteria explanation, and the tool assessment.

Criteria Explanation

Our assessment of the most powerful DS4B tools was based on three criteria:

Source: Stack Overflow Trends

Individual Tool Assessment

R:

Python:

Excel:

Tableau:

PowerBI:

Matlab:

SAS:

Code for the DS4B Tool Assessment Visualization

library(tidyquant)library(ggrepel)data_apps <- tribble(  ~application, ~business_capability, ~ease_of_learning, ~trend, ~cost,     "R",          10,                   4,                 10,     "Free",  "Python",     7,                    4,                 10,     "Free",  "Excel",      4,                    10,                7,      "Low",  "Tableau",    6,                    7,                 6,      "Low",  "PowerBI",    5,                    8,                 6,      "Low",  "Matlab",     6,                    2,                 1,      "High",  "SAS",        8,                    4,                 3,      "High")cap <- paste0(  "Why R? Tools like Excel, Tableau, PowerBI are easier to learn, but have lower ",  "Business Capability. Tools like Python, SAS, and Matlab have high ",  "Data Science Capability, but lack the visualization and interactive ",  "application tools needed for business. R has the best data science, visualization, ",  " and interactive tools plus it's free!"  )data_apps %>%  ggplot(aes(x = business_capability, y = ease_of_learning,              color = cost, size = trend)) +  geom_point() +  geom_label_repel(aes(label = application, fill = application),                    size = 3.5,                   face = 'bold', color = 'white',                   box.padding = 0.1, point.padding = 0.5,                   segment.color = 'grey50', segment.size = 1) +  geom_smooth(color = palette_dark()[[1]], method = "lm", se = FALSE, show.legend = F) +  expand_limits(x = c(0, 10), y = c(0, 10)) +  theme_tq() +  theme(legend.direction = "vertical") +  scale_fill_tq() +  scale_color_tq() +  scale_y_continuous(breaks = seq(0, 10, 2)) +  scale_x_continuous(breaks = 0:10) +  scale_size_continuous(range = c(2, 14)) +   labs(title = "DS4B Tools: Capability Vs Learning Curve",       subtitle = "R has a longer learning curve but has a massive business capability rating",       caption = label_wrap_gen(115)(cap),       x     = "Data Science For Business Capability Rating",       y     = "Learning Curve Rating",       color = "Cost",       size  = "Trend",       fill  = "Tool")

To leave a comment for the author, please follow the link and comment on their blog: business-science.io - Articles.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.