Site icon R-bloggers

How to Switch from Excel to R Shiny: First Steps

[This article was first published on r – Appsilon Data Science | End­ to­ End Data Science Solutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

tl;dr

If you’re still using Excel or Google Sheets for business, you might already know that Excel is obsolete for many business use-cases. But how do you switch from Excel to a better alternative like R Shiny? It’s easier to get started than you might think, especially if you’re already an Excel power user.

This article will walk you through a sample migration of an analytics tool built with Excel and Google Sheets to a dashboard built with R Shiny. We’ll show you how to prepare the data for migration, how to create a simple R Shiny app, how to view/filter tables in R Shiny, how to modify a table using SQL, and also how to add interactive filters using the library ShinyWidgets.

This article shouldn’t take too much of your time to read – perhaps no more than 15 minutes. The final Shiny dashboard has only 45 lines of well-formatted code, so it won’t be too demanding to code along. You can download the dataset here.

Excel is a Bad Choice for Businesses – Here’s Why

Excel has been an excellent tool for decades due to its relative intuitiveness and the WYSIWYG (What You See Is What You Get) user interface. Still, running an important business process in an Excel workbook in 2020 is a big mistake, as there are a variety of more stable and sophisticated tools readily available.

Excel is prone to human errors. Everyone with access to an Excel document can edit all of its features, which can lead to unintentional changes or the removal of crucial functions. As a result, you can easily end up with a broken spreadsheet whenever a user accidentally (or intentionally) changes a parameter. This is a serious issue because it is currently not possible to fully version control Excel. This means that you cannot effectively keep track of changes in an Excel spreadsheet and revert to a previous version if something breaks.

To put it simply – it is not safe to keep a company’s critical information in Excel workbooks, especially when it comes to business analytics. Excel can still be useful as an alternative tool to SQL-based solutions to create simple ‘databases’ as it is very intuitive, and anyone can contribute. However, in the long run, Excel is just too fragile to handle complex business analyses.

Learn more: Excel is Obsolete: Here Are the Top 2 Alternatives from R and Python

Alternatives to Excel: Python and R

To effectively plan your next business move, you need access to features that Excel doesn’t offer, such as integration with machine learning models. You also might need a scalable way of connecting to external data sources via APIs. Due to the variety of analytical approaches that a modern business needs to draw proper conclusions, it is crucial to use a more flexible and reliable tool.

Therefore, it is necessary to move as many analytical operations as possible away from Excel. There are plenty of better-equipped tools for data analysis and data visualization, and they should not be too difficult to master for someone already proficient in Excel. 

Python and R are two of the most popular beginner-friendly programming languages. Many non-IT people find R easier and more intuitive to use than Python. This is good news because R and R Shiny alone can cover the majority of business operations stored in Excel and they really can open the door to modern, ground-breaking data analysis.

Learn More: Better than Excel: Use these R Shiny Packages Instead

Migration from Excel to R: Getting Started

Every migration starts with proper data preparation. This step might take some time if a company has a lot of Excel workbooks. The most important thing is to extract original data from analyzed data, create tables, and save them in a CSV format. If you’re just starting – start small. Pick one or two CSVs that move the needle for a start. In almost any use case, you don’t have to start with a complete dataset. Remember:

It would make sense to switch from CSV to SQL in the future, but using CSV is not a dealbreaker in the beginning. Moreover, you’ve likely performed some useful and effective analytical operations within Excel. Don’t get rid of them just yet. By all means, analyze/verify these operations one more time, and describe their logic in detail (to recreate it later in R).

Sample Case: Job Hunt Analysis

Last year, one of our consultants was searching for a full-time job, and they started tracking application processes to see a more accurate picture of their career prospects – primarily to find out which companies/industries find a single profile interesting. Information about all applications sent in 2019 and 2020 were stored in two Excel sheets (original data) and one in Google Sheets (analysis). Let’s take a quick preview of this data.

Original data:

Analysis:

As you can see in the screen recording above, the dashboard works fine. However, there are some issues we encountered while using and maintaining it:

Because of the reasons above, it makes sense to migrate this dashboard to R Shiny and see if these problems can be eliminated.

Data Preparation

In this case, data preparation was fairly easy. Two sheets were merged into one, some values got replaced (YES -> 1, NO -> empty cell) using Find and Replace, and the data was saved to a CSV file:

The next logical step is describing the features we need to migrate from Excel to R Shiny. The following table summarizes the steps pretty well:

Loading a Dataset

To start the migration process, we downloaded RStudio (a free development environment for R from our partner RStudio, PBC), found the CSV file we wanted to use, and imported it into RStudio:

After a successful import, the file appeared in the Global Environment. R had no problem with recognizing the CSV table.

New to R or want to speed up your workflow? Check out our favorite RStudio Shortcuts and Tricks.

Creating Your First R Shiny Dashboard

Let’s start simple with something that remotely resembles the original dashboard. The main goal is to make a simple app that displays the source data and filters it by Job Category.

It’s important to understand two main components of an R Shiny app – the UI (User Interface) and the server. UI is a graphic layout of an app – everything the user sees on a webpage. The server is the backend of an application. The app is stored on the computer that runs R in the form of a page that can be viewed in a web browser.

If you are a beginner with R Shiny, here’s an additional resource to help you get started: (Video Tutorial) Create and Customize a Simple Shiny Dashboard

Note: To share the R Shiny app with others, you either need to send them a copy of a script or host this page via an external web server.

To start, let’s use the most basic Shiny app template:

library(shiny)
ui <- fluidPage()
server <- function(input, output) {}
shinyApp(ui=ui, server=server)

Defining Input and Output

Input is everything the user can interact with on a website. To name a few:

Each input must have an inputId (local name, e.g. ‘value’), and a label (a description that will be displayed in an app, e.g. ‘Select value’). In addition, depending on the type of input, you can provide additional parameters that will specify/limit the actions a user can perform. For more on defining input and output, and other aspects of Shiny, read this tutorial by RStudio.

In the first draft of the app, let’s create a reactive select box from which the user can choose any job category that appears in the dataset. Therefore, besides defining inputId and a label we need a list of choices for a dropdown list (choices = TableName$ColumnName):

selectInput(‘jobcategory’, ‘Select a category’, choices = j_h$JOB_CATEGORY)

output is the second argument in fluidPage(). In this case, it is the result of actions taken by the user in inputs. It can be displayed in the form of a graph – plotOutput(), table – tableOutput(), text – textOutput(), image – imageOutput(), etc. Just like input, output needs to have an ID – outputId. We’ll display the results as a table, so let’s use the tableOutput() function and name our reactive output ‘jobhuntData’:

tableOutput(‘jobhuntData’)

Like many basic Shiny apps, our draft Shiny app is quite ugly by default. Let’s fix this with some elements: titlePanel(), sidebarLayout(), sidebarPanel(), and mainPanel().

At this point, after adding all elements to a fluidPage() function, our code looks like this:

library(shiny)

ui <- fluidPage(
    titlePanel('JOB HUNT RESULTS'),
    sidebarLayout(
        sidebarPanel(
            selectInput('jobcategory', 'Select a category', choices=j_h$JOB_CATEGORY)
        ),
        mainPanel(
            tableOutput('jobhuntData')
        )
    )
)

server <- function(input, output) {}

shinyApp(ui=ui, server=server)

We can see the filter, but there is no table yet. This is because R Shiny does not know what kind of table we want to generate. Let’s introduce server requirements to address this.

Want to Make a Beautiful Shiny App Fast? Use Appsilon’s shiny.semantic open source package, which brings the Fomantic UI library to Shiny for attractive UI and rapid development.

How to Use Shiny Server

To build the first draft of the app, we need to create a source for the tableOutput() function by using a Render Function. Render Functions (e.g. renderImage() to render an image, renderPlot() to render a plot/graph, renderText() to render text, etc.) turn an R object into HTML, and place it in a Shiny webpage. 

Below you can see how we assigned the outputId (“jobhuntData”) to a function that renders the desired output – in our case, renderTable() to render a table. Inside this function, we specified data that we want to see in the table. Please mind that input$jobcategory refers to the Input Function from the UI, and it is always equal to the current value of the input (a value selected by a user).

library(shiny)

ui <- fluidPage(
    titlePanel('JOB HUNT RESULTS'),
    sidebarLayout(
        sidebarPanel(
            selectInput('jobcategory', 'Select a category', choices=j_h2$JOB_CATEGORY)
        ),
        mainPanel(
            tableOutput('jobhuntData')
        )
    )
)

server <- function(input, output) {
    output$jobhuntData <- renderTable({
        jobcategoryFilter <- subset(j_h2, j_h2$JOB_CATEGORY == input$jobcategory)
    })
}

shinyApp(ui=ui, server=server)

The current version of the app does not look amazing, but we can see that the correct data is shown, and the server generates proper output according to the input provided by the user:

Migration – SQL and ShinyWidgets

Now that we know how to create a basic dashboard in R Shiny, we are going to migrate other features from our original dashboard. First and foremost, we had to not only create filters for all columns but also aggregate/group data by YEAR and COUNTRY. There are several ways to modify the dataset in R, but we decided to do it using an SQL SELECT statement. SQL is another topic on its own, but we recommend that you learn the basics of SQL if you work with data on a daily (or even weekly) basis.

This is one of the SQL statements we used to create an aggregated view in Google Sheets:

Below is the logic that we applied in R using the sqldf library. It enables us to see how many phone screenings, interviews, and offers we had each year in every country:

library(sqldf)

aggregated_data = sqldf(‘SELECT YEAR, COUNTRY, JOB_CATEGORY,
      COUNT(PHONE_SCREENING) AS PHONE_SCREENING, 
      COUNT(INTERVIEW) AS INTERVIEW, 
      COUNT(OFFER) AS OFFER
      FROM j_h2
      GROUP BY JOB_CATEGORY, YEAR, COUNTRY
      ORDER BY JOB_CATEGORY’)

This is how the new table “aggregated_data” looks like:

Adding multiple filters that are conditional can be a very difficult task, but the ShinyWidgets library offers a perfect solution: selectizeGroup-module. Having imported ShinyWidgets, we’ve replaced selectInput() with selecticizeGroupUI() and added one more function – callModule(). This way we have eliminated the possibility of choosing a combination that does not exist. Below you can see the entire solution:

library(sqldf)
library(shiny)
library(shinyWidgets)

aggregated_data = sqldf("SELECT YEAR, COUNTRY, JOB_CATEGORY,
      COUNT(PHONE_SCREENING) AS PHONE_SCREENING, 
      COUNT(INTERVIEW) AS INTERVIEW, 
      COUNT(OFFER) AS OFFER
      FROM j_h2
      GROUP BY JOB_CATEGORY, YEAR, COUNTRY
      ORDER BY JOB_CATEGORY")

shinyApp(
  ui = fluidPage(
    titlePanel("JOB HUNT RESULTS"),
    sidebarPanel(
      selectizeGroupUI(
        id = "fancy_filters",
        inline = FALSE,
        params = list(
          YEAR = list(inputId = "YEAR", title = "Year", placeholder = 'All'),
          COUNTRY = list(inputId = "COUNTRY", title = "Country", placeholder = 'All'),
          JOB_CATEGORY = list(inputId = "JOB_CATEGORY", title = "Job category", placeholder = 'All'),
          PHONE_SCREENING = list(inputId = "PHONE_SCREENING", title = "Number of positive replies", placeholder = 'All'),
          INTERVIEW = list(inputId = "INTERVIEW", title = "Number of interview invitations", placeholder = 'All'),
          OFFER = list(inputId = "OFFER", title = "Number of offers", placeholder = 'All')
        )
      )
    ), 
    mainPanel(
      tableOutput("jobhuntData")
    )
  ),

  server = function(input, output, session) {
    res_mod <- callModule(
      module = selectizeGroupServer,
      id = "fancy_filters",
      data = aggregated_data,
      vars = c("YEAR", "COUNTRY", "JOB_CATEGORY", "PHONE_SCREENING", "INTERVIEW", "OFFER")
    )
    output$jobhuntData <- renderTable({
      res_mod()
    })
  })


Conclusion

Working with a new tool like R Shiny can be intimidating at first, but in some ways it can be even easier to learn and understand than Excel or Google Sheets. It is more flexible in terms of adding new features or modifying existing ones. Because we replaced four tables with one, the dashboard not only looks better than our Excel and Google Sheets tool – it is also much easier to use.

Moreover, we managed to create an app where the user is in complete control of the displayed data but does not have access to the backend. This means we do not need to worry about non-technical users making accidental changes to the source code or breaking the app. We can also apply version control and store the source code of the app on services like GitHub in a way that allows us to safely revert to previous versions. This way, anyone who I want to share my code with can download it and make contributions in a controlled environment.

You deserve a great dashboard. Need help creating a beautiful, durable, and scalable enterprise Shiny app? Reach out.

Learn More

This article was originally written by Zuzanna Danowska with further edits from Appsilon team members Marcin Dubel and Dario Radečić.

Article How to Switch from Excel to R Shiny: First Steps comes from Appsilon Data Science | End­ to­ End Data Science Solutions.

To leave a comment for the author, please follow the link and comment on their blog: r – Appsilon Data Science | End­ to­ End Data Science Solutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.