Site icon R-bloggers

How to Plot Categorical Data in R-Quick Guide

[This article was first published on Methods – finnstats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Plot Categorical Data in R, Categorical variables are data types that can be separated into categories. Race, sex, age group, and educational level are examples of categorical variables.

This tutorial describes three approaches to plot categorical data in R.

Let’s make use of Bar Charts, Mosaic Plots, and Boxplots by Group.

Approach 1: Bar Chart

The code below demonstrates how to make a bar chart to represent the frequency of teams in a given data frame. Let’s load the library first,

Timeseries analysis in R » Decomposition, & Forecasting »

library(ggplot2)

Create dataset

data <- data.frame(result = c('W', 'L', 'L', 'W', 'W', 'L', 'L', 'L', 'W', 'L'),
team = c('B', 'D', 'B', 'A', 'D', 'A', 'A', 'D', 'C', 'D'),
score = c(18, 38, 29, 28, 32, 55, 22, 48, 33, 12),
rebounds = c(15, 5, 9, 10, 15, 8, 9, 12, 11, 10))
head(data)
    result team score rebounds
1      W    B    18       15
2      L    D    38        5
3      L    B    29        9
4      W    A    28       10
5      W    D    32       15
6      L    A    55        8

Create a bar chart of teams

ggplot(data, aes(x=reorder(team, team, function(x)-length(x)))) +
geom_bar(fill='red') +  labs(x='Team')

datatable editor-DT package in R » Shiny, R Markdown & R »

Approach 2: Boxplots by Group

A useful technique to show a numeric variable that is grouped by a categorical variable is to use grouped boxplots.

The following code, explains how to make boxplots that display the distribution of points scored per team

Create boxplots of points, grouped by team

ggplot(data, aes(x=team, y=score)) +  geom_boxplot(fill='green')

The teams are represented on the x-axis, while the distribution of points scored by each team is represented on the y-axis.

Kruskal Wallis test in R-One-way ANOVA Alternative »

Approach 3: Mosaic Plot

A mosaic plot is a form of a graph that shows the frequencies of two categorical variables on the same graph.

The following code demonstrates how to make a mosaic plot that displays the frequency of the categorical variables “result” and “team” in one figure

Create the counts

counts <- table(data$result, data$team)

#create a mosaic plot

mosaicplot(counts, xlab='Match Result', ylab='Team',main='Wins by Team', col='orange')

The game outcome is displayed on the x-axis, while the four separate teams are displayed on the y-axis.

Principal component analysis (PCA) in R »

The post How to Plot Categorical Data in R-Quick Guide appeared first on finnstats.

To leave a comment for the author, please follow the link and comment on their blog: Methods – finnstats.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.