How to Calculate Relative Frequencies in R?

[This article was first published on Data Science Tutorials, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post How to Calculate Relative Frequencies in R? appeared first on Data Science Tutorials

How to Calculate Relative Frequencies in R?, The relative frequencies/proportions of values in one or more columns of a data frame can frequently be calculated in R.

Data Science Statistics Jobs  » Are you looking for Data Science Jobs?

Fortunately, utilizing the dplyr package’s methods makes this task simple. This tutorial shows how to apply these functions to the following data frame to get relative frequencies:

Let’s create a data frame

df <- data.frame(team = c('P1', 'P1', 'P1', 'P2', 'P2', 'P2', 'P2'),
                 position = c('R2', 'R1', 'R1', 'R2', 'R2', 'R1', 'R2'),
                 points = c(102, 115, 119, 202, 132, 134, 212))

Now we can view the data frame

df
  team position points
1   P1       R2    102
2   P1       R1    115
3   P1       R1    119
4   P2       R2    202
5   P2       R2    132
6   P2       R1    134
7   P2       R2    212

Example 1: Relative Frequency of One Variable

The relative frequency of each team in the data frame can be calculated using the code below.

library(dplyr)
df %>%
  group_by(team) %>%
  summarise(n = n()) %>%
  mutate(freq = n / sum(n))
team      n  freq
  <chr> <int> <dbl>
1 P1        3 0.429
2 P2        4 0.571

This reveals that team P1 is responsible for 42.9 percent of the data frame’s total rows while team P2 is responsible for the remaining 57.1 percent. Take note that they add up to 100% when combined.

Replace NA with Zero in R – Data Science Tutorials

Example 2: Relative Frequency of Multiple Variables

The relative frequency of positions by team can be calculated using the code below:

library(dplyr)
df %>%
  group_by(team, position) %>%
  summarise(n = n()) %>%
  mutate(freq = n / sum(n))
  team  position     n  freq
  <chr> <chr>    <int> <dbl>
1 P1    R1           2 0.667
2 P1    R2           1 0.333
3 P2    R1           1 0.25
4 P2    R2           3 0.75

This tells us that:

Team P1 has 66.7 percent of its players in position R1.

Team P1 has 33.3 percent of their players in position R2.

Team P2 has 25.0% of its players in position R1.

Team P2 has 75.0 percent of its players in position R2.

How to Replace String in Column using R – Data Science Tutorials

Example 3: Display Relative Frequencies as Percentages

The relative frequency of locations by team is calculated using the following code, and the relative frequencies are displayed as percentages:

library(dplyr)
df %>%
  group_by(team, position) %>%
  summarise(n = n()) %>%
  mutate(freq = paste0(round(100 * n/sum(n), 0), '%'))
team  position     n freq
  <chr> <chr>    <int> <chr>
1 P1    R1           2 67% 
2 P1    R2           1 33% 
3 P2    R1           1 25% 
4 P2    R2           3 75%

The post How to Calculate Relative Frequencies in R? appeared first on Data Science Tutorials

To leave a comment for the author, please follow the link and comment on their blog: Data Science Tutorials.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)