Site icon R-bloggers

Conquering Daily Data: How to Aggregate to Months and Years Like a Pro in R

[This article was first published on Steve's Data Tips and Tricks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
< section id="introduction" class="level1">

Introduction

Taming the beast of daily data can be daunting. While it captures every detail, sometimes you need a bird’s-eye view. Enter aggregation, your secret weapon for transforming daily data into monthly and yearly insights. In this post, we’ll dive into the world of R, where you’ll wield powerful tools like dplyr and lubridate to master this data wrangling art.

< section id="packages-gear-up-with-the-right-packages" class="level1">

Packages: Gear Up with the Right Packages

Think of R packages like your trusty toolbox. Today, we’ll need two essentials:

< section id="sample-data-our-training-ground" class="level1">

Sample Data, Our Training Ground

Imagine you have daily sales data for a year. Each row represents a day, with columns for date, product, and sales amount. Let’s create a mini version:

library(dplyr)
library(lubridate)

# Generate random dates and sales
set.seed(123)
dates <- seq(as.Date('2023-01-01'), as.Date('2023-12-31'), by = 'day')
sales <- runif(365, min=5000, max=10000)

# Create our data frame
daily_data <- data.frame(date = dates, sales = sales)

# Peek at our data
head(daily_data)
        date    sales
1 2023-01-01 6437.888
2 2023-01-02 8941.526
3 2023-01-03 7044.885
4 2023-01-04 9415.087
5 2023-01-05 9702.336
6 2023-01-06 5227.782

This code generates 10 random dates and sales figures, and stores them in a data frame called daily_data.

< section id="monthly-magic-from-days-to-months" class="level1">

Monthly Magic – From Days to Months

Now, let’s transform this daily data into monthly insights. Here’s the incantation:

# Group data by month
monthly_data <- daily_data %>%
   # Group by month extracted from date
  group_by(month = month(date)) %>%
  # Calculate total sales for each month
  summarize(total_sales = sum(sales))

head(monthly_data)
# A tibble: 6 × 2
  month total_sales
  <dbl>       <dbl>
1     1     245675.
2     2     199109.
3     3     233764.
4     4     227888.
5     5     230928.
6     6     222015.

Let’s break it down:

< section id="yearly-triumph-conquering-the-calendar" class="level1">

Yearly Triumph – Conquering the Calendar

Yearning for yearly insights? Fear not! Modify the spell slightly:

# Group data by year
yearly_data <- daily_data %>%
  # Group by year extracted from date
  group_by(year = year(date)) %>%
  # Calculate average sales for each year
  summarize(average_sales = mean(sales))

head(yearly_data)
# A tibble: 1 × 2
   year average_sales
  <dbl>         <dbl>
1  2023         7494.

Here, we group by the year extracted from date and then calculate the average sales for each year.

< section id="but-what-about-base-r" class="level1">

But what about base R?

So far, we’ve used dplyr to group and summarize our data. But what if you don’t have dplyr? No problem! You can use base R functions like aggregate() to achieve the same results:

monthly_data <- aggregate(
  daily_data$sales, 
  by = list(month = format(daily_data$date, '%m')), 
  FUN = sum
  )
head(monthly_data)
  month        x
1    01 245675.1
2    02 199108.7
3    03 233764.1
4    04 227888.3
5    05 230928.0
6    06 222015.3
yearly_data <- aggregate(
  daily_data$sales, 
  by = list(year = format(daily_data$date, '%Y')), 
  FUN = mean
  )
head(yearly_data)
  year      x
1 2023 7493.8
< section id="experiment" class="level1">

Experiment!

The magic doesn’t stop there! You can customize your aggregations to your heart’s content. Try these variations:

< section id="remember" class="level1">

Remember

< section id="the-takeaway" class="level1">

The Takeaway

Mastering daily data aggregation is a valuable skill for any data warrior. With the help of R and your newfound knowledge, you can transform mountains of daily data into insightful monthly and yearly summaries. So, go forth, conquer your data, and share your insights with the world!

Bonus Challenge: Share your own R code and insights in the comments below! Let’s learn from each other and become daily data aggregation masters together!

To leave a comment for the author, please follow the link and comment on their blog: Steve's Data Tips and Tricks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version