Site icon R-bloggers

Summarising Dates with Missing Values

[This article was first published on R on Alan Yeung, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This blog post is just a note that when you try to do a grouped summary of a date variable but some groups have all missing values, it will return Inf. This means that the summary will not show up as an NA and this can cause issues in analysis if you are not careful.

library(tidyverse)

df <- tibble::tribble(
    ~id,          ~dt,
    1L, "01/01/2001",
    1L,           NA,
    2L,           NA,
    2L,           NA
) %>% 
    mutate(dt = dmy(dt))

z1 <- df %>% 
    group_by(id) %>% 
    summarise(dt_min = min(dt, na.rm = TRUE),
              .groups = "drop")

z1
# A tibble: 2 × 2
#      id dt_min    
#   <int> <date>    
# 1     1 2001-01-01
# 2     2 Inf

sum(is.na(z1$dt_min))
# [1] 0

There are a couple of ways around this. Firstly you can use an if() statement.

z2 <- df %>% 
    group_by(id) %>% 
    summarise(dt_min = if (all(is.na(dt))) NA_Date_ else min(dt, na.rm = TRUE),
              .groups = "drop")

z2
# A tibble: 2 × 2
#      id dt_min    
#   <int> <date>    
# 1     1 2001-01-01
# 2     2 NA  
sum(is.na(z2$dt_min))
# [1] 1

Or you can summary functions from the hablar package.

z3 <- df %>% 
    group_by(id) %>% 
    summarise(dt_min = hablar::min_(dt),
              .groups = "drop")

z3
# A tibble: 2 × 2
#      id dt_min    
#   <int> <date>    
# 1     1 2001-01-01
# 2     2 NA  
sum(is.na(z3$dt_min))
# [1] 1

Is there a reason why R decides to return Inf when summarising dates? Are there any other solutions to summarising date variables that contain missing values? Leave me a comment if you know thanks.

To leave a comment for the author, please follow the link and comment on their blog: R on Alan Yeung.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version