Site icon R-bloggers

Climate circles

[This article was first published on R on Dominic Royé, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The climate of a place is usually presented through climographs that combine monthly precipitation and temperature in a single chart. However, it is also interesting to visualize the climate on a daily scale showing the thermal amplitude and the daily average temperature. To do this, the averages for each day of the year of daily minimums, maximums and means are calculated.

The annual climate cycle presents a good opportunity to use a radial or polar chart which allows us to clearly visualize seasonal patterns.

Packages

We will use the following packages:

Package Description
tidyverse Collection of packages (visualization, manipulation): ggplot2, dplyr, purrr, etc.
fs Provides a cross-platform, uniform interface to file system operations
lubridate Easy manipulation of dates and times
janitor Simple functions to examine and clean data
# install the packages if necessary

if(!require("tidyverse")) install.packages("tidyverse")
if(!require("fs")) install.packages("fs")
if(!require("lubridate")) install.packages("lubridate")

# packages

library(tidyverse)
library(lubridate)
library(fs)
library(janitor)

Preparation

Data

We download the temperature data for a selection of US cities here. You can access other cities of the entire world through the WMO or GHCN datasets at NCDC/NOAA.

Import

To import the temperature time series of each city, which we find in several files, we apply the read_csv() function using map_df(). The dir_ls() function of the fs package returns the list of files with csv extension. The suffix df of map() indicates that we want to join all imported tables into a single one. For those with less experience with tidyverse, I recommend a short introduction on this blog post.

Then we obtain the names of the weather stations and define a new vector with the new city names.

# import data
meteo <- dir_ls(regexp = ".csv$") %>% 
          map_df(read_csv)
meteo
## # A tibble: 211,825 x 12
##    STATION     NAME    LATITUDE LONGITUDE ELEVATION DATE        TAVG  TMAX  TMIN
##    <chr>       <chr>      <dbl>     <dbl>     <dbl> <date>     <dbl> <dbl> <dbl>
##  1 USW00094846 CHICAG~     42.0     -87.9      202. 1950-01-01   6.8    NA    NA
##  2 USW00094846 CHICAG~     42.0     -87.9      202. 1950-01-02   8.4    NA    NA
##  3 USW00094846 CHICAG~     42.0     -87.9      202. 1950-01-03  11      NA    NA
##  4 USW00094846 CHICAG~     42.0     -87.9      202. 1950-01-04  -7.2    NA    NA
##  5 USW00094846 CHICAG~     42.0     -87.9      202. 1950-01-05 -10.2    NA    NA
##  6 USW00094846 CHICAG~     42.0     -87.9      202. 1950-01-06  -4.6    NA    NA
##  7 USW00094846 CHICAG~     42.0     -87.9      202. 1950-01-07  -7.1    NA    NA
##  8 USW00094846 CHICAG~     42.0     -87.9      202. 1950-01-08  -5.8    NA    NA
##  9 USW00094846 CHICAG~     42.0     -87.9      202. 1950-01-09   2.9    NA    NA
## 10 USW00094846 CHICAG~     42.0     -87.9      202. 1950-01-10   3.9    NA    NA
## # ... with 211,815 more rows, and 3 more variables: TAVG_ATTRIBUTES <chr>,
## #   TMAX_ATTRIBUTES <chr>, TMIN_ATTRIBUTES <chr>
# station names
stats_names <- unique(meteo$NAME)
stats_names
## [1] "CHICAGO OHARE INTERNATIONAL AIRPORT, IL US"             
## [2] "LAGUARDIA AIRPORT, NY US"                               
## [3] "MIAMI INTERNATIONAL AIRPORT, FL US"                     
## [4] "HOUSTON INTERCONTINENTAL AIRPORT, TX US"                
## [5] "ATLANTA HARTSFIELD JACKSON INTERNATIONAL AIRPORT, GA US"
## [6] "SAN FRANCISCO INTERNATIONAL AIRPORT, CA US"             
## [7] "SEATTLE TACOMA AIRPORT, WA US"                          
## [8] "DENVER INTERNATIONAL AIRPORT, CO US"                    
## [9] "MCCARRAN INTERNATIONAL AIRPORT, NV US"
# new city names
cities <- c("CHICAGO", "NEW YORK", "MIAMI", 
            "HOUSTON", "ATLANTA", "SAN FRANCISCO", 
            "SEATTLE", "DENVER", "LAS VEGAS")

Modify

In the first step, we will modify the original data, 1) selecting only the columns of interest, 2) filtering the period 1991-2020, 3) defining the new city names, 4) calculating the average temperature where it is absent, 5) cleaning the column names, and 6) creating a new variable with the days of the year. The clean_names() function of the janitor package is very useful for getting clean column names.

meteo <- select(meteo, NAME, DATE, TAVG:TMIN) %>%  
           filter(DATE >= "1991-01-01", DATE <= "2020-12-31") %>% 
            mutate(NAME = factor(NAME, stats_names, cities),
                   TAVG = ifelse(is.na(TAVG), (TMAX+TMIN)/2, TAVG),
                   yd = yday(DATE)) %>% 
            clean_names()

In the next step, we calculate the daily maximum, minimum and mean temperature for each day of the year. It now only remains to convert the days of the year into a dummy date. Here we use the year 2000 since it is a leap year, and we have a total of 366 days.

# estimate the daily averages
meteo_yday <- group_by(meteo, name, yd) %>% 
                  summarise(ta = mean(tavg, na.rm = TRUE),
                            tmx = mean(tmax, na.rm = TRUE),
                            tmin = mean(tmin, na.rm = TRUE))
## `summarise()` has grouped output by 'name'. You can override using the `.groups` argument.
meteo_yday
## # A tibble: 3,294 x 5
## # Groups:   name [9]
##    name       yd    ta    tmx  tmin
##    <fct>   <dbl> <dbl>  <dbl> <dbl>
##  1 CHICAGO     1 -3.77  0.537 -7.86
##  2 CHICAGO     2 -2.64  1.03  -6.68
##  3 CHICAGO     3 -2.88  0.78  -6.93
##  4 CHICAGO     4 -2.86  0.753 -7.10
##  5 CHICAGO     5 -4.13 -0.137 -8.33
##  6 CHICAGO     6 -4.50 -1.15  -8.05
##  7 CHICAGO     7 -4.70 -0.493 -8.57
##  8 CHICAGO     8 -3.97  0.147 -8.02
##  9 CHICAGO     9 -3.47  0.547 -7.49
## 10 CHICAGO    10 -3.41  1.09  -7.64
## # ... with 3,284 more rows
# convert the days of the year into a dummy date
meteo_yday <- mutate(meteo_yday, yd = as_date(yd, origin = "1999-12-31"))

Creating the climate circles

Predefinitions

We define a divergent vector of various hues.

col_temp <- c("#cbebf6","#a7bfd9","#8c99bc","#974ea8","#830f74",
              "#0b144f","#0e2680","#223b97","#1c499a","#2859a5",
              "#1b6aa3","#1d9bc4","#1ca4bc","#64c6c7","#86cabb",
              "#91e0a7","#c7eebf","#ebf8da","#f6fdd1","#fdeca7",
              "#f8da77","#fcb34d","#fc8c44","#f85127","#f52f26",
              "#d10b26","#9c042a","#760324","#18000c")

We create a table with the x-axis grid lines.

grid_x <- tibble(x = seq(ymd("2000-01-01"), ymd("2000-12-31"), "month"), 
                 y = rep(-10, 12), 
                 xend = seq(ymd("2000-01-01"), ymd("2000-12-31"), "month"), 
                 yend = rep(41, 12))

We define all the style elements of the graph in our own theme theme_cc().

theme_cc <- function(){ 
  
 theme_minimal(base_family = "Montserrat") %+replace%
  theme(plot.title = element_text(hjust = 0.5, colour = "white", size = 30, margin = margin(b = 20)),
        plot.caption = element_text(colour = "white", size = 9, hjust = .5, vjust = -30),
        plot.background = element_rect(fill = "black"),
        plot.margin = margin(1, 1, 2, 1, unit = "cm"),
  
        axis.text.x = element_text(face = "italic", colour = "white"),
        axis.title.y = element_blank(),
        axis.text.y = element_blank(),
        
        legend.title = element_text(colour = "white"),
        legend.position = "bottom",
        legend.justification = 0.5,
        legend.text = element_text(colour = "white"),
       
        
        strip.text = element_text(colour = "white", face = "bold", size = 14),
        
        panel.spacing.y = unit(1, "lines"),
        panel.background = element_rect(fill = "black"),
        panel.grid = element_blank()
      ) 
}

Graph

We start by building a chart for New York City only. We will use geom_linerange() to define line range with the daily maximum and minimum temperature. Also, we will draw the range line colour based on the mean temperature. Finally, we can adjust alpha and size to get a nicer look.

# filter New York
ny_city <- filter(meteo_yday, name == "NEW YORK") 

# graph
ggplot(ny_city) + 
  geom_linerange(aes(yd, 
                     ymax = tmx, 
                     ymin = tmin, 
                     colour = ta),
                 size=0.5, 
                 alpha = .7) + 
  scale_y_continuous(breaks = seq(-30, 50, 10), 
                     limits = c(-11, 42), 
                     expand = expansion()) +
  scale_colour_gradientn(colours = col_temp, 
                         limits = c(-12, 35), 
                         breaks = seq(-12, 34, 5)) + 
  scale_x_date(date_breaks = "month",
               date_labels = "%b") +
  labs(title = "CLIMATE CIRCLES", 
       colour = "Daily average temperature") 

To get the polar graph it would only be necessary to add the coord_polar() function.

# polar chart
ggplot(ny_city) + 
  geom_linerange(aes(yd, 
                     ymax = tmx, 
                     ymin = tmin, 
                     colour = ta),
                 size=0.5, 
                 alpha = .7) + 
  scale_y_continuous(breaks = seq(-30, 50, 10), 
                     limits = c(-11, 42), 
                     expand = expansion()) +
  scale_colour_gradientn(colours = col_temp, 
                         limits = c(-12, 35), 
                         breaks = seq(-12, 34, 5)) + 
  scale_x_date(date_breaks = "month",
               date_labels = "%b") +
  coord_polar() +
  labs(title = "CLIMATE CIRCLES", 
       colour = "Daily average temperature") 

In the final graph, we add the grid defining the lines on the y-axis with geom_hline() and those on the x-axis with geom_segement(). The most important thing here is the facet_wrap() function, which allows multiple facets of charts. The formula format is used to specify how the facets are created: row ~ column. If we do not have a second variable, a point . is indicated in the formula. In addition, we make changes to the appearance of the colour bar with guides() and guide_colourbar(), and we include the theme_cc() style.

ggplot(meteo_yday) + 
  geom_hline(yintercept = c(-10, 0, 10, 20, 30, 40), 
             colour = "white", 
             size = .4) +
  geom_segment(data = grid_x , 
               aes(x = x, 
                   y = y, 
                   xend = xend, 
                   yend = yend), 
               linetype = "dashed", 
               colour = "white", 
               size = .2) +
  geom_linerange(aes(yd, 
                     ymax = tmx, 
                     ymin = tmin, 
                     colour = ta),
                 size=0.5, 
                 alpha = .7) + 
  scale_y_continuous(breaks = seq(-30, 50, 10), 
                     limits = c(-11, 42), 
                     expand = expansion())+
  scale_colour_gradientn(colours = col_temp, 
                         limits = c(-12, 35), 
                         breaks = seq(-12, 34, 5)) + 
  scale_x_date(date_breaks = "month", 
               date_labels = "%b") +
  guides(colour = guide_colourbar(barwidth = 15,
                                  barheight = 0.5, 
                                  title.position = "top")
         ) +
  facet_wrap(~name, nrow = 3) +
  coord_polar() + 
  labs(title = "CLIMATE CIRCLES", 
       colour = "Daily average temperature") +
  theme_cc()

To leave a comment for the author, please follow the link and comment on their blog: R on Dominic Royé.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.