Tidy Tuesday Energy Analysis
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Tidy Tuesday: Energy (2023 – Week 23)
Introduction
In this document I’ll analyze and visualize some energy data that were the focus of Tidy Tuesday 2023 week 23. The data comes from Our World In Data and the full data set is available here. Data Source Citation: Ritchie, Roser, and Rosado (2022).
Analysis and Visualization
I’ll start by loading the tidyverse (Wickham et al. (2019)) library and the data set. The result is a dataframe with a row for each country and year, from 1900-2002.
Code
suppressPackageStartupMessages(library(tidyverse)) suppressPackageStartupMessages(library(plotly)) owid_energy <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-06-06/owid-energy.csv", show_col_types = FALSE) head(owid_energy)
# A tibble: 6 × 129 country year iso_code population gdp biofuel_cons_change_pct <chr> <dbl> <chr> <dbl> <dbl> <dbl> 1 Afghanistan 1900 AFG 4832414 NA NA 2 Afghanistan 1901 AFG 4879685 NA NA 3 Afghanistan 1902 AFG 4935122 NA NA 4 Afghanistan 1903 AFG 4998861 NA NA 5 Afghanistan 1904 AFG 5063419 NA NA 6 Afghanistan 1905 AFG 5128808 NA NA # ℹ 123 more variables: biofuel_cons_change_twh <dbl>, # biofuel_cons_per_capita <dbl>, biofuel_consumption <dbl>, # biofuel_elec_per_capita <dbl>, biofuel_electricity <dbl>, # biofuel_share_elec <dbl>, biofuel_share_energy <dbl>, # carbon_intensity_elec <dbl>, coal_cons_change_pct <dbl>, # coal_cons_change_twh <dbl>, coal_cons_per_capita <dbl>, # coal_consumption <dbl>, coal_elec_per_capita <dbl>, …
- How many countries are in the dataset?
Code
length(unique(owid_energy$country))
[1] 306
That’s a lot! I’ll focus on just the United States for now.
- I also noticed that the data set goes back to 1900 but a a lot of the data for earlier years are missing/NA so I’ll filter those out as well.
- It looks like we have data for the USA from 2000-2021.
Make a new dataframe for just the USA data and remove years without data.
Code
usa <- owid_energy %>% filter(country == "United States") %>% filter(!is.na(electricity_demand)) head(usa)
# A tibble: 6 × 129 country year iso_code population gdp biofuel_cons_change_pct <chr> <dbl> <chr> <dbl> <dbl> <dbl> 1 United States 2000 USA 282398560 1.30e13 14.6 2 United States 2001 USA 285470496 1.31e13 6.24 3 United States 2002 USA 288350240 1.33e13 19.5 4 United States 2003 USA 291109824 1.37e13 35.7 5 United States 2004 USA 293947872 1.42e13 26.2 6 United States 2005 USA 296842656 1.47e13 16.8 # ℹ 123 more variables: biofuel_cons_change_twh <dbl>, # biofuel_cons_per_capita <dbl>, biofuel_consumption <dbl>, # biofuel_elec_per_capita <dbl>, biofuel_electricity <dbl>, # biofuel_share_elec <dbl>, biofuel_share_energy <dbl>, # carbon_intensity_elec <dbl>, coal_cons_change_pct <dbl>, # coal_cons_change_twh <dbl>, coal_cons_per_capita <dbl>, # coal_consumption <dbl>, coal_elec_per_capita <dbl>, …
Renewables breakdown
- In this dataset, renewables include wind, solar, and hydro.
Code
g <- usa %>% select( year, renewables_share_elec, solar_share_elec, hydro_share_elec, wind_share_elec ) %>% tidyr::pivot_longer( cols = dplyr::ends_with("share_elec"), names_to = "FuelType", values_to = "Percentage" ) %>% mutate(FuelType = str_remove(FuelType,'_share_elec')) |> ggplot(aes(year, Percentage)) + geom_line(aes(color = FuelType), linewidth = 1.5) + ggtitle("Percent of US electricity Generation: Renewables") + xlab("Year") + ylab("Percent") plotly::ggplotly(g)
Observations from this plot (Figure 4):
The share of renewable electricity production has increased sharply, approximately doubling from 2008 to 2020.
The share of hydro generation has remained relatively constant.
-
Solar and wind shares have increased significantly.
Wind started to increase earlier, around 2005.
Solar started increasing around 2012
SessionInfo
Code
sessionInfo()
R version 4.3.1 (2023-06-16) Platform: x86_64-apple-darwin20 (64-bit) Running under: macOS Sonoma 14.1.2 Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 time zone: America/Denver tzcode source: internal attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] plotly_4.10.3 lubridate_1.9.3 forcats_1.0.0 stringr_1.5.0 [5] dplyr_1.1.3 purrr_1.0.2 readr_2.1.4 tidyr_1.3.0 [9] tibble_3.2.1 ggplot2_3.4.4 tidyverse_2.0.0 loaded via a namespace (and not attached): [1] utf8_1.2.4 generics_0.1.3 renv_1.0.3 stringi_1.7.12 [5] hms_1.1.3 digest_0.6.33 magrittr_2.0.3 evaluate_0.22 [9] grid_4.3.1 timechange_0.2.0 fastmap_1.1.1 jsonlite_1.8.7 [13] httr_1.4.7 fansi_1.0.5 crosstalk_1.2.0 viridisLite_0.4.2 [17] scales_1.2.1 lazyeval_0.2.2 cli_3.6.1 rlang_1.1.1 [21] crayon_1.5.2 ellipsis_0.3.2 bit64_4.0.5 munsell_0.5.0 [25] withr_2.5.1 yaml_2.3.7 parallel_4.3.1 tools_4.3.1 [29] tzdb_0.4.0 colorspace_2.1-0 curl_5.1.0 vctrs_0.6.4 [33] R6_2.5.1 lifecycle_1.0.3 htmlwidgets_1.6.2 bit_4.0.5 [37] vroom_1.6.4 pkgconfig_2.0.3 pillar_1.9.0 gtable_0.3.4 [41] glue_1.6.2 data.table_1.14.8 xfun_0.40 tidyselect_1.2.0 [45] rstudioapi_0.15.0 knitr_1.44 farver_2.1.1 htmltools_0.5.6.1 [49] labeling_0.4.3 rmarkdown_2.25 compiler_4.3.1
References
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.