Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
In my last blog I shared a basic dataset listing the Prime Minister’s of Canada, the start and end of their terms and the political party they associated themselves with during their tenure. In this blog I share my second dataset contribution that compliments this- Canadian inflation rate data.
Note: This blog is based on my Kaggle notebook covering the topic. To get the relevant data from this blog check out my contributions: Canadian Prime Ministers and Canada Inflation Rates.
Loading and Combining the Data Together
The Canada Prime Ministers dataset is loaded in similar to how I did in my previous blog, however to have it in a format that will combine the data with the inflation data, I add a variable called Term Interval
which combines the start and end dates of a Prime Minister’s tenure.
# Supress warnings for this options(warn=-1) library(tidyverse) # metapackage of all tidyverse packages library(lubridate) prime_ministers <- readr::read_csv('../input/canadian-prime-ministers/Canadian Prime Ministers Dataset.csv', show_col_types=FALSE) %>% # Format Dates Properly mutate(`Term Start` = anytime::anydate(`Term Start`), # For Justin Trudeau's Term We'll have it up to today () `Term End` = ifelse(Name == "Justin Trudeau",lubridate::today(),anytime::anydate(`Term End`)) %>% anytime::anydate(), # An interval of the start and end date of a prime minister `Term Interval` = lubridate::interval(`Term Start`,`Term End`)) tail(prime_ministers)
No. | Name | Political Party | Term Start | Term End | Term Interval |
---|---|---|---|---|---|
<chr> | <chr> | <chr> | <date> | <date> | <Interval> |
18 | Brian Mulroney | Progressive Conservative | 1984-09-17 | 1993-06-24 | 1984-09-17 UTC–1993-06-24 UTC |
19 | Kim Campbell | Progressive Conservative | 1993-06-25 | 1993-11-03 | 1993-06-25 UTC–1993-11-03 UTC |
20 | Jean Chrétien | Liberal | 1993-11-04 | 2003-12-11 | 1993-11-04 UTC–2003-12-11 UTC |
21 | Paul Martin | Liberal | 2003-12-12 | 2006-02-05 | 2003-12-12 UTC–2006-02-05 UTC |
22 | Stephen Harper | Conservative | 2006-02-06 | 2015-11-03 | 2006-02-06 UTC–2015-11-03 UTC |
23 | Justin Trudeau | Liberal | 2015-11-04 | 2022-04-06 | 2015-11-04 UTC–2022-04-06 UTC |
To combine the Prime Ministers dataset together with the inflation data I use the mutate
function and define a new field called political_party
. It is with this I use the mapply()
function (a multivariable version of lapply
) and deal the details of this mapping. To deal with filling NA values, tidyr::fill(..., .direction="downup")
is employed.
Its ugly, but it works.
inflation_data <- readr::read_csv('../input/canada-inflation-rates-source-bank-of-canada/CPI-INFLATION-sd-1993-01-01-ed-2022-01-01.csv', show_col_types = FALSE)%>% mutate(date=anytime::anydate(date), political_party=mapply(function(x,y,z) z[x %within% y], x=date, y = prime_ministers$`Term Interval`, z=prime_ministers$`Political Party`) %>% lapply(function(x) ifelse(length(x)==0, NA,x)) %>% unlist() ) %>% tidyr::fill( political_party,.direction="downup") head(inflation_data)
date | INDINF_CPI_M | INDINF_LOWTARGET | INDINF_UPPTARGET | political_party |
---|---|---|---|---|
<date> | <dbl> | <dbl> | <dbl> | <chr> |
1993-01-01 | 2.0 | 1.972223 | 3.972223 | Liberal |
1993-02-01 | 2.4 | 1.944445 | 3.944445 | Liberal |
1993-03-01 | 1.9 | 1.916667 | 3.916667 | Liberal |
1993-04-01 | 1.8 | 1.888890 | 3.888889 | Liberal |
1993-05-01 | 1.9 | 1.861112 | 3.861111 | Liberal |
1993-06-01 | 1.7 | 1.833334 | 3.833333 | Liberal |
Now for making the visual. With the ggthemes
package and the theme_fivethirtyeight()
geom, the visual looks quite nice and informative. From the visual below its possible to see that there might be a relationship, but it is too noisy to look at in its present form.
Since this analysis is just to compliment the data, a formal analysis has been not conducted. Some of the things to consider would be:
-
Looking at the time series decomposition of the data to account for seasonality and look at the trend component.
-
Test to see if inflation between conservative and liberal leadership is the same or not.
-
After talking a little bit on the R discord server it was suggested to try to lag inflation by two years to account for a given leadership to undo or alter the policy of its predecessor.
If you looked into any of these questions let me know and I would love to check out and share the work as well.
I hope you enjoy this dataset. Be sure to upvote and share around!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.