Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Description of the Parliament of Estonia from its website:
The Riigikogu is the parliament of Estonia. Its 101 members are elected at general elections for a term of four years. The Riigikogu passes laws and resolutions, exercises parliamentary supervision and ratifies international agreements.
Parliament of Estonia has regular meetings. In this post we look how to get session attendance data from Estonian Government Office API1 and visualize XIII Riigikogu absence from these sessions. R will be used to get and analyze data.
Special thanks to Estonian Government Office eployees who answered my queries and showed me how to access the data I was interested in. Double thanks for reacting so quickly to fix a problem I was having!
If you are not interested getting into the R stuff, feel free to jump to results.
Data
We use jsonlite
package to download parliament votings data. Required fields to access votings data are startDate
and endDate
(see /api/votings). Session attendance votings are marked as type Kohaloleku kontroll
(attendance control). Lets download all votings between 1990-01-01
and 2018-09-01
:
library(tidyverse) library(jsonlite) url <- "https://aavik.riigikogu.ee/api/votings?startDate=1990-01-01&endDate=2018-09-01&lang=et" votings <- fromJSON(url) %>% as_data_frame() head(votings) ## # A tibble: 6 x 5 ## uuid title membership sittingDateTime votings ## <chr> <chr> <int> <chr> <list> ## 1 04be4e82-f266… Täiskogu korrali… 8 1998-01-22T00:0… <data.fram… ## 2 6176a8ae-89ac… Täiskogu korrali… 8 1998-01-29T00:0… <data.fram… ## 3 bcb34b03-a842… Täiskogu korrali… 8 1998-02-09T00:0… <data.fram… ## 4 cd42db50-d8b7… Täiskogu korrali… 8 1998-02-11T00:0… <data.fram… ## 5 e289a577-8c64… Täiskogu korrali… 8 1998-02-18T00:0… <data.fram… ## 6 1ea1246f-056f… Täiskogu korrali… 8 1998-02-25T00:0… <data.fram…
Voting type information lies in votings
list-column. Using str()
we see that votings data.frame
also as some list-columns. Since these list-columns are not necessary for us, lets remove them to have nice data.frame
structure.
str(votings$votings[[1]]) ## 'data.frame': 1 obs. of 15 variables: ## $ uuid : chr "c80bec7e-58cb-347e-ba2b-6511cf1b9b8d" ## $ votingNumber : int 1 ## $ type :'data.frame': 1 obs. of 2 variables: ## ..$ code : chr "AVALIK" ## ..$ value: chr "Avalik" ## $ description : logi NA ## $ startDateTime : chr "1998-01-22T10:42:00" ## $ endDateTime : logi NA ## $ present : int 75 ## $ absent : int 26 ## $ inFavor : int 66 ## $ against : int 3 ## $ neutral : int 6 ## $ abstained : int 26 ## $ relatedDraft : logi NA ## $ relatedDocument: logi NA ## $ _links :'data.frame': 1 obs. of 1 variable: ## ..$ self:'data.frame': 1 obs. of 1 variable: ## .. ..$ href: chr "http://aavik.riigikogu.ee/api/votings/c80bec7e-58cb-347e-ba2b-6511cf1b9b8d" votings <- votings %>% mutate(votings = map(votings, function(x) { x[, map_chr(x, typeof) != "list"] })) %>% unnest(votings) head(votings) ## # A tibble: 6 x 17 ## uuid title membership sittingDateTime uuid1 votingNumber description ## <chr> <chr> <int> <chr> <chr> <int> <chr> ## 1 04be… Täis… 8 1998-01-22T00:… c80b… 1 <NA> ## 2 6176… Täis… 8 1998-01-29T00:… 43cc… 3 1. parandus ## 3 6176… Täis… 8 1998-01-29T00:… 2b57… 2 660 ## 4 6176… Täis… 8 1998-01-29T00:… d729… 4 2. parandus ## 5 6176… Täis… 8 1998-01-29T00:… fb1e… 1 1. parandus ## 6 bcb3… Täis… 8 1998-02-09T00:… 2136… 1 Päevakorra… ## # ... with 10 more variables: startDateTime <chr>, endDateTime <chr>, ## # present <int>, absent <int>, inFavor <int>, against <int>, ## # neutral <int>, abstained <int>, relatedDraft <lgl>, ## # relatedDocument <lgl>
Voting type information is in column description
votings %>% count(description) %>% arrange(-n) %>% head() ## # A tibble: 6 x 2 ## description n ## <chr> <int> ## 1 Lõpphääletus 3245 ## 2 Kohaloleku kontroll 3150 ## 3 Päevakorra kinnitamine 573 ## 4 Lükata tagasi 427 ## 5 Läbirääkimiste lôpetamine 376 ## 6 1. parandus 355
For getting detailed votings information we use get_votings()
function. In process of getting detailed votings information we also save all votings data so that we can easily use it later.
get_voting <- function(uuid = NULL) { # uuid: voting id # one voting is saved once # expects to have data/voting-details folder in your working directory files <- dir("data/voting-details/") # detailed votings local directory if (paste0(uuid, ".rds") %in% files) y <- readRDS(paste0("data/voting-details/", uuid, ".rds")) else { url <- glue::glue("https://aavik.riigikogu.ee/api/votings/{uuid}?lang=et") Sys.sleep(5) + runif(1, max = 2) y <- tryCatch({ x <- jsonlite::fromJSON(url) x }, error = function(y) NULL) if (!is.null(y)) saveRDS(y, paste0("data/voting-details/", uuid, ".rds")) else print(y) } y } ids <- votings$uuid1[votings$description == "Kohaloleku kontroll"] map(ids, get_voting)
After getting and saving all necessary attendance check data, we use var_faction()
and var_decision()
functions and some mutating to make attendance
data more accessible:
var_faction <- function(x) { faction <- x$faction$name x$faction <- faction x } var_decision <- function(x) { decision <- x$decision$value x$decision <- decision x } files <- dir("data/voting-details/", full.names = T) attendance <- map_df(files, function(f) { x <- readRDS(f) x$voters %>% select(-`_links`, -lastName, -firstName) %>% var_faction() %>% var_decision() %>% as_data_frame() %>% mutate( votingNumber = x$votingNumber, startDatetime = lubridate::as_datetime(x$startDateTime) ) }) %>% left_join(votings %>% select(uuid = uuid1, description), by = "uuid") %>% filter(description == "Kohaloleku kontroll") head(attendance) ## # A tibble: 6 x 6 ## fullName active faction decision votingNumber startDatetime ## <chr> <lgl> <chr> <chr> <int> <dttm> ## 1 Rein Aidma FALSE Eesti Refor… kohal 6518 2007-05-16 14:05:51 ## 2 Jaak Aab FALSE <NA> kohal 6518 2007-05-16 14:05:51 ## 3 Peep Aru TRUE Eesti Refor… kohal 6518 2007-05-16 14:05:51 ## 4 Hannes As… FALSE <NA> kohal 6518 2007-05-16 14:05:51 ## 5 Meelis At… FALSE <NA> kohal 6518 2007-05-16 14:05:51 ## 6 Ivi Eenmaa FALSE <NA> kohal 6518 2007-05-16 14:05:51
By now we have all necessary data. Decision values kohal
and poolt
mean that parliament member attended meeting. puudub
means that member did not attend the meeting.
Helpers
We use following helper functions to make data munging and plotting easier:
dat_attendace_roll_mean <- function(x = NULL, n = 1, from = "1970-01-01", to = Sys.Date()) { # member has to be present (attend) at least in 1 attendence check to be # counted as present for the day # # x: attendance data # n: nr of attendance check over mean is calculated from <- as.Date(from) to <- as.Date(to) x %>% mutate(date = lubridate::date(startDatetime)) %>% filter(date >= from & date <= to) %>% count(date, fullName, decision) %>% group_by(date) %>% summarise( n_present = sum(decision %in% c("kohal", "poolt")), n_total = n_distinct(fullName) ) %>% ungroup() %>% mutate(p_present = n_present / n_total) %>% mutate( p_roll_mean = RcppRoll::roll_sumr(n_present, n = n, na.rm = T) / RcppRoll::roll_sumr(n_total, n = n, na.rm = T) ) %>% ungroup() } plot_attendance_roll_mean <- function(x = NULL, n = 1, from = "1970-01-01", to = Sys.Date()) { # plotting function for making results part easier to read # # x: attendance data # n: nr of attendance check over mean is calculated gg_data1 <- dat_attendace_roll_mean(x = x, n = n, from = from, to = to) # first and last point first <- gg_data1 %>% filter(!is.na(p_roll_mean)) %>% filter(date == min(date)) last <- gg_data1 %>% filter(!is.na(p_roll_mean)) %>% filter(date == max(date)) gg_data1 %>% ggplot(aes(date, 1 - p_roll_mean)) + geom_line() + geom_point(data = first, color = "white", size = 2) + geom_point(data = last, color = "white", size = 2) + hrbrthemes::theme_modern_rc(grid = "Y", plot_title_size = 14, subtitle_size = 11, plot_margin = ggplot2::margin(30, 30, 10, 30)) + scale_y_continuous(labels = function(x) scales::percent(x, 1), limits = c(.12, .22), breaks = seq(.12, .24, .03)) + labs(x = NULL, y = NULL, title = "Absence from XIII Riigikogu sessions has increased", subtitle = glue::glue("{n} meeting days rolling average absence"), caption = "\nSource: https://aavik.riigikogu.ee/api/votings/ \n \nTheme: modern_rc from {hrbrthemes}") + scale_color_manual(values = col_faction()) + theme( legend.title = element_blank(), text = element_text(family = "Helvetica") ) + geom_vline(xintercept = as.Date(c("2015-03-30", "2016-11-23", "2017-09-10")), lty = 2, alpha = .5, color = "#8e8e93") + annotate("text", label = "new government\ntook the oath", x = as.Date("2016-11-23") - 20, y = .195, size = 3, hjust = 1) + annotate("text", label = "end of 2017 \nsummer break", x = as.Date("2017-09-10") + 20, y = .135, size = 3, hjust = 0) + annotate("text", label = "2015-03-30 \nXIII Riigikogu \nfirst session", x = as.Date("2015-03-30") + 20, y = .165, size = 3, hjust = 0) + annotate("text", label = scales::percent(1 - first$p_roll_mean), x = first$date - 70, y = .134, size = 3.5, color = "white") + annotate("text", label = scales::percent(1 - last$p_roll_mean), x = last$date + 70, y = .215, size = 3.5, color = "white") + scale_x_date(limits = as.Date(c("2015-01-01", "2018-10-01"))) }
Results
Reading forward, keep in mind that the attendance check information only shows who were present or absent at the moment of attendance check and not during the whole day of the sitting. In the following, member is considered absent, if (s)he has missed all attendance checks of the day.
XIII Riigikogu election was held in 1. March 2015. First XIII Riigikogu session took place in 30. March 2015. Six factions gained enough votes to get seats in Riigikogu. Following plot gives a glimpse of XIII Riigikogu sessions absence/attendance.
plot_attendance_roll_mean(x = attendance, n = 90, from = "2015-03-30")
Notes:
- Member is considered absent for a day, if (s)he has missed all attendance checks of the day.
- 90 meeting days rolling average absence is calculated by summing all absences from those 90 days (e.g, day one – 15 members were absent, day two – 8 members were absent) divided by total votes (since Estonian parliament has 101 members, it would be 90 x 101).
- XIII Riigikogu 90th meeting day was
2016-01-21
which makes it the first day the 90 days rolling average is calculated. As of writing, last data point is from2018-06-14
.
2016-11-23
Jüri Ratas’ cabinet took the oath. It was preceded by the Second Cabinet of Taavi Rõivas (from 9 April 2015 – 22 November 2016 2), a cabinet that ended when Social Democrats and the Union of Pro Patria and Res Publica joined the opposition’s no confidence vote against the cabinet. 3
FIN
In this post we saw how to download Estonian parliament session attendance data using Government Office API. One can use same methodology to get other votings data. There is obviously many (and more) interesting topics one can look into using data from the API. Since this post main focus was getting first feeling of the API and data, we leave further analysis for another posts.
If you happen to use the API or have any question, please leave a comment. I am curious of your take on the data! 🙂
As of writing this post the API is in demo/test phase.↩
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.