Du Bois Visualization Challenge 04
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Recreating the the data visualization of W.E.B Du Bois from the 1900 Paris Exposition using modern tools. See the challenge presentation.
This week, the data provided1 don’t match the visualization. So we are free to be more creative…
We’ll try to show who profited from the slave trade by looking at the origin of the ships involved.
Setup
library(tidyverse) library(tidygeocoder) library(leaflet) options(scipen = 100) # compute mode (from https://stackoverflow.com/a/45216553) stat_mode <- function(x, return_multiple = FALSE, na.rm = FALSE) { if(na.rm){ x <- na.omit(x) } ux <- unique(x) freq <- tabulate(match(x, ux)) mode_loc <- if(return_multiple) which(freq == max(freq)) else which.max(freq) return(ux[mode_loc]) }
Data
We impute the missing data and do some cleaning then we sum the slaves numbers by ships’ home port.
data_04 <- read_csv("routes.csv", na = "NA") ports <- data_04 |> group_by(ship_name) |> mutate(port_origin = if_else( is.na(port_origin), stat_mode(port_origin, na.rm = TRUE), port_origin), port_geo = str_replace(port_origin, ", port unspecified|, colony unspecified|, location unspecified", ""), port_geo = case_match(port_geo, "Southeast Brazil" ~ "Rio de Janeiro", "Princes Island" ~ "Sao Tome", "Para" ~ "Belém", "Lyme" ~ "Lyme Regis", "Les Sables" ~ "Les Sables d'Olonnes", "Charlestown" ~ "Boston", "Cabanas" ~ "Cabañas, Cuba", "Camaret" ~ "Camaret-sur-Mer", "Goree" ~ "Gorée", "Saint-Louis" ~ "Saint-Louis, Saint-Louis, Sénégal", "Montrose" ~ "Montrose, Scotland", "Salem" ~ "Salem, Massachussets", "Stockton" ~ "Newcastle", "Cardenas" ~ "Cárdenas, Cuba", "Newbury" ~ "Newbury, Massachussets", "Norfolk" ~ "Norfolk, Virginia", "Portuguese Guinea" ~ "Guinea-Bissau", "Ilho do Fayal" ~ "Azores", "Lancaster" ~ "Lancaster, UK", "British Americas" ~ "New England", "Warren" ~ "Warren, Rhode Island", "St. Thomas" ~ "United States Virgin Islands", "Danish West Indies" ~ "United States Virgin Islands", "Mediterranean coast (France)" ~ "Marseille", "Sao Tome or Princes Island" ~ "Sao Tome", "Spanish Caribbean, unspecified" ~ "Havana", "Spanish Circum-Caribbean,unspecified" ~ "Havana", "Catuamo and Maria Farinha" ~ "Pernambuco", .default = port_geo), n_slaves_arrived = if_else(is.na(n_slaves_arrived), round(median(n_slaves_arrived, na.rm = TRUE)), n_slaves_arrived)) |> group_by(port_geo) |> summarise(n_slaves = sum(n_slaves_arrived, na.rm = TRUE)) |> filter(n_slaves > 0) |> drop_na(port_geo) |> arrange(desc(n_slaves))
Then we geocode:
ports_geo <- ports |> geocode(port_geo, method = "osm")
Although I did clean the data, some errors in geocoding may persist.
Map
ports_geo |> leaflet() |> addTiles() |> addCircleMarkers(radius = ~ sqrt(n_slaves) / 50, popup = ~ paste0("<strong>", port_geo, "</strong><br />", format(n_slaves, big.mark = ","), " slaves shipped"))
Footnotes
The dataset seems to come from https://www.slavevoyages.org/voyage/database.↩︎
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.