[This article was first published on r.iresmi.net, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Day 30 of 30DayMapChallenge: « The final map » (previously).
< section id="creating-a-raw-polygon-layer-of-french-postal-codes-from-points" class="level1">Creating a raw polygon layer of french postal codes from points
This data, although not a “real” administrative limit, is often used in many different applications, see for example the BNV-D.
There is no free, up-to-date, layer of polygon boundaries for french postal codes.
- The official database is just a table giving a postal code for each commune code, optionally with point coordinates. We have some postal codes shared by several communes and some communes have several postal codes (at the same point location).
- Using the national address base (BAN) we can compute the convex hulls: Contours calculés des zones codes postaux. Interesting method but old (2021) and with many holes/overlaps
- Fond de carte des codes postaux is nice (although métropole only) but old (2013).
Config
library(readr) library(dplyr) library(tidyr) library(stringr) library(purrr) library(ggplot2) library(glue) library(janitor) library(sf)
Data
# Postal codes as point by commune cp_points <- read_csv("https://datanova.laposte.fr/data-fair/api/v1/datasets/laposte-hexasmal/metadata-attachments/base-officielle-codes-postaux.csv", name_repair = make_clean_names) |> separate(geopoint, into = c("lat", "lon"), sep = ",", convert = TRUE) |> drop_na(lon, lat) |> st_as_sf(coords = c("lon", "lat"), crs = "EPSG:4326") |> filter(str_sub(code_commune_insee, 1, 3) < "987") |> mutate(proj = case_when(str_sub(code_commune_insee, 1, 3) %in% c("971", "972", "977", "978") ~ "EPSG:5490", str_sub(code_commune_insee, 1, 3) == "973" ~ "EPSG:2972", str_sub(code_commune_insee, 1, 3) == "974" ~ "EPSG:2975", str_sub(code_commune_insee, 1, 3) == "976" ~ "EPSG:4471", .default = "EPSG:2154")) # France limits, to clip voronoi polygons fr <- read_sf("https://static.data.gouv.fr/resources/admin-express-cog-simplifiee-2024-metropole-drom-saint-martin-saint-barthelemy/20240930-094021/adminexpress-cog-simpl-000-2024.gpkg", layer = "departement") |> mutate(terr = if_else(insee_reg > "06", "fx", insee_reg)) |> group_by(terr) |> summarise() # Communes limits and population to give the name of the postal code as the # biggest commune (by pop) com <- read_sf("https://static.data.gouv.fr/resources/admin-express-cog-simplifiee-2024-metropole-drom-saint-martin-saint-barthelemy/20240930-094021/adminexpress-cog-simpl-000-2024.gpkg", layer = "commune")
Processing
We will create voronoï polygons from grouped postal codes points.
st_rename_geom <- function(x, name) { names(x)[names(x) == attr(x, "sf_column")] <- name st_geometry(x) <- name return(x) } # We need to use a projection for each territory to avoid geometry errors voronoi <- function(df, k) { df_proj <- df |> st_transform(pull(k, proj)) df_proj |> st_union() |> st_voronoi() |> st_cast() |> st_intersection(st_transform(fr, pull(k, proj))) |> st_sf() |> st_rename_geom("geom") |> st_join(df_proj) |> group_by(code_postal) |> summarise() |> st_transform("EPSG:4326") } # get the names of the main town of each postal code noms <- cp_points |> st_join(com |> select(insee_com, nom, population)) |> group_by(code_postal) |> slice_max(population, n = 1, with_ties = FALSE) |> select(code_postal, nom) |> st_drop_geometry() # create the voronoï for each territory cp_poly <- cp_points |> group_by(proj) |> group_modify(voronoi) |> ungroup() |> select(-proj) |> st_sf() |> left_join(noms, join_by(code_postal))
Map
An extract only on métropole:
cp_poly |> filter(str_sub(code_postal, 1, 2) < "97") |> st_transform("EPSG:2154") |> ggplot() + geom_sf(fill = "#eeeeee", color = "#bbbbbb", linewidth = .1) + labs(title = "Codes postaux", subtitle = "France métropolitaine - 2024", caption = glue("https://r.iresmi.net/ - {Sys.Date()} data: La Poste")) + theme_void() + theme(plot.caption = element_text(size = 6, color = "darkgrey"))
Export
cp_poly |> st_write("codes_postaux_fr_2024.gpkg", delete_layer = TRUE, quiet = TRUE, layer_options = c("IDENTIFIER=Codes postaux France 2024", glue("DESCRIPTION=Métropole + DROM WGS84. d'après données La Poste + IGN Adminexpress https://r.iresmi.net/ - {Sys.Date()}")))
Get the file: polygones des codes postaux français 2024 (geopackage WGS84) (3 MB).
Caution
- Métropole + DROM (no Polynesia, New caledonia,…)
- Some polygons parts (434) cover other polygons in communes where several postal codes are present.
- 4 NAs (multipolygons without postal codes) are present covering territories outside the voronoï polygons.
- No postal code for St-Barth/St-Martin?
To leave a comment for the author, please follow the link and comment on their blog: r.iresmi.net.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.