[This article was first published on r - Brandon Bertelsen, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
NOTE: This document does not provide a complete list of FSAs, so it’s only useful for the scraping example. See Part Dieux for a more complete listing based on the economic region in question.
There’s no official database that I can find to define clearly some of the economic areas in Montreal, at least none that are free. However, wikipedia does seem to be rather well organized in this regard. Small scale scraping to identify FSAs for a particular locale, in the example below, Montérégie or “Rive-Sud” an affluent part of the greater Montreal area.
library(rvest) links <- read_html("https://en.wikipedia.org/wiki/South_Shore_(Montreal)") %>% html_node(css = ".column-count-3") %>% html_nodes("a") %>% html_attr("href") links <- paste0("http://en.wikipedia.org",links) listing <- list() for(link in links) { listing[[link]] <- read_html(link) %>% html_node(".adr") %>% html_text() }
So far so good, but it looks like a bit of extra cleaning will be required.
> listing %>% unlist() %>% as.character %>% strsplit(",") %>% unlist() [1] "J3G to J3H" "J4B" "J4W to J4Z" "J5R" "J3L" [6] "J3L" "J6J" " J6K" "J5B" "J0L1B0" [11] NA "J5R" "J3Y" " J3Z" " J4G to J4N" [16] " J4T" " J4V" "J4V" "J3Y" " J3Z" [21] " J4T" NA NA "J3G 6N9" "J3H" [26] "J3H 2M6" "J3L" "J0L 1N0" "J3N" "J3V" [31] "J5A" "J0L 2A0" "J3E" "J5C" "J4P" [36] " J4R" " J4S" "J3L 6Z5" "J0L 2K0" "J3X" trim <- function (x) gsub("^\\s+|\\s+$", "", x) listing %>% unlist() %>% as.character %>% strsplit(",") %>% unlist() %>% strsplit("to") %>% unlist %>% na.omit() %>% trim %>% substr(start = 0, stop = 3)
And we have a list of FSAs for Montérégie pulled from what we hope is a good resource:
J3G J3H J4B J4W J4Z J5R J3L J3L J6J J6K J5B J0L J5R J3Y J3Z J4G J4N J4T J4V J4V J3Y J3Z J4T J3G J3H J3H J3L J0L J3N J3V J5A J0L J3E J5C J4P J4R J4S J3L J0L J3X
To leave a comment for the author, please follow the link and comment on their blog: r - Brandon Bertelsen.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.