Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A short post today based on recent work by @3wen (Ewen Galic, graduate Student in Rennes, spending a year in Montreal). Since we were working on a detailed French dataset (per commune), we needed a dataset containing a list all communes, with population and location. GPS coordinates were extracted from Google, using the following php file, inspired by http://www.andrew-kirkpatrick.com/ on Google geocoding api with php webpage. Population was interpolated from INSEE’s datasets, i.e. http://www.insee.fr/ (since data are over a 35 year period, from 1975 to 2010, changes have been taken into account as carefully are possible – e.g. merges and splits of cities – based on that description). A spline model has been used for all cities (with three degrees of freedom, and null and negative interpolation became one, since we’ll be using loglinear models afterwards). Names are from that dataset, still on INSEE’s website, http://www.insee.fr/.
A zipped file can be downloaded here popfr19752010.zip, but it is also possible to use the code below (it is a 24Mo dataset). Since it was hard to find such a dataset online (different files can be found, but we found none with population and location), we have decided to upload that dataset. Please let us know if there are problems with those data…
> base=read.csv( + "http://freakonometrics.free.fr/popfr19752010.csv", + header=TRUE)
Using that code, it is possible to locate all the communes in France (metropolitan), for instance
> library(maps) > map("france") > points(base$long,base$lat,cex=.1,col="red",pch=19) > points(base$long,base$lat,cex=2*base$pop_2010/ + max(base$pop_2010),col="blue",pch=19)
- reg : code region INSEE (character)
- dep : code departement INSEE (character, corse 201 et 202 au lieu de 2A et 2B)
- com : code commune INSEE (character)
- article : article du nom de la commune (character)
- com_nom : nom de la commune (character)
- long : longitude (numeric)
- lat : latitude (numeric)
- pop_i : estimation de la population à la date i (ramenée à 1 si <=0), i=1975,…,2010 (numeric)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.