inegiR version 1.2
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Version 1.2 of inegiR is now on CRAN so I thought I’d write a few words/vignette about what’s new or different, if at all. By the way, i’m writing in english because more people seem to read r-bloggers than my blog (no surprise there), however the pdf manual and most documentation is still in spanish.
Bug fixes
Thanks to Diego Valle who reported a slight bug, the more random dates (“bienal” and “decenal”) were not being parsed correctly.
Also added warnings and error handling when the data doesn’t exist for municipalities (issue is here)
New functions
Grids
Thanks to Arturo Cardenas who unwittingly built a new function for the DENUE part of the package that’s incorporated in this version.
As he wrote in his blog, the denue API only allows us to download businesses in a radious of a maximum of 5 kilometers. However, we can get around this limitation by asking the API a series of coordinates that we know overlap each other to create a square of a larger size. This is a picture, taken from that post, detailing what I mean:
Each circle is, of course 5 kms in radius and so the API would give us everything inside.
The hacer_grid()
function helps us in the process by creating a data.frame with a series of coordinates that create a grid like the one in the image if we supply it 2 corners in latitud and longitud.
But the more powerful denue_grid()
does the interesting part. Using the former function, it also downloads the denue data and returns a unique business data.frame in that grid (if you want duplicates as well, you can eliminate the unique part by setting the unicos = FALSE
parameter)
Example with Grids
Here is an example with the city of Monterrey, let’s say I want all the businesses in San Pedro (a municipality that is part of the metropolitan area).
The total area is roughly about 45 kms, give or take (I know this is not geographically accurate):
I feed the upper right hand and lower left hand coordinates to the function, and voila:
library(inegiR) upper_lat = 25.686917 upper_long = -100.429398 lower_lat = 25.612030 lower_long = -100.333032 token_denue <- "mytoken" sanpedro <- denue_grid(upper_lat, lower_lat, upper_long, lower_long, token = token_denue)
Simple as that!
Factor productivity
By using two fairly consistent surveys that INEGI makes on a monthly bases, I added two functions to calculate productivity, by state in two important industries.
For both cases, productivity is defined as total value produced in state divided by number of total occupied people in the industry in the state. Bear in mind that value produced is in thousands of pesos, so 100 would be equal to 100 thousand pesos “produced” by each person.
We can simply get a time series by the doing the following:
library(eem) library(ggplot2) # ts for Manufacturing in state of Nuevo León: token <- "mytoken" pm <- series_productividad_man(token) nl <- data.frame("Productivity" = pm$NL, "Date" = as.Date(pm$Fechas)) ggplot(nl, aes(x = Date, y = Productivity))+ geom_line(colour = eem_colors[1])+ theme_eem()+ labs(title = "Productivity in Manufacturing n State of Nuevo León", y = "Thousands of pesos x person")
# ts for Construction in state of Nuevo León: pc <- series_productividad_const(token) nl <- data.frame("Productivity" = pc$NL, "Date" = as.Date(pc$Fechas)) ggplot(nl, aes(x = Date, y = Productivity))+ geom_line(colour = eem_colors[1])+ theme_eem()+ labs(title = "Productivity in Construction n State of Nuevo León", y = "Thousands of pesos x person")
New geography
These last two examples lead me to another point: the names in the functions with states have changed. In the first version, Nuevo León state was “NuevoLeon”, it has been changed to “NL”. This is more conscise, easier to read and consistent with the new constitutional name change for Mexico City (it is now “CDMX”, as opposed to “DF”).
The other advantage is that these names will be consistent with Diego Valle’s mxmaps
package to easily make chroloplethr maps (it’s available here). There is a nifty function to make these included in the package using inegiR, but now you can do this both ways!
To switch between “old names” and the new ones, i’ve left the following catalog here:
Name of State | Previous Name | New Name |
Aguascalientes | Aguascalientes | AGS |
Baja California | BajaCalifornia | BC |
Baja California Sur | BajaCaliforniaSur | BCS |
Campeche | Campeche | CAMP |
Coahuila | Coahuila | COAH |
Colima | Colima | COL |
Chiapas | Chiapas | CHPS |
Chihuahua | Chihuahua | CHIH |
Distrito Federal | DF | CDMX |
Durango | Durango | DGO |
Guanajuato | Guanajuato | GTO |
Guerrero | Guerrero | GRO |
Hidalgo | Hidalgo | HGO |
Jalisco | Jalisco | JAL |
Estado de México | EdoMexico | MEX |
Michoacán | Michoacan | MICH |
Morelos | Morelos | MOR |
Nayarit | Nayarit | NAY |
Nuevo León | NuevoLeon | NL |
Oaxaca | Oaxaca | OAX |
Puebla | Puebla | PUE |
Querétaro | Queretaro | QRO |
Quintana Roo | QuintanaRoo | QROO |
San Luís Potosí | SanLuisPotosi | SLP |
Sinaloa | Sinaloa | SIN |
Sonora | Sonora | SON |
Tabasco | Tabasco | TAB |
Tamaulipas | Tamaulipas | TAM |
Tlaxcala | Tlaxcala | TLAX |
Veracruz | Veracruz | VER |
Yucatán | Yucatan | YUC |
Zacatecas | Zacatecas | ZAC |
If there are any suggestions or bugs, you can find me at twitter or github.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.