Site icon R-bloggers

inegiR version 1.2

[This article was first published on En El Margen - R-English, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Version 1.2 of inegiR is now on CRAN so I thought I’d write a few words/vignette about what’s new or different, if at all. By the way, i’m writing in english because more people seem to read r-bloggers than my blog (no surprise there), however the pdf manual and most documentation is still in spanish.

Bug fixes

Thanks to Diego Valle who reported a slight bug, the more random dates (“bienal” and “decenal”) were not being parsed correctly.

Also added warnings and error handling when the data doesn’t exist for municipalities (issue is here)

New functions

Grids

Thanks to Arturo Cardenas who unwittingly built a new function for the DENUE part of the package that’s incorporated in this version.

As he wrote in his blog, the denue API only allows us to download businesses in a radious of a maximum of 5 kilometers. However, we can get around this limitation by asking the API a series of coordinates that we know overlap each other to create a square of a larger size. This is a picture, taken from that post, detailing what I mean:

Each circle is, of course 5 kms in radius and so the API would give us everything inside.

The hacer_grid() function helps us in the process by creating a data.frame with a series of coordinates that create a grid like the one in the image if we supply it 2 corners in latitud and longitud.

But the more powerful denue_grid() does the interesting part. Using the former function, it also downloads the denue data and returns a unique business data.frame in that grid (if you want duplicates as well, you can eliminate the unique part by setting the unicos = FALSE parameter)

Example with Grids

Here is an example with the city of Monterrey, let’s say I want all the businesses in San Pedro (a municipality that is part of the metropolitan area).

The total area is roughly about 45 kms, give or take (I know this is not geographically accurate):

I feed the upper right hand and lower left hand coordinates to the function, and voila:

library(inegiR)

upper_lat = 25.686917
upper_long = -100.429398
lower_lat = 25.612030
lower_long = -100.333032
token_denue <- "mytoken"

sanpedro <- denue_grid(upper_lat, lower_lat, 
                       upper_long, lower_long, 
                       token = token_denue)

Simple as that!

Factor productivity

By using two fairly consistent surveys that INEGI makes on a monthly bases, I added two functions to calculate productivity, by state in two important industries.

For both cases, productivity is defined as total value produced in state divided by number of total occupied people in the industry in the state. Bear in mind that value produced is in thousands of pesos, so 100 would be equal to 100 thousand pesos “produced” by each person.

We can simply get a time series by the doing the following:

library(eem)
library(ggplot2)
# ts for Manufacturing in state of Nuevo León:
token <- "mytoken"
pm <- series_productividad_man(token)
nl <- data.frame("Productivity" = pm$NL, "Date" = as.Date(pm$Fechas))
ggplot(nl, aes(x = Date, y = Productivity))+
  geom_line(colour = eem_colors[1])+
  theme_eem()+
  labs(title = "Productivity in Manufacturing n State of Nuevo León", 
        y = "Thousands of pesos x person")

# ts for Construction in state of Nuevo León:
pc <- series_productividad_const(token)
nl <- data.frame("Productivity" = pc$NL, "Date" = as.Date(pc$Fechas))
ggplot(nl, aes(x = Date, y = Productivity))+
  geom_line(colour = eem_colors[1])+
  theme_eem()+
  labs(title = "Productivity in Construction n State of Nuevo León", 
        y = "Thousands of pesos x person")

New geography

These last two examples lead me to another point: the names in the functions with states have changed. In the first version, Nuevo León state was “NuevoLeon”, it has been changed to “NL”. This is more conscise, easier to read and consistent with the new constitutional name change for Mexico City (it is now “CDMX”, as opposed to “DF”).

The other advantage is that these names will be consistent with Diego Valle’s mxmaps package to easily make chroloplethr maps (it’s available here). There is a nifty function to make these included in the package using inegiR, but now you can do this both ways!

To switch between “old names” and the new ones, i’ve left the following catalog here:

Name of State Previous Name New Name
Aguascalientes Aguascalientes AGS
Baja California BajaCalifornia BC
Baja California Sur BajaCaliforniaSur BCS
Campeche Campeche CAMP
Coahuila Coahuila COAH
Colima Colima COL
Chiapas Chiapas CHPS
Chihuahua Chihuahua CHIH
Distrito Federal DF CDMX
Durango Durango DGO
Guanajuato Guanajuato GTO
Guerrero Guerrero GRO
Hidalgo Hidalgo HGO
Jalisco Jalisco JAL
Estado de México EdoMexico MEX
Michoacán Michoacan MICH
Morelos Morelos MOR
Nayarit Nayarit NAY
Nuevo León NuevoLeon NL
Oaxaca Oaxaca OAX
Puebla Puebla PUE
Querétaro Queretaro QRO
Quintana Roo QuintanaRoo QROO
San Luís Potosí SanLuisPotosi SLP
Sinaloa Sinaloa SIN
Sonora Sonora SON
Tabasco Tabasco TAB
Tamaulipas Tamaulipas TAM
Tlaxcala Tlaxcala TLAX
Veracruz Veracruz VER
Yucatán Yucatan YUC
Zacatecas Zacatecas ZAC

If there are any suggestions or bugs, you can find me at twitter or github.

To leave a comment for the author, please follow the link and comment on their blog: En El Margen - R-English.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.