New R package to access World Bank data
[This article was first published on mages' blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Staying on top of new CRAN packages is quite a challenge nowadays. However, thanks to Dirk’s CRANberries service I occasionally spot a new gem, such as Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
wbstats
, which appeared on CRAN last week. Similarly to the
WDI
package, wbstats
offers an interface to the World Bank database.With the functions of
wbstats
the World Bank data can be searched and data for several indicators requested. Unlike WDI
, the data is returned in a ‘long’ table with one column for all values and a separate column for the indicators. Additionally, the function wb
allows me to specify how many most recent values (mrv
) I am interested.Thus, to recreate the famous Gapminder chart by Hans Rosling, showing the correlation between fertility, i.e. number of children per woman, and life expectancy over time by country and region, I can write:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(wbstats) | |
library(data.table) | |
library(googleVis) | |
# Download World Bank data and turn into data.table | |
myDT <- data.table( | |
wb(indicator = c("SP.POP.TOTL", | |
"SP.DYN.LE00.IN", | |
"SP.DYN.TFRT.IN"), mrv = 60) | |
) | |
# Download country mappings | |
countries <- data.table(wbcountries()) | |
# Set keys to join the data sets | |
setkey(myDT, iso2c) | |
setkey(countries, iso2c) | |
# Add regions to the data set, but remove aggregates | |
myDT <- countries[myDT][ ! region %in% "Aggregates"] | |
# Reshape data into a wide format | |
wDT <- reshape( | |
myDT[, list( | |
country, region, date, value, indicator)], | |
v.names = "value", | |
idvar=c("date", "country", "region"), | |
timevar="indicator", direction = "wide") | |
# Turn date, here year, from character into integer | |
wDT[, date := as.integer(date)] | |
setnames(wDT, names(wDT), | |
c("Country", "Region", | |
"Year", "Population", | |
"Fertility", "LifeExpectancy")) | |
M <- gvisMotionChart(wDT, idvar = "Country", | |
timevar = "Year", | |
xvar = "LifeExpectancy", | |
yvar = "Fertility", | |
sizevar = "Population", | |
colorvar = "Region") | |
# Ensure Flash player is available an enabled | |
plot(M) |
If you’d like to learn more about how to create interactive charts with googleVis, then check out the free tutorial on DataCamp.
Session Info
R version 3.2.4 (2016-03-10) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.11.4 (El Capitan) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets [6] methods base other attached packages: [1] googleVis_0.5.10 data.table_1.9.6 wbstats_0.1 loaded via a namespace (and not attached): [1] httr_1.1.0 R6_2.1.2 rsconnect_0.4.2.1 [4] tools_3.2.4 curl_0.9.7 RJSONIO_1.3-0 [7] jsonlite_0.9.19 chron_2.3-47
To leave a comment for the author, please follow the link and comment on their blog: mages' blog.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.