For this fun exercise, I analyzed more than 200 million data points using SAP HANA and R and then brought in the aggregated results in HTML5 using D3, JSON and Google Maps APIs. The 2008 airlines data is from the data expo and I have been using this entire data set (123 million rows and 29 columns) for quite sometime. See my other blogs
The results look beautiful:
Each airport icon is clickable and when clicked displays an info-window describing the key stats for the selected airport:
I then used D3 to display the aggregated result set in the modal window (light box):
D3 made it looks ridiculously simpler to generate a table from a JSON file.
Unfortunately, I can’t provide the live example due to the restrictions put in by Google Maps APIs and I am approaching my free API limits.
Fun fact: The Atlanta airport was the largest airport in 2008 on many dimensions: Total Flights Departed, Total Miles Flew, Total Destinations. It also experienced lower average departure delay in 2008 than Chicago O’Hare. I always thought Chicago O’Hare is the largest US airport.
As always, I just needed 6 lines of R code including two lines of code to write data in JSON and CSV files:
################################################################################
airports.2008.hp.summary <- airports.2008.hp[major.airports,
list(AvgDepDelay=round(mean(DepDelay, na.rm=TRUE), digits=2),
TotalMiles=prettyNum(sum(Distance, na.rm=TRUE), big.mark=”,”),
TotalFlights=length(Month),
TotalDestinations=length(unique(Dest)),
URL=paste(“http://www.fly”, Origin, “.com”,sep=””)),
by=list(Origin)][order(-TotalFlights)]
setkey(airports.2008.hp.summary, Origin)
#merge the two data tables
airports.2008.hp.summary <- major.airports[airports.2008.hp.summary,
list(Airport=airport,
AvgDepDelay, TotalMiles, TotalFlights, TotalDestinations,
Address=paste(airport, city, state, sep=”, “),
Lat=lat, Lng=long, URL)][order(-TotalFlights)]
airports.2008.hp.summary.json <- getRowWiseJson(airports.2008.hp.summary)
writeLines(airports.2008.hp.summary.json, “airports.2008.hp.summary.json”)
write.csv(airports.2008.hp.summary, “airports.2008.hp.summary.csv”, row.names=FALSE)
##############################################################################
Happy Coding and remember the possibilities are endless!