Getting geo data into SQL Server using API and R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
R language (as well as Python) as comprehensive languages helping not only data analysis and data science tasks make it pretty easy, but is also a multiple-purpose language.
In this blog-post I will examine ways of getting data into SQL SERVER using API. I have briefly discussed this already in one of my previous blog poss, but this time, we can do it using SQL Server and R or SQL Server and Python. For the API Service I will call couple of different API services to demonstrate the ease of usage.
Most common API calls are those, retrieving information from social media networks, such as Twitter, Facebook, Slack, LinkedIn and others. These information are great for network analysis, but business also tend to get data from Mailchimp (or Mandrill), Google API, Yahoo stocks, automatic translations sites, Jira, Jenkins and many more.
Calling API in R
Before we continue, We should test a simple
Google API
Using goople API for getting address from longitute and latitude.
DROP TABLE IF EXISTS dbo.[Address]; GO CREATE TABLE [dbo].[Address]( [street_number] [varchar](255) NULL, [street_name] [varchar](255) NULL, [city_name] [varchar](255) NULL, [ZIP] [varchar](255) NULL, [Country] [varchar](255) NULL, [ID] [int] NULL ); GO CREATE TABLE dbo.LAT_LNG ( LAT VARCHAR(100) ,LNG VARCHAR(100) ,ID INT ); GO INSERT INTO dbo.LAT_LNG(LAT,LNG,ID) SELECT '46.080317','14.5178409',1 UNION ALL SELECT '48.198726','16.396652', 2 UNION ALL SELECT '-34.3594725','18.470029',3; GO
Now that we have some sample data in table, we can execute R script to get the corresponding addresses.
INSERT INTO dbo.[Address] exec sp_execute_external_script @language = N'R' ,@script = N' library(RODBC) #library(sqldf) library(RCurl) library(RJSONIO) library(plyr) Addresses <- InputDataSet Addresses$x <- as.character(Addresses$x) Addresses$y <- as.character(Addresses$y) Addresses$latlng <- paste(as.character(Addresses$y),",",as.character(Addresses$x), sep ="") url <- function(latlng, return.call = "json") { root <- "https://maps.googleapis.com/maps/api/geocode/" API_Key <- "AIzaSyDBzvtkyo1Q8uSC8oDu3dxxxxxxxxxxx" #Taken from: https://developers.google.com/maps/documentation/geocoding/start#get-a-key #Bound to my personal email: [email protected] u <- paste(root, return.call, "?latlng=", latlng, "&key=",API_Key,sep = "") return(URLencode(u)) } geoAddress <- function(latlng,verbose=FALSE) { if(verbose) cat(latlng,"\n") u <- url(latlng) doc <- getURL(u, .opts = list(ssl.verifypeer = FALSE)) #opts is RCurls additional options for undergo certiciates/SSL/VPN problems x <- fromJSON(doc,simplify = FALSE) if(x$status=="OK") { street_number <- x$results[[1]]$address_components[[1]]$short_name street_name <- x$results[[1]]$address_components[[2]]$short_name city_name <- x$results[[1]]$address_components[[3]]$short_name ZIP <- x$results[[1]]$address_components[[5]]$short_name return(c(street_number, street_name, city_name, ZIP, formatted_address)) } else { return(c(NA,NA,NA,NA,NA)) } } # create empty data.frame address_df <- data.frame(street_number=NA ,street_name=NA ,city_name=NA ,ZIP=NA ,Address=NA ,ID=NA ) for (i in 1:3) # 2500 API calls per day for free (until 18.7.2018) { read_latlng <- geoAddress(Addresses[i,4]) read_customer <- Addresses[i,3] address_df_temp <- data.frame(t(sapply(read_latlng,c))) names(address_df_temp)[1]<-paste("street_number") names(address_df_temp)[2]<-paste("street_name") names(address_df_temp)[3]<-paste("city_name") names(address_df_temp)[4]<-paste("ZIP") names(address_df_temp)[5]<-paste("Address") cus1 <- data.frame(read_customer) names(cus1)[1]<-paste("ID") address_df_temp <- data.frame(c(address_df_temp,cus1)) address_df <- rbind(address_df, address_df_temp) Sys.sleep(1) rm(address_df_temp, cus1) } OutputDataSet <- address_df' ,@input_data_1 = N'SELECT LNG AS y, LAT AS x,id FROM dbo.LAT_LNG'
And after running a select statement against the addresses table, I get the following:
Google offers so many APIs, that I only choose google maps as shown above as an example. Another favorite Google API of mine is getting the distances from same API between different points.
Happy R-ing.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.