Site icon R-bloggers

didYouMean() Function: Using Google to correct errors in Strings

[This article was first published on sweissblaug, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A function that will take a String as an input and return the “Did you mean..” or “Showing Results for..” from google.com. Good for misspelled names or locations.

Github

library(RCurl)
didYouMean=function(input){
  input=gsub(” “, “+”, input)
  doc=getURL(paste(“https://www.google.com/search?q=”,input,”/”, sep=””))
  
  
  dym=gregexpr(pattern =’Did you mean’,doc)
  srf=gregexpr(pattern =’Showing results for’,doc)
  
  
  if(length(dym[[1]])>1){
    doc2=substring(doc,dym[[1]][1],dym[[1]][1]+1000)
    s1=gregexpr(“?q=”,doc2)
    s2=gregexpr(“/&”,doc2)
    new.text=substring(doc2,s1[[1]][1]+2,s2[[1]][1]-1)
    return(gsub(“[+]”,” “,new.text))
    break
  }
  
  else if(srf[[1]][1]!=-1){
    doc2=substring(doc,srf[[1]][1],srf[[1]][1]+1000)
    s1=gregexpr(“?q=”,doc2)
    s2=gregexpr(“/&”,doc2)
    new.text=substring(doc2,s1[[1]][1]+2,s2[[1]][1]-1)
    return(gsub(“[+]”,” “,new.text))
    break
  }
  else(return(gsub(“[+]”,” “,input)))
}  

So didYouMean(“gorecge washington”) returns “george washington”


Works well with misspelled companies or nouns or phrases. For example; you’re doing text analysis on twitter and a customer raves about Carlsburg beer. Only problem is he’s enjoying their product while tweeting (something that happens only rarely, I’m sure) and wrote “clarsburg gprou”. Not to worry!

> didYouMean(“clarsburg gprou”)
[1] “carlsberg group”

Or suppose you have a 3 phase plan for profits. This can help you get there!

didYouMean(“clletc nuderpants”)
[1] “collect underpants”

To leave a comment for the author, please follow the link and comment on their blog: sweissblaug.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.