Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
What’s the one thing that help you add value to your company’s raw geospatial data? GEOCODING.
Geocoding is the process of converting raw physical addresses to latitude and longitude geospatial points that can be viewed on a map and used for geospatial calculations. Heck – Geocoding has been known to increase my machine learning model perfomance by up to 10%!
Table of Contents
Today I’m going to show you how to do Geocoding in R for FREE using tidygeocoder
. Here’s what you’re learning today:
- Tutorial Part 1: How to use
tidygeocoder
to effortlessly geocode addresses (convert your company addresses to Lat/Long) - Tutorial Part 2: And I’m going to show you how to do Reverse Geocoding (go from Lat/Long to Physical Addresses)
- Bonus: I’m going to show you how to Map lat/long data using Simple Features + Mapview!
R-Tips Weekly
This article is part of R-Tips Weekly, a weekly video tutorial that shows you step-by-step how to do common R coding tasks. Pretty cool, right?
Here are the links to get set up. 👇
This Tutorial is Available in Video
I have a companion video tutorial that gives you the bonus Mapview Shortcuts shown in this video (plus walks you through how to use it). And, I’m finding that a lot of my students prefer the dialogue that goes along with coding. So check out this video to see me running the code in this tutorial. 👇
Why Geocoding is a Must
Look, I’ve been working with customer data for a long time…
And one of the RICHEST sources of data is raw company addresses!
Think about it. If you know where a company is located, do you think that might be important to their purchasing behavior?
Well it was for me. In fact I found out that just simply adding the Latitude and Longitude information to my customer churn prediction models…
Gave my models a 10% increase in performance!
Lot’s of Value to Machine Learning in Raw Customer Addresses
The Latitude and Longitude was key!
And that’s just one of the benefits of working with geospatial data (and geocoding).
But you’re probably thinking geospatial data is really tough.
Listen, I get it. Geospatial data is a little weird.
But, you have good ole Matt Dancho to help you out.
And my promise is today, I’m going to get you on the right track.
So let’s fix that geospatial problem, and make one small step today. And it starts with geocoding.
Thank You to the Developer (and Community).
Before we do our deep-dive into tidygeocoder
, I want to take a brief moment to thank the developers working on theTidygeocoder project, Jesse Cambon, Diego Hernangómez, Christopher Belanger and Daniel Possenriede. Without their hard work, this tutorial (and easy Geocoding) wouldn’t be possible. Thank you!
Free Gift: Cheat Sheet for my Top 100 R Packages (Special Geospatial Analysis Topics Included)
Before we dive in…
You’re going to need R packages to complete the geospatial analysis that helps your company. So why not speed up the process?
To help, I’m going to share my secret weapon…
Even I forget which R packages to use from time to time. And this cheat sheet saves me so much time. Instead of googling to filter through 20,000 R packages to find a needle in a haystack. I keep my cheat sheet handy so I know which to use and when to use them. Seriously. This cheat sheet is my bible.
Once you download it, head over to page 3 and you’ll see several R packages I use frequently just for Data Analysis.
Which is important when you want to work in these fields:
- Machine Learning
- Time Series
- Financial Analysis
- Geospatial Analysis
- Text Analysis and NLP
- Shiny Web App Development
So steal my cheat sheet. It will save you a ton of time.
Tutorial: How to Geocode in R for Free with tidygeocoder
Time for geocoding with tidygeocoder
. Let’s have some fun!
Step 1: Load the Libraries
Load the following libraries.
tidyverse
andtidygeocoder
are the main libraries.- But my bonus lat/long map hack uses
sf
andmapview
.
Step 2: Get My Pittsburgh Pharmacies Dataset
Next, you can steal my Pittsburgh Pharmacies dataset. This dataset is a great way to test your skills with Geocoding.
Steal The Pittsburgh Pharmacies Data Set
We’ll the Pittsburgh Pharmacies dataset (171 geocoded pharmacies) throughout the rest of this tutorial.
Get it here. It’s in the 059_geocoding
folder.
Next, read the data set into R.
Step 3: Geocode the Address Column to get Latitude and Longitude
Next, use the geocode()
function to convert a company’s physical address to a Latitude / Longitude.
Here’s what happens…
Step 4: Reverse Geocode to go from Lat/Long to Physical Address
Sometimes you have a latitude and longitude and want a physical address. For example, if your salesperson needs to know what addresses to visit (you wouldn’t send them a Lat/Long… or else they’d think your nuts!)
Did you know that you can reverse geocode?
You can! Here’s how to go from Latitude / Longitude to a Physical Address. (And save your inter-office reputation)
And you can see that reverse geocoding creates an address from Lat/Long coordinates.
Bonus: Steal My Map Hack to Visualize Lat/Long Data
Want to visualize the geocoded data?
Steal my bonus script here. (It’s in the 059_geocode.R
file)
Here’s what it does in 2 lines of code:
Now you can visualize all 171 Pittsburgh Pharmacies in an interactive map!
💡 Conclusions
You learned how to use the tidygeocoder
library to geocode and reverse geocode. Great work! But, there’s a lot more to becoming a data scientist.
If you’d like to become a Business Data Scientist (and have an awesome career, improve your quality of life, enjoy your job, and all the fun that comes along), then I can help with that.
Do You Need Help Becoming A Business Data Scientist Right Now?
YOU know the feeling. Being unhappy with your current job.
Promotions aren’t happening. You’re stuck. Hopeless. Confused…
And you’re praying that the next data science interview will go better than the last 12…
… But you know it won’t. Not unless you take control of your career.
The good news is…
I Can Help You Speed It Up.
I’ve helped 5,897+ students learn data science for business from an elite business consultant’s perspective.
I’ve worked with Fortune 500 companies like S&P Global, Apple, MRM McCann, and more.
And I built a training program that gets my students life-changing data science careers (don’t believe me? see my testimonials here):
6-Figure Data Science Job at CVS Health ($125K)
Senior VP Of Analytics At JP Morgan ($200K)
50%+ Raises & Promotions ($150K)
Lead Data Scientist at Northwestern Mutual ($175K)
2X-ed Salary (From $60K to $120K)
2 Competing ML Job Offers ($150K)
Promotion to Lead Data Scientist ($175K)
Data Scientist Job at Verizon ($125K+)
Data Scientist Job at CitiBank ($100K + Bonus)
Whenever you are ready, here’s how I can help you:
Here’s the system that has gotten aspiring data scientists, career transitioners, and life long learners data science jobs and promotions…
Join My 5-Course R-Track Program
(And Become The Data Scientist You Were Meant To Be…)
P.S. – Samantha landed her NEW Data Science R Developer job at CVS Health (Fortune 500). This could be you.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.