Dallas R Users Group Baseball Data Dive
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This past Saturday I led a data dive workshop for the Dallas R Users Group using Lahman’s baseball statistics. After providing a brief introduction to the Lahman R package and showing how to load the data and make some basic plots, I had the ~20 people in attendance begin working on the following questions:
Visualization:
Visualize how the game of baseball has changed over the years.
Visualize a meaningful statistic on the US map.
Prediction:
Is winning the world series becoming less predictable?
Your friend Peter Daisy likes to bet on baseball games. He asserts that the best predictor of Division Winners is ERA. Is he right? If not, what is the single best predictor of Division Winners?
Scenarios:
The consultant. Nolan Ryan and Ron Washington just called and asked for your expert advice. They are going to focus on improving three statistics this next season, what should they be and why?
The agent. You found an athlete who wants to apply his talents to the game of baseball. He is right-handed, 5 feet 8 inch tall, and weights 165 lbs. Which position makes the most sense for him to start learning and why?
The general manager. MLB has allowed you and Mark Cuban to form an American League expansion team. Mark wants you to choose the three starting outfield players. You can have any current player you want, but Mark says you can’t spend more than 15 million combined. He expects you to balance offensive and defensive performance with these players. Which players do you pick and why?
The parent. Your son is a pitcher and wants to play baseball at the best college for getting into the big leagues. Which college should he attend and why?
The idea wasn’t to complete all of the questions, but to choose one or two of interest. Most of the participants were new to R and focused on visualizing how baseball has changed over the years. Some of the more experienced R users took on the agent and general manager questions. Since the questions were somewhat open-ended, it was fun to see the different approaches and R packages people used.
Feel free to reply with your answers to any of these questions!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.