Site icon R-bloggers

Predicting Pizza

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

What’s the secret to the best pizza in New York? That’s what statistical consultant and R user Jared Lander sought to find out, by analyzing the rankings of NY pizza joints at MenuPages.com, and building a regression model for ratings based on variables like localion, price, number of reviews, and pizza-oven type (gas, coal or wood)? Here’s a scatterplot matrix of the data set:


Jared published his conclusions in a paper (PDF), “New York Pizza: How to Find The Best”. He used a logit analysis in R to model the five-star rank from the various variables. His conclusions? First of all, there’s a big discrepancy between critics’ “Top 10” pizza rankings and those of the general public (at least as measured at MenuPages.com), with only one of MenuPage’s Top 10 listed in the typical critic’s list. Secondly, while an Uptown location and a coal oven both popular draws (as measured by number of the reviews) none of the variables have a significant influence in rating:

Our findings were able to discern the factors that go into a pizzeria’s popularity but did not discover much differentiation in quality. Popularity and quality are not always equivalent. It is likely that we may have just proved the old adage about pizza: “Even when it’s bad, it’s still good.”

Slice: The ‘Moneyball’ of Pizza? Using Statistics to Find NYC’s Best Pies and Slices

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.