Of Height and Speed in Tennis, or Fuzziness and Techiness in College
[This article was first published on mickeymousemodels, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I thought of this after reading this post and perhaps also this one, one the Cheap Talk blog. Here’s the puzzle: in general, being tall does not make you slow; but among professional tennis players, the tall athletes do tend to be relatively sluggish. Why does this happen? Cheap Talk gives a perfectly good written explanation, and I thought I’d complement it with something graphical.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Suppose that, in the general population, the distribution of height and speed looks roughly like this:
The next step is to give individuals a maximum footspeed S = 10 + U4 + U5 + U6 mph, with the U independently uniform on (0, 5). By construction, speed is independent from height, and falls more or less in a bell curve from 10 to 25 mph. Fun anecdote: my population is too slow to include Usain Bolt, whose top footspeed is close to 28 mph.
Back to tennis. Let’s imagine that tennis ability increases with both height and speed — and, moreover, that those two attributes are substitutable: if you’re short (and have a weak serve), you can make up for it by being fast. With that in mind, let’s revisit the scatterplot:
As hinted in the title, you can apply the exact same thinking to fuzziness (height) and techniness (speed) among college students (professional tennis players). Literary and mathematical ability might be more-or-less independent in the general population; however, being admitted to university requires that you be excellent at one of them, or solidly above average at both; and just as before, excluding everyone else creates a downward slope, which might explain why fuzziness and techiness are negatively correlated among college students.
(Caveat: When I say “are,” I should probably say “seem to be,” because I have no hard data. Certainly the stereotype is that a CS whiz probably writes bad essays, and that a humanities superstar is likely to struggle with algorithms.)
Some R code pour les curieux:
n <- 2000
df <- data.frame(height=4 + runif(n) + runif(n) + runif(n), speed = 10 + runif(n, 0, 5) + runif(n, 0, 5) + runif(n, 0, 5))
dev.new(height=6, width=6)
plot(df, main="Height and Speed", xlim=c(4, 7), ylim=c(10, 25), col=rgb(60, 120, 180, 30, maxColorValue=255), pch=16, cex=2)
savePlot("height_and_speed.png")
# A(n arbitrary but convenient) function returning the probability of being a good tennis player
df$p <- with(df, 1 / (1 + exp(1 - 20 * (((height / 6)^2 + (speed / 20)^2) - 2))))
# You are unlikely to be a tennis player unless ((height / 6)^2 + (speed / 20)^2) > 2
df$is.tennis.player <- (runif(n) < df$p)
# Two plots side by side
dev.new(height=6, width=12)
par(mfrow=c(1,2))
plot(df[ , c("height", "speed")], main="General Population", xlim=c(4, 7), ylim=c(10, 25), col=rgb(60, 120, 180, 30, maxColorValue=255), pch=16, cex=2)
plot(subset(df, is.tennis.player)[ , c("height", "speed")], main="Tennis Players", xlim=c(4, 7), ylim=c(10, 25), col=rgb(180, 10, 10, 30, maxColorValue=255), pch=16, cex=2)
savePlot("height_and_speed_general_population_and_tennis_players.png")
To leave a comment for the author, please follow the link and comment on their blog: mickeymousemodels.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.