Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Match.com will bring more love to the planet than anything since Jesus Christ (Gary Kremen, founder of Match.com)
Charlie is a brilliant 39 years old mathematician living in Massachusetts. He lives alone and has dedicated his whole life to study. He has realized lately that theorems and books no longer satisfy him. Charlie has realized that needs to find love.
To find the love of his life, Charlie joined Match.com to try to have a date with a girl every week for a year (52 dates in total). Charlie has a method to rate each girl according her sympathy, her physical appearance, her punctuality, her conversation and her hobbies. This method allows him to compare girls with each other. Charlie wants to pick the best of the 52 girls according to their score, but there is a problem: he needs to be agile if he wants to woo a girl. If he spends so many time to call back after the first date, she may be in the arms of another man. So Charlie has to decide inmediately if the girl is the love of his life. His plan is as follows: he will start having some dates only to assess the candidates, without declaring his love to any of them. After this, he will try to win over the first girl better than any of the first candidates, according his scoring.
But, how many girls should discard to maximize the probability of choosing the best one? Discarding just one, probability of having a date with a better girl in the next date is very high. But will she be the best of all girls? Probably not. On the other hand, discarding many girls, makes very probable discarding also the best candidate.
Charlie did a simulation in R of the 52 dates to approximate the probability of choosing the best girl depending on the number of discarded girls. He obtained that the probability of choosing the best girl is maximal discarding the 19 first girls, as can be seen in the following graph:
Why 19? This is one of the rare places where can found the number e. You can see an explanation of the phenomenon here.
require(extra) require(ggplot2) n=52 sims=1000 results=data.frame(discards=numeric(0), triumphs=numeric(0)) for(i in 0:n) { triumphs=0 for (j in 1:sims) { opt=sample(seq(1:n), n, replace=FALSE) if (max(opt[1:i])==n) triumphs=triumphs+0 else triumphs=triumphs+(opt[i+1:n][min(which(opt[i+1:n] > max(opt[1:i])))]==n)} results=rbind(results, data.frame(discards=i, triumphs=triumphs/sims)) } opts=theme( panel.background = element_rect(fill="darkolivegreen1"), panel.border = element_rect(colour="black", fill=NA), axis.line = element_line(size = 0.5, colour = "black"), axis.ticks = element_line(colour="black"), panel.grid.major = element_line(colour="white", linetype = 1), panel.grid.minor = element_blank(), axis.text.y = element_text(colour="black", size=20), axis.text.x = element_text(colour="black", size=20), text = element_text(size=25, family="xkcd"), legend.key = element_blank(), legend.background = element_blank(), plot.title = element_text(size = 40)) ggplot(results, aes(discards, triumphs))+ geom_vline(xintercept = n/exp(1), size = 1, linetype=2, colour = "black", alpha=0.8)+ geom_line(color="green4", size=1.5)+ geom_point(color="gray92", size=8, pch=16)+ geom_point(color="green4", size=6, pch=16)+ ggtitle("How e can help you to find the love of your life")+ xlab("Discards") + ylab("Prob. of finding the love of your life")+ scale_x_continuous(breaks=seq(0, n, by = 2))+opts
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.