In memory of Monty Hall

mrajter

3 years ago

[This article was first published on There's something about R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Some find it a common knowledge, some find it weird. As a professor I usually teach about Monty Hall problem and year after year I see puzzling looks from students regarding the solution.

Image taken from http://media.graytvinc.com/images/690*388/mon+tyhall.jpg

The original and most simple scenario of the Monty Hall problem is this: You are in a prize contest and in front of you there are three doors (A, B and C). Behind one of the doors is a prize (Car), while behind others is a loss (Goat). You first choose a door (let’s say door A). The contest host then opens another door behind which is a goat (let’s say door B), and then he ask you will you stay behind your original choice or will you switch the door. The question behind this is what is the better strategy?

image taken from https://rohanurich.files.wordpress.com/2013/03/mhp-agc2.png

The basis of the answer lies in related and unrelated events. The most common answer is that it doesn’t matter which strategy you choose because it is 50/50 chance – but it is not. The 50/50 assumption is based on the idea that the first choice (one of three doors) and the second choice (stay or switch door) are unrelated events, like flipping a coin two times. But in reality, those are related events, and the second event depends on the first event.

At the first step, when you choose one of three doors, the probability that you picked the right door is 33%, or in other words, there is 66,67% that you are on the wrong door. The fact that that in the second step you are given a choice between your door and the other one doesn’t change the fact that you are most likely starting with the wrong door. Therefore, it is better to switch door in the second step.

Simulation using R

To explore this a bit further and to have a nice exercise with R, a small simulation of games is created.

First we load the necessary packages

library(ggplot2)
library(scales)

Then we create the possible door combinations

#create door combinations
 a<-c(123,132,213,231,312,321)

So what I did was to generate three-digit numbers. The first number will always say behind which door is a car, and two other numbers will say where are goats.

Now let’s prepare the vectors for the simulation

#create results vectors
 car=integer(length=100000)
 goat1=integer(length=100000)
 goat2=integer(length=100000)
 initial_choice=integer(length=100000)
 open_door=integer(length=100000)
 who_wins=character(length=100000)

Now we are ready for the simulation

#create 100.000 games
for (i in 1:100000){

  #set up a situation

  doors<-sample(a,1) #randomly pick a door combination
  car[i]<-doors %/% 100 #the first number is which door is the right door
  goat1[i]<-(doors-car[i]*100)%/%10 #where is the first wrong door
  goat2[i]<-doors-car[i]*100-goat1[i]*10 #where is the second wrong door

  #have a person select a random door
  initial_choice[i]<-sample(c(1,2,3),1)
  

#now we open the wrong door
  if (initial_choice[i]==car[i]){
    open_door[i]=sample(c(goat1[i],goat2[i]),1) #if the person is initially on the right door we randomly select one of the two wrong doors
  } else if (initial_choice[i]==goat1[i]) {
    open_door[i]=goat2[i]
  } else {open_door[i]=goat1[i]} #if the person is initially on the wrong door, we open the other wrong door  

  #stayer remains by his initial choice and switcher changes his choice
  if (initial_choice[i]==car[i]){who_wins[i]="Stayer"} else {who_wins[i]="Switcher"}


}
monty_hall=data.frame(car, goat1,goat2,initial_choice,open_door,who_wins)

And now we got a nice analysis of 100.000 games. To put the most important result into chart we use ggplot2

ggplot(data=monty_hall, aes(who_wins, fill=who_wins)) + 
  geom_bar(aes(y = (..count..)/sum(..count..))) + #crude but effective
  ylim(0,1)+
  ylab("Ratio")+
  xlab("Who wins?")+
  theme(legend.position = "none")

And now we got a nice analysis of 100.000 games. To put the most important result into chart we use ggplot2

So it is definitely better to switch door!

For more reading refer to https://en.wikipedia.org/wiki/Monty_Hall_problem

Happy coding

To leave a comment for the author, please follow the link and comment on their blog: There's something about R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.