Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A couple of days ago, we had a quick chat on Karl Broman‘s blog, about snakes and ladders (see http://kbroman.wordpress.com/…) with Karl and Corey (see http://bayesianbiologist.com/….), and the use of Markov Chain. I do believe that this application is truly awesome: the example is understandable by anyone, and computations (almost any kind, from what we’ve tried) are easy to perform. At the same time, some French students asked me specific details regarding some old lectures notes on Markov chains, and on some introductory example I used as a possible motivation: the stepping stone algorithm. In the notes, I just mentioned the idea of this popular generic algorithm (introduced in Sawyer (1976)) and I use simulations to show – visually – how it works. Again, it was just to motivate the course which actually did focus on the theory of Markov Chains. But those student wanted more, like how did I get the transition matrix, for instance. And that is actually not a simple question, from a computational perspective. I mean, I can easily generate this Markov Chain, but writing explicitly the transition, that was another story. Which took me a bit longer. In a very specific case…
But let us get back to the roots, and to the stepping stone algorithm. At least, one of them (the one I used in my notes) because it looks like there are several algorithm. We do consider a grid, say
First of all, we need to define the state space. Basically, we do have
So let’s try… The good thing is that it can be related to work I’ve been doing recently on binomial recombining trees (binomial being related to bi-color). First of all, our grid will be describes as follows
> h=3 > M=matrix(1:(h^2),h,h) > M [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9
with two colors
> color=c("red","blue")
Then, we should look for neighbors, or derive an neighborhood matrix,
> d=function(i,j) dist(rbind(c((i-1)%/%h,(i-1)%%h), + c((j-1)%/%h,(j-1)%%h))) > Neighb=matrix(Vectorize(d)(rep(1:(h^2),each=h^2), + rep(1:(h^2),h^2)),h^2,h^2) > trunc(Neighb*100)/100 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 0.00 1.00 2.00 1.00 1.41 2.23 2.00 2.23 2.82 [2,] 1.00 0.00 1.00 1.41 1.00 1.41 2.23 2.00 2.23 [3,] 2.00 1.00 0.00 2.23 1.41 1.00 2.82 2.23 2.00 [4,] 1.00 1.41 2.23 0.00 1.00 2.00 1.00 1.41 2.23 [5,] 1.41 1.00 1.41 1.00 0.00 1.00 1.41 1.00 1.41 [6,] 2.23 1.41 1.00 2.00 1.00 0.00 2.23 1.41 1.00 [7,] 2.00 2.23 2.82 1.00 1.41 2.23 0.00 1.00 2.00 [8,] 2.23 2.00 2.23 1.41 1.00 1.41 1.00 0.00 1.00 [9,] 2.82 2.23 2.00 2.23 1.41 1.00 2.00 1.00 0.00 > Neighb=(Neighb<2)&(Neighb>0) > Neighb [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] FALSE TRUE FALSE TRUE TRUE FALSE FALSE FALSE FALSE [2,] TRUE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE [3,] FALSE TRUE FALSE FALSE TRUE TRUE FALSE FALSE FALSE [4,] TRUE TRUE FALSE FALSE TRUE FALSE TRUE TRUE FALSE [5,] TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE [6,] FALSE TRUE TRUE FALSE TRUE FALSE FALSE TRUE TRUE [7,] FALSE FALSE FALSE TRUE TRUE FALSE FALSE TRUE FALSE [8,] FALSE FALSE FALSE TRUE TRUE TRUE TRUE FALSE TRUE [9,] FALSE FALSE FALSE FALSE TRUE TRUE FALSE TRUE FALSE
Now, let us explicit our 512 possible states.
> n=h^2 > states=function(x){ + Base.b=rep(0,n) + ndigits=(floor(logb(x,base=length(color)))+1) + for(i in 1:ndigits){ + Base.b[n-i+1]=(x%%length(color)) + x=(x %/% length(color))} + return(Base.b)} > M=Vectorize(states)(1:(length(color)^n-1)) > liststates=data.frame(rbind(rep(0,h^2),t(M))) > head(liststates) X1 X2 X3 X4 X5 X6 X7 X8 X9 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 1 3 0 0 0 0 0 0 0 1 0 4 0 0 0 0 0 0 0 1 1 5 0 0 0 0 0 0 1 0 0 6 0 0 0 0 0 0 1 0 1
(for the first six, with 0/1 digits instead of colors). For instance, if we look at a specific one, it is possible to plot the grid, using
> plotsteps=function(u){ + plot(0:h,0:h,col="white",xlab="",ylab="",axes=FALSE) + for(i in 0:(h^2-1)){ + x=i%/%h + y=i%%h + polygon(x+c(1,.1,.1,1),y+c(1,1,.1,.1), + col=color[as.numeric(u)[i+1] + 1]) + text(x+.45,y+.45,i) + }}
> plotsteps(liststates[100,])
Then, given one state, let us see what could happen next,
- let us compute all connected states: all states where we can end up in if we change one cell
- we have to check, for each connect state which cell did change
- we should compute probabilities to reach those 9 states, based on the fact that each of the cell is chosen with the same probability, and the fact that probability to change the color is based on the colors around.
- if some states cannot be reached (if a cell is surrounded by elements of the same color, so it cannot change its color), then, we should remove then from the list of reachable (possible) states.
The code will be something like the following
> listneighbour=function(i){ + start=liststates[i,] + difference2only=function(j) { + w=which(liststates[j,]!=liststates[i,]) + return((length(w)==1))} + possible=which( Vectorize(difference2only)(1:nrow(liststates))==TRUE ) + P=function(j){ + L=liststates[i,which(Neighb[which(liststates[j,]!=liststates[i,]),]==TRUE)] + T=table(as.numeric(L)) + T=T[as.character(0:(length(color)-1))] + T[is.na(T)]=0 + return(as.numeric(T)/sum(T)) + } + probability=Vectorize(P)(possible) + W=NULL + for(j in possible) W=c(W,which(liststates[j,]!=liststates[i,])) + I=1-liststates[i,W]+1 + vp=diag(probability[as.numeric(I),]) + vproba=0*vp + if(sum(vp)!=0) vproba=vp/sum(vp) + return(list( + color=liststates[i,W], + absorb=W, + possible=possible, + probability=probability, + prob=vproba)) + }
> listneighbour(100) $color X3 X4 X8 X9 X7 X6 X5 X2 X1 100 1 1 1 1 0 0 0 0 0 $absorb [1] 3 4 8 9 7 6 5 2 1 $possible [1] 36 68 98 99 104 108 116 228 356 $probability [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 1 0.8 0.6 0.6667 0.3333 0.4 0.5 0.6 0.6667 [2,] 0 0.2 0.4 0.3333 0.6667 0.6 0.5 0.4 0.3333 $prob [1] 0.17964072 0.14371257 0.10778443 0.11976048 0.11976048 [6] 0.10778443 0.08982036 0.07185629 0.05988024
> liststates[99,] X1 X2 X3 X4 X5 X6 X7 X8 X9 99 0 0 1 1 0 0 0 1 0
If we plot it (here on the right, again), we get
> plotsteps(liststates[99,])
Clearly, here, the cell in the upper corner (number 9) changed from blue to red. Now, about the probability… The probability to select cell 9 is 1/9, and given that cell 9 is chosen, the probability to go from blue to red is 2/3 (the cell is surrounded by 2 red cells, and 1 blue cell). The probability to remain blue is then 1/3. Those are the probabilities computed by our function (the table with two rows, one per color). In order to get a better understanding on the meaning of the last line, with some sort of probabilities), let us look at the following (simpler) example.
> liststates[2,] X1 X2 X3 X4 X5 X6 X7 X8 X9 2 0 0 0 0 0 0 0 0 1
> listneighbour(2) $color X9 X8 X7 X6 X5 X4 X3 X2 X1 2 1 0 0 0 0 0 0 0 0 $absorb [1] 9 8 7 6 5 4 3 2 1 $possible [1] 1 4 6 10 18 34 66 130 258 $probability [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 1 0.8 1 0.8 0.875 1 1 1 1 [2,] 0 0.2 0 0.2 0.125 0 0 0 0 $prob [1] 0.65573770 0.13114754 0.00000000 0.13114754 0.08196721 [6] 0.00000000 0.00000000 0.00000000 0.00000000
Things are pretty simple here
- if we chose cells
, then nothing change, since all the neighbors have the same color. So if we want to focus on changes (or say run the algorithm until the first color change, then choosing those cells is a waste of time) - if we chose cells
, then it could be possible to change the color. And actually, is different from (since it does have much more neighbors) - if we chose cell
, then definitively, the color will change, since all neighbors have the other color here,
Now, the probability to select cell
while if
And for the other,
while if
and if
Which are exactly the probability computed above. The point is that we compute probabilities given that a color change will actually occur. The good point is that it should faster convergence to some limiting distribution. If any.
What about our transition matrix ? Well, using a simply loop, we should get it easily
> M=matrix(0,nrow(liststates),nrow(liststates)) + for(i in 1:nrow(liststates)){ + L=listneighbour(i) + if(sum(L$prob)!=0){ + j=L$possible + M[i,j]=L$prob + } + if(sum(L$prob)==0){ + j=i + M[i,j]=1 + } + }
One can check that this matrix satisfies some properties of transition matrices. For instance, the sum per row is one,
> sum(apply(M,1,sum)!=1) [1] 0
Remember that this matrix is big, so I will not print if here. But trust me, it works (it might take a while on an old laptop, but anyone can do it). Now, if we want to visualize some paths of that chain, we can use the following algorithm. First, we need a starting point, that can be chosen randomly,
> j=sample(1:nrow(liststates),size=1)
or using a given colored grid, say
> j=100
Then we plot it,
> plotsteps(liststates[j,])
Now, the code within the loop is here
> d=rep(0,nrow(liststates)) > d[j]=1 > d=d%*%M > j=sample(1:nrow(M),size=1,prob=d) > plotsteps(liststates[j,])
Here are some examples. And indeed, we end up either with all cells in blue, or all cells in red.
|
|
|
|
Now, do we have to compute that transition matrix to produce those graph (and to generate that Markov chain) ? No. Of course not… At each step, I use a Dirac measure, and use the transition matrix just to get the probability to generate then the next state. Actually, one can write a faster and more intuitive code to generate the same chain… But I should probably keep that for another post…
Arthur Charpentier
Arthur Charpentier, professor in Montréal, in Actuarial Science. Former professor-assistant at ENSAE Paristech, associate professor at Ecole Polytechnique and assistant professor in Economics at Université de Rennes 1. Graduated from ENSAE, Master in Mathematical Economics (Paris Dauphine), PhD in Mathematics (KU Leuven), and Fellow of the French Institute of Actuaries.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.