Simulating a Queue in R
[This article was first published on Taking the Pith Out of Performance, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In the GCaP class earlier this month, we talked about the meaning of the load average (in Unix and Linux) and simulating a grocery store checkout lane, but I didn’t actually do it. So, I decided to take a shot at constructing a discrete-event simulation (as opposed to Monte Carlo simulation) of a simple M/M/1 queue in R. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
We can make use of a lot of conveniences in R to accomplish such a simulation. For example, we don’t have to worry about random number generation, we can simply use the rexp() function for an M/M/1 queue. It may not be the fastest code on the planet but it is guaranteed to be reliable. We also have the ease of integrating PDQ (Pretty Damn Quick) for analytic comparison, as well as the nice statistical analysis and plotting capabilities available in R.
Simulation Variables
As usual, we start with a list of the necessary variables for the simulation and its instrumentation.
t.end <- 10^5 # duration of sim t.clock <- 0 # sim time Ta <- 1.3333 # interarrival period Ts <- 1.0000 # service period t1 <- 0 # time for next arrival t2 <- t.end # time for next departure tn <- t.clock # tmp var for last event time tb <- 0 # tmp var for last busy-time start n <- 0 # number in system s <- 0 # cumulative number-time product b <- 0 # total busy time c <- 0 # total completions qc <- 0 # plot instantaneous q size tc <- 0 # plot time delta plotSamples <- 100 set.seed(1)
Next, we need to write the R code to perform the actual M/M/1 simulation of arrivals into and departures from the queue.
Simulation Loop
This code meant to be pedagogic so, I haven't bothered to do anything spiffy like pre-allocating the Exp variates, for example. I based it on the example in Mac MacDougall's book Simulating Computer Systems (an oldie but a goodie), rather than the example in the more recent Introduction to Scientific Programming and Simulation Using R book, because I think there's a bug in their R code, but I didn't spend any time trying to find it. Also, that code is not instrumented.
while (t.clock < t.end) { if (t1 < t2) { # arrival event t.clock <- t1 s <- s + n * (t.clock - tn) # delta time-weighted number in queue n <- n + 1 if (t.clock < plotSamples) { qc <- append(qc,n) tc <- append(tc,t.clock) } tn <- t.clock t1 <- t.clock + rexp(1, 1/Ta) if(n == 1) { tb <- t.clock t2 <- t.clock + rexp(1, 1/Ts) # exponential interarrival period } } else { # departure event t.clock <- t2 s <- s + n * (t.clock - tn) # delta time-weighted number in queue n <- n - 1 if (t.clock < plotSamples) { qc <- append(qc,n) tc <- append(tc,t.clock) } tn <- t.clock c <- c + 1 if (n > 0) { t2 <- t.clock + rexp(1, 1/Ts) # exponential service period } else { t2 <- t.end b <- b + t.clock - tb } } }
So, now we have the simulation workhorse in place.
Instrumented Metrics
Here, we collect the instrumentation data to form some well-known performance metrics. They correspond to the definitions given in class.
u <- b/t.clock # utilization B/T N <- s/t.clock # mean queue length (see the Load Average notes) x <- c/t.clock # mean throughput C/T r <- N/x # mean residence time (from Little's law: Q = XR) q <- sum(qc)/max(tc) # estimated queue length for plot
Queue Length
This is a plot of instantaneous queue length à la load average data. This is what queueing fluctuations look like. As I point out in class, they're responsible for the usually complicated math seen in queueing-theory textbooks that can make your head hurt.
PDQ Model
For analytic comparison, we also include the corresponding PDQ-R model in the same script using the online manual for reference.
Init("") CreateOpen("w",1/Ta) # arrivals into queue CreateNode("n",CEN,FCFS) # the M/M/1 queue SetDemand("n","w",Ts) # service time Solve(CANON) # Collect individual performance metrics R <- GetResidenceTime("n","w",TRANS) Q <- GetQueueLength("n","w",TRANS) U <- GetUtilization("n","w",TRANS) X <- GetThruput(TRANS,"w")
Yes, these few lines are equivalent to the above simulation code with instrumentation, and it's guaranteed to be in steady state. Running PDQ, even in R, is essentially instantaneous. The simulation will take longer, but given the plethora of MIPS/core available today, especially on laptops, running simulations in R is entirely feasible.
Results
Finally, we can compare the simulated M/M/1 queue with the corresponding PDQ results. As usual, it's best to break them into inputs and outputs.
- Inputs:
Tsim: 1.00e+05 Ta: 1.3333, Ts: 1.0000 # times Ar: 0.7500, Sr: 1.0000 # rates
- Outputs:
Usim: 0.7477, Updq: 0.75 Xsim: 0.7495, Xpdq: 0.75 Rsim: 4.0316, Rpdq: 4.00 Qsim: 3.0219, Qpdq: 3.00
Within the expected limits of precision, we can conclude that the simulation reached steady state during the specified 105 time-steps.
No doubt, I'll go into more detail about doing simulations in R during the upcoming GDAT class in August.
To leave a comment for the author, please follow the link and comment on their blog: Taking the Pith Out of Performance.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.