[This article was first published on Taking the Pith Out of Performance, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I was planning to blog about the amazing time I had at Velocity 2009 last week, when this landed in my mailbox (edited for space and privacy): Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Subject: Seeking help with PDQ-R ...
Date: Thu, 25 Jun 2009 15:51:21 -0500
My name is James and I've been trying to learn to properly use PDQ after reading two of your books, "Guerrilla Capacity Planning" and "Analyzing Computer System Performance with Perl::PDQ." I'm still getting a grip on PDQ-R. ... I decided to set about of re-creating the queue circuit in the study with PDQ-R as an exercise. ...
The output of my code yields:
[1] "Manual response time for class 1 is 0.864179 seconds"
[1] "PDQ-R response time for class 1 is 0.313637 seconds"
[1] "Manual response time for class 2 is 6.105397 seconds"
[1] "PDQ-R response time for class 2 is 3.552873 seconds"
[1] "Manual response time for class 3 is 4.535833 seconds"
[1] "PDQ-R response time for class 3 is 4.535833 seconds"
If you could give my code a look over and give me some hints I would really appreciate it.
It turns out that James N. had discovered a bug (gasp!) in PDQ, which is why we have users. (jk) The above output refers to a simple model of a database system comprising 3 resources (call them: cpu, disk1 and disk2) and 3 transaction streams (work1, work2, work3) and no limit on the queue lengths, i.e., an open queueing network or circuit. Here’s what my rendition looks like:
# PDQ-R model library(pdq) # Request rates of the 3 transaction streams into the DBMS Xsys<-c(50/150, 80/150, 70/150) # Service demands at each resource Dcpu<-c(0.096, 0.615, 0.193) Ddk1<-c(0.088, 0.683, 0.763) Ddk2<-c(0.119, 0.795, 0.400) # Start PDQ code with Init call Init("James' DB Model"); # Define the 3 transaction workloads workname<-1:3 for (w in 1:3) { workname[w] <- sprintf("work%d", w) CreateOpen(workname[w], Xsys[w]) } # Define the 3 resources CreateNode("cpu", CEN, FCFS) CreateNode("dk1", CEN, FCFS) CreateNode("dk2", CEN, FCFS) for (w in 1:3) { SetDemand("cpu", workname[w], Dcpu[w]) SetDemand("dk1", workname[w], Ddk1[w]) SetDemand("dk2", workname[w], Ddk2[w]) } Solve(CANON) Report()
To hunt down the problem, I rewrote the PDQ-R model in C, just in case there were any translation problems with SWIG-ing PDQ/lib into PDQ-R, Perl PDQ, PyDQ, etc.
/* multiclass-open.c Created by NJG on Thursday, June 25, 2009 Updated by NJG on Sunday, June 28, 2009 */ #include < stdio.h > #include < stdlib.h > #include < string.h > #include < math.h > #include "PDQ_Lib.h" int main(void) { extern void exit(); extern char s1[]; char *p; // dummy pointer for names char *devname[3]; char *workname[3]; int i, j, k, n, s, w; double actualtR[4][3]; // Expected RT values double expectR[4][3] = { {0.174, 1.118, 0.351}, {0.351, 2.734, 3.054}, {0.340, 2.270, 1.142}, {0.865, 6.122, 4.546} }; // Request rates of the 3 transaction streams into the DBMS double Xsys[] = {50.0/150, 80.0/150, 70.0/150}; // Service demands double Dcpu[] = {0.096, 0.615, 0.193}; double Ddk1[] = {0.088, 0.683, 0.763}; double Ddk2[] = {0.119, 0.795, 0.400}; // Name the workloads for(w = 0; w < 3; w++) { resets(s1); sprintf(s1, "work%d", w+1); if ( (p = (char *) malloc(strlen(s1) * sizeof(char)) ) != NULL) { strcpy(p, s1); // copy into assigned storage workname[w] = p; } else { printf("malloc failed!\n"); exit(-1); } } free(p); // Name the resources for(k = 0; k < 3; k++) { resets(s1); if (k == 0) sprintf(s1, "%s", "cpu"); if (k == 1) sprintf(s1, "%s", "dk1"); if (k == 2) sprintf(s1, "%s", "dk2"); if ( (p = (char *) malloc(strlen(s1) * sizeof(char)) ) != NULL) { strcpy(p, s1); // copy into assigned storage devname[k] = p; } else { printf("malloc failed!\n"); exit(-1); } } free(p); /************************** Start PDQ code **********************/ PDQ_Init("Multiclass Test Model"); // Create workloads for(w = 0; w < 3; w++) { s = PDQ_CreateOpen(workname[w], Xsys[w]); } // Create resources n = PDQ_CreateNode("cpu", CEN, FCFS); n = PDQ_CreateNode("dk1", CEN, FCFS); n = PDQ_CreateNode("dk2", CEN, FCFS); // Assign demands for(w = 0; w < 3; w++) { PDQ_SetDemand("cpu", workname[w], Dcpu[w]); PDQ_SetDemand("dk1", workname[w], Ddk1[w]); PDQ_SetDemand("dk2", workname[w], Ddk2[w]); } PDQ_Solve(CANON); printf("Expected Response Times\n"); for(i = 0; i < 4; i++) { for(j = 0; j < 3; j++) { printf("%4.3f\t", expectR[i][j]); } printf("\n"); } printf("--------------------------\n"); printf("Actual Response Times\n"); for(i = 0; i < 4; i++) { // System response times for QNM if (i == 3) { for(w = 0; w < 3; w++) { printf("%4.3f\t", actualtR[i][w] = PDQ_GetResponse(TRANS, workname[w])); } } // Residence times per resource if (i < 3) { for(w = 0; w < 3; w++) { printf("%4.3f\t", actualtR[i][w] = PDQ_GetResidenceTime(devname[i], workname[w], TRANS)); } } printf("\n"); } } // main
This code also compares actual (meaning, computed by PDQ) with expected values (embedded as 2-d array) of residence times due to each workload at each resource. The "expected" values can come from any one of a number of sources such as: measurements, other models, other tools, etc. This forms the basis of the test code approach.
The problem seen by James turns out to arise from a conflict between the way resource utilizations are computed for the new multi-server queues (released in PDQ 5.0.1) and multi-class workloads. When the PDQ lib is corrected, the agreement can be observed in this output:
[njg]~/PDQ/Test Suite/C-PDQ% ./mulclass-open Expected Response Times 0.174 1.118 0.351 0.351 2.734 3.054 0.340 2.270 1.142 0.865 6.122 4.546 <-- -------------------------- Actual Response Times 0.175 1.118 0.351 0.352 2.728 3.048 0.340 2.274 1.144 0.866 6.120 4.543 <--
The last line in each of the above tables corresponds to the "manual" values that James was reporting in his email.
The PDQ test cases had not kept up with new code developments; one of the hazards of only having severely punctuated time to work on PDQ. The new release PDQ 5.0.2 should be available for download later this week. I'll send out an email notice at that time.
To leave a comment for the author, please follow the link and comment on their blog: Taking the Pith Out of Performance.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.