[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Two short items in this blogpost. Since it was not obvious how to run odfWeave() in my particular setup, the call I am using. Then there were several people crosstabulating logical vectors, so I wanted to play along, 80 times faster than table().Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
odfWeave
My particular setup consists of R, 7-zip, libreoffice. Somehow they don’t 100% play along when using odfWeave. I had that problem this spring and decided to put my solution in a post at some point. In terms of versions therefore, I had that with my previous versions, and tested that it still runs with my new setup (R 3.1.1, Libreoffice 4.2.5.2). The only loose end, is that odfWeave complains I am re-using a directory, and that I need to empty said directory manually.# the standard example call that works for me
demoFile <- system.file(“examples”, “simple.odt”, package = “odfWeave”)
outputFile <- gsub(“simple.odt”, “output.odt”, demoFile)
odfWeave(demoFile, outputFile,
workDir=’C:\\Users\\Kees\\Documents\\tmp’,
odfWeaveControl(zipCmd =
c(“C:\\Progra~1\\7-Zip\\7z a -tzip $$file$$ . -r”,
“C:\\Progra~1\\7-Zip\\7z x -tzip $$file$$ -yr”) ))
# removing files
file.remove(dir(‘C:\\Users\\Kees\\Documents\\tmp’,
recursive=TRUE,
full.names=TRUE))
# using a different directory
odfWeave(‘C:\\Users\\Kees\\Documents\\test\\testcases.odt’,
‘C:\\Users\\Kees\\Documents\\test\\testout.odt’,
workDir=’C:\\Users\\Kees\\Documents\\tmp’,
odfWeaveControl(zipCmd =
c(“C:\\Progra~1\\7-Zip\\7z a -tzip $$file$$ . -r”,
“C:\\Progra~1\\7-Zip\\7z x -tzip $$file$$ -yr”) ))
Cross table of logical vectors
This was started in Sometimes Table is not the Answer – a Faster 2×2 Table and carried on with Sometimes I feel (some) need for speed. So, I wanted to add my own attempts. The aim is to make a cross table of two logical vectors with a minimum of time. Which becomes important if these vectors are long. Solutions from previous posts.set.seed(2014)
manual = sample(c(TRUE, FALSE), 10e6, replace = TRUE)
auto = sample(c(TRUE, FALSE), 10e6, replace = TRUE)
logical.tab = function(x, y) {
tt = sum(x & y)
tf = sum(x & !y)
ft = sum(!x & y)
ff = sum(!x & !y)
return(matrix(c(ff, tf, ft, tt), 2, 2))
}
basic.tab2 = function(x, y) {
dif = x – y
tf = sum(dif > 0)
ft = sum(dif < 0)
tt = sum(x*y)
ff = length(dif) – tt – tf – ft
return(c(tf, ft, tt, ff))
}
tabulate(manual + auto *2+1, 4)
My idea was we should use the margins and go back from there.
my.tab = function(x, y) {
tt = sum(x * y)
t1=sum(x)
t2=sum(y)
return(matrix(c(length(x)-t1-t2+tt, t1-tt, t2-tt, tt), 2, 2))
}
my.tab2 <- function(x, y) {
phase1 <- colSums(cbind(x,y,x*y))
return(matrix(c(length(x)-sum(phase1[-3])+phase1[3],
phase1[-3]-phase1[3],
phase1[3]),2,2))
}
With my particular hardware table() is just too slow to microbenchmark often, but 80 times faster than table() is not bad.
library(microbenchmark)
microbenchmark(
logical.tab(manual, auto),
basic.tab2(manual, auto),
my.tab(manual,auto),
my.tab2(manual,auto),
tabulate(manual + auto *2+1, 4),
table(manual,auto),
times = 20)
Unit: milliseconds
expr min lq median uq max neval
logical.tab(manual, auto) 2852.5587 2888.8590 2906.4571 2972.3916 3227.0821 20
basic.tab2(manual, auto) 705.8153 722.5800 746.1683 765.9400 957.5435 20
my.tab(manual, auto) 185.8359 186.6829 188.0988 224.2308 413.5623 20
my.tab2(manual, auto) 463.2731 481.8843 487.7825 512.2563 694.1729 20
tabulate(manual + auto * 2 + 1, 4) 276.1837 300.8009 315.9451 379.7302 534.7997 20
table(manual, auto) 15703.0576 16132.0100 16231.3342 16466.7445 19012.0273 20
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.