Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In my sordid past, I was a data science consultant. One thing about data science that they don’t teach you at school is that senior managers in most large companies require reports to be in PowerPoint. Yet, I like to do my more complex data science in R – PowerPoint and R are not natural allies. As a result, creating an updating PowerPoint reports using R can be painful.
In this post, I discuss how to make R and PowerPoint work efficiently together. The underlying assumption is that R is your computational engine and that you are trying to get outputs into PowerPoint. I compare and contrast three tools for creating and updating PowerPoint reports using R: free ReporteRs package with two commercial products, Displayr and Q.
Option 1: ReporteRs
The first approach to getting R and PowerPoint to work together is to use David Gohel’s ReporteRs. To my mind, this is the most “pure” of the approaches from an R perspective. If you are an experienced R user, this approach works in pretty much the way that you will expect it to work.
The code below creates 250 crosstabs, conducts significance tests, and, if the p-value is less than 0.05, presents a slide containing each. And, yes, I know this is p-hacking, but this post is about how to use PowerPoint and R, not how to do statistics…
library(devtools) devtools::install_github('davidgohel/ReporteRsjars') devtools::install_github('davidgohel/ReporteRs') install.packages(c('ReporteRs', 'haven', 'vcd', 'ggplot2', 'reshape2')) library(ReporteRs) library(haven) library(vcd) library(ggplot2) library(reshape2) dat = read_spss("http://wiki.q-researchsoftware.com/images/9/94/GSSforDIYsegmentation.sav") filename = "c://delete//Significant crosstabs.pptx" # the document to produce document = pptx(title = "My significant crosstabs!") alpha = 0.05 # The level at which the statistical testing is to be done. dependent.variable.names = c("wrkstat", "marital", "sibs", "age", "educ") all.names = names(dat)[6:55] # The first 50 variables int the file. counter = 0 for (nm in all.names) for (dp in dependent.variable.names) { if (nm != dp) { v1 = dat[[nm]] if (is.labelled(v1)) v1 = as_factor(v1) v2 = dat[[dp]] l1 = attr(v1, "label") l2 = attr(v2, "label") if (is.labelled(v2)) v2 = as_factor(v2) if (length(unique(v1)) <= 10 <= 10) # Only performing tests if 10 or fewer rows and columns. { x = xtabs(~v1 + v2) x = x[rowSums(x) > 0, colSums(x) > 0] ch = chisq.test(x) p = ch$p.value if (!is.na(p) && p <= alpha) { counter = counter + 1 # Creating the outputs. crosstab = prop.table(x, 2) * 100 melted = melt(crosstab) melted$position = 100 - as.numeric(apply(crosstab, 2, cumsum) - 0.5 * crosstab) p = ggplot(melted, aes(x = v2, y = value,fill = v1)) + geom_bar(stat='identity') p = p + geom_text(data = melted, aes(x = v2, y = position, label = paste0(round(value, 0),"%")), size=4) p = p + labs(x = l2, y = l1) colnames(crosstab) = paste0(colnames(crosstab), "%") #bar = ggplot() + geom_bar(aes(y = v1, x = v2), data = data.frame(v1, v2), stat="identity") # Writing them to the PowerPoint document. document = addSlide(document, slide.layout = "Title and Content" ) document = addTitle(document, paste0("Standardized residuals and chart: ", l1, " by ", l2)) document = addPlot(doc = document, fun = print, x = p, offx = 3, offy = 1, width = 6, height = 5 ) document = addFlexTable(doc = document, FlexTable(round(ch$stdres, 1), add.rownames = TRUE),offx = 8, offy = 2, width = 4.5, height = 3 ) } } } } writeDoc(document, file = filename ) cat(paste0(counter, " tables and charges exported to ", filename, "."))
Below we see one of the admittedly ugly slides created using this code. With more time and expertise, I am sure I could have done something prettier. A cool aspect of the ReporteRs package is that you can then edit the file in PowerPoint. You can then get R to update any charts and other outputs originally created in R.
Option 2: Displayr
A completely different approach is to author the report in Displayr, and then export the resulting report from Displayr to PowerPoint.
This has advantages and disadvantages relative to using ReporteRs. First, I will start with the big disadvantage, in the hope of persuading you of my objectivity (disclaimer: I have no objectivity, I work at Displayr).
Each page of a Displayr report is created interactively, using a mouse and clicking and dragging things. In my earlier example using ReporteRs, I only created pages where there was a statistically significant association. Currently, there is no way of doing such a thing in Displayr.
The flipside of using the graphical user interface like Displayr is that it is a lot easier to create attractive visualizations. As a result, the user has much greater control over the look and feel of the report. For example, the screenshot below shows a PowerPoint document created by Displayr. All but one of the charts has been created using R, and the first two are based on a moderately complicated statistical model (latent class rank-ordered logit model).
You can access the document used to create the PowerPoint report with R here (just sign in to Displayr first) – you can poke around and see how it all works.
A benefit of authoring a report using Displayr is that the user can access the report online, interact with it (e.g., filter the data), and then export precisely what they want. You can see this document as it is viewed by a user of the online report here.
Option 3: Q
A third approach for authoring and updating PowerPoint reports using R is to use Q, which is a Windows program designed for survey reporting (same disclaimer as with Displayr). It works by exporting and updating results to a PowerPoint document. Q has two different mechanisms for exporting R analyses to PowerPoint. First, you can export R outputs, including HTMLwidgets, created in Q directly to PowerPoint as images. Second, you can create tables using R and then have these exported as native PowerPoint objects, such as Excel charts and PowerPoint tables.
Q has two different mechanisms for exporting R analyses to PowerPoint. First, you can export R outputs, including HTMLwidgets, created in Q directly to PowerPoint as images. Second, you can create tables using R and then have these exported as native PowerPoint objects, such as Excel charts and PowerPoint tables.
In Q, a Report contains a series of analyses. Analyses can either be created using R, or, using Q’s own internal calculation engine, which is designed for producing tables from survey data.
The map above (in the Displayr report) is an HTMLwidget created using the plotly R package. It draws data from a table called Region, which would also be shown in the report. (The same R code in the Displayr example can be used in an R object within Q). So when exported into PowerPoint, it creates a page, using the PowerPoint template, where the title is Responses by region and the map appears in the middle of the page.
The screenshot below is showing another R chart created in PowerPoint. The data has been extracted from Google Trends using the gtrendsR R package. However, the chart itself is a standard Excel chart, attached to a spreadsheet containing the data. These slides can then be customized using all the normal PowerPoint tools and can be automatically updated when the data is revised.
Explore the Displayr example
You can access the Displayr document used to create and update the PowerPoint report with R here (just sign in to Displayr first). Here, you can poke around and see how it all works or create your own document.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.