pandoc, markdown and pander
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Pandoc + markdown seem to be a great way of documenting my work.
Markdown syntax is very simple and allows to add basic formatting and figures to an otherwise simple text document, without obfuscating the actual text.
Then I simply compile the document using the pandoc command:
pandoc -o document.docx document.md pandoc -o document.pdf document.md
There are some more tricks, of course, and plenty of output formats are possible. One thing I was struggling with was that images in docx files were much too large. It turns out that the PNG graphics I generate from PDFs (which, in turn, come from R) lacked the information about density units. I was using the convert program from ImageMagick, and it turns out it is necessary to add the option -units PixelsPerInch:
convert -density 300 -units PixelsPerInch image.pdf image.png
Another thing that I found useful was the pander package. Of course, there is this whole science of generating dynamic documents and reports from R using Sweave or knittr, but at the moment I rather produce two files: a commented R pipeline and, separately, a report in markdown format.
(The reason for not using knittr is that I given that I work with some very large data sets that sometimes take ages to compute, I would have to work out the details of cacheing and handling code that takes a while to execute. Also, I want to have a document with all commands, for me, and report without any R code for everyone else).
Pander allows to create nice tables in R that can be directly copied and pasted to a markdown document (of course, pander is so much more, but this is my main use at the moment):
pander(foo, emphasize.strong.cols=1, justify="left", style="simple", digits=2, split.tables=Inf)
I was astonished how nice the resulting word file is. The PDFs, which are produced by TeX/LaTeX, I think, are actually more trouble, for example because LaTeX disregards my order of figures and tables, they are all floating objects and there is no easy way to change this from within the document.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.