Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Fast-track publishing using knitr is a short is a short series on how I use knitr to get my articles faster published. This is part II where I will show how you can tweak RStudio into producing seamless MS Word-integration by using the .RProfile together with CSS, a few basics about HTML that might be good to know, and lastly some special characters that can be useful. In the previous post, part I, I explained some of the more general concepts behind fast-track publishing and why I try to get my manuscript into MS Word instead of using LaTeX or other alternatives.
RStudio is in my opinion currently the best tool for using knitr. It allows code folding, navigating through chunks, direct knitr integration, spell checking, and is actively being developed. It is therefore a little odd that the default markdown document generated in knitr looks… terrible:
As you can see there are no margins, allowing no white space that would enhance the reading. As nicely put by Carrie Cousins:
“Don’t forget about the margins. Remember to leave some white space around the entire text frame, creating an almost invisible halo. This margin will help set text apart from other “noise,” easing the reader into the copy.”
This becomes even more difficult to read if we change the window width:
The solution to this is to attach your own CSS file. RStudio has a basic help page that you can find here about changing the CSS. Important to remember is that changing the CSS-rendering must be done before knitting the document.
Inspired by LaTeX’ wide margins, I usually submit my manuscript with wide margins (2 inches/5.08 cm left and right) in order to keep the optimal character count between 65 and 75 characters per line. This helps reading the document and hinting how the paragraphs (more guidelines) will feel in the published article.
A RStudio/knitr .RProfile
The .RProfile is a document allowing you to execute code at startup. All you need to do is create a file called .RProfile in your home directory, If you are uncertain: then start RStudio (close any open project) and write getwd()
= your home directory. The home directory is on OS X/Unix/Linux systems located at the “~/” directory, in Windows 8 this is the “Documents” or “My Documents” folder, Windows 7 it is your user folder (the one with your username).
My .RProfile has a few tweaks in it:
- Use custom.css if exists: If there is a file at the same location as the knitr .Rmd document called
custom.css
it automatically switches to this alternative. As this runs at startup I don’t need to worry about running any code before knitting. - Skip embedded png: Libre Office can’t handle embedded png-images, it hangs as it tries to process them. You can still use embedded png-images by specifying:
options(base64_images= "inline")
. - Fix headers: Libre Office “forgets” the margins for the headers object if they are specified in the CSS, I have therefore a crude gsub() fix for this, to skip it simply set the option
options(LibreOffice_adapt= "skip")
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | cat("\n ** Starting .RProfile **") options(rstudio.markdownToHTML = function(inputFile, outputFile) { require(markdown) htmlOptions <- markdownHTMLOptions(defaults=TRUE) # LibreOffice hangs when the png is included in the html file # I have therefore this option where you actively # have to choose inline if you want the png to be inline if (getOption("base64_images", "No") != "inline") htmlOptions <- htmlOptions[htmlOptions != "base64_images"] # Now in this section we skip writing to the outputfile # and keep the markdown text in the md_txt variable md_txt <- markdownToHTML(inputFile, options = htmlOptions, stylesheet=ifelse(file.exists('custom.css'), 'custom.css', getOption("markdown.HTML.stylesheet"))) if (getOption("LibreOffice_adapt", "Yes") == "skip"){ writeLines(md_txt, con=outputFile) }else{ # Annoyingly it seems that Libre Office currently # 'forgets' the margin properties of the headers, # we therefore substitute these with a element specific # style option that works. Perhaps not that pretty but # it works and can be tweaked for most things. writeLines( gsub("<h([0-9]+)>", "<h\\1 style='margin: 10pt 0pt 0pt 0pt;'>", gsub("<h1>", "<h1 style='margin: 24pt 0pt 0pt 0pt;'>", md_txt)), con=outputFile) } } ) # I’ve added some automated comments just as a reminder, remove # the cat() if you want the .RProfile to be quiet (note, the output does # not affect the knitr document) cat("\n * If you want knitr markdown png-files to be inside the document", " then set the options(base64_images = 'inline') for it to work.") cat("\n * If you don't want the Libre Office adaptations then set", " options(LibreOffice_adapt = 'skip')") cat("\n * If you want knitr markdown to use a custom css then", " just input a 'custom.css' file in the Rmd file's directory.") cat("\n ** End .RProfile **\n") |
The custom.css file
CSS is extremely flexible although it is important to keep in mind that if you aim at Libre Office or MS Word import these are rather limited in their CSS abilities. I use the one below that is optimized to be as similar as possible to the Word template and imports nicely (copy the text into a file that you name custom.css
):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 | /* Set the main to Calibri, same as My Word 2010 uses. Also set the default size to 11pt. The maximum width to 35em enhances readability through optimal line length. Note: this setting is ignored by Word/Libre Office*/ body { -family: Calibri; -size: 11pt; background-color: white; padding-top: 1em; margin: auto; max-width: 35em; } /* Set the paragraph margin and padding to 0 except for the bottom */ p { padding: 0; margin: 0; margin-bottom: 10pt; } /* Center the table and add top/bottom margins */ table{ margin: auto; margin-top: 1em; margin-bottom: 1em; border: none; } /* The tr padding/margin 0 is important for table import, while the needs to be specified as and not -family-size due to limiations in Libre Office */ td, tr{ : 10pt Arial; padding: 0px; margin: 0px; } /* The cell should have a little space to easy reading although this section is mostly ignored by the Libre Office import */ td { padding: 4px; padding-bottom: 2px; } /* Set the headings to correspond to Word-style */ h1, h2, h3, h4, h5, h6 { margin: 10pt 0pt 0pt 0pt; -family: Cambria; -weight: bold; } /* h1 has a slightly larger top margins so we re-set that from the other*/ h1 { margin: 24pt 0pt 0pt 0pt; -size: 14pt; color: #365F91; } h2 { -size: 13pt; color: #4F81BD; } h3 { -size: 11pt; color: #4F81BD; } h4 { -size: 11pt; -weight: bold; -style: italic; color: #4F81BD; } h5 { -size: 11pt; -weight: normal; color: #243F5D; } h6 { -size: 11pt; -weight: normal; -style: italic; color: #243F5D; } /* The following sections are mostly unrelated to Word/Libre Office imports */ tt, code, pre { -family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter: none !important; -ms-filter: none !important; } body { -size:11pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page { margin-top: 2cm; margin-bottom: 1.5cm; margin-left: 3cm; margin-right: 3cm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } |
If you want to generate your own custom CSS I suggest you start by tweaking the original CSS that you can find here. While I thought the heading colors were a little silly at the beginning I now like how they softly integrate into the text. Microsoft probably put top designers when generating the default style for Word and I think it is sensible to trust their judgment, their settings is probably a pretty safe starting point.
A few HTML basics
HTML (HyperText Markup Language) was developed in 80:s and has remained the main way to communicate documents on the web. Although it has been refined over the years the basic structure is mostly the same. The document markup consists of <start> </end> tags, where the text within <> contains the element type. The basic structure of the document is:
Everything is wrapped within the main document, the <html> corresponds to the grey area. Subelements to the <html> are the <head> and <body> elements. The <head> contains meta-data not shown in the document and the style sheet should be defined within this area. The <body> contains the actual text with all the paragraphs, tables, and images.
CSS and HTML
As you may have noticed the <body> element was also present in the CSS-elements above. CSS you can set the CSS properties of each <body> element, you can for instance see that the paragraph element, <p>, has the attributes:
1 2 3 4 | p { padding: 0; margin: 0pt 0pt 10pt 0pt; } |
The above states that the padding should be 0 on all sides while the margin should be 10 points below. The 4-in-1 description of the different sides can be confusing although all you need to remember is TRouBLe (top, right, bottom, left). If you still feel a little queasy you can go with the specific parameter by expanding the above into:
1 2 3 4 5 | p { padding: 0; margin: 0pt; margin-bottom: 10pt; } |
You can also find the headings <h1>, <h2>, <h3>, … (the number corresponds to the heading level), first with the common attributes:
1 2 3 4 5 | h1, h2, h3, h4, h5, h6 { margin: 10pt 0pt 0pt 0pt; -family: Cambria; -weight: bold; } |
And then with specific attributes for each heading later on (although note that the margin setting is also overridden in the .RProfile due to the Libre Office incompatibility):
1 2 3 4 5 | h1 { margin: 24pt 0pt 0pt 0pt; -size: 14pt; color: #365F91; } |
Using this knowledge you should be able to tailor your document layout to your needs. Remember though that Word/Libre Office has not prioritized handling HTML and you may need to try some different alternatives before you get it to work.
Useful HTML-features
I’ve found that <sup> </sup> for superscript is very convenient although markdown has a shorthand for this ^
where you write 106 as 10^6. Perhaps more useful is subscipting <sub> </sub> with that currently doesn’t work as intended in default RStudio markdown (H~2~O does not translate into H2O while H<sub>2</sub>O does, note that the H~2~O works with Pandoc).
Special characters
Another thing that is very useful is special characters. Special characters basically any characters outside the English alphabet. Some very useful for tables are for instance the daggers and similar:
Code† | Glyph | Description |
---|---|---|
† | † | Dagger |
‡ | ‡ | Double dagger |
§ | § | Section sign |
• | • | Bullet |
˙ | ˙ | Dot accent |
¤ | ¤ | General currency sign |
° | ° | Degree sign |
‰ | ‰ | Per mill sign (10-3) |
≈ | ≈ | Approximate sign |
&plusm; | ± | Plus minus |
† Just enter the code and it should work, don’t forget the & and the ending ; without any intervening space |
Well that’s it for this part, I hope you enjoyed it.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.