Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Prologue
It felt really nice to achieve custom code highlighting on this site with highlight.js (see this post). After that, I found myself working with pkgdown, one of many great Hadley’s packages. It is “designed to make it quick and easy to build a website for your package”. It converts all package documentation into appropriate HTML pages. Naturally, I had the same question as before: is there an easy way to highlight pipe operator %>%
separately? This time the best answer I was able to come up with was “Yes, if you don’t mind some hacking.”
This post is about adding custom rules for code highlighting for pkgdown site, taking string %>%
as an example.
Overview
After looking into HTML code of site built with pkgdown
, I noticed next key features of code highlighting:
- Text is already parsed with appropriate strings wrapped in
<span></span>
. This is done during building site withpkgdown::build_site()
. Class attribute of<span>
is used to customize highlighting. - Code from reference pages is processed differently. For example, function
mean
is wrapped as<span class="kw">mean</span>
in Home page but<span class='fu'>mean</span>
in Reference. - The most valuable feature of code preprocessing is creating links to appropriate help pages for R functions. This is done with adding
<a>
tag inside<span>
for certain function name.
So the default method of customising code highlighting in pkgdown
is to define CSS styles for present classes (which are essentially different across site).
To highlight certain strings, such as %>%
, one should parse HTML for certain <span>
tags inside <pre>
node (tag for preformatted text used for separate code blocks) and add appropriate class for further CSS customisation. This path is described in With adding tag class.
Although this method solves the problem of highlighting the %>%
, it is somewhat constrained: one can’t customize parsing rules. For example, there is no easy way to highlight <-
differently because it is not wrapped in <span>
. I thought it would be better to reuse the existing solution with highlight.js, but I didn’t consider this path for some time because of preformatted nature of code (unlike my previous experience) and concerns about function links to disappear. However, after manually adding necessary JavaScript code, it worked! Well, kind of: reference pages were not highlighted. The good news was that links stayed in place. How to add appropriate JavaScript code to pkgdown
site and deal with reference pages is described in With highlight.js
All code and short version of how to use it is placed in my highdown package.
With adding tag class
The plan is pretty straightforward:
- Find all HTML pages to add tag classes.
- At each page find appropriate tags, i.e.
<span>
inside<pre>
with text satisfying desired condition. - Add certain class to that tags.
- Modify CSS file.
Add class
The following functions do the job of adding class to appropriate tags. Package xml2 should be installed.
Main function arguments are:
xpath
– String containing an xpath (1.0) expression (use"//pre//span"
for code highlighting tags).pattern
– Regular expression for tags’ text of interest.new_class
– String for class to add.path
– Path to folder with html files (default to “docs”).
xml_add_class_pattern <- function(xpath, pattern, new_class, path = "docs") { # Find HTML pages html_files <- list.files( path = "docs", pattern = "\\.html", recursive = TRUE, full.names = TRUE ) lapply(html_files, function(file) { page <- xml2::read_html(file, encoding = "UTF-8") matched_nodes <- xml_find_all_patterns(page, xpath, pattern) if (length(matched_nodes) == 0) { return(NA) } xml_add_class(matched_nodes, new_class) xml2::write_html(page, file, format = FALSE) }) invisible(html_files) } # Add class `new_class` to nodes xml_add_class <- function(x, new_class) { output_class <- paste(xml2::xml_attr(x, "class"), new_class) mapply(xml2::xml_set_attr, x, output_class, MoreArgs = list(attr = "class")) invisible(x) } # Find appropriate tags # To find <span> inside <pre> use `xpath = "\\pre\\span"`. xml_find_all_patterns <- function(x, xpath, pattern, ns = xml2::xml_ns(x)) { res <- xml2::xml_find_all(x, xpath, ns) is_matched <- grepl(pattern, xml2::xml_text(res)) res[is_matched] }
For convenience one can define function high_pipe()
for adding class pp
to all <span>
inside <pre>
with text containing %>%
:
high_pipe <- function(path = "docs", new_class = "pp") { xml_add_class_pattern("//pre//span", "%>%", new_class, path) }
So typical usage is as follows:
- Run
pkgdown::build_site()
. - Run
highdown::high_pipe()
(with working directory being package root).
Add custom CSS rules
For adding custom CSS rules in pkgdown
site create file pkgdown/extra.css
in package root and edit it. For example, to make %>%
bold write the following:
.pp {-weight: bold;}
With highlight.js
Highlight.js enables more flexible code highlighting. For its overview and customization see my previous post.
Add custom JavaScript
To add custom JavaScript code to pkgdown
site one should create and modify file pkgdown/extra.js
in package root. Go here for code that initializes highlight.js and registers default R language parsing rules.
Tweak reference page
For highlight.js to work, code should be wrapped in <pre><span class="r">
tags. However, reference pages use only <pre>
. To tweak these pages use the following function (with working directory being package root):
tweak_ref_pages <- function() { # Find all reference pages ref_files <- list.files( path = "docs/reference/", pattern = "\\.html", recursive = TRUE, full.names = TRUE ) lapply(ref_files, add_code_node) invisible(ref_files) } add_code_node <- function(x) { page <- paste0(readLines(x), collapse = "\n") # Regular expression magic for adding <code class = "r"></code> page <- gsub('(<pre.*?>)', '\\1<code class = "r">', page) page <- gsub('<\\/pre>', '<\\/code><\\/pre>', page) invisible(writeLines(page, x)) }
Note that as for 2017-10-27 this still can cause incorrect highlighting if some actual code is placed just after comment.
Conclusions
- It is confirmed that asking questions about seemingly simple task can lead to the long journey of code exploration and hacking.
- At first try to find a way to reuse existing solutions, if they satisfy your needs. It can save considerable amount of time in the future.
- With highdown it is straightforward to customise code highlighting of
pkgdown
sites.
sessionInfo() ## R version 3.4.2 (2017-09-28) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu 16.04.3 LTS ## ## Matrix products: default ## BLAS: /usr/lib/openblas-base/libblas.so.3 ## LAPACK: /usr/lib/libopenblasp-r0.2.18.so ## ## locale: ## [1] LC_CTYPE=ru_UA.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=ru_UA.UTF-8 LC_COLLATE=ru_UA.UTF-8 ## [5] LC_MONETARY=ru_UA.UTF-8 LC_MESSAGES=ru_UA.UTF-8 ## [7] LC_PAPER=ru_UA.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=ru_UA.UTF-8 LC_IDENTIFICATION=C ## ## attached base packages: ## [1] methods stats graphics grDevices utils datasets base ## ## loaded via a namespace (and not attached): ## [1] compiler_3.4.2 backports_1.1.1 bookdown_0.5 magrittr_1.5 ## [5] rprojroot_1.2 tools_3.4.2 htmltools_0.3.6 yaml_2.1.14 ## [9] Rcpp_0.12.13 stringi_1.1.5 rmarkdown_1.7 blogdown_0.2 ## [13] knitr_1.17 stringr_1.2.0 digest_0.6.12 evaluate_0.10.1
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.