Site icon R-bloggers

Self-documenting plots in ggplot2

[This article was first published on Higher Order Functions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When I am showing off a plotting technique in ggplot2, I sometimes like to include the R code that produced the plot as part of the plot. Here is an example I made to demonstrate the debug parameter in element_text():

library(ggplot2)

self_document(
  ggplot(mtcars, aes(x = mpg)) +
    geom_histogram(bins = 20, color = "white") +
    labs(title = "A basic histogram") +
    theme(axis.title = element_text(debug = TRUE))
)

Let’s call these “self-documenting plots”. If we’re feeling nerdy, we might also call them “qquines”, although they are not true quines.

In this post, we will build up a self_document() function from scratch. Here are the problems we need to sort out:

Creating the code annotation

As a first step, let’s just treat our plotting code as a string that is ready to use for annotation.

p_text <- 'ggplot(mtcars, aes(x = mpg)) +
  geom_histogram(bins = 20, color = "white") +
  labs(title = "A basic histogram")'

p_plot <- ggplot(mtcars, aes(x = mpg)) +
  geom_histogram(bins = 20, color = "white") +
  labs(title = "A basic histogram")

In order to have a titled plot along with this annotation, we need some way to combine these two graphical objects together (the code and the plot produced by ggplot2). I like the patchwork package for this job. Here we use wrap_elements() to capture the plot into a “patch” that patchwork can annotate.

library(patchwork)
wrap_elements(p_plot) + 
  plot_annotation(title = p_text)

Let’s style this title to use a monospaced . I use Windows and like Consolas, so I will use that .

# Use default mono  if "Consolas" is not available
extra::loads(device = "win", quiet = TRUE)
mono <- ifelse(
  extra::choose_("Consolas") == "", 
  "mono", 
  "Consolas"
)

title_theme <- theme(
  plot.title = element_text(
    family = mono, hjust = 0, size = rel(.9), 
    margin = margin(0, 0, 5.5, 0, unit = "pt")
  )
)

wrap_elements(p_plot) + 
  plot_annotation(title = p_text, theme = title_theme)  

One problem with this setup is that the plotting code has to be edited in two places: the plot p_plot and the title p_text. As a result, it’s easy for these two pieces of code to fall out of sync with each other, turning our self-documenting plot into a lying liar plot.

The solution is pretty easy: Tell R that p_text is code with parse() and evaluate the code with eval():

wrap_elements(eval(parse(text = p_text))) + 
  plot_annotation(title = p_text, theme = title_theme)  

This works. It gets the job done. But we find ourselves in a clumsy workflow, either having to edit R code inside of quotes or editing the plot interactively and then having to wrap it in quotes. Let’s do better.

Capturing plotting code as a string

Time for some nonstandard evaluation. I will use the rlang package, although in principle we could use functions in base R to accomplish these goals.

First, we are going to use rlang::expr() to capture/quote/defuse the R code as an expression. We can print the code as code, print it as text, and use eval() to show the plot.

p_code <- rlang::expr(
  ggplot(mtcars, aes(x = mpg)) +
    geom_histogram(bins = 20, color = "white") +
    labs(title = "A basic histogram")
)

# print the expressions
p_code
#> ggplot(mtcars, aes(x = mpg)) + geom_histogram(bins = 20, color = "white") + 
#>     labs(title = "A basic histogram")

# expression => text
rlang::expr_text(p_code)
#> [1] "ggplot(mtcars, aes(x = mpg)) + geom_histogram(bins = 20, color = \"white\") + \n    labs(title = \"A basic histogram\")"

eval(p_code)

Then, it should be straightforward to make the self-documenting plot, right?

p_code <- rlang::expr(
  ggplot(mtcars, aes(x = mpg)) +
    geom_histogram(bins = 20, color = "white") +
    labs(title = "A basic histogram")
)

wrap_elements(eval(p_code)) + 
  plot_annotation(title = rlang::expr_text(p_code), theme = title_theme)  

Hey, it reformatted the title! Indeed, in the process of capturing the code, the code formatting was lost. To get something closer to the source code we provided, we have to reformat the captured code before we print it.

The styler package provides a suite of functions for reformatting code. We can define our own coding styles/formatting rules to customize how styler works. I like the styler rules used by Garrick Aden-Buie in his grkstyle package, so I will use grkstyle::grk_style_text() to reformat the code.

p_code <- rlang::expr(
  ggplot(mtcars, aes(x = mpg)) +
    geom_histogram(bins = 20, color = "white") +
    labs(title = "A basic histogram")
)

wrap_elements(eval(p_code)) + 
  plot_annotation(
    title = rlang::expr_text(p_code) |> 
      grkstyle::grk_style_text() |> 
      # reformatting returns a vector of lines,
      # so we have to combine them
      paste0(collapse = "\n"), 
    theme = title_theme
  ) 

Putting it all together

When we write our self_document() function, the only change we have to make is using rlang::enexpr() instead rlang::expr(). The en-variant is used when we want to en-quote exactly what the user provided. Aside from that change, our self_document() function just bundles together all of the code we developed above:

self_document <- function(expr) {
  mono <- ifelse(
    extra::choose_("Consolas") == "", 
    "mono", 
    "Consolas"
  )
  
  p <- rlang::enexpr(expr)
  title <- rlang::expr_text(p) |> 
    grkstyle::grk_style_text() |> 
    paste0(collapse = "\n")
  
  patchwork::wrap_elements(eval(p)) + 
    patchwork::plot_annotation(
      title = title, 
      theme = theme(
        plot.title = element_text(
          family = mono, hjust = 0, size = rel(.9), 
          margin = margin(0, 0, 5.5, 0, unit = "pt")
        )
      )
    )
}

And let’s confirm that it works.

library(ggplot2)
self_document(
  ggplot(mtcars, aes(x = mpg)) +
    geom_histogram(bins = 20, color = "white") +
    labs(title = "A basic histogram")
)

Because we developed this function on top of rlang, we can do some tricks like injecting a variable’s value when capturing the code. For example, here I use !! color to replace the color variable with the actual value.

color <- "white"
self_document(
  ggplot(mtcars, aes(x = mpg)) +
    geom_histogram(bins = 20, color = !! color) +
    labs(title = "A basic histogram")
)

And if you are wondering, yes, we can self_document() a self_document() plot.

self_document(
  self_document(
    ggplot(mtcars, aes(x = mpg)) +
      geom_histogram(bins = 20, color = "white") +
      labs(title = "A basic histogram")
  )
)

Alas, comments are lost

One downside of this approach is that helpful comments are lost.

self_document(
  ggplot(mtcars, aes(x = mpg)) +
    geom_histogram(bins = 20, color = !! color) +
    # get rid of that grey
    theme_minimal() +
    labs(title = "A basic histogram")
)

I am not sure how to include comments. One place where comments are stored and printed is in function bodies:

f <- function() {
ggplot(mtcars, aes(x = mpg)) +
  geom_histogram(bins = 20, color = !! color) +
  # get rid of that grey
  theme_minimal() +
  labs(title = "A basic histogram")
}

print(f, useSource = TRUE)
#> function() {
#> ggplot(mtcars, aes(x = mpg)) +
#>   geom_histogram(bins = 20, color = !! color) +
#>   # get rid of that grey
#>   theme_minimal() +
#>   labs(title = "A basic histogram")
#> }
#> <environment: 0x000001746d339b68>

I have no idea how to go about exploiting this feature for self-documenting plots, however.


Last knitted on 2022-03-10. Source code on GitHub.1

  1. sessioninfo::session_info()
    #> ─ Session info ───────────────────────────────────────────────────────────────
    #>  setting  value
    #>  version  R Under development (unstable) (2022-03-02 r81842 ucrt)
    #>  os       Windows 10 x64 (build 22000)
    #>  system   x86_64, mingw32
    #>  ui       RTerm
    #>  language (EN)
    #>  collate  English_United States.utf8
    #>  ctype    English_United States.utf8
    #>  tz       America/Chicago
    #>  date     2022-03-10
    #>  pandoc   NA
    #> 
    #> ─ Packages ───────────────────────────────────────────────────────────────────
    #>  package     * version date (UTC) lib source
    #>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.2.0)
    #>  backports     1.4.1   2021-12-13 [1] CRAN (R 4.2.0)
    #>  cachem        1.0.6   2021-08-19 [1] CRAN (R 4.2.0)
    #>  cli           3.2.0   2022-02-14 [1] CRAN (R 4.2.0)
    #>  colorspace    2.0-3   2022-02-21 [1] CRAN (R 4.2.0)
    #>  crayon        1.5.0   2022-02-14 [1] CRAN (R 4.2.0)
    #>  DBI           1.1.2   2021-12-20 [1] CRAN (R 4.2.0)
    #>  digest        0.6.29  2021-12-01 [1] CRAN (R 4.2.0)
    #>  downlit       0.4.0   2021-10-29 [1] CRAN (R 4.2.0)
    #>  dplyr         1.0.8   2022-02-08 [1] CRAN (R 4.2.0)
    #>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.2.0)
    #>  evaluate      0.15    2022-02-18 [1] CRAN (R 4.2.0)
    #>  extra     0.17    2014-12-08 [1] CRAN (R 4.2.0)
    #>  extradb   1.0     2012-06-11 [1] CRAN (R 4.2.0)
    #>  fansi         1.0.2   2022-01-14 [1] CRAN (R 4.2.0)
    #>  farver        2.1.0   2021-02-28 [1] CRAN (R 4.2.0)
    #>  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.2.0)
    #>  generics      0.1.2   2022-01-31 [1] CRAN (R 4.2.0)
    #>  ggplot2     * 3.3.5   2021-06-25 [1] CRAN (R 4.2.0)
    #>  git2r         0.29.0  2021-11-22 [1] CRAN (R 4.2.0)
    #>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.2.0)
    #>  grkstyle      0.0.3   2022-03-10 [1] Github (gadenbuie/grkstyle@6a7011c)
    #>  gtable        0.3.0   2019-03-25 [1] CRAN (R 4.2.0)
    #>  here          1.0.1   2020-12-13 [1] CRAN (R 4.2.0)
    #>  highr         0.9     2021-04-16 [1] CRAN (R 4.2.0)
    #>  knitr       * 1.37    2021-12-16 [1] CRAN (R 4.2.0)
    #>  labeling      0.4.2   2020-10-20 [1] CRAN (R 4.2.0)
    #>  lifecycle     1.0.1   2021-09-24 [1] CRAN (R 4.2.0)
    #>  magrittr      2.0.2   2022-01-26 [1] CRAN (R 4.2.0)
    #>  memoise       2.0.1   2021-11-26 [1] CRAN (R 4.2.0)
    #>  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.2.0)
    #>  patchwork   * 1.1.1   2020-12-17 [1] CRAN (R 4.2.0)
    #>  pillar        1.7.0   2022-02-01 [1] CRAN (R 4.2.0)
    #>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.2.0)
    #>  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.2.0)
    #>  R.cache       0.15.0  2021-04-30 [1] CRAN (R 4.2.0)
    #>  R.methodsS3   1.8.1   2020-08-26 [1] CRAN (R 4.2.0)
    #>  R.oo          1.24.0  2020-08-26 [1] CRAN (R 4.2.0)
    #>  R.utils       2.11.0  2021-09-26 [1] CRAN (R 4.2.0)
    #>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.2.0)
    #>  ragg          1.2.2   2022-02-21 [1] CRAN (R 4.2.0)
    #>  rlang         1.0.2   2022-03-04 [1] CRAN (R 4.2.0)
    #>  rprojroot     2.0.2   2020-11-15 [1] CRAN (R 4.2.0)
    #>  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.2.0)
    #>  Rttf2pt1      1.3.8   2020-01-10 [1] CRAN (R 4.2.0)
    #>  scales        1.1.1   2020-05-11 [1] CRAN (R 4.2.0)
    #>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.2.0)
    #>  stringi       1.7.6   2021-11-29 [1] CRAN (R 4.2.0)
    #>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.2.0)
    #>  styler        1.6.2   2021-09-23 [1] CRAN (R 4.2.0)
    #>  systems   1.0.4   2022-02-11 [1] CRAN (R 4.2.0)
    #>  textshaping   0.3.6   2021-10-13 [1] CRAN (R 4.2.0)
    #>  tibble        3.1.6   2021-11-07 [1] CRAN (R 4.2.0)
    #>  tidyselect    1.1.2   2022-02-21 [1] CRAN (R 4.2.0)
    #>  utf8          1.2.2   2021-07-24 [1] CRAN (R 4.2.0)
    #>  vctrs         0.3.8   2021-04-29 [1] CRAN (R 4.2.0)
    #>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.2.0)
    #>  xfun          0.30    2022-03-02 [1] CRAN (R 4.2.0)
    #>  yaml          2.3.5   2022-02-21 [1] CRAN (R 4.2.0)
    #> 
    #>  [1] C:/Users/trist/AppData/Local/R/win-library/4.2
    #>  [2] C:/Program Files/R/R-devel/library
    #> 
    #> ──────────────────────────────────────────────────────────────────────────────
    

To leave a comment for the author, please follow the link and comment on their blog: Higher Order Functions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.