Site icon R-bloggers

If not Notebooks, then what? Look to Literate Programming

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Author and research engineer Joel Grus kicked off an important conversation about Jupyter Notebooks in his recent presentation at JupyterCon

There's no video yet available of Joel's talk, but you can guess the theme of that opening slide, and walking through the slides conveys the message well, I think. Yuhui Xie, author and creator of the rmarkdown package, provides a detailed summary and response to Joel's talk, where he lists Joel's main critiques of Notebooks: 

  1. Hidden state and out-of-order execution
  2. Notebooks are difficult for beginners
  3. Notebooks encourage bad habits
  4. Notebooks discourage modularity and testing
  5. Jupyter’s autocomplete, linting, and way of looking up the help are awkward
  6. Notebooks encourage bad processes
  7. Notebooks hinder reproducible + extensible science
  8. Notebooks make it hard to copy and paste into Slack/Github issues
  9. Errors will always halt execution
  10. Notebooks make it easy to teach poorly
  11. Notebooks make it hard to teach well 

Yihui suggests that many of these shortcomings of Notebooks could be addressed through literate programming systems, where the document you edit is plain-text (and so easy to edit, manage, and track), and computations are strictly processed from the beginning of the document to the end. I use the RMarkdown system myself, and find it a delightful way of combining code, output and graphics in a single document, which can in turn be rendered in a variety of formats including HTML, PDF, Word and even PowerPoint.

Yihui expands on these themes in greater detail in his excellent book (with JJ Allaire and Garrett Grolemund), R Markdown: The Definitive Guide, published by CRC Press. Incidentally, the book itself is a fine example of literate programming; you can find the R Markdown source here, and you can read the book in its entirety here. As Joel mentions in his talk, an automatically-generated document of that length and complexity simply wouldn't be possible with Notebooks.

All that being said, RMarkdown is (for now) a strictly R-based system. Are there equivalent literate programming systems for Python? That's a genuine question — I don't know the Python ecosystem well enough to answer — but if you have suggestions please leave them in the comments.

Yihui Xie: The First Notebook War

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.