Updates to the Big Book of R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The Big Book of R provides a comprehensive and ever-growing overview of a broad selection of R programming books. It was created and is maintained by Oscar Baruffa. The collection began with approximately 100 books and, with the help of contributions from the R community, has subsequently expanded to over 400. The books are grouped into topics such as geospatial, machine learning, statistics, text analysis, and many more. The Big Book of R is an excellent resource for anyone learning R programming, whether they are a beginner or advanced user.
What we set out to do
Fathom Data uses R extensively in our consulting work. We took on the migration of the Big Book of R from Bookdown to Quarto as a way of giving back to the R community, to which we owe so much. This migration offers several advantages such as:
- Consistency — Quarto merges the functions of R Markdown, Bookdown, Distill, and Xaringan into one system. Instead of multiple YAML configuration files (
_bookdown.yml
and_output.yml
) there’s just a single_quarto.yml
file. - Support for Many Languages and Tools — Quarto supports many programming languages (R, Python, JavaScript, and Julia, as well as new languages that might be added in the future) and tools (like knitr, Jupyter, and Observable).
- Works Well with Existing Content — Quarto can use most R Markdown documents and Jupyter notebooks without needing changes.
- Elegant Outputs — Quarto, built on Pandoc, is an open-source system that lets you combine text and code to make beautifully formatted documents, web pages, blog posts, books, and more.
- Managing Bibliographies — Quarto has a smart way to handle bibliographies.
Where we started
We began with a GitHub repository that contained the R Markdown project for the book. This repository had three main parts: a preface, a page about various R communities, and the main script. The main script pulls book data from Google Sheets, sorts it into chapters, and creates a well-organised report. This report includes detailed information about the authors and books, which is used to render each chapter in the book.
Approach
Our approach involved setting up a new Quarto book project, to which we gradually transferred files. The first two R Markdown (.Rmd
) files were converted into Quarto markdown format (.qmd
). We didn’t need the index file because Quarto operates with a _quarto.yml
file that defines the book structure. All configuration options were transferred from the index to the _quarto.yml
file and the rest of the content was incorporated into the index.qmd
file.
Challenges
Unlike Markdown, where a single script could render the book by fetching data from Google Sheets, Quarto required a different approach.
Quarto needs each chapter to be its own .qmd
file within the project folder, and these files must be listed in _quarto.yml
. We achieved this by developing a script that retrieves information from Google Sheets for each chapter, saves each chapter as an individual .qmd
file, and creates a list of all the .qmd
files as a text file. This made it simpler to update the _quarto.yml
file with new or modified chapter titles.
Lessons learned
- The
{googlesheets4}
package worked well to generate.qmd
files from Google Sheets. - Projects can successfully be migrated from Markdown or Bookdown to Quarto with the proper configuration steps.
- CSS could easily be integrated with Quarto to enhance the book’s presentation.
- There is a built-in search bar for Quarto books in the
_quarto.yml
template.
Finished product
The finished product retains the structural essence of the original Bookdown format, with several enhancements. It features a book cover and CSS styling that improves the visual appeal of the book, which includes an easy to use light and dark mode toggle. Additionally, each chapter is equipped with its own table of contents, making it much easier for readers to navigate.
Why this is an improvement
Quarto facilitates easier collaboration with its multilingual and multi-engine support. This is ideal for projects that involve multiple programming languages or the integration of various data sources and analytical tools.
Using a script to organise content into separate .qmd
files for each chapter allows for more efficient document management. This modular structure enables independent editing, version control, and reuse of individual chapters without impacting the overall document, making it suitable for large projects with numerous contributors.
Furthermore, Quarto’s support for a variety of output formats and customisations, such as websites, PDFs, and slides, improves the functionality and aesthetics of the outputs.
Future steps
The configuration of the book simplifies its maintenance. Changes to the _quarto.yml
file are required only for significant updates, such as the introduction of a new chapter within Google Sheets. This setup ensures that maintenance of the book remains straightforward and manageable over time.
We’re confident that this change to the underlying infrastructure of the Big Book of R will place the project on a solid foundation, which will ensure that it continues to be one of the best resources for information about R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.