Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Last week after my useR! talk, someone I had met at the R-Ladies dinner asked me for a list of all the links in my slides. I said I’d prepare it, not because I’m a nice person, but because I knew it’d be an use case where the great tinkr package would shine! 😈
What is tinkr?
tinkr is an R package I created, and that its current maintainer Zhian Kamvar took much further that I’d ever would have. tinkr can transform Markdown into XML and back.
Under the hood, tinkr uses
- commonmark for the Markdown-to-XML conversion. CommonMark, in the form of its cmark implementation, is the C library that GitHub for instance uses to display your Markdown comments as HTML. The commonmark package is also what powers Markdown support in roxygen2.
- xslt for the XML-to-Markdown conversion. XSLT is a templating language for XSLT.
Anyway, enough said, let’s go back to today’s use case.
Extract and format links from index.qmd
With tinkr I can use XPath, the query language for XML or HTML, to extract links from my slidedeck source. Then I will format them as a list.
First, I create a yarn object from my slidedeck source.
talk_yarn <- tinkr::yarn$new("/home/maelle/Documents/conferences/user2024/index.qmd") talk_yarn #> <yarn> #> Public: #> add_md: function (md, where = 0L) #> body: xml_document, xml_node #> clone: function (deep = FALSE) #> get_protected: function (type = NULL) #> head: function (n = 6L, stylesheet_path = stylesheet()) #> initialize: function (path = NULL, encoding = "UTF-8", sourcepos = FALSE, #> md_vec: function (xpath = NULL, stylesheet_path = stylesheet()) #> ns: http://commonmark.org/xml/1.0 #> path: /home/maelle/Documents/conferences/user2024/index.qmd #> protect_curly: function () #> protect_math: function () #> protect_unescaped: function () #> reset: function () #> show: function (lines = TRUE, stylesheet_path = stylesheet()) #> tail: function (n = 6L, stylesheet_path = stylesheet()) #> write: function (path = NULL, stylesheet_path = stylesheet()) #> yaml: --- format: revealjs: highlight-style: a11y ... #> Private: #> encoding: UTF-8 #> md_lines: function (path = NULL, stylesheet = NULL) #> sourcepos: FALSE
Then I extract all links.
links <- xml2::xml_find_all( talk_yarn$body, xpath = ".//md:link", ns = talk_yarn$ns ) head(links) #> {xml_nodeset (6)} #> [1] <link destination="https://user-maelle.netlify.app" title="">\n <text xm ... #> [2] <link destination="https://www.pexels.com/photo/old-cargo-ship-on-sea-207 ... #> [3] <link destination="https://www.pexels.com/photo/the-word-louise-is-spelle ... #> [4] <link destination="https://www.pexels.com/photo/gray-rotary-telephone-on- ... #> [5] <link destination="https://www.pexels.com/photo/close-up-photography-of-y ... #> [6] <link destination="https://www.r-consortium.org/all-projects/call-for-pro ...
I then throw away the links to the great website Pexels, because these are image credits rather than information useful to do R stuff.
links <- purrr::discard( links, \(x) startsWith(xml2::xml_attr(x, "destination"), "https://www.pexels") ) head(links) #> {xml_nodeset (6)} #> [1] <link destination="https://user-maelle.netlify.app" title="">\n <text xm ... #> [2] <link destination="https://www.r-consortium.org/all-projects/call-for-pro ... #> [3] <link destination="https://www.r-consortium.org/all-projects/call-for-pro ... #> [4] <link destination="https://www.heltweg.org/posts/who-wrote-this-shit/" ti ... #> [5] <link destination="https://fosstodon.org/@hadleywickham/11202130903588421 ... #> [6] <link destination="https://nostarch.com/kill-it-fire" title="">\n <text ...
After that I can format the links and display them here using an “asis” chunk. Yes, my slidedeck uses Quarto but this blog is still powered by R Markdown, hugodown to be precise.
I’m using the formatting as an opportunity to only keep distinct links: sometimes I had very similar slides in a row, with repeated information.
format_link <- function(link) { url <- xml2::xml_attr(link, "destination") text <- xml2::xml_text(link) sprintf("* [%s](%s)", text, url) } formatted_links <- purrr::map_chr(links, format_link) formatted_links <- unique(formatted_links) formatted_links |> paste(collapse = "\n") |> cat()
- https://user-maelle.netlify.app
- R Consortium ISC
- https://www.heltweg.org/posts/who-wrote-this-shit/
- https://fosstodon.org/@hadleywickham/112021309035884210
- https://nostarch.com/kill-it-fire
- “Refactoring Pro-Tip: Easiest Nearest Owwie First”
- https://styler.r-lib.org/
- https://masalmon.eu/2024/05/23/refactoring-tests/
- {lintr} itself
- reference index
- continuous integration
- https://masalmon.eu/2024/05/15/refactoring-xml/
- Tidyteam code review principles
- The Code Review Anxiety Workbook
- General science lifecycle
- Statistical software
- now
- then
- Happy Git and GitHub for the useR
- “Oh shit, Git!"
- “How Git works”
- Why you need small, informative Git commits
- The two phases of commits in a Git branch
- Hack your way to a good Git history
- {saperlipopette}
- Oh shit, Git!
- No Maintenance Intended
- What Does It Mean to Maintain a Package?
- Three currencies of payment for our work
- Package maintainer cheatsheet
- dev guide
- blog
- Package Development Corner
- paths of participation
- Monthly newsletter
- Blog
- R-universe
- https://ropensci.org/help-wanted
- https://ropensci.org/news
- https://devguide.ropensci.org/maintenance_evolution.html#archivalguidance
- 2021 community call
Conclusion
Using tinkr, XPath and sprintf()
, I was able to create a list of all the links shared in my useR! slidedeck. Some of them have no text, meaning that the URL is used as text for the link; or text that only makes sense in the context of the paragraph they were a part of; others are slightly more informative; but at least none of them is a “click here” link. 😅
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.