drat Tutorial: Publishing a package
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
The drat package was released earlier this month, and described in a first blog post. I received some helpful feedback about what works and what doesn’t. For example, Jenny Bryan pointed out that I was not making a clear enough distinction between the role of using drat to publish code, and using drat to receive/install code. Very fair point, and somewhat tricky as R aims to blur the line between being a user and developer of statistical analyses, and hence packages. Many of us are both. Both the main point is well taken, and this note aims to clarify this issue a little by focusing on the former.
Another point make by Jenny concerns the double use of repository. And indeed, I conflated repository (in the sense of a GitHub code repository) with repository for a package store used by a package manager. The former, a GitHub repository, is something we use to implement a personal drat with: A GitHub repository happens to be uniquely identifiable just by its account name, and given an (optional) gh-pages
branch also offers a stable and performant webserver we use to deliver packages for R. A (personal) code repository on the other hand is something we implement somewhere—possibly via drat which supports local directories, possibly on a network share, as well as anywhere web-accessible, e.g. via a GitHub repository. It is a little confusing, but I will aim to make the distinction clearer.
Just once: Setting up a drat repository
So let us for the remainder of this post assume the role of a code publisher. Assume you have a package you would like to make available, which may not be on CRAN and for which you would like to make installation by others easier via drat. The example below will use an interim version of drat which I pushed out yesterday (after fixing a bug noticed when pushing the very new RcppAPT package).
For the following, all we assume (apart from having a package to publish) is that you have a drat directory setup within your git / GitHub repository. This is not an onerous restriction. First off, you don’t have to use git or GitHub to publish via drat: local file stores and other web servers work just as well (and are documented). GitHub simply makes it easiest. Second, bootstrapping one is trivial: just fork my drat GitHub repository and then create a local clone of the fork.
There is one additional requirement: you need a gh-pages
branch. Using the fork-and-clone approach ensures this. Otherwise, if you know your way around git you already know how to create a gh-pages branch.
Enough of the prerequisities. And on towards real fun. Let’s ensure we are in the gh-pages
branch:
edd@max:~/git/drat(master)$ git checkout gh-pages Switched to branch 'gh-pages' Your branch is up-to-date with 'origin/gh-pages'. edd@max:~/git/drat(gh-pages)$
Publish: Run one drat command to insert a package
Now, let us assume you have a package to publish. In my case this was version 0.0.1.2 of drat itself as it contains a fix for the very command I am showing here. So if you want to run this, ensure you have this version of drat as the CRAN version is currently behind at release 0.0.1 (though I plan to correct that in the next few days).
To publish an R package into a code repository created via drat running on a drat GitHub repository, just run insertPackage(packagefile)
which we show here with the optional commit=TRUE
. The path to the package can be absolute are relative; the easists is often to go up one directory from the sources to where R CMD build ...
has created the package file.
edd@max:~/git$ Rscript -e 'library(drat); insertPackage("drat_0.0.1.2.tar.gz", commit=TRUE)' [gh-pages 0d2093a] adding drat_0.0.1.2.tar.gz to drat 3 files changed, 2 insertions(+), 2 deletions(-) create mode 100644 src/contrib/drat_0.0.1.2.tar.gz Counting objects: 7, done. Delta compression using up to 8 threads. Compressing objects: 100% (7/7), done. Writing objects: 100% (7/7), 7.37 KiB | 0 bytes/s, done. Total 7 (delta 1), reused 0 (delta 0) To [email protected]:eddelbuettel/drat.git 206d2fa..0d2093a gh-pages -> gh-pages edd@max:~/git$
You can equally well run this as insertPackage("drat_0.0.1.2.tar.gz")
, then inspect the repo and only then run the git commands add
, commit
and push
. Also note that future versions of drat will most likely support git operations directly by relying on the very promising git2r package. But this just affect package internals, the user-facing call of e.g. insertPackage("drat_0.0.1.2.tar.gz", commit=TRUE)
will remain unchanged.
And in a nutshell that really is all there is to it. With the newly drat-ed package pushed to your GitHub repository with a single function call), it is available via the automatically-provided gh-pages
webserver access to anyone in the world. All they need to do is to point R’s package management code (which is built into R itself and used for e.g._ CRAN and BioConductor R package repositories) to the new repo—and that is also just a single drat command. We showed this in the first blog post and may expand on it again in a follow-up.
So in summary, that really is all there is to it. After a one-time setup / ensuring you are on the gh-pages
branch, all it takes is a single function call from the drat package to publish your package to your drat GitHub repository.
This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.