Building an R package (under Windows) without C, C++ or FORTRAN code
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Why build and R package? It basically boils down to be able to brag at your local pub that a new version of YOUR package is on CRAN as of 7 p.m. CET. But seriously, if you’ve produced some function that other people might benefit (or have ordered them) from using them, like your boss, co-workers or students, consider building a package. The chances of broken dependencies and ease of installing everything outweighs the effort of learning how to build one. If you feel your functions (that may be new in some respect) could benefit an even wider audience, consider submitting it to CRAN (I will not discuss how to do that here, but do read the Ripley reference I mention later).
I have set out to build a test package to prepare myself when the time comes and will really need to build one of my own. This here is an attempt I made to document steps I took when building a dummy package (called texasranger (yes, THE Texas Ranger!)) with one single function. I have attempted to build documentation and all other ladeeda things that are mandatory for the package to check out O.K. when building it.
Before you dig into the actual preparation and building itself, you will need a bunch of tools. These come in a bag with a linux distribution, but you will have to add them yourself if you’re on Windows. This is basically the only thing that is different when trying to build a package on Windows/Linux. I will not go into details regarding these tools (perl, MS html help compiler, if you have C/C++/FORTRAN code you will need GNU compiler set) , a TeX distro), – I will, however, advise you to check out Making R package under Windows (P. Rossi). There, you will find a detailed description (see page 2-6) of how to proceed to get all the correct tools and how to set them up. When you have done so, you are invited to come back here. Feel free to follow just mentioned tutorial, as it goes a bit more in-depth with explaining various aspects. The author warns that MiKTeX will not work (see the datum of the document), but things might have changed since then and it now works, at least for me.
I have followed the aforementioned Making R package under Windows (by P. Rossi), slides Making an R package made by R.M. Ripley and of course the now famous Writing R Extensions (WRE) by R dev core team (you are referred to this document everywhere). I would advise everyone to read them in this listed order – or at least read WRE last. First two can be read from cover to cover in a few minutes – the last one is a good reference document for those pesky “details”. In my experience, I started to appreciate WRE only after I have read the first two documents.
Enough chit-chat, let’s get cracking!
1. These are the paths I entered (see document by Rossi what this is all about) to enable all the tools so that I can access them from the Command prompt (Command prompt can be found under Accessories, another term for it may be Terminal or Console on different OSs):
c:/rtools/bin;c:/program files/miktex 2.8/miktex/bin;c:/program files/ghostgum/gsview;C:/strawberry/perl/bin;c:/program files/r/r-2.11.0/bin;c:/program files/help workshop%SystemRoot%\system32;%SystemRoot%;%SystemRoot%\System32\Wbem;%SYSTEMROOT%\System32\WindowsPowerShell\v1.0\;C:\strawberry\c\bin;C:\strawberry\perl\site\bin
2. Use R function
package.skeleton()
to create directory structure and some files (DESCRIPTION and NAMESPACE). I used the following arguments:
package.skeleton(name = "texasranger", list = c("bancaMancaHuman"), namespace = TRUE) #I only have one function, but you can list them more
See argument code_files for an alternative way of telling the function where to read your functions. I suspect this may be very handy if you have each function in a separate file.
3. Fill out DESCRIPTION and NAMESPACE (if you decide to have a name space, read more @ WRE document). Pay special attention to export, import, useDynLib… All of the above mentioned documentation will help guide you through the process with minimal effort.
A side (but important) note. You should write your functions without them calling
require()
or
source()
to dig up other function and packages. Read more about NAMESPACE and how to specify which functions and packages to “export” (or “import”) and how.
4. Create documentation files. This is said to be the trickiest part. I still don’t have much experience with this so I can’t judge how tricky it can be – but I can tell you that it may be time consuming. Make sure you take time to document your functions well. If you were smart, you wrote all this down while you were writing (or preparing to write) a function and this should be more or less a session of copy-paste. Use
prompt(function, file = "filename.Rd")
to create template Rd files ready for editing. They are more or less self explanatory (with plenty of instructions). It help if you know LaTeX, but not necessary. Also, I suspect the function may dump the files into the correct /man directory automatically – if it doesn’t, do give it a hand and move the files there yourself. Perhaps worth mentioning is that if you want to reference to functions outside your package, use(notice the options square brackets [])
\code{\link[package:function]{function}}
, e.g.
\code{\link[raster:polygonsToRaster]{polygonsToRaster}}
or
\code{\link[utils:remove.packages]{remove_packages}}
– To refer to “internal” package function (those visible by the user), use
\code{\link{function_name}}
4a. If you have datasets you wish to include in your package (assuming those in library(help=”datasets”) are not sufficient), you will need to do two things. First, prepare your object (list, data.frame, matrix…). Save it and prepare documentation. Saved .rda file goes to data/ directory. The documentation file goes into the same directory (man/) as other .Rd files. If your dataset is not bigger than 1 MB you shouldn’t worry, otherwise consult the Manual on how to prepare a
save(my.dataset, file = "my.dataset.rda") # move to data/ folder promptData(my.dataset, filename ="my.dataset.rda.Rd") # move to man/ folder¸and edit
4b. You should also build a vignette, where you can explain at greater length what your package is about and maybe give a detailed (or more detailed) workflow with the accompanying functions. You can use Sweave or knitr, and the folder to place your .Rnw file is vignettes/.
5. To check the documentation for errors, use
R CMD Rd2txt filename.Rd
and/or
R CMD Rdconv -t=html -o=filename.html filename.Rd
6. Next, you should run a check on your package before you build it. You should run it from the directory where the package directory is located. I’ve dumped my package contents to d:/workspace/texasranger/ and executed the commands from d:/workspace/
R CMD check
If you get any errors, you will be referred to the output file. READ and UNDERSTAND it.
7. Build the package with the command
R CMD build package_name
This will create a file and will add a version (as specified in the DESCRIPTION file, i.e. package_name_1.0-1.tar.gz, see WRE for specifics on package version directives).
package_name is actually the name of the directory (which should be the name of your package as well).
If you use Windows, you can build a .zip file AND install the package (uses install.packages) at the same time. Use command
R CMD INSTALL --build package_name_1.0-1.tar.gz
8. Rejoyce.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.