Sweave Tutorial 1: Using Sweave, R, and Make to Generate a PDF of Multiple Choice Questions
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
make
, Sweave
, and R
.Overview
The repository with all source files is available at:
The repository allows you to download all files as an archive or view the files individually on the web.A copy of the final PDF generated from the process is available here
I ran the code on Windows with the following programs installed.
- R: For the R code
- Rtools: For
make
and thesh
commands in Make and for Sweave to run on the command line - MikTeX: For compilation of the PDF using
texify
and the default downloading of theexam
document class
It should run on MAC and Linux with appropriate R, make, and LaTeX tools installed.
Assuming you have the above installed, to run the code
- Download the repository from github
- Uninstall to a directory
- Open the shell in that directory
- Type:
make
The remainder of this post explains the code in each of the main files in the repository.
The Makefile
The makefile
is used to build the PDF from the Rnw Source.It also performs other useful tasks.A copy of the makefile is shown below:
output = .output rnwfile = Sweave_MCQ backup = .backup all: R CMD Sweave $(rnwfile).Rnw -mkdir $(output) -cp *.sty $(output) -mv *.tex *.pdf *.eps $(output) cd $(output); texify --run-viewer --pdf $(rnwfile).tex tex: cd $(output); texify --run-viewer --pdf $(rnwfile).tex clean: -rm $(output)/* backup: -mkdir $(backup) cp $(output)/$(rnwfile).pdf $(backup)/$(rnwfile).pdf
I recently posted on the benefits of makefiles when developing Sweave documents.
The make
file starts with three variables.
output
stores the name of the folder where where all derivative files are placed.rnwfile
stores the name of the Rnw source file without the.Rnw
extension. This is also the base of the resulting.tex
and.pdf
files.backup
stores the name of the folder where a copy of the pdf is placed if this is desired.
The file then has four goals.
The default goal is called all:
.If make
is called without argument from the command line in the project directory, the recipe immediately below all:
is run. Note that all apparent indentations are tab indentations (a set of spaces would cause an error).
- The first line runs Sweave on the
Rnw
file from the command line.On windows thisR CMD Sweave
requires installation of RtoolsNote how$(rnwfile).Rnw
will actually beSweave_MCQ.Rnw
after variable substitution. - The second line creates a directory that corresponds to the value in the variable
output
. The hyphen at the start of the line ensures that any errors, such as if the folder already exists, do not stopmake
from running.Note that commands such asmkdir
,cp
,cd
, andrm
are based on thesh
shell.These commands are supported on Windows if you haveRtools
installed. - The third line copies
Sweave.sty
into theoutput
folder. - The fourth line copies
tex
,pdf
andeps
files (i.e., those generated by the Sweave command) into theoutput
folder.This is done to ensure that the root directory only includes source files.This has several benefits. (a) It makes version control easier;
(b) it makes it easy to see the source files and(c) it reduces the risk of accidentally deleting source files when deleting derived files. - the fifth line changes the home directory to the
output
directory and then runstexify
on thetex
file generated from Sweave. The flags ensure that a pdf is generated and that the default viewer is initiated.This command could be modified to something likepdflatex
or some otherlatex
program. - By changing the directory, all the derived latex files are kept in the output folder.
The tex:
goal can be called by running make tex
at the command line.I use it in case I want there is an error in the when compiling the pdf from the tex file.Sometimes its easier to work out where the bug is by manipulating the intervening tex file.Of course once the problem has been identified, it needs to be incorporated into the Rnw source.
The clean:
goal removes all files in the output directory (i.e., all the derived files)
The backup:
goal copies the resulting pdf into the backup
folder.I figured this might be useful in order to include a copy of the final product in the repository.
.gitignore
/.output .project
The .gitignore
file prevents all files in the /.output
directory (i.e., the derived files) and the file .project
from being placed under version control in git.
I’m preparing a post on version control, git, and github which will be posted shortly.
Sweave_MCQ.Rnw
Sweave_MCQ.Rnw
is the R noweb file that contains chunks of LaTeX and R code.When Sweave is run on this file, the R code chunks are converted into tex
and, potentially, image files are generated.
LateX Preamble
\documentclass[12pt, a4paper]{exam} \usepackage[OT1]{fontenc} \usepackage{Sweave} \SweaveOpts{echo=FALSE} \usepackage{hyperref} \hypersetup{pdfpagelayout=SinglePage} % http://www.tug.org/applications/hyperref/ftp/doc/manual.html \setkeys{Gin}{width=0.8\textwidth} \pagestyle{headandfoot} % every page has a header and footer \header{}{Sample Multiple Choice Questions}{} \footer{}{Page \thepage\ of \numpages}{}
The latex preamble is mostly general code that ensures proper display of the reuslting document.
- The
exam
document class is great for writing a variety of exam style documents in LaTeX.See CTAN – examfor documentation. hyperref
is used to display hyperlinks and allows the resulting pdf to open inSinglePage
format.\setkeys...
controls the width of Sweave figures relative to the paragraph width. There are no figures in this documents; so it is not really required.\pagestyle{headerandfoot}...
These three lines ensure the display of header and footer information on each page.
First R Code Chunk
<<prepare_data>>= items <- read.csv("data/items.csv", stringsAsFactors = FALSE) writeQuestion <- function(x){ c("\\filbreak", paste("\\question\n", x["itemText"]), "\\begin{choices}", paste("\\choice", x["optionA"]), paste("\\choice", x["optionB"]), paste("\\choice", x["optionC"]), paste("\\choice", x["optionD"]), "\\vspace{10 mm}", "\\end{choices}\n\n") } itemText <- apply(items, 1, function(X) writeQuestion(x = X)) answers <- paste(items$item, "=", LETTERS[as.numeric(items$correctAnswer)], sep ="") answersText <- paste(answers, collapse = "; ") @
- R code chunks in Sweave are commenced by
<<>>=
and ended by@
. These need to appear in the first column of the text file. <<pre>>=</code>: The first non-keyword placed in the opening tags provides a name for the R code chunk. A short descriptive title is useful both when reading the source and when debugging Sweave compilation.</li><li><code>items...</code>: this line reads in a csv file into a data frame with 40 cases. Each case is a multiple choice question with fields such as the question text, the text for the four response options, and the correct answer.</li><li><code>writeQuestion...</code>: This is a function which is designed to take a row of data from the <code>items</code> data frame and return a latex formated character vector, where each element is ultimately be printed on its own line of the tex file.Note how in order to produce one backslash in LaTeX, two backslashes, need to be written.The <code>\question</code>, <code>\choices</code>, and <code>\choice</code> commands are part of the <code>exam</code> document class and are used for formatting multiple choice questions.</li><li><code>apply...</code> takes the <code>items</code> data frame and for each row (1=rows) runs the function <code>writeQuestion</code> on the row.</li><li><code>answers...</code> and <code>answersText</code> create a formatted string that shows item numbers and letters for correct answers, all drawn from the <code>items</code> data frame.</li></ul><h4>Remaining code</h4><pre>\begin{questions} <<print_items, results=tex>>= cat(itemText, sep = "\n") @ \newpage \section*{Answers} <<print_answers, results=tex>>= cat(answersText) @ \end{questions} </pre><ul><li>The second R code chunk has a descriptive name <code>print_items
.It uses the key-value pairresults=tex
. This ensures that Sweave interprets the text outputed usingcat
as rawtex
.cat...
prints the long character vectoritemText
containing all the latex for the questions.sep="\n"
means that each element is printed on a new line which makes the resultingtex
file easier to read.
Summary and Related Resources
The combination of make
, R
, Sweave
, and LaTeX
is tremendously powerful.Hopefully, this post encourages a few more people to have a play.To learn more check out some of the following posts and pages:
- Getting Started with Sweave
- makefiles for Sweave, R and LaTeX using Eclipse on Windows
Question on Stats.SE 'Complete substantive examples of reproducible research using R'
UPDATE: After posting this article on my blog, Christophe Lalanne let me know about a paper by Bettina Grün and Achim Zeileis called 'Automatic Generation of Exams in R' in JSS. It also uses Sweave, LaTeX, and the exams document class.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.