Site icon R-bloggers

A basic reproducible data analysis workflow – principles

[This article was first published on Joris Muller's blog - Posts about R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The goal of reproducible data analysis is, according to the CRAN task view, to tie specific instructions to data analysis and experimental data so that scholarship can be recreated, better understood and verified.

In order to prepare a small talk on this subject for the next Strasbourg’s R user group meeting, I’m starting a series of blog post. Today I will focus on the goals and constraints of my personal workflow.

Goal of my workflow

Beside some side projects, I use mainly R in an academic context. For this, I must be able to:

Constraints

Furthermore I have some constraints.

Summary of the implementation

With these goals and constraints, R is a natural choice:

After several years of R practice, I developed a simple workflow based on

This is basic, doesn’t cover all reproductibility problems (e.g. no software archive) but is for me a good balance.

I will describe this workflow in a later post.

If you have your own reproducible data analysis workflow, please feel free to describe it in the commentaries or send me a link!

To leave a comment for the author, please follow the link and comment on their blog: Joris Muller's blog - Posts about R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.