Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Genesis
I started to use continuous integration with gitlab a few weeks ago and up to a few days was really happy with rocker
image (basically docker
+ R
).
I became ambitious and started to write a markdown
that was comparing R
and Python
speed on simple operations. It was working fine on my laptop (anaconda is installed). However, because anaconda is not available in rocker
image, markdown compilation naturally failed. I thus started the project to create a docker
image that would do the job, i.e. that would integrate Python
and R
together. The container I propose is not well-suited for Python
only repository, its goal is to ease the pass-through between Python
and R
Since I am a beginner in docker
ecosystem, it has not been an easy path. When I was thinking the solution would be trivial to implement I was planning to make the repository private. However, I think now that the solution produced can help people. I decided to make it public. To make the project as reproducible as possible, I ended up with that complex workflow:
github
connected todockerhub
to build image base fromDockerFile
gitlab
with continuous integration using/gitlabCI/.simple_configuration.yml
example file as a reproducible workflowdockerhub
that builds automatically from github repository the docker image
Complex workflow, simple image
This is not the most natural workflow. If you go into project history, you might see that I did not adopt initially that workflow. I adopted it after merging branches from two separated project that were pursuing the same goal. This complex set up presents an advantage for reproducibility: each time project updates are pushed, the code used to build pocker
image and the example of use from continuous integration is updated.
I should warn people used to create docker image that I might not have created the most parsimonious image necessary to run R
and Python
together. I would welcome pull request to improve pocker
repository
Some explanations
DockerFile
is used to build the image. The main steps are the following:
- Start from
rocker/verse
container that avoids re-installing tidyverse each time a CI/CD job is ran. - Install
python 3
andanaconda
- Add
conda
binary directory in path - Install
reticulate
package
In gitlabCI
directory, you will find scripts useful for continuous integration related to docker
project:
complete_configuration.yml
: the gitlab CI/CD configuration file I was using before building my own docker image. It starts fromrocker/verse
and follows the same steps that theDockerfile
that has been presentedsimple_configuration.yml
: gitlab CI/CD configuration I use now thatpocker
container is built
The other scripts build.R
, scripts/*
are here to propose tests for the configuration obtained from gitlab
CI/CD.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.