Site icon R-bloggers

Transform a folder as git project synchronized on Github or Gitlab

[This article was first published on Rtask, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

You can read the original post in its original format on Rtask website by ThinkR here: Transform a folder as git project synchronized on Github or Gitlab

You have been working for years on your R scripts, and saved all versions as “script_v1.R”, “script_v2.R”, “script_v2_best-of-the-world.R”, … One day, you heard about git, a versioning system that allows you to make your files travel through time. But, how to transform a directory of falsely versioned files into a git repository and synchronize it online?

Originally, I wanted to write a blog post about our package {gitdown}, but I realized I needed a reproducible example of a git repository on Github and Gitlab and I started to document the steps. But then, this was too long to stay in the same article. So here it is: the preamble of my article “Download Gitlab or Github issues and make a summary report of your commits”. These are the very first step in transforming a local folder into a git repository, synced remotely with a git provider. Also, we’ll see how to synchronize the same project with two different git providers, Github and Gitlab, at the same time.

If you want to do a quick read, you can read titles and text in bold to get the main points of the sections.

git, for personal projects

When speaking about git, you may have heard this is a great tools for collaborative work in software development. This helps manage multiple developers working in parallel on the same project while reducing the fear of loosing parts of people’s work. But you do not need to work in a group to use git with your project. As soon as you have some scripts, you may want to track the changes you make over time. git is here to help.

To install git, you can follow this guide in “Happy Git and GitHub for the useR”.

Transform my directory into git project

You should probably work in a Rstudio project: File > New project > Existing directory.
If you do not want Rstudio, you will need to set your folder as your current working directory.
Then run usethis::use_git() and answer the questions (I would say “Yes” twice).

git requires that you say who you are, so that each modification of your code is assigned to you. With this line of code, you can define your configuration globally. Do it once on this computer, and you won’t have to do it again.

use_git_config(user.name = "Jane", user.email = "jane@example.org")

That’s it. Your now working in a git repository.

git beginners command lines

Your project is now local. If you are a beginner and do not want to bother now with all git wonderful possibilities, that’s OK. There are two actions to realize daily: add and commit.

Any reason to commit is good. To commit does not mean you can’t continue to work on this file afterward. Indeed, if you make new modifications, you can add them for the next commit. Commit messages will be the equivalent of your v1, v2, final_version, best_final_version, …

Use add and commit with RStudio graphical interface

In Rstudio, as soon as you restart a project with a git repository inside, you will have access to the git pane.

You’ll be able to see the history of your commits by clicking on the clock icon.

Use add and commit in command line

If you are still in Rstudio, you can use the terminal to commit. You can also directly open a Terminal and make sure you are in the correct folder.

In the Terminal

# move to correct folder (named repo.rtask here)
cd repo.rtask
# add some files
git add example.txt
git add NEWS.md
# commit with message
git commit -m "My first command line commit"

Create an example git project

For tests and examples in the {gitdown} package, function fake_repo() builds a fake project with some files and commits. This is already a folder with a git repository. I will use this repository as the reproducible example of this blog post.

# remotes::install_github("ThinkR-open/gitdown")
library(gitdown)
# Create the repo here with `R/` and `vignettes/` directories
repo <- fake_repo(path = "repo.rtask", as.package = TRUE)
# Content of the repo
fs::dir_tree(repo)
## repo.rtask
## ├── NEWS.md
## ├── R
## │   └── my_mean.R
## ├── example.txt
## └── vignettes

I can also see that we are indeed in a git repository using {git2r}, with some files tracked and others not.

library(git2r)
ls_tree(repo = repo)
##     mode type                                      sha path        name len
## 1 100644 blob 9c66eff9a1f6f34b6d9108ef07d76f8ce4c4e47f          NEWS.md  98
## 2 100644 blob c36b681bb31b80cbd090f07c95f09788c88629a6      example.txt 200
status(repo = repo)
## Untracked files:
##  Untracked:  .gitignore
##  Untracked:  R/

Synchronize a local git repository with Github

Now that I have a local git repository, I will synchronize it on Github, then on Gitlab.

Creation of a Github repository

Send the content of your local repository to Github

Information on synchronization with your local repository is presented by Github in your project home page.
Choose the part “…or push an existing repository from the command line” and execute these lines in a Terminal.

In the Terminal

# Change directory if needed
cd repo.rtask
# synchronize with github
git remote add origin git@github.com:statnmap/repo.rtask.git
git push -u origin master

Go back on your Github project page to see the two files uploaded: https://github.com/statnmap/repo.rtask

Note that the same kind of operations would be used for a synchronization with a Gitlab server.

Add some issues on Github

I open two issues on my Github. Indeed, in the commits made by fake_repo(), messages mention different issues from #1 to #145. For instance, you can look at this commit message (here on Github).

summary(last_commit(repo))
## Commit:  1ef6c084213854942de71e0c00921ca9f08c6b9d
## Author:  Alice <alice@example.org>
## When:    2020-08-10 14:49:27 GMT
## 
##      Add NEWS
##      
##      issue #32.
##      issue #1.
##      issue#12
##      ticket6789.
##      ticket1234
##       Creation of the NEWS file for version 0.1.
## 1 file changed, 3 insertions, 0 deletions
## NEWS.md | -0 +3  in 1 hunk

I will not create 145 issues! Two are enough for the purpose of the following blog post. So let me create two issues with a title and a small description.

Communicate with a distant git server

When you synchronize your local git repository with a distant server, you need to learn two more actions: pull and push.

If you work alone, chances are you do not need to pull because modifications were first created on your computer. And there is no reasons that the server have more recent modifications than your computer… Instead, you want to push regularly your commits to the server. You can push multiple commits at the same time. We usually say “Commit early and often, Push from time to time”. Later you’ll learn that you can modify the content of your commits as soon as they’re not already pushed.

On Rstudio git pane, use the “↓” (blue bottom arrow) to pull and the “↑” (green up arrow) to push to the server. In the command line, use git pull and git push.

synchronize a local git repository on Gitlab

We could do the exact same steps on Gitlab than what we did on Github. If you plan to have a unique remote repository, which is what almost all developers do, follow the Github guide above on your Gitlab server (“Repository” is named “Project”). You can go directly to the last section of this blog post.
What are you going to do after that? Why not read this to transform your project as a package, with the appropriate documentation that will allow you to maintain it without headaches?

Here, this is a good opportunity to show how to migrate from Github to Gitlab or to synchronize one local repository to two different remote providers.

Move a Github repository to Gitlab

I first tried to use the git transfer protocol proposed by Gitlab to retrieve my repository and my issues on my Gitlab.com account.

…or create a copy of your Github on Gitlab

The Gitlab importer works in theory. In my case, not all my Github repositories were listed in the Gitlab importer. And, of course, the repository of interest was not there. This paragraph shows what I did to copy the git content in an other remote provider.

In the Terminal

# Get all last modifications
git pull
# rename the local link dedicated to github
git remote rename origin origin_github
# Create the local link to Gitlab by default
git remote add origin https://gitlab.com/statnmap/repo.rtask.git
# Push the content
git push -u origin

Note that you would use similar commands if you want to fork a project, change its name and give it a new direction. As I did with {gitlabr}, but we’ll speak about that in the next blog post.

Let’s use git in all your project!

Now that you have the basis of git, go transform all your directories as git projects. If you do not think it is important because you have a good management of all your script versions, just give it a try for one project. Set up a reminder, in 6 months, come back and tell me if you changed your mind.

If you went through the part where I speak about the transfer from Github to Gitlab section of this blog post, you read that I had to manually recreate my opened issues to get an exact copy. What if you want to download all your issues locally to keep a trace of discussions ? You’ll see it in the next article: “Download Gitlab or Github issues and make a summary report of your commits”

If you want to install git, know more and go further this article, I recommend “Happy Git and GitHub for the useR”.

This post is better presented on its original ThinkR website here: Transform a folder as git project synchronized on Github or Gitlab

To leave a comment for the author, please follow the link and comment on their blog: Rtask.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.