Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
You can read the original post in its original format on Rtask website by ThinkR here: Transform a folder as git project synchronized on Github or Gitlab
You have been working for years on your R scripts, and saved all versions as “script_v1.R”, “script_v2.R”, “script_v2_best-of-the-world.R”, … One day, you heard about git, a versioning system that allows you to make your files travel through time. But, how to transform a directory of falsely versioned files into a git repository and synchronize it online?
Originally, I wanted to write a blog post about our package {gitdown}, but I realized I needed a reproducible example of a git repository on Github and Gitlab and I started to document the steps. But then, this was too long to stay in the same article. So here it is: the preamble of my article “Download Gitlab or Github issues and make a summary report of your commits”. These are the very first step in transforming a local folder into a git repository, synced remotely with a git provider. Also, we’ll see how to synchronize the same project with two different git providers, Github and Gitlab, at the same time.
If you want to do a quick read, you can read titles and text in bold to get the main points of the sections.
git, for personal projects
When speaking about git, you may have heard this is a great tools for collaborative work in software development. This helps manage multiple developers working in parallel on the same project while reducing the fear of loosing parts of people’s work. But you do not need to work in a group to use git with your project. As soon as you have some scripts, you may want to track the changes you make over time. git is here to help.
To install git, you can follow this guide in “Happy Git and GitHub for the useR”.
Transform my directory into git project
You should probably work in a Rstudio project: File > New project > Existing directory.
If you do not want Rstudio, you will need to set your folder as your current working directory.
Then run usethis::use_git()
and answer the questions (I would say “Yes” twice).
git requires that you say who you are, so that each modification of your code is assigned to you. With this line of code, you can define your configuration globally. Do it once on this computer, and you won’t have to do it again.
use_git_config(user.name = "Jane", user.email = "jane@example.org")
That’s it. Your now working in a git repository.
git beginners command lines
Your project is now local. If you are a beginner and do not want to bother now with all git wonderful possibilities, that’s OK. There are two actions to realize daily: add
and commit
.
add
: this adds a file to the list of files that will be frozen aftercommit
commit
: this freezes the state of all filesadded
into the git repository. This goes with a message where you explain why you froze those files at this state.
Any reason to commit
is good. To commit
does not mean you can’t continue to work on this file afterward. Indeed, if you make new modifications, you can add
them for the next commit
. Commit
messages will be the equivalent of your v1
, v2
, final_version
, best_final_version
, …
Use add
and commit
with RStudio graphical interface
In Rstudio, as soon as you restart a project with a git repository inside, you will have access to the git pane.
- To
add
a file for the nextcommit
, check the box in column “Staged”. - To
commit
all checked files, click “Commit”. Add a commit message, and commit!
You’ll be able to see the history of your commits by clicking on the clock icon.
Use add
and commit
in command line
If you are still in Rstudio, you can use the terminal to commit. You can also directly open a Terminal and make sure you are in the correct folder.
In the Terminal
# move to correct folder (named repo.rtask here) cd repo.rtask # add some files git add example.txt git add NEWS.md # commit with message git commit -m "My first command line commit"
Create an example git project
For tests and examples in the {gitdown} package, function fake_repo()
builds a fake project with some files and commits. This is already a folder with a git repository. I will use this repository as the reproducible example of this blog post.
# remotes::install_github("ThinkR-open/gitdown") library(gitdown) # Create the repo here with `R/` and `vignettes/` directories repo <- fake_repo(path = "repo.rtask", as.package = TRUE) # Content of the repo fs::dir_tree(repo) ## repo.rtask ## ├── NEWS.md ## ├── R ## │ └── my_mean.R ## ├── example.txt ## └── vignettes
I can also see that we are indeed in a git repository using {git2r}, with some files tracked and others not.
library(git2r) ls_tree(repo = repo) ## mode type sha path name len ## 1 100644 blob 9c66eff9a1f6f34b6d9108ef07d76f8ce4c4e47f NEWS.md 98 ## 2 100644 blob c36b681bb31b80cbd090f07c95f09788c88629a6 example.txt 200 status(repo = repo) ## Untracked files: ## Untracked: .gitignore ## Untracked: R/
Synchronize a local git repository with Github
Now that I have a local git repository, I will synchronize it on Github, then on Gitlab.
Creation of a Github repository
- Open your github account on https://github.com
- Go to “Repositories”
- Click on button “New”
- Give it the name of your local repository (
repo.rtask
here)- Do not check “Initialize this repository with a README”
- Click “Create repository”
Send the content of your local repository to Github
Information on synchronization with your local repository is presented by Github in your project home page.
Choose the part “…or push an existing repository from the command line” and execute these lines in a Terminal.
In the Terminal
# Change directory if needed cd repo.rtask # synchronize with github git remote add origin git@github.com:statnmap/repo.rtask.git git push -u origin master
Go back on your Github project page to see the two files uploaded: https://github.com/statnmap/repo.rtask
Note that the same kind of operations would be used for a synchronization with a Gitlab server.
Add some issues on Github
I open two issues on my Github. Indeed, in the commits made by fake_repo()
, messages mention different issues from #1
to #145
. For instance, you can look at this commit message (here on Github).
summary(last_commit(repo)) ## Commit: 1ef6c084213854942de71e0c00921ca9f08c6b9d ## Author: Alice <alice@example.org> ## When: 2020-08-10 14:49:27 GMT ## ## Add NEWS ## ## issue #32. ## issue #1. ## issue#12 ## ticket6789. ## ticket1234 ## Creation of the NEWS file for version 0.1. ## 1 file changed, 3 insertions, 0 deletions ## NEWS.md | -0 +3 in 1 hunk
I will not create 145 issues! Two are enough for the purpose of the following blog post. So let me create two issues with a title and a small description.
Communicate with a distant git server
When you synchronize your local git repository with a distant server, you need to learn two more actions: pull
and push
.
pull
is to download the content of the server on your local repository.push
is to upload your localcommits
to the server.
If you work alone, chances are you do not need to pull
because modifications were first created on your computer. And there is no reasons that the server have more recent modifications than your computer… Instead, you want to push
regularly your commits
to the server. You can push multiple commits
at the same time. We usually say “Commit early and often, Push from time to time”. Later you’ll learn that you can modify the content of your commits as soon as they’re not already pushed.
On Rstudio git pane, use the “↓” (blue bottom arrow) to pull
and the “↑” (green up arrow) to push
to the server. In the command line, use git pull
and git push
.
synchronize a local git repository on Gitlab
We could do the exact same steps on Gitlab than what we did on Github. If you plan to have a unique remote repository, which is what almost all developers do, follow the Github guide above on your Gitlab server (“Repository” is named “Project”). You can go directly to the last section of this blog post.
What are you going to do after that? Why not read this to transform your project as a package, with the appropriate documentation that will allow you to maintain it without headaches?
Here, this is a good opportunity to show how to migrate from Github to Gitlab or to synchronize one local repository to two different remote providers.
Move a Github repository to Gitlab
I first tried to use the git transfer protocol proposed by Gitlab to retrieve my repository and my issues on my Gitlab.com account.
- Open your Gitlab account
- On gitlab.com or on your own Gitlab server
- Click on “New project”
- Choose the tab named “Import project”
- Click on the appropriate icon. Github in this case.
- Click on the appropriate icon. Github in this case.
- You may need to connect to your Github account
- Choose the repository you would like to import
- Import
…or create a copy of your Github on Gitlab
The Gitlab importer works in theory. In my case, not all my Github repositories were listed in the Gitlab importer. And, of course, the repository of interest was not there. This paragraph shows what I did to copy the git content in an other remote provider.
- Create a new “Blank” project on Gitlab without “Initialization”: https://gitlab.com/statnmap/repo.rtask
- Send the content of your local repository to Gitlab. But here, as we already sent it to Github, we need some modifications of the command lines.
In the Terminal
# Get all last modifications git pull # rename the local link dedicated to github git remote rename origin origin_github # Create the local link to Gitlab by default git remote add origin https://gitlab.com/statnmap/repo.rtask.git # Push the content git push -u origin
- Create the two issues manually as we did on Github.
Note that you would use similar commands if you want to fork a project, change its name and give it a new direction. As I did with {gitlabr}, but we’ll speak about that in the next blog post.
Let’s use git in all your project!
Now that you have the basis of git, go transform all your directories as git projects. If you do not think it is important because you have a good management of all your script versions, just give it a try for one project. Set up a reminder, in 6 months, come back and tell me if you changed your mind.
If you went through the part where I speak about the transfer from Github to Gitlab section of this blog post, you read that I had to manually recreate my opened issues to get an exact copy. What if you want to download all your issues locally to keep a trace of discussions ? You’ll see it in the next article: “Download Gitlab or Github issues and make a summary report of your commits”
If you want to install git, know more and go further this article, I recommend “Happy Git and GitHub for the useR”.
This post is better presented on its original ThinkR website here: Transform a folder as git project synchronized on Github or Gitlab
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.