Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
When I create a new .git repository, my first commit message tends to be “1st commit”. I’ve been wondering what other people use as initial commit message. Today I used the gh
package to get first commits of all repositories of the ropensci and ropenscilabs organizations.
The sample might seem a bit small, but I just wanted to start exploring my question. I agree that it means my answer won’t be very conclusive.
Getting all repos for an organization
I’ve come up with a quite inelegant solution to paging, I just continue querying the API until it returns me nothing.
library("gh") library("dplyr") library("purrr") get_repos <- function(org){ ropensci_repos_names <- NULL page <- 1 geht <- TRUE while(geht){ ropensci_repos <- try(gh("/orgs/:org/repos", org = org, page = page)) geht <- ropensci_repos != "" if(geht){ ropensci_repos_names <- c(ropensci_repos_names, vapply(ropensci_repos, "[[", "", "name")) page <- page + 1 } } return(ropensci_repos_names) } head(get_repos(org = "ropenscilabs"))
## [1] "webmockr" "vcr" "seasl" "plater" ## [5] "rnaturalearth" "convertr"
Get first commit for a repository
Here I’m doing something quite inefficient. Since the API returns the most recent commits first I get all commits. I could have used the creation date of the repository instead to only query commits created shortly after that.
first_commit <- function(repo, org){ messages <- NULL page <- 1 geht <- TRUE while(geht){ commits <- try(gh("/repos/:owner/:repo/commits", owner = org, repo = repo, page = page)) if(class(commits)[1] != "try-error"){ geht <- commits != "" }else{ geht <- FALSE } if(geht){ now <- lapply(commits, "[[", "commit") now <- lapply(now, "[[", "message") messages <- c(messages, unlist(now)) page <- page + 1 } } messages[length(messages)] } first_commit("ropenaq", "ropensci")
## [1] "Everything"
I’m a bit surprised I chose “Everything” as first commit for my ropenaq
package, actually. Not because I expect my commit history to be particularly smart either, just because it’s not a “1st commit”.
Get all the first commits
first_commits <- get_repos("ropenscilabs") %>% map(first_commit, org = "ropenscilabs") save(first_commits, file = "data/2017-02-21_ropenscilabs_first_commits.RData") first_commits <- get_repos("ropensci") %>% map(first_commit, org = "ropensci") save(first_commits, file = "data/2017-02-21_ropensci_first_commits.RData")
What are the most frequent first commits?
load("data/2017-02-21_ropenscilabs_first_commits.RData") ropenscilabs <- first_commits load("data/2017-02-21_ropensci_first_commits.RData") ropensci <- first_commits all <- c(unlist(ropenscilabs), unlist(ropensci)) firstc <- tibble::tibble(commit = all) firstc <- mutate(firstc, commit = tolower(commit)) firstc %>% group_by(commit) %>% summarize(n = n()) %>% arrange(desc(n)) %>% head(n = 15) %>% knitr::kable()
commit | n |
---|---|
first commit | 117 |
initial commit | 76 |
added readme | 19 |
added files | 9 |
1st commit | 3 |
create readme.md | 3 |
init | 3 |
added readme file | 2 |
code extracted from mikabr/devtools | 2 |
first comit | 2 |
first commit, added files | 2 |
initial | 2 |
initial import | 2 |
package infrastructure | 2 |
rstudio new package project | 2 |
Out of the 362 repositories, 76 used “initial commit” as a first commit message and 117 used “first commit” instead. In total 0.53 of all repos used either one of these two messages, which isn’t as much as I expected. But maybe rOpenSci repositories are unusual as regards first commit originality? And you, what is your favourite initial commit message if you have one?
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.