Site icon R-bloggers

Rosetta language popularity

[This article was first published on factbased, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Rosetta Code is a community wiki which presents how to solve various programming tasks by different programming languages. Thus, it serves as a dictionary between programming languages, but also as cookbook of programming recipes for a specific language.

One unsolved (until today) programming task for R was to rank languages by popularity. I worked on it using the RJSONIO package from Omegahat and the Mediawiki API. Here I explain the code step by step:

First, let us look up the languages which are defined at Rosetta Code. The wiki has a category for solutions by programming languages, which we will use.

> library(RJSONIO)
> langUrl <- "http://rosettacode.org/mw/api.php?action=query&format=json&cmtitle=Category:Solutions_by_Programming_Language&list=categorymembers&cmlimit=500"
> languages <- fromJSON(langUrl)$query$categorymembers
> languages <- sapply(languages, function(x) sub("Category:", "", x$title))

Now for each programming language, there is a category of the users of the language. We iterate over all languages and count the category members.

> user <- function (lang) {
+   userBaseUrl <- "http://rosettacode.org/mw/api.php?action=query&format=json&list=categorymembers&cmlimit=500&cmtitle=Category:"
+   userUrl <- paste(userBaseUrl, URLencode(paste(lang, " User", sep="")),sep="")
+   length(fromJSON(userUrl)$query$categorymembers)
+ }
> users <- sapply(languages, user)

Now we can print out the top 15 languages:

> head(sort(users, decreasing=TRUE),15)
         C        C++       Java     Python JavaScript       Perl UNIX Shell 
        55         55         37         32         27         27         22 
    Pascal      BASIC        PHP        SQL    Haskell        AWK    C sharp 
        20         19         19         18         17         16         16 
      Ruby 
        14  

It is very straightforward to work with the Mediawiki API, and it offers many other different features. It would be nice to have a S3 class that does all the URL encoding. There is already a project wikirobot on R-forge, but I did not look into it yet.

To leave a comment for the author, please follow the link and comment on their blog: factbased.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.