Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
When I started with R, I soon discovered that, more often than not, a package name has a particular meaning. For example, the first package I ever installed was foreign
. The name corresponds to its ability to read and write data from other foreign sources to R. While this and many other names are rather straightforward, others are much less intuitive. The name of a package often conveys a story, which is inspired by a general property of its functions. And sometimes I just don’t get the deeper meaning, because English is not my native language.
In this blog post, I will shed light on the wonderful world of package names. After this journey, you will not only admire the creativity of R package creator; you’ll also be king at your next class reunion! Or at least at the next R-Meetup.
Before we start, and I know that you are eager to continue, I have two remarks about this article. First: Sometimes, I refer to official explanations from the authors or other sources; other times, it’s just my personal explanation of why a package is called that way. So if you know better or otherwise, do not hesitate to contact me. Second: There are currently 15,341 packages on CRAN, and I am sure there are a lot more naming mysteries and ingenuities to discover than any curious blog reader would like to digest in one sitting. Therefore, I focussed on the most famous packages and added some of my other preferences.
But enough of the talking now, let’s start!
dplyr (diːˈplaɪə)
dbplyr
, implyr
, dtplyr
, and so on. This homophone of pliers corresponds to its refining of base R apply
-functions as part of the „split-apply-combine“ strategy. Instead of doing all steps for data analysis and manipulation at once, you split the problem into manageable pieces, apply your function to each piece, and combine everything together afterward. We see this approach in perfection when we use the pipe operator.
The first part of each package just refers to the object it is applied upon. So the d stands for data frames, db for databases, im for Apache Impala, dt for data tables, and so on…
Sources: Hadley [Wickham
lubridate (ˈluːbrɪdeɪt)
„It works as lubricated“
ggplot2 (ʤiːʤiːplɒt tuː)
data.table (ˈdeɪtə ˈteɪbl) – logo
data.table
. Occasionally, I take the liberty and exploit its functions to improve the performance of my code (hello fread()
and rbindlist()
).
Anyway, the name itself is pretty straightforward – but did you notice how cool the logo is?! Well, there is obviously the name „data.table“ and the square brackets that are fundamental in data.table syntax. Likewise, there is the assignment by reference operator, a.k.a. the walrus operator. „Wait, stop,“ your inner marine mammal researcher says, „isn’t this a sea lion on top there?!“ Yes indeed! The sea lion is used to highlight that it is an R package since, of course, it shouts R! R!.
Source: [Rdatatable
tibble (tɪbl)
Sources: Wikipedia; All About Birds; Oddity Central
Indicator: [Hadley Wickhams birth country
purrr (pɜːɜː)
apply
-functions has been one of my favorites lately. The concise usage of purrr
enables powerful functional programming that, in turn, makes your code faster, more readable, and more stable. Or, as Mr. Wickham states, it makes „your pure R functions purr„. Also, note its parallelized sibling furrr
.
Sources: [Hadley Wickham
Amelia (əˈmiːlɪə)
Amelia
. It turned out that this package is named after an impressive woman: Amelia Earhart. Living in the early 20th century, she was an aviation pioneer and feminist. She has been the first woman to fly solo across the Atlantic, a remarkable achievement and an inspiration for women to start a technical career. Unfortunately, she disappeared during a flight over the central pacific at age 39 and is thus… missing. ba dum-tss
Source: Gary King – Co-Author
magrittr (maɡʁitə)
dplyr
or its siblings is not imaginable without the pipe operator %>%. This allows you to write and read code from top to bottom and from left to right, just like regular text. Pipes are no special feature of R, yet I am sure René Magritte had nothing else in mind when he painted The Treachery of Images in 1929 with its slogan: „Ceci n’est pas une pipe„. The logo designers just made a slight adjustment to his painting. Or should I say: unearthed the meaning that has always been behind it?!
Sources: Vignette; [revolutionanalytics.com
](https://blog.revolutionanalytics.com/2014/07/magrittr-simplifying-r-code-with-pipes.html)
batman (ˈbætmən)
TRUE
/FALSE
, but who would write this in a questionnaire? In fact, when we try to convert such data to logical values by calling as.logical()
, almost every string becomes NA
. Lost and doomed? NO! Cause who is more expert to determine actual NA
’s than nananananana… batman
!
Homeric (həʊˈmɛrɪk)
Homeric::PlotDoughnut(1, col = "magenta")
Source: Homeric Documentation
fcuk (fʌk)
Error in view(my_data): could not find function "view"
Are you sick and tired of this or similar error messages? Do you regularly employ your ample stock of swear words to describe the stupidity of inconsistent usage of camel or snake case function names across packages? Or do you just type faster than your shadow, causing minor typos in your, otherwise, excellent code? There is help! Just go and install the amazing fcuk
package and useless error messages are a thing of the past.
hellno (hɛl nəʊ)
as.data.frame()
: stringsAsFactors = TRUE
. But I do not only change it to FALSE
! Also, I create my own FALSE
value and name it HELLNO
.
Honorable mentions
gremlin
: package for mixed-effects model REML incorporating Generalized Inverses.harrietr
: named after Charles Darwin’s pet giant tortoise. A package for phylogenetic and evolutionary biology data manipulations.beginr
: it helps where we’ve all been, searching for ages until settingpch = 16
.charlatan
: worse than creating dubious medicine, this one makes fake data.fauxpas
: explains what specific HTTP errors mean.fishualize
: give your plots a fishy look.greybox
: why just thinking black or white? This is a package for time series analysis.vroom
: it reads data so fast to R, you almost can hear it making vroom vroom.helfRlein
: some little helper functions, inspired by the German word Helferlein = little helper.
Über den Autor
Matthias Nistler
ABOUT US
STATWORX
is a consulting company for data science, statistics, machine learning and artificial intelligence located in Frankfurt, Zurich and Vienna. Sign up for our NEWSLETTER and receive reads and treats from the world of data science and AI. If you have questions or suggestions, please write us an e-mail addressed to blog(at)statworx.com.
Der Beitrag Why Is It Called That Way?! – Origin and Meaning of R Package Names erschien zuerst auf STATWORX.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.