Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I usually write around here in french and mainly report on French Hospitals data managment and the statistical tasks they imply. As today’s post is about a new package I have created, I’ll be writing in english. The package is called stringfix
because it uses infix operators to manipulate character strings.
Introduction
In Python, the operator +
is used to paste two character strings together. For example: 'Hello ' + 'world'
gives 'Hello World'
. For that matter, building sentences with words and arithmetic symbols seems a very nice way to write. In R, the paste function requires parenthesis in order to be computed. Therefore the use of consecutive functions can make it hard to understand.
+
is a nice operator, and we can use it in R almost as it is used in Python by creating an infix operator.
`%+%` <- function(x,y){ paste0(x, y) }
While a ggplot function has already the same name, it is used to override data in a ggplot
call and not for pasting character strings, see here. When loading tidyverse, the same ggplot function is called, preventing us from using paste0’
s %+%
. Otherwise, you can find a hint of character string pasting in the Advanced R book.
In order to create a toolbox around paste0
’s %+%
, I started collecting some other infix functions for character strings manipulation. The main question was: which functions with a right to left call that I use really often could be reordered in a %>%
code. Here is the little family I have since build on : paste, grepl, substring, count, padding. The goal of this package is to use stringr or base functions in backend as a start for an alternative character string manipulation in R.
This package is still at its early begining (kind of a draft for me!) but I thought some other people would enjoy it and may even wish to contribute.
Presentation
"In a manner of coding, I just want to say..." % % "Nothing." #>[1] "In a manner of coding, I just want to say... Nothing."
Examples
paste
'Hello ' %+% 'world' #> [1] "Hello world" 'Your pastas taste like ' %+% '%>%' #> [1] "Your pastas taste like %>%" 'coco' %+% 'bolo' #> [1] "cocobolo" 'Hello' % % 'world' #> [1] "Hello world" 'Your pastas taste like' % % '%>%' #> [1] "Your pastas taste like %>%" 'Hello' %,% 'world...' #> [1] "Hello, world..." 'Your pastas taste like ' %+% '%>%...' %,% 'or %>>%...' #> [1] "Your pastas taste like %>%..., or %>>%..."
grepl
Case sensitive
'pig' %g% 'The pig is in the cornfield' #> [1] TRUE 'Pig' %g% 'The pig is in the cornfield' #> [1] FALSE
Case insensitive (ignore.case)
'pig' %gic% 'The pig is in the cornfield' #> [1] TRUE 'PIG' %gic% 'The PiG is in the cornfield' #> [1] TRUE
substring
'NFKA008' %s% '1.4' #> [1] "NFKA" 'NFKA008' %s% .4 #> [1] "NFKA" 'where is' % % ('the pig is in the cornfield' %s% '1.7') %+% '?' #> [1] "where is the pig?"
count
fruit <- c("apple", "banana", "pear", "pineapple") "a" %count% fruit #> [1] 1 3 1 1 c("a", "b", "p", "p") %count% fruit #> [1] 1 1 1 3
pad, lpad and rpad
5 %lpad% '0.5' #> [1] "00005" 5 %lpad% .5 #> [1] "00005" 5 %lpad% '.5' #> [1] " 5" 5 %lpad% '2.5' #> [1] "22225" 'é' %lpad% 'é.5' #> [1] "ééééé"
names of tibbles : tolower and toupper
I have added two functions that I use really often.
library(magrittr) iris %>% toupper_names %>% head #> SEPAL.LENGTH SEPAL.WIDTH PETAL.LENGTH PETAL.WIDTH SPECIES #> 1 5.1 3.5 1.4 0.2 setosa #> 2 4.9 3.0 1.4 0.2 setosa #> 3 4.7 3.2 1.3 0.2 setosa #> 4 4.6 3.1 1.5 0.2 setosa #> 5 5.0 3.6 1.4 0.2 setosa #> 6 5.4 3.9 1.7 0.4 setosa
Finally, I also wanted to outline that the function from the rmngb package : %out%
: negation of %in%
can be very useful to avoid typing ! x %in% y
(you can just type x %out% y
instead). This is why I have included it in this package !
More information here : https://github.com/GuillaumePressiat/stringfix
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.