Site icon R-bloggers

Writing Functions in R

[This article was first published on R-post on Cosima Meyer, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The beauty of R is its versatility and of course the community 💜 you can use R for literally anything (I use blogdown to set up and maintain my website, xaringan to create slide decks, Shiny to build web applications, ….). All these great tools build upon one “little” (or not so little) thing: functions!

💡 What are functions?

A function is an inherent code block that performs a specific task, such as calculating a sum. And that’s exactly what we are doing now 😊

👩🏼‍💻 How to write them?

In R, functions can be as simple as this:

< !-- raw HTML omitted --> < !-- raw HTML omitted -->
name_of_the_function <- function(arguments) {
function_content
}

You give your function a name (name_of_the_function), define some arguments (arguments), and put some content in the function. Here you define how the function should proceed with the input (function_content).

Let’s use a simple example – a function that calculates the sum:

< !-- raw HTML omitted --> < !-- raw HTML omitted -->
make_sum <- function(a, b) {
c <- a + b
return(c)
}

You have the name of your function (make_sum), two arguments (a and b), and the operation inside the function (calculating the sum, storing it in c, and returning c). You theoretically don’t have to use the return statement here because the function will implicitly return the last object created but I prefer to be more explicit and to have more control (and understanding) of what my function does 🤓

When I write functions, I usually have a more or less working code in my head or a script, copy-paste it into the function environment and let it run (it comes, of course, with a lot of debugging and problem-solving time).

Writing more complex functions

Writing functions is like a flower that blooms – you start simple and add more and more parts to it (like petals) 🌸 To explain what I mean, I will use the function overview_na from the {overviewR} package. The function allows you to plot the share of missing values in your data set.

When writing a function, I usually first set up a simple architecture of the function. The code snippet shows such an example: The function takes the data object, 1) uses an apply function to get the number of NAs by column, 2) converts the result to a data frame object and 3) plots it with {ggplot2}.

# How to plot NAs in your data 🕵
# # Based on `overview_na` from {overviewR}:
# https://github.com/cosimameyer/overviewR/blob/master/R/overview_na.R
overview_na <-
function(dat
) {
# Generate necessary variables ----------------------------------------
# Calculate the number of NAs per column
na_count <-
sapply(dat, function(y)
sum(length(which(is.na(
y
)))))
# Convert it to the a data.frame
dat_frame <- data.frame(na_count)
# Add rownames_to_columns
dat_frame <-
tibble::rownames_to_column(dat_frame, var = "variable")
# Plot vour visualization ---------------------------------------------
# Create a aaplot2 with vour normal wav to create a ggplot2
plot <- ggplot2::ggplot(data = dat_ frame)
ggplot2::geom_col(ggplotz::aes(y = reorder(variable,-na_count),
x = na_count))
# Return the plot
return(plot)
}
< !-- raw HTML omitted --> < !-- raw HTML omitted -->

The function already works but you can tweak it further (and that’s what I mean with the blooming and flower petal part 🌸 – it’s like adding another piece of beauty to it). You can now, for instance, allow the user to manually define the label of your x axis by adding an “xlabel” argument to your function (you are generally free to select an argument name that you want). The new parts are in-between the sparkles ✨

# How to plot NAs in your data 🕵
# # Based on `overview_na` from {overviewR}:
# https://github.com/cosimameyer/overviewR/blob/master/R/overview_na.R
overview_na <-
function(dat,
# ✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨
# Add a manual xlabel for your plot ✨
# The default will be "Showing your NAs" but
# you can change it and also add a different label
xlabel = "Showing your NAs"
# ✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨
) {
# Generate necessary variables ----------------------------------------
# Calculate the number of NAs per column
na_count <-
sapply(dat, function(y)
sum(length(which(is.na(
y
)))))
# Convert it to the a data.frame
dat_frame <- data.frame(na_count)
# Add rownames_to_columns
dat_frame <-
tibble::rownames_to_column(dat_frame, var = "variable")
# Plot vour visualization ---------------------------------------------
# Create a aaplot2 with vour normal wav to create a ggplot2
plot <- ggplot2::ggplot(data = dat_ frame)
ggplot2::geom_col(ggplotz::aes(y = reorder(variable,-na_count),
x = na_count)) +
# ✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨
ggplot2::xlab(xlabel)
# ✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨
# Return the plot
return(plot)
}
< !-- raw HTML omitted --> < !-- raw HTML omitted -->

Or use a pre-defined theme 💅 You can add the theme to your function but you can also put it in extra function as I did (makes debugging so much better (and your code cleaner 👍, the theme that we use in {overviewR} is here)).

# How to plot NAs in your data 🕵
# # Based on `overview_na` from {overviewR}:
# https://github.com/cosimameyer/overviewR/blob/master/R/overview_na.R
overview_na <-
function(dat,
xlabel = "Showing your NAs") {
# ✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨
# Set theme -----------------------------------------------------------
# Create a theme for the plot
# The theme is created here:
# https://bit.ly/theme_na_plot
# It is a basic ggplot2::theme
theme_plot <- theme_na_plot()
# ✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨✨
# Generate necessary variables ----------------------------------------
# Calculate the number of NAs per column
na_count <-
sapply(dat, function(y)
sum(length(which(is.na(
y
)))))
# Convert it to the a data.frame
dat_frame <- data.frame(na_count)
# Add rownames_to_columns
dat_frame <-
tibble::rownames_to_column(dat_frame, var = "variable")
# Plot vour visualization ---------------------------------------------
# Create a aaplot2 with vour normal wav to create a ggplot2
plot <- ggplot2::ggplot(data = dat_ frame)
ggplot2::geom_col(ggplotz::aes(y = reorder(variable, -na_count),
x = na_count)) +
ggplot2::xlab(xlabel)
# Return the plot
return(plot)
}
< !-- raw HTML omitted --> < !-- raw HTML omitted -->

✨ Best practices when writing functions

Let’s dig into best practices when it comes to function writing. This list contains a loose collections of tips and tricks that are not ranked in a particular order:

< details> < summary>Alternative text

Good practice vs. not-so-good practice

##Good practice
make_sum <- function(a, b) {
c <- a + b
return(c)
}
##Not so good practice
make_sum <- function(a, b) a + c

For more tips and tricks, also have a look at Hadley Wickham’s and Garett Grolemund’s excellent book “R for Data Science”.

If you want to quickly look up what this blog post tells you about writing functions, here’s a summary (also as 📄PDF for you to download here):

< details> < summary>Alternative text

Image showing how a general function in R looks like (a function has arguments, a function statement, and usually a return function). Good practices when writing functions are:

To leave a comment for the author, please follow the link and comment on their blog: R-post on Cosima Meyer.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.