Site icon R-bloggers

Useful functions in R!

[This article was first published on R Language – the data science blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I have listed some useful functions below:

with()

The with( ) function applys an expression to a dataset. It is similar to DATA= in SAS.

# with(data, expression)
# example applying a t-test to a data frame mydata 
with(mydata, t.test(y ~ group))

Please look at other examples here and here.

by()

The by( ) function applys a function to each level of a factor or factors. It is similar to BY processing in SAS.

# by(data, factorlist, function)
# example obtain variable means separately for
# each level of byvar in data frame mydata 
by(mydata, mydatat$byvar, function(x) mean(x))

Please look here for more details.

do.call()

do.call calls a function with a list of arguments, lapply applies a function to each element of the list

do.call(sum, list(c(1,2,4,1,2), na.rm = TRUE))
#10
lapply(c(1,2,4,1,2), function(x) x + 1)
#2
#3
#5
#2
#3
do.call("+",list(4,5))
#9

More examples here.

more()

more() is a user-defined function that is helpful in printing out a large object. Taken from here.

#to print out an object such as data.frame mydf 20 lines at a time, use:
more(mydf)

#where more() is defined as

more <- function(expr, lines=20) {
  out <- capture.output(expr)
  n <- length(out)
  i <- 1
  while( i < n ) {
    j <- 0
    while( j < lines && i <= n ) {
      cat(out[i],"\n")
      j <- j + 1
      i <- i + 1
    }
    if(i<n){
      rl <- readline()
      if( grepl('^ *q', rl, ignore.case=TRUE) ) i <- n
      if( grepl('^ *t', rl, ignore.case=TRUE) ) i <- n - lines + 1
      if( grepl('^ *[0-9]', rl) ) i <- as.numeric(rl)/10*n + 1
    }
  }
  invisible(out)
}

options()

options() can be used to increase the limit for max.print in R. More info here.

options(max.print=1000000)

To check which columns in the data frame df have missing values

colnames(df)[colSums(is.na(df)) > 0]

The cover photo of this blog post is taken from https://visualstudiomagazine.com/Articles/2016/04/01/Program-Defined-Functions-in-R.aspx?Page=1


To leave a comment for the author, please follow the link and comment on their blog: R Language – the data science blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.