Site icon R-bloggers

Maintining the data frame fromat when indexing

[This article was first published on gacatag, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

 

Occasionally when indexing data frames the format is converted, leading to confusing consequences. As for instance, when indexing to select a single column the result is a ‘numeric’ or ‘integer’ vector. The following  demonstrates this : 



df<- data.frame(num=1:10, al=letters[1:10], bool=c(rep(TRUE,5), rep(FALSE,5)) )

rownames(df)<- df$al

df
#   num al  bool
# a   1  a  TRUE
# b   2  b  TRUE
# c   3  c  TRUE
# d   4  d  TRUE
# e   5  e  TRUE
# f   6  f FALSE
# g   7  g FALSE
# h   8  h FALSE
# i   9  i FALSE
# j  10  j FALSE

class(df[,1])
#[1] “integer”

class(df[,2])
#[1] “
factor

class(df[,3])
#[1] “
logical 

df[,1]
#[1]  1  2  3  4  5  6  7  8  9 10

# Note that the following returns an error !

rowSums(df[,1])
#Error in base::rowSums(x, na.rm = na.rm, dims = dims, …) :
#  ‘x’ must be an array of at least two dimensions

Using the drop=FALSE parameter setting, it is possible to maintain the data frame format.


class(df[,1, drop=FALSE])
#[1] “data.frame

df[,1, drop=FALSE] 

#  num
#a   1
#b   2
#c   3
#d   4
#e   5
#f   6
#g   7
#h   8
#i   9
#j  10

# No error raised by the following command!

rowSums(df[,1, drop=FALSE])
# [1]  1  2  3  4  5  6  7  8  9 10


To leave a comment for the author, please follow the link and comment on their blog: gacatag.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.