Maintining the data frame fromat when indexing
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Occasionally when indexing data frames the format is converted, leading to confusing consequences. As for instance, when indexing to select a single column the result is a ‘numeric’ or ‘integer’ vector. The following demonstrates this :
df<- data.frame(num=1:10, al=letters[1:10], bool=c(rep(TRUE,5), rep(FALSE,5)) )
rownames(df)<- df$al
df
# num al bool
# a 1 a TRUE
# b 2 b TRUE
# c 3 c TRUE
# d 4 d TRUE
# e 5 e TRUE
# f 6 f FALSE
# g 7 g FALSE
# h 8 h FALSE
# i 9 i FALSE
# j 10 j FALSE
class(df[,1])
#[1] “integer”
class(df[,2])
#[1] “factor“
class(df[,3])
#[1] “logical“
df[,1]
#[1] 1 2 3 4 5 6 7 8 9 10
# Note that the following returns an error !
rowSums(df[,1])
#Error in base::rowSums(x, na.rm = na.rm, dims = dims, …) :
# ‘x’ must be an array of at least two dimensions
class(df[,1, drop=FALSE])
#[1] “data.frame
df[,1, drop=FALSE]
# num
#a 1
#b 2
#c 3
#d 4
#e 5
#f 6
#g 7
#h 8
#i 9
#j 10
# No error raised by the following command!
rowSums(df[,1, drop=FALSE])
# [1] 1 2 3 4 5 6 7 8 9 10
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.