Site icon R-bloggers

Get Some Class

[This article was first published on Mango Solutions Shop, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

By Chris Campbell – Senior Consultant, UK.

R is a classy language. Look:

x <- matrix(1:9, nrow = 3)
x
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
class(x)
## [1] "matrix"

A class is a label that identifies what an object is. How it’s structured. What elements should be there. Knowing an object’s class is very useful because it means that we can make assumptions when writing code. When we’re using function to write new functions we can take advantage of this. For example, if we need a function to look at columns of a data frame, we can check the class of the input object to ensure that it actually has columns!

typicalColVals <- function(x) {
    if (!is.matrix(x) || is.data.frame(x)) {
        stop("x must be a matrix or data.frame") }
    # output object
    out <- character(ncol(x))
    for (i in seq_len(ncol(x))) {
        tab <- table(x[, i])
        # in event of tie choose first occuring
        out[i] <- names(sort(tab, decreasing = TRUE))[1]
    }
    return(out)
}
# get information from a matrix
typicalColVals(x = x)
## [1] "1" "4" "7"
# syntax would not make sense with a vector, 
# so throw error
typicalColVals(x = c("a", "a", "b"))
## Error: x must be a matrix or data.frame

Object orientation is a programming term to describe languages where objects are acted upon by methods according to their class. For example, the instruction “open” when demanded of a person will be interpreted differently depending on whether it should be performed on a sliding window, a plastic packet, or a jam jar.

R is not strictly object orientated, but allows similar behaviours to be used.

In this tutorial we will just look at “S3″ classes, as they are very simple to create, and are great for scripting and ad hoc utilities. For these functions, the method is created simply by appending the name of the class to the name of the function, separated by a dot. This is why R coding standards typically disuade the use of dots in object names, and lowerCamelCase is preferred.

# generic method
summary(x)
##        V1            V2            V3     
##  Min.   :1.0   Min.   :4.0   Min.   :7.0  
##  1st Qu.:1.5   1st Qu.:4.5   1st Qu.:7.5  
##  Median :2.0   Median :5.0   Median :8.0  
##  Mean   :2.0   Mean   :5.0   Mean   :8.0  
##  3rd Qu.:2.5   3rd Qu.:5.5   3rd Qu.:8.5  
##  Max.   :3.0   Max.   :6.0   Max.   :9.0
# available methods
methods(generic.function = "summary")
##  [1] summary.aov             summary.aovlist        
##  [3] summary.aspell*         summary.connection     
##  [5] summary.data.frame      summary.Date           
##  [7] summary.default         summary.ecdf*          
##  [9] summary.factor          summary.glm            
## [11] summary.infl            summary.lm             
## [13] summary.loess*          summary.manova         
## [15] summary.matrix          summary.mlm            
## [17] summary.nls*            summary.packageStatus* 
## [19] summary.PDF_Dictionary* summary.PDF_Stream*    
## [21] summary.POSIXct         summary.POSIXlt        
## [23] summary.ppr*            summary.prcomp*        
## [25] summary.princomp*       summary.proc_time      
## [27] summary.srcfile         summary.srcref         
## [29] summary.stepfun         summary.stl*           
## [31] summary.table           summary.tukeysmooth*   
## 
##    Non-visible functions are asterisked

We can choose to call visible methods directly. But we should be aware that the assumptions of structure implied by the corresponding class may mean that it is not appropriate to use certain methods on certain objects. It is for this reason that some methods may be hidden.

summary.matrix(x)
##        V1            V2            V3     
##  Min.   :1.0   Min.   :4.0   Min.   :7.0  
##  1st Qu.:1.5   1st Qu.:4.5   1st Qu.:7.5  
##  Median :2.0   Median :5.0   Median :8.0  
##  Mean   :2.0   Mean   :5.0   Mean   :8.0  
##  3rd Qu.:2.5   3rd Qu.:5.5   3rd Qu.:8.5  
##  Max.   :3.0   Max.   :6.0   Max.   :9.0
summary.factor(x)
## 1 2 3 4 5 6 7 8 9 
## 1 1 1 1 1 1 1 1 1

“S3″ classes are relaxed because they can be manually updated.

class(x) <- "square"
class(x)
## [1] "square"

We can define a new “S3″ method to work on square objects. This function will only be dispatched on square objects, so we can assume that the rowSums and colSums functions can be used.

# new methods for square class
summary.square <- function(x) {
    out <- list(rowsums = rowSums(x),
        colsums = colSums(x))
    cat("row sums:", out$rowsums, 
        "ncolumn sums:", out$colsums, "n")
    return(invisible(out))
}
summary(x)
## row sums: 12 15 18 
## column sums: 6 15 24

We can check what methods are available with the class argument of methods.

methods(class = "square")
## [1] summary.square

If there is no applicable method for a class, the default method will be used.

print(x)
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
## attr(,"class")
## [1] "square"
# equivalent
print.default(x)
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
## attr(,"class")
## [1] "square"
# by default, the first two columns are plotted
plot(x)

We can create multiple methods for a class.

plot.square <- function(x, ...) {
    x <- t(x)[, rev(seq_len(nrow(x))), drop = FALSE]
    image(x, ...) }
plot(x)

An object can have more than one class. We can request that print of our object calls print.by by adding the “by” class.

class(x) <- c("square", "by")
print(x)
## [1] 1
## -------------------------------------------------------- 
## [1] 2
## -------------------------------------------------------- 
## [1] 3
## -------------------------------------------------------- 
## [1] 4
## -------------------------------------------------------- 
## [1] 5
## -------------------------------------------------------- 
## [1] 6
## -------------------------------------------------------- 
## [1] 7
## -------------------------------------------------------- 
## [1] 8
## -------------------------------------------------------- 
## [1] 9

The “S3″ class system allows specialized objects relating to your own work to be operated on using intuitively named commands. This will make it easier to reuse your code and easier for others to follow your work.

To leave a comment for the author, please follow the link and comment on their blog: Mango Solutions Shop.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.