Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
By Chris Campbell – Senior Consultant, UK.
R is a classy language. Look:
x <- matrix(1:9, nrow = 3) x ## [,1] [,2] [,3] ## [1,] 1 4 7 ## [2,] 2 5 8 ## [3,] 3 6 9 class(x) ## [1] "matrix"
A class is a label that identifies what an object is. How it’s structured. What elements should be there. Knowing an object’s class is very useful because it means that we can make assumptions when writing code. When we’re using function
to write new functions we can take advantage of this. For example, if we need a function to look at columns of a data frame, we can check the class of the input object to ensure that it actually has columns!
typicalColVals <- function(x) { if (!is.matrix(x) || is.data.frame(x)) { stop("x must be a matrix or data.frame") } # output object out <- character(ncol(x)) for (i in seq_len(ncol(x))) { tab <- table(x[, i]) # in event of tie choose first occuring out[i] <- names(sort(tab, decreasing = TRUE))[1] } return(out) } # get information from a matrix typicalColVals(x = x) ## [1] "1" "4" "7" # syntax would not make sense with a vector, # so throw error typicalColVals(x = c("a", "a", "b")) ## Error: x must be a matrix or data.frame
Object orientation is a programming term to describe languages where objects are acted upon by methods according to their class. For example, the instruction “open” when demanded of a person will be interpreted differently depending on whether it should be performed on a sliding window, a plastic packet, or a jam jar.
R is not strictly object orientated, but allows similar behaviours to be used.
- Relaxed “S3″ classes
- Formal “S4″ classes
- Reference classes
In this tutorial we will just look at “S3″ classes, as they are very simple to create, and are great for scripting and ad hoc utilities. For these functions, the method is created simply by appending the name of the class to the name of the function, separated by a dot. This is why R coding standards typically disuade the use of dots in object names, and lowerCamelCase is preferred.
# generic method summary(x) ## V1 V2 V3 ## Min. :1.0 Min. :4.0 Min. :7.0 ## 1st Qu.:1.5 1st Qu.:4.5 1st Qu.:7.5 ## Median :2.0 Median :5.0 Median :8.0 ## Mean :2.0 Mean :5.0 Mean :8.0 ## 3rd Qu.:2.5 3rd Qu.:5.5 3rd Qu.:8.5 ## Max. :3.0 Max. :6.0 Max. :9.0 # available methods methods(generic.function = "summary") ## [1] summary.aov summary.aovlist ## [3] summary.aspell* summary.connection ## [5] summary.data.frame summary.Date ## [7] summary.default summary.ecdf* ## [9] summary.factor summary.glm ## [11] summary.infl summary.lm ## [13] summary.loess* summary.manova ## [15] summary.matrix summary.mlm ## [17] summary.nls* summary.packageStatus* ## [19] summary.PDF_Dictionary* summary.PDF_Stream* ## [21] summary.POSIXct summary.POSIXlt ## [23] summary.ppr* summary.prcomp* ## [25] summary.princomp* summary.proc_time ## [27] summary.srcfile summary.srcref ## [29] summary.stepfun summary.stl* ## [31] summary.table summary.tukeysmooth* ## ## Non-visible functions are asterisked
We can choose to call visible methods directly. But we should be aware that the assumptions of structure implied by the corresponding class may mean that it is not appropriate to use certain methods on certain objects. It is for this reason that some methods may be hidden.
summary.matrix(x) ## V1 V2 V3 ## Min. :1.0 Min. :4.0 Min. :7.0 ## 1st Qu.:1.5 1st Qu.:4.5 1st Qu.:7.5 ## Median :2.0 Median :5.0 Median :8.0 ## Mean :2.0 Mean :5.0 Mean :8.0 ## 3rd Qu.:2.5 3rd Qu.:5.5 3rd Qu.:8.5 ## Max. :3.0 Max. :6.0 Max. :9.0 summary.factor(x) ## 1 2 3 4 5 6 7 8 9 ## 1 1 1 1 1 1 1 1 1
“S3″ classes are relaxed because they can be manually updated.
class(x) <- "square" class(x) ## [1] "square"
We can define a new “S3″ method to work on square objects. This function will only be dispatched on square objects, so we can assume that the rowSums
and colSums
functions can be used.
# new methods for square class summary.square <- function(x) { out <- list(rowsums = rowSums(x), colsums = colSums(x)) cat("row sums:", out$rowsums, "ncolumn sums:", out$colsums, "n") return(invisible(out)) } summary(x) ## row sums: 12 15 18 ## column sums: 6 15 24
We can check what methods are available with the class argument of methods
.
methods(class = "square") ## [1] summary.square
If there is no applicable method for a class, the default method will be used.
print(x) ## [,1] [,2] [,3] ## [1,] 1 4 7 ## [2,] 2 5 8 ## [3,] 3 6 9 ## attr(,"class") ## [1] "square" # equivalent print.default(x) ## [,1] [,2] [,3] ## [1,] 1 4 7 ## [2,] 2 5 8 ## [3,] 3 6 9 ## attr(,"class") ## [1] "square" # by default, the first two columns are plotted plot(x)
We can create multiple methods for a class.
plot.square <- function(x, ...) { x <- t(x)[, rev(seq_len(nrow(x))), drop = FALSE] image(x, ...) } plot(x)
An object can have more than one class. We can request that print
of our object calls print.by
by adding the “by” class.
class(x) <- c("square", "by") print(x) ## [1] 1 ## -------------------------------------------------------- ## [1] 2 ## -------------------------------------------------------- ## [1] 3 ## -------------------------------------------------------- ## [1] 4 ## -------------------------------------------------------- ## [1] 5 ## -------------------------------------------------------- ## [1] 6 ## -------------------------------------------------------- ## [1] 7 ## -------------------------------------------------------- ## [1] 8 ## -------------------------------------------------------- ## [1] 9
The “S3″ class system allows specialized objects relating to your own work to be operated on using intuitively named commands. This will make it easier to reuse your code and easier for others to follow your work.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.