Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
It’s often the case that I want to write an R script that loops over multiple datasets, or different subsets of a large dataset, running the same procedure over them: generating plots, or fitting a model, perhaps. I set the script running and turn to another task, only to come back later and find the loop has crashed partway through, on an unanticipated error. Here’s a toy example:
> inputs = list(1, 2, 4, -5, 'oops', 0, 10)
> for(input in inputs) {
+ print(paste("log of", input, "=", log(input)))
+ }
[1] "log of 1 = 0"
[1] "log of 2 = 0.693147180559945"
[1] "log of 4 = 1.38629436111989"
[1] "log of -5 = NaN"
Error in log(input) : Non-numeric argument to mathematical function
In addition: Warning message:
In log(input) : NaNs produced
The loop handled the negative arguments more or less gracefully (depending on how you feel about NaN), but crashed on the non-numeric argument, and didn’t finish the list of inputs.
How are we going to handle this?
The try
block
The most straightforward way is to wrap our problematic call in a try
block:
> for(input in inputs) {
+ try(print(paste("log of", input, "=", log(input))))
+ }
[[1] "log of 1 = 0"
[1] "log of 2 = 0.693147180559945"
[1] "log of 4 = 1.38629436111989"
[1] "log of -5 = NaN"
Error in log(input) : Non-numeric argument to mathematical function
In addition: Warning message:
In log(input) : NaNs produced
[1] "log of 0 = -Inf"
[1] "log of 10 = 2.30258509299405"
This skips over the error-causing non-numeric input with an error message (you can suppress the error message with the silent=T
argument to try
), and continues on with the rest of the input. Generally, this is what you would like.
The tryCatch
block
Sometimes, however, you might want substitute your own return value when errors (or warnings) are returned. We can do this with tryCatch
, which allows you to write your own error and warning handlers. Let’s set our loop to return log(-x) when x is negative (negative arguments throw a warning) and return a NaN for non-numeric arguments (which throw an error). We’ll print out an advisory message, too.
> for(input in inputs) {
+ tryCatch(print(paste("log of", input, "=", log(input))),
+ warning = function(w) {print(paste("negative argument", input));
log(-input)},
+ error = function(e) {print(paste("non-numeric argument", input));
NaN})
+ }
[1] "log of 1 = 0"
[1] "log of 2 = 0.693147180559945"
[1] "log of 4 = 1.38629436111989"
[1] "negative argument -5"
[1] "non-numeric argument oops"
[1] "log of 0 = -Inf"
[1] "log of 10 = 2.30258509299405"
Whoops — not quite! We are correctly catching and messaging warnings and errors, but we are not printing out our desired corrected value. This is because the warning and error handlers are altering the execution order and throwing out of the print
statement. If we want to return and print out the appropriate value when warnings and errors are thrown, we have to wrap our tryCatch
into a function. We’ll leave the advisory message in.
> robustLog = function(x) {
+ tryCatch(log(x),
+ warning = function(w) {print(paste("negative argument", x));
log(-x)},
+ error = function(e) {print(paste("non-numeric argument", x));
NaN})
+ }
>
> for(input in inputs) {
+ print(paste("robust log of", input, "=", robustLog(input)))
+ }
[1] "robust log of 1 = 0"
[1] "robust log of 2 = 0.693147180559945"
[1] "robust log of 4 = 1.38629436111989"
[1] "negative argument -5"
[1] "robust log of -5 = 1.6094379124341"
[1] "non-numeric argument oops"
[1] "robust log of oops = NaN"
[1] "robust log of 0 = -Inf"
[1] "robust log of 10 = 2.30258509299405"
Now we return and print out a valid numeric value for numeric inputs to robustLog
, and a NaN only for non-numeric input. Notice also that log(0) still returns -Inf, with no warning or error.
Of course, now that we are writing a new function, it would make more sense to check the arguments before calling log
, to avoid the recalculation. This example is only to demonstrate tryCatch
, which is useful for defending against unexpected errors.
Advanced Exception Handling
The above is about as much about exception and error handling in R as you will usually need to know, but there are a few more nuances. The documentation for tryCatch
claims that it works like Java or C++ exceptions: this would mean that when the interpreter generates an exceptional condition and throws, execution then returns to the level of the catch block and all state below the try block is forgotten. In practice, tryCatch
is a bit more powerful than that, because you have the ability to insert custom warning and exception handlers. There is another exception handling routine called withCallingHandlers
that similarly allows you to insert custom warning and exception handlers. There may be some difference in semantics or in environment context between tryCatch
and withCallingHandlers
; but we couldn’t find it.
The final concept in R’s error handling is withRestarts
, which is not really an error handling mechanism but rather a general control flow structure. The withRestarts
structure can return to a saved execution state, rather like a co-routine or long-jump. It can be used with withCallingHandlers
or with tryCatch
to design either interactive or automated “retry on failure” mechanisms, where the retry logic is outside of the failing function. Although obviously a function that checks for potential errors and alters its behavior before signaling a failure is much easier to maintain.
Here’s as simple an example of using restarts as we could come up with. The idea is that there is some big expensive computation that you want to do with the function input before you get to the potentially error-causing code. You want the exception handlers to mitigate the failure and continue running the code without having to redo the expensive calculation. Imagine this function as being part of a library of routines that you wish to call regularly.
By default, our example routine will enter R’s debugging environment upon exception. The user then has to select the appropriate restart function to continue the operation.
> # argument x: item to take logarithm of
> # argument warning: warning handler
> # argument error: error handler
> # invokeRestart("flipArg"): re-runs function on -x if x
> # (appropriate fix for negative numeric arguments)
> # invokeRestart("zapOutArg"): re-runs function on x=1
> # (appropriate fix for non-numeric arguments)
> expensiveBigLibraryFunction <- function(x,
+ warning=function(w) {
print(paste('warning:',w));
browser()},
+ error=function(e) {
print(paste('e:',e));
browser()}
+ )
+ {
+ print(paste("big expensive step we don't want to repeat for x:",x))
+ z <- x # the "expensive operation"
+ # (not really, just standing in for computation)
+ repeat
+ withRestarts(
+ withRestarts(
+ tryCatch( # you could call withCallingHandlers
+ # with identical arguments here, too
+ {
+ print(paste("attempt cheap operation for z:",z))
+ return(log(z))
+ },
+ warning = warning,
+ error = error ),
+ flipArg = function() {z
Here's the code working on valid input.
> # normal operation
> expensiveBigLibraryFunction(2)
[1] "big expensive step we don't want to repeat for x: 2"
[1] "attempt cheap operation for z: 2"
[1] 0.6931472
Here's what happens when you call the code with a negative argument, and then invoke the correct restart.
> # bad numeric argument (negative)
> # user must restart with flipArg
> expensiveBigLibraryFunction(-2)
[1] "big expensive step we don't want to repeat for x: -2"
[1] "attempt cheap operation for z: -2"
[1] "warning: simpleWarning in log(z): NaNs produced\n"
Called from: function (w)
{
print(paste("warning:", w))
browser()
}(list(message = "NaNs produced", call = log(z)))
Browse[1]> invokeRestart("flipArg")
[1] "attempt cheap operation for z: 2"
[1] 0.6931472
Here's what happens when you call the code with a non-numeric argument, and then invoke the inappropriate restart.
> # bad non-numeric argument
> # flipArg is the wrong restart function
> expensiveBigLibraryFunction('a')
[1] "big expensive step we don't want to repeat for x: a"
[1] "attempt cheap operation for z: a"
[1] "e: Error in log(z): Non-numeric argument to mathematical function\n"
Called from: h(simpleError(msg, call))
Browse[1]> invokeRestart("flipArg")
Error in -z : invalid argument to unary operator
Here's what happens when you call the code with a non-numeric argument, and then invoke the correct restart.
> # bad non-numeric argument
> # zapOutArg is the right restart function
> expensiveBigLibraryFunction('a')
[1] "big expensive step we don't want to repeat for x: a"
[1] "attempt cheap operation for z: a"
[1] "e: Error in log(z): Non-numeric argument to mathematical function\n"
Called from: h(simpleError(msg, call))
Browse[1]> invokeRestart("zapOutArg")
[1] "attempt cheap operation for z: 1"
[1] 0
Of course, you probably don't want to have invoke the restart manually. so we will rewrite the exception handlers to invoke the appropriate restart automatically.
> autoBigLibraryFunction = function(x) {
+ expensiveBigLibraryFunction(x,
+ warning=function(w) {invokeRestart("flipArg")},
+ error=function(e) {invokeRestart("zapOutArg")})
+ }
> autoBigLibraryFunction(2)
[1] "big expensive step we don't want to repeat for x: 2"
[1] "attempt cheap operation for z: 2"
[1] 0.6931472
> autoBigLibraryFunction(-2)
[1] "big expensive step we don't want to repeat for x: -2"
[1] "attempt cheap operation for z: -2"
[1] "attempt cheap operation for z: 2"
[1] 0.6931472
> autoBigLibraryFunction('a')
[1] "big expensive step we don't want to repeat for x: a"
[1] "attempt cheap operation for z: a"
[1] "attempt cheap operation for z: 1"
[1] 0
Using withRestart
is a bit complex, as you can see. Fortunately try
and tryCatch
will most likely be good enough for the vast majority of your exception handling needs.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.