A primer in using Java from R – part 1
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
This primer shall consist of two parts and its goal is to provide a walk-through of using resources developed in Java from R. It is structured as more of a “note-to-future-self” rather than a proper educational article, I however hope that some readers may still find it useful. It will also list a set of references that I found very helpful, for which I thank the respective authors.
The primer is split into 2 posts:
- In this first one we talk about using of the rJava package to create objects, call methods and work with arrays, we examine the various ways to call Java methods and calling Java code from R directly via execution of shell commands.
- In the second one we discuss creating and using custom .jar archives within our R scripts and packages, handling of Java exceptions from R and a quick look at performance comparison between the low and high-level interfaces provided by rJava.
Contents
Calling Java from R directly
Calling Java resources from R directly can be achieved using R’s system()
function, which invokes the specified OS command. We can either use an already compiled java class, or invoke the compilation also via a system()
call from R. Of course for any real world practical uses, we will probably do the Java coding, compilation and jaring in a Java IDE and provide R with just the final .jar file(s), I however find it helpful to have a small example of the simplest complete case, for which even the following is sufficient. Integrating pre-prepared .jars into an R packages will be covered in detail by the second part of this primer.
Let us show that by writing a very silly dummy class with just 2 methods:
main
, that prints “Hello World!” + an optional suffix, if provided as argumentSayMyName
method, that returns a string constructed from “My name is” andgetClass().getName()
This HelloWorldDummy.java
file can look as follows:
package DummyJavaClassJustForFun; public class HelloWorldDummy { public String SayMyName() { return("My name is " + getClass().getName()); } public static void main(String[] args) { String stringArg = "And that is it."; if (args.length > 0) { stringArg = args[0]; } System.out.println("Hello, World. " + stringArg); } }
Compilation and execution via bash commands
Now that we have our dummy class ready, we can put together the commands and test them by just executing via a shell, or for RStudio fans, we can test the commands via RStudio’s cool Terminal feature. First, the compilation command, which may look something like the following, assuming that we are in the correct working directory:
$ javac DummyJavaClassJustForFun/HelloWorldDummy.java
Now that we have the class compiled, we can execute the main
method, with and without the argument provided:
$ java DummyJavaClassJustForFun/HelloWorldDummy $ java DummyJavaClassJustForFun/HelloWorldDummy "I like winter"
In case we need to compile and run with more .jars that are in folder jars/
, we specify the folder using -cp
(class path):
$ javac -cp "jars/*" DummyJavaClassJustForFun/HelloWorldDummy.java $ java -cp "jars/*:compile/src" DummyJavaClassJustForFun/HelloWorldDummy
Compilation and execution of Java code from R
Now that we have tested our commands, we can use R to do the compilation via the system
function. Do not forget to cd
into the correct directory within a single system call if needed:
system('cd data/; javac DummyJavaClassJustForFun/HelloWorldDummy.java')
After that we can also execute the main
method, and the main
method with one argument specified, just like we did it outside of R, once again using cd
to enter the proper working directory if needed:
system('cd data/; java DummyJavaClassJustForFun/HelloWorldDummy') system('cd data/; java DummyJavaClassJustForFun/HelloWorldDummy "Also I like winter"')
The rJava package – an R to Java interface
The rJava package provides a low-level interface to Java virtual machine. It allows creation of objects, calling methods and accessing fields of the objects. It also provides functionality to include our java resources into R packages easily.
We can install it with the classic:
install.packages("rJava")
Note the system requirement Java JDK 1.2 or higher and for JRI/REngine JDK 1.4 or higher. After attaching the package, we also need to initialize a Java Virtual Machine (JVM):
## Attach rJava and Init a JVM library(rJava) .jinit()
In case of issues with attaching the package using library
, one can refer to this helpful StackOverflow thread.
Creating Java objects with rJava
We will now very quickly go through the basic uses of the package. The .jnew
function is used to create a new Java object. Note that the class
argument requires a fully qualified class name in Java Native Interface notation.
# Creating a new object of java.lang class String sHello <- .jnew(class = "java/lang/String", "Hello World!") # Creating a new object of java.lang class Integer iOne <- .jnew(class = "java/lang/Integer", "1")
Working with arrays via rJava
# Creating new arrays iArray <- .jarray(1L:2L) .jevalArray(iArray) ## [1] 1 2 # Using a list of 2 and lapply # Integer Matrix int[2][2] iMatrix <- .jarray(list(iArray, iArray), contents.class = "[I") lapply(iMatrix, .jevalArray) ## [[1]] ## [1] 1 2 ## ## [[2]] ## [1] 1 2 # Integer Matrix int[2][2] square <- array(1:4, dim = c(2, 2)) square ## [,1] [,2] ## [1,] 1 3 ## [2,] 2 4 # Using dispatch = TRUE to create the array # Using simplify = TRUE to return a nice R array dSquare <- .jarray(square, dispatch = TRUE) .jevalArray(dSquare, simplify = TRUE) ## [,1] [,2] ## [1,] 1 3 ## [2,] 2 4 # Integer Tesseract int[2][2][2][2] tesseract <- array(1L:16L, dim = c(2, 2, 2, 2)) tesseract ## , , 1, 1 ## ## [,1] [,2] ## [1,] 1 3 ## [2,] 2 4 ## ## , , 2, 1 ## ## [,1] [,2] ## [1,] 5 7 ## [2,] 6 8 ## ## , , 1, 2 ## ## [,1] [,2] ## [1,] 9 11 ## [2,] 10 12 ## ## , , 2, 2 ## ## [,1] [,2] ## [1,] 13 15 ## [2,] 14 16 # Use dispatch = TRUE to create the array # Use simplify = TRUE to return a nice R array # Interestingly, this seems weird dTesseract <- .jarray(tesseract, dispatch = TRUE) .jevalArray(dTesseract, simplify = TRUE) ## , , 1, 1 ## ## [,1] [,2] ## [1,] 1 0 ## [2,] 0 0 ## ## , , 2, 1 ## ## [,1] [,2] ## [1,] 0 0 ## [2,] 0 8 ## ## , , 1, 2 ## ## [,1] [,2] ## [1,] 9 0 ## [2,] 0 0 ## ## , , 2, 2 ## ## [,1] [,2] ## [1,] 0 0 ## [2,] 0 16
Calling Java methods using the rJava package
rJava provides two levels of API:
- fast, but inflexible low-level JNI-API in the form of the
.jcall
function - convenient (at the cost of performance) high-level reflection API based on the
$
operator.
In practice, there are three ways available to us from the rJava package enabling us to call Java methods, each of them with their positives and negatives.
The low-level way - .jcall()
.jcall(obj, returnSig = "V", method, ...)
calls a Java method with the supplied arguments the “low-level” way. A few important notes regarding the usage, for more refer to the R help on .jcall:
- requires exact match of argument and return types, doesn’t perform any lookup in the reflection tables
- passing sub-classes of the classes present in the method definition requires explicit casting using
.jcast
- passing null arguments needs a proper class specification with
.jnull
- vector of length 1 corresponding to a native Java type is considered a scalar, use
.jarray
to pass a vector as array for safety
# Calling a Java method length on the object low-level way .jcall(sHello, returnSig = "I", "length") ## [1] 12 # Also we must be careful with the data types: # This works .jcall(sHello, returnSig = "C", "charAt", 5L) ## [1] 32 # This does not .jcall(sHello, returnSig = "C", "charAt", 5) ## Error in .jcall(sHello, returnSig = "C", "charAt", 5): method charAt with signature (D)C not found
The high-level way - J()
J(class, method, ...)
is the high level API for accessing Java, it is slower than .jnew
or .jcall
since it has to use reflection to find the most suitable method.
- to call a method, the
method
argument must be present as a character vector of length 1 - if
method
is missing,J
creates a class name reference
# Calling a Java method length on the object high-level way J(sHello, "length") ## [1] 12 # Also, the high-level will not help here this way J(sHello, "charAt", 5L) ## Error in .jcall(o, "I", "intValue"): method intValue with signature ()I not found J(sHello, "charAt", 5) ## Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, : java.lang.NoSuchMethodException: No suitable method for the given parameters
The high-level way with convenience - $
Closely connected to the J
function, the $
operator for jobjRef
Java object references provides convenience access to object attributes and calling Java methods by implementing relevant methods for the completion generator for R.
$
returns either the value of the attribute or calls a method, depending on which name matches first$<-
assigns a value to the corresponding Java attribute
# And via the $ operator sHello$length() ## [1] 12 # But these still do not work sHello$charAt(5L) ## Error in .jcall(o, "I", "intValue"): method intValue with signature ()I not found sHello$charAt(5) ## Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, : java.lang.NoSuchMethodException: No suitable method for the given parameters
Examining methods and fields
.DollarNames
returns all fields and methods associated with the object. Method names are followed by (
or ()
depending on arity:
# vector of all fields and methods associated with sHello .DollarNames(sHello) ## [1] "CASE_INSENSITIVE_ORDER" "equals(" ## [3] "toString()" "hashCode()" ## [5] "compareTo(" "compareTo(" ## [7] "indexOf(" "indexOf(" ## [9] "indexOf(" "indexOf(" ## [11] "valueOf(" "valueOf(" ## [13] "valueOf(" "valueOf(" ## [15] "valueOf(" "valueOf(" ## [17] "valueOf(" "valueOf(" ## [19] "valueOf(" "length()" ## [21] "isEmpty()" "charAt(" ## [23] "codePointAt(" "codePointBefore(" ## [25] "codePointCount(" "offsetByCodePoints(" ## [27] "getChars(" "getBytes()" ## [29] "getBytes(" "getBytes(" ## [31] "getBytes(" "contentEquals(" ## [33] "contentEquals(" "equalsIgnoreCase(" ## [35] "compareToIgnoreCase(" "regionMatches(" ## [37] "regionMatches(" "startsWith(" ## [39] "startsWith(" "endsWith(" ## [41] "lastIndexOf(" "lastIndexOf(" ## [43] "lastIndexOf(" "lastIndexOf(" ## [45] "substring(" "substring(" ## [47] "subSequence(" "concat(" ## [49] "replace(" "replace(" ## [51] "matches(" "contains(" ## [53] "replaceFirst(" "replaceAll(" ## [55] "split(" "split(" ## [57] "join(" "join(" ## [59] "toLowerCase(" "toLowerCase()" ## [61] "toUpperCase()" "toUpperCase(" ## [63] "trim()" "toCharArray()" ## [65] "format(" "format(" ## [67] "copyValueOf(" "copyValueOf(" ## [69] "intern()" "wait(" ## [71] "wait(" "wait()" ## [73] "getClass()" "notify()" ## [75] "notifyAll()" "chars()" ## [77] "codePoints()"
Signatures in JNI notation
Java Type | Signature |
---|---|
boolean | Z |
byte | B |
char | C |
short | S |
int | I |
long | J |
float | F |
double | D |
type[] | [ type |
method type | ( arg-types ) ret-type |
fully-qualified-class | Lfully-qualified-class ; |
In the fully-qualified-class row of the table above note the
L
prefix;
suffix
For example
- the Java method:
long f (int n, String s, int[] arr);
- has type signature:
(ILjava/lang/String;[I)J
References
- rJava basic crashcourse - at the rJava site on rforge, scroll down to the Documentation section
- The JNI Type Signatures - at Oracle JNI specs
- rJava documentation on CRAN
- Calling Java code from R by prof. Darren Wilkinson
- Mapping of types between Java (JNI) and native code
- Fixing issues with loading rJava
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.