The FeatureTest
Java class is designed to showcase the
main aspects of the R6 Generator Maven Plugin, and serves as a quick
guide to Java programmers wishing to use the plugin. The source of the
FeatureTest
class is shown below, where the use of the Java
annotations @RClass
and @RMethod
tag a class,
and specific methods in that class for use in R. The code structure,
parameters and return values or the tagged classes and methods are used
to create an equivalent R6 class structure in an R library. In general
Javadoc comments and tags are used to document the library, and where
there are no applicable tags specific fields in the @RClass
and @RMethod
annotations can been used to specify needed
imports, suggested R package dependencies and provide specific example
code if needed.
/**
* A test of the R6 generator templating
*
* The feature test should allow mathjax in javadoc
*
* $$e = mc^2$$
*
*
* this is a details comment
* @author Rob Challen rob@terminological.co.uk 0000-0002-5504-7768
*
*/
@RClass(
imports = {"ggplot2","readr","dplyr","tibble"},
suggests = {"roxygen2","devtools","here","tidyverse"},
exampleSetup = {
"J = JavaApi$get()"
},
testSetup = {
"J = JavaApi$get()"
}
)
public class FeatureTest {
String message;
static Logger log = LoggerFactory.getLogger(FeatureTest.class);
/**
* A maximum of one constructor of any signature can be used. <br>
*
* If different constructors are needed then they may be used but not
* included in the R Api (i.e. not annotated with @RMethod.) <br>
*
* Static factory methods can be used instead.
* @param logMessage - a message which will be logged
*/
@RMethod(examples = {
"minExample = J$FeatureTest$new('Hello from Java constructor!')"
})
public FeatureTest(String logMessage) {
log.info(logMessage);
this.message = logMessage;
}
...
@RFinalize
public void close() {
log.info("The FeatureTest finalizer is called when the R6 object goes out of scope");
throw new RuntimeException("Errors from the finalizer are ignored");
}
@RMethod
public static RCharacter collider(RCharacter message1, RCharacter message2) {
return RConverter.convert("feature test: "+message1.toString()+message2.toString());
}
}
The packaging of this class into an R library is described elsewhere. The package name (in this
case testRapi
), the directory of the library (in this
example ~/Git/r6-generator-maven-plugin-test/r-library/
)
and other metadata such as author and license details are defined in the
Maven plugin configuration (in a file named pom.xml
). This
configuration is described in detail elsewhere. For the purposes of
this we assume the Java code has been compiled, generating the
testRapi
R package which is ready for installation.
The generated R package can be installed into R in more or less the
same way as any other R library, depending on how it is deployed.
Typical scenarios would be pushing the whole Java project to Github and
installing to R from Github using
devtools::install_github()
, installing directly from the
local filesystem, with devtools::install()
, or submitting
the R library sub-directory as a project to CRAN and installing from
there, using install.packages()
.
# not run
# remove installed versions
try(detach("package:testRapi", unload = TRUE),silent = TRUE)
remove.packages("testRapi")
rm(list = ls())
Restarting R maybe also required if there was a running java VM.
# locally compiled
devtools::install("~/Git/r6-generator-docs", upgrade = "never")
# pushed to github
# devtools::install_github("terminological/r6-generator-docs", upgrade = "never")
# submitted to CRAN
# install.packages("testRapi")
The R6 api to the Java classes requires a running instance of the
Java Virtual Machine and JNI bridge provided by rJava. It also
requires Java classpath dependencies to be loaded and application
logging to be initialised. This is all managed by a specific generated
R6 class called JavaApi
and creating a singleton instance
of this is the first step to using the library in R. In these examples
the singleton instance J
is referred to as the “root” of
the api, as all the functions of the API stem from it.
J = testRapi::JavaApi$get(logLevel = "WARN")
J$changeLogLevel("DEBUG")
J$.log$debug("prove the logger is working and outputting debug statements...")
J$printMessages()
## prove the logger is working and outputting debug statements...
Using the FeatureTest
class above requires a creating a
new instance of the class. This is done through the root of the api as
follows, and the FeatureTest
constructor simply logs the
logMessage
parameter’s value.
## Hello world. Creating a new object
/**
* A hello world function
*
* More detailed description
*
* @return this java method returns a String
*/
@RMethod(examples = {
"minExample = J$FeatureTest$new('Hello, R World!')",
"minExample$doHelloWorld()"
})
public RCharacter doHelloWorld() {
return RConverter.convert("Hello world from Java!");
}
/**
* Add some numbers (1).
*
* The doSum function description = it adds two numerics
* @param a the A parameter, can be NA
* @param b the B parameter
* @return A+B of course, NAs in inputs are converted to null in Java. This catches the resulting NPE in java idiom and returns an explicit NA.
* This only matters if you care about the difference between NA_real_ and NaN in R.
*/
@RMethod(tests = {
"minExample = J$FeatureTest$new('Hello from Java constructor!')",
"result = minExample$doSum(2,7.5)",
"testthat::expect_equal(result,9.5)"
})
public RNumeric doSum(RNumeric a, RNumeric b) {
try {
return RConverter.convert(a.get()+b.get());
} catch (NullPointerException e) {
log.info("Java threw a NPE - could have had a NA input?");
return RNumeric.NA;
}
}
/**
* Adds some numbers
*
* Do sum 2 uses native ints rather than RNumerics
* It should throw an error if given something that cannot be coerced to an integer.
* This also demonstrated the use of the `@RDefault` annotation
* @param a the A parameter
* @param b the B parameter
* @return A+B of course
*/
@RMethod
public int doSum2(int a, @RDefault(rCode = "10") int b) {
return a+b;
}
/**
* Static methods are also supported.
*
* These are accessed through the
* root of the R api, and as a functional interface
*
* @param message a message
*/
@RMethod(examples = {
"J = JavaApi$get()",
"J$FeatureTest$demoStatic('Ola, el mundo')",
"demo_static('Bonjour, le monde')"
})
public static void demoStatic(String message) {
log.info(message);
}
The FeatureClass.doHelloWorld()
method takes no
arguments and returns a value to R. A detailed discussion of R and Java
data types is to be found elsewhere but our approach has involved
developing a specific set of Java datatypes that have close
relationships to the native R datatypes. This enables loss-less round
tripping of data from R to Java and back again, but requires the mapping
of Java data types to R. This is handled by the
uk.co.terminological.rjava.RConverter
class which provides
a range of datatype transformers, and the
uk.co.terminological.rjava.types.*
classes which specify
Java equivalents to R data types. These are needed as R’s dynamic
datatypes contain concepts which are not readily represented in the
primitive Java datatypes that are transferred across the JNI. Thus some
marshaling is required on both sides to ensure translation is 100%
accurate, including for example, conversion of R logical vectors
containing NA values, to Java List<Boolean>
via JNI
primitive arrays, or support for typed NA values
(e.g. NA_int_
versus NA_logical_
).
The doHelloWorld()
function returns a character vector,
The doSum()
function expects 2 R numeric values and
seamlessly handles both datatype coercion and NA values.
## [1] "Hello world from Java!"
## [1] "character"
## [1] 7.1
## [1] "numeric"
## Java threw a NPE - could have had a NA input?
## [1] NA
## Java threw a NPE - could have had a NA input?
## [1] "numeric"
Wrapping and unwrapping every datatype is inconvenient for the Java
programmer so some single valued primitive types are supported as
parameters and return types of Java functions, particularly
int
, char
, double
,
boolean
, and java.lang.String
, but these come
with constraints on use, particularly around NA values in R, and use in
asynchronous code.
## [1] 7
## [1] "integer"
## [1] 7
## [1] "integer"
## Error in self$.api$.toJava$int(b) : not an integer
## Error in self$.api$.toJava$int(b) : cant use NA as input to java int
Default values in R are demonstrated here with the
@RDefault
annotation which has a string of valid R code
producing the value that you want as the default value when this method
is called from R. Any valid R code that produces an input that can be
coerced to the correct type is allowed here but string values must be
double quoted and double escaped if needs be. (I.e. the R string
hello..<newline>...world
would be
"hello...\n...world"
in R so must be given as
@RDefault(value="\"hello...\\n...world\"")
here in an
annotation).
Static Java methods are also supported. R6 does not have a concept of
static methods, so to get the same look and feel as the object interface
in Java, we use the root of the JavaApi as a place to hold the static
methods. This enables auto-completion for static methods. In this
example the static method demoStatic
nothing (an R NULL is
invisibly returned), but logs its input.
## Hello, static world, in a Java-like interface.
As static methods are stateless they can also be implemented as more regular package functions, for which exactly the same functionality as the format above is made. For this to work all the static functions declared in the API must have different names. At the moment this is up to the developer to ensure this is the case, although at some point I will make a check for it. To differentiate the object style of call above from the function style more common in R packages we have converted static Java method names from camel case to snake case. Therefore the same exact same call as above in the functional style is as follows. Both functional and object oriented interfaces are generated for all static methods:
## Hello, static world, in a more regular R-like interface.
The generated API has support for the loss-less bi-directional transfer of a range of R data types into Java and back to R. Extensive tests are available elsewhere but in general support for vectors, dataframes and lists is mostly complete, including factors, but matrices and arrays are not yet implemented. Dataframes with named rows are also not yet supported. Dataframes as well as other objects can be serialised in Java and de-serialised. This serialisation has been done for the ggplot2::diamonds data set, and the resulting de-serialisation shown here. Factor levels and ordering are preserved when the factor is part of a vector or dataframe.
/**
* Consumes a data frame and logs its length
* @param dataframe a dataframe
*/
@RMethod
public void doSomethingWithDataFrame(RDataframe dataframe) {
log.info("dataframe length: "+dataframe.nrow());
}
/**
* Creates a basic dataframe and returns it
* @return a daatframe
*/
@RMethod
public RDataframe generateDataFrame() {
RDataframe out = new RDataframe();
for (int i=0; i<10; i++) {
Map<String,Object> tmp = new LinkedHashMap<String,Object>();
tmp.put("index", i);
tmp.put("value", 10-i);
out.addRow(tmp);
}
return out;
}
/**
* The ggplot2::diamonds dataframe
*
* A copy serialised into java, using
* RObject.writeRDS, saved within the jar file of the package, and exposed here
* using RObject.readRDS.
* @return the ggplot2::diamonds dataframe
* @throws IOException if the serialised data file could not be found
*/
@RMethod(
examples = {
"dplyr::glimpse( diamonds() )"
},
tests = {
"testthat::expect_equal(diamonds(), ggplot2::diamonds)"
}
)
public static RDataframe diamonds() throws IOException {
InputStream is = FeatureTest.class.getResourceAsStream("/diamonds.ser");
if(is==null) throw new IOException("Could not locate /diamonds.ser");
return RObject.readRDS(RDataframe.class, is);
}
The basic smoke tests of this are as follows
## dataframe length: 53940
## Rows: 10
## Columns: 2
## $ index <int> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
## $ value <int> 10, 9, 8, 7, 6, 5, 4, 3, 2, 1
## Rows: 53,940
## Columns: 10
## $ carat <dbl> 0.23, 0.21, 0.23, 0.29, 0.31, 0.24, 0.24, 0.26, 0.22, 0.23, 0.…
## $ cut <ord> Ideal, Premium, Good, Premium, Good, Very Good, Very Good, Ver…
## $ color <ord> E, E, E, I, J, J, I, H, E, H, J, J, F, J, E, E, I, J, J, J, I,…
## $ clarity <ord> SI2, SI1, VS1, VS2, SI2, VVS2, VVS1, SI1, VS2, VS1, SI1, VS1, …
## $ depth <dbl> 61.5, 59.8, 56.9, 62.4, 63.3, 62.8, 62.3, 61.9, 65.1, 59.4, 64…
## $ table <dbl> 55, 61, 65, 58, 58, 57, 57, 55, 61, 61, 55, 56, 61, 54, 62, 58…
## $ price <int> 326, 326, 327, 334, 335, 336, 336, 337, 337, 338, 339, 340, 34…
## $ x <dbl> 3.95, 3.89, 4.05, 4.20, 4.34, 3.94, 3.95, 4.07, 3.87, 4.00, 4.…
## $ y <dbl> 3.98, 3.84, 4.07, 4.23, 4.35, 3.96, 3.98, 4.11, 3.78, 4.05, 4.…
## $ z <dbl> 2.43, 2.31, 2.31, 2.63, 2.75, 2.48, 2.47, 2.53, 2.49, 2.39, 2.…
if (identical(J$FeatureTest$diamonds(), ggplot2::diamonds)) {
message("PASS: round tripping ggplot2::diamonds including java serialisation and deserialisation works")
} else {
stop("FAIL: serialised diamonds from Java should be identical to the ggplot source")
}
## PASS: round tripping ggplot2::diamonds including java serialisation and deserialisation works
The generated R6 code can handle return of Java objects to R, as long
as they are a part of the api and annotated with @RClass
. A
common use case for this is fluent Apis, where the Java object is
manipulated by a method and returns itself.
/**
* Get the message
*
* message desciption
* @return The message (previously set by the constructor)
*/
@RMethod
public RCharacter getMessage() {
return RConverter.convert(message);
}
/**
* Set a message in a fluent way
*
* A fluent method which updates the message in this object, returning the
* same object. This is differentiated from factory methods which produce a new
* instance of the same class by checking to see if the returned Java object is equal
* to the calling Java object.
* @param message the message is a string
* @return this should return exactly the same R6 object.
*/
@RMethod
public FeatureTest fluentSetMessage(@RDefault(rCode = "\"hello\nworld\"") RCharacter message) {
this.message = message.toString();
return this;
}
The JavaApi
root manages R’s perspective on the identity
of objects in Java. This allows for fluent api methods, and method
chaining. This is not flawless but should work for most common
scenarios. It is possible that complex edge cases may appear equal in
Java but not identical in R, so true equality should rely on the Java
equals()
method.
## [1] "Hello world. Creating a new object"
## [1] "Hello world. updating message."
if(identical(feat1,feat2)) {
message("PASS: the return value of a fluent setter returns the same object as the original")
} else {
print(feat1$.jobj)
print(feat2$.jobj)
print(feat1$.jobj$equals(feat2$.jobj))
stop("FAIL: these should have been identical")
}
## PASS: the return value of a fluent setter returns the same object as the original
if (feat1$equals(feat2)) {
message("PASS: java based equality detection is supported")
} else {
stop("FAIL: these should have been equal")
}
## PASS: java based equality detection is supported
## [1] "Hello world. updating message."
# Operations on feat2 are occurring on feat1 as they are the same underlying object
feat2$fluentSetMessage("Hello world. updating message again.")
feat1$getMessage()
## [1] "Hello world. updating message again."
Factory methods allow java methods to create and return Java objects.
This is supported as long as the objects are a part of the api and
annotated with @RClass
. Arbitrary Java objects are not
supported as return types and Java code that tries to return such
objects will throw an exception during the maven packaging phase. This
is by design to enforce formal contracts between the Java code and the R
api. If you want dynamic manipulation of the Java objects then the
jsr223 plugin is more appropriate for you.
/**
* A factory or builder method which constructs an object of another class from some parameters
* @param a the first parameter
* @param b the second parameter
* @return A MoreFeatureTest R6 reference
*/
@RMethod
public MoreFeatureTest factoryMethod(RCharacter a, @RDefault(rCode = "as.character(Sys.Date())") RCharacter b) {
return new MoreFeatureTest(a,b);
}
@RMethod
public String objectAsParameter(MoreFeatureTest otherObj) {
return otherObj.toString();
}
This Java code from refers to another class -
MoreFeatureTest
which has the following basic
structure:
/**
* This has no documentation
*/
@RClass
public class MoreFeatureTest {
String message1;
String message2;
static Logger log = LoggerFactory.getLogger(MoreFeatureTest.class);
/**
* the first constructor is used if there are none annotated
* @param message1 - the message to be printed
* @param message2 - will be used for toString
*/
public MoreFeatureTest(RCharacter message1, RCharacter message2) {
this.message1 = message1.toString();
this.message2 = message2.toString();
log.info("constuctor: {}, {}",this.message1, this.message2);
}
/** A static object constructor
* @param message1 - the message to be printed
* @param message2 - will be used for toString
* @return A MoreFeatureTest R6 object
*/
@RMethod(examples = {
"J = JavaApi$get()",
"J$MoreFeatureTest$create('Hello,',' World')"
})
public static MoreFeatureTest create(RCharacter message1, RCharacter message2) {
return new MoreFeatureTest(message1,message2);
}
public String toString() {
return "toString: "+message2;
}
...
}
The FeatureTest.factoryMethod(a,b)
method allows us to
construct instances of another class. This enables builder patterns in
the R api. The MoreFeatureTest.create(message1,message2)
method demonstrates static factory methods, which return instances of
the same class. Static methods are implemented as methods in the
JavaApi
root, as demonstrated here, and accessed through
the root object J
:
## constuctor: Hello, World
# static factory method accessed through the root of the API
moreFeat2 = J$MoreFeatureTest$create("Ola","El Mundo")
## constuctor: Ola, El Mundo
## [1] "toString: World"
The logging sub-system is based on slf4j
with a
log4j2
implementation. These are specified in the
r6-generator-runtime
dependency pom.xml
, so
anything that imports that will have them as a transitive dependency.
These are needed as dynamic alteration of the logging level from R is
dependent on implementation details of log4j
. This is maybe
possible to remove in the future.
Exceptions thrown from Java are handled in the same way as
rJava, and printed messages are seen on the R console as
expected. However rJava
does something strange to messages
from System.out
that means they do not appear in knitr
output. To resolve this a unsightly workaround (hack) has been adopted
that collects messages from system out and prints them after the Java
method has completed. This has the potential to cause all sorts of
issues, which I think I have mostly resolved, but it is best described
as a work in progress.
The logging level can be controlled at runtime by a function in the
JavaApi
root. Logging can be configured dynamically with a
log4j
properties file (not shown) to enable file based
logging, for example.
@RMethod
public void testLogging() {
log.error("An error");
log.warn("A warning");
log.info("A info");
log.debug("A debug");
log.trace("A trace");
}
@RMethod
public RCharacter throwCatchable() throws Exception {
throw new Exception("A catchable exception has been thrown");
}
@RMethod
public void printMessage() {
System.out.println("Printed in java: "+message1+" "+message2);
}
@RMethod
public RCharacter throwRuntime() {
throw new RuntimeException("A runtime exception has been thrown");
}
## Printed in java: Hello World
## An error
## A warning
## A info
## A debug
## A trace
# Suppressing errors
try(moreFeat1$throwCatchable(),silent = TRUE)
# Handling errors
tryCatch(
{
moreFeat1$throwRuntime()
},
error = function(e) {
message("the error object has a set of classes: ",paste0(class(e),collapse=";"))
warning("error: ",e$message)
# the e$jobj entry gives native access to the throwable java object thanks to rJava.
e$jobj$printStackTrace()
},
finally = print("finally")
)
## the error object has a set of classes: RuntimeException;Exception;Throwable;Object;error;condition
## Warning in value[[3L]](cond): error: java.lang.RuntimeException: A runtime
## exception has been thrown
## [1] "finally"
## An error
The Java objects bound to R instances will stay in memory whilst they
are needed. When they go out of scope they should automatically be
garbage collected as a native feature of rJava
. R6 object
finalizers are also generated when specified by the code and these are
triggered during release of the Java objects, and may call any closing
code needed in the Java library (e.g. closing input streams etc.).
## [1] "Hello world from Java!"
When an object goes out of scope the finalizer will be called. This can happen much later, and any errors thrown by the finalizer code could cause issues. Code run in these finalizers can throw unchecked exceptions which are ignored and converted to logged errors.
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 1252694 67.0 1992255 106.4 1992255 106.4
## Vcells 2662904 20.4 8388608 64.0 7221368 55.1
The finalizer should also be called implicitly when the R6 object goes out of scope in R.
Debugging compiled Java code running in the context of a R is not for
the faint-hearted. It definitely makes sense to test and debug the Java
code in Java first. To make this possible it is useful to be able to
serialise some test data in the exact format in which it will arrive in
Java from R. To that end all the Java structures supported can be
serialised, and de-serialised for testing purposes. The
testRapi
library presented here has a set of functions that
facilitate this as static methods of J$Serialiser
.
@RMethod
public static void serialiseDataframe(RDataframe dataframe, String filename) throws IOException {
FileOutputStream fos = new FileOutputStream(filename);
dataframe.writeRDS(fos);
log.info("dataframe written to: "+filename);
}
@RMethod
public static RDataframe deserialiseDataframe(String filename) throws IOException {
InputStream is = Files.newInputStream(Paths.get(filename));
if(is==null) throw new IOException("Could not locate "+filename);
return RObject.readRDS(RDataframe.class, is);
}
s = tempfile(pattern = "diamonds", fileext = ".ser")
J$Serialiser$serialiseDataframe(dataframe = ggplot2::diamonds, filename = s)
J$Serialiser$deserialiseDataframe(filename=s) %>% glimpse()
## Rows: 53,940
## Columns: 10
## $ carat <dbl> 0.23, 0.21, 0.23, 0.29, 0.31, 0.24, 0.24, 0.26, 0.22, 0.23, 0.…
## $ cut <ord> Ideal, Premium, Good, Premium, Good, Very Good, Very Good, Ver…
## $ color <ord> E, E, E, I, J, J, I, H, E, H, J, J, F, J, E, E, I, J, J, J, I,…
## $ clarity <ord> SI2, SI1, VS1, VS2, SI2, VVS2, VVS1, SI1, VS2, VS1, SI1, VS1, …
## $ depth <dbl> 61.5, 59.8, 56.9, 62.4, 63.3, 62.8, 62.3, 61.9, 65.1, 59.4, 64…
## $ table <dbl> 55, 61, 65, 58, 58, 57, 57, 55, 61, 61, 55, 56, 61, 54, 62, 58…
## $ price <int> 326, 326, 327, 334, 335, 336, 336, 337, 337, 338, 339, 340, 34…
## $ x <dbl> 3.95, 3.89, 4.05, 4.20, 4.34, 3.94, 3.95, 4.07, 3.87, 4.00, 4.…
## $ y <dbl> 3.98, 3.84, 4.07, 4.23, 4.35, 3.96, 3.98, 4.11, 3.78, 4.05, 4.…
## $ z <dbl> 2.43, 2.31, 2.31, 2.63, 2.75, 2.48, 2.47, 2.53, 2.49, 2.39, 2.…
With serialised test data, as dataframes, lists or named lists,
development of Java functions and unit tests can be created that output
values of the correct RObject
datatype. Correct packaging
and integration with R is a question of running mvn install
to compile the Java into a jar file and generate R library code, then
using devtools::install
to install the generated R library.
As you iterate development I have found it necessary to install the
package and restart the session for R to pick up new changes in the
compiled Java files. There is probably a cleaner way to do this but I
haven’t found it yet.
# compile Java code and package R library using `mvn install` command
cd ~/Git/r6-generator-docs
mvn install
setwd("~/Git/r6-generator-docs")
# remove previously installed versions
try(detach("package:testRapi", unload = TRUE),silent = TRUE)
remove.packages("testRapi")
# rm(list = ls()) may be required to clear old versions of the library code
# Restarting R maybe also required if there was a running java VM otherwise changes to the jars on the classpath are not picked up.
# install locally compiled R library:
devtools::install("~/Git/r6-generator-docs", upgrade = "never")
# N.B. devtools::load_all() does not appear to always successfully pick up changes in the compiled java code
For initial integration testing there is a debug flag in the maven
pom.xml
that enables remote Java debugging to the
initialized when the library is first loaded in R. When set to true a
Java debugging session on port 8998 is opened which can be connected to
as a remote Java application. This allows breakpoints to be set on Java
code and the state of the JVM to be inspected when Java code is executed
from R, however Java code changes cannot be hot-swapped into the running
JVM, and so debugging integration issues is relatively limited. For more
details see the Maven
configuration vignette.
There are other limitations with enabling Java debugging, not least being the potential for port conflicts with multiple running instances of the development library, and caching issues between running and loaded versions of the Java code. Whilst not too painful (compared to the alternatives) this is very definitely not a REPL experience and reserved for the last stage of debugging. Part of the purpose of strongly enforcing a datatype conversion contract between Java and R, and extensive use of code generation, is to decouple Java and R development as much as possible (N.B. do as I say - not as I do).
Java code that takes a long time to complete or requires interaction
from the user creates a problem for rJava
as the program
control is passed completely to Java during the code execution. This can
lock the R session until the Java code is finished. The fact that the R
session is blocked pending the result from Java means there is no
obvious way to terminate a running Java process from within R, and if a
Java process goes rogue then the R session hangs.
We have approached this by creating a RFuture
class
which is bundled in any R package built with
r6-generator-maven-plugin
, and some Java infrastructure to
allow a Java method call, initiated by R, to be run in its own thread.
The thread is monitored using the R6
RFuture
class. This allows instantaneous return from the Java call which
executes asynchronously in the background, freeing up the R process to
continue. The RFuture
class has functions to
cancel()
a thread, or check whether it is complete
(isDone()
), cancelled (isCanceled()
), or to
wait for the result and get()
it.
The RFuture
thread wrapper is used for Java methods
annotated with @RAsync
instead of
@RMethod
.
int invocation = 0;
int timer = 10;
@RAsync(synchronise = true)
public RCharacter asyncCountdown() throws InterruptedException {
invocation = invocation + 1;
timer = 10;
String label = "Async and run thread safe "+invocation;
// This example deliberately uses a not thread
// safe design. However the synchronise=true annotation
// forces it to be synchronised on the feature test class.
// Progress in this thread can be recorded and displayed in R
// when `get()` is called on a result in progress. The total is
// not actually required
RProgressMonitor.setTotal(timer);
while (timer > 0) {
System.out.println(label+" ... "+timer);
Thread.sleep(1000);
timer--;
// This static method is keyed off the thread id so can be placed
// anywhere in code.
RProgressMonitor.increment();
}
RProgressMonitor.complete();
return RCharacter.from(label+" completed.");
}
@RAsync
public RCharacter asyncRaceCountdown() throws InterruptedException {
invocation = invocation + 1;
timer = 10;
String label = "Async and not thread safe "+invocation;
// This example deliberately uses a not thread
// safe design to demonstrate race conditions. These are the
// responsiblity of the Java programmer to avoid.
RProgressMonitor.setTotal(timer);
while (timer > 0) {
System.out.println(label+" ... "+timer);
Thread.sleep(1000);
timer--;
RProgressMonitor.increment();
}
RProgressMonitor.complete();
return RCharacter.from(label+" completed.");
}
A basic test of this follows which starts the execution of a 10 second countdown in Java. The countdown
# J = testRapi::JavaApi$get(logLevel = "WARN")
featAsyn = J$FeatureTest$new("Async testing")
# The asyncCountdown resets a timer in the FeatureTest class
tmp = featAsyn$asyncCountdown()
message("Control returned immediately.")
## Control returned immediately.
Sys.sleep(4)
# The countdown is not finished
if (tmp$isDone()){
stop("FAIL: Too soon for the countdown to have finished..!")
} else {
message("PASS: 4 seconds later the countdown is still running.")
}
## PASS: 4 seconds later the countdown is still running.
Sys.sleep(8)
if (!tmp$isDone()) {
stop("FAIL: It should have been finished by now!")
} else {
message("PASS: the countdown is finished.")
# in this case getting the result returns nothing as the java method is void
# but it should trigger printing the java output.
}
## PASS: the countdown is finished.
System output from asynchronous code can be very confusing if it
appears out of sequence to other code. The system output of Java code
running asynchronously is cached and only displayed when the result is
retrieved via get()
## Async and run thread safe 1 ... 10
## Async and run thread safe 1 ... 9
## Async and run thread safe 1 ... 8
## Async and run thread safe 1 ... 7
## Async and run thread safe 1 ... 6
## Async and run thread safe 1 ... 5
## Async and run thread safe 1 ... 4
## Async and run thread safe 1 ... 3
## Async and run thread safe 1 ... 2
## Async and run thread safe 1 ... 1
## [1] "Async and run thread safe 1 completed."
RFuture
does not ensure thread safety, which in general
is up to the Java programmer however in the situation where you are
annotating a non thread safe class that might be used in an
@RAsync
annotated method there is a basic locking mechanism
that prevents multiple synchronous calls of the same method in the same
object.
# Potential for race condition is prevented by the sychronise=true annotation
tmp = featAsyn$asyncCountdown()
tmp2 = featAsyn$asyncCountdown()
Sys.sleep(5)
if (tmp$cancel()) print("First counter cancelled.")
## [1] "First counter cancelled."
Although both counters were triggered at the same time the second one is waiting to obtain a lock. In this example we cancel the first call after 5 seconds:
## Async and run thread safe 2 ... 10
## Async and run thread safe 2 ... 9
## Async and run thread safe 2 ... 8
## Async and run thread safe 2 ... 7
## Async and run thread safe 2 ... 6
## Async and run thread safe 2 ... 5
## Error in tmp$get() :
## background call to asyncCountdown(...) was cancelled.
## user system elapsed
## 0.001 0.000 0.001
After which the second call starts processing. If you are running this interactively you will notice a progress indicator appears.
## Async and run thread safe 3 ... 10
## Async and run thread safe 3 ... 9
## Async and run thread safe 3 ... 8
## Async and run thread safe 3 ... 7
## Async and run thread safe 3 ... 6
## Async and run thread safe 3 ... 5
## Async and run thread safe 3 ... 4
## Async and run thread safe 3 ... 3
## Async and run thread safe 3 ... 2
## Async and run thread safe 3 ... 1
## user system elapsed
## 0.709 0.037 9.811
If the default @RAsync(synchronise=false)
is used then
race conditions may occur if the Java method changes the state of other
objects. This is demonstrated here where both methods are altering the
underlying counter alternating. As before, the output is only displayed
when the result is requested:
# Potential for race condition is prevented by the sychronise=true annotation
system.time({
tmp = featAsyn$asyncRaceCountdown()
tmp2 = featAsyn$asyncRaceCountdown()
tmp$get()
})
## Async and not thread safe 4 ... 10
## Async and not thread safe 4 ... 9
## Async and not thread safe 4 ... 7
## Async and not thread safe 4 ... 5
## Async and not thread safe 4 ... 3
## Async and not thread safe 4 ... 1
## user system elapsed
## 0.458 0.011 6.005
In this case the execution takes far less that 10 seconds as both countdowns are running in parallel and using the same timer. The output from the second function
## Async and not thread safe 5 ... 10
## Async and not thread safe 5 ... 8
## Async and not thread safe 5 ... 6
## Async and not thread safe 5 ... 4
## Async and not thread safe 5 ... 2
## user system elapsed
## 0.003 0.000 0.003
The RFuture
class is also useful to prevent lock-ups due
to Java code entering an infinite loop or waiting on external input that
never arrives. Sometimes blocking the R process is useful, as long as
the Java process can be terminated at the same time as the R process, so
that we can be sure that a Java process is finished. This is supported
by the @RBlocking
annotation which places the Java method
call in a thread that can be cleanly interrupted from R, but otherwise
makes R wait for Java to finish.
@RBlocking
public RCharacter blockingCountdown() throws InterruptedException {
invocation = invocation + 1;
timer = 10;
String label = "Blocking "+invocation;
RProgressMonitor.setTotal(timer);
while (timer > 0) {
System.out.println(label+" ... "+timer);
Thread.sleep(1000);
timer--;
RProgressMonitor.increment();
}
RProgressMonitor.complete();
return RCharacter.from(label+" completed.");
}
## Blocking 6 ... 10
## Blocking 6 ... 9
## Blocking 6 ... 8
## Blocking 6 ... 7
## Blocking 6 ... 6
## Blocking 6 ... 5
## Blocking 6 ... 4
## Blocking 6 ... 3
## Blocking 6 ... 2
## Blocking 6 ... 1
Static methods are more likely to be type safe. Async methods can be static in which case there is no potential for race conditions and we don’t need to check for them.
@RAsync
public static RCharacter asyncStaticCountdown(RCharacter label, @RDefault(rCode = "10") RInteger rtimer) throws InterruptedException {
// N.B. inputs in Async classes cannot be Java primitives
int timer = rtimer.javaPrimitive();
RProgressMonitor.setTotal(timer);;
while (timer > 0) {
System.out.println(label.get()+" ... "+timer);
Thread.sleep(1000);
timer--;
RProgressMonitor.increment();
}
RProgressMonitor.complete();
return RCharacter.from(label+" completed.");
}
@RAsync
public static FactoryTest asyncFactory() throws InterruptedException {
Thread.sleep(5000);
return new FactoryTest();
}
# debug(J$FeatureTest$asyncStaticCountdown)
tmp = J$FeatureTest$asyncStaticCountdown("hello 1",4)
tmp2 = J$FeatureTest$asyncStaticCountdown("hello 2",4)
Sys.sleep(5)
tmp$get()
## hello 1 ... 4
## hello 1 ... 3
## hello 1 ... 2
## hello 1 ... 1
## [1] "hello 1 completed."
## hello 2 ... 4
## hello 2 ... 3
## hello 2 ... 2
## hello 2 ... 1
## [1] "hello 2 completed."
ASync and blocking methods are handled slightly different internally.
When writing a Java method you cannot use inputs that are primitives.
All parameters must be subtypes of RObject
such as
RInteger
rather than the primitive equivalent
int
. This is a result of dynamic type checking using
reflection when calling the java method and may be dealt with in the
future. Async methods can happily return Java objects annotated with
@RClass
which will be appropriately passed to R wrapped in
an R6
class.
## [1] ONE THREE <NA> TWO
## Levels: ONE < TWO < THREE
As long running jobs are in the background the status of all long
running jobs may need to be queried. The status may be “cancelled”, “in
progress”, “result ready” or if the result has been already retrieved by
get()
it may be “processed”.
## id status
## 1 24 asyncCountdown(...) [result processed]
## 2 25 asyncCountdown(...) [cancelled]
## 3 26 asyncCountdown(...) [result processed]
## 4 27 asyncRaceCountdown(...) [result processed]
## 5 28 asyncRaceCountdown(...) [result processed]
## 6 29 blockingCountdown(...) [result processed]
## 7 30 asyncStaticCountdown(...) [result processed]
## 8 31 asyncStaticCountdown(...) [result processed]
## 9 32 asyncFactory(...) [result processed]
## 10 33 asyncCountdown(...) [0/10 (0%)]
Previous results can be retrieved from this list using the id.
## [1] "Async and run thread safe 1 completed."
Releasing old results may be necessary if memory is an issue. The tidy up clears all processed and cancelled background tasks, and frees up associated JVM memory.
## id status
## 1 33 asyncCountdown(...) [0/10 (0%)]
The r6-generator-maven-plugin
can be used to generate an
R package with R6
classes that exposes selected Java
methods to R. Given enough detail in Java the resulting generated R
package can be quite feature rich and setup in a format ready to deploy
to r-universe
. The aim is to make the process of creating R
clients for Java libraries easy and maintainable.