Maven plugin and annotation processor to write glue code to allow correctly annotated java class to be used within R as an set of R6 classes.
R can use RJava or jsr223 to communicate with java. R has a class system called R6.
If you want to use a java library in R there is potentially a lot of glue code needed, and R library specific packaging configuration required.
However if you don’t mind writing an R-centric API in Java you can generate all of this glue code using a few java annotations and the normal javadoc annotations. This plugin aims to provide an annotation processor that writes that glue code and creates a fairly transparent connection between java code and R code, with a minimum of hard work. The focus of this is streamlining the creation of R libraries by Java developers, rather than allowing access to arbitrary Java code from R.
The ultimate aim of this plugin to allow java developers to provide simple APIs for their libraries, package their library using maven, push it to github and for that to become seamlessly available as an R library, with a minimal amount of fuss.
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import uk.co.terminological.rjava.RClass;
import uk.co.terminological.rjava.RMethod;
import uk.co.terminological.rjava.types.RDataframe;
/**
* This class is a very basic example of the features of the rJava maven plugin. <br>
* The class is annotated with an @RClass to identify it as part of the R API. <br>
*/
@RClass(
exampleSetup = {
"J = JavaApi$get()"
},
testSetup = {
"J = JavaApi$get()",
}
)
public class MinimalExample {
static Logger log = LoggerFactory.getLogger(MinimalExample.class);
@RMethod(examples = {
"minExample = J$MinimalExample$new()",
"minExample$demo(dataframe=tibble::tibble(input=c(1,2,3)), message='Hello world')"
})
/**
* Documentation of the method can be done in JavaDoc and these will be present in the R documentation
* @param dataframe - a dataframe with an arbitrary number of columns
* @param message - a message
* @return the dataframe unchanged
*
*/
public RDataframe demo(RDataframe dataframe, String message) {
log.info("this dataframe has nrow="+dataframe.nrow());
log.info(message);
return dataframe;
}
}
Key points:
Required Maven runtime dependency
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<r6.version>1.1.0</r6.version>
</properties>
<groupId>io.github.terminological</groupId>
<artifactId>r6-generator-docs</artifactId>
<version>${r6.version}</version>
<packaging>jar</packaging>
<name>R6 Generator Maven Plugin Test</name>
<dependencies>
<dependency>
<groupId>io.github.terminological</groupId>
<artifactId>r6-generator-runtime</artifactId>
<version>${r6.version}</version>
</dependency>
...
</dependencies>
Repository configuration if you want to use the unstable
main-SNAPSHOT
version of the r6-generator
.
<!-- Resolve SNAPSHOTS of the runtime library on Github packages
not needed if you are using a stable r6.version of the r6-generator-runtime
and r6-generator-maven-plugin from Maven central rather than xx.xx.xx-SNAPSHOT -->
<repositories>
<repository>
<id>github</id>
<url>https://maven.pkg.github.com/terminological/m2repo</url>
</repository>
</repositories>
<!-- Resolve SNAPSHOTS of the maven plugin on Github packages -->
<pluginRepositories>
<pluginRepository>
<id>github</id>
<url>https://maven.pkg.github.com/terminological/m2repo</url>
</pluginRepository>
</pluginRepositories>
<!-- N.B. for this to work with Github packages you need a personal access token
defined in your ~/.m2/settings.xml file as a server with the id of `github`
to match the above, e.g.
<settings>
<servers>
<server>
<id>github</id>
<username>GITHUB_USERNAME</username>
<password>GITHUB_TOKEN</password>
</server>
</servers>
</settings>
All of which is probably a good reason to only use stable releases from Maven
Central.
-->
Maven plugin example configuration:
<build>
<plugins>
...
<plugin>
<groupId>io.github.terminological</groupId>
<artifactId>r6-generator-maven-plugin</artifactId>
<version>${r6.version}</version>
<configuration>
<packageData>
<!-- R package metadata: -->
<title>A test library</title>
<!-- As this project is documenting the r6-generator-maven-plugin
I am syncing R package version with the r6-generator version.
This is unlikely to be what you want to do. -->
<version>${r6.version}</version>
<!-- Instead you most likely want to sync the R package
version to your Java artifact version in a "normal" project
e.g.: -->
<!-- <version>${project.version}</version> -->
<!-- Note that -SNAPSHOT java versions will be rolled
back to previous patch version for the R package
so 0.1.1-SNAPSHOT (Java) becomes 0.1.0.9000 (R)
this is due to the difference between R and Java versioning
strategies. R tools typically use non-standard semantic
versioning. -->
<!-- Alternatively you can manage the R package version
manually by putting a R style version
of the format 0.1.0.9000. e.g. -->
<!-- <version>0.1.0.9000</version> -->
<doi>10.5281/zenodo.6645134</doi>
<packageName>testRapi</packageName>
<githubOrganisation>terminological</githubOrganisation>
<githubRepository>r6-generator-docs</githubRepository>
<!-- often (but not in this case) the repository will be
the same as the package, e.g.: -->
<!-- <githubRepository>${packageName}</githubRepository> -->
<license>MIT</license>
<!-- this is the Description field in the R DESCRIPTION file
CRAN specifies some standards for this, such as it
should not start with the package name and must pass
spellchecks any references to other packages mut be in
single quotes.-->
<description>
Documents the features of the 'r6-generator-maven-plugin'
by providing an example of an R package automatically
generated from Java code by the plugin. It is not
intended to be useful beyond testing, demonstrating
and documenting the features of the r6 generator plugin.
</description>
<maintainerName>Rob</maintainerName>
<maintainerFamilyName>Challen</maintainerFamilyName>
<maintainerEmail>[email protected]</maintainerEmail>
<maintainerOrganisation>terminological ltd.</maintainerOrganisation>
<!-- Build configuration options: -->
<!-- starts the R library with Java code in remote
debugging mode: -->
<debug>false</debug>
<!-- Roxygen can integrate user supplied and generated R
code, but requires a working R version on the system
that generates the R package. This must be set if, like
this package, you define some additional manual
functions in your own `.R` files in the R directory
beyond those generated by the package. This kind of
hybrid java and R package must use devtools::document
through this option to generate the correct NAMESPACE
file and documentation. -->
<useRoxygen2>true</useRoxygen2>
<!-- Runs a R CMD Check as part of the maven build and abort on failure . -->
<useCmdCheck>true</useCmdCheck>
<!-- Pkgdown will generate a nice looking site. if it fails the build will abort -->
<usePkgdown>true</usePkgdown>
<!-- Install the library on the local machine when finished. disable for CI -->
<installLocal>true</installLocal>
<!-- building the javadocs into the documentation is nice but can add
to the size of the package which is not helpful if submitting to CRAN -->
<useJavadoc>false</useJavadoc>
<!-- pre-compiling the binary if probably a safest option, where the compilation is done during maven build
the alternative is to compile the java from source code on first use of the library from within R
this requires the user to have a JDK installed, and uses a maven wrapper script -->
<preCompileBinary>true</preCompileBinary>
<!-- packaging all dependencies is the most robust but
results in a large package size that may not be accepted on CRAN
however this is the simplest if the main target is r-universe or deployment via github
the alternative is to deploy a minimal jar and fetch all dependencies on first library use.
this option only applies if the binary is precompiled in the previous option. -->
<packageAllDependencies>true</packageAllDependencies>
<!-- Maven shade can minimise the size of JAR files by trimming bits that you don't actually use -->
<useShadePlugin>true</useShadePlugin>
<!-- any rJava VM start up options can be added here -->
<rjavaOpts>
<!-- this example sets the maximum heap size -->
<rjavaOpt>-Xmx256M</rjavaOpt>
</rjavaOpts>
</packageData>
<!-- the best place to put the R package is in the directory above the java code
and to have the java code in a `java` subdirectory of a github repo.
i.e. this file would be `java/pom.xml`. This makes R optimally
happy and is the best layout for new projects. -->
<outputDirectory>${project.basedir}/..</outputDirectory>
</configuration>
<executions>
<execution>
<id>clean-r-library</id>
<goals>
<goal>clean-r-library</goal>
</goals>
</execution>
<execution>
<id>flatten-pom</id>
<goals>
<goal>flatten-pom</goal>
</goals>
</execution>
<execution>
<id>generate-r-library</id>
<goals>
<goal>generate-r-library</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
And with this in place, a call to mvn package or mvn install will create your R library by adding files to your java source tree in the directory. Push your java source tree to github (Optional).
# library(devtools)
# if you are using locally:
# devtools::install_local("~/Git/your-project-id")
# devtools::load_all("~/Git/your-project-id")
# OR if you pushed the project to github
# install_github("your-github-name/your-project-id")
# a basic smoke test
# the JavaApi class is the entry point for R to your Java code.
J <- testRapi::JavaApi$get()
# all the API classes and methods are classes attached to the J java api object
eg = J$MinimalExample$new()
df = eg$demo(dataframe = diamonds, message = "The diamonds dataframe")
nrow(df)
## [1] 53940
For basic info about the plugin see: https://github.com/terminological/r6-generator
For a more complete working example and further documentation see: https://github.com/terminological/r6-generator-docs