Bridging C++ to Scala with BridJ ¬
At Curalate we’ve moved towards a microservice architecture with each service living in its own git repository. For the most part,
we’ve standardized the way we build our Scala projects using Apache Maven to manage dependencies and
compilation. This is convenient since any Curalady / Curalad can clone one of our repos and type mvn install
at the root with the
expectation that everything will compile successfully on the first try. We wanted this same ease of use for our Scala projects that
needed access to native libraries and this post explains how we obtained it.
Seamlessly Interfacing Scala with C++
The JVM is an impressive piece of technology and enables awesome high-level languages like Scala. However, there are times that we need to use native languages like C++, especially when applying computer vision and machine learning. Like any good startup, we racked up technical debt to move quickly. Initially, our native projects were compiled manually and interfaced with Java via JNA. This required an error-prone multi-step process when making changes, including manually placing a dynamic library in a JAR for deployment. As native development became more important and the size of our team increased this manual process became cumbersome.
It was clear to us that we needed to overhaul our native development infrastructure. When we approached the task of redesigning our native build system we had several goals in mind:
- Standardizing native builds and providing push button operation (i.e.
mvn install
is all we need) - Adding native functionality to a Scala project should be as simple as putting native source files in the right directories
- Minimizing boilerplate and saving developer time
- Including shared libraries in the final JAR should be automatic
Choosing the Interface
There are several options for using Java and C++ together which in turn, allows us to interface with Scala. The classic option is the Java Native Interface (JNI) which is part of the Java language specification. If you’ve ever used the JNI you may recall that there is quite a bit of boilerplate. In addition, almost all communication between the native code and Java must be done through special native JVM calls requiring a significant amount of glue code to do seemingly simple things.
A higher level alternative to JNI is Java Native Access (JNA) which when paired with JNAerator can minimize the boilerplate we need to write. JNAerator takes in a C/C++ header file and generates a Java source file with wrappers for each native function. This makes JNA appealing since we only need the header file, which we had to write anyway! The price for these high-level features is that JNA is significantly slower than the JNI. Often the sole reason for crossing the native boundary is speed so this is problematic.
Fortunately, JNAerator recently added support for yet another way to interface native code with Java, BridJ.
BridJ is a relatively young project, but it claims to have speeds comparable to the JNI and it allows direct interfacing with C++. In contrast, the JNI
and JNA are designed to interface with C which requires redundant extern
declarations to use C++. BridJ also allows building shared
libraries for multiple target operating systems and architectures. As long as the libraries are placed in a specific directory they will be included in the
final JAR and at runtime BridJ extracts the library from the JAR and instructs the class loader to load the library.
Automating the Build
To integrate all of this into our existing development infrastructure we wrote a specialized Makefile along with a suite of scripts. We wrote hooks for specific Maven lifecycle phases to make everything seamless. Simply placing C++ header and source files in the right sub-directories is enough to get a working hybrid Scala / C++ project. Our build system takes care of calling JNAerator to generate Java wrappers, building shared libraries, and putting everything in the correct place in the final JAR for deployment.
Using the Interface
Now we’ll work through the obligatory hello world example here to show what BridJ looks like in practice.
First, we’ll write our C++ header to define the interface with Java. It’s best to stick to primitive types
here like char*
, int
, etc. since JNAerator’s support for parsing header’s is limited.
To transfer arbitrary data or objects we found it was easier to serialize everything to a byte array and unpack
that on the native side (Java’s ByteBuffer is handy here).
This method of passing serialized data was better captured with two headers instead of just one. For the first header, we would restrict
ourselves to primitive types to define the Java interface that will perform the serialization and call the appropriate native
function. The second header file would be written to accept the unpacked data as more complex native types like objects and to supply the
actual native implementation. This separation of concerns made things a little cleaner to implement.
Here’s our C++ header that specifies the Java interface:
and here’s the C++ implementation file:
Here’s the file automatically generated from our header by JNAerator:
We’re ready to call this from Scala now! Let’s fire up the REPL and try it out:
Let’s take a look at what the final JAR looks like when compilation is complete:
For this example, we compiled this for Mac OS X and the final library is stored in the JAR as lib/darwin_universal/libhello-world-native.dylib
. If we also built the Linux library binary we could add it to this JAR as lib/linux_x64/libhello-world-native.so
. At runtime BridJ would extract the appropriate library for the class loader allowing the JAR to be used with both Linux and Mac OS X.
Great, now when someone would like to use our code they can simply clone a git repo and type mvn install
to get things
compiled! At Curalate, we often need to interface with native code when working with machine learning or computer vision.
In this post, we’ve given a brief tour of our custom build system that gives us a consistent, fast, and easy-to-use framework for
interfacing Scala with C++. Have you had to tackle a problem like this before? If you have suggestions or another approach let us know.
We’re always listening!