Improve Performance With In-Process Integration

In-process integration beats out other, out-of-process integration approaches in many ways. You get higher performance, more reliable integration, and better security.

As architects we are used to looking at the big picture, and as architects of enterprise-level applications we're usually looking at a very large picture indeed. When we hear the term "integration," we think of business applications communicating through message queues, Web services, CORBA, or one of a host of other middleware technologies.

We tend to forget that there are many integration problems for which such middleware-based integration approaches are not only overkill, but downright counterproductive.

Once upon a time, I worked on a large electronic commerce project that consisted of many CORBA servers, most written in C++ but some written in Java. The project had started out in C++ and had a relatively mature C++ infrastructure, consisting of several shared libraries for database access, logging, security, and some other assorted services. The C++ servers simply linked with these libraries and called into them directly, but it was a different picture for the newer Java servers. What were we to do: rewrite the shared C++ libraries in Java, wrap them in an "integration server" to expose the shared services to the Java servers, or use the Java Native Interface (JNI) to call into the libraries directly?

None of these alternatives was appealing, and had it not been for the exclusive availability of a third-party component in Java, we might have ditched Java entirely and kept the server side in C++. Alas, we did not have the luxury of maintaining a homogenous implementation and had to come up with an integration plan in a hurry. We ruled out a rewrite of the libraries in Java as too expensive in terms of time and money. We also ruled out a CORBA-based "integration server" mostly for performance reasons: Many of these modules were written with the assumption of in-process execution and did not have acceptable performance when a server-to-server call was involved.

The winner on technical points was clearly the JNI-based integration solution. JNI allows a Java application to call a specially declared C function in a shared library, so in theory it would have allowed us to call directly into our libraries. But a proof of concept quickly showed that writing JNI code by hand was going to be almost as expensive as rewriting the libraries in Java.

Not surprisingly, we ended up with a hack, as so often happens when implementation design is an afterthought. Nevertheless, this experience helped me understand that there are certain use cases that lend themselves extremely well to an in-process integration approach, as long as there are tools to support you.

What are the characteristics that can make in-process integration so attractive in comparison to other (out-of-process) integration approaches? Well, it's quite simple: Think of a simple in-process integration as a virtual function call to a dynamically loaded library. Consider the implications:

  • Extremely high performance. You clearly have a winner if you can call from one solution into another solution at roughly the cost of a virtual function call.
  • Extremely reliable integration. A library might load successfully or not, but there are not many realistic failure modes that are introduced by this type of integration, and there is no partial failure mode. Compare that to any kind of sockets-based communication between applications (network saturation, connection timeouts, and so on).
  • Extremely low security implications. The integrated solution has an attack profile that is just about equal to the sum of its parts. There are no exposed communication channels, no man-in-the-middle attacks, no authentication or encryption requirements, and no denial-of-service attacks because everything happens within the process.

This begs the question of why we don't integrate everything using in-process technology. Following that idea to its logical conclusion quickly demonstrates the potential downsides of in-process integration:

  • No process separation between integrated components. We're not taking advantage of application-level security, process isolation, process prioritization, and many other operating system services.
  • No application scaling through distribution. The only way to improve the performance of a single-process application (multithreading left aside for the moment) is to upgrade the hardware or improve the code base.
  • Parallelization requires thread-level programming. Heavily multithreaded applications are notoriously hard to debug and require a higher skill level to write and maintain.

Analyzing the relative strengths and weaknesses gives us an idea of where in-process integration is useful and where it isn't. In-process integration is almost invariably an integration technology at the API level rather than the application level: We have one API that is not easily usable from the technology we're using, but we wish to use it as if it were meant for us. In practice, this problem occurs most often because the API in question was written in the "wrong" language.

Sometimes, though, in-process integration can extend into the area of application integration. Take such APIs as the Java Message Service (JMS) or Enterprise JavaBeans (EJBs) as an example. Providing an in-process integration solution that allows a C++ process to use JMS or EJB directly through JNI might technically be an in-process integration, but conceptually it is the solution of an application integration problem. I could probably write an entire article on such use of in-process integration to facilitate application integration, but for now I will concentrate on the core subject of in-process integration.

Even if a problem seems like a good match for in-process integration, there might not be a good way to perform the integration. In my professional career, I have come across only a handful of technologies that act as good enablers of in-process integration:

  • The JNI can be used to mix Java and anything callable from C in one process.
  • P/Invoke can be used to mix .NET and anything callable from C in one process.
  • P/Invoke and JNI together can then be used to mix .NET and Java in one process.
  • The C escape hatch from Fortran.
  • COM.

Clearly, none of these technologies is a general-purpose integration technology; each targets one particular integration problem, such as Java/C integration. Nevertheless, some of these integration niches are important enough to warrant a closer look. JNI and P/Invoke are often applicable because of the wide adoption of Java, C++, and .NET in the typical enterprise.

In the remainder of the article, I'll focus on the use of JNI for the in-process integration of Java and C/C++ code. I'll also discuss plenty of general concepts and use certain JNI features as examples for more general in-process integration concepts.

As mentioned previously, JNI is not an easy-to-use API; it is low-level, almost comparable to communicating with the JVM in assembly language. It might cause the developer to leak resources that can result in a delayed application crash due to an OutOfMemoryError. It also puts the burden of error checking fully on the programmer and fails spectacularly and catastrophically if you don't handle the error conditions correctly. JNI is not a popular API, and few books are written about it; consequently, you spend a lot of time discovering by trial and error how things work and the best way to achieve your goals.

Let me introduce a few JNI concepts and functions, which I will use to illustrate how in-process integration through JNI differs from out-of-process integration and how it can achieve superior application performance. I will particularly look at these areas:

  • Object representation.
  • Object data access.
  • Function calls.
  • Strings.
  • Error handling.

Different aspects of the JNI API can come to dominate a particular application's performance. Some C applications predominantly call Java methods, so their performance will be dominated by the function call overhead introduced by JNI. Other applications will predominantly access Java data structures, so their performance will be dominated by the data access overhead introduced by JNI. Let's now take a look at some JNI concepts.

Object Representation
In JNI, every Java object (regardless of its type) is represented on the native side by an opaque type called _jobject. You only deal with the Java instances through the typedef'ed pointer type jobject. This means that you use a pointer-sized function parameter to reference a Java object, no matter how big the object might be on the Java side. Think of this as a "pass by reference" rather than a "pass by value" calling convention for method arguments and return values of object type. Pass by reference typically has superior performance when the referenced objects are large and when not all object data is always required on the receiving side.

Consider a Java Hashtable containing a couple hundred entries. You might only be interested in one of the entries on the native side. Passing the entire Hashtable state by value would be a huge waste. If, on the other hand, you're always going to iterate over all entries in the Hashtable, you might be better off if the entire Hashtable had been converted to a native value type right away. The former use case is much more likely than the latter, and it does not impose any limitations on the object types that can be handled. Pass by value schemes imply that an object can be serialized, and they typically deal poorly with deep reference hierarchies.

Object Data Access
In JNI, data members (fields) of an object have to be queried. You can't just "cast" a jobject handle to a pointer to a data structure and then simply and directly dereference the object's data members. This means that accessing a lot of fields can be relatively expensive when compared with accessing the members of a native in-memory object. If performance is critical, you might be better off serializing a Java object into an in-memory buffer and then accessing the buffer from the native side. Just be aware of the additional tasks beyond defining and maintaining serialization code and format that you will take on if you do this:

  • You will have to deal with the byte ordering of primitive types.
  • You will have to deal with data alignment.
  • You will have to deal with string encodings.

You might also want to consider your application's usage of the queried data. If the same field value is used more than once, a caching scheme for the value might have a much larger impact on performance than an improvement in the data access logic.

Function Calls
In JNI, you can call a Java method from C/C++ by using a method identifier of type jmethodID and passing arguments in one of several possible ways. The most efficient way to pass the arguments is to pass them as an array of jvalues. jvalue is a union type that can represent both primitive and object types. JNI function calls involve little marshalling overhead; the passed arguments do not have to be massaged into a particular format to make them suitable for consumption by the virtual machine. Depending on your platform, the JNI layer might have to perform some byte order conversion between the Java side (integers are in network byte order) and the native side.

Once you have acquired the jmethodID—a costly lookup operation that you luckily have to perform only once per method—the overhead of the method call is relatively small. In real-life applications, we have often measured negligible performance degradation when compared with a pure Java application.

Strings
Strings are commonly among the trickiest object types to integrate, no matter which technologies you're looking at. In Java, you have a built-in, final String type, meaning that you can't replace it and you can't extend it to modify its behavior. In C++, you have dozens of string types: from simple char* over wchar_t* to custom types, STL string types, third-party library strings types, and so on. You are also dealing with different character sets, encodings, and byte orders.

Not surprisingly, JNI puts its own twists on the issue as well. In Java, a String is just another object. The Java compiler simply translates string literals in your source code into String objects. If you recall our discussion of object representation, you will remember that any Java type is represented as a jobject on the native side, and strings are no exception.

To use a Java string in C/C++, you need to extract the string characters from it. There are various ways to do this, each with hugely different performance characteristics. If you know that all string characters are ASCII-compatible or if your C++ application can deal with UTF-8 encoded strings, you can use a fairly high-performing JNI function called GetStringUTFChars(). Otherwise you will have to invoke various methods to convert a string to a byte[] and then gain access to the byte[] elements. This can be several times slower than the first alternative. In either case, you're probably going to be looking at some copying of character data with the implied performance penalty.

The String type is the only type that requires marshalling and unmarshalling in our in-process integration design, simply based on our use of JNI as the integration technology (but P/Invoke is similar in this respect). You can gain large performance boosts if your application caches string characters and reuses preconstructed Java String instances wherever possible.

Error Handling
Error handling is one of those things that we know we should do, but even if we do it, it is usually not perfect. We tend to focus on making things work and not on the possibility of failure.

JNI is particularly unforgiving in this area. If you ignore an error in the JNI layer and continue operations, JNI will terminate your JVM through a call to the FatalError() function. This has given JNI a bad reputation as a buggy technology, even though it simply magnifies and highlights our tendency to ignore best practices in error handling.

This characteristic of JNI has a huge implication for in-process integration, though. Although your application might be able to recover from an aborted out-of-process request, it is unlikely to be able to recover from a terminated JVM. Using JNI as an in-process integration technology puts the burden of flawless error handling on you. If you're not willing to take on this burden, you're going to have to use one of several available tools that do it for you.

In general, you must carefully analyze the error-handling model used by your in-process integration technology. Not all in-process integration technologies will give you complete error information—using COM wrappers to call into .NET loses a lot of error information, for example—and some, like JNI, will force you to be thorough.

I have used JNI to illuminate how the integration technology that we're using impacts performance, and to demonstrate that just being able to "use" an object written in another language in the same process does not automatically guarantee higher performance.

From all the potential problems I described, you might get the idea that using JNI for in-process integration is something that only a fool would do. This is far from true: Using JNI judiciously for in-process integration can be a hugely rewarding experience. Unparalleled integration performance, the ability to reuse tested Java code from C++, and the absence of integration-related security holes can make competing integration approaches look bad.

Here are a few words of wisdom about in-process integration:

  • Use tools to help you. Integration libraries or generated code save you a lot of time and money and can make the difference between success and failure.
  • Remember the Law of Leaky Abstractions. No matter how good your in-process abstraction for the other side is, there will be surprising (conceptual, not physical) disconnects.
  • Handle errors. Remember that you have only one process to play in. Corrupting the process's heap or losing the embedded JVM is a catastrophe and can't be solved by reconnecting to a server process!
  • Just because it's in-process doesn't mean you don't have marshalling overhead. Depending on the technology you're using, you might have to marshal data when making a call on another thread (COM) or when certain data types are used (String in JNI).
  • If you always use all-object data, consider serialization. Object handles (pass by reference) usually save you a lot of overhead, but in the relatively rare circumstance of 100-percent used, large objects with a lot of discrete data members, object serialization can perform better.

About the Author
Alexander Krapf is president and cofounder of CodeMesh Inc. He has more than 15 years of experience in software engineering, product development, and project management in the United States and Europe. Alex has also worked for IBM, Thomson Financial Services, Hitachi, Veeder-Root, and Document Directions Inc.