In-Depth

The ascent of EJBs

Enterprise JavaBeans (EJBs) did not simply spontaneously generate. Rather, EJBs were the next logical step in the evolution of enterprise software technology. Just as objects yielded the advent of distributed objects a decade ago, distributed objects have given birth to container-managed components called EJBs. This article describes the different stages in this technological evolution.

We begin our journey with basic object technology. Object-oriented (OO) programming revolutionized the software industry by combining data and behavior into an entity called an object. New methodologies of analysis and design followed, and OO application development began to flourish.

Shortly thereafter, the term distributed objects appeared. This signified that objects of an application reside in potentially different processes, perhaps even on different machines—yet the remote invocation of objects uses the same programming syntax as regular (local) invocations. This was a notable advancement, as OO applications could become distributed applications without breaking the OO paradigm.

To do this, software to translate a local invocation into a distributed invocation was needed. In other words, the software had to broker the request to the remote object—an action that coined the term Object Request Broker (ORB). To enable ORB technologies such as Remote Method Invocation (RMI) and CORBA, two key structural elements were utilized: an on-the-wire protocol and the proxy design pattern.

Concerning the former, it is not surprising that communication between the two processes was not the invention of ORB technologists. Inter-Process Communication (IPC) me- chanisms and transport protocols, such as TCP/IP, had been in existence for decades. While an ORB would need this to communicate between client and server processes, it would also need to support another protocol on top of TCP/IP—an "OO protocol" if you will. This protocol would transmit packets that would declare, for example, "Please invoke the withdraw method on Lloyd Miller's bank account for the amount of $500." This is fundamentally what the CORBA Internet Inter-ORB Protocol (IIOP) does.

An ORB is typically a vendor-provided library with network communications capabilities supporting an object-aware protocol. However, an ORB itself is not enough. Consider client code. It would be optimal if client code transparently hid the location of the remote object. In addition, application code should ideally consist of a normal invocation using a normal object reference. For example, if you wanted to deposit money into a bank account, you would want to call account.deposit(50) (Java). But how would that translate into a remote invocation?

Enter the proxy design pattern according to Design Patterns: Elements of Reusable Object-Oriented Software by E. Gamma, et al. (Addison-Wesley, 1995). The client process needs a proxy (or stub) object that advertises the same interface as the targeted remote object, therefore enabling the client to invoke a deposit() method on it (see Figure 1). When this invocation is made, the proxy object transparently uses the client-side ORB library to send the request to the server process. The server-side ORB library then processes the IIOP message and delivers the invocation to the remote object, which then executes and returns the result to the calling client.

Figure 1: The proxy design pattern
Figure 1
In the proxy design pattern, the client process needs a proxy object that advertises the same interface as the targeted remote object, enabling the client to invoke a deposit() method on it. When an invocation is made, the proxy transparently communicates to the server the client's request.

But where did the proxy instance come from? The proxy is instantiated by the client-side ORB automatically when the client obtains a remote object reference (perhaps via the CORBA Naming Service) to the specific remote object instance. The remote object reference includes a network address of the target object, an object ID and other data, yet remains hidden from all application code. This information is encapsulated within the proxy the client ORB instantiates when the client obtains the remote reference. After the instantiation, the ORB returns a normal Java reference to the client application code. The proxy class is not written by the developer, but rather generated by an ORB vendor-supplied development tool for each interface type. The proxy provides the method deposit(), but when invoked, the proxy would "proxy" (hence, the term) the request to the remote object running in the server process.

To summarize, the proxy provides the illusion of a typical object invocation, while the ORB library provides the necessary communication facilities. This ORB technology is applicable in environments ranging from the embedded to the enterprise. Concerning the latter, however, it was soon discovered that an ORB alone was inadequate. Serious needs and issues arose, leading to the advent of container-managed components such as EJBs. EJB is built logically on top of the RMI ORB and provides many of the features lacking in ORB technology.

Factories
One troublesome issue in ORB-based applications concerns the creation of remote objects. Consider the famous shopping cart, which is an object that executes within an e-commerce application. Suppose the shopping cart server creates a set of shopping cart objects at start-up and publishes their object references to a naming service. How many shopping carts should be created? How many will be used at any moment? Is this number consistent at all times? Definitely not.

We might decide to create the number of shopping carts used during peak hours. If the guess is too low, the system is incapable of servicing some customers. If the guess is too high, the application is inefficient. In fact, no matter what number is chosen, there will always be resources wasted at 4 a.m. when almost no one is shopping on the site. This is when the shopping cart server should be resource-light so that other applications and tasks can execute efficiently.

The solution is to create shopping carts on demand. An object that creates shopping carts, a shopping cart factory, does just this. Instead of creating shopping carts at server start-up, all clients have control of instantiating shopping carts when needed. When a client begins, it first obtains a reference to the shopping cart factory and then utilizes it to create a shopping cart. An advanced factory gives clients the ability to destroy shopping carts when finished and to find existing shopping carts.

Because of the flexibility afforded by factories, it is evident why the factory design pattern became extremely popular in building distributed object applications. In view of this, the EJB specification built a factory into the component architecture itself, calling it the home interface. Every EJB has a home interface and a remote interface. The remote interface advertises the business methods, such as addItemForPurchase() for a shopping cart. But more interestingly, the home interface facilitates bean creation and destruction by serving as a factory for all clients (see Figure 2). Moreover, in the case of entity beans, the home interface can also be used to find existing entity (persistent) beans.

Figure 2: The factory design pattern
Figure 2
In the factory design pattern, the remote interface advertises the business methods, and the home interface facilitates bean creation and destruction by serving as a factory for all clients.

What is the benefit of the EJB home interface? In CORBA and RMI applications, a factory would have to be developer-defined and implemented completely. But in EJB, the developer defines the home interface but does not have to implement the home class, which is generated by a tool provided by the EJB vendor. Furthermore, since the vendor-supplied code is responsible for creating bean instances for the client, it can potentially use techniques like pooling to increase efficiencies and performance. This is all functionality the bean developer does not have to code.

Passivation and activation
Continuing with the shopping cart example, suppose a customer visited this particular e-commerce site and added items intended for purchase into a shopping cart. But then other activities drew the customer away from the site until the next day. Does this mean the customer's shopping cart will remain in memory and occupy resources for that entire duration? Ideally, no. The server should have the ability to temporarily store the state of the shopping cart to persistent storage if it is not invoked for a significant amount of time. This is called passivation. When the customer returns to their shopping, the subsequent client invocations trigger the server to transparently reload the object into memory to service the invocations, a process called activation. The client is not aware that either passivation or activation has occurred, for it is a hidden server-side implementation detail.

But how are these capabilities implemented? Concerning passivation, the server could monitor all object invocations, passivating infrequently used objects. On the other hand, activation could be implemented by a server-side proxy. The server-side proxy to the remote interface is called EJBObject in the EJB specification. This proxy object would intercept the invocation within the server, thus enabling it to activate the target object if necessary. The proxy would then delegate the client's invocation to the target object.

These capabilities were not provided by CORBA or RMI, nor could an application developer implement them simply. (The CORBA Portable Object Adapter [POA] did provide this capability, but it was somewhat of a latecomer within the CORBA world, not well supported and rather difficult to use. In addition, many CORBA and RMI implementations provided the ability to launch servers on demand.) EJB, on the other hand, has passivation and activation support built-in (see Figure 3). The server-side proxy is provided by the EJB container, which is the most important piece of the EJB architecture. The container intercepts all invocations to the bean, which then allows the container to activate the bean if necessary. But the container is not just a server-side proxy; it is actually an aggregation of many objects. In fact, the implementation of the home class described earlier is also part of the container. Therefore, the container is aptly named because it contains, or manages, the beans. Indeed, the container provides a great amount of functionality that was not provided for or standardized in the ORB world.

Figure 3: Passivation and activation
Figure 3
As the server monitors all object invocations, infrequently used objects can be passivated by temporarily storing the state of the object into storage. To be activated, a server-side proxy would intercept the invocation within the server, enabling it to activate the target object when necessary.

Enterprise services
Fundamentally, what EJB has added to ORB technology is the integration and automation of enterprise services. In fact, we covered EJB's two implementations of the first enterprise service: the life-cycle service. A life-cycle service controls the birth, death and other events in between for the objects it manages. Notice that the home interface provides life-cycle support from the client's perspective—the ability to create, destroy and find EJBs. The container also provides a life-cycle service within the server in that beans are passivated and activated for the purpose of better resource utilization. This life-cycle support is a server-side mechanism that is transparent to all clients.

Persistence service
Notice that the activation and passivation mechanism provided by the container necessitated a persistence mechanism. This persistence mechanism here, however, is considered to be temporary and not mission critical; therefore, many containers will simply store the passivated state of a bean to a flat file. (After all, in-memory bean data is lost if the server crashes.) Long-lived business data, on the other hand, should actually be stored to a more reliable persistent store such as an RDBMS. Do EJBs provide any capabilities for permanently persisting state?

Consider, for instance, a bank account EJB. This EJB has methods to deposit, withdraw and get the current balance. The account balance is maintained as a member variable within the bean instance. But this vital data ultimately needs to be stored in a database or other enterprise data store. The bean will also need to synchronize its state with the database by reading and writing at the appropriate moments.

If the bank account were implemented as a CORBA or RMI object, rather than as an EJB, the persistence logic would become the burden of the developer. This could potentially swell into many lines of Java Database Connectivity (JDBC) code. Also annoying is the fact that this logic would be at least partially co-located with the business logic, thereby diminishing the extensibility, portability and maintainability of the code. For example, it would probably be necessary to read in the balance at the beginning of each deposit and withdraw invocation, and then write it to the database before the method returns.

To improve this situation, EJBs provide a persistence service for entity beans. The fundamental difference between session beans and entity beans is that permanent persistence support is provided for entity beans. To provide this persistence service, the container loads the entity bean's state from the database and stores it in the entity bean's member variables. The bean will receive notification when this occurs, but the bean code is not mandated to perform any task whatsoever. No JDBC code is needed. Then, after one or more business methods are invoked, the container will store the changed state to the database. In creating the bank account entity bean, the developer does not have to assume anything associated with the database—not the location, vendor, type or schema of the database—it simply has to utilize a member variable. (This is EJB 1.1 container-managed persistence, which is still valid for EJB 2.0, although a more advanced mechanism has been defined.) The container handles all persistence and data synchronization transparently. The EJB specification also allows the developer to implement the persistence mechanism if desired. This is known as Bean-Managed Persistence (BMP).

Security and transactions
Let us return to the idea of the container providing a server-side proxy. The container intercepts every invocation of a business method. This is a powerful design because of the many useful things the container can do at this point. An example of this, which was described earlier, is the persistence service implemented by a container to load and store an entity bean's persistent state. But there is still more the container can do.

Consider security in a banking application. A withdraw invocation may be subject to an authorization check, which is a subset of users with permission to make this call. The container can do this. If the client fails the authorization check, the container can throw an exception to the client. If the authorization succeeds, the container will proceed to delegate the invocation to the bean. Note that the bean does not need any logic to implement this because the container provides it.

The same can be said for transactions, which are a powerful capability. In fact, it is arguably the most important enterprise service of all. But what is a transaction?

A transaction is often called a unit of work, giving programmers the ability to combine multiple operations into one "giant" operation. The classic example is from the banking world: a transfer of funds. During a transfer of funds, one account must have funds withdrawn, while another has funds deposited. But what happens if a power failure occurs after the withdraw operation but before the deposit operation? Not only will the bank's accounting books not balance properly, but you will have a very angry customer. This is where a transaction is indispensable. If a transaction is used to scope both operations, then the transaction ensures one of two outcomes: the data changes from both operations are made permanent, or no updates are made permanent at all. If all changes are made permanent, we say the transaction is committed. If the withdraw succeeds and the deposit fails, then the transaction is rolled back. This means that updates made during the withdraw operation will be undone. The system will then return to the state in which it existed before the transaction commenced.

The EJB container can automate the use of transactions. When the container intercepts the invocation from the client, the container can be configured to start a transaction (see Figure 4). This means that all work performed by the bean—whether invoking other EJBs or writing to a database—will be considered to be within the scope of the transaction. Furthermore, if the EJB does invoke on other EJBs, then the work done by these EJBs may be configured to be part of the same transaction. It is possible that many EJBs and much database-managed data could be affected. If no substantial errors occur, the container will attempt to commit the transaction after the invocation is finished. This committal process would make all database updates permanent and perhaps other in-memory state. If a serious error occurs, all updates would be rolled back.

Figure 4: Automating the use of transactions
Figure 4
When the container intercepts the invocation from the client, the container can be configured to start a transaction. All work performed by the bean will be considered to be within the scope of the transaction.

Note the power of EJB transactions. Many complex errors can occur in a large enterprise system. Without transactions, the programmer would be responsible for detecting all errors and developing the recovery logic. It is quite difficult and expensive to do this and, if the system changes, the recovery logic would be difficult to maintain. Transactions free programmers from this responsibility. Transactions can be completely automated by the container, as in the example above, or they can be demarcated directly by the bean logic provided by the EJB developer.

But were transactions and security support missing from ORB technology? CORBA did support security, transactions and persistence, but there are two major differences compared to the support provided by EJBs in these areas. First, not every CORBA ORB supported these services. In fact, the persistence service was hardly implemented or used at all. More importantly, these services were not tightly integrated into the server architecture as they are in EJB. That is why the container is so crucial—it automates the calling of these services. In CORBA, these services—if they were available—needed to be called by application code. This resulted in more lines of code to develop and maintain, which muddied up the main purpose of the object: to provide a business service. (The Object Management Group, the creators of CORBA, recently released the CORBA Component Model, which is similar to EJB and actually leverages EJB.)

Components
The power of EJB is not that it is a component model, but rather that it is a container-managed component model. As developers code the bean according to the EJB specification, the bean automatically leverages the functionality of the container to automate enterprise services. The container can implement the logic for persistence, transactions, life-cycle management and more.

With all the power and flexibility of EJB, the task of proper configuration becomes essential. EJB enables the creation of business components with minimal code. An EJB developer codes the business logic, and the container provides the enterprise services. But how does the container know what services to provide, and how to provide them? How does the container know when to start a transaction, when to store state to database, and who has the authority to invoke the EJB?

The answer is configuration data, also known as deployment information. EJB standardized an XML language that is used to create the container at deployment time. A vendor-supplied tool that takes in the XML deployment descriptor and compiled source code generates the container. The deployment descriptor is also utilized for declaring and configuring any dependencies of the bean's implementation, such as other EJBs, or usage of other resources such as database connections. Furthermore, environmental variables can be defined to customize a bean's behavior without changing the code. All of these options available in the deployment descriptor produce benefits, such as code reuse and portability.

But there is a trade-off. Although the EJB standard minimizes the amount of coding necessary, it adds more work to the task of configuration and deployment. But considering all the powerful enterprise services and portability EJBs provide, it is a worthwhile trade.