Message engine drives Delta data: A case study

In an industry where six hours equals 3,000 miles, information is of prime importance. Ask any airline. Data about passengers, itineraries, flights, planes, crews and gates is generated from every corner of the operation and might be needed anywhere. And when schedules are disrupted -- witness the early January 1999 blizzard that closed Chicago's O'Hare Airport -- the need for accurate, timely information only grows.

Airlines want and need to react rapidly to such events. For example, it would be better to reroute passengers around a troubled hub before canceling a flight. But who has time to sift through thousands of messages describing airplane departures and landings?

Data generated by an airline should therefore be turned into something operationally meaningful. Busy employees should also have relevant information pushed to them, so that adaptive actions can begin immediately. But a question remains: How do you move enormous amounts of events through a complex data model and deliver them proactively to the people who need them?

Delta Air Lines, Atlanta, has solved this problem by developing an information delivery system that collects, adds value to and distributes information in real time throughout its operations. The system --which substantially improves customer travel experience and airline operations, while still controlling costs -- can perhaps best be described and conceptualized as an "information flow" or "information value chain."

Unlike a database-oriented approach, the information systems in use at Delta "push" data to users instead of simply making it available for inquiry. Clients do no polling, but simply detect incoming messages on the TCP/IP port when they arrive. Filtering is performed at the server, as it is more economical based on specifications submitted throughout the day by the clients.

Information is provided in multiple forms and in different levels of abstractions so as to simultaneously serve the needs of diverse organizations. Events are analyzed as they happen and collected for later analysis. True to object-oriented concepts, system design is organized around an object model and defined "services"; the solution is implemented using message-oriented middleware, including IBM's MQSeries. Although the airline's middleware-based approach appears to conflict with database-oriented approaches, careful attention to the data is still required.

First steps

An early, and perhaps pivotal, project at Delta had to do with fuel purchasing or tankering. The project not only generated substantial savings for the airline, but also established a foundation for future applications.

Because the price and tax for jet fuel varies by location, an airline can save millions of dollars per year by carefully planning its fuel purchasing and loading. This requires collecting information on the loading of every aircraft, including the number of passengers and the cargo carried. Two airline groups, Flight Dispatch and Fuel Purchasing, are involved in the process. Flight Dispatch deals with immediate decisions, such as how many pounds of fuel to put on a specific airplane. Fuel Purchasing deals with issues such as futures, current prices and so on.

Previously, Fuel Purchasing would typically fax fuel price information to Flight Dispatch, which would then use the data to make optimizing decisions. Now, purchasing information goes to a Fuel Tankering System application that converts this information into fueling recommendations that dispatchers can act upon directly. Likewise, operational information in the form of actual fuel loading is sent to Fuel Purchasing in order to maintain accurate fuel inventories.

"In order to address the tankering problem you had to know all the relevant events associated with the flight," said Ron Thieme, president of consulting firm SCG Partners Inc., Nashua, N.H., which helped Delta implement its new systems. "We ended up putting together the infrastructure that captures the major events from [airline] operations -- from all the flights and all the ships. We ended up with a real-time model of the current state of the fleet."

The project required the use of relatively new messaging technology. As a result, middleware was built to move data from the airline reservation systems -- based on an IBM operating system called Transaction Processing Facility (TPF) that is widely used by airlines -- into the distributed (Unix) systems that would host the tankering application. While the goal was to simply save money in a "back-room" function, the project also offered relatively low risk with a good upside potential.

The new application embodies some important concepts. From a business perspective, the system links two groups that had been separated not only organizationally but in terms of the level of data abstraction they dealt with. It was also able to take data from one group and turn it into information useful to the other. This involved not only transforming the information, but generating new information. For example, no amount of format transformation changes the price of fuel. Creating a fueling recommendation, however, means collecting and analyzing a great deal of data to create "actionable" information.

Getting the data to the people

Drawing on experience from its Fuel Tankering System, Delta then rolled out the Flight Progress Events System (FPES). FPES is a near-real-time delivery of flight events to "consumers" -- programs that need to receive the events -- and a database of current flight information used in activities such as supporting Delta's Web site.

FPES collects data from all over the airline. For the most part, the information comes out of the airline's TPF-based Operational Support System (OSS); however, gate information may come from some airports through other systems. Passenger information is collected by the reservation system. Most of the planes generate and transmit their own landing time, which gets to FPES via OSS. Delta wrote its own middleware in order to move data out of TPF, while MQSeries is used in various other applications. The FPES server runs on a set of Hewlett-Packard servers running HP-UX and Oracle, with systems duplicated for purposes of high availability.

Consistent, scrubbed information of record is broadcast to consumers through the system's "event-push" capability. Data is kept in memory for quick access, although there is supporting information in databases from which the current in-memory state can be reconstructed.

"The server keeps in memory the exact state of flights and the whole airline from an operational point of view. For example, there is an application that uses the information provided by the server to tell how well Delta is doing in terms of operational reliability," said SCG Partners' Rick Lawhorne, who designed the system. "The software provides what we call a 'service' that maintains all the data needed to support immediate calculation of operational status. It takes only milliseconds to calculate the information that we maintain in real memory."

Demonstrating value

There are two applications that demonstrate the value of FPES: Flight Status Monitor (FSM) and Passenger Rebooking. FSM, like FPES, is based on a server that collects, interprets and redistributes information about the state of operations. It displays the status of each flight, updating the display in real time as new information is received. FSM is deployed at ramp towers at Delta hubs and can be accessed remotely.

While FPES compares each new message with the previous one -- and creates new, individual change event messages before distributing them out to the consumer applications -- FSM continually assesses the significance of incoming events. "As part of the flow of data and events, the server can correlate certain things and detect exceptions," said Lawhorne. "The airline people do not need to see that a plane is running on time. What matters to them is when things do not go right." An example, he said, is when an inbound plane is late and, given the necessary ground service time, will not make its scheduled departure time.

Within the FPES server that supports these applications is a business-event service that filters information before distributing it. All day long each "consumer" or client application keeps the server up-to-date. The server selects the data each client has asked for and sends it to that client.

All of this data will eventually go through MQSeries queues. There are several clients that already get all their data through events in MQSeries queues -- these are clients that want guaranteed delivery of each and every message. In addition, anywhere a given system is the System-of-Record for another system, queues are put in place to enable guaranteed delivery of all messages.

"The goal here is to use exceptions and alerts to keep everyone updated. This brings the focus right in to solving the problem rather than trying to find out what the problem is or creating new problems," Lawhorne explained. "It also enables the airline to make the associations that can tip you off to a problem that's coming down the road so that you can solve it proactively or bypass it altogether."

Passenger Rebooking is the second application. It provides the best possible assistance to passengers when their flight plans are disrupted. The application also combines up-to-date actual operational information (not just schedules or plans) with passenger data from reservations systems to inform employees about passenger requirements. In the future, this application may even have the ability to hand passengers revised tickets (if required) as they step off a late-arriving connecting flight.

However, reacting to problems by rebooking is only part of what the application does. "Even before it is time to rebook someone, the application supports the operational planning process. If a flight dispatcher knows [they are] one plane short and must cancel one flight, with the help of the rebooking application they can look at the impact on the customers. In addition, [they can make] sure the crew and the aircraft are available," said a former project manager.

Previously, judgments were made based on purely operational questions, such as which plane had the fewest number of passengers. However, this might cause some passengers to be delayed for many hours. There are also some passengers who should not be "seriously inconvenienced," for example, passengers with a medical condition or unaccompanied minors. With the Passenger Rebooking application, this type of passenger information is now available to decision makers.

Object model

The design of FPES and the other applications is based on an object model of an airline. The solution involves:

  • Data represented via the object model;
  • A set of defined application "services";
  • A service that receives a wide variety of events and maintains an "operational state of the airline" that generates alert messages whenever exceptional events or situations occur;
  • A real-time flow of events to represent real events and to communicate information such as changes in airline state across the distributed environment; and
  • A middleware infrastructure to transmit this information.

"The object concept is a good meta language with which to think about data and the connections between data," said SCG Partners' Thieme. "It is much more intuitive to talk about objects -- such as flights, segments, itineraries and gates -- than to talk about relations between rows in a database. However we implement -- even if we end up using relational tables -- we find it best to design in terms of objects."

Once objects are understood individually, noted Thieme, you must think through the relationship between them. This is where the object model must be carefully designed. In the airlines industry, he explained, there are complex objects with many connections between them. Thus the data is navigational by nature, and the real value is in the connections between the objects. "We have a passenger connected to a flight, connected to a ship connected to a schedule, and all of those can become important in an operational decision-making process. It would be difficult to design databases and formulate SQL statements that will gather all that you need to know at any time with good performance," said Thieme. "But with an object system, you can easily navigate the links through all that data." This does not necessarily mean that the "object system" is an object database. At Delta (and elsewhere), the object system is maintained in real memory with adequate backup.

Delta's FPES service, for example, maintains the state of the airline in memory and updates the information as new data is received. In addition to maintaining the state, the service generates events and pushes them out to consumers. Of course, a database in the server is also kept up-to-date for internal purposes -- to restart the system if it fails, and to record information for later historical access -- but applications seeking information will not obtain it from the database directly.

"To save all the information in a database in a way suitable for another program to retrieve it -- given performance requirements -- does not make sense, especially given our current direction regarding services interfaces," said SCG Partners' Lawhorne. "Only the applications that provide the services will access the database. We build up the relationships between the objects when we first load the information into the memory model and maintain them in memory. We can do a much better job of delivering the information if we are not constrained by having to keep everything up-to-date in the databases."

For example, said Lawhorne, if a passenger indicates that they are boarding a plane by "swiping" their boarding card through an electronic reader at the jetway, this event must immediately show up on the gate agent's counts. "If we had to make trips into the database with each event the same way we do with reservations [updating and committing transactions at the same time we generate events], we would slow everything down," he noted.

Implementing using messaging

While designed according to object concepts, Delta's solution was implemented using messaging due to performance concerns and other factors. The project team -- consisting of employees from Delta and SCG Partners -- provided application programmers with an interface to navigate around the objects in the model, as well as the call methods to obtain needed information.

The object model is also maintained throughout the system using messaging. A master service maintains the entire object model, and clients or consumers are kept up-to-date as required. In general, the master service keeps its own object model up-to-date with incoming events and forwards some of those same events to clients or consumers, depending on the interest they express by subscribing. The latter can subscribe to events pertaining to any object(s); the master distributes to subscribers all events relevant to the objects they subscribe to. Thus, if a gate agent needs to know all of the events pertaining to arriving Flight 325, the agent's application can subscribe to those events. The application also receives a complete copy of the object, plus a subscription to subsequent relevant events. As subsequent events are received, the application can keep its own copy of the object up-to-date.

"Structuring your information systems, such that they are dependent upon events, allows you a more convenient way to integrate disparate information systems," said Thieme. "Once you identify the events you are interested in and construct a front end to your legacy information systems -- so that the latter can generate, receive and understand these events -- integration becomes easier to achieve."

Said Thieme, "At Delta, these events correspond to real elements in the environment. Watching the wheels hit the runway and then seeing that event show up on the screen gives you that visceral feedback that everyone immediately gravitates toward and understands."