Message engine drives Delta data: A case study
- By John E. Mann
- June 11, 2001
In an industry where six hours equals 3,000 miles, information
is of prime importance. Ask any airline. Data about passengers, itineraries, flights, planes, crews and gates is
generated from every corner of the operation and might be needed anywhere. And when schedules are disrupted --
witness the early January 1999 blizzard that closed Chicago's O'Hare Airport -- the need for accurate, timely information
Airlines want and need to react rapidly to such events. For example, it would be better to reroute passengers
around a troubled hub before canceling a flight. But who has time to sift through thousands of messages describing
airplane departures and landings?
Data generated by an airline should therefore be turned into something operationally meaningful. Busy employees
should also have relevant information pushed to them, so that adaptive actions can begin immediately. But a question
remains: How do you move enormous amounts of events through a complex data model and deliver them proactively to
the people who need them?
Delta Air Lines, Atlanta, has solved this problem by developing an information delivery system that collects,
adds value to and distributes information in real time throughout its operations. The system --which substantially
improves customer travel experience and airline operations, while still controlling costs -- can perhaps best be
described and conceptualized as an "information flow" or "information value chain."
Unlike a database-oriented approach, the information systems in use at Delta "push" data to users
instead of simply making it available for inquiry. Clients do no polling, but simply detect incoming messages on
the TCP/IP port when they arrive. Filtering is performed at the server, as it is more economical based on specifications
submitted throughout the day by the clients.
Information is provided in multiple forms and in different levels of abstractions so as to simultaneously serve
the needs of diverse organizations. Events are analyzed as they happen and collected for later analysis. True to
object-oriented concepts, system design is organized around an object model and defined "services"; the
solution is implemented using message-oriented middleware, including IBM's MQSeries. Although the airline's middleware-based
approach appears to conflict with database-oriented approaches, careful attention to the data is still required.
An early, and perhaps pivotal, project at Delta had to do with fuel purchasing or tankering. The project not
only generated substantial savings for the airline, but also established a foundation for future applications.
Because the price and tax for jet fuel varies by location, an airline can save millions of dollars per year
by carefully planning its fuel purchasing and loading. This requires collecting information on the loading of every
aircraft, including the number of passengers and the cargo carried. Two airline groups, Flight Dispatch and Fuel
Purchasing, are involved in the process. Flight Dispatch deals with immediate decisions, such as how many pounds
of fuel to put on a specific airplane. Fuel Purchasing deals with issues such as futures, current prices and so
Previously, Fuel Purchasing would typically fax fuel price information to Flight Dispatch, which would then
use the data to make optimizing decisions. Now, purchasing information goes to a Fuel Tankering System application
that converts this information into fueling recommendations that dispatchers can act upon directly. Likewise, operational
information in the form of actual fuel loading is sent to Fuel Purchasing in order to maintain accurate fuel inventories.
"In order to address the tankering problem you had to know all the relevant events associated with the
flight," said Ron Thieme, president of consulting firm SCG Partners Inc., Nashua, N.H., which helped Delta
implement its new systems. "We ended up putting together the infrastructure that captures the major events
from [airline] operations -- from all the flights and all the ships. We ended up with a real-time model of the
current state of the fleet."
The project required the use of relatively new messaging technology. As a result, middleware was built to move
data from the airline reservation systems -- based on an IBM operating system called Transaction Processing Facility
(TPF) that is widely used by airlines -- into the distributed (Unix) systems that would host the tankering application.
While the goal was to simply save money in a "back-room" function, the project also offered relatively
low risk with a good upside potential.
The new application embodies some important concepts. From a business perspective, the system links two groups
that had been separated not only organizationally but in terms of the level of data abstraction they dealt with.
It was also able to take data from one group and turn it into information useful to the other. This involved not
only transforming the information, but generating new information. For example, no amount of format transformation
changes the price of fuel. Creating a fueling recommendation, however, means collecting and analyzing a great deal
of data to create "actionable" information.
Getting the data to the people
Drawing on experience from its Fuel Tankering System, Delta then rolled out the Flight Progress Events System
(FPES). FPES is a near-real-time delivery of flight events to "consumers" -- programs that need to receive
the events -- and a database of current flight information used in activities such as supporting Delta's Web site.
FPES collects data from all over the airline. For the most part, the information comes out of the airline's
TPF-based Operational Support System (OSS); however, gate information may come from some airports through other
systems. Passenger information is collected by the reservation system. Most of the planes generate and transmit
their own landing time, which gets to FPES via OSS. Delta wrote its own middleware in order to move data out of
TPF, while MQSeries is used in various other applications. The FPES server runs on a set of Hewlett-Packard servers
running HP-UX and Oracle, with systems duplicated for purposes of high availability.
Consistent, scrubbed information of record is broadcast to consumers through the system's "event-push"
capability. Data is kept in memory for quick access, although there is supporting information in databases from
which the current in-memory state can be reconstructed.
"The server keeps in memory the exact state of flights and the whole airline from an operational point
of view. For example, there is an application that uses the information provided by the server to tell how well
Delta is doing in terms of operational reliability," said SCG Partners' Rick Lawhorne, who designed the system.
"The software provides what we call a 'service' that maintains all the data needed to support immediate calculation
of operational status. It takes only milliseconds to calculate the information that we maintain in real memory."
There are two applications that demonstrate the value of FPES: Flight Status Monitor (FSM) and Passenger Rebooking.
FSM, like FPES, is based on a server that collects, interprets and redistributes information about the state of
operations. It displays the status of each flight, updating the display in real time as new information is received.
FSM is deployed at ramp towers at Delta hubs and can be accessed remotely.
While FPES compares each new message with the previous one -- and creates new, individual change event messages
before distributing them out to the consumer applications -- FSM continually assesses the significance of incoming
events. "As part of the flow of data and events, the server can correlate certain things and detect exceptions,"
said Lawhorne. "The airline people do not need to see that a plane is running on time. What matters to them
is when things do not go right." An example, he said, is when an inbound plane is late and, given the necessary
ground service time, will not make its scheduled departure time.
Within the FPES server that supports these applications is a business-event service that filters information
before distributing it. All day long each "consumer" or client application keeps the server up-to-date.
The server selects the data each client has asked for and sends it to that client.
All of this data will eventually go through MQSeries queues. There are several clients that already get all
their data through events in MQSeries queues -- these are clients that want guaranteed delivery of each and every
message. In addition, anywhere a given system is the System-of-Record for another system, queues are put in place
to enable guaranteed delivery of all messages.
"The goal here is to use exceptions and alerts to keep everyone updated. This brings the focus right in
to solving the problem rather than trying to find out what the problem is or creating new problems," Lawhorne
explained. "It also enables the airline to make the associations that can tip you off to a problem that's
coming down the road so that you can solve it proactively or bypass it altogether."
Passenger Rebooking is the second application. It provides the best possible assistance to passengers when their
flight plans are disrupted. The application also combines up-to-date actual operational information (not just schedules
or plans) with passenger data from reservations systems to inform employees about passenger requirements. In the
future, this application may even have the ability to hand passengers revised tickets (if required) as they step
off a late-arriving connecting flight.
However, reacting to problems by rebooking is only part of what the application does. "Even before it is
time to rebook someone, the application supports the operational planning process. If a flight dispatcher knows
[they are] one plane short and must cancel one flight, with the help of the rebooking application they can look
at the impact on the customers. In addition, [they can make] sure the crew and the aircraft are available,"
said a former project manager.
Previously, judgments were made based on purely operational questions, such as which plane had the fewest number
of passengers. However, this might cause some passengers to be delayed for many hours. There are also some passengers
who should not be "seriously inconvenienced," for example, passengers with a medical condition or unaccompanied
minors. With the Passenger Rebooking application, this type of passenger information is now available to decision
The design of FPES and the other applications is based on an object model of an airline. The solution involves:
- Data represented via the object model;
- A set of defined application "services";
- A service that receives a wide variety of events and maintains an "operational state of the airline"
that generates alert messages whenever exceptional events or situations occur;
- A real-time flow of events to represent real events and to communicate information such as changes in airline
state across the distributed environment; and
- A middleware infrastructure to transmit this information.
"The object concept is a good meta language with which to think about data and the connections between
data," said SCG Partners' Thieme. "It is much more intuitive to talk about objects -- such as flights,
segments, itineraries and gates -- than to talk about relations between rows in a database. However we implement
-- even if we end up using relational tables -- we find it best to design in terms of objects."
Once objects are understood individually, noted Thieme, you must think through the relationship between them.
This is where the object model must be carefully designed. In the airlines industry, he explained, there are complex
objects with many connections between them. Thus the data is navigational by nature, and the real value is in the
connections between the objects. "We have a passenger connected to a flight, connected to a ship connected
to a schedule, and all of those can become important in an operational decision-making process. It would be difficult
to design databases and formulate SQL statements that will gather all that you need to know at any time with good
performance," said Thieme. "But with an object system, you can easily navigate the links through all
that data." This does not necessarily mean that the "object system" is an object database. At Delta
(and elsewhere), the object system is maintained in real memory with adequate backup.
Delta's FPES service, for example, maintains the state of the airline in memory and updates the information
as new data is received. In addition to maintaining the state, the service generates events and pushes them out
to consumers. Of course, a database in the server is also kept up-to-date for internal purposes -- to restart the
system if it fails, and to record information for later historical access -- but applications seeking information
will not obtain it from the database directly.
"To save all the information in a database in a way suitable for another program to retrieve it -- given
performance requirements -- does not make sense, especially given our current direction regarding services interfaces,"
said SCG Partners' Lawhorne. "Only the applications that provide the services will access the database. We
build up the relationships between the objects when we first load the information into the memory model and maintain
them in memory. We can do a much better job of delivering the information if we are not constrained by having to
keep everything up-to-date in the databases."
For example, said Lawhorne, if a passenger indicates that they are boarding a plane by "swiping" their
boarding card through an electronic reader at the jetway, this event must immediately show up on the gate agent's
counts. "If we had to make trips into the database with each event the same way we do with reservations [updating
and committing transactions at the same time we generate events], we would slow everything down," he noted.
Implementing using messaging
While designed according to object concepts, Delta's solution was implemented using messaging due to performance
concerns and other factors. The project team -- consisting of employees from Delta and SCG Partners -- provided
application programmers with an interface to navigate around the objects in the model, as well as the call methods
to obtain needed information.
The object model is also maintained throughout the system using messaging. A master service maintains the entire
object model, and clients or consumers are kept up-to-date as required. In general, the master service keeps its
own object model up-to-date with incoming events and forwards some of those same events to clients or consumers,
depending on the interest they express by subscribing. The latter can subscribe to events pertaining to any object(s);
the master distributes to subscribers all events relevant to the objects they subscribe to. Thus, if a gate agent
needs to know all of the events pertaining to arriving Flight 325, the agent's application can subscribe to those
events. The application also receives a complete copy of the object, plus a subscription to subsequent relevant
events. As subsequent events are received, the application can keep its own copy of the object up-to-date.
"Structuring your information systems, such that they are dependent upon events, allows you a more convenient
way to integrate disparate information systems," said Thieme. "Once you identify the events you are interested
in and construct a front end to your legacy information systems -- so that the latter can generate, receive and
understand these events -- integration becomes easier to achieve."
Said Thieme, "At Delta, these events correspond to real elements in the environment. Watching the wheels
hit the runway and then seeing that event show up on the screen gives you that visceral feedback that everyone
immediately gravitates toward and understands."