In-Depth
Web services architectures: Easier said than done
- By Mayank Prakash
- August 30, 2002
Web services can provide an open and interoperable development framework, but
the technology requires an architecture unlike anything built before.
Every few years a new technology bursts upon the scene with great fanfare,
promising to cure world hunger and fill the hole in the ozone layer. Artificial
intelligence, object-oriented programming, data warehousing and data mining, the
Internet, Java and XML have each been touted as savior. Initially, vendors
introduce what they claim is the ''next big thing,'' seeking to generate as much
excitement as possible, and the publicity engines dutifully join in. Early
adopters flock to the new technology with great expectations, only to become
disillusioned as faith collides with reality. A backlash inevitably ensues and
soon every mention of the would-be panacea technology evokes denigration. For
the lucky few, their real potential and limitations are eventually understood
and they settle into a suitable niche, although many inevitably slide into
oblivion.
With all that said, it is fair to say this is the era of Web services hype.
As a marketing phenomenon, the Web services hype is riding on a wave of XML
hype, which in turn is helping to promote Microsoft's .NET hype. From
inter-enterprise integration to software as a service, to whatever else suits
your fancy, you can find someone promoting Web services as the ideal solution.
As you might have guessed, I believe the truth lies somewhere in between. While
it is not a panacea, Web services technology does provide powerful solutions to
a class of problems that involve multiple cooperating agents working together to
solve business problems.
Web services defined
If you ask several technologists
how they define Web services, you are likely to get more answers than there are
respondents. But you will find some common elements in their answers, such as a
heavy reliance on XML, and the use of TCP/IP-based standard protocols as the
transport medium to enable distributed computing agents to collaborate with each
other. Another key attribute that is often left out is that each Web service has
a unique and standard address. Together, these features allow any consumer of a
Web service to access it without having to worry about communication protocols,
message formats and the like. The innovation in Web services is not in providing
a framework for building distributed applications (indeed we have many worthy
precedents for that), but in providing an open and highly interoperable
framework. Like the Web and Internet e-mail, Web services enable providers and
consumers of services to be completely independent of one another's
implementation technology and platforms. The major weakness of earlier
frameworks such as DCE, CORBA, DCOM and RMI, was that they could only operate in
their own environment.
Of course, this flexibility comes with a price. The important characteristics
of the earlier frameworks that differentiate them from Web services is the
specification of a proprietary transport protocol and a proprietary data
transfer format. Web services ride on a pre-existing and ubiquitous transport
protocol, and utilize the ubiquitous XML as the data transfer format. By using
proprietary protocols and data transfer formats, a distributed application
framework is able to customize and tune itself to achieve high performance and
build in a higher level of services and richer semantics. Features like security
and minimizing network communications can be designed into the protocol itself.
Web services, however, do not have control over these elements and can only
assume a least-common denominator of features from the allowable open protocols.
This implies that the overall structure will be less well-tuned and several
essential services have to be provided as additional higher-level protocols.
At the risk of repeating a well-worn cliche, the good thing about standards
is that there are so many to choose from. Web services technology is no
exception. There are three main contenders for the core protocol of Web services
-- the most familiar are SOAP; its simpler progenitor, XML-RPC; and REST (a term
that describes the architecture of the Web). REST uses HTTP as the transport
protocol and provides a limited set of verbs (the standard set that comes with
HTTP, such as GET, POST, PUT) that operate on an arbitrary set of nouns. XML-RPC
also uses HTTP as its transport protocol and specifies a procedure call-value
return standard using XML as the data transfer protocol. SOAP does not specify
any particular transport protocol, although HTTP is the most common, with SMTP
poised to become another popular option. It also allows for great flexibility in
the types of messages that can be communicated, which are not restricted to a
strictly procedure-call value return model. SOAP is also accompanied by an
alphabet soup of auxiliary protocols (WSDL, UDDI, WS-Security, SAML, ebXML and
WSCI, among others) that provide additional levels of functionality.
A detailed description of these protocols and a comparison between them is a
topic for another article. This article focuses mostly on SOAP, because it is
the mind-share leader.
Top 13 architectural guidelines for designing Web services
The technologies that comprise Web services entail a paradigm shift. When
architecting distributed apps based on them, a new set of design guidelines
needs to be followed. Web services leverage existing transport protocols and
data transfer standards. For the most part, these standards were not originally
designed to support general distributed applications. This gives rise to some
key characteristics of Web services-based apps:
* Loose coupling between partners;
* High overhead per interaction;
*
Standardization of interaction protocol, but not content; and
* Fundamental
services, such as security, directory and messaging are outside the scope of
SOAP, and must be specified separately through additional related standards.
The distributed and multivendor nature of Web services-based applications
requires a foundation based on a well-thought-out architecture. However, the
principles underlying architecting Web services-based applications are
significantly different from typical n-tier applications. The following salient
points present some practical guidelines for building applications based on this
new paradigm.
1. Select the appropriate Web services protocol.
While the focus of this article is SOAP, SOAP is not the ideal choice in all
cases. The W3C provides an XSLT transformation service. You simply construct a
URL that addresses the transformation service site, include the URLs of the
source XML document and the XSL style sheet in the URL, and the transformation
service returns the transformed document (try it at www.w3.org/2000/
06/webdata/xslt?xslfile=&xmlfile= URL of
XMLfile>&transform=Submit).
This is an example of the REST protocol in action. For such a simple service,
you do not need the complexity of SOAP and the associated infrastructure. For
more complex types of interactions, however, you will need the support provided
by XML-RPC or even SOAP. The important point is to match the Web services
protocol to the needs of the application.
2. Keep the server stateless. Despite its name,
SOAP is an RPC protocol and is not object oriented. There is no notion of the
identity of the agent providing the service within the protocol itself. If you
want the ability to send a sequence of messages to the same object on the
server, with each message possibly altering the state of the server object, you
have to define a higher-level protocol and the appropriate facility on each end
to track object identity. This can increase the application's complexity as well
as the potential for error. However, there is also a performance- and
flexibility-related issue -- a stateless server is inherently easier to scale
and make fault-tolerant by adding redundancy.
This is not to say that applications that involve multiple interactions
between agents should not be developed using Web services, but that doing so
requires careful analysis. Not all applications need to scale to millions of
interactions per day, but some may need to implement more complex interaction
scenarios. For such applications, maintaining the state of the conversation is
required. There are a number of specifications in various stages of development
that aim to address the different aspects of interaction scenarios spanning
multiple interactions, such as ebXML, Web Services Choreography Interface
(WSCI), Business Process Modeling Language (BPML), Web Services Flow Language
(WSFL) and XLANG, but the jury is still out on which of these will become the
standard, or if we are doomed to have to cope with multiple standards in this
space.
3. Keep an eye on messaging overhead. In general,
the network communications requirements define the largest bottleneck in a
distributed application. This is even more true with SOAP because of HTTP's
stateless nature and XML's verbosity. The preferred approach is to design the
communication between the consumer and the producer of your services so that all
related information is transmitted in one message, rather than in many
fine-grained messages.
For example, suppose you are designing a Web services-based solution that
allows a travel agency to collect and display room availability from a range of
hotels. The client provides some search criteria and you show them a list of the
matching properties. The user can opt to get more details on an individual
property before making a selection. Once the client makes a selection, a
reservation is made. But should the application be designed so that the first
call returns all details about the matching properties, or should they be
retrieved only when the user demands to see the additional details? The wrong
answer here can cost you a lot in performance.
If a search returns a large number of properties, then returning the details
on all of them could be a drain on performance, especially since the user is
only likely to ask for details on a small subset of them. In this case, a
two-message protocol is likely to be better. But suppose that for each property
there are multiple levels of detail that you display to the user. At the first
level, you display only the description of the available matching rooms. At the
next level, you also display the facilities available at the hotel. A user might
also want to see the recreational facilities and sightseeing venues in the
vicinity. It is preferable to retrieve these details in one shot even if you
display them in tiers and only on demand. In each case, a careful analysis of
the messaging overhead relative to the data transferred should be performed.
4. Choose the appropriate implementation layer. The
Web services provider can live in many different architectural layers of the
application, so there are several factors to consider. Are you providing an
interface to existing legacy back-end apps? Are you trying to leverage an
existing presentation layer app? (In the J2EE terminology, you can design your
Web services to expose the JSP/servlet layer, the session bean layer or the
entity bean layer. In more generic terms, this corresponds to Web services in
the presentation layer, the business logic layer or the persistence layer.)
In general, the deeper the architectural layer you expose through your Web
service, the more you sacrifice scalability, the greater the security
requirements are, and the deeper the dependency between your Web service and the
consumers becomes. At the same time, you also provide consumers with greater
control over how they use your application's services. This is appropriate if
the consumers are trusted partners and the risks of the greater exposure are
outweighed by the benefits.
5. Choose the right toolkit. Application server
vendor-provided toolkits are easier to integrate with the corresponding
application server infrastructure, but they also tend to lock you in to that
particular vendor. JAX pack has promise, but with many missing pieces it is not
yet a mature specification. Apache SOAP makes the development of SOAP-based
services easier in a vendor-neutral API. The upcoming Apache Axis API is a
re-architecture of Apache SOAP and a dramatic improvement. It promises to make
developing Web services easier and substantially hides the complexities of the
underlying protocols. Among proprietary toolkits, BEA WebLogic Workshop and
Microsoft's VS .NET raise the ease of building Web services to a new level.
The different toolkits also provide various abstractions of Web services and
even implement different subsets of the SOAP specification. It is therefore
important to ensure that the toolkit you choose implements the features of the
SOAP specification you need at the proper level of abstraction.
6. Do not generate the SOAP interface from implementations (VS
.NET syndrome). VS .NET greatly simplifies the creation of Web
services. One simply adds a Web method declaration to a method and VS .NET
generates all of the code necessary to expose it as a Web service. Yet this
methodology defeats the whole purpose of Web services, which is to hide the
implementation of a service completely behind an XML-based interface. VS .NET
generates the interface from the implementation. While it is tempting to do so,
avoid creating Web services as an afterthought in VS .NET.
7. Design for interoperability. In theory, Web
services are about interoperability. But in reality, there are several reasons
why interoperability might be an issue. Typically, you will use a SOAP toolkit
to create and consume Web services. Your chosen toolkit might only implement a
subset of the specification, making it incompatible with consumers or providers
created with another toolkit. Additionally, the SOAP specification leaves some
details optional and up to the implementation. Different toolkits can
legitimately choose to ignore these features. The different toolkit implementers
might interpret the specification differently if the language is ambiguous.
Finally, the toolkit might simply have bugs in its implementation. All of this
can lead to interoperability nightmares.
Of course, life is simple if all producers and consumers can be created using
the same version of the toolkit. You might not be so lucky. In any event, it is
always prudent to ensure that your services work well in a heterogeneous
environment. The best approach is to test, test and test your services against
various toolkits to ensure they work correctly. You can also participate in the
SOAPBuilders online forum (http://groups.yahoo.com/group/soapbuilders)
to discuss these issues and get help from other developers.
8. Plan on implementing security manually. SOAP
does not address security, except for allowing SSL (HTTPS) as a transmission
protocol. There are a number of additional solutions being proposed and
developed, but no universally accepted standards have emerged. SSL allows for
digital certificate-based authentication and encrypted secure end-to-end
communications. This requires a PKI infrastructure to be in place. Other means
of authentication need to be implemented manually, at least in the short term.
9. Decide between asynchronous and synchronous services.
Most RPC mechanisms are designed to handle synchronous
interactions. However, it is often necessary to engage in asynchronous
interactions. The communication might not need a response (for example, a
notification event), the processing of the request might take minutes or even
days (for example, process and ship an order), or the response might need to be
sent to a third party (for example, a workflow app). Message-based services
allow for a very loosely coupled and robust design when real-time response is
not required.
Note that the SOAP protocol is by nature synchronous and every request will
generate a response. In a message-based architecture, the response is merely an
acknowledgement that the message has been received and a response will be
forthcoming (or not, as the case may be). Note also that the response does not
necessarily need to be returned to the original requester but to a third party.
Either the request itself can carry the address of the response recipient(s), or
the sender and receiver might have established the recipient ahead of time. The
response can also be sent to multiple recipients (for example, an order
fulfillment service and an audit trail service). This can be used to build
fairly complex workflow type scenarios. But do not get carried away -- this can
become a debugger's nightmare.
10. Incorporate error handling. SOAP allows the
server to return an indication of processing failure through FAULT elements in
the response. While not as transparent as proper out-of-band exception handling,
it is still a useful mechanism and should be used to build robust applications.
A robust mechanism is to translate FAULT SOAP responses to Java exceptions on
the client end, and to catch exceptions and return FAULT responses on the server
end.
11. Compose services together to create more complex
interactions. A single Web service can invoke other services to
generate parts of the response, which it then consolidates into a single
response for its client. While powerful, this technique has the potential to
slow down the responses. If you are composing services provided by several
different providers, debugging can be especially challenging.
There are several different topologies that can be built here. In a pipes and
filters paradigm, each service does some processing and passes the request on to
the next service for further work. As the services in the chain return, they
combine the response from their callee with their own response until the
original service consolidates all information and returns it back to the caller.
In a dispatch-based architecture, the original service acts as a broker and,
based on the request, dispatches to one of several real services whose response
it returns. Finally, a coordinator-based service calls upon one or more services
to produce parts of the response and combines the responses into a single
response for the caller. These models can also be combined. To the client, it
all appears as a single service.
12. Design in testability. By now, you are probably
excited and ready to start building Web services that will conquer the world.
However, it is important to remember that testing a Web services-based app is a
major challenge. Suppose you decided to build an application that invokes Web
services provided by several partners. How do you test it? You cannot test
against live services (as they might place real orders). Volume testing for
scalability becomes even more challenging. It does not work to test against
dummy services either, since then you are not testing the actual service.
For this reason, it is best to create a testing environment alongside the
production environment. Your partners can then test against the test
environment, which does everything exactly like the real service, but with
simulated side-effects. Insist that partners provide such an environment to you
as well. Given the lack of tools available to help you debug a distributed
application based on Web services architecture, it is a good idea to incorporate
extensive logging functionality within your services to help with the
troubleshooting.
13. Choose your application judiciously. It should
be clear by now that the technology is still in its infancy, and it is prudent
to test the waters before plunging in head first. The most important use of the
technology in the short term is as a framework for the integration of internal
enterprise applications, potentially on multiple platforms. This approach is
prudent for several reasons:
* It gives you control over the entire Web services app. You do not have to
deal with unknown problems cropping up because of bugs in other people's
services.
* Security issues are easier to deal with in an intranet
environment.
* The benefits of integrating judiciously chosen internal apps
can be great in terms of operational efficiencies; the resulting visibility can
be very helpful in creating support to fund further projects.
While they do not introduce new concepts, Web services do introduce new
possibilities by exploiting standard and vendor-neutral technologies. However,
the characteristics of the base technologies require a new way of thinking about
applications. Building Web services-based applications places a premium on a
solid architectural foundation. Hopefully, these tips will help you to better
exploit the technology in your environment.
About the Author
Mayank Prakash is CTO at Accelera Software, a Newton, Mass.-based consulting firm.