Does XML give away the keys to the warehouse?
Minimization is an important aspect of security. In the supply chain of goods
businesses, one doesn’t lump all the raw materials, financial instruments
and other paraphernalia of commerce along with the final goods in the retail
Besides other obvious problems, security would become impractical. Even the
most super superstore limits its contents to material that would normally be
expected to leave the premises in typical commerce.
For some reason, businesses have a hard time applying such a simple, pragmatic
approach to data management.
If you’re a developer or manager on a project channeling data to, say,
a commerce Web page, data security probably
keeps you up nights.
One infamous vulnerability behind such nightmares is SQL injection. The attacker
fills in normal commerce
forms with cleverly constructed strings designed to trick the database into
leaking sensitive information.
Recent analysis reveals how readily attackers can compromise entire databases
in this way. When the Web site
is but a thin layer over a super-sized enterprise database (an all too common
setup), the degree of vulnerability
Accumulating data into ever bigger databases might make some aspects of development
and management easier,
but it makes things just as pleasant for the attacker.
Recently the specter of malicious input has come to the XMLworld. The so-called
XPath injection attack is
aimed at corporations that pack loads of information into XML files, which are
then processed for the Web
by XSLT or other XPath-based technologies. Again, clever input fools the processor
into returning more data
While I don’t claim to have foreseen XPath injection attacks, it does
strike me that this security problem is made
possible by practices that I and others have always discouraged. One problem
is the phenomenon of production
XML as database dump. Developers love to create titanic XMLfiles, often as monolithic
dumps from databases.
Sometimes they deploy such monsters to servers susceptible to the cleverness
If someone does compromise the server, they can pilfer one file and have your
information warehouse at
Some vendors suggest encryption as a way to secure XML data, but the XPath
injection illustrates how this is
but a cosmetic fix on a foundational problem. If XPath injection fools the processor
into returning sensitive data,
it will be conveniently decrypted for the attacker. With a clever enough XPath
injection, they could end up
with the entire file decrypted for them.
Does this mean that using XML automatically gives black hats keys to your sensitive
data? Of course not.
Paradoxically, the solution is to embrace the fact that XMLis open data, and
don’t cower behind the false bastion
of obscurity. All data accessible through XPath at any time should be data you
expect any party to be able to
access, including attackers.
From an architecture point of view, this is a strong hint toward pipeline architecture
for your XML applications.
The idea behind XML pipelines is that rather than working monolithic XML data
sets through monolithic
processors, you break down your system into discrete stages, each of which only
represents a small window into
the overall data stream.
That way you can firewall sensitive data across pipeline stages so it’s
impossible for any action within one
stage (even actions as clever as XPath injection) to access more data than designed.
Astrophysics has the concept of a light cone, the portion of space-time open
to observation. Outside the light
cone, the observer is blacked out by the heftiest firewall in the universe:
the speed of light. You should design your
pipelines so that as much sensitive data as possible is outside the light cone
of each processing stage.
Uche Ogbuji is a consultant and co-founder
at Fourthought Inc. in Boulder, Colo.
He may be contacted at firstname.lastname@example.org.