Thoughts on the Office XML Reference Schemas
Earlier this week, Microsoft popped out a rather unexpected press release: "Microsoft
Announces Availability of Open and Royalty-Free License for Office 2003 XML
Reference Schemas". Read the press release, and you'll discover that
Microsoft worked with the Danish government to make these schemas publicly
available. The Danes put out their own press release on the subject.
Apparently the initiative was driven by The Danish
Software Strategy. That document makes fascinating reading on its own right.
The Danish government is charting a middle course between using proprietary and
open source software, emphasizing XML and other open standards rather than
political correctness for the software itself. This strikes me as eminently
Anyhow, back to the XML schemas. As you might already know, Word 2003, Excel
2003, and InfoPath 2003 can save documents as XML (InfoPath natively, Word and
Excel if you choose to save as XML). This means that any XML generating or
consuming tool can at least theoretically interoperate with Word and Excel.
Indeed, it's been possible to do this just by saving Word or Excel documents as
XML and then inspecting the results. What the new Microsoft
Office 2003 XML Reference Schemas download adds to the picture is
documentation. Right now the download contains information on WordProcessingML
(which was formerly called WordML in some Microsoft documents); the Excel and
InfoPath schemas are due to follow on December 5.
Download and install the schemas (a process that was slightly annoying here,
since the installer is hard-coded to use drive c: and my development box doesn't
have a drive c:) and you'll get the applicable XML schemas, a help file
that documents everything, and a Word document that explains some common
scenarios. It really is everything you need to build up XML files that Word will
see as perfectly-valid Word documents, complete with the schemas that you need
to validate your work.
Yet all may not be quite rosy in the land of openness. There are two
potentially troubling things in the legalese that comes with the schemas (and
please bear in mind that I'm a developer, not a lawyer; I would welcome
corrections or clarifications from anyone who knows better). First there's the
matter of the associated Office
2003 XML Reference Schema Patent License. The schema download itself
contains language that lets you copy and distribute the schema, subject to
certain limitations (mostly that you need to properly credit it and link to a
particular page at Microsoft). But the download doesn't grant you the right to
implement a program that can use the specifications. That's the purpose
of the patent license.
This whole Patent License business is a bit troubling to me, as it starts off
by saying "Microsoft may have patents and/or patent applications that are
necessary for you to license in order to make, sell, or distribute software
programs that read or write files that comply with the Microsoft specifications
for the Office Schemas." It then goes on to say "Except as provided below,
Microsoft hereby grants you a royalty-free license under Microsoft's Necessary
Claims to make, use, sell, offer to sell, import, and otherwise distribute
Licensed Implementations solely for the purpose of reading and writing files
that comply with the Microsoft specifications for the Office Schemas." You need
to display a license notice, you can't sublicense, and "You are not licensed to
distribute a Licensed Implementation under license terms and conditions that
prohibit the terms and conditions of this license."
Also, within the schema license itself you'll find this language: "No right
to create modifications or derivatives of this Specification is granted
So where's the problem? Well, first off, it seems possible that the bit about
not being licensed to distribute under other license terms bit in the Patent
License is a clause designed to prevent application that use the Gnu General
Public License (GPL) from implementing Office XML compatibility. To be fair,
Eben Moglen, the counsel to the Free Software Foundation (which keeps an eye on
the GPL) says he
doesn't think there's an incompatibility. But if I were writing
open source software, I'd think twice before using these schemas.
Second, what the heck can you patent in an XML schema? XML Schema itself is, of course, an
approved recommendation of the World Wide Web Consortium. It's all right out in
the open and understood by thousands of applications. Patents are supposed to be
for non-obvious innovations (though many recent abuses of the system make it all
too clear that the US Patent and Trademark Office is simply incompetent to judge
the merits of software patents), so what's non-obvious about using an XML schema
to describe an XML document? I don't get it. A search at the Patent Office for
Schema AND Microsoft" turns up ten patents. None of them look especially
applicable to me, but reading patents is a minefield for the layman so I could
easily be wrong.
Finally -- and most troubling to me -- is this whole business of the license
only granting you permission to "read and writes files that are fully compliant"
with the specification, and not being able to create modifications or
derivatives. Correct me if I'm wrong, but doesn't the X in XML stand for
"Extensible" (we'll talk about the inability of developers to spell another
day)? It seems like you have to implement the whole schema and nothing but the
schema to avoid falling afoul of the license. That sure doesn't seem extensible
Maybe I've just spent too much time hanging around with open source folks,
and it's made me paranoid. Maybe the Office team always intended to make
everything open, and just forgot until the Danish government reminded them.
Maybe the developers at Microsoft want us to use the schema to build
interoperable applications, and understand that we might extend or only use part
of the schema. Maybe everything was fine until the lawyers got involved and
marked everything up. But given Microsoft's past behavior, I certainly think
these are questions worth pondering. Hopefully Microsoft will move forward on
openness and interoperability, and the lawyers will get out of the way, and we
can all write happy smiling software that plays well together.
Mike Gunderloy has been developing software for a quarter-century now, and writing about it for nearly as long. He walked away from a .NET development career in 2006 and has been a happy Rails user ever since. Mike blogs at A Fresh Cup.