Lately some controversy has arisen in the XML community about what is the most appropriate tool to supercede
the dreaded DTDs: XML Schema or RELAX NG? The IETF is creating an
RFC on the use of XML,
and the topic of specifying schemas is one point of contention.
XML Schema was put forward by the W3C in 2001 to fix the most obvious limitations of DTDs:
The syntax of a DTD is different from XML, requiring the document writer to learn yet another notation, and
the software to have yet another parser
There is no way to specify datatypes and data formats that could be used to automatically map from and to
programming languages
There is not a set of well-known basic elements to choose from
While XML Schema successfully tackled these problems, it also went much further. XML Schema also took on
data type definitions, infoset modification, and schema expressiveness far beyond that of DTDs.
James Clark, leader of the technical committee at OASIS for RELAX NG, and author of one of the first XML parsers,
recently described the problems of the XML Schema language in a newsgroup posting:
XML Schema definitions require considerable expertise to understand and can contain quite a few surprises.
As an example, if you derive a complex type by
restriction you have to specify the new restricted content model
explicitly. However, attributes are treated in the opposite way: by
default you get all the attributes and you have to explicitly rule out
the ones you don't want. A similar inconsistency exists in that if you
merge two attribute definitions you get the union of the concrete attribute definitions, but the intersection
of attribute wildcards (specified by the anyAttribute element).
While these might be convenient choices for the specification, it easily creates confusion for the human
reader of XML schemas.
The XML Schema Recommendation is hard to read and understand.
To avoid the possible misinterpretations mentioned above you might have to reference the specification in order to fully understand
a specific schema definition. I have to agree with Clark that the W3C's XML Schema Recommendation is by far
the hardest to read and understand, making it even more difficult to make sense of a particular schema presented
to you.
W3C XML Schema's support for attributes provides no advance over DTDs.
As with DTDs, W3C XML Schema only
allows the specification of whether attributes are required or
optional. There is no way to specify more complex constraints between
attributes or between attributes or elements, for instance
that either attribute X or attribute Y is allowed or that either
attribute X or element Y is allowed. The mechanism that is used to constrain the co-occurrence of child elements
should be extended to attributes and the combinations of attributes and child elements.
W3C XML Schema provides very weak support for unordered content.
When the designer of an XML vocabulary does not wish to force child
elements to occur in a particular order, it can be impractical to
describe the XML vocabulary using XML Schema, because XML Schema
imposes such limitations.