A number of specifications use wildcards for extensibility and evolution. This is a good thing. However, there is a prevalent design choice that has significant limitations. The use of wildcards at the same level as element content precludes many of the common validation features authors need for loosely coupled systems. The specific problem is that validation of extensions cannot be correctly done, leading to documents passing validity that should not.
To illustrate this problem, let us start with a person description that contains a name. It contains two extensibility points using the common wildcard with targetNamespace="##other":
The author of the person namespace decides that they want to extend their type by adding a middle name to the name and a city to the person type. Hence they create 2 new schema types, for middle and city, which look like:
They can create documents that have a middle name and city in it. Now what does the author do about the schema for the extended person?
The problem is that there is no way for the person schema author who has extended the name with middle to say where the middle is allowed and where it is not allowed. The person schema author cannot validate the location of middle or city elements. Using the previous processing model, the following document is valid
Another option is that they could update the schema to specify the middle name as an optional element inside the name. This would prevent the city from occuring in the name, but it would also prevent any other extension inside the name as they have to lose the wildcard. The reason is the pesky unique particle attribution constraint, which means that an optional element in a given ns can't occur before a wildcard that might match that ns. The following schema is illegal:
XML Schema, mostly because of the Unique Particle Attribution rule, does not have a mechanism for refining wildcards to say which optional elements are allowed and which are not allowed in a particular wildcard while retaining extensibility.
The author has really only 3 options when adding elements in this design:
None of these options are desirable. The options in the design exist because the fundamental problem with using wildcards outside of an Extension element for versioning, is that a schema cannot correctly validate extensions and retain compatibility. This is because XML schema lacks the ability to validate optional wildcard extensions. So when a namespace owner tries to use wildcards at the same level as elements for versioning, they run into this problem.
In contrast, the solution that I advised in my article on versioning xml languages does allow an author to add optional elements (such as middle and city), constrain where these elements can occur, and retain backwards and forwards compatibility. The design is a schema that has Extension elements to contain any extensions. In any subsequent version, the schema author overlays a new type on the older type by replacing a wildcard with the new elements. This is the linchpin of this design: To allow forward compatibility, a wildcard is used inside an Extension element. In a subsequent revision of the specification, the wildcard where extension occurs is replaced with an optional element (optional elements preserve backwards compatibility) and a new Extension element is placed after the optional element. There are two options for namespaces within the wildcard element, either the targetnamespace or other namespaces. Given that the wildcard is going to be replaced with an element if the namespace owner makes a change, they can use the targetnamespace and effectively "promise" how they will make changes. If they use ##other as many solutions do, any revision they do to the schema that retains forwards and backwards compatibility (replacing the wildcard with an optional element) will end up invalidating everybody else's extension. Thus it is more desirable to use the ##targetnamespace as this promises that the namespace owner can only invalidate their own extensions!
To illustrate the benefits of an Extensibility element and compare ##targetnamespace and ##other, we make a schema that uses Extensibility elements for person:
Using this model, the namespace owner extends the person schema with the middle name in the person namespace and a city in the address namespace, which is:
The following document is correctly invalid to the newer schema:
And the following document is correctly valid to both the original and the new schema:
The technique of using an extensibility element with a targetnamespace wildcard has the advantage over other namespace wildcard because it means that a single namespace and related schema can be updated with the new type information and allowing others to retain their ability to extend the instance. There is a "master" schema that a namespace owner can see that controls their documents, though there obviously may be schema modularization. Using a single namespace is preferable to multiple namespaces.
A final observation: The SOAP specification follows this model of containing wildcards within Extensibility elements (specifically head and body), so this technique should not be new to Web services developers. And the WSDL specification has the difficult task of determining how to create schemas that constrain the wildcard elements. This is so difficult that WSDL 1.1 does not express optional header blocks, and WSDL 2.0 is on the same path.
This article shows that using a wildcard at the same level as elements for compatible evolution of a schema does not allow correct validation using updated schemas whilst retaining compatible evolution. The advocated technique, using an Extension element, allows the correct validation of new and old documents under newer and older schemas. And using ##targetnamespace in the wildcard allows for a simpler mechanism than ##other. There are some drawbacks to the Extension element technique because of further limitations of wildcards (they still are more lenient than we need them to be), and these are addressed separately.