The W3C XML Schema WG is now talking about how to do "versioning" in XML Schema 1.1, yeah! There are a lot of different approaches that are possible and better than the status quo. Roughly the requirements are allowing addtional content in mixed namespace documents with forwards and backwards compatible schemas.
One approach that I think meets the 80/20 point is the "allow anything not declared in the Schema" extensibility model. I first published this way back in December 2003 and I like it more and more. The XML Schema WG has collected uses cases and I contributed a bunch of Web services versioning use cases to help the discussion.
I always like using "names" as examples. The first version of a ame structure has given and family. But then we want to add a "middle" name. Because we want to combine with extension (which happens at the end of content models), we'll show the "new" content at the end. This is shown in more detail in the "appended" case
We want to allow
<name>
<given>David<given>
<family>Orchard</family>
<middle>Bryce</middle>
</name>
and
<name>
<given>David<given>
<family>Orchard</family>
<newns:middle>Bryce</newns:middle>
</name>
But want to preclude things that are already in the content model so disallow:
<name>
<given>David<given>
<family>Orchard</family>
<given>Davido<given>
</name>
And if the schema had already defined phone numbers, area codes, addresses, etc, then they should be disallowed as well. It seems pretty rare that V2 schema will be created because somebody "forgot" to add an element that they already knew about. Disallow if phone is already defined in the schema:
<name>
<given>David<given>
<family>Orchard</family>
<phone>555-555-5555</phone>
</name>
A V1 schema that supports this using a new Wildcard would look something like:
<xsd:complexType name="name">
<xsd:sequence>
<xsd:element name="given" type="nameString"/>
<xsd:element name="family" type="nameString"/>
<xsd:any allow="AnythingNotInSchema" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
A V2 schema that supports this could be:
<xsd:complexType name="name">
<xsd:sequence>
<xsd:element name="given" type="nameString"/>
<xsd:element name="family" type="nameString"/>
<xsd:element name="middle" type="nameString"/>
<xsd:any allow="AnythingNotInSchema" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
Or preserving the name type and defining a new nameV2 type:
<xsd:complexType name="newName">
<xsd:complexContent>
<xsd:extension base="name">
<xsd:sequence>
<xsd:element name="middle" type="xsd:string"/>
<xsd:any allow="AnythingNotInSchema" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
One "trick" is that the Schema extension mechanism has to "eat" the wildcard. There shouldn't be a wildcard between given and middle in the v2 schema.
An alternative is to keep the targetNamespace and add a qualifier for which elements, ie
<xsd:any targetnamspace="##any" element="AnythingNotInSchema" >
This would allow fine grained control over particular namespaces and elements.
This also handles the case of where more "structure" is added. For example, say the first version of name is a string and the 2nd version of name is the structure. The one aspect that isn't totally ideal is that the name must be in both structures, for the V1 receiver and the V2 receiver. Obviously it would be best if V1 had all the structure but there is a solution that treats the new structure as data, talked about at http://www.pacificspirit.com/blog/2004/06/14/why_putting_extra_structure_in_v10_is_good.
V1:
<contactinfo>
<unstructuredname>David Orchard</unstructuredname>
</contactinfo>
V2:
<contactinfo>
<unstructuredname>David Orchard</unstructuredname>
<name>
<given>David<given>
<family>Orchard</family>
</name>
</contactinfo>
A V1 receiver can thow away the "name" that it knows nothing about. A V2 receiver can do whatever it wants with both structured and unstructured names.
One of the arguments is that the scenario where the element really was just "forgotten" and we want to add it in. There is a workaround, which is that the forgotten type can always be used with a new qname, either new namespace or new local name.
Comments (1)
Blocking any element that has been mentioned in a schema anywhere (it's not clear if it's just global elements, or all elements actually) seems way too broad.
I have often seen the case where information items used in one place subsequently have to be used in another place. This not necessarily because some one forgot about it, but new use-cases have been discovered and similar bits of data need to be made available in different contexts.
It also seems inconsistent with the rest of XSD practice. XSD allows there to be multiple complex type elements that contain elements, each with their own application defined meaning. Defining allow="AnythingNotInSchema" would mean that a element could be added in any subsequent version (where it was used). You'd quickly end up with silly senarios like !!!
What's needed is for the wildcard to not match the element names mentioned in the local scope. i.e. the wildcard should not match the names of any of the children of the parent of the wildcard (and any of the childrens' substitution group members). e.g. allow="AnythingNotLocalScope".
Posted by Pete Cordell | March 21, 2007 8:44 AM
Posted on March 21, 2007 08:44