pssst. When is an xml element part of an extension to a language or part of a version of a language? I first wrote about this a long time ago in Extensibility vs Versioning
Say I have a name element with a first name in a namespace with prefix ns1. Given that middle was not defined in version 1.0 or the ns1 namespace, which of the following middles are extensions and which are versions? Let's assume that the "1.1" version attribute means that the element is optional, and "2.0" means that it is required.
1.
<ns1:name>
<ns1:first>Dave</ns1:first>
<ns1:middle>Bryce</ns1:middle>
</ns1:name>
2.
<ns1:name version="1.1">
<ns1:first>Dave</ns1:first>
<ns1:middle>Bryce</ns1:middle>
</ns1:name>
3
<ns1:name version="2.0">
<ns1:first>Dave</ns1:first>
<ns1:middle>Bryce</ns1:middle>
</ns1:name>
4
<ns1:name>
<ns1:first>Dave</ns1:first>
<ns2:middle>Bryce</ns2:middle>
</ns1:name>
5
<ns1:name version="1.1">
<ns1:first>Dave</ns1:first>
<ns2:middle>Bryce</ns2:middle>
</ns1:name>
6
<ns2:name>
<ns2:first>Dave</ns2:first>
<ns2:middle>Bryce</ns2:middle>
</ns2:name>
7
<ns1:name>
<ns1:first>Dave</ns1:first>
<ns2:middle ns1:mustUnderstand="true">Bryce</ns2:middle>
</ns1:name>
Interesting question... Does it make a difference whether ns2 is "owned" by ns1 or not? Does it make a difference whether a formal schema is changed or not?
My guess is that folks think that 1,2,3,5,6 are versions and 4,5,7 are extensions. Notice that 5 is both an extension and a version. Some tricky problems show up in the differences between these.
If a namespace author uses multiple namespaces for their language, then there痴 no difference between 4 and 5 when ns2 is controlled by ns1 author. The ns1 author might want to reference a 3rd party's extension, and so a new version could exist because of a reference to a 3rd party's extension. And if a namespace author uses a single namespace for each major version of a language, then there痴 no difference between 1 & 2, and 3 & 6 are roughly the same.
So how is a version different from an extension when using multiple namespaces, and how does this difference show up in the language and instances? Can an end user even tell the difference?
I argue that an element may be in BOTH an extension and a version. Specifically, a new version of a language may consist entirely of extensions. There is nothing in the use of a namespace that differentiates between an extension and a version. An element is part of a version if the language designer uses the extension namespace in it's definition of the language. That may or may not show up in the instance or the schema. And that a version change may or may involve a namespace name change.
The rationale is thus: In ye olde good times of design, we did not have namespaces. So we had to use version identifiers to distinguish between the languages. Namespaces fundamentally change the playing field. We can move from a 鍍op versioning� system where the version is at the top of the document, to a 澱ottom versioning� system where the version is the embodiment of all the elements in the document.
When a language designer is creating an instance with the new and optional middle name, the choices are really:
1. Put the middle in the same namespace or new namespace, and
2. Keep the name in the same namespace or a new namespace.
Their answers to these depend upon what they have decided for their versioning/extensibility strategy. If they keep the name in the same namespace, they have retained compatibility as a new namespace will break all the existing software.
If they put the middle element in a new namespace (and potentially all new optional elements in new namespaces), they may have an explosion of new namespace names. The main reason for introducing namespace names was to allow differentiation of the names in a document. In this particular case, the namespace owner can guarantee that the new middle name will not clash with an existing middle in their namespace, because they own the namespace!
Let's say that again: It seems strange for a namespace owner to use multiple namespaces for differentiating names when they can guarantee that there aren't clashes within a single namespace. Namespaces were originally intended so that different authorities could merge their content. Authors are now often using namespaces to modularize their language, typically for conformance. Some examples are xhtml, xml schema, wsdl.
Thus the use of single versus multiple namespaces can be guided by the expected coherence of the language - if it is a highly coherent language then probably 1 namespace is a good way to go.
On the issue of version #s versus new namespaces for incompatible changes - this is the choice between #3 and #6 when middle is required - notice that client can look at either the version attribute - #3 - or the root qname - #6 - to determine whether they understand the type or not. I argue that #6 - changing the root qname to indicate incompatible change - is more effective than using a version attribute. In case #7, a 3rd party has made a change and so they couldn't change the version attribute to indicate that their extension is mandatory. Thus the version # can't be used by 3rd parties. oh, and what "version" of a document is #7 anyways? It's not really a version.
Regardless of namepace management practice, when allowing extensibility (of either same or new namespace), the compatibility requirement is that new instances can be transformed into an old instance. This is often done with a 杜ust ignore unknowns� rule. Given that the 徒nown� items may be in multiple namespaces, it is no more complex to ignore unknown elements in the main namespace.
So, with a must ignore unknown elements applying to all namespaces, the namespace owner can inserting the middle into the existing namespace. The choice appears to be whether multiple namespaces are desirable, perhaps for identification purposes.
I believe that a combination of re-using namespace names for conceptually similar and related items with a version identifier for the compatible extensions � basically option #1 or #2 � is a highly desirable style for namespace name authors to handle versioning and extensibility. A new qname - option #6 - can be used by the namespace owner to indicate incompatible changes and a mustUnderstand flag - option #7 - can be used by non-namespace owners to indicate an incompatible change.