« Web Architecture document is Recommendation | Main | QName URN Namespace »

More on XML 1.1 version identification

Norm takes umbrage with my post that XML 1.1 blew it with version identifiers particularly my expression of "blew it". He makes a key point: XML 1.1 processors can trivially read XML 1.0 documents, and then he asks a good question: would the Must Ignore rule have been good for XML name characters?

The first point is that I think the issue of forwards compatibility is as big as backwards compatibility. Norm focuses mostly on backwards compatibility and mostly ignores the implications of forwards compatibility when it comes to version identification.

I believe that most people look at version identifiers and have a rough feeling that moving from "1.0" to "1.1" is a compatible change, both forwards and backwards compatible. Forwards compatible changes are the hardest to deal with because it means allowing unknown extensibility. And it is the lack of forwards compatibility that bothers me the most on the version identification.

The second point is that I don't think is whether somebody could write an XML 1.1 processor that would do certain things with XML 1.0 documents, the question is really whether XML 1.1 mandates what happens with XML 1.0 documents. In specese, this could be thought of as a "MUST" versus "MAY" in 1.1. So Norm's statement that "any XML 1.1 processor can trivially read XML 1.0 content" is a statement about software possibility, rather than a compatibility guarantee. From the XML 1.1 spec "The minor sacrifice of backward compatibility is considered not significant.". If XML 1.1 had required that all XML 1.0 documents were conformant XML 1.1 documents (backwards compatible), then maybe there would be more of a leg to stand on for calling it a "minor" version change.

In a comment, John Cowan says "And that's why I wanted to call it "XML 1.0.1". To convey the smallness, the utter triviality, of the changes needed to XML parsers.". John and Norm are roughly using the criteria of the size of the change to XML parsers to guide version identifiers. Trivial = minor minor version change, small change = minor version change, big change = major version change. At least they don't advocate using the amount of time to determine the version identifier change, like we all do with our software products.

Thus we have the 2 positions: 1) version identifiers provide the compatibility guarantees, or 2) version identifiers provide the scope of change in software. I advocate #1, as #2 feels like marketing to me. My argument is that XML 1.1 is not forwards or backwards compatible with XML 1.0, and as such it should not be identified with a "minor" version change.

Norm asks me to answer the question about whether XML Must Ignore unknowns would have been a good rule for XML 1.0, which isn't my main point. My point is that because XML 1.0 did not provide for extensibility as well as a substitution model for extensions (like new characters in names), it is almost impossible to version XML 1.0 in a forwards compatible manner. XML 1.0 decided on draconian error handling, which fundamentally meant there couldn't be a compatible change. There is one major major extension that was allowed, which was the ":" character that became used for XML Namespaces. In this case, forwards compatibility was enabled because XML 1.0 processors can read XML documents with Namespaces. So XML 1.0 + Namespaces could have been XML 1.1 because of the forwards/backwards compatibility guarantees.

Maybe Must Ignore would have been a bad rule for XML 1.0, and perhaps a better substitution rule would have worked. Something like "If you find an illegal character in the control character range or unicode, don't error but escape it instead". OTOH, XML 1.0 spec writers really wanted to return errors.

A spec author makes a choice about where in the spectrum of extensibility and hence possible compatibility that they want for their language: one on extreme is no extensibility and no forwards compatibility, the other extreme is something like an open content model with substitution rules and high chance of forwards compatibility.

I do think there is a case to be made that the XML Core working group had some pretty hard constraints from a compatibility perspective. They could easily argue that from a compatibility point of view, it was XML 1.0 that blew it by almost guaranteeing that a compatible evolution of XML was impossible. But from the identification of Version 1.1, that's neither here nor there. They could have called it "XML 2.0" and said that they've got a much better forwards compatibility story with unicode characters.

Norm and Chris Ferris both point out that the choice of version identifiers has various stakeholders - Chris points out the J2SE version identifiers - and various criteria - Norm points out the cost of implementation. I agree with all of that, but I'm pointing out that people using think of version identifiers being associated with compatibility.

How far do they think this should go? Should we allow namespace names to change or not change for marketing reasons? At some point, software has to be written and it's the intersection of the software and the version identifier that I'm focusing on. By calling XML "1.1" and putting it in all XML 1.1 documents, it is now in the software realm rather than the product identifier realm.

I distinguish between product version identifier that is only intended for humans and a software/document component identifier. Maybe Microsoft has it right, when it calls it's products by the Year rather than a version ID. XML 1.1 could have been XML 2004 - maybe excepting that there is a conference by that name - and have the version identifier in the documents be "2.0" or even "2004". Then there isn't the same expectation of compatibility that a "1.1" implies.

About

This page contains a single entry from the blog posted on December 16, 2004 3:12 PM.

The previous post in this blog was Web Architecture document is Recommendation.

The next post in this blog is QName URN Namespace.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.34