When it comes to versioning and extending software, we've been talking a lot about versioning the described contract. We can examine the contract languages - XML Schema, WSDL, RelaxNG, etc. - for the richness or paucity of their abilities to describe extensions and versions. But there are two important additional aspects that I've been thinking about lately: the described interface versus the actual interface, and an actual interface being a composite interface. This entry is about described versus actual interfaces, and another entry will examine composite interfaces
Motivation for high fidelity interface descriptions
The major reason for providing a description of an interface is to reduce effort. A described interface enables software to be written without having to repeatedly invoke the interface. Instead of receiving numerous runtime errors, a software component uses the knowledge of the interface so as to reduce the runtime error checking.
It should be obvious that the richer the fidelity of the description, the fewer runtime errors will be generated.
Actual software vs description
The actual interface that the software operates under is always - I say again always - different than the described interface, and the reason is because of the impedance mismatch between the software and the interface description. The only true documentation of the actual behaviour of the software is, well, the software. Any description language attempts to describe a subset of the overall software functionality.
Now a lot, maybe even most, software will try to offload functionality into the description language and do less software. The designer will often try to re-use the description languages features when building the software - why should the designer do similar functionality in both the description and the actual software? For example, the type descriptions might be used as the source for some generated Java code.
This leads directly into the impedance mismatch between actual and described software. The actual software can either re-use the description or it can not.
Re-using descriptions
The software that re-uses a description, ie generated java code from schema types, will often extend the description with additional functionality. That's the whole point of software... The raison d'etre of programming languages, the if-then-else, will be applied. Software that acts on these types will probably generate runtime errors - probably in the "else" part.
In other words, the software has added additional functionality, probably constraints, to the described types. This means that the interface is actually the described interface + more.
I'll use my favourite Name example to illustrate. Imagine a "Name" processor that has a described Name type which consists of a First and Last, each of which are strings. Now probably the software that operates on the name won't be happy with just strings. Say the designer decides that First and Last names need more than one character and the first character must be capitalized. For whatever reason, the description does not contain these constraints, and the software does.
It is rare that software that re-uses an interface description will have fewer constraints than the description. The software would have to either: validate messages against the description then ignore some errors, or rewrite the messages before validation then undo the rewrite after validation.
The scenario of the described interface having less fidelity than the actual interface is well known but still troublesome to software designers.
No re-use of Description
For a variety of reasons, such as performance, the software might not use the described interface. A great example of this is Web browser treatment of HTML. Most browsers do not validate every HTML document against the doctype that is declared in the HTML document. They have a high performance representation of the doctype in their software, and they validate against that internal representation.
Further, it is usually considered desirable for a Web browser to use an interface that is less strict than the actual HTML interface. Browsers will take badly formed HTML and render it as best they can. In interface terminology, the interface the software supports has few constraints than the described interface. It is also clearly possible that the actual software interface may have more constraints that the described interface, it depends upon the way the software is written.
Summary
Whenever the interface description is not used in the software, there is the chance that the actual software will have both more or less constraints than the described interface. If the interface description is used in the software, there is the chance that the actual software will have more constraints than the described interface.
Client software will only find out at runtime whether software it is invoking has re-used the interface or not, and whether there are less or more constraints than the description provides.
What's the point of this? It is to clearly identify the reality of software development: that the interface into software is different than the interface that is described, and to observe some of the trade-offs that are made when the language designer decides between re-using the description or not and whether to add constraints.