Compatibility and Evolution in an Asynch world: some answers?

| | Comments (1)

Carlos comments on the one sided nature of substution and asks some interesting questions, as well as proposing a response. Sean partially answers as well. But I think I differ with both of them.

I believe that the notion of substitutability is mostly types, and I've argued that doing method substitutability is really hard and darned near impossible. We can examine the issue of synch or asynch compatibility, Liskov's substitution rules, and even Meyer's design by contract, mostly from the perspective of type substitution. The constraints on methods are quite simpler than those on types, typically just "keep the methods you had before".

The critical point is that compatibility guarantees are at the individual message level. I argued this in the TAG finding and as well more stridently in a versioning article review . I'll explain what I mean in a slightly different way.

In a request/response scenario, there are 2 messages: the request and the response. In a function call, there are 2 messages: the input and the return of the function. Each of these messages must be examined for evolvability and versioning.

For function y = foo(x), I need to look at x, y and foo for compatibility. For example, if I make subclass x' that extends x, and I want foo to work with x', I have to make sure that x' is substitutable for x. What I actually have is y = foo'(x'). To ensure that the client that only knows about x still works, I have to ensure that foo'(x') and foo'(x) work the same. That is, I want x' to be backwards compatible with x, so that foo' will work with both the old input and the new. The converse is that if I want to make sure that the old foo works with x', that is foo(x) and foo(x') work the same, I'm really saying I want x to be forwards compatible with x'.

If I change foo and make subclass y' that extends y, and I want the client code to work, we need to ensure that y' is substitutable for y. In this case, I want the y to be "forwards compatible" with y', so it will treat a y' as y if it doesn't know about y'.

The backwards/forwards compatibility questions are on each message. Backwards compatible is if you keep working with the old stuff, and forwards compatible is if you work with new stuff that you have no idea what it is.

Now we do need to make this object oriented, so we'll say it is actually y = z.foo(x).

Liskov's substitution rule says "methods that use base classes must be able to use derived classes without knowing about it". Assuming z is a base class, a derived class would be y' = z'.foo(x'). Liskov's rule means 3 things actually. Firstly, that x' must be substitutable for x (because a method using foo might only send in x). Secondly, that y' must be substitutable for y (because a client may only understand y even though foo is returning y'). Thirdly, each method in z must be in z'. For the derived class to be used as if it was the original, that is z' can be used as z, then z' must ensure that it still retains the foo function. As I said earlier, trying to figure out how to map foo(x) into foo'(x) is really really tough, so most object systems don't try. They simply say "keep the same methods".

To list the 3 constraints upon z and z' that must be met to ensure that Liskov's substitution rule is met:
- All the methods in z also exist in z',
- the parameters to each method in z' must be backwards compatible with the parameters to the same method in z, ie each x' must be backwards compatible with x,
- the return from each methods in z must be forwards compatible with the returns from the same method in z', ie each y must be forwards compatible with y'.

Object systems can directly help these 3 assurances. Compile time checking can ensure that each of the 3 conditions are met. Additionally, the run-time can do the substitutions, that is converting each y' into y for the client. Polymorphism simply the ability to transform the types without the client knowing about it.

But there is a really really big catch in object systems. They invariably assume that z', x' and y', are available to the client, and so the y' return can be cast to y. It's this assumption that simply does not hold in distributed systems. What is REALLY necessary is that when y' is returned to the client, it can create a y out of y' WITHOUT having z' available. I described a few various techniques for substituting y' with y in Whither Substitution Rules. This is the essence of distributed evolvability, where one piece of software can change without requiring the other to change. Developers can often make sure that x' is backwards compatible with x, it's making sure that y is forwards compatible with y' that people forget about. And this is why I harp on the "provide extensibility, provide a substitution rule in V1.0 (like Must Ignore unknown extensions), re-use namespace names when making compatible evolutions" set of rules is to ensure that people can make a y that is forwards compatible with y'. And because of XML schemas limitations, it's hard work to make a y that is forwards compatible with y'.

You have to admit, it's a real shame that we don't have a Web Service description language that can ensure compatible evolution by making sure that derived services don't lose any required input or output methods and we don't have a schema language that enables us to do the type substitutions of y for y' in the absence of y'.

At any rate, synchronous communication forces us to think about message flow in 2 directions, and to examine the compatibility guarantees on each of the messages. Asynchronous communication has the exact same constraints, it's just that the message exchange pattern isn't limited to request/response. You could easily create a synchronous pattern out of 2 asynchronous messages (say using callbacks via WS-Addressing ReplyTo) and the compatibility guarantees for the input and output still have to be planned.

Moving to asynch actually opens up more compatibility trouble spots. In synchronous, there is always an implied "return message". This return message does not have an operation or method name. In asynch, there may be many different operations/methods that are invoked in response to a request. Therefore the compatibility guarantees are harder to guarantee.

If we move from y=z.foo(x) to an asynch z that takes in a foo and produces bar, as in z.foo(x); out z.bar(y), then the substitutability of z' for z has added another constraint: each output of z must also be an output of z'. If z' drops bar and the old client is expecting it, then the old client will fail. I did spend some time on protocol evolvability and I've linked to it earlier.

To summarize, synchronous compatibility is simply the combination of the compatibility guarantees of 2 asynchronous message exchanges. Compatibility for asynchronous and sychronous messaging is best thought of at the message level, not the operation level. And you need to do some work in designing your Schemas and using WSDL to enable compatible evolution as the tools don't provide this out of the box.

1 Comments

"You have to admit, it's a real shame that we don't have a Web Service description language that can ensure compatible evolution by making sure that derived services don't lose any required input or output methods and we don't have a schema language that enables us to do the type substitutions of y for y' in the absence of y'."

Well, one way to look at it is that the problem isn't the lack of a decent description language, it's the fact that a description language is needed at all!

I'd love to see a new version of this post done with the assumption that all services have the same interface.

Leave a comment

About this Entry

This page contains a single entry by Dave Orchard published on June 18, 2004 7:13 PM.

Lies: Vancouver Transit was the previous entry in this blog.

XML 2004: Extensibility and Versioning is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Categories