It was great to work with Thomas, Umit, Hugo, Anish, Priscilla, and the other authors on the book.
Results matching “soa” from Dave Orchard's Blog
It was great to work with Thomas, Umit, Hugo, Anish, Priscilla, and the other authors on the book.
These specifications (WS-Transfer, WS-ResourceTransfer, WS-MetadataExchange, WS-Eventing and others) are not guided by the web architecture nor by the W3C TAG. I'm not supportive of any work happening at the W3C that are not governed by the W3C TAG including the Web Architecture document. The harm is that many new resources are being created and used in a separate information space from the Web, reducing their utility and the utility of the web.
The WS-* specifications, particularly WS-Transfer, are fundamentally separate because they effectively make no use of URIs and they re-invent HTTP, HTTP cookies, and even fragment identifiers (part of WS-ResourceTransfer)
As a leading member of the WS-* community, I completely understand the motivations for multi-protocols (hence reinventing HTTP), QName based dispatch (SOAP headers and the WS-* additions), formal description languages for scaling #s of operations (WSDL, WS-Policy), discovery, etc. Much of the WS-* functionality will be recreated in Web land. For example, creating many resources that have varying security requirements around encryption and signing, that can be discovered and used without a human requires a description language, a security description language, description discovery, and description matching. This works now in WS-* toolkits.
It's just that they aren't part of the web, and they basically flipped the web architecture and the TAG the bird.
SOAP was the start. The TAG wanted a clean integration between the Web and SOAP services but XMLP didn't deliver it. Sam Ruby's proposal that an XML response could be interpreted as a SOAP body without any headers never got traction. My proposal for mapping SOAP-RPC into HTTP GET went nowhere. What we got was the SOAP-Response MEP, and nobody does it. It was the usual "too late in the process" argument, and fundamentally the major players weren't interested in a real solution.
Then WSDL came along and finished the job. The HTTP Binding in WSDL 1.1 is completely broken and not implemented. WSDL 2.0 has a great HTTP binding. WSDL 2 WG really took the integration to heart, but again the major vendors weren't interested. MSFT killed WSDL 2.0 by walking away from deploying it.
WS-Addressing came next and re-invented HTTP Cookies with the EndpointRefence's Reference Parameters/Properties. Here the TAG right at the outside pointed this out and it was even in the WG's charter the issue of URIs for identification. I proposed a bunch of QName to URI mappings that would allow binding the Reference Properties into the URI for HTTP Get. The WG, over my concerns, decided to simply drop Reference Properties and the related EPR comparision section. That just hosed anybody that wanted to use them for identifiers, like WS-ReliableMessaging anonymous endpoints. As a result, people came up with even worse hacks like adding attributes to Reference Parameters to indicate they really are identifiers!
Now we're at the final stages of the WS-* standardization. Given that any WS-Resource stuff WG at the W3C won't be guided by the Web architecture and they won't listen to the TAG, there is no technical reason why the work should ever happen at the W3C.
The obvious reason why the work would happen at the W3C is if the W3C Membership wants it. If enough people and companies want to work on this at the W3C, the W3C ought to listen to that.
Perhaps the W3C needs a separate section for vendors that want to do whatever they want with a W3C URI for their specifications and the awesome W3C tools (I <3 Zakim and RRSagent).
But the work shouldn't be able to claim that it has gone through the full W3C process including architectural oversight. And the TAG should not claim that it has had any oversight or review of the products.
I am against about XRIs because I believe the benefits, which can achieved in other designs, do not come close to warrant the costs, including complexity of software, user experience and harm to the Web. The W3C TAG, chaired by Sir Tim Berners-Lee and Stuart Williams, recommends against XRIs. Henry Thompson and I have gone into much more detail previously in the draft TAG finding http://www.w3.org/2001/tag/doc/URNsAndRegistries-50
My concerns can be categorized into 3 major categories:
- Replacement of ICANN/DNS
- Replacement of URIs
- Replacement of HTTP
These will be examined in order. Then I will go through a few of the claims that are made by XRI.
1. Replacement of ICANN and DNS.
XRI and XRI resolution are a replacement for ICANN and DNS. I believe that wanting to replace the core naming system on the internet had better have some huge benefits to the users and developers, certainly more than XRI owners receiving a few dollars per name instead of domain registrars. There is the obvious duplication of issues around persistence of names and naming conflicts. The costs appear to be similar, that is =orchard and orchard.name cost roughly the same. There are many claims as to the benefits of using iNames instead of domains, yet the comparisons against ICANN appear to be unfounded or marginal. For example, there are claims that domain names will be "lost", yet many domain registrars provide 10 year or more terms and numerous renewal messages There are claims that XRIs help prevent "phishing" but that smacks of fear-mongering.
Dan Brickley did a great job of extracting the business model and IPR issues around XRIs, even back to the earlier incarnation as XNS in http://danbri.org/words/2008/01/29/266
2. Replacement of URIs by XRIs
There are a number of concerns related to the replacement of URIs by XRIs.
2.1 Unregistered URI scheme requiring escaping to be legal URI characters
XRI syntax specifies a new URI protocol/scheme but has not registered it. (from http://www.oasis-open.org/committees/download.php/15376/xri-syntax-V2.0-cs.html)
"However because XRI syntax includes syntactic elements other than those defined in [IRI] and [URI], this specification defines a new protocol element,"XRI", along with rules for transforming XRI references into generic IRI or URI references for applications that expect them ".
The crucial problem is the requirement that any document containing XRIs MUST be processed by an XRI processor before any URI resolution is done, without any clue from the media type or even a URI used to retrieve the document. This requirement for pre-processing adds considerable complexity. It is effectively inserting a new processing step at the beginning of the XML Processing Pipeline.
Additionally, any new protocol elements should be registered as URI schemes, though I expect it would be rejected because of the escaping required to become valid URI characters.
I would expect that registering a new URI scheme, especially for a specification that has been fairly done for 3 years, would be done before finalization. The W3C does this with registering media types and the same process is very reasonable. Waiting to do registration after standardization is far too long.
2.2 Replacing relative URIs
XRI introduces base XRIs that mirror base URIs. In many situations, relative URIs may be mistakenly used as relative XRIs, and relative XRIs may be mistakenly used as relative URIs. This could be incredibly harmful and seems similar to mistaking pounds for kilograms. A key part of this is that setting base XRI is unclear.
There is a claim that "=orchard" will only be recognized as a relative XRI in the presence of a base XRI. This claim is unsupported in the specifications. There is no text that specifies that a URI starting with XRI special characters is a relative URI when a base XRI is available. There is no specification of what behaviour is expected when a base URI is available, a base XRI is available, and a URI contains a relative XRI. Harmfully, it is not clear what behaviour should occur when a base URI and a base XRI are available and the URI string contains a relative URI, ie "orchard".
The issue of short names grounded in URI (or even XRI) space continues to surface. Namespaces, QNames, CURIEs, Microformats, RDFa, and many technologies are looking into this area. With all the diversity and potential issues around interpretation of short names in strings or URI types, probably the last thing that the community needs is a short name grounded in non-URI and even non-HTTP space.
2.3 Lack of dereferencability by HTTP software
The XRI documents show the use of an XRI in a namespace name. In such a scenario, a user does not accrue the advantages of http: URIs, ie. namespace documents being retrieved from dereferencing the namespace URI. The benefits of using URIs are described in Arch Web, Namespace Document (http://www.w3.org/2001/tag/doc/nsDocuments/), The Self-Describing Web (http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html)
2.4 Extra Effort and Confusion
The choice of XRIs versus URIs causes extra effort on the part of implementors and users. For example, OpenID allows XRIs for user IDs as well as namespace declarations. There are a number of special rules added for XRIs, adding significantly to the development cost.
2.5 Lack of Interoperability
Any time a specification, such as OpenID, has optional features, implemenations may and often will ignore them. It is a continual source of problems for users and implementors. A factor in whether a specification is testable and interoperable is the amount of such optional features, lower being better.
In the case of OpenID, when providers and consumers do not support OpenID, of which there are many, it will invariably lead to confusion and problems.
2.6. Version confusion.
The TC is publishing the XRI Syntax V2 Committee Specification, 14 November 2005 and XRI Resolution V2 Committee Specification 1 12 April 2008. The XRI Resolution V2 specification specifies that it is related to Extensible Resource Identifier Syntax V2, Committee Specification, December 2005, which I cannot resolve and I assume is the November 2005 specification.
OpenID uses the XRI Syntax Committee Specification, 14 November 2005 and XRI Resolution Working Draft 10, 18 March 2006. I cannot find the differences between the Resolution March 2006 draft compared to April 2008 and whether those changes are applicable to OpenId.
Presumably this means that OpenID 2.0 would need to be updated to refer to the latest version of XRI Resolution. Such an update cycle would be more usefully spent removing XRI from OpenID.
3. HTTP Replacement
Instead of using http based content negotation, the requested media-type is encoded in the URI. For example,
Request URI:
http://xri.example.biz/=example*book?_xrd_m=application/pdf;sep=true
The "sep=true" media type subparameter indicates the proxy resolver should
perform service endpoint selection using the media type requested; the
absence of an _xrd_r parameter means it must return a redirect as specified
in section 11.7.
Again, they've re-invented something from HTTP.
Specific XRI Claims
I posted a response to XRI solves Real Problems in http://www.pacificspirit.com/blog/2008/05/28/xri_solves_what_real_problems. In general, I didn't find real user problems in document.
There are a few detailed claims that I have found that can be examined.
1. HTTP URIs are bound to a specific network protocol. XRIs are by definition protocol independent.
Technically, no. The TAG, in http://www.w3.org/2001/tag/doc/SchemeProtocols.html, and Roy Fielding have all regularly disputed this.
If the desire is to be protocol independent, then I suggest that there are specifications like SOAP and WSDL that are designed for protocol independence that are more suitable for achieving that goal. Interesting that they aren't that trendy, in large because they are protocol independent. Protocol independence appears to be a bug on the web, not a feature.
2. HTTP URIs do not provide any standard way to determine if they are reassignable or persistent. XRIs provide unambiguous syntax to distinguish between persistent and reassignable identifiers..
The XRI community could have easily provide a URI template that myself, Mark Nottingham, Joe Gregorio, and others have worked on (http://bitworking.org/projects/URI-Templates/draft-gregorio-uritemplate-00.html) that used http: URIs. The approved TAG Finding on Metadata in URIs (http://www.w3.org/2001/tag/doc/metaDataInURI-31) explicitly licenses the relevent authorities to assign metadata in URIs.
3. HTTP URIs do not have a standard service discovery format or protocol.
XRIs are discoverable using the XRDS format created by the XRI TC (and now widely adopted by OpenID, OAuth, Higgins, and other identity frameworks).
Deferencing an HTTP URI can easily retrieve a representation. That is a fine and widely adopted protocol. There is a lot of working in various communities to standardize how to get metadata given an identifier, such as Semantic Web, Web Services (WSDL and WS-MetadataExchange), Microformats, RDFa, Link Header (http://tools.ietf.org/id/draft-nottingham-http-link-header-01.txt) and more.
I think the use of XRDS and XRDS-Simple is interesting as there does need to be a metadata specification for services. I've worked on WSDL and WS-MetadataExchange and I'm intimitely familiar with many of the issues.
But XRDS is completely separable from XRI. The XRDS format and protocol could easily build on http: URIs.
There are a couple more claims, but I desparately need to get this article published so I'll discuss them slightly later.
Summary
I believe that XRIs as a new naming schema and identifier format do not nearly justify the deployment costs associated. The Web is built on URIs, HTTP and DNS. For our users sake, we should use deployed technologies where possible. In the case of XRIs, it is very possible, proven by the fact that most OpenIDs are http: URIs.
In all the hysteria about Bisphenol-A, it's been hard to find any real facts. Simple things like:
- What's the real danger using our baby bottles?
- What are the effects of washing bottles? In particular, if we already have been washing our bottles in our dishwasher, does washing them by hand in cooler water with softer soap help?
- What are the exact bottles that are the problem?
- What about the lids, the inserts, and all the other plastic things that are used and washed?
1. Real Danger
I spent some time and found a few solid answers. One interesting report is the "Baby's Toxic Bottle". This has a good list of the issues, results and recommendations.
2. Effects of heating and washing
The key determinant of the amount of BPA released is temperature and there is no "residual effect" - Food Addit Contam Mar 2008. This supercedes a study that found also found the amount of BPA is related to temperature but also found a residual effect of repeated exposure. Food addit Contam 2003.
This science supports the recommendation in BTB of
- "If you continue to use polycarbonate bottles, do not use harsh detergents or put bottles in the dishwasher. Instead, clean them with warm soapy water and a sponge. Scouring brushes can scratch the surface of the bottles and increase leaching rates "
- "Avoid heating foods in polycarbonate containers, as bisphenol A tends to leach faster with higher temperatures. Use glass or ceramic containers instead"
3. Exact Bottles
From BTB, -"Use glass, or polypropylene bottles (the #5 plastic) instead of polycarbonate (hard, shiny, clear or tinted plastic, usually with a number 7 or "PC" on the bottom/underside) bottles" -It looks like our playtex bottles were the least of the problems, but not by much.
But.... The Z report on BPA in infant care pointed out a few things differently. They said it was #3 and #6 plastics that were also a problem, and not all #7s are polycarbonate. It looks like all the playtex bottles we have are #5s, but are not listed as being BPA-Free. Maybe that means the bottle is ok but the lid and/or insert are polycarbonate?
4. Other materials
Looks like it's #7 items. But I've also heard that it could be #8 though I couldn't find any evidence. The lids have a number on them, but it's not in the recycling style, so I don't know.I've been reviewing the Access Control for Cross-site Requests document. One interesting aspect of the document is that it specifies how a web site can authorize other web sites to do non-GET operations such as PUT or DELETE. The client makes an authorization request by creating an HTTP GET with the http header Method-Check. The server then responds with an HTTP Response containing Access-Control HTTP Headers or even an XML document with Processing Instructions.
Now the part that I found very interesting is that it seems that the client's authorization request isn't really for the resource identified by the URI, because the goal is to actually get the authorization information. Thus, an HTTP GET has been over-ridden to be a GET of metadata about a resource. Also interestingly, if the URI for some reason doesn't know about the Method-Check header, then it will return the "wrong" representation, that is the actual representation. There is no way of requiring that the server knows about the Method-Check request.
Over in WS-* land, WS-ResourceTransfer is a specification that uses a SOAP header wsrt:ResourceTransfer to indicate that there may be RT specific extensions to the WS-Transfer operation, such as GET. Because it uses a SOAP header, it can use the soap:mustUnderstand attribute to require that the server understand it.
Seems to me like this is an interesting case of where SOAP solves a problem that the Access Control for Cross-site requests has, that is the ability to mark a header as mustUnderstand. This isn't surprising, given that SOAP was exactly created to solve problems with HTTP headers.
Last week I attended and presented the BEA position at the W3C Web of Services for Enterprise Computing. I don't want to go through all the various discussion as I think Paul, Jonathan, and Eric do a great job.
I found much of it interesting and useful. Real-world discussion of how much traction SOAP and WSDL are getting in B2B and enterprise scenarios was great. I liked the WS-Core WG idea. A number of folks commented positively on a couple of the W3C TAG findings I'm working on: versioning and state. However, I had a couple of disappointments.
There was almost no discussion about how enterprise computing is different from "non-enterprise computing". I argued it might be something like higher trust in well-behaved clients, so state, security, extensibility, etc. might be different. But almost no follow through, though maybe just because I presented midway through the 2nd day.
Given all the angst about Web architecture vs Web services architecture, I was also surprised that there was no support for technical reconciliation. I suggested WADL (Web application description language), to help with enterprises and the desperate perl/python hacker building stronger typed REST services. And for the flipside, I suggested improved SOAP to URI/XML bindings so SOAP/WSDL services would be more easily consumable by REST clients. There were 2 votes (including mine) for doing WADL, and 2 votes against doing WADL. I'm still surprised that there wasn't more support for technical ways of bringing the two architectures together. Perhaps this is because the way the voting structure was done, which was pick 2 items out of about 15.
Reposting:
What is SOA?
This document provides a technical description of SOA and what it means for Architects and developers. Readers should have some familiarity with current technologies such as Web, XML, and Web services. This document provides a view of SOA as a broad set of architecture, design principles and choices used in building distributed systems. In essence, this document views SOA as a few core principles and a set design options or “knobs” that are set differently for each particular application depending upon the features required.
There are many reasons to follow SOA design principles and options. The usual main goal is to build software that provides components that are usable by a variety of other components in a distributed environment, aka re-usable software components. Other goals include optimizing functionality, costs and non-functional requirements like scalability, performance, extensibility and security.
There are as many definitions of Service Oriented Architecture as there are people in the technology industry. Despite the diversity of opinions, the community does have some commonly held beliefs about what SOA is and isn't. The "word on the street" is that SOA is all about building software components that can be easily created, accessed, discovered, and re-used using a variety of tooling and deployment environments and platforms. The environmental diversity requires that the distributed application has reliability, scalability, availability, extensibility, versioning, and configurability. These "-ilities" are often called non-functional requirements, and are formally called the architectural properties of key interest in Dr. Fielding's REST thesis (http://www.ics.uci.edu/~fielding/pubs/dissertation/net_app_arch.htm#sec_2_3). Successful distributed systems embody an appropriate mix of these properties and have been built for over 30 years. We merely stand on the shoulders of giants when we successfully deploy SOA based systems.
If there is one phrase that is the essence of SOA, it is “how to build composite distributed systems”. In order to be reusable, reliable, available, scalable, etc. - the properties of interest - the distributed components must reach an appropriate level of "loose-coupling". The realization of loose coupling, which is the ability to add, modify and delete components with minimal impact on other components, is done through the definition and description of interaction patterns between components, including the appropriate selection of technologies. There is no way to build useful distributed systems that are completely decoupled, rather there are a variety of choices that can be made in the system design that affect various aspects of the systems. Each of the properties of a system is a trade-off compared to other properties. For example, a synchronous service is usually more coupled in time with its client than an asynchronous service. On the other hand, an asynchronous service usually requires the client open up its address space for a "callback" which increases the coupling in shared distributed state included address space. More detail is at http://www.webservices.org/weblog/dave_orchard/async. There are numerous design choices made in a distributed system, and the collection of these choices result in the behavior and properties of the system. Effectively, loose coupling and SOA are phrases that represent the possible properties of a system.
SOA includes the technology and design choices for the system, but also includes the organizational processes around SOA. Much of the information available on SOA today focuses on the organizational and governance processes. This document is complementary to those aspects, and focuses solely on measurable technical aspects.
A difficulty emerges in describing Service-oriented architecture because there are many different views on what architecture is. My view of architecture is that it is a set of principles, constraints or decisions that yield the desired functional and architectural properties. The W3C Architecture of the World-Wide Web follows this architectural model (http://www.w3.org/TR/webarch/) to describe the World-Wide Web. This is unlike other definitions of architecture which are intentionally vague and do not provide any constraints, principles or decisions.
Another difficulty that emerges is defining what a service is, as compared to other computation units such as component or object. How does one distinguish between these things and say “this is a service” and “that is an object”? The answer is difficult because there are few hard and fast rules to differentiate between them. Can a CORBA component that uses XML, HTTP with a WSDL description be called a Service? Can a SOAP/WSDL/WS-Addressing endpoint that requires a Factory Operation to get an Endpoint Reference with Reference Parameters (eerily similar to CORBA architecture) and Metadata that indicates an X.509 certificate is required for authentication be called a service?
We must be clear about what we mean by a Service. I adopt the convention that a Service is software that follows SOA design principles, which roughly means that it is intended for use across varied and distributed software platforms. This highlights the need for interoperability, platform independence and planning for widely distributed access.
SOA Trade-off Principle: SOA deployment is a trade-off between properties of interest.
The properties of interest, such as Reusability, Reliability, Availability, Serviceability, Extensibility, Versioning, cannot all be achieved. There is a trade-off between them. An analogy is that there is a dial for each of the properties. The dials can be adjusted, potentially changing other dials. Each decision made, such as asynchronous services, affects the dials. Any successful deployment of an internet scale distributed system must account for the property dial settings. This paper will describe further details on the trade-offs between these “ilities” or properties.
SOA defined message format Principle: Services have well defined message formats.
Well defined message formats decouple the execution environment of systems because the message parsing software can be decoupled from the application and the platform. XML is the obvious technology as the basis for well-defined message formats. Using XML enables software components, like editors, parsers and toolkits, to be re-used across platforms.
When XML is used to communicate with distributed software, we call these Web services. We attempt to avoid the difficulties in relating SOA, Web services and technology choices such as WSDL and SOAP. Instead, we offer the view that Web services are services with an XML interface. Web Service specifications, such as SOAP, WSDL, WS-Addressing, etc. are technology choices but not the embodiment of deployed Web services.
SOA Interface Principle: Services have described interfaces.
SOA requires that the services have described interfaces. Without a described interface, all the software components are coupled to each other and it is difficult to scale to many different potential users and platforms. A described interface reduces the coupling between two or more systems by making the software components depend upon the explicit interface description, rather than the software with implicit assumptions. The more precisely the software's behavior is described in an interface description, the fewer implicit assumptions are made and thus the looser the coupling is between the software components. SOA's use of described interfaces provides a measure of loose coupling because the service dependency isn't a software to software dependency; rather it is N software to interface dependencies. This is predicated on the fidelity of the individual interface contracts. If an interface contract is basically anything allowed, then there is still tight coupling between components.
Again, interface description languages for XML are the obvious choice. XML has description languages like DTDs, XML Schema, and RelaxNG for defining the contract. When used in the Web services context, an important description language is WSDL, which provides specific contract details. Interfaces, operations, schemas, etc. can all provide loose coupling if they are designed properly.
This is not to say that SOA requires XML Schema or WSDL. There are downsides to these technologies, such as complexity. Many services can be described in textual documentation, templates, RelaxNG, Schematron, etc. However, as widely deployed standards, XML Schema and WSDL provide a number of advantages for widespread deployment of services. The benefits for tooling, automated processing, standardized discovery are clear.
SOA Interface Fidelity Principle: The richness of the interface description relates directly to the amount of coupling.
An interface can have a wide range of fidelity or richness, from lightly described to very detailed. The simplest interface is that a message with any parameter is received and a response is produced. This lack of information means that the consumer of the interface will have to find the information about the message contents from some other source, perhaps a phone call or trial and error. The more that interface specifies the types of inputs and outputs, the relationships between the messages and the semantics of the messages, the less that a human or software must do to interact and find the "real" information. System designs should endeavor to provide the richest or highest fidelity interface description that is useful to interact with the service.
Compliance with the interface, including extension, affects coupling. A service that does not fully comply with the interface description will probably not perform as expected by the client. To interact with the service, the client has to determine the amount of non-compliance, which will invariably be very time consuming. This will result in the most tightly coupled implementation because the client is coupled to an undocumented and unsupported aspect of the service, which might change at the service provider’s discretion without any warning or notification.
The interface description may include information about the “qualities” of the service, such as the availability, reliability, etc. These are often called Qualities of Service (QOS) and comprise a Service Level Contract. The more that the quality of service is described, the less the service consumer has to make assumptions about the service.
Richness or fidelity of the description does not mean that unnecessary or inappropriate information should be included in the description. The goal of an interface description is to provide the highest possible chance of continued interoperability with minimal human intervention. Care should be taken to prevent inappropriate information from or “noise” from the interface description. An oft-quotes expression is, “as much information as necessary and no more”.
SOA Composite Behavior Principle: The actual behavior is the sum of all the behaviors in the software
An interface description can never fully and completely describe a system’s behavior. A single component or node in a software system invariably consists of multiple components, like a servlet engine using a J2EE component in front of a relational database. Each of these components has their own interface descriptions, constraints and behavior. The actual interface to the component will be the sum of each of the components. For example, if the interface allows extensibility to the servlet, but the J2EE component does not, then the system as a whole is not completely extensible.
The effect of this principle is that each of the components within the system should be designed with the overall systems properties in mind. If extensibility is a goal for the system, then each of the components should be designed for extensibility.
SOA Interface Coupling Principle: Software should be as loosely coupled as possible to the interface
The interface definition is one aspect of loose coupling between systems, and the implementation of software of the interface is a component of coupling. For a given interface, implementations can vary widely. The variations can range from languages to platforms to hardware and more. Each implementation will be coupled to the interface differently. One implementation might remain unchanged for a particular interface change, whereas another implementation would need to be changed. And looking the other way, some implementation changes may require an interface change and other implementation changes don’t.
The two types of coupling to an interface are: what interface changes require software changes, and what software changes require interface changes? The design of the software that implements the interface affects the resiliency to changes in software or interface. For example, XML Beans enables the Java software to handle some changes in the XML data (interface) without breaking the software. However, this is predicated upon the interface allowing these changes. Hence for systems to be loosely coupled, it is necessary that the interface AND the software to permit the changes. To a degree, the interface and software must align.
SOA Distributed Principle: Services should be designed for existence in a widely distributed and heterogeneous computing environment.
We will adopt the view that SOA is about delivering services to widely distributed clients. The assumption is that services can be deployed at a global, or internet scale.
There are many SOA communities that are not intended for delivery to a global or internet scale. Some SOA communities exist in non-distributed and homogenous environments. Other SOA communities are global in distribution, heterogeneous and internet scale but are closed, such as an intranet. Many of these communities may not need to pay the costs for designing for distributed environments. However, it is often the case that local services are often deployed for a wider and wider set of clients and environments. Planning for distributed interactions up front can result in dramatic savings compared to redeveloping a service. From here on, we will associate SOA with distributed services, rather than closed communities. Closed communities will often benefit from designs that are structured as distributed services.
As we've been building distributed systems for many decades now, there are a number of canonical "lessons learned" that we choose to acknowledge and embrace. Peter Deutsch wrote up 8 seminal fallacies in building distributed systems (http://today.java.net/jag/Fallacies.html). They are:
1. The network is reliable
2. Latency is zero
3. Bandwidth is infinite
4. The network is secure
5. Topology doesn’t change
6. There is one administrator
7. Transport cost is zero
8. The network is homogenous
Earlier attempts at distributed computing, such as distributed objects, made some erroneous assumptions about the environment the services operated in, such as reliable networks and no latency. A SOA does not make the assumptions that Peter Deutsch listed. The essence of the fallacies is that software components should make as few assumptions as possible. That isn’t to say that each component has to deal with the fallacies itself, rather the components combined with their environment should not suffer any of the fallacies. For example, a software component might be built assuming a reliable network with the commensurate knowledge that the reliability would be provided, such as by JMS or WS-ReliableMessaging. In general, the more fallacies that are followed, the more the systems are coupled and frail.
The existence of Services in a distributed environment requires that the service providers and consumers construct their software and systems with specific constraints to avoid the fallacies and hence be more loosely coupled. There are variable technical and business costs to sending messages across security, network, and computing environments. The assumptions of each environment must be accounted for in the design of the Services. In this view, the Web is a SOA and CORBA "could" be a SOA though most implementations aren't. SOA is not new or anything radical, and in fact does not actually preclude many existing technologies. The same way that CORBA services may be built in a SOA fashion, Web services applications may be built in a non-SOA and tightly coupled manner.
SOA Decentralization Principle: Services should be designed and planned for decentralized administration
Services can span fiefdoms of administrative control. This can be between companies, between departments within a company, between individuals and companies, and further. The administration of the components that comprise and use the service is by many people. The web is a brilliant example of SOA with respect to decentralized administration. Browsers, Servers, and applications can all evolve on different time scales. This principle avoids the pitfalls of Myth #6, which states "there is one administrator". The interface and software design should account for this separate and distributed evolution of software. This does not mean that there is some federated management system that handles the disparate administration rather that each Service within its administrative domain plans to use and be used by Services that are separately managed.
One aspect of this is planning for extensibility and versioning of services. The more that each Service can evolve without affecting other Services, the more loosely coupled they are. Some useful reading materials on extending and versioning languages are:
• How to use XML Schema 1.0 for extensibility and versioning (http://www.xml.com/pub/a/2004/10/27/extend.html)
• A set based model for how to design any language for compatible versioning (http://www.xml.com/pub/a/2006/12/20/a-theory-of-compatible-versions.html)
• W3C TAG finding on Versioning (http://www.w3.org/2001/tag/doc/versioning ) and XML Versioning (http://www.w3.org/2001/tag/doc/versioning-xml )
• Guide to versioning using the upcoming Schema 1.1 (http://www.w3.org/TR/xmlschema-guide2versioning/ )
• A personal collection of links (http://www.pacificspirit.com/Authoring/Compatibility/)
As service components and clients can be deployed independently, it also regularly occurs that the topology of the services evolves. This can be through service evolution or the introduction of new layered services, aka composite services, aggregators or portals. An interesting example of a topology change is the rapid deployment of search engines. The web topology is flexible enough to account for search engines as a new service that consumes other services. If the web had allowed only web browsers to interact with web sites, then search engines wouldn’t have been able to index the site and provide their own service. Thus the original topology of human user interacting with web site was extended to service interacting with web site and human user interacting with the new service THEN the web site. The topology change is highlighted in myth #5, which states "Topology doesn't change".
Decentralized administration implies that some clients cannot be stopped from joining the network topology, so potentially malicious components may exist on the network. A service must be designed with this knowledge, and it must be secured appropriately. The appropriate security stance is a design choice and is listed as myth #4, which states "the network is secure".
We have listed a number of principles that underlie SOA. There are also design choices that are made in the context of SOA. We call these decisions “techniques” and detail a number of them.
SOA WS-* Technique: Services should make appropriate use of Web services specifications such as SOAP.
Web services specifications can enable loose coupling between applications, though they increase coupling between toolkits. An application that uses a Web service specification does not have to implement that functionality itself, such as implementing asynchronous callbacks. This means that the n applications do not each have to implement the functionality, rather they can rely upon the underlying web service stack to provide the functionality. The rough argument is to re-use Web services specifications rather than recreate their functionality. The application is tightly coupled to the Web service specification, but hopefully the web service specification will be well developed with thorough interoperability testing, saving the application developer the process of testing and debugging. Any change in a Web service specification will probably affect the implementation of a service - say the change from SOAP 1.1 to SOAP 1.2 - but these specifications typically evolve in slower cycles than the service. Thus the service can evolve faster than the specifications and implementations.
Web services can provide increased loose coupling with SOAP, because SOAP provides an extensible format and processing model that can allow an application to be decoupled from other software such as infrastructure components. A SOAP header block can be used for a security token, which means the application language does not have change. Without the soap header construct, the application must provide for or be coupled to all particular extensions. Non-SOAP XML services may be loosely coupled but this only happens when they never need to have a well-defined and layered programming model such as SOAP handler-chains.
The interface has other factors that contribute to loose coupling, such as synchronous or asynchronous, the granularity of the messages, the evolvability of parts of the contract (such as separating the address from the interface to evolve the address differently than the interface), and the constraints on the interface. An arbitrarily open interface typically increases coupling because there is no opportunity for re-use across the interfaces. An example of a closed interface is HTTP GET. This restriction of the interface enables many software components to understand the interface and be deployed against different services without changes, i.e. they are decoupled.
SOA Asynchronous Technique: use Asynchrony.
There are very well-known trade-offs between synchronous and asynchronous messaging. In broad terms, these come to a trade-off between coupling in time versus coupling in space. In synchronous messaging, resources are dedicated in one party while the other party and/or network are processing the message. The classic is obviously request-response where the client keeps a connection open while waiting for the response. These two systems are coupled in time.
Asynchrony relaxes the coupling in time but adds a coupling in space. In asynchronous messaging, the message sender spreads out the resources devoted to the message exchange pattern over time, potentially increasing efficiency of resources. The simplest is a one-way message, where the sender puts the message into the transport and continues. Request-response is often layered onto asynchronous messaging by using of a "callback", sometimes called asynchronous request-response. A request contains an address for the receiver to send a message to.
The trade-off between coupling in space and time is that the resources that are used for synchronous communications, such as socket connections, program and CPU threads, may be inefficiently allocated. Asynchrony allows the sender to free up these resources for other tasks. While there are resources that must be allocated for waiting for the callback, these are invariably less than the synchronous resources. In situations where the resources are not effectively used, asynchrony increases the scalability and performance of applications. Additionally, it is usually easier to add reliable message delivery to asynchronous messaging than to synchronous messaging systems. This makes it easier to avoid the first fallacy of assuming the network is reliable.
When Asynchrony
The next question is when to use Asynchronous versus Synchronous communications. As with all trade-offs, there are no hard and fast rules. Clearly communications that will take hours or days to answer are candidates for asynchrony. But how many minutes or seconds would suggest using asynchrony? Most web sites have a guideline that a web page must respond within 30 seconds. Perhaps this is a good guideline for the line between synchronous and asynchronous.
If you believe that Asynchrony is a vital component in building scalable distributable systems, then SOAP and a few other WS-* specs provide suitable infrastructure. There are 2 constraints that are emerging in the WS-* stack that enable a more asynchronous messaging world than we have with the current web. These constraints are support for Asynchrony and protocol messages as headers, aka flattened network layers. It has happened before that the architectural constraints of a new system have been described after the technology has been deployed. Perhaps in the same way that the REST thesis came after the first versions of HTTP and URIs, we will have an "Asynchronous" thesis after the first versions of SOAP, WS-Addressing and WS-ReliableMessaging, are widely deployed.
SOA State Technique: State location and management
The design and location of state, particularly application and connection state, in a distributed system is another trade-off. A detailed description of the trade-offs in a Web context is at http://www.w3.org/2001/tag/doc/state
SOA Coarse Grained Technique: Coarse grained interfaces
Coarse grained interfaces are those interfaces that have comparatively few operations with larger amounts of information in each message. The goal is to have a model of where each operation receives a sufficient amount of information to do a useful amount of work. This is contrasted with distributed-object style interfaces that have fine-grained interfaces where there are many operations with smaller amounts of information in each message. A canonical example of a coarse grained interface is an operation that handles Purchase Orders. The main operation is a submitPO operation with a Purchase Order parameter. Contrasting with this is a fine grained interface that would have operations for manipulating parts of the Purchase Order such as setting or getting Billing Information, ShipTo Address, Line Items, Purchaser Name, etc.
The advantages of coarse grained are many-fold. They include increased network performance and scalability by having fewer operations, increased security by needing to secure fewer operations, simpler ease of design, and simpler administration. There are potential downsides if the types of operations are themselves fine grained, i.e. control type operations, and they are inappropriately combined to achieve coarse grained interfaces.
Summary
SOA is a broad set of architecture and design principles and choices used in building distributed systems. The very fundamental part of building distributed systems is requiring that there is a described interface, or contract, between components and this contract is one step towards loose coupling. There are a variety of interface technology selections that provide further loose coupling, such as XML, WSDL and SOAP and other Web services specifications. The implementation of software, and the extent to which its internal contracts affect the published interface, has significant impacts on coupling, arguably as much as the interface technologies.
Acknowledgements
Thanks to Ed Cobb, Jeff Davies, Zulah Eckert, Yaron Goland, Gilbert Pilz, Michael Rowley, and Ken Tam for their reviews.
It's rare that I disagree with Nick Carr but I think he and a few others are making up a controversy that doesn't exist. Nick wrote up the web services schism.
The thesis that he and a few others are using is roughly that SOA = corporate dev/bus types and Web 2.0 = hacker barbarians. Now this is just plain fooey. What's really happened is that they are all using SOA. There's no real definition of SOA, but really it's just about building successful distributed apps. Of which the Web 2.0 folks are defintiely doing that! The quote SOA quote folks just like to build database apps and are marketing it like crazy.
It's not "SOA vs Web 2.0", the question really ought to be "why aren't more corporate environments using web 2.0 technologies". To which the obvious answer is "It's the apps silly".
The reason why a lot of the corporate apps aren't very google mappy or flickr apiy or bloggy or wikiey is frankly the corps have *LOTS* of stuff in enterprise apps and databases. Some trading exchange company that's doing data enrichment of trades with risk information etc. just doesn't need maps/flickr/blog/wiki.
Now why isn't the trading data enrichment app built using Ajax, etc.? Well, some of them are, but mostly it's just way smarter to move the processing closer to the data, ie the server, ie the "corporate" environment. When you can offer 2x perf over a competitor by doing the data integration on your server and enforce SLAs on your partners, it's a nobrainer.
SOA vs Web 2.0 debate is smoke no fire.
For many years, SOAP has promised that it is a "one-way" protocol. This statement somehow magically seems to want to differentiate SOAP from HTTP's request-response. To date, we haven't fully realized this vision of one-way messaging. We've had WSDL one-way messages, but then we fall off the map when it comes to SOAP because we don't actually have a fully standardized SOAP one-way binding. Interestingly, lots of people use the WS-I Basic Profile which allows an empty HTTP Response, but let's talk in the context of W3C and WS-Addressing, SOAP, WSDL stds specs..
To do one-way, we need both SOAP 1.1 and SOAP 1.2. SOAP 1.1 should be fairly straightforward - allow a 202 with an empty http response body. I've written up such a beast that is straight up one-way and a more popular variant that allows for a response.
SOAP 1.2 is a much more complex case because SOAP 1.2 introduced the notion of SOAP message exchange patterns. Now ironically and frustratingly, the "one-way" SOAP 1.2 never defined a one-way MEP. We are now *finally* doing that work in XMLP and WS-A, mostly because a number of us got really agitated on WS-Addressing land and said "one-way" is in scope for WS-A, so do it somewhere. The SOAP 1.2 MEP notation style specifies a detailed state machine for each MEP, particulary the request-response.
This has led us into some interesting issues: What are the various flavours of One-way particularly Fire and Forget, and how many MEPs do we need? In fact, what does F-A-F really mean? Does F-A-F mean that the sender can close the connection as soon as it has placed the message on the connection? Does it mean that the receiver can close the connection as soon as it has received the message? That is, who's "forget" is it? And how does this all relate to HTTP? How does all this get described in WSDL?
There seem to be two camps emerging, what I call the "strongly-typed" mep and the "weakly-typed" mep. The strongly typed camp says that each mep ideally has no optionality. Roughly, for each combination of messages introduce a new MEP. In addition to the request-response MEP, introduce a F-A-F MEP, introduce a request-optional-protocol response mep, etc. Then each binding is constrained by what it can do. I wrote up a one-way MEP with a binding in December 2004
One thing pops out of this analysis, is that an HTTP binding cannot implement a true client F-A-F. HTTP is a request-response protocol as the client MUST wait for a response. Therefore a F-A-F MEP couldn't be implemented using HTTP. HTTP could implement any kind of MEP that allowed required a protocol response, such as a request-response, request-optional-soap-response, or request-nosoap-response-but-protocol-response.... From the binding perspective, it's easy to write bindings for the MEPs because there is no variability in the aspects of the message cardinality.
This gets to the core of my disagreement with the "strongly-typed" meps. I proposed what I call a protocol mep, aka weakly typed mep, which is basically coming up with a single mep that covers all variants. Incidentally, I propose removing all the state machine gorp from the mep. The problem that I see is that I think looking from the application down is very important. When a WSDL author decides they want a "one-way" message, I don't think they want to have to look to each binding to see which flavour of MEP it supports. If I define a one-way message, why do I want to peek into each binding to see which MEPs it supports? If I offer my one-way message over UDP and over HTTP, why do I want 1 version that is f-a-f, and another that is request-optional-protocol response?
WSDL has cleanly separated the "abstract" operation out from the binding. There is a "SOAP binding" section in both wsdl 1.1 and wsdl 2.0. WSDL 2.0 allows me the choice of which MEP is associated with an operation. With strongly typed MEPs, I have to do multiple SOAP bindings per different MEP.
The core of this problem is that SOAP meps tried to abstract things from underlying protocols - that's a big reason why SOAP is a protocol and not a format BTW, the addition of the MEPs is a key part of this. My opinion is that this abstraction hasn't provided any utility. Using the previous example, we have a SOAP binding that is supposed to be protocol independent, but now I'd need a SOAP binding for each type of underlying protocol. Yuck!
Effectively, I think that we should get rid of the SOAP MEPs and the way to do that is to come up with one MEP that provides an abstraction of all underlying protocols for giving us properties for the request, response, results, destination, etc. I still think some kind of abstraction is necessary to insulate the WSDL from the particular binding. With a layer of abstraction, you can write a binding (like my sample SOAP 1.1 HTTP binding using protocol MEP) independent of WSDL, you can write WSDL and extensions independent of any particular binding (like my proposal for WS-Addressing usingAddressing extension and WSDL 2.0's binding WSDL 2.0 MEPs to SOAP 1.2 MEPs
So I like having some kind of abstraction above the protocol, which is why I favour some kind of MEP abstraction. The key question is, given a "weakly typed" mep, how does a sender and a receiver know when they can close the connection? In the strongly typed mep, it is specified in the MEP. In the weakly typed mep, I believe that this is controlled by a combination of the application and the binding. Imagine that a receiver is using HTTP, SOAP and WS-Addressing. The receiver is going to have to look at the message to see what all the *To properties are. If there's a ReplyTo and FaultTo that is non-anonymous, the receiver will know that it can close the connection before the application creates a response.
This is the crux of my issue with the question about receivers "knowing", is that it asks the question the wrong way around. The data needed for a receiver to "know" can be in the message. Therefore trying to say the receiver will "know" ahead of time based upon the MEP is just plain wrong. It leads down bizarre paths like "If the ReplyTo is anonymous then the soap mep is request response else if the replyTo is non-anonymous then the soap mep is f-a-f if the protocol is not http else the mep is request-optional-response if the protocol is http. If the mep is f-a-f then close the connection, etc. ". The whole point of doing things "strongly" or "statically" is that there's something known up front, but with data driven protocol interactions supported by WS-A et al, this is just plain broken. Instead of the previous, I propose "The mep is request-optional-response. If protocol is HTTP and replyTo is anonymous, wait for response from application. Else don't wait.".
This makes the WSDL authors life easier, and it correctly places the burden on the receiver to look at the message for how to know when to close the connection, without some arbitrary and complex strongly typed mep in the middle.
I wonder how much technical innovation is simply change so that the developers don't have to talk with the admin/security type folks. There seems to be this regular tension between the devs changing the playing field and the admin folks catching up. I've lately been working on the TAG on the EPR issue and related discussions around State. One of the themes that has come up is the "niceness" of putting identifying information in http cookies and into EPR Reference parameters, rather than in the HTTP URI. This discussion harkens back to the discussions about SOAP tunneling through HTTP POST.
The Dev vs Admin cycle
I think there's a cycle that goes roughly:
1. Developers want easier deployment of applications than they currently have
2. Some solution emerges that vendors/devs/etc. tout as being simpler
3. Technology gains partial adoption in large part because "it's the new thing" but it's not "enterprise ready"
4. Vendors develop and sell technology to admin/secure "the new thing"
5. It gets more complicated to deploy "the new thing"
6. Repeat
Cookies
One of the reasons for using cookies is that it's so easy for a developer to stuff some value, like a session id, into a cookie associated with a web page. In comparison, it's harder to rewrite all the URLs in the web page. It's certainly possible to do URL rewriting, and lots of sites do that. But the alternative is so much easier from a tool and governance pov. There's no need for the developer to ask the admin for a portion of the URI to be available for any of the cookie information, like customer ids or back accounts or …
SOAP
SOAP over HTTP is another great example. By stuffing SOAP requests through port 80, the envelope has mostly been invisible to the firewall folks. They setup "myService:80", messages flow, and subsequent revisions can be deployed without bringing in the URL/port police. 90% of the time, if you go to somebody and say "Can you deploy this service or callback to port 8080 or add a new URL" you will get an "Ack! No, the firewall folks won't let me". So SOAP over port 80 conveniently gets inside the firewall.
After marketing to the devs how much easier it is to build SOAP apps, we can now build SOAP security related products. WS-Security and related specifications provide a wealth of security related technology for SOAP. Ironically, we're now almost back to the point where a network administrator can secure a discrete application for particular users. Tongue firmly in cheek, I guess it's somehow magically easier to say "secure the StockQuote Service defined by this WSDL Endpoint QName" than "secure port 8080 which is running the StockQuote service". And people are saying that it's too hard to deploy WS-* apps because there's all this "extra stuff" - which brings us fully back to step 1.
Ajax
Ajax is being deployed and one of the reasons is it allows the developer to tunnel an application through an existing "simple" web/html application. Because the client-side firewall isn't easy to open up, Ajax allows full programs to be downloaded and run on client machines. Instead of saying "allow the google maps App to run" or "open port 8081 for the IANA registered Google App", we say "go to google maps uri". Tim Berners-Lee has wisely been talking about the problem that this is completely mixing all sorts of content models together, and soon admin types are going to start questioning even allowing so-called Vanilla HTML through firewalls. Starting from "pop-up blockers", we're sure to see "Ajax blockers" sometime soon.
WS-Addressing EPRs
WS-Addressing EPRs are another example of tunneling information, as the EPR minter that's creating an EPR with a Ref Parameter doesn't have to ask the firewall admin for access to the SOAP header block space.
I boldly predict a new security product, the "EPR Security product" which makes sure that an EPR minter isn't putting "illegal" Ref Params into the EPR. All Ref Params will have to be vetted by an administrator. And there will be a client-side product too, that checks to make sure that an EPR Ref Params that are going to be echoed aren't too long, don't conflict with existing soap headers, etc.
Binary XML
One of the major pushbacks against any kind of "binary XML" has been the tunneling aspect. The argument is that if we allowed binary xml, then some big vendor could stuff a bunch of proprietary and non-readable protocol or data elements in a set of XML documents and call it "a standards based xml language". Because XML is textual, it is viewable and more prone to public commentary. Though I note with considerable dismay the efforts that Apple has gone to with ITMS to hide their catalog and data store via encryption, etc., even though they use XML.
It will be interesting to see whether the W3C's binary xml - sorry Efficient XML Interchange - will lead to this kind of tunneling, and what further tunneling is possible.
URIs and XRIs
The folks on the XRI committee have created a format that looks like URIs but allows for "location independent identifiers". I wrote about why http: uris are better than urns and id: uris. As part of the TAG, we pushed back on this technology pretty strongly with little response. The XRI solution is yet another example of where the developer wants to create something without having the admin person be able to approve/disapprove of their choice. Considering that http: uris can do all the things that xris can do IF the http admin allows for persistent identifiers, redirects, etc., it seems the only advantage to xri: is that they allow the developer to create the identifiers without talking to the admin. The XRI folks don't really talk about who's going to approve or populate the mapping of xri: identifiers to locations as this is at the wave the magic wand and don't see that admin over there stage (stage #2).
Types of Tunneling
It's interesting to look at the different types of tunneling. Ajax is about adding more and more functionality into an existing app - the browser - to the point that apps can be run inside the app. We can call this "application tunneling". The container format is HTML in this case. SOAP is about tunneling using a container format, but it is primarily about allowing dispatch to a "hidden" application. This seems to be at a lower level, perhaps we can call this "protocol tunneling". Tunneling a protocol through XML could be considered "format tunneling". And finally, EPRs are about tunneling data that is often location or identifying information. Depending upon the Ref Param type, this could be "address" or "data" tunneling.
Next cycle
After tunneling through port 80 security hole has been closed by WS-Security products, and WS-A EPRs have been closed off by the "EPR Security product", I wonder what's next. Let's see, we need to tunnel through some existing service's "hole". Looking for the same qualities that made HTTP tunnelable (tm), where's a widely deployed app that's nicely extensible and admins are allowing in and out of firewalls and has a lot of buzz so the admins couldn't conceivably lock it out. Let me think… I know, how about blogging?
Blogging, perhaps via standardization of Atom or RSS, has a good chance of being the next app to be tunneled through. It's got all the same things that HTTP had: widely deployed, extensible, lots of hype. No need to register with the security police, just put up a new blog feed. I'm off to start my "blog security" company.
I have a lot of sympathy for Rasmus' comment about the Yahoo search service, where he says
"But don't even try to read the SOAP spec. If you managed to fight your way through that spec already, try the new WSDL 2.0 Draft Spec. This is the sort of stuff that makes my brain hurt"
As one of the advocates that spec readability and spec completeness are not orthogonal, I am disappointed in WSDL 2.0's lack of readability. I continue to believe that specs have to be "marketed" to communities and part of that is making them readable/usable without the need for primers.
But he should take a look at what he can do with WSDL 2.0. Interestingly, I specifically used WSDL 2.0 to describe Yahoo's REST api. I think it's darn slick.
I've come to a conclusion that the Web services community needs a simpler and yet more powerful way of describing bindings and message exchanges patterns. I've been working for a while now on asynchronous Web services, and it turns out to be quite difficult to describe What's Going On.
As part of the Async Task force at W3C, I've written an async scenarios document [1] and 2 meps and bindings using the binding framework [2], [3]. The current SOAP 1.2 Convention for describing Features and Bindings [4] does not seem to add significant value, rather it makes the process significantly more difficult.
My call is for a simplified binding and mep conventions and a new MEP and binding [5]. This could be standalone or could be an errata to SOAP 1.2. I think it could be SOAP 1.2 errata because it doesn't change any of the bytes on the wire and I don't think that are any bindings that would be broken by changing the conventions.
For context, the binding framework was created by the SOAP 1.2 working group to describe protocol aspects of SOAP, particularly a clear description of the on-the-wire expectations for binding to protocols like HTTP. The Message Exchange Patterns introduced are request-response and soap-response, and then these are specified for HTTP GET and HTTP POST. The MEPs contain a state machine for senders and receivers as well as specific properties (like request message, response message, status). The state machine and properties are used and augmented by the binding to fully specify the behaviour for a given protocol, such as HTTP.
It is roughly a goal that there will be "fewer" bindings rather than more, that is there won't be a large number of bindings for HTTP.
Two points of note: 1) with all irony, SOAP is about single one-way messages but it doesn't provide a binding for one-way messaging over HTTP. 2) There is partial support for HTTP, which is accomplished by the "soap-response" MEP and use of HTTP GET.
Some of the advantages that the binding framework ought to offer, but don't:
1. Re-use of the MEP state machines
The state machines defined in the MEP ought to be re-usable in different bindings. However, the reality is that there have been very few bindings formally written, and I have not seen one that actually re-uses the state machine. Further, the state machines as designed were not re-usable. To describe HTTP GET requests returning a SOAP body, the soap-response MEP was created. SOAP had a request-response MEP and HTTP POST described, but the solution for HTTP GET didn't re-use the request-response MEP. This obviously affected the binding. This soap-response MEP has not been re-used ever as far as I know, and introduced further complexity into the SOAP HTTP Binding because it requires that the HTTP binding deal with 2 different MEPs.
2. Re-use of bindings
Adding a one-way MEP without changing the HTTP Binding is not possible, in a strict interpretation of the wording of the binding. Therefore the SOAP HTTP Binding re-use was never achieved.
3. Re-use and extensibility of threading model
The state machines are roughly written as a single threaded process. From the receiver perspective, there are roughly 2 separate threads that are important: receiving the request and sending the response. The MEP design combines these states into states for each combination, ie receiving or sending/receiving. Then the binding has to combine these states together. There is no benefit to separating then combining these states together.
The matter is further complicated when WS-Addressing ReplyTo or FaultTo are introduced. It may be very well possible for a receiver to have a new thread, which is the sending to the ReplyTo or FaultTo address, even while it is still in the receiving or sending/receiving state with the initial sender. Using a SOAP request-response MEP over 2 HTTP connections actually may involve 3 separate "threads". This is even more difficult to describe in the MEP given the parallel nature of the threads. For example, at what point in the receiving state machine is the receiver allowed to open a new connection are start sending a message? In reality, it could happen during the receiving, the sending/receiving, or after the terminate state.
Attempting to describe such an asynchronous callback is thus quite difficult in the binding framework. We have agreement on how the messages appear on the wire and how they are described in WSDL, so it is the binding framework that "gets in the way".
4. Implementation Reality
I believe the reality of implementations have implemented a request-optional-response MEP. There is no proof that any software will actually fault if a response is not received. The tightly specified nature of the request-response MEP has not been implemented or proven useful.
5. Web architecture
The soap binding conventions made it difficult to add support for HTTP GET. This solution of the SOAP-response MEP solution has not been adopted to any serious degree. Further, adding support for other HTTP methods, like HTTP PUT, is not possible using the current MEP and binding. A solution that does not constrain the MEP to exactly which message directions are required and which are SOAP messages is more extensible.
WS-Addressing ReplyTo
WS-Addressing ReplyTo provides an illustration of the problem. We know what the WSDL is and what the HTTP messages look like. Does a WSDL request-response which has a ReplyTo value (and so uses 2 HTTP connections) map to: a) one SOAP request-response MEP and then a new binding that supports 2 HTTP connections OR b) two SOAP one-way MEPs and 2 instance of a new one-way HTTP Binding for each connection? To a certain extent, is the "soap" mep layer closer to the WSDL( 1 SOAP request-response) or closer to the protocol ( 2 soap meps for 2 protocol connections). There are pros and cons to both approaches, but remember: we know what the WSDL and the HTTP messages look like so we're just shuffling bits in the abstract specifications.
This dissertation then begs the question: what is useful in a binding framework?
Properties
The properties that contain the request, response, status code, and availability of response are useful for bindings to use. Otherwise, each binding would have it's own property for these. When the WSDL request-response is mapped to "something", it shouldn't map to a specific protocol, like "HTTP request" and "HTTP Response". Defining and re-using these properties is a useful layer between WSDL and the protocol. Describing these properties without coupling them to SOAP also makes it possible to mix SOAP messages with protocol messages, such as HTTP GET, in a simple manner.
Simple state machine
A simple state machine that says that a request is followed by a response, with an indication of when the response is available for processing by the sender or receiver provides sufficient utility to create bindings. This would have enabled a "request-response" MEP to be bound to HTTP GET as well.
This simplified approach to describing bindings as well as a simpler MEP and Binding is described in the Simplified MEP and Binding proposal [5]
[1] http://www.pacificspirit.com/Authoring/async/async-scenarios.html
Jeffrey Schlimmer gives some thoughts and a mnemonic on MEPs in MEP: Metadata Explains (Message) Properties. I have 2 problems with what Jeff describes.
Firstly, I agree with him that an MEP is independent of underlying protocols - he describes this as a transport but HTTP isn't always a transport protocol. But he misses the 4th case of the simple/complex mep and simple/complex underlying protocol: in the case of complex protocol/complex mep, a rich protocol is typically dumbed down to provide the minimum one-way mep. I wrote about this in depth in Underlying Protocol is a completely leaky abstraction. For example, some folks really want to design applications so that a WSDL one-way operation deployed over HTTP can't ever report an error over the HTTP response, despite the fact that WSDL robust-in-only and SOAP itself are designed for this.
Secondly, he misses a crucial point about intermediaries and MEPs. How does a protocol intermediary know whether a request message is part of a one-way or request-response mep? How long should an HTTP intermediary keep the connection open waiting for a response? If it's request-response over HTTP, then it has to keep it open. If it's one-way over HTTP, then it can close it after sending the message onwards. If you say "look at the ReplyTo" or "look at the Action" then you mean an HTTP intermediary has to be WS-Addressing aware to perform properly. But we actually know that WS-A ReplyTo doesn't fully describe the MEP. And we probably don't want every HTTP intermediary to have the mep metadata for each distinct Action/URI combination for a domain.
Then there's composability problems. WS-ReliableMessaging allows an acksTo=anonymous, which means use the HTTP connection. This can be combined with a one-way MEP. So the HTTP intermediary has to know about ws-a AND ws-reliablemessaging AND the next spec.
Now it can be argued that an ws-rm "reply" is different than an app level "response", but the http intermediary still would like to know whether something is: a) never coming back; b) possibly coming back; c) always coming back.
As it stands, intermediaries such as load balancers and dispatchers can have a tough time without messages being self-describing wrt meps and the http response.
Norm's been working on describing a where in the world application using WSDL 1.1 and SOAP. He says WSDL sucks, but I think that he really meant to say is that WSDL 1.1 sucks. Lots of us have spent time on WSDL 2.0 to try to clean up lots of the things he found problems with. A while ago I sent him a WSDL 2.0 version of WITW, but he's gotten busy so I thought I'd post the WITW WSDL 2.0 and is-request.xsd
One of the biggies is improving the HTTP/REST binding for WSDL. WSDL 2.0 can describe a variety of REST services. In fact, I've done provide a list of WSDL 2.0 REST descriptions that includes Atom 0.3, Yahoo Search, Music search, WSDL 2.0 Primer Travel Reservations (GET Reservation, PUT Reservation, GET ReservationList), and this is the WITW WSDL 2.0 version. Why people keep saying that WSDL can't describe REST seems grossly unfair to me.
I also wish that people that bash WSDL 1.1 would take a look at WSDL 2.0. I am completely sympathetic to the arcane-ness of the wsdl 2.0 spec as I'm one of the people that lost the vote where WSD WG decided that the specs were for toolkit authors not wsdl document authors. But at least take a look at the primer.
If there's a core concept to examine in REST description languages wrt WSDL, it's whether WSDL's aggregation of operations into interfaces and then binding to http makes sense. Web services tends to have lots of operations and few URIs, whereas REST is obviously the opposite. And so an "interface" construct for operations + binding seems to me to be the parts of WSDL that are most problematic for REST and this would be the core of any simplification over WSDL 2.0. OTOH, it's quite clear that WSDL 2.0 can take interfaces for multiple bindings and deploy to SOAP and non-SOAP based services.
I'm finally back in the saddle working on extensibility and versioning. Looking back, I think that we've done a decent job over the past few years of raising the awareness of "must Ignore unknown" and "must understand" models. In parallel, we're also getting better understanding of how to describe interfaces to applications.
It's time to take the next step in looking at understanding in applications. A good case study is Atom. The Atom working group decided against adding a "mustUnderstand" marker to the Atom language. I think the main reason is that there are various types of Atom applications, and it was too hard to figure out how to "target" the must understand to the right type of application. For example, if an entry has an extension marked mU, does a feed agreggator have to fail if it doesn't understand it? A feed aggregator partially understands entries as it looks at some of the content (particularly the author child), but it doesn't understand all of it. The Atom group also wisely decided it didn't want to formally define processor classes, ie. "aggregator", "entry handler", etc. FWIW, SOAP provides some hooks for this by the ability to use the "role" attribute to target headers at. But very few applications seem to use soap header blocks for application data extensions, let alone the role attribute for targetting to particular nodes.
Many applications deal with partial processing as the document is transferred to one piece of software to another. I regularly use the "Name" example, so imagine that the Name exists in a Medical Record. There may be many pieces of software that operation on the medical record, and even many different pieces that work on the same subset. There might be the "patient info validation" component that uses the Name, and then there is the "Patient info display" that uses the Name, and then "patient query" that also uses the Name. Each of these "Name" processors could be potentially targetted for extensions, the same way it's difficult to target different Atom Entry processors.
What we have is a couple of cases showing the difficulty in partial understanding of xml. This difficulty is by no means limited to Atom. It pervades XML applications, and I think is one of our next big problems to address for achieving distributed extensibility and versioning. Which brings us to the usual questions: what's the problem, is it worth solving, how can it be solved, what's the best way for it to be solved. My intuition is that Schema 1.1 and the PSVI could be part of the solution because I think we'll need reporting on partial validation in order to get partial understanding.
Problem
A strawman problem statement is "How can a language be designed and evolved in the context of a variety of processor types". This problem statement leads us square to the hard parts of the problem, which is how are processor types identified and how is the language subset identified.
Identifying Processor Types
XML processors come in many sizes, shapes and colo(u)rs, ranging from editors to parsers to full blown b2b applications. Trying to figure out at language design time what all the different processors for a language are seems very hard, probably unnecessary, and probably harmful. Imagine that an application decides that there are n different processor types. What happens when innovation of type n+1 comes along? In the case of Atom, what if the original RSS community had decided that there were only "entry processors" and "text viewing processors". They might have accidentally precluded feed aggregators.
I have a feeling that there are 2 extremes for identifying classes of processors. At one extreme is that the classes of processors are identified in the language. Interestingly, XML itself does this by specifying "well-formed" vs "validation" processors. A little more to the middle is where the class of processor can be identified using a token (like soap:role) but the meaning of the token is undefined in the language. The other extreme, is that the class of processor isn't identifiable as an entity at all but rather by the Qnames it understands.
The targetting could be done based upon the language itself. Something like "If you understand entry/content Qnames, then you must understand entry/DaveOsExtension Qnames". This might be usable for the Atom folks to allow text editors and feed aggregators to ignore the DaveOsExtension.
Language subsets
The scenario of expressing understanding based upon Qnames is a simple solution to how to express the language subset that the extension is related to. In the sample scenario, the entry/content Qname suffices for a general class of processors. But what if that Qname isn't sufficient, say it's entry/content where content has attribute foo or bar but not baz AND something else.
The two problems - identifying the processor type and identifying the language subset for a processor type - seem intricately coupled. The processor type is probably defined by the language subset it operates on, and the language subset is determined by what the processor is doing.
A further complication is that an extension could be mandatory for multiple processor types, each with a separate subset of the language that they operate on. There's a lot of potential yuckiness in either: repeating the extension for each processor type/language subset or coming up with a framework for targetting multiple types/subsets for an extension.
Validation and Understanding
Admittedly this is a potentially hard problem, but what building blocks do we have? Let's assume that we are using a schema validator at run-time - I know this is a *big* assumption, but bear with me. Many people have lambasted the PSVI but it provides some *very* interesting pieces of information. In particular, it has the validation attempted on an given XML component, and the results of that validation. So you can find out whether a component was validated successfully or not.
I first encountered this when I proposed that WSDL 2.0 could use this feature of Schema to enable relaxing of the non-determinism constraint. The idea is WSDL 2.0 could use schema in a way that removed any extra xml components that failed schema validation because they had violated the non-determinism content rules, particularly to allow wsdl 2.0 to incorporate the "Must Ignore Unknowns" rule. (W3C Member only link)
If we did want to make any kind of expression of understanding, we could start with validation. We could specify that validating Entry/Content elements is the same as understanding Entry/Content elements.
There's obviously a lot of area to explore as to how to use the PSVI. How does one refer to the PSVI validation attempted/results information items? This could be as simple as using XPath refering to the additions (Entry/Content[@validationAttempted=true&validationSucceeded=true]), or as complicated as a new set of Schema specific XPath functions. I'd think that the PSVI should take into account this usage pattern.
Partial Understanding = Partial Schema?
To associate understanding with validation, it probably also means that implementations need Schema subsets. The Atom specification provides a complete Schema for the Entry. But a feed aggregator only wants to parse a subset of it the Entry. It needs a partial schema for doing the validation.
Right now, we generally build up the Schema of a message to be the composite of all the processors that work on the message. This monolithic view of an interface is both helpful - as it's arguably simpler than expressing multiple schemas/message - and harmful - because it doesn't handle the nuanced kinds of processing that we are talking about.
If we could provide multiple schemas (gee, kind of like XHTML does) for constructs, then we could also provide a processing pipeline where each component gradually adds in the validation information as it is needed.
The breakup and consolidation continues
I've argued for a long time now that XML Namespaces will result in languages that are smaller and smaller because it's easier to version smaller languages than larger languages. The farthest degree of this is where each element or attribute has it's own namespace. We're clearly not there, but some of the WS-* specs are getting pretty close. As the namespaces get smaller, it's obvious that the schemas will get smaller.
And as the namespaces get "stitched" together into composite languages, we need support from Schemas to be stitched together.
Conclusion
I started talking about the difficulties of "must understand" in applications that have partial understanding. This led to thinking about how to identify a subset of a language and how to identify a processor for that subset. We thought about tying the validation logic to the understanding logic, which further led to the need to express the schema for the processor rather than the composite.
I think that expressing partial understanding will cause a need for multiple schemas per name, stronger psvi support for validation using a particular schema, and targetting language extensions based upon the partial validation results.
There are well-known trade-offs between synchronous and asynchronous messaging. The trade-offs roughly mean a choice between coupling in time versus coupling in space. In synchronous messaging, resources are dedicated in one party while the other party and/or network are processing the message. The classic is obviously request-response where the client keeps a connection open, and probably a thread for the connection, while waiting for the response. These two systems are coupled in time.
Asynchrony relaxes the coupling in time but adds a coupling in space. In asynchronous messaging, the message sender devotes fewer resources to the message exchange pattern. The simplest is a one-way message, where the sender puts the message into the transport and continues. Request-response is often layered onto asynchronous messaging by using of a "callback", sometimes called asynchronous request-response. A request contains an address for the receiver to send a message to.
There is further coupling in space because the state of the system is spread across two machines. To re-use the resources on the sender, the "state" of the sender must somehow be passivated, so that it can be reified when the response comes back. The correct instance of state on the server, often called correlation, must also be identifiable from the response message. The protocol between the two systems must necessarily be more complicated for asynchrony, because it requires the correlation between messages and/or instances to be in the messages. FWIW, these correlations are why WS-Addressing has MessageIds, RelatesTo and Reference Parameters.
The trade-off between coupling in space and time is that the resources that are used for synchronous communications, such as socket connections, program and cpu threads, may be inefficiently allocated. Asynch allows the sender to free up these resources for other tasks. While there are resources that must be allocated for waiting for the callback (such as waiting for an inbound connection) and there is an increase in the complexity of the protocol (adding in correlation) and sender machinery (passivating and reifieing state0, these are often less than the synchronous resources. In situations where the resources are not effectively used, asynchrony increases the scalability and performance of applications.
When Asynch
The next question is when to use Asychronous versus Synchronous communications. As with all trade-offs, there are no hard and fast rules. Clearly communications that will take hours or days to answer are strong candidates for asynchrony. But how many minutes or seconds would suggest using asynch? Most web sites have a guideline that a web page must respond within 30 seconds. Perhaps this is a good guideline for the line between synchronous and asynchronous.
Callback specs
If you believe that Asynchrony is a vital component in building scalable distributable systems and you are using Web services, then SOAP and a few other WS-* specs provide the prerequisite infrastructure. WS-Addressing is probably the core specification for asynchrony, it's likely to be very widely deployed. BEA poineered and shipped a couple of predecessors in SOAP-Conversation and WS-Callback, and WS-I specified a basic Callback scenario. Additional specifications that layer on top are WS-Eventing (PDF) and WS-Notification. I don't want to get into the obviousness of the political nature of why there isn't a single notification spec that has publish,subscribe,renew, terminate and pause/continue messages.
Polling
Another style of asynch uses "polling" rather than "callbacks". This avoids the opening up of the sender's address space, but causes increased complexity on both sender and receiver, inefficient message flow (as polls are often wasted) and potentially tardy responses. We have few WS-* specs in this area. I think a very intriguing possibility for polling is the use of blogging or RSS/Atom. Imagine using blogging for publish and subscribe.
Who specifies
A final aspect of Asynch worth speaking about is who decides whether a request is synchronous or asynchronous. There are only 2 choices: either the sender decides or the receiver decides. There is no clear answer.. In cases where the sender can't do asynch, it would obviously prefer to do synchronous comm. And in the same manner, if the receiver knows that the communication will take a long time, it will prefer async. WS-Addressing using a model where a receiver allows sender-specified. The sender will specify a callback by the presence of a non-anonymous value in the ReplyTo field.
Conlusion
Asynchrony is a powerful tool in building distributed systems. This article has provided some technical explanation of the tradeoffs between synchronous and asynchronous, when async can be applied, comparison of the callback and polling styles, and the specifications like WS-Addressing that can be used for asynch.
I believe that all abstractions eventually leak, it's just a question of how much. Web service Faults are the catalyst for some serious cascading leaks in asynchronous messaging, and WS-Addressing compounds matters. Updated 4/15/2005 based on comments from Marc Hadley and Jonathan Marsh.
With HTTP, SOAP HTTP Binding, SOAP MEPs, WSDL MEPs and WS-Addressing, the underlying protocol's ability to deliver Faults leaks through to every level. There ends up being no worthwhile abstraction. The fault is all Faults. Normal application messages can easily be modeled as one-way or request-response, but it's support for Faults that complicate the message exchange patterns and bindings enormously. Faults, and the delivery of Faults over HTTP, result in the need for extra SOAP and WSDL MEPs, and this abstraction leaks ever upward into application design.
It's even worse, because we have to retain SOAP meps even though we have WSDL meps.
Let's assume that you can deploy your application on 2 different protocols: HTTP and some marvy true one-way (SOAP over JMS or SOAP over UDP perhaps, or …)
Scenario 1: One-way
We start at the "top", that is the interface level. You write your application to send a one-way message. In WSDL terms this is a "one-way" MEP. SOAP provides a "request-response" MEP, and we probably will end up having a "one-way" SOAP MEP. Let's assume for now that we can use a "one-way" SOAP MEP. The logical SOAP mep choice for a WSDL one-way would be a SOAP one-way.
It's at the binding layer that the abstraction now leaks. You deploy on both HTTP and a real one-way. Things are great on the real one-way as the WSDL MEP, SOAP MEP and the protocol line up. But HTTP is a connection oriented protocol and gives a return code and optionally a response body. What happens with the HTTP return code and a response body? Should the application wait for the HTTP return code? And what should it do with it?
Imagine there's a failure, and a Fault is generated. An HTTP 4xx or 5xx is returned, and the Fault could be reported. Should the binding layer throw the return code + fault on the floor and say nothing to the application? Throwing away an error at the underlying protocol seems very counter-intuitive. We'd probably like the underlying protocol to report back the fault. Which means we picked the wrong WSDL MEP, as we'd like to be able to report the underlying protocol error to the application.
Scenario 2: WSDL Robust in-only
Let's pick a better WSDL MEP in scenario 2. WSDL 2.0 has a "robust In-only" for *exactly* this pattern of in + optional fault out. This is the obvious WSDL MEP to use if a Fault can be reported from the underlying protocol up to the application. Now we don't actually have the right SOAP MEP for this. Amazing, but true. The WSDL MEP correspondingly needs a SOAP in-optional-out MEP. Another option is a SOAP in-optional-fault MEP, when bound to HTTP specs that an HTTP 2xx must contain no body. Let's call this one-way mep mythical because we haven't decided to standardize one yet though I wrote up one many months ago.
The choice of the WSDL MEP at the application layer is guided by which underlying protocol(s) are expected to be used. . This is the main thesis of this article, that the application layer modeling is affected by the underlying protocol. An application using HTTP would probably never want to deploy a true in-only as the HTTP response code will be thrown away. The underlying protocol's ability to report faults has leaked into the SOAP mep which has then leaked into WSDL mep. The abstraction between the application MEP and the underlying protocol (via the SOAP MEP) has leaked.
Point #1: The myth of protocol independence
This comparison of in vs robust in-only has shown that the application level WSDL MEP will be determined by the underlying protocol, assuming that the underlying protocol's information is fully utilized.
Another possibility is to throw out anything "extra" from the underlying protocol, that is effectively dumbing HTTP down to UDP. In general, Web services using SOAP and WSDL 1.1 has already done that by ignoring the HTTP Operation. The Web services "architecture" has further done that by ignoring the protocol capabilities, such as security, encoding, caching. We could have utilized the capabilities of HTTP and other protocols if we'd agreed how to describe the capabilities, but the "features and properties" work of SOAP 1.2 and WSDL 2.0 looks pretty much DOA.
Corollary to point #1: True protocol independence means dumbing down every protocol to UDP OR a framework for expressing protocol capabilities
I've shown how we are achieving true protocol independence by throwing away everything that makes up HTTP: the operations, status code, response, encodings, security, and all that.
What's the point of SOAP meps
As I've been exploring this complexity of wsdl meps, soap meps, and bindings, I've been wondering if we could simplify or even not use SOAP meps. Why not just have WSDL MEPs and Bindings and skip the SOAP mep abstraction? Also, Jonathan Marsh made an intriguing proposal that maybe we could have an "uber MEP" of in optional-out rather than an in-out and an in-only SOAP MEP.
It turns out that we need new one-way AND an in-optional-fault meps, because the SOAP sender has to know what it can expect, and a receiver has to know what it can do. Argh!
Imagine a world without soap meps…
If a WSDL in-only MEP is specified, then there is no response allowed, period. So the SOAP MEP should be a one-way. This tells the SOAP sender to ignore any response, and if communicated in WSDL it tells the SOAP responder to not send any response. If a WSDL robust in-only is specified, then a response is allowed. An in-optional-out SOAP mep is needed. A 200 indicates no fault, and a 4/5xx indicates a Fault.
We need the 2 new SOAP meps (one-way and in-optional-fault) to tell the SOAP sender what to expect and the SOAP receiver what it can do. If we only have the in-optional-fault without the in-only, then how does a SOAP sender know whether it could get a Fault, and how does a SOAP receiver know that it can send a Fault? There has to be something somewhere that says "SOAP receiver, you can return a Fault on the HTTP connection". This is what the SOAP meps do.
Why in-optional-out is bad
If we used only an in-optional-out, that is ignoring request-response and in-only, then the receiver doesn't know what it can do with status codes and responses including faults. A sender wouldn't know whether it will get a response (request-response), may get a response (in-optional-out) or won't get a response (in-only). Imagine these two get out of synch, say the soap sender always waits for a response and the soap receiver lets the application specify whether a response is to be sent or not. How does the soap software "know" to keep the connection open? Imagine at the WSDL level it is one-ways. The receiver has dispatched the message, but the application won't send a response. The application *must* tell the SOAP layer that it's not going to send a response. How does it do that? It probably says "this is a one-way message".
Voila. We've just invented the main point of SOAP meps. Perhaps the SOAP meps could be simplified or rejigged. But at the end of the day, you still have to say things like can/may/must there be a response, and is the response a fault or a soap response or either. Maybe we could have a "SOAP Request-response profile" that says "assume all the stuff in the request-response mep, except there can't be a response."
The MEP return optionality naturally gets passed down to the binding as "can a response come with an http 200, can a response come with any http code, can only a 200 without a body or a 4/5xx with a body, etc.". It's worth pointing out that the only public SOAP HTTP binding spec says that an HTTP 200 requires that the SOAP sender start making an abstraction of the response message AND it says nothing about any other 2xx return codes. For example, it does not say that a 204 indicates the message is finished.
Point #2: SOAP MEP functionality needed
This exercise has shown that the needs of the SOAP sender and SOAP receiver dictate that SOAP meps are required. The SOAP meps are used to tell the SOAP software what it can do. Co-incidently, we need a SOAP HTTP Binding that supports an in-only and an in-optional-fault MEP, not coincidentally matching up WSDL meps with underling protocol capabilities.
Presumably WSDL 2.0 will need an update to support the new MEPs and HTTP bindings, and I'd suggest that the SOAP MEP defaults to the WSDL meps. One could bind the meps in a different manner, and some extension could over-ride the default. For example, a WSDL robust in-only could use a SOAP one-way over a SOAP one-way underlying protocol. And a WSDL in-out could be mapped to 2 different SOAP in-optional-faults, each to a separate HTTP connection for the application request/response using asynchronous underlying protocol (aka asynch request response).
WSDL mep and SOAP mep needs
1. A SOAP one-way MEP,
2. A SOAP in-optional-fault MEP,
3. One or more SOAP HTTP binding(s) that supports one-way and in-optional-fault MEPs.
4. Updated WSDL 2.0 to relate WSDL meps to SOAP meps.
Enter sandman: WS-Addressing
Now you decide that you want to use WS-Addressing. You are again going to deploy your app on both HTTP and some one-way protocol. What should you do about faults and meps? Let's say you pick the robust in-only. You will probably specify a FaultTo. FaultTo has 2 options: an explicit address and "anonymous". The anonymous is for the same connection, ie an HTTP connection. Anonymous seems pretty ideal for the HTTP 4/5xx + SOAP Fault scenario. But which address to pick, explicit or anonymous?
Those pesky underlying protocols are different and this leaks upwards into WS-Addressing and ReplyTo. In the true one-way case, an explicit FaultTo address is the obvious choice. There are many different places where a Fault could be generated and all of them can be sent to the FaultTo address.
Using HTTP, a fault can be generated 1) during reception of the SOAP message (ie at the protocol level) or 2) after the HTTP connection has closed. If you put an anonymous FaultTo address, you lose the ability to transmit the 2nd type of Fault. If you put an explicit address, what happens to a fault generated while the HTTP connection is open? It seems extremely inefficient to report errors of type #1 by closing the HTTP connection, and then opening up another HTTP connection to send the Fault. Especially if a Fault is generated and the FaultTo can't be read, eg the envelope is encrypted and no appropriate key is available.
What we need is the ability for both anonymous and explicit FaultTo addresses in WS-Addressing when HTTP is used. The intent is that any faults generated during HTTP connection time can be returned in the HTTP response, and any faults generated after the connection is torn down have an explicit address. I think my favored design is a "AllowAnonymous" on FaultTo, with a default to True. The default is that a fault generated during the HTTP processing could be returned over the HTTP connection, but this can be turned off in the FaultTo. Some other ideas are to allow 2 EPRs for FaultTo and require one to be anonymous if 2 are present, or to have an "allowAnonymous" to have default False so that an explicit action is required.
Returning to the leaky abstraction theme, notice how the protocol abstraction is broken. Using HTTP with a FaultTo means that a Fault could come back on the HTTP connection OR to the FaultTo address. The application will need to be prepared for a Fault when it does the send. This does not occur if a true asynch protocol is used. An application that can be bound to true asynch as well as HTTP must be prepared for Faults in 2 places. The design of the WS-Addressing FaultTo mechanism itself reflects the underlying protocols capabilities.
There are 3 scenarios for FaultTo with no ReplyTo:
1. No FaultTo or anonymous: must have at least a WSDL robust in-only and SOAP in+optional-out.
2. FaultTo and allowAnonymous="true": equivalent to 2 WSDL operations and 2 SOAP meps, first is robust-in variety, the second is one-way or robust-in variety.
3. FaultTo and allowAnonymous="false": equivalent to 2 WSDL operations and 2 SOAP meps, first is in-only and second is one-way or robust-in variety.
Point #3: WS-Addressing Faults must reflect underlying protocol capabilities
The previous section has show that WS-Addressing Faults should support the capabilities of HTTP by allowing faults over the HTTP connection as well as Faults being sent to the ReplyTo address. A particular design is proposed that cleanly adds this into the FaultTo.
Mix-in some ReplyTo
In the case of Asynchronous request response, which is the application sees request/response but there are 2 separate connections, WS-Addressing provides a ReplyTo address. But which SOAP meps should be used for a WS-A with a ReplyTo and a FaultTo?
In the case of asynch underlying protocol, the request/response/fault messages each use SOAP one-ways and WSDL one-way messages. In the case of HTTP, things can be more efficient. I'd previously mentioned that WS-Addressing ought to support a Fault over the HTTP Connection. This implies a SOAP in-optional-fault aka robust-in MEP for the request message. The application will know to wait for a SOAP fault if the HTTP response code is 4/5xx.
It's conceivable and possible that there may be faults in delivering the Reply or the Fault on the separate HTTP connection. Presumably, the application sending the Reply or Fault will want to know if they can't be delivered. This implies a SOAP in-optional-fault MEP for the response and the fault messages.
Returning yet again to leaky abstractions, using WS-Addressing has done nothing to insulate or abstract the SOAP meps from the protocol binding. It's simple: if you use HTTP, then the SOAP meps and WSDL meps should be in-optional-fault (aka robust in-only). If you use a true asynch protocol, then the meps can be in-only. The safest option is to use in-optional-fault for the SOAP and WSDL meps. I'd say that the default for a WS-Addressing aware service ought to be in-optional-fault. If the underlying protocol doesn't support returning faults, then the optional fault part will never be used.
Point #4: WS-Addressing ReplyTo or FaultTo needs new SOAP MEP(s)
The existence of non-anonymous ReplyTo means that it is not synchronous request response at some layer, and invariably it means not synchronous at the protocol layer. If you wanted the HTTP response for the application response, you'd use request-response with anonymous ReplyTo. Two connections means using two bindings and two SOAP meps. I've already mentioned the need for SOAP one-way and either in-optional-fault.
WS-Addressing's ReplyTo, with non-anonymous values, calls for a breakup of the 1:1 mapping between WSDL and SOAP meps for a natural programming environment.
I think it's pretty natural to want to express a request-response using ReplyTo as a single operation. The whole point of doing "WSDL" operations is to associate one or more messages into an operation. My pitch is that we should be able to take a WSDL in-out that has WS-A engaged and express it as 2 SOAP in-optional-fault meps if there is a ReplyTo and 3 SOAP in-optional-fault meps if there is a ReplyTo and non-anonymous FaultTo. Some folks will argue that there should be a 1:1 correspondance, and so ReplyTo should used 2 WSDL meps and then some kind of correlation aka BPEL service links. I proposed this to WSDL 2.0 in June 2004, but WS-A wasn't at the W3C at the time.
Point #5: WS-Addressing asynch request-response should allow 2 soap meps for wsdl request-response
The argument comes down to whether there is a leaky abstraction or not. Ironically, I want to hide the SOAP meps from the WSDL meps, so a wsdl request-response can map to more than 1 soap mep. Those that want a 1:1 relationship are arguing that the application protocol should mirror the underlying protocol.
My argument is that the constructs in WS-A - ReplyTo, FaultTo, messageId, and relatesTo - are sufficient information to do the correlation and manage the connections and URI-space. The key is to make sure the programming model knows that request-response can be deployed asynchronously and doesn't make synchronous assumptions.
Diversion: WSDL 1.1
What to do about WSDL 1.1 MEPs, SOAP and HTTP Bindings? The situation is quite simply dreadful. If you want to use an asynch protocol, then you can only use the WSDL 1.1 one-way. Which means if you also use HTTP, then any Faults are simply thrown away. If you use HTTP and you want faults over the HTTP connection, you are forced to use in-out. You have 2 bad choices: Use one-way and throw-away HTTP Faults, or get HTTP faults but be forced into request-response.
WS-Callback spent considerable time examining all the protocol possibilities for callbacks. One decision was to not allow an "acknowledgement" message (aka out or response) on the HTTP response. For an acknowledgement header block to be returned for a one-way mep, it would have required a "not a fault" SOAP fault, which seemed far too complicated.
We clearly need to move to WSDL 2.0 and it's MEPs to enable more efficient error handling.
Intermediaries and MEPs
I'm still struggling with intermediaries and SOAP MEPs. I think there is a problem in intermediaries. I don't think a SOAP intermediary will perform as expected with a SOAP one-way MEP. The problem is that the SOAP message is not self-describing from an MEP perspective. In the scenarios above, the sender and the receiver have the SOAP MEP, so they know whether the mep is one-way or request-response. But the intermediary probably won't be configured with that information.
A reasonable design for an intermediary is to assume the MEP is request-response. If it assumes the MEP is one-way, then it will erroneously throw away any responses. So request-response is the thing. But if the MEP really is one-way, the behaviour will be undesirable. The intermediary will keep the inbound HTTP connection open while it waits for a response from the node it is forwarding the message to. It will effectively have to wait for the ultimate receiver to determine whether to send a SOAP response or not. This SOAP response will then flow back through the intermediary chain.
This is very undesirable from an intermediary and senders perspective if the MEP is a one-way. They would rather have each intermediary send an HTTP 202 and close the connection. That's the whole point of store and forward intermediaries, is to decouple the message passing from the sender.
It seems to me the solution must involve making the message self-describing wrt the SOAP MEP. ReplyTo doesn't help us, because ReplyTo can be present for an in-only MEP if the reply is part of some long running pan MEP choreography. Maybe a >wsa:soapmep>One-way>/wsa:soapmep> flag. Another option is changing the definition of non-anonymous ReplyTo to mean in-only. I'm not sure what the right solution is, but I think something has to be done.
What's needed, and why
What do we really need from all of this for faults and asynch request-response, and why:
1. A SOAP one-way MEP, so that a SOAP sender will know not to wait for any optional (ie Fault) or required response over a one-way protocol
2. A SOAP in-optional-out MEP or a SOAP in-optional-fault MEP, so that a SOAP sender can wait for an optional response, such as a FAULT over HTTP.
3. One or more SOAP HTTP binding(s) that supports one-way and in-optional-out MEPs.
4. Updated WSDL 2.0 to relate WSDL meps to SOAP meps.
5. A fix to WS-Addressing FaultTo so that a Fault can be returned upon an initial request and not just the FaultTo, and I propose a "AllowAnonymous" flag.
6. Support for asynchronous request-response to over-ride of the WSDL 1.1/2.0 WSDL mep defaults to specify different soap meps than the default. I'd proposed this in WSDL 2.0 but it could be done in Addressing as well.
7. Ideally, a simplified SOAP MEP structure to say simple things like in-only, in-optional-fault.
8. Some solution to make messages self-describing" wrt to the SOAP MEPs.
Returning to the leaks
We need these new SOAP meps and SOAP HTTP bindings to support asynch and HTTP Fault reporting. The HTTP faults are have caused leakage because
1. the specific binding leaks into the SOAP mep selected, because HTTP bindings match up well with a to be standardize (TBS) SOAP in-optional-out to support faults on the HTTP connection
2. the specific binding leaks into the WSDL mep selected because the WSDL mep will logically match up with the SOAP mep
3. the specific binding leaks into the WS-Addressing Fault, because an HTTP binding will suggest Faults can be returned on the in HTTP connection AND a separate connection for the FaultTo.
This paper has shown how the underlying protocol leaks into the SOAP MEP (#1), the WSDL MEP (#2), an the WS-Addressing Fault (#3).
These conclusions are definitely not what I had wanted, but I can't see any other solutions that enable asynch request/response, binding to multiple protocols, and proper reporting of HTTP faults to the application.
Every few months or so, another 100K or so of uninformed and sloppy writing is generated comparing Web services to REST. I've now found the analogy. It's the "war in Iraq". Let's compare and contrast.
Iraq
At this moment, cnn.com has one article in the World section about Prisoner counts doubling in Iraq.
Web services/REST
No blog entries in the past couple weeks.
Iraq:
Last few articles talked about "Increasing insurgent activity"
WS/REST
Last serious article on Tim Bray's Blog talked about "Web Services rumble getting remarkably loud"
Iraq:
Precious little knowledge about what's actually going on. Is it safer? More unsafe? Are the people happier, unhappier?
WS/REST
Precious little knowledge about what's actually going on. Are companies deploying Web services/REST? What is the actual WS/REST traffic on Web apps?
Iraq:
Everybody's agenda influences what they say. Those that say going to war was a good thing only quote good things, those that are opposed only quote bad things.
WS/REST
Same as Iraq. My favourite example is people talking about Amazon's "REST" API being 85% of the traffic, when it's not a REST api at all and the 85% traffic has been disavowed.
Iraq:
The aggressors/liberators motives distrusted. For oil, democracy?
WS/REST
The WS-* folks motives distrusted. REST folks think they want to do a stack swap and replace HTTP, TCP, URIs with SOAP+WS-RM+WS-Addressing+SOAP/UDP+binaryXML.
Iraq:
The nay-sayers never acknowlege any positives of invading/liberating Iraq.
WS/REST
The nay-sayers never acknowlege any postives of WS. Tim Bray links to Carlos Perez's articles on REST better than WS-* but fails to mention Chris Ferris' awesome debunking of Carlos in "1+1=0".
Iraq:
Sometime soon, there will be some more poor boys and/or gals from Iowa, Ohio, etc. that will be killed. Some press will call it "increasing insurgency".
WS/REST
Sometime soon, some new blogo/wiki/email/ flame war will break out. People will talk about "Death of Web services".
What's it mean?
In the Web services vs REST debate, the sad part is that the communities are not coming closer together. There are things that could be done for Web services to integrate with REST but few people from either camp are jumping up and down. And there certainly is no progress in getting to agreement on any kind of principled approach towards the discourse, like having an actual framework or architecture for evaluating the various arguments.
And we sure are not having the really important debate about how to best to architect an asynchronous applications and stacks, which I think is the key innovation behind WS-*. What are the technical trade-offs between HTTP + TCP + "content-location" vs SOAP + WS-RM + WS-Addressing + SOAP/UDP? You'd think the REST folks could *at least* use the architectural properties of key interest from the REST thesis when they are attack Web services. Funny how the REST advocates won't use the formal model of REST to evaluate WS-*. At least I did the hard work and wrote up a comparison of EPRs to URIs using the REST properties
We're at the same situation in WS/REST as we are in modern press reporting. People are regularly offering biased and inaccurate opinions, refering to and using biased and inaccurate opinions they've been fed, and a general malaise in doing the actual hard work in the background research.
The truth is always the first casualty, and the WS/REST debate has shown that the technical community is just as bad as the rest of politics.
What I don't quite get is why so many in the the technical community have this smug superiority complex about the "truth" that they offer, almost invariably bash W and what goes on in the administration, and yet can't see the similarities. What's good for the goose is good for the gander, and if technologists don't like the way some organizations don't do the hard work and bias everything, the last thing that a technologist should do is follow the same pattern yet remain self-righteous.
To execute on "act local think global", I call on technical people to engage in deeply technical debates and less on "marketing" campaigns. Do the work, present the facts, and propose actual technical analysis.
Yahoo has released the Yahoo search api. It is RESTful and does not use SOAP or WSDL. WSDL 2.0 has tried very hard to model RESTful Web services. For some reason Yahoo provided their documentation in text form as well as a schema for the response, they didn't provide a WSDL 1.1 or WSDL 2.0.
I quickly wrote up the WSDL 2.0 for their service.
Here's my attempts the Query type and the Yahoo Search in WSDL 2.0.
It could be simpler, but it certainly seems to do the job.
The Web services and HTTP world are still way separate. Without getting into the politics, I wonder what it would take to get to a mixed world: where SOAP clients can "see" Web resources, and Web clients can "see" SOAP resources. We're getting awfully far down the WS-Addressing path, which is doing nothing to contribute WS-Addressing"able" resources to the Web, and the SOAP 1.2 "Web-Method feature" didn't do this, so what else might be needed?
Strawman
I think there are 3 things that would need to be done:
- Binding SOAP to XML over HTTP, taking the soap body child and setting as HTTP body
- Binding of WS-A properties to HTTP properties
- Revamped WSDL to use SOAP for all operations and use Constrained verbs as operations
XML over HTTP "is" SOAP
The first binding is to make it look to "SOAP" like all Web resources are SOAP resources using the raw XML is a soap body trick that Sam Ruby talked about. Let's take that a bit further into WS-A land and say that various HTTP properties, like request-method and request-uri, are "automatically" WS-A properties.
HTTP Binding(s)
This means there needs to be at least 1 and maybe 2 new bindings. There needs to be a binding to HTTP that allows headers and body to be serialized into various places in HTTP. Let's call this the "SOAP HTTP Transfer Binding". The current SOAP HTTP Binding already exposes the Request URI and Web Method as features, so the SOAP HTTP Transfer binding could use that with the addition that all SOAP headers are serialized as HTTP Headers and the SOAP body is serialized as XML in the HTTP Body. I lean towards a mime type of application/xml rather than soap+xml.
WS-A binding to HTTP
The next piece is to provide WS-A specific bindings to this new SOAP HTTP Transfer binding. The obvious things would happen - the wsa:action becomes the web-method property and the wsa:To becomes the ImmediateDestination property. I've mentioned the other tricky parts before in ruminations on WS-Addressing and transfer protocols.
Transfer Verbs
Somewhere the Transfer verbs need to be "identified". My guess would be that the SOAP adjuncts should define the HTTP operations of GET|POST|PUT|DELETE, probably just as simple as defining qnames for the HTTP operations.
Another obvious option would be to use WS-Transfer - or even WS-Get - operations. This makes a lot of sense because the generic operations aren't limited to just HTTP. But... WS-Transfer doesn't solve the problem of how to specify the types that are related to the transfer operations. And WSDL n.n doesn't give a way for a client to take the transfer verbs and then "extend" them with the input and/or output types. Maybe that's a problem with WSDL.... So it doesn't really add the necessary typing to WSDL.
WSDL
WSDL then needs a way of using the new binding(s) and the HTTP properties.
WSDL transfer operations
To really use HTTP within WSDL, we could define an operation that uses the HTTP verbs. How about going as far as baking in HTTP GET right into a special operation, like:
<wsdl:getOperation>
<wsdl:input headers="" />
<wsdl:output element=""/>
</wsdl:getOperation>
Operations without Interfaces
Native HTTP, aka REST, is fundamentally different than Web services because HTTP defines operations and Web services allows arbitary operation definitions. One result is that Web services typically have many operations in an interface at a small number of URIs, whereas Web apps typically have a large number of URIs. Web services = few URIs and Web apps = many URIs. As a result, WSDL's interface centric model is naturally difficult to work with web apps. It would be simpler for Web application developers to define operations *without* collecting them into an interface. They just don't need to collect operations into interfaces because there are very few operations. WSDL that allowed standalone operations that could be directly deployed at locations would dramatically simplify describing web apps.
Assume SOAP + maybe WS-Addressing in WSDL
The final step for WSDLis to assume that SOAP is the format and protocol for all wsdl defined messages. WSDL can be simpler for SOAP messages on the wire (no soap binding!!!!) and the HTTP Native binding can be used to bridge to the HTTP world.
A service could just reference the interface or operations without a binding step (it's just SOAP), or have a flag indicating that the HTTP Native binding is being used.
Other "Simplified" WSDL
I'm certainly not the only one that wants a simplified WSDL.
Some of the other works around simplifying WSDL are at SSDL and Rich Salz's Really Simple Web Service Descriptions
Politics
Obviously lots of politics would have to happen, not the least of which is that some companies want to focus solely on SOAP envelopes being passed around and no use of GET. As well, the WSDL WG has passed it's Last Call period. OTOH, WSDL 2.0 doesn't seem to have a lot of pent up demand, so helping both SOAP users and XML over HTTP users better than it's done currently could increase demand.
I've been evaluating XML schema, RDF/OWL, and RelaxNG for extensibility and versioning capabilities. Herein, I provide a draft version of a small set of scenarios that I think are good for a first pass evaluation.
The scenarios start simply, with the first scenario being compatible schema evolution where the type is extended at the end and the new type is created by simply rewriting the type with the additional extension. This proceeds through extension not at the end, to by-reference extension where the original type is left untouched and the new type refers to the original type + the extension, and concludes with by reference extension of another type that is an incompatible change with the original type.
Compatible Schema evolution by-value extension at end
A language is extended in a backwards and forwards compatible way, and a V2 Schema can be written by simply adding in the new information. My canonical example is a name type in V1 that requires first followed by last and allows extensions, ie name=first,last or name=first,last, middle. A subsequent version, V2, is a name that adds a middle. Both schemas should allow names with or without middles.
Compatible Schema evolution by value extension anywhere
Similar to the previous scenario with the addition that the content model allows for extension anywhere in the type. A middle could be added in between first and last, such as name=first,middle,last
Compatible Schema evolution by-value extension with duplicate prevention variant
This variant is a refinement of scenario #1 and #2 that adds a refinement that an element that is already known cannot occur in an extensibility point. In the first name example, it requires that a first name cannot also occur after a last name, ie name=first,last,first is precluded for extensibility at the end, and name=first,first,last is precluded with extensibility anywhere.
Compatible Schema evolution by-reference extension
This scenario is where the V2 schema refers to the V1 schema rather than augmenting the V1 schema. This is roughly nameV2 schema = name schema + optional middle
Incompatible Schema evolution by-reference extension by extension author
A v2 schema is not compatible with V1 and V2 is not authored by the V1 author. This is a common scenario for container languages, like SOAP and WSDL. These languages are specifically designed for extensibility and for extension authors to make incompatible extensions. An example is a Web service that requires that a SOAP body contain a particular element and the SOAP header MUST or MAY contain other elements. This is roughly where nameV2 schema = name schema + mandatory middle
Multiple Namespace names
Languages may, and often do, use multiple namespace names. Each of the scenarios has a variant where the V1 language consists of multiple namespace names. The first scenario with the multi-ns variant is roughly that V1Name = firstns:first, lastns:last.
Instance vs Schema Type vs Schema Language constructs
Schema languages provide and enable various constructs for extensibility and versioning. Each of the scenarios can be evaluated on what constructs are required to enable the extensibility and versioning. They range from roughly starting at the Schema language itself provides all the constructs necessary with schema instances and document instances containing no constructs, to the schema language provides constructs that must be used in schema instances or even document instances. An example of the latter is XML Schema, which provides a wildcard that must be combined with document instance constructs - such as a Extension element or Sentry/Marker element - to fulfill scenario #1.
Evaluation Criteria
This provides a small set of scenarios plus a couple of variants that can be used for examining extensibility and versioning in a schema language.
I've been thinking a fair bit about how WS-Addressing *could* make use of HTTP as a transfer protocol. There would be 2 main different scenarios: 1) to bring all the HTTP services into the WS-Addressing realm; 2) to enable WS-Addressing software to utilize Web based software like intermediaries.
I've been saying for a while now that I think it's a shame that SOAP 1.2 didn't define a general SOAP to HTTP binding that used HTTP as a transfer protocol, for the previous 2 reasons.
This is the crux of the first problem facing WS-Addressing making use of HTTP as a transfer protocol. If SOAP 1.2 didn't do it, why should WS-A? If SOAP 1.2 didn't provide the hypothetical "SOAP HTTP Transfer binding", why should WS-A create such a thing?
The problem argument for why WS-A should do it is because it should have been done, and WS-A does define a simple binding of abstract properties to SOAP Headers so why not define abstract properties to HTTP?
This relationship between WS-A and HTTP would show up in pretty much all the WS-A properties: Action, To, From, ReplyTo, FaultTo, Message-Id, RelatesTo. I wonder what a strawman would look like. Note, I had proposed a WS-REST binding to the WS-RF last April..
Action
The design for Action would have to be something like:
1. WS-Addressing defines/refers to 4 Transfer operations (ooh, kind of like HTTP or WS-Transfer). I had proposed that WSDL 2.0 define 4 different transport independent operations, but they didn't want that so WS-Transfer is better.
2. When these operations are the Action, then a sender can be configured to map directly to the Web Method Property (ie the transfer verb)
3. The receiver of an HTTP Operation will create an Action property from the Protocol operation, or the Action Header. Probably the Action header over-rides the protocol to avoid the "POST" Action problem.
The problem is what to do with the rest of the properties, and let's look at them
MessageId
Message Ids are very useful for correlation and comparison for duplicates. The duplicate detection is probably best handled outside the scope of WS-A, such as WS-ReliableMessaging. OTOH, correlation for an asynchronous callback is definitely useful for WS-A. HTTP has a built in correlation mechanism that is the HTTP Connection and the request-response. Some possible design alternatives are:
1. If the WS-Addressing MEP is request-response AND it is bound to a single HTTP Connection, then the MessageId and RelatesTo are anonymous values.
2. The MessageId/RelatesTo properties are serialized as HTTP Header
ReplyTo
It gets harder to use HTTP as a transfer with WS-Addressing with ReplyTo. The common ReplyTo scenario is an asynchronous callback. Now what would it mean to use an HTTP GET with an asynch callback? What about PUT or DELETE? These don't seem very callback-ish. Would GET be more useful for Polling solutions instead of callbacks?
If we squint hard enough, we could see that an asynchronous GET operation could look like:
1. If Action is GET and there is a ReplyTo, then serialize the request as an HTTP GET with a x-ReplyTo header.
2. The service responds with a 200 OK response
3. The service then responds with the callback. Presumably this isn't another GET but would more likey be a POST.
This just seems strange. HTTP doesn't really work well with ReplyTo kinds of oeprations. It does work well for asynch access to the HTTP Resource. The HTTP PUT/POST can return a Content-Location HTTP header that would be useful for the client to call to.
So HTTP is designed for origin server-side asynchronous access, but not client side asynch. Which is absolutely no suprise, but makes it harder to do server-server asynch.
Reference Properties and Parameters
And then we really throw ourselves off the cliff when we come to mapping WS-Addressing RefPs to HTTP. We can either map the RefP to the callback URI or to an HTTP header like Cookies. The best that I can think of for this would be to layer WS-Addressing to HTTP Cookies, and any RefPs are eched back in the HTTP Cookie header. Then the receiver casts the Cookie header into SOAP header blocks (which is very similiar to what we had to do for the WSDL 2.0 Abstract Data Feature when binding Application data to either SOAP or HTTP).
I've thought a lot about how RefPs could be mapped to URIs, and there's lots of ways to do it. The idea would be that a client would get a RefP, then binding the RefP to the URI it uses for the callback. Maybe if RefPs were constrained a bunch of ways - only 1 refP and a particular QName to URI mapping is defaulted - then it could work. It certainly requires some rules
NetNet
I've looked at 4 areas of WS-Addressing and how they could be designed to work with HTTP as a Transfer protocol. It's not pretty, it's hard, it's possible, but I'm not sure the value of it. WS-Addressing focuses a lot on asynchronous communication, which seems to require some heavy squinting to get working with HTTP. Let's say the WS-A group undertook to do the "WS-A HTTP Transfer Binding", it would have to do all the previous work.
Would WS-Addressing software be able to access Web resources? Would WS-Addressing software use HTTP as a Transfer protocol and thus integrate better into the Web? I'm not so sure.
Here's a test case: Would the Atom protocol switch to using WS-Addressing and then use the HTTP as Transport binding(s) and HTTP as Transfer binding? Seems to me not likely. The Atom folks that want to use HTTP as Transfer have baked the verbs into their protocol, and they won't want to switch away from being HTTP-centric. And same as I don't see the SOAP centric folks wanting to "pollute" their operations and bindings with HTTP-isms.
I would love it if there was a reasonable way to bridge the SOAP/WS-Addressing world and the HTTP Transfer protocol world, but I just don't see that each side really want the features of the other side. The SOAP/WSA folks want the SOAP processing model for Asynch, and don't care about the underlying protocol. The Web folks want their constrained verbs and URIs and don't care about SOAP processing model.
At any rate, this is a strawman analysis of the costs and benefits of WS-Addressing using HTTP as a transfer protocol, and will hopefully help spark some discussion.
Chris Ferris blogged about loose coupling and WSDL versioning, particularly that deploying a new and compatible version of a service at the same URI as the previous version of the service is a good thing.. I totally agree. The example that he uses is when the data type of an operation is extended. I agree, and I had blogged about ways that a service can evolve in compatible or incompatible ways.
But what can a service provider do if they haven't written in an extensibility point in their schema and they don't have a processing model for unknown extensions? They can make use of the SOAP header block extensibility and the WSDL 2.0 application data feature and soap module. Also, the HTTP Protocol makes use of Headers for application data outside of the HTTP Body, such as HTTP Cookies and Content-Location.
WSDL 2.0 Application Data Feature
It is often the case that service data can evolve in a compatible manner, but the Schema wasn't designed for extensibility. One solution is to do the extensions as a SOAP header block. BEA and Sonic convinced the WSDL 2.0 working group that there needed to be a way for WSDL 2.0 services to deal with SOAP header extensibility, which we called the Application Data Feature.
The purpose of this feature is that the operation with the unextensible schema can be augmented by additional schema information. The new operation uses the Application Data Feature to specify the schema for the extension, and then this is serialized into SOAP headers.
All we do is define the extension type (which we have to do no matter what..) use the Application Data property to refer to that type, and then turn on the Application Data SOAP Module.
WSDL 2.0 with AD
I show below Chris's example with the Application Data feature and soap module with an updated response without the wildcard, sans some of the xml grungery like namespaces, and some of the irrelevent wsdl constructs.
<description>
<types>
<xsd:schema targetNamespace="http://example.com/tns"
xmlns:tns="http://example.com/tns">
<xsd:complexType name="StockQuoteResponse">
<xsd:sequence>
<xsd:element name="Symbol" type="xsd:string"/>
<xsd:element name="Value" type="xsd:float"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="ExtensionDataType">
<xsd:sequence>
<xsd:element ref="StockQuoteResponseExtension"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="StockQuoteResponseExtension">
<xsd:sequence>
<xsd:element name="Hi" type="xsd:float"/>
<xsd:element name="Low" type="xsd:float"/>
<xsd:element name="NetChange" type="xsd:float"/>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>
</types>
<interface name="StockQuote">
<operation name="GetQuote">
<input element="StockQuoteRequest"
<output element="StockQuoteResponse">
<property uri="http://www.w3.org/2004/08/wsdl/feature/AD/data">
<constraint xmlns:foo="http://example.com/">
tns:ExtensionDataType
</constraint>
</property>
</input>
</operation>
</interface>
<soap:binding name="StockQuoteSOAP" interface="StockQuote">
<soap:module uri="http://www.w3.org/2004/08/wsdl/module/AD" required="true"/>
</soap:binding>
<service interface="StockQuote">
<endpoint binding="StockQuoteSOAP" address="foo"/>
</service>
HTTP Headers
HTTP Headers are another example of application extensibility outside the body. HTTP Cookies are often used by applications to contain state or state identifying information. And in RESTful systems, the HTTP Content-Location header provides a URI for where the Web resource exists. Applications that use HTTP as an application protocol make use of the content-location header, such as following an HTTP POST request. The application data feature brings the ability to examine HTTP headers to the HTTP Binding of WSDL 2.0
Potential problems
There are 2 significant potential problems with this feature. The first is that it is done with Features and Properties, which some large vendors may not implement and has a minority opinion against it. We might need to change this to a WSDL extension. The second is that it is optional. Some folks, notably IBM, believe that application data should never ever be in SOAP header blocks and couldn't live with this being mandatory. I can't figure it out, but that's the way life is. As it is optional, it is likely that not all implementations will provide this facility. I propose that customers should ask for and demand implementation of this feature in their WSDL 2.0 toolkits. That way they can get data extensibility in the likely scenario that their operations are not all perfectly designed for all time in the first version.
Tim's a little bothered by WS-Addressing introducing instances of stateful services into Web services, and correctly asks what the difference between stateful Web services and distributed objects are. IMO, the real answer is both not much and yet enough.
Here are a few major technical factors to consider when evaluating Web services versus distributed objects:
1) Extensibility/Versioning,
2) State,
3) Re-usable verbs,
4) Remote invocation style.
There are also political factors, namely who's at the table.
Extensibility/Versioning
Web services, by the use of XML and extensibility mechanisms, can be more loosely coupled than distributed objects because of the evolvability of the interface. HTML is the poster child for this decentralized ability. XML with namespaces gives a lot of potential for evolvability, but we've half-blown it with Web services because of the difficulty of extensibility - particularly lack of must ignore rules, the Schema UPA rule, and the lack of a default extensibility model in Schema. We probably have enough extensibility to make Web services != distributed object extensibility, but Web service implementations by and large are as brittle as distributed objects. :-( And we're still doing it, because the WSDL 2.0 group won't commit to a versioning story for services - the techy issue is that WSDL 2.0 doesn't provide a service with a way of indicating whether a revision is compatible or incompatible with the earlier service.
State
There are 2 Webs that are out there wrt state:
1) Web resources that have a URI that are stateless and work with HTTP GET - this called "on the web";
2) Web resources that require some state or data - through HTTP Cookies or POST data - that are not "on the web". Not a lot of people understand that an HTML FORM POST result is not "on the web" because there's no URI for the result. But that's ok!
They both work and scale just fine. Let me say that again: Stateful services scale fine. Tim couples stateful with pinning to a server, "If a service implementation requires pinning to a particular server to work, then it isn't going to work in a real enterprise environment." Firstly, pinning to a particular server actual does work in an enterprise environment. Seriously. Secondly, there are lots of ways to migrate state from one node to another without the client knowing. BTW, this is one reason why BEA has lobbied hard for mutable EPRs. Point being, stateful services can scale just fine.
There is a difference between stateful and stateless when it comes to system properties other than scalability. They do have dramatically different properties for intermediaries as the "on the web" resources can be secured, cached, inspected, etc. more efficiently. I think it's pretty safe to say that the cost of doing security on the web, with SSL + HTTP Authentication or Cookies, is simpler than security of web services, with WS-Security + WS-Trust + WS-SecureConversation + WS-Policy + WS-SecurityPolicy + WS-I Basic Security Profile + ? WS-I Trust Profile? The property that Web services give you is way more flexibility, but it comes at a significant cost. Stateful services can hinder some of the properties other than scalability, but that seems to be ok in the scenarios that the flexibility calls for.
The reality is that there's a big chunk of the web that is not "on the web" and is stateful, and so we can't just say the web is about stateless and any introduction of stateful web services means breaking the web. The horse left the barn going after the HTTP cookie.
Re-usable Verbs
The really great stuff on the web happens with 1 verb: GET. POST is basically a no-op, and PUT and DELETE usage is almost non-existant. That one verb does all the magic that we think of for the Web. But Web services can't really use that verb. I think that Web services really blew it by not providing a default SOAP binding to HTTP GET. Nobody uses the soap-response (which is HTTP GET + SOAP response) because it takes SOAP people out of the SOAP data model on the sending side. The worlds of SOAP and Web are pretty much separate because SOAP can't really use the Web resources.
It also turns out that we don't have any reasonable way of bringing the XML world into the URI world (I've bashed at this problem for sooo long it's painful), so the re-usable HTTP GET verb doesn't really "see" xml in nearly the same way that SOAP does. Maybe we'll be able to get the benefits of HTTP GET by the adoption of WS-Transfer (or my proposed WS-GET subset) but most Web services folks just go on making new verbs for read and write operations without thinking about re-usable GET. And besides, I don't hear the thunder of WS-Transfer GET intermediaries.
We do remember that distributed objects were all about custom operations right?
Network knowledge
Speaking of RPC, one of the "knocks" on distributed objects is that they are RPCs. This is allegedly bad because the client and server are coupled. When I say RPC, I mean that a client makes a synchronous and custom method invoke on a remote machine. I don't mean that SOAP RPC or encoding are used, those aren't really the issue. Yet Web services are almost all RPC style invokes. It doesn't matter if Doc/literal is used if the soap body contains a custom method and a synchronous invoke is done. The Web is fairly RPC-ish too, it's just a standardized set of verbs and some other constraints. An HTTP GET on uri foo.com/whatever is certainly a synchronous remote method invoke.
But people have forgotten what I think is the big reason that RPC got into trouble, which was the "R" part. Distributed objects tried to make the remote procedure call look as if it was local. The idea is that you take a local service and just "remote" it. And this breaks because you have to know about the network, particularly latency and reliability. The Web came along and showed that the application MUST know that it is make a remote invoke and the network can't be abstracted away from the application.
Ironically, most of our tools do the same kind of thing that we did with RPC, by autogenerating SOAP/WSDL wrappers for services. Even more ironically, we decided that "RPC/Encoded" was bad for interop, that we should move to Doc/Literal, and yet customers are screaming about the lack of interop of Schema. Ooosh.
In spite of that, we do seem to be finally moving towards an interface centric design philosophy. The WSDL, Schema, SOAP, etc. are front and centre in people's minds. And this is a far better place to be than we were with distributed objects.
Another aspect of RPC is the synchronicity. Web services are finally getting standards around Asynchrony, particularly WS-Addressing. For the most part, Web services are synchronous interactions. Effectively, Web services is remote method invokes but with knowledge of the remoteness.
Where are we?
I've shown that some pretty important technical facets - extensibility, versioning, state, verb re-usability, sychronicity - where Web services aren't that different from distributed objects. There is an issue that the bulk of Web services can't take advantage of Web infrastructure. Sure, Web services uses XML with Namespaces and that buys a lot for interoperability. The knowledge of the network is an important differentiator between Web services and distributed objects.
The challenge for anybody to prove that Web services = or != distributed objects is to quantify the differences or similarities in actual architecture terms - like identity, state, lifecycle, verbs, synch/asynch, message exchange patterns. Show how Web services are more or less brittle than distributed object technology at a technical level. Not just "Oh, Web services are SOA and distributed objects are objects and we all know services are better than objects." That's yucky thinking
Web services are pretty close to distributed objects at a technical level but Web services != distributed objects at political level because we roughly have all the big vendors working together. It would be nice if the distributed object folks wanted to try some new approaches (hey, URIs!) but we'll get Web services to work technically and politically because the technical differences are the important ones (remote knowledge) and the politics are better.
Issue #1 afore the WS-Addressing Working group is the relationship of EPRs to URIs. I was an obvious target for owning this issue as I have been on the W3C TAG and have been a frequent TAG/WS group liaison.
The issue is that the Web Architecture is really built upon using URIs for identifiers. But there are lots of deploying components using web protocols (notice that I don't say "on the web") that use URIs+ for identifiers. Like Cookies. QNames. EPR Reference Properties. I wrote earlier this year about RefProperties, Cookies and Frag-ids. DonB responded about EPRs and Queries, but he's since moved his blog and I can't find the entry.
Akin a legal opening argument, I intend to show that an XML based identifier systems in addition to URIs has potential upsides. These upsides will be defined in terms of the REST thesis properties.
I will keep the issue to the identifier issue, and not let it bleed into whether stateful or stateless Web services are good or bad. If you're interested, some of us are nattering away on www-ws@w3.org about it, such as "costs of stateful/stateless" and I even disagree with a part of Roy's thesis on stateless benefits
I always appreciate it when Mark mentions me, but I'm not sure what warnings I'm heeding. I've always been a believer in pragmatic software and network architectures, and using formalisms to enable better discussion. Thinking of who I've heeded, a variety of people have helped my understanding of distributed systems architecture, like Roy Fielding, Don Box, Adam Bosworth, Tim Bray, TimBL and even long ago Rohit Khare. Thanks all.
I do wish that the WS-Arch WG had formally defined some of the constraints and properties of the Web service architecture similar to the REST thesis. :-( It would have been handy. For now, I'll have to make up the SOA/Web service properties as I go along.
At the WS-Addressing f2f, we've been talking about lifecycle of resources and identifiers. This has motivated me to finally start writing about time, lifecycle and garbage collection in distributed computing. Almost every higher level ws- spec has some solution or design for this, ie WS-Eventing, WS-Notification, WS-ReliableMessage, WS-Transactions, WS-ResourceFramework, SOAP-Conversation, etc. There seems to be a taxonomy of decisions that can be made wrt gcol. I'll provide this taxonomy, and an analysis of some WS- specs decisions lifecycle decisions.
Design decisions
The main decisions that a resource owner can offer for garbage collection are:
1. Timeout of resources.
2. Terminate messages.
3. Start message
4. Renew/refresh messages.
Timeouts
Timeouts are a mechanism to declare that a resource will not be available, probably deleted or garbage collected, at a certain point in time. The most common timeout is to use absolute times. Another variation is the use of duration. The main purpose of a timeout is that all users of a resource will know that the resource is no longer available after certain points of time.
A main problem of time based garbage collection is that the user(s) of a resource and the resource may have differences of opinion on time, commonly called skew. This breaks down into a false positive and false negative on time opinion. If the user is ahead in time, then it may prematurely think that a resource has expired and prematurely terminate it's use of the resource. If the user is behind in time, then it will incorrectly believe that a resource is still available for use and retain it's reference and perhaps even emit messages to the resource.
There are protocols available to synchronize time in distributed systems, NTP being a good example. This helps mitigate the problem of clock skew.
Another problem with using time is that a user may not have a clock, perhaps a cheap embedded device. It seems like most connected devices have a clock.
An advantage to timeouts is that it can enable distributed garbage collection without the use of explicit protocol messages. Without timeouts, a protocol to explicitly terminate resources is required.
Terminate Messages
Speaking of termination protocols, this is where the user or the resource can inform the other side that the resource is not longer available or of interest.
The advantage of a user terminate message is that it allows a resource to be garbage collected earlier than might otherwise be possible. Without it, a timeout is required or resources might never be garbage collected. The downside is that user terminate messages may be lost in transit and the resource is gcolled later than it ought to be. This is the classic problem of memory leaks.
The advantage of a resource terminate message is that it allows a user to delete is reference to the resource, saving it some costs and preventing potential extraneous messages. The downside is that a resource originated terminate requires that the user has implemented and deployed the terminate message.
Create Message
Invariably, resources that required garbage collection must be created. This can be done implicitly or explicitly. An explicit start message from a client will result in a resource being created. An alternative is to use an implicit start message, which is where another message type is overloaded to include the start message.
The advantage of an implicit start message is that network traffic is reduced.
The advantage of an explicit start message is that a resource may have the time needed for it's purposes, such as buffer creation. It also enables the client and service to negotiate the resource. A client may not want to accept the timeout on a resource because it is too short, so overloading with a higher level message would be bad behaviour.
Renew Message
When resources have timeouts, the resource may want to offer clients the opportunity to renew their interest. This enables a service to offer shorter timeouts than might be otherwise possible.
There is a trade-off between shorter timeouts and more renew messages that the resource has to make.
Specifications life-cycles
Examining some specific specifications:
- SOAP-Conversation: Implicit conversational beans. No timeout, implicit terminate, start, and no refresh messages.
- WS-Addressing: EPRs. No timeout, terminate, or refresh messages. There is an implicit start with a ReplyTo or FaultTo header.
- WS-Eventing: Subscription containing an EPR. absolute and duration based timeout, explicit user terminate (Unsubscribe), explicit resource terminate(Subscription End), explicit start (Subscribe), refresh messages(Renew).
- WS-ReliableMessaging: Sequence containing EPR. Timeout (absolute), explicit user terminate(TerminateSequence), explicit resource terminate (Fault), explicit start (Start Sequence), no refresh message.
- WS-Coordination: CoordinationContext containing EPR. Timeouts, explicit start (CreateContext), extension defined terminate or refresh messages.
- WS-Security: Security header block. Timeout, no messages
- WS-SecureConversation: SecureConversationContext. Leveraging WS-Security for timeout, explicit terminate(explicit start (RequestSecurityToken), no terminate or renew messages.
- WS-Trust: leveraging WS-SecureConversation for timeouts and RequestSecurityToken, and renew (Renew) message.
Conclusions
It's interesting to observe the similarities and differences between these different resource lifecycles. The main commonality seems to be the use of timeouts, but the start, refresh and terminate messages are different. An obvious question is whether it would be useful for EPRs to have a timeout/expiry value as a refactoring from all the other specs. Hopefully this will help in any such discussions.
Sam talks about WS-HTTP, and where we could be with Web services and HTTP. Thanks for the mention of WS-Get Sam! I'm stoked that some folks were talking about this at sells-con. I shoulda been there...
Sam mentions the use of WSDL Binding to handle SOAP messages that are XML sans SOAP envelopes. Remember I want WS-Get to have a binding to HTTP Get...
I tried really really hard in WSDL 2.0 to make it easy for the developer to describe an operation and the binding it to SOAP/HTTP or XML/HTTP. The idea is that a WSDL operation is what the contract is. While this is better in WSDL 2.0, it's still arguably too broken. :-( There's a number of central problems that we can't/couldn't resolve.
1. SOAP 1.2 shoulda never done the soap-response mep. It should have allowed an HTTP GET to be a legal binding of a soap envelope, remember it's "all about the infoset". I'm not bitter... However, if we can say that WS-GET has a binding to HTTP GET (as I proposed), then we can describe all the HTTP GET services as WS GET services. So I don't think it's too late but we would need WS-Get which includes a binding to HTTP Get.
2. WS-Addressing EPRs cannot be bound to just URI or URI+http header or URI + cookie. Which means that all the WS- toolkits that publish CORBA IORS whoops I mean WS EPRs can't use HTTP GET or SOAP/HTTP interchangeably. Maybe the W3C WS-A group will solve this problem AND maybe the WS-A toolkits will use the solution.... One could argue that use EPRs that have no ref properties and use just Address is a solution, but I think most of the WSA EPR users will use ref properties. Our software does.... I did try to get the WS-RF folks to provide something on this (what I called WS-REST), but they said no way.
3. WS-Addressing EPRs cannot be constructed at the client, unlike URIs that are HTML form encoded, as we don't have a WSDL/Policy annotation. HTML does this using <form>. We did this in WSDL 2.0 for URIs for HTTP and SOAP-response MEP. I still think it's a bad subset of xml (like missing namespaces and attributes!!) but it's at least there. WS-RF has a bit of a leg up because it has a WSDL annotation that provides a binding between properties and EPR construction. But of course there's no integration/commonality between WSDL 2.0 URI construction, WS-RF Property EPR construction, and completely missing WS-Transfer/WS-Get EPR annotation.
4. Creating a WSDL operation that can be bound to both HTTP and SOAP/HTTP (say using WS-Transfer) is hard, as I found out in my Atom SOAP and HTTP WSDL 2.0. Like:
- how is an application HTTP header and a SOAP header expressed the interface operation? (and damn I've had to argue a lot in WSDL 2.0 for the "application data feature"). An example in Atom is the use of the HTTP Content-Location for the POST result.
- Supporting "Wrapped" elements, as Atom does, results in duplicating all the data structures
- Large numbers of URIs (as occurs with HTTP) results in roughly in a binding element for each endpoint (URI), which is too complicated compared with the slickness of the SOAP binding. For example, GetEntry takes Service + Endpoint + Binding + Operation + Input elements. One would think it could be *a little* simpler.
- Different infrastructure policies(aka security) that is binding specific is complicated.
WSDL 2.0 + WS-Get + WS-Addressing comes close to a good enough job of mixing the two bindings from an abstract interface to actually deliver WS-HTTP, but there's a couple of things missing. Oh, did I mention WSDL 2.0 still might accept Last Call comments, WS-Get doesn't yet exist, and WS-Addressing *just* got started? There's still some time if we want to make WS-REST (my term) or WS-HTTP (Sam's term) a reality.
We need another WS- spec, WS-Get. It's the Get subset of WS-Transfer that just has Get. In the same way that HTTP Put/Delete are used less than Get, so I think that WS-Transfer verbs other than Get will be used less than Get.
WS-MetadataExchange is a perfect example of a spec that could use WS-Get. WS-Mex can't really refer to WS-Transfer because there's too many extra verbs. With WS-Get, WS-MEX could define GetMetadata and refer to WS-Get, and WS-Transfer could define Create/Update/Delete and refer to WS-Get. Cool.
We can still get to is a place where applications can choose:
- generic reads, generic writes: WS-Get + refactored WS-Transfer
- generic reads, specific writes: WS-Get + application specific verbs
- specific reads, specific writes: applicaiton specific verbs
I've argued the middle choice is a sweet spot that we should enable, ie Web services needs transfer protocols and specific protocols.
There's one additional piece that I would put into WS-Get, which is an optional binding to HTTP Get. As Chris said, SOAP 1.2 should have done a binding between SOAP infoset and HTTP GET + URIs. I'd proposed a SOAP RPC URI binding a long time ago. This and other proposals were rejected in favour of the soap-response MEP, which I think was a terrible solution. But the story isn't over, as WS-Get + binding to HTTP GET can give the advantages of SOAP + re-use of HTTP. I think there's more that WSDL 2.0 should do, but that's a separate issue.
WS-Get can enable HTTP GET for Web services read operations (including GET) and HTTP POST for all other Web service operations. Then the SOAP folks get what they want (the soap and wsdl extensibility models) and the REST folks get what they want (the use of URIs and use of HTTP GET).
A language designer makes a number of significant choices when designing or extending a language. I thought I'd list some of the major extensibility and versioning choices that a designer makes, either implicitly or explicitly.
I haven't provided schemas for the name constructs, nor definitions of backwards/forwards compatibility, as those are provided frequently in other entries that can be seen from my compatibility page
I thought I'd write up and share a fairly quick comparison of WS-Addressing and WS-MessageDelivery. I wrote this up to help some of our folks do a comparison, and it might be useful to other folks. I'm sure I've made some errors and people will let me know, so count on another revision or 2 :-)
The SOAP Headers are very similar, matching up almost exactly.
The WS-Addressing specification relies on the contents of run time messages to implicitly define message exchange patterns when just WS-Addressing is used and WS-Policy for interactions that are WS-Addressing dependent. For example, WS-ReliableMessaging uses WS-Addressing, and WS-RM is attached to a Web service by WS-Policy statements. WS-Addressing uses Endpoint References for refering to service instances.
WS-MD provides WSDL extensions and features and properties for a number of it's functions. To a certain extent, WS-Addressing as part of the WS-* stack uses WS-Policy for extensibility and WS-MD uses features and properties for extensibility. WS-MD uses a WS-Ref structure to identify service instances.
Dare Obasanjo has published an article on designing extensible, versionable xml formats, and it heavily references an article that I published on Versioning XML vocabularies.
I've had a few people ask me about the differences between the articles, and what are the reasons for the differences. I thought I would compare and contrast the articles. Firstly, let me say that I'm really pleased to see Dare's article and I was glad to be a reviewer. In general, I think there is a lot of overlap between the articles. That is goodness, that we agree and hopefully the industry can adopt common guidelines will help everybody. And where we disagree, I think that the differences aren't that significant compared to the areas we do agree.
Where we agree is roughly that vocabularies should/must:
- be designed for backwards and forwards compatibility,
- provide for extensibility of attributes and elements,
- Use Must Ignore unknown extensions rule as the simplest way for forwards compatibility,
- Use or provide a must Understand rule to over-ride Must Ignore,
- Use a Schema design technique if it is desirable to write compatible V2 Schemas.
We both state that using a new namespace for new constructs suffers from a significant problem that forwards compatible schemas cannot be written if all new constructs are done in a new namespace. This is because of the limitations of the schema wildcard, in that it can't differentiate between anything finer than "##other" namespace. If a namespace owner uses another namespace for an extension, and they want to retain extensibility, then they can't write a new schema because of non-determinism. I offered an Extension Element technique, and Dare offers a delimiter technique, for writing schemas that are forwards and backwards compatible.
I've written up a WSDL 2.0 specification for Atom 0.3. It's based upon Randy's work on WSDL 1.1
My proposal for WSDL 2.0 Atom is at http://www.pacificspirit.com/Authoring/wsdl/atom3.1.wsdl2
I have a number of questions and comments about the WSDL 1.1 and WSDL 2.0 as a result:
I've argued for a while now that extensibility and versioning are important topics, and it's incredibly important for data formats to plan for evolvability. I've already argued that you must have substitution mechanisms in place for V1, otherwise it's impossible to evolve formats. I've refined my argument that compatibility should be thought of at the the message level, and how to think about synch/asynch compatibility as combinations of compatibility of the messages that make up the message exchange pattern.
But what about protocols? Can we cast the same logic that applies to data formats to protocols? I think we can apply some of the rules for format extensibility to protocols but not as many and the effect isn't as useful as we'd like.
It turns out to be harder to plan for evolvability in protocols for a few reasons: the substitution rules for one protocol to another are much harder than format substitution, and we want earlier indication of whether a new protocol or protocol extension is understood than we can typically live with for formats. I will show how the advocated format rules of: Provide Extensibility, Must Ignore unknown extensions, Re-use namespace names in compatible evolution, and Provide Must Understand could be adapted for protocol evolution but then show the deficiencies in this approach.
The WSDL WG is currently working on the description of bindings for Web services. We've started working on some interesting issues and potential solutions around the relationship between application operations in either a generic or specific method interface, and the underlying protocol. I'm going to expand on the solution space I first proposed in http://lists.w3.org/Archives/Public/www-ws-desc/2004Apr/0004.html.
Let's take it for now that there is market pressure to fully utlize HTTP as a transfer protocol in Web services. That's an important debate (!) but I only want to look at the solution space in this entry. What I mean by a transfer protocol is, well, what HTTP means. It's the protocol used to transfer application state between nodes. So to fully utilize HTTP as a transfer protocol means to fully utilize the features of HTTP, specifically the various methods it has.
Contrast this with the predominant state of Web services, where most services send SOAP messages through the HTTP POST method, with the operation name encoded in a variety of places in the message. It's another interesting yet tangental debate about whether POST is used primarily for the "feature" of firewall tunnelling, or some other reason.
These two uses of HTTP will be the end points of our spectrum of solutions for web services and HTTP. On one side is "strict" HTTP. There are NO application methods allowed, just HTTP. There is no "binding" per se, because the application operations ARE HTTP. On the other side is the "HTTP as transport". The application methods are all custom and when bound to HTTP, use HTTP POST.
Many people believe that these are the only 2 positions. It turns out there are a few other important combinations of application operations and protocol operations. We can see the combinations by examing the 2 different axes that can be mixed. The first axis is the style of application operations, and the second is the usage of the underlying protocol operations. There are 2 styles of application operations: generic or custom. By generic, we mean that they are constrained to a small set of operations, typically the CRUD methods (Create/Read/Update/Delete). The Web has been built upon an architecture style with generic operations, as well as other constraints. This style is called REST - Representational State Transfer. The converse then is custom operations, where the application can create arbitrary operation names.
It should be noted that the use of generic versus custom operations is not mutually exclusive. One could easily see a scenario where an application wanted a variety of "GET" methods, but also needs to indicate that they are all follow the "GET" semantics of REST. One could argue that the "binding" should take care of this, but having the 2 operation names on a given operation means that they can be re-used for either multiple bindings to a given protocol or for use with protocols that do not have generic interfaces.
The usage of the underlying protocol is straightforward. The WSDL 1.1 SOAP HTTP binding only uses the POST method. It is obvious that an application may want to use other HTTP methods than POST.
Given these 2 axes, the solution space is roughly:
1. App protocol (custom) + HTTP as Transport.
The current WSDL 1.1 SOAP HTTP binding, and uses HTTP POST. The operation name is encoded either in the message content (RPC), an HTTP header (soap:action) or a SOAP header (WSA:Action).
2. App protocol (custom) + HTTP as Transfer
This allows the specification of an HTTP method for each application operation. This is in the editor's draft of the WSDL 2.0 Part 3: bindings document.
There may be some additional trickiness in encoding the operation name in the request. The client will probably have to follow an algorithm for ensuring that the request is self-describing, that is it contains the custom operation name. It may very well be that the URI used by HTTP will contain the operation name. For example,
<interface>
<operation name="getFoo"/>
may only be allowed on URIs that are of the form
http://foo.org/Foo
and thus the "type" of the operation is actually embedded in the identifier for the resource. I proposed one algorithm for this in my proposal on SOAP HTTP GET binding a few years ago.
3. App protocol (generic) + HTTP as Transfer.
ATOM uses this style as it's basis. IF WSDL provided a definition of the the generic operations for applications, then WSDL could also probably a direct mapping to HTTP for the generic operations. One solution is a "REST" interface with the GET/PUT/DELETE/POST operations.
4. App protocol (generic) + HTTP as Transport.
On the face of it, this seems like a strange scenario. This uses HTTP POST to contain HTTP methods. However, the ATOM specification has to do exactly this because sometimes DELETE and PUT cannot go through HTTP firewalls. Further, HTTP does not have a "GET" operation that can contain a body, such as an XML Query. Therefore the XQuery must be encoded in an HTTP POST request somehow, perhaps with a special soap header as I suggested in XQuery:Meet the Web
5. App protocol (generic + custom) + HTTP as Transfer.
This would probably a direct mapping to HTTP for the generic operations. The same issues raised earlier about encoding operation name(s) in the request apply.
6. App protocol (generic + custom) + HTTP as Transport.
This is a combination of scenarios 4 and 5, with the additional complexity of determining how the HTTP POST request should contain the operation name(s).
We see that there are 6 different possibilities for describing an applicaiton operation name(s) and binding to HTTP. I believe all 6 are useful, but methinks that the last 2 ( or 3 or 4) will take some time and a separate blog entry or 2 more fully describe.
Jim Webber and Savas Parastatidas wrote a nice article about why WSDL is not IDL. It's been interesting to watch the fall-out from that, such as Jim Webber's followup. Stefan Tilkov echoes my complaint about Schema 1.0 limitations, though WSDL isn't bound to Schema 1.0.
I think there are 2 separate points that can be articulated: what the IDL describes and what the IDL can express.
WSDL does not constrain the WSDL interface description and the implementation of the described interface. WSDL and distributed object IDLs are all IDLs, that is Interface Definition Languages. However, the key difference is that they describe the interface to different things. In dist-obj, IDL describes an interface to an object. That's the point, distributed objects. All the operations in a given interface are to the same object.
In Web services, WSDL describes an interface to a Service. What's a Service? That's the beauty: By leaving undefined what exactly the interface is for, web services achieves looser coupling than distributed objects. In particular, the coupling between the interface and the component(s) that implement the interface is relaxed for Web services (and I guess SOA). The important point is that WSDL relaxes the contract between the interface and the "service" implementation. One could argue that a definition of "SOA" is those architectures that loosely couple the component implementation to the interface, which WSDL does a good job of.
This relaxing of the coupling between the interface and the component naturally bleeds into the client view of the component, though not nearly as much as one might think. Given a stateless distributed object IDL and comparing it to a WSDL, there's little difference in the clients coupling to the interface.
I think there's a big win in WSDL not describing exactly what it is an interface for. Arguably, that's a big part of the "Web" part of Web services, where resources on the Web can only be reasoned about based upon the data they send back and the "resource" is completely opaque and unknown. To paraphrase, "there is no service neo".
Given that WSDL interface say nothing about the implementation of the interface and only the contract for what the interface can exchange, this leads to the second point on interfaces. I believe that WSDL's use of Schema - yes, I know WSDL allows other schema languages, but let's deal with current reality of 99.9% of implementations and deployments - for the most part keeps us locked into the tightly coupled client to interface model. I wrote about this a while ago in Web services = or != Distributed Objects. The difficulty in expressing the Web's level of extensibility and versioning in Schema often keeps us building applications that tightly couple the client to the interface and prevent interface changes.
There are some interesting possibilities in WSDL 2.0 that we've started to explore about relaxing some of the constraints that Schema imposes, but that work is early on. Those of you going to XML Europe should go to Henry Thompson's talk about 2 pass Schema validation to relax extensibility constraints, listed as a late breaking session
In summary, interfaces can be used to reduce coupling between systems in two regards: the coupling between the client and the interface, and the coupling between the interface and the implementation. WSDL helps us a lot with the 2nd and somewhat with the first but it's also limited by Schema.
Saying things like "WSDL isn't an IDL" doesn't really help with the primary differentiation that I'm making. A more appropriate expression would be "WSDL isn't an object IDL", and even better "IDLs that do not constrain the IDL implementation provide for more loosely coupled systems".
Don Box has been talking about relaxing the constraints of things the clients must know about the implementation to facilitate interoperability, and the relaxing of the component/interface implementation is one very signficant example of that.</blatant DB troll for a blog posting>
Regardless of what the interface describes and can express, there's always implementation realities. We, the vendors, haven't provided as consistent an experience for using XML Schema within WSDL and our programming languages as we'd all like. I think that's something that will gradually improve over time, but we probably could have made it easier by having a smaller Schema language.
Speaking of schema languages, I've also been doing some looking into RDF and OWL for modelling extensible components, and I'll be posting results of my investigations soon..
As I've thought through the awesome deployment of web software and the relationship to dealing with extensibility and versioning, I keep on coming up with more instances of extensibility in Web specs. I've been involved in discussions in a bunch of groups about what's the right way to do extensibility/versioning. I hope that by referring to a fairly exhaustive list of the areas that the Web arch has extensibility + Ignore rule, it will provide a canonical reference point for ongoing analysis.
We listed a few of these in the Web Architecture document section on general principles on Extensibility, so I'll start with that and expand.
HTML
HTML 2.0, rfc 1866, contains the following text in 4.2.1 Undeclared Markup Handling
To facilitate experimentation and interoperability between
implementations of various versions of HTML, the installed base of
HTML user agents supports a superset of the HTML 2.0 language by
reducing it to HTML 2.0: markup in the form of a start-tag or end-
tag, whose generic identifier is not declared is mapped to nothing
during tokenization. Undeclared attributes are treated similarly. The
entire attribute specification of an unknown attribute (i.e., the
unknown attribute and its value, if any) should be ignored.
Thus HTML Elements follow a Must Ignore, and attributes follow the Should Ignore.
HTTP
HTTP, RFC 2616, has a number of extensibility points that follow the ignore rule.
HTTP Response and Request headers
Section 5.3 says:
However, new or
experimental header fields MAY be given the semantics of request-
header fields if all parties in the communication recognize them to
be request-header fields. Unrecognized header fields are treated as
entity-header fields.
and Section 6.2 says:
However, new or
experimental header fields MAY be given the semantics of response-
header fields if all parties in the communication recognize them to
be response-header fields. Unrecognized header fields are treated as
entity-header fields.
Here we see the Must Ignore rule, as unrecognized header fields are treated as entity-header fields, which we shall shortly see follows Should Ignore.
HTTP Entity headers
Section 7.1 says:
The extension-header mechanism allows additional entity-header fields
to be defined without changing the protocol, but these fields cannot
be assumed to be recognizable by the recipient. Unrecognized header
fields SHOULD be ignored by the recipient and MUST be forwarded by
transparent proxies.
Here we see the Should Ignore rule, plus additional constraints on intermediaries.
HTTP Error codes
Section 6.1. says
HTTP status codes are extensible. HTTP applications are not required
to understand the meaning of all registered status codes, though such
understanding is obviously desirable. However, applications MUST
understand the class of any status code, as indicated by the first
digit, and treat any unrecognized response as being equivalent to the
x00 status code of that class, with the exception that an
unrecognized response MUST NOT be cached.
HTTP error codes have a Must Understand on the status code. By casting unkown subcodes to 00, this is effectively the Must Ignore rule for subcodes.
HTTP Chunked encoding
Section 3.6.1 says
All HTTP/1.1 applications MUST be able to receive and decode the
"chunked" transfer-coding, and MUST ignore chunk-extension extensions
they do not understand.
Cache Control Extensions
Section 14.9.6 says
This extension mechanism depends on an HTTP cache obeying all of the
cache-control directives defined for its native HTTP-version, obeying
certain extensions, and ignoring all directives that it does not
understand.
and
Unrecognized cache-directives MUST be ignored;
Finishing HTTP
It's pretty darned obvious where SOAP headers, mustUnderstand, and Actor/Role came from if you think about HTTP extensibility.
URIs
The URI specification is designed for extensibility. It has a number of major sections that are extensible: schemes, domain, path, query, fragment identifiers. However, the "ignore" rule doesn't really apply. For all the sections excluding frag-ids the mustUnderstand rule is in effect. A browser or software that doesn't understand a scheme or domain of a URI will generate an error. And a server or other software that doesn't understand a path or query will generate an error.
However, frag-ids, which are interpreted by the client and not sent to the origin server, have typically been implemented with the Ignore rule. A browser that has retrieved a resource that does not understand the frag-id will ignore the frag-id. A representation that doesn't have an element with an attribute that matches the frag-id (ie name attribute in HTML) is still rendered.
CSS
CSS planned for forwards compatibility and applied the Must Ignore rule at each of it's extensibility points. CSS Level 1 Forward Compatibility applies the ignore rule to a number of extensibility points:
Properties: a declaration with an unknown property is ignored
illegal values: illegal values, or values with illegal parts, are treated as if the declaration weren't there at all:
At-keywords: an invalid at-keyword is ignored together with everything following it, up to and including the next semicolon (;) or brace pair ({...}), whichever comes first.
There are more explanations and cases in the section, all of which apply the must Ignore rule.
Others
Many other examples exist, such as XSLT and SOAP.
Conclusion
The Ignore rule, in the two variants of SHOULD and MUST, is the key rule that has enabled forwards compatibility for older software to deal with new extensions it doesn't know about.
I've spent a lot of time over the past few years thinking about the Web, Web services and distributed object technology, how they differ, and what the critical success factors. Our industry talks a lot about the various reasons, with the (I guess) expected amount of hype: "Web services are great because they're new and improved. For you, they can even be red!".
But how are the Web, Web services and Distributed objects really different? What are the specific architectural differences? And I'm not talking about the "Web services are about coarse-grained components and distributed objects are about fine grained objects" kind of ambiguity.
One way that I think Web and Web services are different from distributed objects is that they make the data format on the wire (html, xml,..), the object references (URIs), the protocol messages (HTTP) and the description human readable. Well, maybe wsdl isn't the most human readable. Distributed objects on the other hand make the description (idl) human readable and the wire format, object references, and protocol messages binary. Now that might be a sufficient reason. There are plenty of others, such as the late binding of the information content to the user, the ability to have high performance due to ease of inspecting the HTTP message method and URI to determine equivalence, etc.
But I think there's another big and untouted reason: Extensibility. If you take a look at comparing the Web to distributed objects, the invocation mechanics are quite different. In distributed objects, you might say something like:
public interface PO {
getMyPo( in int poId, out PO purchaseOrder); }
whereas HTTP might be modeled somewhat like
public interface HTTP {
get (in URI address, out MIME body ); }
Now let's focus just a minute on the HTML side of the house, where much of the innovation happened, so we'll slightly modify the interface
public interface HTTP {
get (in URI address, out HTML body ); }
There are 2 very different aspects to each of the interfaces. The PO interface can be extended to new and arbitrary methods, and the PO class can also be extended. These are both arbitrary extensible - that is specific to the interface - as opposed to generic or uniform. HTTP, SQL, etc. are called uniform because the methods cannot be varied by the application programmer. HTTP verbs can only be extended by an extraordinary amount of work as this is via a standards process. And the same with official HTML.
You would think that given the arbitrary extensibility of a dist-obj style interface, this would have taken off. But there is something that is not expressed in the interface which is incredibly important. It's perhaps one of the biggest reasons why systems that focus on contracts rather than APIs tend to win. Whereas the HTTP interface is constrained to a specific number of verbs, the content is extensible. You can put XML, MIME, etc. in the content. Looking at the HTML case, there's a critical piece of information that is in the specification.
This is the "Must Ignore" rule. HTML, and HTTP headers and even much of the URI spec, have a rule that any unknown content must be ignored. So if any content appears, in any place, and the receiver doesn't know about it, it can validate as if the unknown content was "projected" out of the instance. This rule is specified in the HTML specification but is not expressed in the schema/dtd. In fact, I believe that if HTML had not been able to express the "must ignore" rule outside of the schema, then HTML probably wouldn't have allowed nearly as much extensibility.
This allowed a huge evolution in HTML, and did not affect the HTTP API. They were orthogonal. The formats and verbs are separately evolvable.
Distributed object systems made a critical decision that any kind of extension required that both sides understand the extended interface. This is the fallacy of "single administrator". Much has been made about the fragility of distributed object systems, and I'm convinced that this lack of "touchless" extensibility was a key contributor to it's lack of uptake and the triumph of the web.
Now imagine distributed object and web world in which the reverse rules were applied. In distributed object systems, you could insert any kind of content you wanted in the PO class, and if it wasn't known by the receiver, it would simply ignore it. Distributed object systems would be far more resilient. Fewer of the exploding interfaces with a gajillion different methods. I think that distributed object technology would have had a wider appeal if parameter extensibility, and perhaps even method extensibility, were supported in a manner that didn't require both sides to simultaneously change.
And it's hard to imagine that HTML would have evolved so quickly and successfully if the Must Ignore rule was not built in, so HTML could not be evolved outside of the standards committee. Image, forms, css, etc. were all innovated after the first version of html.
How about Web services? It appears that most authors effectively make the same distributed object decision when they design their interfaces. They recreate the "getPO" distributed object method in a SOAP message without allowing extensibility in the PO. Any time they need to extend the PO, they have to extend the Schema and roll out a new version. Now they could extend the PO using XML Schema's extensibility mechanism.
XML Schema made the decision that any extensions that were to be validated would require the updated schema to be on both sides. Further, that all notions of extensibility compliance are expressible in the schema language. This is awfully close to distributed objects decisions.
However, they aren't quite the same. XML Schema provides a wildcard element, <xs:any>, which allows elements in constrained namespaces to appear in an instance. If the PO Schema provides wildcards, then PO authors can use wildcards for extensibility. They won't get the arbitrary extensibility that HTML had - and I think this is a quite a problem. There's a common pattern of allowing elements from namespaces other than the target namespace. This provides at least some extensibility, but it has some issues I detail in Examining wildcards for versioning. I argue that they can use <xs:any> in particular ways to get touchless extensibility with full validation in an article on xml.com called Versioning XML Languages, but this still requires the developer actively specify extensibility points.
There is another significant difference between the Web and XML schema extensibility technology, that of active versus passive specification of extensibility. XML Schema requires the author actively insert something in the schema document to predict where extensibility needs to occur for touchless evolution, whereas much of the web technology has extensibility passively built in to the system as the default. What would be perhaps the ideal solution is if we could unify the "must ignore" rule with our validation logic in a passive manner. Then we could give the full must ignore functionality into the infrastructure rather than requiring each data format and corresponding code to specify where ignorable elements are allowed and not allowed. And learning the lesson of HTML, the mechanism for specifying "must ignore" may need to be expressed outside of the schema language.
Now some folks argue that the very fact that there aren't constrained verbs (the RESTafarians) means Web services are doomed to the dustbin of history. I think that the difficulty in providing touchless extensibility is harming the ability to deploy loosely coupled applications, but there are sufficient techniques that enough extensibility can be provided to enable loosely coupled Web services. This does require explicit actions on the part of the interface designer. I do think that the community should provide an easier model for creating and validating extensible xml languages.
I had some interesting thoughts on comparing reference propertes in EndpointReferences to URI fragment identifiers. EndpointReferences (EPRs) are defined in the WS-Addressing specification.
Frag-ids identify a secondary resource that is related to a primary resource. In fact, they aren't even sent from the client to the server when the resource is requested, they are strictly for client-side usage. The interpretation of the frag-is governed by the media-type metadata returned wth the representation.
Now a lot of web sites use HTTP Cookies for doing secondary resource identification. This can be done through session ids, ip addresses, etc. They are sent as HTTP headers. Of course, there's often some amazingly breaking of orthogonality when the application then does a getCookie to retrieve part of it's state from the header. But that's a separate article. What is interesting is that one of the main reasons that cookies are used is to identify secondary resources on a server, versus frag-ids which are under client control.
WS-Addressing gives us an incredibly useful mechanism for doing secondary resource identification. An EndpointReference contains reference properties, which are required to be echoed as SOAP headers by an EPR holder whenever it communicates. Many of us call this "SOAP cookies".
The neat thing that I was thinking about is that EPRs give us bidirectional secondary resource identification. Now the obvious comparison is with cookies - and I've been working on that - but a comparison can be made with URI frag-ids. Identifiers for secondary resources can either flow with a message (cookies, EPRs) or not (frag-ids). Given that a secondary resource identifier will flow, it's going to be in a message header or a body. I think bi-directional use of SOAP headers is a really elegent solution and enables us to bring both the power of frag-ids and cookies to both sides as well as fully utilizing XML infrastructure. But I'm leaping to the conclusion and I still need to do the a comparison of frag-ids with ref properties.
In the same way that a frag-id is opaque to a server, an EPR reference is opaque to a sender. It is meant for consumption by the "interpreter" of the reference. In an example web context, a web page links to a section of a document using a frag-id. The client invokes a GET operation and the server returns the entire representation. The refering web page also has a pretty good idea of what the media type of the retrieved representation will be. Woe to any server that doesn't ensure that the frag-ids are uniform across the media-types.
HTTP is fundamentally a request response protocol. So if we do any kind of asynchrony, such as a callback, how does the "receiver" know which secondary resource is being used? This is exactly what ref properties provide. As a Web browser wll be able to "keep" the frag-id for interpretation in a response, an asynchronous message needs to "keep" the secondary resource identifier. As soon as the "request/response" exchange pattern is broken, the secondary resource identifier must be sent in a message.
Now what about media-types and reference properties? For a web browser to correctly interpret formats, it has to "know" about the various media-types, or be able to hand-off to an application that does. The browser author looks up the media type information in the IANA registry. XML changes the landscape completely. Instead of having a small number of types that are registered through a centralized authority, authors can create arbitrary vocabularies and even application protocols through XML and Schema. In the same way a client has to be programmed for media types, a client must be programmed for xml types and wsdl operations.
So how does a web service tell a client that it ought to be prepared for various formats and protocols, and which are the particulars of a given service? Through WSDL and EPRs. EPRs have the ability to provide information about the binding and the interface that are related to an address and reference properties. The creator of an EPR has knowledge of any secondary resource identification that may be needed relating to the particular endpoint and it's interface and address.
In an interesting way, we've taken advantage of the decentralized ability to describe languages and protocols, and then provided the ability to specify which types are related to a specific resource and secondary resource. A Web browser doesn't need the ability to combine the frag-id for a given resource with a media-type. But this does't seem to be the case for what we're going to be doing with distributed application design, specifically the movement towards bi-directional asynchrony.
I think it's pretty interesting to think of frag-ids and reference properties as the same architectural concepts, observe the parallels between media-types/iana and xml schema/wsdl, compare the process of minting and intepreting URIs with frag-ids compared to EPRs, and understand the necessity of different mechanisms because of Asynchrony.
Yesterday BEA, Microsoft and Tibco published WS-Eventing. This is another spec that I'm pretty stoked about. This follows the classic model of subscription start, publication of messages, and renewal or termination of subscriptions.
It's got lots of the right stuff architecturally too: EndpointReferences, URIs for subscription identifiers, SOAP 1.1 and 1.2 support, WSDL, timeouts, simplicity, composability with other specs and re-use of other specs, particularly WS-Addressing.
I continue to like our model of publishing specs that solve a common and pressing problem and solving it well.
I'll use Adam Bosworth's keynote at XML 2003 to kick off some thoughts around Xquery and the web. It seems like an interesting world where a service provider could describe their data model and then allow arbitrary queries against it. The current model of both the Web and Web services is that a service provider needs to provide pre-defined operations for each type of query. In fact, this separation of the private and more general query mechanism from the public facing constrained operations is the essence of the movement we made years ago to 3 tier architectures. SQL didn't allow us to constrain the queries (subset of the data model, subset of the data, authorization) so we had to create another tier to do this.
What would it take to bring the generic functionality of the first tier (database) into the 2nd tier, let's call this "WebXQuery" for now. Or will XQuery be hidden behind Web and WSDL endpoints?
I first tried to re-use the Xquery functionality rather than providing specific operations in the SAML spec. My idea was that instead of SAML defining bunch of operations (getAuthorizationAssertionBySubjectAssertion, getAuthorizationAssertionListBySubjectSubset, ..), that SAML would define a Schema data model which could be queried against. A provider would offer a generic operation (evaluateQuery) which took in the query against that data model. Hence why I worked created a formal domain model in SAML, so it could be queried against.
Now, in this was too early in Xquery's life to work. One of the necessary things was to be able to subset XQuery so only some of the complexity was offered. The security model was handled outside the scope of the individual query, but that would need to be worked in.
Obviously the choice of using a generic xquery interface versus a specific operational interface depends upon the application, and they probably need to be matched. Specific interfaces are useful in many different conditions, but they don't work very well if the client really needs a generic interface. The idea is that currently there is an impedance mismatch in some applications, particularly where a client needs a generic interface but a specific interface is all that is available. They client ends up invoking large numbers of operations and then transforming the retrieved data models into their data model. This leads to brittle and complex clients and providers that can't scale to client demand in functionality and performance.
If this is an interesting idea, of providing generic and specific query interfaces to applications, what technology is necessary? I've listed a number of areas that I think need examination before we can get to XQuery married to the Web and to make a generic second tier.
1. How to express that a particular schema is queryable and the related bindings and endpoint references to send and receive the queries. Some WSDL extensions would probably do the trick.
2. Limit the data set returned in a query. There's simply no way an large provider of data is going to let users retrieve the data set from a query. Amazon is just not going to let "select * from *" happen. Perhaps fomal support in XQuery for ResultSets to be layered on any query result would do the trick. A client would then need to iterate over the result set to get all the results, and so a provider could more easily limit the # of iterations. Another mechanism is to constrain the Return portion of XQuery. Amazon might specify that only book descriptions with reviews are returnable.
3. Subset the Xquery functionality. Xquery is a very large and complicated specification. There's no need for all that functionality in every application. This would make implementation of XQuery more wide spread as well. Probably the biggest subset will be Read versus Update.
4. Data model subsets. Particular user subsets will only be granted access to a subset of the data model. For example, Amazon may want to say that book publishers can query all the reviews and sales statistics for their books but users can only query the reviews. Maybe completely separate schemas for each subset. The current approach seems to be to do an extract of the data subset accoring to each subset, so there's a data model for publishers and a data model for users. Maybe this will do for WebXQuery.
5. Security. How to express in the service description (wsdl or policy?) that a given class of users can perform some subset of the functionality, either the query, the data model or the data set. Some way of specifying the relationship between the set of data model, query functionality, data set and authorization.
6. Performance. The Web has a great ability to increase performance because resources are cachable. The design of URIs and HTTP specifically optimizes for this. The ability to compare URIs is crucial for caching., hence why so much work went into specifying how they are absolutized and canonically compared. But clearly XQuery inputs are not going to be sent in URIs, so how do we have cachable XQueries gven that the query will be in a soap header? There is a well defined place in URIs for the query, but there isn't such a thing in SOAP. There needs to be some way of canonicalizing an Xquery and knowing which portions of the message contain the query. Canonicalizing a query through c14n might do the trick, though I wonder about performance. And then there's the figuring out of which header has the query. There are 2 obvious solutions: provide a description annotation or an inline marker. I don't think that requiring any "XQuery cache" engine to parse the WSDL for all the possible services is really going to scale, so I'm figuring a well-defined SOAP header is the way to go.
Your thoughts? Is WebXQuery an interesting idea and what are the hurdles to overcome?
One of the things I致e been involved in lately has been working on extensibility and versioning models for the Web and Web services. The bulk of the work shows up in the TAG finding http://www.w3.org/2001/tag/doc/versioning and in the xml.com article at http://www.xml.com/pub/a/2003/12/03/versioning.html. This thinking has also shown up in the Web Arch documents sections General principles 1.2.2 Extensibility (http://www.w3.org/TR/webarch/#general), 4.2 Extensibility and Versioning (http://www.w3.org/TR/webarch/#ext-version) and 4.5.3 Namespaces (http://www.w3.org/TR/webarch/#xml-formats).
I was really glad to see many folks including the TAG really endorse the work on extensibility and compatibility, and even more so the observation that I made that compatibility and extensibility are not just for XML data formats. That in reality, all 3 legs of the Web architecture (formats, identifiers, protocols) are formats that have defined rules for compatibility and extensibility. And whether or not the Web was designed this way, the compatibility guarantees of things like HTTP headers, URI fragment identifiers/path expressions/query strings, and document formats have the same underlying models and they are all necessary.
So what is left to do?
First up is to more formally define what is meant by compatibility and extensibility. Right now we池e a little loosy-goosy on these. David Bau, who I致e been working with a lot over the past few years, has posted some really excellent stuff on his blog (http://www.davidbau.com). Now I知 going to take a bit of a different tack than DaveB on this, but you can certainly tell me if you like his approach better. It wouldn稚 be the first time that the DB material was better than the DO material.
What I would like to do, but probably won稚 have time to do:
The relationship between format set theory and protocol set theory needs to be refined. All the work that we致e been doing so far has been around defining format set theory, but we池e missing protocol set theory. I have this intuition that they are actually the same. My theory goes that a protocol can be expressed in terms of set theory and compatibility and extensibility theories can be applied. For example, a request response protocol can be expressed as A, B. In formats, we壇 say that A, B is extensible if a C could be introduced, say A,C,B. Does the same hold true for Protocols? Given the web architecture bent, I値l take a look at the HTTP protocol for this. This would have to do an analysis of a subset of a protocol, specifically the time based sequence of messages and excluding the other stuff, like security considerations, specific timing constraints (back-off algorithms). I think that one of the big problems is that interfaces/protocol constructs in languages like Java aren稚 designed very well for distributed compatibility (such as adding in a new method) but that requires a bit of explanation. BPEL?
I want to examine the relationship of protocol set theory to WSDL, BPEL, WS-Policy needs examination. I provided one way of allowing for compatible WSDL files � that is by designing the schema for individual operations for extensibility. But the issue of whether an entire application protocol can be designed and described in WSDL/BPEL that formally supports compatibility needs to be done.
Finally, the relationship between 塗eaders� and WS-* specifications in SOAP/WSDL/WS-Policy/BPEL and compatibility could be examined. For example, what compatibility guarantees can be assumed between an application with protocol X and using spec Y and Z to pass headers? Is there an interesting set intersection that might preclude an application from being forward compatible? Are these specifications really orthogonal from a message exchange pattern perspective.
And a final thing that I won稚 have time to do, though maybe the final item will be a forcing function, is to work on formal models of compatibility on mixed namespace documents.
Formal models of compatibility and extensibility.
We provided a formal definition for language, instance, sender, receiver, component, vocabulary, terms in the finding. We also provide a definition of compatibility, but it痴 just not very adequate.
To reprise:
A language has a vocabulary that may be drawn from one or more XML Namespaces (or none). [Definition: A vocabulary is a set of terms]. The syntactic structure of the language is constrained by the use of DTDs, XML Schema, other schema languages or narrative constraints expressed in the relevant language specification.
In general, the intended meaning of a vocabulary term is scoped by the language in which the term is found. However, there is some expectation that terms drawn from an XML Namespace have a consistent meaning across all languages in which they are used.
For our purposes, [Definition: a language is an identifiable set of vocabulary terms that has defined constraints.] For example, the elements and attributes of XHTML 1.0 or the names of built-in functions in XPath 2.0. Languages may or may not be defined by a schema in any particular schema language. By language, we just mean the set of elements and attributes, or components, used by a particular application.
Now there痴 a really important point that痴 hiding in here, which is that a language must have defined constraints. This is the 田ontract� of the language. If a contract is missing, then it is very difficult to design a language with any kind of compatibility guarantees. If all you致e got is running code, and the developer wants to introduce a new feature, how does (s)he know what messages must still be understood? I would argue that a language that does not have a contract is really just an arbitrary set of terms. And there痴 no way of knowing what are valid terms or not? In the absence of a contract, is
There痴 a quote 鉄ervice Oriented Architecture is about shared language understanding, not about reverse engineering terms�. OK, it痴 not that great, but it gives a flavour of the rationale.
And where language understanding gets super important is when the system changes. In the absence of a language, it痴 exceedingly difficult to version the software to figure out new allowable words and sentences. The reason is that without a written definition of the contract aka language, the designer has to guess and hack at what works.
Given that contracts will be provided, we can move to examination of set theory of languages and contracts.
The Web architecture document provides a starting point.
Language subset: one language is a subset (or, "profile") of a second language if any document in the first language is also a valid document in the second language and has the same interpretation in the second language. Taken another way, the set of allowable documents in the first language is less than the first.
Language superset: one language is a superset of a second language if the second is a language subset of the first.
Language extension: one language is an extension of a second language if the first is a superset of the second. This follows our intuition, that adding to a language (extending) results in a bigger set than the first set.
Now let痴 take a look at XML languages. Imagine I define a schema that defines a really simple set of allowable documents. The 吐oo� schema allows only 1 element, a foo. A valid document is
To allow for compatibility, we advocate that the contract for foo allow extensibility. One simple extensibility model is to say foo can have any number of siblings after it. So
The set of allowable documents that are valid under foo we will call V0.
Now what is a 田ompatible� change to V0? According to set theory, V1 must be a subset of V0. But wait! We just said that extension was the addition of terms, and compatibility theory says we can only subset. How can we subset and superset at the same time?
Well it turns out there is a trick that gets played. There is a difference between known combination of terms and allowable combination of terms. In the case of V0, the only known combination was known was
Let us introduce a new term and create V1. To create V1, we expand the set of 徒nown� terms. In this case, we will add a
In defining bar, we expand the set of known terms by some amount of the unknown terms. So when we 兎xtend� our language, we increase the set of known terms. But we also reduce the set of allowable combination of terms.
One way of looking at this is that Extensibility in XML languages is actually the process of creating successive subsets of the allowable combination of terms. What you say? Extensibility is about subsetting? Allow me to explain.
V0 allows any terms after the foo element and has only one known term. V1 allow only 1 term (bar) after the foo element, allows any terms after the bar element, and has two known terms. V1 has a larger set of know combination of terms but a subset of allowable combination of terms.
What we have discovered is that a language extension is a superset of the known combination of terms but is also a subset of the allowable combination of terms. Isn稚 this deliciously ironic? Extensibility allows us to subset in the future.
We have a problem though: We can稚 describe compatibility in terms of just V0 and V1. Backwards compatibility is where V0 can be interpreted as V1, and forwards compatibility is where V1 can be interpreted as V0. How do we define the set theory for this? Our intuition says that backwards compatibility is where V0 is a subset of V1. But this is based upon the 田losed� set model, not the 登pen� set model that we致e been dealing with. We know that in the example of V, V0 is a superset of the allowable combination of terms in V1 yet is a subset of the known terms.
But we know that there is difference between the known combination of terms and the allowable combination of terms. A piece of software that knows about V0 will only send
We will refine our definition of V to be the set of allowable terms A and K to be the set of known terms. So K is a subset of A. K0 = foo, and K1 = Foo, bar. In K1, remember that bar is optional. We need to introduce a function on our sets to determine the minimal set of terms. We will call this req() for required elements.
We use these sets and the function to determine compatibility. In the case of backwards compatibility, we can allow any of the optional known elements to be omitted. Omiting all the optional aspect of K1 is called req(K1).
V1 is backwards compatible with V0 if req(K1) = req(K0), K1 is a superset of K0, and A1 is a subset of A0.
In the case of forwards compatibility, an instance of V1 can be treated as an instance of K0.
V0 is forwards compatible with V1 if req(V1) = K0 and A1 is a subset of A0.
This follows our intuition: Forwards compatibility requires accepting and ignoring unknown content, so A0 must be larger than K0 and we have to be able to map the A0 set down to K0. If A1 is not a subset of A0, then the portion of A1 that is outsideA0 can稚 be mapped to K0.
The mapping function we talk about is described as the 溺ustIgnore� rule, and is essential to enable the mapping of V1 to the K0 portion of V0.
Now have two definitions of compatibility based upon our sets of known and allowable terms.
