Model Driven Architecture thoughts: It's hard

| | Comments (8)

Model Driven Architecture has gained momentum over the past few years. This is a natural response to the growing complexity of software development, particularly the increasing heterogenity of languages for development. A given web site might use 3,4,5 or more languages and data formats (javascript, html, JSP pages, Java servlets, EJBS, perl, python, SQL, LDAP).

The promise of "MDA" is that a model will be used to model the application data, and then the logical model is bound/converted/mapped to the various underlying languages and data structures. By using the model as the source of metadata for the application, the model won't ever get out of date either. Seems pretty compelling.

But it's not that simple. A long time ago - actually in 1999 - whilst I was at IBM, I was the architect an MDA system. I thought that we could provide customers exactly what MDA promises. What we did was create an architecture for creating a logical model in Rational Rose, exporting the XMI model, and then bind the logical model to XML, Java, and SQL. Each of arcs between the "physical" languages could provide a mapping between the language.

This custom mapping was incredibly important and necessary for performance reasons. We had done some work in an earlier project that mapped between Java and SQL (pre-EJBs) and we had found that in complex web pages - such as a Student Registration Result page - the default mappings between Java and SQL resulted in way way too many SQL select statements. 189 SQL selects resulted in about a 3 minute rendering time. Therefore, we created a way of doing the optimization in the database, and then parsing the denormalized and rectangular result into a Java graph. We got to the point that the 189 SQL select statements became about 4 select statements, and the perf was huge.

The first lesson to learn for MDA is that "abstractions", like a logical model, typically worsen performance and there is a need for selectively optimizing performance. There are 2 old saws in Computer science: Any problem can be solved by adding a layer of abstraction, and any performance problem can be solved by removing a layer of abstraction. MDA solves the problem of how to centralize the data model of systems by creating the model, and it will typically result in a performance problem of binding between the languages. A partial solution that we did was to enforce that the abstraction is a "leaky" abstraction, so that we could control and over-ride the mapping.

We had a bigger problem though: The developers hated it. Somebody would create the logical model, they'd push the "generate" code button, and then run the software. But guess what, they always got the model wrong. Maybe they forgot about the zip code in the address, or the middle name in the name structure. So the model needed to be updated. To make a simple change in the model and then generate took way too long. It could take up to half an hour before the system could be retested. The devs simply would make the change they needed in the place they needed it. For example, the SQL already had the zip code so they only needed to add the zip code in the Java and in the SQL select.

Astute readers will see this as the "round-trip changes" problem between design tools and code. And also the resultant tendency of designs, either in documents or models, to become out of date and unused over time.

I assert that MDA systems almost invariably suffer from the "design documents collect dust on bookshelves" problem, despite best attempts of the tools and organization to stop the natural entropy. The technology simply introduces an artificial layer of abstraction that is too difficult to modify and build high performance systems.

A final problem with MDA systems is that they don't really solve the hard problem. The hard problem is understanding the meaning of the Java, XML or SQL tables. What does the "StudentName" mean, and how to use it? You can guess. What you then do is figure out what the data is supposed to be used for, such as searching, using as an identifier or key, sorting, etc. All of these uses simply cannot be automated.

In the systems we built, we would *always* have to find out some metadata about the semantics of the data element. Usually this involved talking to somebody, because the tools never had the metadata in an automated manner. Now maybe the Semantic Web will solve this metadata problem, but I'm not holding my breath as to whether developers will start creating metadata in a usable manner.

The problem is that MDA systems only solve about 5% of the problem. 95% of the effort of a software development project is understanding customer requirements, creating the architectures and designs to solve the requirements, understanding/creating the semantics of the components in new and legacy systems. Automating the binding between a model and various software systems is just a really small part of the overall problem.

I won't even get into the problems that much of software is "infrastructure", like loops, conditionals, and these are incredibly difficult to model.

In conclusion, I believe that MDA systems solve a small portion of systems development and will typically suffer from the "stale design" and performance problems. I think the path forward for software development is perhaps to use MDA as a prototyping exercise, but the real productivity gains will come from ever increasing productivity tools (like better GUIs, APIs and programming models) and increasing metadata.

8 Comments

I would disagree with this statement:


The first lesson to learn for MDA is that all "abstractions", like a logical model, worsen performance


I wrote a rebuttal to this general idea here. Not that it's not sometimes true, but opposite can also be true. It's easy to optimize the local case and forget about the optimization of the larger application, or to underappreciate the scale of different performance problems.

For instance, I've seen people write unindexable queries over large tables, because the indexable query would require multiple queries to produce the same effect. Three indexed queries beats one unindexed query any day! Anyway, there's no one path to performance.

On another topic, in the modelling software I've created, I try to avoid code generation. Without code generation you avoid the round-trip problem, because you never offer anything lasting that would need to be folded back into the software. This doesn't address the customization and extension problem of a model, but those are just hard problems.

Fair enough point about abstractions don't "always" worsen performance. I think they typically do, given the mismatches in various type systems and optimizations that can occur. I've changed the statement to "typically" worsen performance.

With all due respect, I believe you don't really know what you are talking about.

If somebody had used J2EE, esp. EJB, once, in a single project, and it didn't work out - and he then wrote a blog entry saying "EJB doesn't work" - what would you think?

MDA clearly is no silver bullet. It can work very well though, and in fact I have seen it work in a number of very different projects. It explicitly shines during development of J2EE applications, where the amount of redundant information you have to maintain turns maintenance into a nightmare.

As to developers hating the approach: It's my experience this very much depends on the toolset and on the way the methodology is introduced. Some time ago, I wrote an article targeted at developers; I have since talked to a lot of different development teams, and usually the key is to just get people started. They'll never go back afterwards.

Oh yes, I forgot to mention: I'll bet a bottle of fine champagne that with a decent MDA toolset, the turn-around cycle for something as simple as introducing a new attribute to an existing entity takes less than a tenth of the time than it does with the manual approach.

Ah, but I didn't write "EJB doesn't work", I wrote "MDA is hard". And then I said "MDA solves problem X and X isn't the really hard problem".

As to the toolsets issue, I focused more on the issue of the architecture of MDA. You haven't actually refuted any of the points I made. You said I shouldn't have said it, that you've seen it work, and that developers like it. If you could address the issues I raised, rather than "darn it, you're wrong and folks like it", that would be great. You allude to a refute in your bet, but it's not really a clear refute and it doesn't cover the stale design doc and other issues I raised

Fair enough; I have addressed your points in more detail here (it seems you don't do trackbacks - right?). Feel free to respond either there or here.

And of course I know you didn't write "EJBs don't work". :-) That was not my point. I just objected to the over-generalization that seemed to be based on experience from a single project, and took EJB as an analogy I hoped you could relate to ;-)

A model-driven architecture implicitly defines an envelope - an envelope in which things can be done relatively easily *if* your application fits nicely within the envelope. There is a kind of 80/20 rule implicit here, with the idea that if your application falls partially outside the envelope, there should still be some way to create it, though it may be a lot more work. What is really important is the "size" of the envelope. The bigger the envelope, the harder it is to create a model-driven architecture. The smaller the envelope is, the more easier it becomes. As the envelope becomes "too large" (don't ask me to define this!), risks of project failure increase rapidly, as the whole MDA approach may collapse.
A lot of very important judgement calls are needed here, and reasonable architects may come to very different conclusions about the nature of what is doable in the envelope and its size.

What I would really be interested is how we apply divide and conquer to this kind of probject. That is, is it possible to partition an overly large envelope into multiple loosely coupled envelopes?
No firm ideas how this might work in practice, but it just seems right to me.

Bottom line is: I enjoyed reading your views.

MDA tools does PIM and PSM as we all know, its code generation part always based on some framework or philosophy. The number of problems I encountered with the tools is:
1. Editing auto generated code is risky, means I will restrain my self to regenerate the code, because my manually written code will be wiped out.
2. As time passes my code becomes stale, from technology stand point, I even cannot use upgraded model tool set for that code.
3. Abstraction causes the huge problem for performance in Java/J2EE, atleast in my application.
4. It introduces debugging nightmare.
5. Lots of investment in training and time very expensive.

What I learned is MDA is not for every application. Currently I am working on perhaps one of the very large enterprise project, and one of the concern top management have is, impact of MDA, that tells everything.
To me it looks like either it is not mature or it will never be a tool for end to end modeling the big size enterprise project.
The analogy that I can give for modeling tool is, automated hair cutting saloon. It may work for some and will not work for some. Though it solves the bottom line problem of hair cutting. But just imagine how everybody looks after having automated hair cutting Funny, isn't it.
So, same applies to MDA. Beware I am not talking here projects like student regitration, library management, resume management etc... I think MDA may work perfactly for these projects.

So, I completely agree with MDA is hard, I would like to add on top of that, it does not solve the hardest problem from the Architects view.

shoot your idea.

-VS

Leave a comment

About this Entry

This page contains a single entry by Dave Orchard published on March 3, 2004 4:01 AM.

WS Edge Extensibility and Versioning Course notes was the previous entry in this blog.

The prescient Onion is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Categories