JOB REFERRALS
    ON THIS PAGE
    ARCHIVES
    CATEGORIES
    BLOGROLL
    LINKS
    SEARCH
    MY BOOKS
    DISCLAIMER
 
 Thursday, October 13, 2005
CORBA did what?

Long-time blog reader Dilip Ranganathan pointed me to this discussion over on Steve Vinoski's blog about the history of CORBA, and in particular the discussion that ensued in the comments section on the entry. I found it interesting from two perspectives:

  1. The idea that two people could look at the history of CORBA (having presumably lived through it) and come away with entirely different ideas of what that history was, and
  2. The discussion over CORBA's role and influence on the current XML services environment.

For starters, Steve Vinoski was a bit miffed at the idea posited by Mark Baker that CORBA failed. Sorry, Steve, I have to say it, but I agree with Mark--CORBA never fulfilled on its intended promise of seamless middleware interoperability and integration capabilities, and certainly not over the Internet in any meaningful way. By the time CORBA began to address some of those issues--firewalls being a big one--the world had already pretty much abandoned both the "distributed object brokers" (the other being COM/DCOM) and were starting to explore HTTP as the be-all, end-all transport protocol.

But the discussion that comes out of Steve's challenge that CORBA didn't fail is to me the far more interesting point--the discussion of whether the WS-* stack is loosely coupled or not. See, if CORBA's failure was that it was a too tightly-coupling technology to allow for good integration between companies (as Mark Baker asserts in the discussion), then we have to be careful regarding how tightly we couple endpoints and interfaces in the WSDL world, as well. And this is where I wholly agree with Mr. Baker: I look at the current crop of WSDL-based implementations, and their IDL-cum-WSDL interface descriptions (usually generated from shudder a language interface), and I see the same mistakes being made.

The discussion continues, but rather than try to summarize it (and probably get it wrong, given my current state of exhaustion), I suggest you head over and have a look. If you're into the XML services space at all, you owe it to yourself... and your clients... to do so.


Friday, October 14, 2005 7:26:16 AM (Pacific Daylight Time, UTC-07:00)

I think that loose-coupling points were made by Michi Henning, not Mark Baker (although Mr. Baker probably holds similar view)
Dilip
Friday, October 14, 2005 1:48:52 PM (Pacific Daylight Time, UTC-07:00)
Ted, if a technology is deemed a failure, it usually doesn't have a 13 year successful industry behind it or billions of customer dollars successfuly invested in it. Did you know that pretty much anytime you make a phone call, you're using CORBA in some fashion? Did you know that billions and billions of dollars are traded through CORBA-based financial systems every single day? There are many, many examples of CORBA successes out there, in manufacturing, transportation, finance, telecom, etc., etc. If CORBA actually had failed, it wouldn't be as widely used as it is today.

Now, if you want to get specific, like you kinda did in your posting, as in "CORBA failed on the Internet," or "CORBA failed to become the only integration technology used anywhere on the planet," then I could agree with you. But simply saying "CORBA failed" like Mark did is just plain nonsense.
Friday, October 14, 2005 4:59:37 PM (Pacific Daylight Time, UTC-07:00)
I agree with Steve here. To say that CORBA has failed without qualification is clearly nonsense. Thousands of successful systems have been built with it, and the technology was in widespread use in the late nineties. Steve's and my book has sold over 33,000 copies so far, and is still selling (albeit in quite small numbers now), so there definitely was substantial interest in CORBA.

But CORBA *has* failed to become what was envisaged, namely, to be the B2B integration substrate for the internet. We are simply not seeing CORBA being used by unrelated parties to integrate their business processes across the internet. And, going out on a limb as usual, I predict that WS won't get there either. In terms of technology, WS/SOAP is far inferior to CORBA. I still cannot see anything I could do with WS that I couldn't do better with CORBA, at least not given some experience on part of the designers of the system. The WS complexity, lack of API standardization, technical naivety, and atrocious performance will be its downfall.

And here is another prediction: once people get over their current fixation with loose coupling, they will finally realize that, to get loose coupling, I don't need loose type systems that throw away compile-time type safety, and I don't need support at the protocol level at horrendous cost in performance. All I need is intelligent system design, a middleware that offers a workable implementation of multiple interfaces (check out Ice facets), and domain-specific standardization. With that, I get type safety, flexibility, and performance.

Cheers,

Michi.
Friday, October 14, 2005 7:05:36 PM (Pacific Daylight Time, UTC-07:00)
"All I need is intelligent system design, a middleware that offers a workable implementation of multiple interfaces (check out Ice facets), and domain-specific standardization. With that, I get type safety, flexibility, and performance."

No, you need interface constraints; look around, can you name a single successful system on the Internet which doesn't use them? Nope, because there are none. That's no coincidence.

If you choose to realize this constraint as IDLs over IIOP or WSDLs over SOAP, or some other *DL over some other layer 6 protocol, so be it, it just won't perform as well as HTTP, contrary to your unsubstantiated claims about a performance cost. That's because layer 6 protocols aren't optimized for any particular application, while layer 7 protocols are optimized for one application; coarse grained data transfer, in the case of HTTP. See RFC 817 for a discussion of some of the performance problems of layering (and not layering) at different places in the stack; http://www.ietf.org/rfc/rfc817.txt
Sunday, October 16, 2005 11:34:29 PM (Pacific Daylight Time, UTC-07:00)
Mark, as far as I can see, the interface constraints you are talking about are, for example, the restricted verbs in HTTP. So I can say "GET" to any endpoint and get something back. Fair enough.

But I don't see what that buys me. The operation name has moved into the request payload, instead of being passed at the protocol level. What does that achive? Nothing, as far as I can see. Sure, I can get the data back but, once I have it, I'm just as bound by the interface contract as I was before: to make sense of the data, I must have an a priori agreement with the sender of the data, otherwise I simply won't know what to do with it.

I agree that, with the "GET" approach, the interface can change. But so what? To make sense of the changed interface, client and server need to establish a new a priori agreement and need to be changed anyway. So where is the gain?

> contrary to your unsubstantiated claims about performance cost.

Are we missing each other here? The performance penalty of using XML as an encoding is well established. There are many papers on that topic.

Or are you talking about the performance of the underlying HTTP protocol? If so, there is no difference between Ice and HTTP. Ice sends byte sequences around at the bandwidth of the underlying link. (>500 Mbps over the backplane.)

Regardless, what matters is the overall end-to-end performance of getting a message from sender to receiver. With SOAP over HTTP, that cost is typically around 100 times greater in both bandwidth and CPU cycles than with Ice.

Cheers,

Michi.
Monday, October 17, 2005 7:09:53 AM (Pacific Daylight Time, UTC-07:00)
"But I don't see what that buys me. The operation name has moved into the request payload, instead of being passed at the protocol level."

But GET, POST, PUT, etc.. *are* the operation names. You don't need other ones, because those are very general. Everything else is just identifiers, and data.

That's the difference between application protocols and sub-application-layer protocols like IIOP, TCP, BEEP, Ice, etc..

Regarding performance, yes I meant HTTP since that's what I thought you were talking about when you said, about loose coupling, "and I don't need support at the protocol level at horrendous cost in performance". If you were talking about XML, then we're in agreement.
Monday, October 17, 2005 7:49:02 AM (Pacific Daylight Time, UTC-07:00)
How do you define success and failure? There in lies the answer to CORBA's rating.

In terms of reaching the technology masses, CORBA did not make it there; but it is used in some enterprise application with heavy usage. Plus I think it gave the momentum for other distributed technologies to succeed.

Monday, October 17, 2005 3:50:49 PM (Pacific Daylight Time, UTC-07:00)
> But Get, POST, PUT, etc. *are* the operation names.

Sure, I understand. But there still is the data that the sender puts there, and that the receiver interprets. That data can be interpreted only if there is an a priori agreement. The ability to get arbitrary data with a GET then really doesn't change the overall picture: sure, I can get the data, but I can't make sense of it unless I have an agreement with the sender. So, I'm just as tightly coupled as always.

And I keep coming back to the performance issue: what matter is the end-to-end performance of an invocation. And that is very much worse with XML than a binary protocol.

Cheers,

Michi.
Monday, October 17, 2005 7:31:57 PM (Pacific Daylight Time, UTC-07:00)
I'm sure I'm sounding like a broken record at this point Michi, but I don't know how you can claim "The ability to get arbitrary data with a GET then really doesn't change the overall picture". It does, in a *huge* way, as any client can now get data from (or submit data too (POST), or store data with (PUT), etc...) any server. Neither CORBA, nor DCOM, nor RMI, nor Ice provides this degree of loose coupling.

Yes, once you've got the data you've got its coupling problems to address. But the issue is identical for all these systems, including the Web.

"And I keep coming back to the performance issue: what matter is the end-to-end performance of an invocation. And that is very much worse with XML than a binary protocol. "

You're preaching to the choir. But neither HTTP nor REST requires the use of XML.
Tuesday, October 18, 2005 1:52:03 PM (Pacific Daylight Time, UTC-07:00)
Mark, I understand that any client can get data with GET, or store it with PUT. What I don't understand is what that gains me. Exactly what advantage does this offer? In particular, if I have a deployed application that uses some operations with some parameters, and then I need to change the application so the interface and parameters change in some way, where and exactly how does this give me loose coupling? What do I not have to change with WS in the case that I would have to change with CORBA, and why would it matter? (I'm not being facetious here--these are genuine questions.)

Cheers,

Michi.
Tuesday, October 18, 2005 6:04:35 PM (Pacific Daylight Time, UTC-07:00)
Michi, Mark's point is that with the approach he's talking about, the interface is already known up front, and only agreements on data are required. With the approach you're talking about, you need agreements on *both* interface and data. Mark's approach therefore offers the potential for much lower coupling and far greater scalability than your approach.
Tuesday, October 18, 2005 6:24:27 PM (Pacific Daylight Time, UTC-07:00)
Hi Steve,

I hear what you are saying. But does this really provide an advantage? Consider the following two IDL definitions, which I think capture the core of the two approaches:

interface Specific {
SomeData someOperationName();
SomeOtherData someOtherOperationName();
};

That would be the traditional CORBA way to get at the data. I need to know the name of the operation, otherwise I can't fetch the data.

Now, with this approach, the server is told the operation name plus the object identity, so it can use these two things to decide what data to return. Now, with the generic approach, the equivalent interface would look like:

// Assume type Blob is an XML-encoded document, or something else that the client can parse and understand.

interface Generic {
Blob GET(); // Returns SomeData or SomeOtherData inside Blob, depending on the URL
};

With this approach, the server is told the object identity (which is implicit in the target of the request (that is, its URL), but the server isn't given an operation name. Not a problem, of course, because we can encode the operation name in the URL. Something like:

http://host.domain/Generic/someOperationName

and

http://host.domain/Generic/someOtherOperationName

So, as far as I can see, the only difference between the two approaches is that the operation name has become part of the object identity (or, if we use in-parameters, it could also become part of the in-parameters).

Now, what I don't understand is how that is less tightly coupled or more scalable than the other approach. Semantically, each approach sends the same information and returns the same information, and I can't see any inherent advantage in one over the other.

Specifically, suppose I have a deployed system and I want to make a change to one of the operations, so it returns different data. To evolve the system to work with that new operation, I have to do much the same things in either case, as far as I can see: both client and server need to be updated (in practice, recompiled and redeployed, except for rare cases where everything uses dynamic invocation/dispatch and is table/configuration driven). So, where is the loose coupling? And I cannot see any scalability advantage either--as far as I can see, one approach is just as scalable as the other.

Cheers,

Michi.
Tuesday, October 18, 2005 8:22:00 PM (Pacific Daylight Time, UTC-07:00)
Hi Michi, I think the thing to keep in mind is that in a REST system like Mark describes, interfaces are uniform and fixed. In effect, the operations are built in. Anyone wanting to participate in the system simply implements the uniform interface.

You said:

With this approach, the server is told the object identity (which is implicit in the target of the request (that is, its URL), but the server isn't given an operation name.

Actually, it IS given an operation name -- the operation is GET. Scalability comes from the fact that the entire interface is known up front, thus eliminating a huge source of variability and thus a huge source of potential disagreement. No operation requests for operations other than those four are ever going to arrive.

The World Wide Web is kind of a big distributed object system where all the "objects" have exactly the same interface with the HTTP verbs as operations.
Tuesday, October 18, 2005 10:00:37 PM (Pacific Daylight Time, UTC-07:00)
> No operation requests for operations other than those four are ever going to arrive.

I agree. But don't see how scalability follows from that. And I don't see *why* knowing the interface up front would give me scalability either.

The complexity doesn't come from the interface variability. That is a fifth-order issue at best. The complexity comes from the nature of the interactions between different distributed application components. (You called this a "choreography" I believe.)

Whether I have a fixed set of verbs at the protocol level or not is quite irrelevant to scalability, and I don't see how that would fundamentally allow me to build better distributed systems. And I definitely can't see where this would contribute to loose coupling. The semantic ties between client and server are just as strong as ever, and they are just as tightly coupled as ever--agreeing on a fixed set of verbs doesn't change that, as far as I can see.

Cheers,

Michi.
Wednesday, October 19, 2005 9:22:27 AM (Pacific Daylight Time, UTC-07:00)
Geez, I take a break for a few hours and Web heresy breaks out! 8-)

Michi - just as a simple example, with interface constraints search engines can exist, while without them they can't ... or at least you'd have to upgrade them to recognize all the different get* methods out there in order to have something that's as functional as a constrained-interface supporting search engine.

From an architectural POV, what the interface constraint buys you has been documented by Roy Fielding, here;

http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_1_5
http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_2_2

.. with other little tidbits mentioned elsewhere in chapter 5 too. You might want to read the whole chapter.
Wednesday, October 19, 2005 6:45:09 PM (Pacific Daylight Time, UTC-07:00)
> Michi - just as a simple example, with interface constraints search engines
> can exist, while without them they can't ... or at least you'd have to upgrade them
> to recognize all the different get* methods out there in order to have something
> that's as functional as a constrained-interface supporting search engine.

Right, I understand that. It's worthwhile though to think about *why* this is an advantage. I can see two:

- The Web is mostly a read-only medium: the vast majority of web pages are meant to be read, but not interacted with. So, given the URL of a web page, a simple GET is sufficient to make this work because all I need to get the data is the identity of the page.

- The GET approach works for the web only because the data is encoded in a universally-understood format, namely, HTML. By "universally-understood", I don't just mean the syntax of HTML, but its semantics: web browser *know* the meaning of each HTML tag, and therefore can render the HTML into something meaningful for a human reader.

To me, a core question is whether this idea can translate into more general-purpose distributed systems. To stick with the web example, some web pages are interactive, such as online shopping forms. For those pages, a simple GET is no longer sufficient, and I have to use a POST. But, with that, I don't just need the URL, I need to supply data, and I need to supply the correct kind of data. For that to work, there has to be an a priori agreement between client and server, just as for WS or CORBA.

But, as soon as the client-server interaction gets isn't just a "fetch me this page" interaction anymore and gets more complex (such as for doing a POST), the I cannot see much value at all in having a fixed set of verbs. I don't see how they would buy me either more loose coupling or better scalability.

And many distributed systems have very complex interactions (or "choreographies", as Steve calls them). Those interactions cannot easily be modeled as an exchange of documents, and they are definitely not easily modeled as just a bunch of GETs to various endpoints. In other words, once the interactions become more interesting than just "get me this data" or "put this lump of data there", the standardized verbs are without value, as far as I can see.

And, even for simple GET scenarios, I don't think there is a lot of value. For example, I have might have a bunch of sensors for weather data. If I use the GET model, I can get the data from those sensors. And, if we assume that the data is XML-encoded, I can even parse that data. But, without a priori agreement, there is no way that I can *understand* the data. For example, if one of the sensors delivers XML with the tag "Durchschnittlicher_Regenfall", I have no idea what to do with it (unless I use my human intelligence, and happen to speak German...

So, while the GET/POST approach works well for the web, it works well only because the web is so limited in its interactions, and I don't see this translating to general distributed systems with lots of active components.

And, I'm still beating the same drum: I have yet to see a single example that demonstrates a convincing advantage in terms of loose coupling because of the standardized verbs. I do not see a single use case where, if the a priori agreement between client and server changes, the standardized verbs would make it easier for me as a developer, either at development time, or at deployment time.

Cheers,

Michi.
Thursday, October 20, 2005 9:11:46 AM (Pacific Daylight Time, UTC-07:00)
Hi,

Unfortunately I am not being able to arrive at the same conclusions as Mark and Steve are( scalability and coupling ). Could somebody help explain considering Michi's POV.

-JC
John C
Thursday, October 20, 2005 6:30:56 PM (Pacific Daylight Time, UTC-07:00)
Michi,

Let's ignore the data issue. I think any distributed computing solution will (and does) have that problem, and we both agree that it is a problem. I think we can also agree that any solution will probably be largely independent of the architectural style in use, since it's just data. So let's focus on your non-data concerns, because those are, from my POV, the important ones ...

"The GET approach works for the web only because the data is [...] HTML"

That's not the case. GET works because it's a good way for one party to request data from another. I could hand you many URIs that if you invoked GET upon them, wouldn't return HTML, but would instead return XML, CSV, RDF/XML, Excel files, and other machine processable data formats. GET works independently of the data format.

"To stick with the web example, some web pages are interactive, such as online shopping forms. For those pages, a simple GET is no longer sufficient, and I have to use a POST. But, with that, I don't just need the URL, I need to supply data, and I need to supply the correct kind of data. For that to work, there has to be an a priori agreement between client and server, just as for WS or CORBA. "

That's exactly right, and an awesome question, but it's really two steps ahead of what we're talking about as solving it requires solving the data problem first. The very simple answer is that although there is additional agreement required, what you're agreeing on in the two cases is different.

If you're interested though, RDF Forms (http://www.markbaker.ca/2003/05/RDF-Forms/) is one answer to your question, as it builds atop one solution to the data problem, RDF.

"But, as soon as the client-server interaction gets isn't just a "fetch me this page" interaction anymore and gets more complex (such as for doing a POST), the I cannot see much value at all in having a fixed set of verbs. I don't see how they would buy me either more loose coupling or better scalability."

Hmm. It improves loose coupling because the common interface separates a concern that isn't separated without the common interface; the separation of interface from type, or in other words, the separation of connector semantics and component implementation. What that buys you then, is component substitution in any architectural configuration; the ability to substitute any component for any other.

"And many distributed systems have very complex interactions (or "choreographies", as Steve calls them)."

I've got a lot of experience building them with the Web's hypermedia application model, and I've never had any trouble. I suppose more complex ones could exist than I've built, but I'm confident it could be handled with hypermedia.

"I do not see a single use case where, if the a priori agreement between client and server changes, the standardized verbs would make it easier for me as a developer, either at development time, or at deployment time."

I don't think it would help you out there much either. The value of the uniform interface is in providing a general infrastructure such that the information shared a priori between two components *doesn't need to change*.
Wednesday, November 02, 2005 8:22:04 AM (Pacific Daylight Time, UTC-07:00)
I don't see any separation between "interface" and "type". The "interface" just seems like a constituent or additional aspect of "type". If we are talking about pure message passing and not RPC, there is no "interface" anyway - that's a completely artificial fabrication for OO purposes. "Type" is a semantic contract between parties, and whether the technology makes it easier or harder to implement, you /still/ have to define these semantics. Now - there *may* be some value in distilling and elevating some common and ubiquitous interactions to a higher level (say, the "fixed" verbs of HTTP, which actually aren't all that fixed with extensions such as WebDAV, etc.), but that seems like a technical issue entirely orthogonal to the semantic contract. Being able to more easily obtain arbitrary garbage return value from a service doesn't seem valuable to me. I don't want just *any* data, I want some *specific* data. (of course you can implicitly encode type information in the formulation of the locator for the service, but that is still knowledge the client requires: e.g. you can define http://blah/getSomeSpecificInformationInThisType, which implements one and only one behavior, which is what I think the REST argument is; in this way, since there is only one behavior, there is only one semantic that needs to be defined for GET - but this only works IF you mandate only one behavior, otherwise the idea of a single reusable "GET" verb loses its value)

I'll agree that XML and HTTP possibly make this technically easier for humans, but it's quantitative, not qualitative.
Aaron
Thursday, November 03, 2005 7:25:32 PM (Pacific Daylight Time, UTC-07:00)
Michi,

You are correct in saying that HTML as well as the standard HTTP verbs are responsible for the web's ability to have search engines. You are incorrect in suggesting that "there has to be an a priori agreement between client and server, just as for WS or CORBA". This is patently incorrect as shopping web sites of today do not require such an agreement. Mozilla does not know how to interface to Amazon. Mozilla knows how to interface to a web server.

There is a required agreement on the data submitted, but ingenously this is done by supplying a web form to the client as part of the HTML content. Fielding's REST does not just specify the need for standard verbs, but also standard content types. Neither set should vary from web site to web site, or industry to industry. These generic content types are agreed by everyone and are capable of everything. This is how REST achieves true loose coupling.

The approach has its disadvantages. It is necessarily skewed towards machine-to-human comms rather machine-to-machine comms. This is where the view of REST that we typically see today springs from, the one that typically employs application/xml being sent to clients. Because application/xml can't be understood universally, it falls outside of any formal definition or REST. It is instead a simplified SOAP or CORBA where variation between web sites or industries is defined in a XML schema instead of a WDSL or IDL schema.

Keeping close to the truely decoupled REST world view yeilds some advantages. You can negotiate content types, so a single resource can easily accept input of varying forms. REST prohibts stateful interaction, so discourages choreography. A client should submit a whole request that should be processed by a server by appropriate means. Instead of having various components of the system moving about in intricate collaborations we tend to see the world from the view of a single client simply executing a program.

Personally, I see the ideal as a middle ground between the nearly-RESTful and truely-RESTful worlds. I think that standard media types should always be returned to clients, but that the standard media types can express contracts that return arbitrary data back to servers. Whether this is done as XForms, JavaScript, or embedded applets is really irrelevant. So long as all your clients can understand the data you send them without having to be written especially for your web site or industry you're going to be living in a loosely-coupled utopia.

Benjamin.
Tuesday, November 29, 2005 5:43:40 AM (Pacific Standard Time, UTC-08:00)
There are some disputable moments in your article. I do not absolutely agree with the author.
Sunday, July 02, 2006 12:28:46 PM (Pacific Daylight Time, UTC-07:00)
interesting information, thanks.
Friday, November 24, 2006 6:02:05 AM (Pacific Standard Time, UTC-08:00)
blockbuster
Friday, November 24, 2006 6:02:51 AM (Pacific Standard Time, UTC-08:00)
coretta scott king
Friday, November 24, 2006 6:03:30 AM (Pacific Standard Time, UTC-08:00)
fast cars
Friday, November 24, 2006 6:04:04 AM (Pacific Standard Time, UTC-08:00)
hitler
Friday, November 24, 2006 6:04:37 AM (Pacific Standard Time, UTC-08:00)
kelly brook
Friday, November 24, 2006 6:05:11 AM (Pacific Standard Time, UTC-08:00)
movie trailers
Friday, November 24, 2006 6:05:40 AM (Pacific Standard Time, UTC-08:00)
photography
Friday, November 24, 2006 6:06:11 AM (Pacific Standard Time, UTC-08:00)
snoop dogg
Friday, November 24, 2006 6:06:52 AM (Pacific Standard Time, UTC-08:00)
toys
Comments are closed.