The O/R-M Smackdown

So... the .NET Rocks! discussion between myself and Ayende is now live on the Web, and I echo Ayende's blog post: although I have yet to hear the edited version, the real discussion was very interesting. A couple of commenters left some questions and comments on Ayende's blog, and Roy Osherove suggested that he'd like to see my responses, so...

Congratulations on the discussion. I've listened to it once and intend to listen again, mainly because I had trouble figuring out exactlly what Ted was arguing for or against!

In general, I'm against dogma of any form. In this particular debate, I'm against the idea that O/R-M can solve all your problems for you without asking, a viewpoint that's particularly widespread in the Java community and a growing one in the .NET community. O/R-M can get you some of the way there, but it's not capable of "closing the loop", per se, to completely solve all of the issues involved. Anyone who suggests that it can is either lying or trying to sell you something.

Ted really lost the plot, however, when he started advocating db4o as a good solution. I've spent a good few (too many) years in the object database space and they're just a nightmare when it comes to querying and reporting on your data. Implemeinting this sort of solution just moves the dual schema problem out to your databases.

I wasn't aware there was a "plot" that we were trying to follow. :-) Moreover, I *do* see the OODBMS approach (of which, I am most familiar with db4o and to a lesser degree Versant, no endorsement implied) as a good solution to store objects into, particularly when compared against an O/R-M solution, for those scenarios where the stored form of the data is not a visible concern. In other words, if your application sees persistent storage as a implementation issue, and the actual format of the stored data is not intended to be visible outside of your application boundaries, then the OODBMS is a very viable solution. As to querying and reporting, the querying story is much more approachable now given db4o's "Native Queries" approach (which is supposedly being applied as the standard for a future OODBMS-wide standard), but reporting is still something of a mess, if you ask me, largely not because of anythiing intrinsic in the OODBMS space itself, but the fact that there is no standard OODBMS-based reporting tool. (Like it or hate it, Crystal Reports did a lot to solidify the presence of the relational database in the IT world.)

I recall being on the comp.databases.object newsgroup when db4o was first being designed. Its initial purpose was to provide something that would perform better than relational databases, and the guy developing it spent quite some time developing his benchmarks to prove his case. I'm suspicious of any solution designed primarily to improve on performance.

I don't know what the original intent was, but the db4o team has definitely tried to build an OODBMS that was aimed specifically at the "lighter weight" persistence scenarios, such as small devices.

I feel like Ted’s points were only applicable to the edge cases which are very few when it comes to a whole application. It also seems that Ted just thinks writing SQL is less work than good old expressive OOP.

I'm not sure which points were only applicable to the edge cases, or what those edge cases would be, but Ted certainly doesn't feel like his points are only applicable to edge cases. And honestly, Ted does sometimes think that writing SQL is less work than "good old expressive OOP", particularly in those cases where the relational database imposed on the project is not intrinsically OOP-ish. And because Ted has seen "good old expressive OOP" models that were just as badly designed as the relational schema that Ayende references in the discussion.

The other key point that was not mentioned during the vendor neutrality and portability section was that NHibernate IS optimized to a particular vendor’s database. Ted assumes that because NHibernate is database agnostics you loose all the powerful vendor specific features of the rdbms. Not True!! Look at how paging is handled; the SQL 2005 Dialect uses very vendor specific code that is highly optimized. Code that the average MORT DBA would probably not implement out of the box!

This is a point that's going to go back and forth indefinitely, so I'll simply say that a library will never be able to optimize a query as well as a DBA can, simply because the library or application cannot know which of two tables being joined has 10 rows in it, and which has 10,000,000 rows in it, in order to optimize the query accordingly. Ted doesn't assume that NHibernate doesn't try to tune to the particular database, Ted has just seen said attempts at optimization yield results that micro-optimize and don't yield significant results. Ted is happy to see benchmarks that prove him wrong on this score.

While it was never really discussed Ted’s arguments seemed that they were a little performance related as well. Don’t we all know that performance is the thing you tweak last? And thank god NHibernate is so flexible that you could easily write a sproc to handle a specific operation if you had to.

Well, if you don't think about perf until the very end of the project, you usually find yourself having to either just shrug your shoulders and say, "Well, faster hardware will make it run fast", or backtrack and refactor significant chunks of the application in order to cut out round trips from the system as a whole. I'm not suggesting that you optimize prematurely, just that "right before you ship" is not the mature time to optimize. (Usually I want to start serious perf testing and optimization at the same time that a build gets into QA's hands.)

As for exposing a sproc as a service…. Ludacrious!! The whole point of the service is loose coupling and how can database specific sproc be considered loose coupling. But wait…. Its too much work to build a message object.

Hey, the sproc-as-a-service contention was Rocky's, not mine. Take that argument up with him, not me. But given the ANSI SQL-92 syntax, calling a sproc in a database-neutral way is as simple as "? = { call add_customer(?, ?, ?) }", with the parameters substituted (using either JDBC or ADO.NET APIs) as necessary.

However, it seemed that Ted was arguing that something like UpdateAddress would be fine to implement as an spoc. I don't really see the logic in this as these kind of processes would seem to be better implemented and maintain in a OO language than in a relational language like TSQL.

Frankly, the problem with selectively supporting sprocs in places and not in others is that if clients use T-SQL to talk to parts of the system, and sprocs to talk to others, then you lose the encapsulation benefits of always going through a sproc. (This is true of any encapsulation layer, not just sprocs.) As to certain code being more easily implemented in an OO language, I buy that, and would tentatively suggest that because most RDBMS implementations now support the JVM or CLR internally, this is an area for further research and exploration.

 Congratulations on maintaining your composure in the face of "you're wrong, you're wrong, you're wrong..."

Ted used a fair few cheap debating tricks, which I would have called him up on very quickly...

I'd love to know which tricks those were, so I could use them again.... And you're right, he did a great job maintaining his composure, despite all the baiting I threw at him. I wanted SOOO badly for him to stand up and shout, "Ted, you ignorant slut!", but he just wouldn't rise to the bait. Sigh....

However ... I have to disagree with Ted strongly on one point.

There should *never* be business logic within a stored procedure.

Data logic perhaps (reforming your data, concatenation of data, etc) ... but *never* a business rule (like calculating a sales tax total).

Never is a strong word. Dogma is a dangerous thing, and saying "always" or "never" is a form of dogma. Case in point: If you have a system that needs to be used by both Java and .NET applications, where do you implement your business rules? You could build an XML service in order to implement them in there, or you could build a J2EE-server-hosted interop-technology-accessed component and host them in there, or you could put the rules inside a sproc and implement them in there. Which one has the least complexity and most future-proofed solution? (Another commenter also pointed out that reporting tools will often need to invoke/utilize such logic as part of their reports, which again makes the sproc a useful place in which to put said logic.)

He used a few circular arguments to try to stop you poking holes in his arguments, i.e. he railroaded you by getting you to admit you agreed with him about something that was not what you were actually trying to highlight, and hence closed off the avenue of discussion before you got to highlight your point, I got quite annoyed with him listening to it actually.

Actually, you're not alone: lots of people get annoyed with me. I tried to circle back around to points Ayende wanted to make, but I know that there were other points I wanted to follow up on that I didn't simply for the fact that I didn't want this to go for four more hours. And it could have. Easily.

In this particular debate, much of the arguments were intrinsically circular, because in many cases the discussion rests on value judgements made by the developer or on perceptions held long before the discussion begins. (In other words, whether you agree with me or with Ayende is basically predetermined based on your pre-existing judgment of O/R-M tools and libraries.) I'll also be the first to admit that I wasn't entirely coherent in my arguments, and that this debate could have gone on for four more hours, but as Ayende and I had just done a panel on open-source software for two-and-a-half hours (he sitting on it, me moderating it), I know I was just flat-out exhausted and I had an early-morning flight out of Montreal the next day.

And I am sure at one point he just said something like “you’re wrong, you’re wrong, you’re wrong” i.e. like a politician talking over someone to stop them making a valid point.

Valid point? Then I wouldn't have said "You're wrong". I generally draw a pretty clear distinction between value judgments and stated facts, and I will only tell somebody they're "wrong" when my understanding of the facts are different than what they are presenting. I haven't listened to the recorded show and don't remember what part of the debate being referred to, so I can't clarify the point under question.

"He used a few circular arguments to try to stop you poking holes in his arguments, i.e. he railroaded you by getting you to admit you agreed with him about something that was not what you were actually trying to highlight, and hence closed off the avenue of discussion before you got to highlight your point, I got quite annoyed with him listening to it actually."
Exactly! He also hopped from one argument to the next even though the discussion about the previous argument wasn't finished yet, creating a situation where Oren had to defend, defend, defend and come up with proof why Ted was wrong, which is precisely what you shouldn't do in these debates: If the claimer doesn't have proof, the claimer should shut up :)

Interesting--so now, shoudl I claim that the commerter should shut up, since he claims and offers no proof? :-)

The discussion was also pulled into the area of dogmas rather quickly: stored procedures vs. dyn. sql. Ted made the mistake to use the wrong arguments in favor of procs by using myth after myth and presenting them as evidence while Oren had to defend, defend, defend instead of explaining the real state of affairs in reality. What was also a low thing from Ted was that he took advantage of the fact he's a native English speaker so he could with a bit of more volume overtake any argument easily, he didn't let Oren finish a lot of remarks.

Actually, I'd love to hear what those myths were, since as far as I know, we were discussing the "real state of affairs in reality". If anybody wants to offer up hard evidence as to the "myths" of stored procedures, I'd love to hear or see them. Hard evidence, not value judgments. I think that this whole area is wrapped in value judgments as a general rule, however, and I'll be the first to admit my own value judgments may not be what the listeners or commenters agree with, and that's OK.

What I will NOT stand for, however, is the suggestion that I was trying to take advantage of Oren's linguistic difficulties (of which, I think, he had very few). I may have interrupted Oren, but he interrupted right back in places. As to volume issues, I think this may have had more to do with (what I expect would be) my experience with speaking into a microphone more than anything else. During the panel in Montreal right before this discussion, Oren at one point had to be reminded that if he's going to gesture with a hand, don't use the hand holding the microphone, since it's harder for the mic to work if it's several feet from the mouth.

One thing which struck me was Ted's answer what to change: the table or the code, i.e. what drives what. He couldn't come up with the sole right answer: the abstract entity model is what drives it, as that model defines what a 'customer' is, what relations it has etc. so you can define code or relational model from there, not in a low level table editor or class editor. People who think that writing a class first is different, it's not: the knowledge you use to write the entity class is what's embed and described by an abstract entity model.

There is no "sole right answer"! There never is a "one right answer", to any particular problem. There are solutions to problems that yield particular consequences in particular contexts, and sometimes those consequences are ones you can live with, and sometimes they aren't. Looking for "absolute right answers" is like looking for "absolute truth"--interesting in the abstract, mostly useless in practice.

So i.o.w.: Ted showed he doesn't know much about what a relational model is all about as he kept venting on about tables, procs and other details which are all driven by the abstract entity model which is the foundation of every relational model anyway. (Otherwise how on earth can you decide which fields have to be in table X ? You can't: the reason field A, B and C are in table X is because they're defined as attributes for a given entity X. (P.Chen/Codd entity)

Well, readers/listeners are certainly welcome to take away whatever conclusions they like, but the fact is that I've done the Codd reading (including the 8th Edition of Intro to Database Systems, as well as the various Date essays for the last fifteen years, a la "Date on Databases: 2000 - 2006", from APress) I've studied the predicate calculus and relational algebra, and I'd be happy to have a long drawn-out discussion over Codd's assertions that objects are basically just extensions of the relational model (which I think fails to recognize that an object is instrinsically a combination of state and behavior, given Codd's position on stored procedures in general), for example. Generally speaking, I've found that nobody cares about Codd's assertions or views on things, so I'd welcome another viewpoint on the whole thing.

Overall, it was a discussion which blurred the whole topic, as it was more about what Ted thought about procs and object databases than anything else. I would have respected his opinion if he would have used solid arguments which are founded on knowledge, not myths. I have to give him credit that he didn't use the precompiled myth, allthough he did go for the performance argument: the performance argument is a slipperly slope: because if performance is key, why not put everything in the DB (so also UI) and where to stop to make sacrifices for performance? because that's what putting BL in the DB for performance reasons really is: a sacrifice: you give up programmability and maintainability for performance, but there's no end to that road.

I'm not sure, again, which myths we're referring to here, and obviously I think I do bring solid arguments to the table, but I can't argue against opinion, so that will have to just fly by. Performance arguments are always a slippery slope, but so is elegance and purity. In fact, it's arguable that any dogmatic approach--be it a perf-based one, or otherwise--is a slippery slope to which there is no end.

All in all, I had fun with the debate, my hope would be that Ayende did as well, and if he wants a rematch, I'm happy to take him up on it. :-) Thanks to Richard and Carl for recording it, and thanks to the dozen or so folks in the audience who were willing to sacrifice a late night drinking to sit in the audience and provide a cheering section.