JOB REFERRALS
    ON THIS PAGE
    ARCHIVES
    CATEGORIES
    BLOGROLL
    LINKS
    SEARCH
    MY BOOKS
    DISCLAIMER
 
 Monday, January 15, 2007
The Root of All Evil

At a No Fluff Just Stuff conference not that long ago, Brian Goetz and I were hosting a BOF on "Java Internals" (I think it was), and he tossed off a one-liner that just floored me; I forget the exact phrasology, but it went something like:

Remember that part about premature optimization being the root of all evil? He was referring to programmer career lifecycle, not software development lifecycle.

... and the more I thought about it, the more I think Brian was absolutely right. There are some projects, no matter how mature or immature, that I simply don't want any developer on the team to "optimize", because I know what their optimizations will be like: trying to avoid method calls because "they're expensive", trying to avoid allocating objects because "it's more work for the GC", and completely ignoring network traversals because they just don't realize the cost of going across the wire (or else they think it really can't be all that bad). And then there are those programmers I've met who are "optimizing" from the very get-go, because they work to avoid network round-trips, or write SQL statements that don't need later optimization, simply because they got it right the first time (where "right" means "correct" and "fast").

It made me wish there was a "Developer Skill" setting I could throw on the compiler/IDE, something that would pick up the following keystrokes...

for (int x = 10; x > 0; x--)

... and immediately pop Clippy up (yes, the annoying paperclip from Office) who then says, "It looks like you're doing a decrementing loop count as a premature optimization--would you like me to help you out?" and promptly rewrites the code as...

// QUIT BEING STUPID, STUPID!

for (int x = 0; x < 10; x++)

... because the JVM and CLR actually better understand and therefore JIT better code when your code is more clear than "hand-optimized".

And before any of those thirty-year crusty old curmudgeons start to stand up and shout "See? I told you young whippersnappers to start listening to me, we should have wrote it all in COBOL and we would have liked it!", let me be very quick to point out that years of experience in a developer are very subjective things--I've met developers with less than two years experience that I would qualify as "senior", and I've met developers with more than thirty that I wouldn't feel safe to code "Hello World".

Which, naturally, then brings up the logical question, "How do I know if I'm ready to start optimizing?" For our answer, we turn to that ancient Master, Yoda:

YODA: Yes, a Jedi's strength flows from the Force. But beware of the dark side. Anger, fear, aggression; the dark side of the Force are they. Easily they flow, quick to join you in a fight. If once you start down the dark path, forever will it dominate your destiny, consume you it will, as it did Obi-Wan's apprentice.
LUKE: Vader... Is the dark side stronger?
YODA: No, no, no. Quicker, easier, more seductive.
LUKE: But how am I to know the good side from the bad?
YODA: You will know... when you are calm, at peace, passive. A Jedi uses the Force for knowledge and defense, never for attack.

What he refers to, of course, is that most ancient of all powers, the Source. When you feel calm, at peace, while you look through the Source, and aren't scrambling through it looking for a quick and easy answer to your performance problem, then you know you are channelling the Light Side of the Source. Remember, a Master uses the Source for knowledge and defense, never for a hack.

(Few people realize that Yoda, in addition to being a great Jedi Master, was also a great Master of the Source. Go back and read your Empire Strikes Back if you don't believe me--most of his teaching to Luke applies to programming just as much as it does to righting evils in the galaxy.)

All humor bits aside, the time to learn about performance and JIT compilation is not the eleventh hour; spend some time cruisng the Hotspot FAQ and the various performance-tuning books, and most importantly, if you see a result that doesn't jibe with your experience, ask yourself "why".


.NET | C++ | Conferences | Development Processes | Java/J2EE | Reading | Ruby | Windows | XML Services

Monday, January 15, 2007 2:26:29 PM (Pacific Standard Time, UTC-08:00)
Comments [6]  | 
 Wednesday, January 10, 2007
The Five Things Meme

Simon tagged me, so I suppose I have to do this or else be on the bad end of Bad Luck For Seven-and-a-half-Years or something like that. Here we go, five things you may not have known about me before now:

  1. Je parle francais, un peu. (I'm not sure how to get the French characters on my keyboard or in the blog, so those who speak French will have to pardon the lack of the appropriate accented characters.) Ein BiBen Deutsch, aussi.
  2. My degree is in International Relations, from the University of California at Davis. I took several Comp Sci classes while there, but stopped when I realized that my self-driven study of programming (thanks to Stroustrup's The C++ Programming Language and Coplien's Advanced C++ Patterns and Idioms) put me actually well ahead of most of the CS undergrad community there. I thought briefly about grad school, but when the Chair of the CS department at UCD told me he'd turn me down due to my B- in ECS 140A: Programming Languages (I had a really hard time trying to get the hang of Lisp), I decided not to bother.
  3. I'm an avid video-game gamer, dating back to the very early games in the 80's. My most prized accomplishment of that era? Flipping Galaga. (For those who don't know the term, it means gaining a score high enough--in this case, a million points--such that the display "flips" back to zero.) And these were in the days when it was one-quarter-one-game, none of this "play 'til you run out of money" approach first introduced by Gauntlet....
  4. I didn't grow my hair out until after I'd graduated high school. No, it wasn't a "rebellion thing", it was the plain realization that if I ever wanted it long, college was my last chance to do it, because clearly long hair wasn't acceptable in the big bad working world....
  5. Speaking of high school, back then everybody thought my first published book would be a Sci-Fi/Fantasy work. I was one of the founding members of the school's Young Author's Club, and had a series of short stories about an assassin for hire--really terribly written, as I look back at them now, modeled after Edward D. Hoch's Nick Velvet mystery stories from Ellery Queen's Mystery Magazine but without any of his style or panache. That said, however, writing has clearly been at the core of my career for some time, as my life has been (quite positively) affected by various technical authors:
    • The two technical authors I most wanted to meet (and consciously modeled my writing style after) were Don Box and Jeffrey Richter. I grew up on Advanced Windows NT and Windows 3.1: A Developer's Guide, and I was fascinated by Essential COM and Effective COM.
    • The one technical author I never thought I'd ever come close to, much less write a book for and meet in person, was Scott Meyers; Effective C++ and More Effective C++ were amazing, literally life-changing experiences. Had somebody told me, ten years ago, that I would not only have met Scott, but written an Effective book of my own, and be privileged enough to call him friend, I'd have told them they were out-of-their-minds nuts.
    • The book that most influenced my technical career had to be Paul DiLascia's Windows++, since his was the first book I'd come across that walked through the nitty-gritty of building a real C++ framework, and that in turn led me down the ultimately futile path of building my own cross-platform GUI framework (which in turn, in its half-baked form, proved to several employers that my C++ skills were for real, despite not having a degree in Computer Science).
    • But by far and away, the author who's had the most profound effect on my life was none other than Bjarne Stroustrup, who, when emailed by this fledgling author thinking about writing his first book, offered a cogent, three-page email response filled with advice and wisdom about embarking on the path of the technical author, all of which turned out to be spot-on accurate.
    Thanks, to all of you.

So, I'm in turn supposed to tag five others, but I'm going to hold off for now, until I get a better idea of who's been tagged and who hasn't. :-)




Wednesday, January 10, 2007 2:15:31 AM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
 Saturday, January 06, 2007
The First Major Patch/Feature/Change/Whatever to Javac7...

It's a new brand of property support, submitted by Remi Forax. Have a look, and let the huge language debates begin...

Personally, I like what he's done, but then again, I'm a fan of properties-as-first-class-citizens support, a la C#. I'm not so wild about introducing the keyword (I like the C# syntax), but I can understand where the C# syntax is deemed a bit cryptic to Java developers. Besides, Remi's done the Right Thing by not making property (or abstract property) an actual keyword, so we don't have accidental backwards incompatibility issues to worry about.

Mind you, I sincerely doubt this is the final form it'll take in Java7, but this is encouraging--people are hacking on the compiler and producing concrete examples of ideas, not just ideas in limbo.

Hats off to you, Remi!


Java/J2EE

Saturday, January 06, 2007 4:53:15 AM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
 Friday, January 05, 2007
Interop Briefs: Check your politics at the door

(Originally appeared on TheServerSide, November 2006; I've made some edits to it since then.)

As we prepare to enter the holiday season here in the US, I think it’s time that we called for Peace on Earth. Or, at least, Peace in Computer Science.  

In 2000, when Microsoft first announced the .NET Framework (then called by various alternative names, such as the “Universal RunTime (URT)” or “COM3” or the “Component Object Runtime (COR)”), it was immediately hailed as the formal declaration of war on Sun and Java, if not an actual pre-emptive attack.

Within the industry, a schism already present was made deeper—developers were routinely asked “which side” they were on, whether they were supporters of “open” standards and “community-driven” development, or whether they were trying to support the evil corporate conglomerates. (I’ve since lost track of who’s supposed to be good or evil—Sun because they refused to release Java to an international standards body, IBM because they are trying to subvert Sun’s control over Java, Microsoft because they routinely “embrace and extend” open standards, or Oracle, because… well, just because.) I’m personally regarded as some kind of heretic and looney because not only do I routinely write code for both the Java and .NET platforms, but because I refuse to say, when asked, which one I like “better”.

You know what? I’m damn tired of these arguments. Can’t we all just get along and write software?

It’s not like these arguments really do much for our customers and clients. Truth be told, few of the people who use our software can even tell which platform the silly thing was written in, much less how it being written in Java will somehow make the world a more free (as in speech, as in beer, as in sex, whatever) place. Or that .NET somehow allows for multiple languages—generally speaking, the only language they care about is the one they speak and read and interact in. Most of the time, they’re just happy if they can *use* the software—remember, according to statistics routinely cited at conferences and presentations, half the time our customers never see software they’ve asked for, and when they do, it’s likely to be twice the budget costs originally anticipated, with half the features they originally asked for, in a user interface they don’t quite understand, even though it’s supposed to be “the latest greatest thing”.

This is progress?

Over the last five years, there’s been a quiet revolution under way, and it’s not the dynamic language revolution, nor the REST-HTTP-SOAP revolution, nor the agile revolution, nor AJAX. It’s not about containers or dependency injection or inversion of control or mock objects or unit testing or patterns or services or objects or aspects or meta-object protocols or domain-specific languages or model-driven architecture or any other fancy acronym and accompanying hype and marketing drivel. It’s a revolution of pragmatism, of customers and clients and others turning to developers and saying, “Enough is enough. I want software that works.”

“Works” here is a nebulous term, but before the Marketing goons start spinning the term to their best advantage, let’s clarify: “Works” is a simple term, as defined by our customers, not us. “Works” means runs in a manner that’s genuinely useful to our clients and customers. “Works” means it’s delivered close to on time and preferably under budget. (Nothing will ever make that utopian dream come true completely, so let’s be more realistic about the process—besides, *close* to on time and budget is a pretty good goal to shoot for right now, anyway.) “Works” means software that attaches itself to the existing mess we’ve made over the years, without having to rip out a whole bunch of servers and replace them with a whole bunch more. “Works” means taking what a customer has, in place, that already meets that definition, and tying the new stuff we’re building into that existing mess.

“Works” means, practically speaking, that we take the languages and tools that are available to us, and use them each to their advantage, regardless of political affiliation or perceived moral stance. That means taking Microsoft’s tools and technologies and tying them into Java’s, and vice versa. That means dropping the shrill rhetoric about how each is trying to “leverage” the other out of existence, and figuring out how to use them all together in a meaningful and technologically powerful way. That means recognizing that we are all one community, not little villages out in the countryside trying to beat each other into submission even as we try to scrape a living off the land.

Recently, I've picked up two books that I think typify my approach to programming in 2007, both by Larry Winget: "Shut Up, Stop Whining, & Get a Life", and his more recent follow-up, "It's Called Work For a Reason". In both, he points out that there is no "secret sauce", no "secret recipe" to success, and that for most of us, we already know what the Right Thing To Do is... we just don't want to accept it or admit it. I think that in a lot of ways, the debates over which platform to use and whose language is better are ways that we technologists avoid the much harder problem of dealing with customers. I think it's high time that we face that in the mirror, stop talking so much, and start listening more.

Abraham Lincoln, the man who had the unfortunate luck to preside over the United States during its most divisive era, once said, “A house divided cannot stand.” Neither will ours, I fear, if we keep this up. Please check your politics at the door—here, we care only about how tools can be used to solve problems.


.NET | C++ | Java/J2EE | Ruby | Windows | XML Services

Friday, January 05, 2007 10:32:10 PM (Pacific Standard Time, UTC-08:00)
Comments [2]  | 
A Time for a Change

I've had The Blog Ride up for almost two years now, and it seems the latest fad to change your blog title to match whatever your particular focus is at the moment. Given my tech predictions for 2007, and how I believe that interoperability is going to become a Big Deal (well, I guess in one sense it was already, but now I think it's going to become a Bigger Deal), and that hey, this is my schtick anyway, I've decided to rename the blog from "The Blog Ride" (which was kinda a lame name to begin with) to ...

Truth be told, I thought about squatting on Jason Whittington's old blog title ("Managed Space"), given that a lot of where my focus centers these days is around managed environments (Java and .NET, principally), but I didn't like that idea because (a) it was his idea first, and I don't like "me-too" kinds of faked creativity, and (b) I do a lot more than just managed code, so...

Welcome to "Interoperability Happens".

One of the things I've set as a resolution for the new year is to post some concrete interoperability tips (very similar to the ones I'd been posting to TechTarget's "tssblog" site) ranging on all sorts of interop topics from XML services, to using the proprietary communication toolkits, to using IKVM, to some concrete examples (authenticating from Java against a Microsoft Active Directory or ADAM service, hosting Workflow inside of Spring, or writing Office Action Panes that talk to Java back-end servers, and so on) of interoperability "in the field". I won't promise that I'll have a new one up every other week or so, but that's the goal. And the interop hopefully won't be limited to just Java and .NET; I plan to start exploring the Java/Ruby and .NET/Ruby interop space, as well as other pairings (Python, Tcl, maybe a few other languages or environments, like perhaps Parrot) that appeal to me. (That said, I've got a list of about 20 or 30 or so topics on just Java/.NET, so any delays or significant pauses aren't for lack of material or ideas.)

And if there's any particular interoperability topic or question you've got, you know how to reach me.

Catch ya around in 2007.


.NET | C++ | Development Processes | Java/J2EE | Ruby | Windows | XML Services

Friday, January 05, 2007 2:04:53 PM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
 Thursday, January 04, 2007
Warning: XSS attack in PDF URLs

Just heard this through the OWASP mailing list, and it's a dandy:

I wanted to give everyone all a heads-up on a very serious new application security vulnerability that probably affects you. Basically, any application that serves PDF files is likely to be vulnerable to XSS attacks.

Attackers simply have to add an anchor containing a script, e.g. add #blah=javascript:alert(document.cookie); to ANY URL that ends in .pdf (or streams a PDF). The browser hands off the anchor to the Adobe reader plugin, and the script then runs in the victim’s browser.

You can find more information here: http://www.gnucitizen.org/blog/universal-pdf-xss-after-party/

You can protect yourself by upgrading your browser and Adobe Reader. There are many vulnerable browser/plugin combinations in use, including Firefox. However, IE7 and IE6 SP2 do not appear vulnerable.

Protecting the users of your application from attack is more difficult. This problem is entirely in the browser and the Adobe reader. The anchor is not even passed from the browser to the web application, so there’s really not much you can do in your code to detect an attack. You could stop serving PDF documents or move them to a different server, but that’s not realistic for many organizations.

Jeff Williams, Chair, The OWASP Foundation

Now, a couple of thoughts come to mind:
  1. First and foremost, if your application serves PDFs, make sure your clients know to upgrade to the latest Acrobat version, since that seems (based on how I read the above) to be protected against the XSS attak; if it's not, though, Adobe will fix it soon (I would hope, anyway), and thus you'll be back to making sure your clients know to upgrade to the latest Acrobat version.
  2. Secondly, this is technology-agnostic, so regardless of your platform (Java, .NET or Rails), you're vulnerable. (Such is always the case with XSS attacks.)
  3. How many developers will actually take steps to try and prevent it (such as, for example, ensuring that PDF URLS received aren't trailing any fragments before sending the URL request on for Adobe to process)?
  4. How long before somebody figures out a way to make this all Microsoft's fault? Will this gather any press coverage, and if it does, will they note that IE 6 SP2 and IE 7 don't seem to be affected by the attack? Will Slashdot even bother with a footnote? (My best guess would be, 1 week, yes, no, and no, respectively.)

  5. .NET | C++ | Java/J2EE | Ruby | Windows | XML Services

    Thursday, January 04, 2007 2:43:17 PM (Pacific Standard Time, UTC-08:00)
    Comments [2]  | 
 Wednesday, January 03, 2007
2006 Tech Predictions: A Year in Hindsight

OK, time to face the music and look back at my predictions from last year:

  1. The hype surrounding Ajax will slowly fade, as people come to realize that there's really nothing new here, just that DHTML is cool again. As Dion points out, Ajax will become a toolbox that you use in web development without thinking that "I am doing Ajax". Just as we don't think about "doing HTML" vs "doing DOM". Well, much as I might have wanted this to take place, it doesn't seem to have happened--Ajax is as much a buzzword (if not more so) than it was in 2005. In fact, it now seems to have grown to the same buzzwordy status as "Web 2.0", in that we're starting to lose sight of it as its acronym originally defined it to be: Asynchronous Javascript And XML. Now people are talking about using JSON, about using it synchronously, and... hey, it's just a matter of time before somebody points out the flaws in Javascript and starts suggesting other dynamic languages for the browser....
  2. The release of EJB 3 may actually start people thinking about EJB again, but hopefully this time in a more pragmatic and less hype-driven fashion. (Yes, EJB does have its place in the world, folks--it's just a much smaller place than most of the EJB vendors and book authors wanted it to be.) Hah. Fat chance. Though the EJB-bashing wave has slipped to an all-time low, it seems, it's still ready to rear its ugly head any time somebody suggests that there might be something about EJB that doesn't suck. Still, the luster is starting to wear off on Spring, which means that (a) people are starting to look at it critically, rather than taking it for granted as a media darling, and (b) people will start to re-evaluate EJB as a viable technology rather than just demonize it. Maybe.
  3. Vista will be slipped to 2007, despite Microsoft's best efforts. In the meantime, however, WinFX (which is effectively .NET 3.0) will ship, and people will discover that Workflow (WWF) is by far the more interesting of the WPF/WCF/WWF triplet. Notice that I don't say "powerful" or "important", but "interesting". Here we go: did Vista ship, or not? Officially, Vista was released to manufacturing (RTM'ed), but it's not available to consumers yet, and won't be until later this month or next. WinFX... er, I mean .NET 3.0... er, I mean NetFX3... whatever... shipped at the same time Vista did, though, and developers in the .NET space are beginning to hear more about this thing called "Workflow". It's still a mystery to most, I think, but then so is WCF.
  4. Scripting languages will hit their peak interest period in 2006; Ruby conversions will be at its apogee, and its likely that somewhere in the latter half of 2006 we'll hear about the first major Ruby project failure, most likely from a large consulting firm that tries to duplicate the success of Ruby's evangelists (Dave Thomas, David Geary, and the other Rubyists I know of from the NFJS tour) by throwing Ruby at a project without really understanding it. In other words, same story, different technology, same result. By 2007 the Ruby Backlash will have begun. Has the Ruby backlash begun? Hard to say--certainly there are those who've been rolling out Rails apps that have found problems with deploying Rails, but for now Rails--and thus Ruby--remain the media darling. Maybe by 2008.
  5. Interest in building languages that somehow bridge the gap between static and dynamic languages will start to grow, most likely beginning with E4X, the variant of ECMAScript (Javascript to those of you unfamiliar with the standards) that integrates XML into the language. Bah--this was an easy one to call. E4X hasn't yet really gained a lot of traction, but that may be because nobody's really talking about it or writing about it. That part might just require more time, or it may never happen--depends on how badly developers want an easier way to work with XML. Suffice it to say, we'll see lots of E4X-like features show up in other languages as we go; some have already shown up in other languages, such as Flex's ActionScript, for example.
  6. Java developers will start gaining interest in building rich Java apps again. (Freely admit, this is a long shot, but the work being done by the Swing researchers at Sun, not least of which is Romain Guy, will by the middle of 2006 probably be ready for prime-time consumption, and there's some seriously interesting sh*t in there.) Well, you can ask Scott Delap if you're not convinced, but certainly there's been a growing interest in building Eclipse RIAs. Swing (justifiably or not) still remains in the doghouse, however.
  7. Somebody at Microsoft starts seriously hammering on the CLR team to support continuations. Talk emerges about supporting it in the 4.0 (post-WinFX) release. I have no empirical or anecdotal proof, but the rumors abound...
  8. Effective Java (2nd Edition) will ship. (Hardly a difficult prediction to make--Josh said as much in the Javapolis interview I did with him and Neal Gafter.) Whoops. Apparently Josh is busy.
  9. Effective .NET will ship. Pragmatic XML Services will ship. Whoops. Apparently I was busy, too.
  10. JDK 6 will ship, and a good chunk of the Java community self-proclaimed experts and cognoscente will claim it sucks. It did ship, and many did claim it sucks. The coolness of JSR 223 (the scripting support) definitely worked to offset a lot of the cries-of-suckiness, though the last-second dropping of the data-mapping capabilities specified in JDBC 4.0 (WTF, Sun?!?) caught a lot of us by (unhappy) surprise. It also raises the question as to efficacy of the JCP documents when Sun feels completely comfortable changing them at the Very Last Second....
  11. Java developers will seriously begin to talk about what changes we want/need to Java for JDK 7 ("Dolphin"). Lots of ideas will be put forth. Hopefully most will be shot down. With any luck, Joshua Bloch and Neal Gafter will still be involved in the process, and will keep tight rein on the more... aggressive... ideas and turn them into useful things that won't break the spirit of the platform. Well, witness the closures debate between Josh on the one hand, and Neal on the other, and you can clearly see that they're still involved in the process, though not in the manner I'd envisioned. That said, though, the JDK 7 discussions are already ramping up; look for an interview I did with Neal Gafter at Javapolis this year to show up on Parleys.com in the very near future, in which we talked about this exact subject. Some interesting ideas will emerge out of this debate, both for JDK 7 and releases beyond...
  12. My long-shot hope, rather than prediction, for 2006: Sun comes to realize that the Java platform isn't about the language, but the platform, and begin to give serious credence and hope behind a multi-linguistic JVM ecosystem. Wow. Witness the acquisition of the JRuby pair by Sun, and the scripting support in JDK 6, and maybe, just maybe, I can claim a point on this one.
  13. My long-shot dream: JBoss goes out of business, the JBoss source code goes back to being maintained by developers whose principal interest is in maintaining open-source projects rather than making money, and it all gets folded together with what the Geronimo folks are doing. In other words, the open-source community stops the infighting and starts pulling oars in the same direction at the same time. For once. Well, you can't win them all.
Not sure how that leaves the score, but there you go....


.NET | C++ | Java/J2EE | Ruby | Windows | XML Services

Wednesday, January 03, 2007 1:43:01 AM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
 Sunday, December 31, 2006
Lack of power makes it really hard to work, even on a laptop...

Originally, I was going to post this the weekend just before Christmas, but the power outage struck back, and I was forced to hang on to it for a while longer, until I finally had a chance to post (which is now.) Thanks to all those who expressed concern and support through the outage; the worst that happened to us, overall, was the loss of recharging ability, which is a killer when you live on laptops and GameBoys...

For those who've been following the news of the storm that just hammered the Seattle and Eastside area this weekend, yes, I was one of those million-or-so without power, and as I write this, we're still without power. (Not sure if it's because the damage is so widespread or because the power company is being a big corporation--I'm sure the politicos will weigh in on that soon enough.) For those who've been wondering why I'm so slow on email this weekend, now you can probably guess why... And to the rest, yes, I and the family are fine, just really missing working electrical outlets to recharge laptops and GameBoys...




Sunday, December 31, 2006 9:20:53 PM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
Tech Predictions: 2007 Edition

So, in what's become an ongoing tradition, this is the time of year when I peer into the patented Ted Neward Crystal Ball (TM) (operators are standing by!), see what it tells me about technology trends and ideas for the coming year, and report them to you. The usual disclaimers apply, meaning I'm not getting any sort of endorsement deals to mention anybody's technology here, I'm not speaking for anybody but myself in this, and so on. And, in order to prove that I'm not an analyst group like Forrester or Burton or any of those other yahoos, in a separate post, I'll look over my predictions for 2006 and see how they panned out, thus proving that the patented Ted Neward Crystal Ball (TM) is just as capable of mistakes as any other crystal ball of course, right all the time. :-)

2006 was an interesting year, in that a lot of interesting things happened this year for developers. For the .NET crowd, Visual Studio 2005 and SQL Server 2005 finally became widely available to them (yes, it shipped in 2005 but it took a bit for it to percolate through the community), and NetFX 3 (aka .NET 3.0, aka Indigo/Avalon/Workflow) shipped in Q4, not to mention Vista itself, meaning there was all kinds of new stuff to play with. For the Java crowd, Spring 2.0 shipped, Geronimo 1.0 shipped, and Sun decided to finally open the doors on the JDK (apparently not realizing that a lot of us had already slipped in the back way through the doors marked "SCSL license" and "JRL license" since JDK 1.2...). Meanwhile, Ruby continued to amaze those who'd never seen a dynamic/scripting language before, and Rails continued to amaze developers who'd never seen a VB demo before. More WS-* specs shipped, people started talking about JavaScript Object Notation (JSON), RSS/Atom continued to draw attention in droves, and marketing guys looked for all kinds of places they could hang the Tim O'Reilly-inspired "Web 2.0" meme anywhere they could. And yet, through it all, developers somehow ignored the noise and kept working.

Without further ado...

  • General: Analysts will call 2007 the Year of the {Something}, where I bet that {Something} will be either "ESB" or "SOA". They will predict that companies adopting {Something} will save millions, if not billions, if only they rush to implement it now. They will tag this with a probability of .8 in order to CYA in case {Something} doesn't pan out. (Yes, I've read far too many of these reports--I'm personally convinced that each of the analyst companies has a template buried away in their basement that they pull out each time they need a new one, and they just do a global search-and-replace of "{Something}" with whatever the technology du jour happens to be.)
  • .NET: Thousands of developers will horribly abuse WPF in ways that can only be called nightmarish, thus once again proving the old adage that "just because you can doesn't mean you should" still holds. WPF's capabilities with video will prove, in many ways, to be the modern equivalent to the "blink" tag in HTML. This will provide some author with a golden opportunity: "WPF Applications That Suck". Alan Cooper will re-release "About Face", updated to include WPF UI elements.
  • .NET: Thousands of developers will look to Redmond for an answer to the question, "Which should I use? BizTalk, Windows Workflow, or SQL Server Service Broker?", and get no clear answer.
  • Windows: Microsoft will try, once again, to kill off the abomination that was called the Windows 95/98/Me line of operating systems, and will once again have to back off as industry outcries of protest (on behalf of little old ladies who are the only ones left running Windows 95/98/Me and probably haven't turned their machine on in months, at least not since the grandkids last visited) go ballistic.
  • Windows: Ditto for Visual Basic 6.0, except now the outcry will be on behalf of developers who aren't capable of learning anything new. Sun will use the resulting PR to announce Project YAVKRWMITT (Yet Another VB Killer Really We Mean It This Time, pronounced "YAV-kermit") on java.net. Meanwhile, efforts to make CLASSPATH into something a VB 6 guy actually has a prayer of understanding will go quietly ignored.
  • Java: JSR 277 will continue to churn along, and once the next draft ships, publicly nobody will like what we produce, though quietly everybody will admit it's a far cry better than what we have now, and when it ships in JDK 7 will be adopted widely and quietly.
  • Java: Thousands of new ideas and proposals to extend the Java language in various ways will flood into the community, now that developers can start hacking on it for themselves thanks to the OpenJDK. Only a small fraction of these will ever get beyond the concept stage, and maybe one or two will actually be finished and released to the Web for consideration by the community and the JCP. Thousands more Java developers craving Alpha-Geek status will stick a "Hello, world" message into the compiler's startup sequence, then claim "experienced with modifying the OpenJDK Java compiler" on their resume and roundly criticize Java in one way or another by saying, "Well, I've looked at the code, and let me tell you....".
  • .NET: Somewhere, a developer will realize that SQL Server 2005 can be a SOAP/WSDL XML service endpoint, and open it up as a private back-channel for his application to communicate with the database through the firewall "for performance reasons" (meaning, "So I can avoid having to talk to the app server in between my web server and my database"). With any luck, the DBA will kill him and hide the body before anybody can find and exploit it.
  • General: Yet Another Virus That's Microsoft's Fault will rip through the Internet, and nobody will notice that the machines affected are the ones that aren't routinely administered or receive updates/patches. Companies will threaten Microsoft with million-dollar lawsuits, yet will fire none of their system administrators who lovingly lavish whole days tuning their Linux IRC servers yet leave the Windows Exchange Server still running Windows NT 4.0.
  • General: Interest in JSON will escalate wildly, hyped as the "natural replacement for XML" in building browser-to-server connections, owing to its incredible simplicity in expressing "object" data. Folks, JSON is a useful format, but it's not a replacement for XML (nor is XML a replacement for it, either). What made XML so popular was not is hierarchical format (Lord above, that's probably the worst part of it, from where we as developers sit), nor its HTML-like simplified-SGML syntax. What made XML interesting was the fact that everybody lined up behind it--Microsoft, Sun, BEA, Oracle, IBM, there's not a big vendor that didn't express its undying love and devotion to XML. I sincerely doubt JSON will get that kind of rallying effect. (And if you're going to stand there and suggest that JSON is better because its simpler and therefore more approachable for developers to build support for themselves, quite honestly, I thought we were trying to get out of developers building all this communications infrastructure--isn't that what the app servers and such taught us?)
  • General: Interest in Java/.NET interopability will rise as companies start to realize that (a) the WS-* "silver bullet" isn't, (b) ESB, XML, and SOA are just acronyms and won't, in of themselves, solve all the integration problems, and (c) we have lots of code in both Java and .NET that need to talk to each other. This may be a self-serving prediction, but I got a LOT of interest towards the end of this year in the subject, so I'm guessing that this is going to only get bigger as the WS-* hype continues to lose its shine in the coming years.
  • Ruby: Interest in Java/Ruby and .NET/Ruby interoperability is going to start quietly making its presence felt, as people start trying to wire up their quick-to-write "stovepipe" RAILS apps against other systems in their production data center, and find that Ruby really is a platform of its own. RubyCLR or JRuby may be part of the answer here, but there's likely some hidden mines there we haven't seen yet.
  • Languages: A new meme will get started: "JavaScript was that thing, that little toy language, that you used to do stuff in the HTML browser. ECMAScript, on the other hand, is a powerful and flexible dynamic programming language suitable for use in all sorts of situations." Pass it on. If you get it, don't tell anybody else. (Don't laugh--it worked for "The Crying Game".) It's the only way JavaScript ECMAScript will gain widespread acceptance and shed the "toy" label that JavaScript has.
  • Languages: Interest in functional-object hybrid languages will grow. Scala, Jaskell, F#, and others not-yet-invented will start to capture developers' attention, particularly when they hear the part about functional languages being easier to use in multi-core systems because it encourages immutable objects and discourages side effects (meaning we don't have to worry nearly so much about writing thread-safe code).
  • Languages: Interest in Domain-specific languages will reach a peak this year, but a small backlash will begin next year. Meanwhile, more and more developers will realize that one man's "DSL" is another man's "little language", something UNIX has been doing since the early 70's. This will immediately take the shine off of DSLs, since anything that we did in the 70's must be bad, somehow. (Remember disco?)
  • General: Rails will continue to draw developers who want quick-fix solutions/technologies, and largely that community will ignore the underlying power of Ruby itself. The draw will start to die down once Rails-esque feature ideas get folded into Java toolkits. (Rails will largely be a non-issue with the .NET community, owing to the high-productivity nature of the drag-and-drop interface in Visual Studio.)
  • Java: Interface21 is going to start looking like a "big vendor" alongside BEA and IBM. I was talking with some of the I21 folks in Aarhus, Denmark at JAOO, and one of them casually mentioned that they were looking at a Spring 2.1 release somewhere in mid-2008. Clearly Spring is settling into eighteen-month major-version release cycles like all the big (meaning popular), established software systems have a tendency to do. This is both a good thing and a bad thing--it's good in that it means that Spring is now becoming an established part of the Java landscape and thus more acceptable to use in production environments, but it's bad in that Spring is now going to face the inevitable problem all big vendors face: trying to be all things to all people. This is dangerous, both for Interface21 and the people relying on Spring, largely because it means that Spring faces a very real future of greater complexity (and there are those, myself included, who believe that Spring is too complex already, easily on par with the complexity seen in EJB, POJOs notwithstanding).
  • General: Marc Fleury will get a golden parachute from Red Hat (at their request and to their immense relief), and hopefully will retire to his own small island (might I suggest Elba, la petite corporal?) to quietly enjoy his millions. A shame that the people who did most of the real work on JBoss won't see a commensurate reward, but that's the way the business world works, I guess.
  • General: Some company will get millions to build an enterprise product on the backs of RSS and/or Atom, thus proving that VCs are just as stupid and just as vulnerable to hype now as they were back in the DotCom era.
  • General: Somebody will attempt to use the phrase "Web 2.0" in a serious discussion, and I will be forced to kill them for attempting to use a vague term in a vain effort to sound intelligent.
  • Web clients: Ajax will start to lose its luster when developers realize the power of Google Maps isn't in Ajax, but in the fact that it's got some seriously cool graphics and maps. (Or, put another way, when developers realize that Ajax alone won't make their apps as cool as Google Maps, that's it's the same old DHTML from 1998, the hype will start to die down.)
  • XML: Somebody, somewhere, will realize that REST != HTTP. He will be roundly criticized by hordes of HTTP zealots, and quietly crawl away to go build simpler and more robust systems that use transports other than HTTP.
  • XML: Somebody, somewhere, will read the SOAP 1.2 specification. H.P. Lovecraft once suggested, loosely paraphrased, the the day Man understands the nature of the universe, he will either be driven into gibbering insanity, or flee back into ignorance in self-preservation. Ditto for the day Man reads the SOAP 1.2 spec and realizes that SOAP is, in fact, RESTful.
  • Security: The US Government will continue its unbelievable quest to waste money on "security" by engaging in yet more perimeter security around airports and other indefensible locations, thus proving that none of them have bothered to read Schneier and learn that real security is a three-part tuple: prevention, detection, and response.
  • Security: Thousands of companies will follow in the US Government's footsteps by doing exactly the same thing. (Folks, you can't solve all your problems with cryptography, no matter how big the key size--you just end up with the basic problem of where to store the keys, and no, burying them inside the code isn't going to hide them effectively.)
  • Security: More and more rootkits-shipping-with-a-product will be discovered. We used to call it "getting close to the metal", now it's a "rootkit". With great power comes great responsibility... and, as many consumers have already discovered, with great power also comes a tendency to create greater instability...
  • General: Parrot will ship a 1.0 release. Oh, wait, hang on, sorry, I bumped into the crystal ball and accidentally set it to 2017.
  • .NET: Microsoft will ship Orcas (NetFX 3.5). (Sorry, crystal ball's still set on 2017. Trying to fix it...)
  • .NET: Vista will surpass Windows XP in market penetration. (Let's see, almost got it set back to 2007, bear with me... There. Got it.)
  • General: I will blog more than I did this year. (Hell, I couldn't blog less, even if I tried.)
  • General: Pragmatic XML Services, Pragmatic .NET Project Automation and Effective .NET will ship. (Wait, is the crystal ball still on 2017...?)
Same time, next year....


.NET | C++ | Java/J2EE | Ruby | Windows | XML Services

Sunday, December 31, 2006 9:14:58 PM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
 Friday, December 01, 2006
Follow-up on the Java Generics post

A number of folks emailed me with comments and ideas following the post on Java5's generics model. In no particular order...

John Spurlock wrote,

Interesting scenario, I wasn't able to come up with a warning-free solution either - but had some fun trying. I wonder if your compiler of choice makes a difference? I seem to remember Eclipse's JDT compiler having subtle differences from Sun's in regards to edge-case generics/casting scenarios (Sun's being more strict and giving more warnings).

The c# analogue is trivial, although the client code seems unnecessarily verbose (does not/will not infer the "item-type" afaik) In general, the c# compiler seems overly conservative in regards to type inference, forcing explicit type parameters far too often (anonymous parameterized delegates are the biggest offender). Also the fact that Type is not parameterized makes it impossible to pass standard arguments a la the java example.

using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;

namespace ConsoleApplication1
{
  class Program
  {
    static void Main(string[] args)
    {
      List<DateTime> listOfDates = GetSomethingOf<List<DateTime>, DateTime>();
      Console.WriteLine(listOfDates.Count);

      Collection<DateTime> collectionOfDates = 
        GetSomethingOf<Collection<DateTime>, DateTime>();
      Console.WriteLine(collectionOfDates.Count);

      LinkedList<DateTime> linkedListOfDates = 
        GetSomethingOf<LinkedList<DateTime>, DateTime>();
      Console.WriteLine(linkedListOfDates.Count);

      Dictionary<DateTime, DateTime> dictionaryOfDatePairs = 
        GetSomethingOf<Dictionary<DateTime, DateTime>, KeyValuePair<DateTime,DateTime>>();
      Console.WriteLine(dictionaryOfDatePairs.Count);

      List<List<String>> listofListsOfDates = 
        GetSomethingOf<List<List<String>>, List<String>>();
      Console.WriteLine(listofListsOfDates.Count);

      Console.ReadLine();
    }

    static C GetSomethingOf<C, T>()
      where C : ICollection<T>, new()
      where T : new()
    {
      C rt =  new C();
      rt.Add(new T());
      return rt;
    }
  }
}

Thanks, John.

Adam Vanderburg wrote:

In C# you'd add the "new()" constraint to the generic types (both the collection and item), then you can just "new T()" them. In fact, one of the frustrating things about C# 2.0 is that you can require a parameterless constructor, but you can't require Constructors with specifically typed parameters. (Rumor has it that the underlying IL supports it, just not the C# 2 compiler.)
Yep, Adam, that's exactly what John demonstrates above (in case any non-C# programmers were wondering what that "where T : new()" syntax was coming from). As to whether the IL supports constructors with specifically typed parameters, I have to admit I don't know the answer to that one, and don't have time at the moment to find out--maybe Serge Lidin will read this blog entry and email me with an answer that I can post in a future blog entry (or comment, once I get comments re-enabled after doing a dasBlog upgrade to prevent all the crap comment and pingback/trackback spam I've been getting on here).

Next, Matt Tucker wrote:

In regard to your article on Java5 generics, I had a couple comments for you:

First of all, I'm not disputing the contention that Java generics leave something to be desired. Cool as they are, there are clearly some bits missing from the implementation.

No arguments there, Matt, but mostly my argument is that Java's generics model leave something to be desired entirely because they support a model of type erasure, rather than persisting the parameterized type directly into the JVM bytecode level. Doing that would have complicated the JVM a fair bit, but would have (a) allowed other languages to take advantage of generics, (b) preserved type-safety even in the face of Reflection, and (c) allowed for JIT compiler optimizations given the known paramterized type. None of these are possible in a type-erasure-based model. Why did Sun choose this approach? I won't speak authoritatively here, but my guess is because it represented too drastic a change for the language/platform at this point in Java's lifetime. (Whether that's true or not, or a good decision to have made, is entirely up to you to judge for yourself.)
Secondly, the code you had posted in your blog (at 4:00p at least) wouldn't compile. The issue was in the "external" class, and I ended up changing it to:
public static class external { 
  public static <C, T extends Collection<C>> T getSomethingOf(Class<T> type, Class<C> contentType)
    throws Exception { 
    T result = type.newInstance(); 
    result.add(contentType.newInstance());
    return result;
  } 

  public static Set<Date> getSetOfDate() 
    throws Exception { 
    return getSomethingOf(HashSet.class, Date.class); // warning 
  } 
}
This still shows a warning on the line in question, but dispenses with the cast of the HashSet class, and rearranges the generic specification a bit.

My next point, which isn't a major one, is that in all of the examples you're doing collection.addAll(Arrays.asList(<single genericized item>)), which causes Java to complain that it's trying to dynamically create an array for an unknown (generic) type. Since that array has one item in it, and the array itself is going to be thrown away, why not do collection.add(contentType.newInstance()) and dispense messing with arrays entirely? If you're worried about doing it in a loop, does it really make sense to create and fill an entire array in a loop, add all its contents to the collection in one operation, and then throw it away?

As for the issue itself, I'd say that while none of the implementations are clean, the last one ("internal") is probably the best from the standpoint that it's at least providing a clean API. Sure, there's some casting and warnings going on inside there, but at least users of the code (ie, getSetOfDate) don't have to mess with it. And the warnings are there to remind you that you need to be careful about what you're doing. Since you have a collection that's only supposed to hold C's, and since you're only creating C's and putting them in the collection, it theoretically *should* be fine. Of course, what's the point of having to deal with a statically typed language if all you can get out of it is "should"?

All are viable points, Matt, and unfortunately I can't answer any questions regarding the intent of the code or why the choice for arrays; as I mentioned, this was code presented to me by an attendee at an NFJS conference, looking for an answer to get rid of the warnings generated by the various options. As to why the code wouldn't compile, that's likely a typo on my end--while we were working with it, we were in Eclipse at the time, and no compiler errors were reported at the time, so I have to assume I fat-fingered the code somewhere along the way.

Then, Bob Lee a.k.a. crazybob wrote to say:

The problem is you're trying to create an instance of a generic type from a Class. Class instances can only represent raw types. For example, List<?>, List<String> and List all share the same Class instance.

Your options are A) use a callback instead of a class instance:

interface CollectionFactory> {
  C newInstance();
}

static <T, C extends Collection<T>> C 
  getSomethingOf(CollectionFactory<C> collectionFactory, Class<T> elementType)
  throws Exception {
  C c = collectionFactory.newInstance();
  c.addAll(Arrays.asList(elementType.newInstance ()));
  return c;
}

public static Set<Date> getSetOfDate() throws Exception {
  return getSomethingOf(new CollectionFactory<Set<Date>>() {
    public Set<Date> newInstance() { 
      return new HashSet<Date>();
    }
  }, Date.class);
}
Or B) suppress the warning.

In the example above, we eliminated the warnings when we got rid of the Class representing the collection type, but we left the Class representing the element type in. This isn't a problem in the example above, but if we called getSomethingOf() with a generic element type, the code won't compile. Again, our only options would be to live with warnings or use a callback.

Thanks, Bob, though I'm not sure I like the solution. Bob also pointed out (over IM) that Angelika Langer has a great FAQ on generics off of her website; were I not such a lazy person, I'd link to it directly from here (and will later, when I'm online), but for now, Google on "Langer generics FAQ" and you're feeling lucky....

Finally, Rafael de F. Ferreira wrote:

Hello. I came up with the following ugly hack:
import java.util.*;
import org.junit.Test;

public class WithNewClass {
  public static <T, C extends Collection<? super T>> 
    C getSomethingOf(Class<C> type, Class<T> contentType) 
    throws Exception
  {
    C res = type.newInstance();
    res.add(contentType.newInstance());
    return res;
  }

  public static Set<Date> getSetOfDate() 
    throws Exception 
  {
    class HSD extends HashSet<Date> {};
    Class<? extends Set<Date>> cls = HSD.class;
    return getSomethingOf(cls, Date.class);
  }

  @Test public void printSetOfDate() throws Exception {
    Set<Date> newset = getSetOfDate();
    System.out.println(newset);
  }

}
It compiles without warnings in Eclipse, but I hope someone knows a better solution. Creating a class just to capture type arguments seems like a kludge.
It is, Rafael, but it's an interesting and useful kludge, nonetheless.

Thanks to all five of you for your comments; much as I dislike the generics system that Java ended up with, the sooner we learn to work with it and account for its... quirks, shall we say... the better.


Java/J2EE

Friday, December 01, 2006 7:49:11 AM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
 Tuesday, November 28, 2006
Java5, generics, and "just not quite there"

So an attendee comes up to me at one of the past NFJS shows, with this challenge:

The implementation does not know what parametrized Iterable class will be used. The Iterable class will need to know what class it contains. Interfaces are passed to the factory and it calls a lookup to identify (or create) the implementing class. Can this be done without causing a compile warning?
// usage:
        Seq<Item> items = factory.createBean(null, Seq.class, Item.class);

// interface:
    public abstract <T> T getBean(String localName, Class<T> javaClass,
            Type... typeArguments);

// impl:
    public <T> T createBean(String localName, Class<T> javaClass, Type... typeArguments) {
        Resource resource = createResource(localName);
        Collection<STRING> rdfTypes = findRdfTypes(javaClass);
        for (String rdfType : rdfTypes) {
            addStatement(resource, RDF.TYPE, createResource(rdfType));
        }
        T bean = rdfBeanFactory.createBean(this, resource, rdfTypes, javaClass);
        if (typeArguments != null && bean instanceof RdfParameterizedBean)
            ((RdfParameterizedBean)bean).setActualTypeArguments(typeArguments);
        return bean;
    }


-- Some ideas I have tried.

import java.util.Arrays;
import java.util.Collection;
import java.util.Date;
import java.util.HashSet;
import java.util.Set;

public class test {

	public static class plain {
		public static Collection getSomethingOf(Class type, Class contentType) throws Exception {
			Collection result = (Collection) type.newInstance(); // cast
			result.addAll(Arrays.asList(contentType.newInstance())); // warning
			return result;
		}

		public static Set<Date> getSetOfDate() throws Exception {
			return (Set<Date>) getSomethingOf(HashSet.class, Date.class); // warning
		}
	}

	public static class fixed {
		public static <C> Collection<C> getSomethingOf(Class<? extends Collection> type, Class<C> contentType) throws Exception {
			Collection<C> result = type.newInstance(); // warning
			result.addAll(Arrays.asList(contentType.newInstance()));
			return result;
		}

		public static Set<Date> getSetOfDate() throws Exception {
			return (Set<Date>) getSomethingOf(HashSet.class, Date.class); // cast
		}
	}

	public static class external {
		public static <C, Collection<C extends T>> T getSomethingOf2(Class<T> type, Class<C> contentType) throws Exception {
			T result = type.newInstance();
			result.addAll(Arrays.asList(contentType.newInstance()));
			return result;
		}

		public static Set<Date> getSetOfDate2() throws Exception {
			Class<HashSet<Date>> type = (Class<HahSet>) HashSet.class; // warning
			return getSomethingOf2(type, Date.class);
		}
	}

	public static class internal {
 		public static <C, T extends Collection, R extends Collection<C>> R getSomethingOf(Class<T> type, Class<C> contentType)
                        throws Exception {
 			R result = (R) type.newInstance(); // warning
                        result.addAll(Arrays.asList(contentType.newInstance()));
 			return result;
 		}

		public static Set<Date> getSetOfDate() throws Exception {
			return getSomethingOf(HashSet.class, Date.class);
		}
	}
The goal here, I think, is to be able to construct instances of T without compiler warnings or errors (or old-style casts). Needless to say, neither Venkat nor I could manage to cruft up something that could work, and so I thought to throw this out to the blogosphere to see what others could come up with.

If I'm feeling bored one day I'll try coding it in C#, just to (hopefully) exemplify the differences in the generics model between the two.

UPDATE: Hopefully I got the formatting right this time; have I mentioned how much I hate the fact that Java, C# and C++ all use the left-pointy-bracket/right-pointy-bracket syntax when posting code snippets like this...?


Java/J2EE

Tuesday, November 28, 2006 2:06:02 PM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
 Monday, November 20, 2006
"What is Java Software?" You'd think they know by now...

While looking to download the Java5 JDK from Sun, I ran across this on the home page of java.com:

What is Java Software?
Java software allows you to run applications called "applets" that are written in the Java programming language. These applets allow you to play online games, chat with people around the world, calculate your mortgage interest, and view images in 3D. Corporations also use applets for intranet applications and e-business solutions.
Applets!? After almost a decade of Java's success on the server through J2EE and lightweight containers, the marketing idiots at Sun choose to explain what Java is by citing applets?!? Folks, if ever there was a single-sentence hint as to how Sun doesn't quite "get it", this is it.


Java/J2EE

Monday, November 20, 2006 5:51:45 PM (Pacific Standard Time, UTC-08:00)
Comments [2]  | 
Blog changes

Because of all the referrer and Trackback/Pingback spam, I've had to disable Trackback and Pingback (hopefully just temporarily, at least until I can get my dasBlog upgraded). Dunno if that makes anybody else sad, but I'm bummed at not being able to see peoples' comments and reactions to my posts.

Thus, for the time being, if you respond (positively or negatively) to something I say, and would really like a reaction (again, positive or negative), please either drop me an email or just post a comment here.




Monday, November 20, 2006 5:26:21 PM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
 Saturday, November 18, 2006
Windows Vista has RTM'ed

... which, normally, would be a source of much excitement. So I pull down the Vista bits, fire up VMWare (not that I don't trust it yet, it's just that... well.. you know... it is a 1.0 release and all, and besides, I do all my work now in VMWare images, and...), and sort through the whole "Vista doesn't like the VMWare CD emulation problem" (by mounting the ISO on the host using Daemon-Tools, so that to VMWare it looks like a real DVD). Voila. Installation proceeds.

And then, Vista prompts me for a license key. This should be the easiest step in the whole process: Being an MVP, we get license keys to everything Microsoft makes. So I cruise on up to the MSDN site, ask for a Vista Ultimate key, and...

"Error while requesting Product Keys. Please try again later or contact customer support. Please try again later. Thank you for your patience."

I try again.

"Error while requesting Product Keys. Please try again later or contact customer support. Please try again later. Thank you for your patience."

One more time--Microsoft software has been known to work the third time (or not at all).

"Error while requesting Product Keys. Please try again later or contact customer support. Please try again later. Thank you for your patience."

Now, fortunately, Vista will allow you to install without the product key up front. But you've got to wonder what the folks in Microsoft's MSDN support department were thinking when they didn't check to make sure requesting product keys would work before posting Vista to the Subscriber Downloads section: "Well, you know, it's not like the MVPs, the folks that we've rewarded for loyalty and external product support, it's not like they would want to download Vista right away and start playing with it or anything... and besides, it's not like they'd want the fullest-featured version of Vista, all they want to do is install the Home/Basic/StrippedDownToNothing version, right?"

Get it fixed, MSDN. And preferably before I have to reinstall Vista in a VMWare image again because I don't get a product key registered in time.

Oh, and for the future? You might want to check these things before you put the silly thing online. And that error message... Oy! "Thank you for your patience"?!? That has GOT to be the most overused phrase in all of customer service. So much so that I'm considering a new crusade to eliminate it from the vocabulary of any and all customer service representatives and management. (If I had any patience, I doubt I would be spending it waiting for somebody to get their act together on this. Now, waiting for my son to make his next move in Catan, THAT's a worthwhile exercise in patience...)

So sorry, Microsoft, but this earns you the highest mark of disrespect I can offer in the blog: "Duh..."

Update: So I went back in to MSDN Subscriber Downloads and got the Product Key without a hitch this time around, but it still doesn't change (a) the inexcusable fact that MSDN couldn't handle the load of its MSDN Subscribers downloading Vista, or (b) the fact that it couldn't even handle the load of people downloading product keys. Possible solutions for future releases: how about handing out product keys *before* the release? Just about a week or two ahead of the actual release, post a notice telling subscribers that "RTM keys are available", and that'd reduce at least a little bit of the load. I think subscribers can understand the difficulties of providing enough server bandwidth to download a 2.5 GB ISO image (!), but not having the product keys ready to go, that's just really hard to understand....


.NET

Saturday, November 18, 2006 2:43:14 AM (Pacific Standard Time, UTC-08:00)
Comments [4]  | 
 Friday, November 17, 2006
Java/.NET Interop discussions..

... are currently under way at The ServerSide Interoperability Blog, and at the InfoQ Java/.NET portal. I'll try to post more on the subject here, but for now, enjoy.


.NET | Java/J2EE | Ruby | XML Services

Friday, November 17, 2006 9:16:00 PM (Pacific Standard Time, UTC-08:00)
Comments [5]  | 
 Thursday, November 16, 2006
Welcome to Borders' Microsoft Days...

If you're a Microsoftie and you're in the Redmond area this week, swing by the Borders in the Redmond Town Center, where they're having their "Microsoft Days" experience--everything a Microsoftie buys (whether for themselves or for their significant other, hint hint, guys) is 15% off.

Why the advertisement? Two reasons: one, because I love supporting the local causes, and two, because I'm going to be there Friday night on a panel discussion with several .NET notables, including Bill Vaughn (the original SQL Server curmudgeon), Harry "I Got Your Architecture Right Here, Baby" Pierson, contributor to the "VB6 Migration Guide" book Keith Pleas, and possibly (if we can drag them out of the p & p "war room") agile afficionados Peter Provost and Brad Wilson. We have no real idea what we're going to talk about, but given the fact that we all like to express opinions regardless of whether we have any real working knowledge on the subject, I expect it'll be an interesting discussion....

See your local Borders for details, and while you're there, drop into the cafe and grab an espresso from the cheerful cafe staff... caffeine makes everything better.


Reading

Thursday, November 16, 2006 5:13:54 AM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
 Thursday, November 02, 2006
Kudos to APress...

So I'm in Borders tonight, looking around, and I happen to see one of APress's latest titles, "Practical OCaml". Several things go through my mind at once:

  1. WOW. OCaml.
  2. A book on OCaml. Not even a "Programming Languages 101" textbook, but a practical one, even.
  3. Like, a book, copywrit this year, on OCaml.
  4. Gotta buy it--not just because it's another of those Dead Languages I like to explore, but because F# is a dead-ringer for OCaml, and I'm really interested in seeing where we can go with F# these days.
  5. Gotta buy it--not only for the F# tie-in, but because Scala comes from that same family of languages, so there's probably some goodness on the Scala thought experiment, too.
  6. You know, come to think of it, this is the third or fourth book on the "Non-Mainstream" languages that APress has done recently. I thought maybe "Practical Common Lisp" was a one-shot, and hey, "Programming Sudoku" isn't a language but definitely a fun title nevertheless, but with "Practical OCaml", maybe Apress is quickly becoming like Morgan-Kaufman, in that they're going after territories that aren't already flooding with ten thousand "Me Too Ruby" books.
  7. And it's not just limited to languages either, come to think of it: they just published a db4o book, and even before then they had the only Lego Mindstorms books for years.
  8. Nice going, Gary.
  9. Hmm.... Wonder if Gary is already has "Practical Scala" under contract...?
Well done, APress. You had me worried there for a while, when you bought up all those Wrox titles (most of which were unadulterated crap, IMHO), but you've restored my faith in you once again. In fact, in my book, you have graduated to an entirely new level of coolness.


Reading

Thursday, November 02, 2006 11:22:41 PM (Pacific Daylight Time, UTC-07:00)
Comments [4]  | 
 Tuesday, October 24, 2006
New column goes live

The folks over at MSDN asked me to author a series of articles based around the theme of the "Pragmatic Architecture" talk I've given in a couple of locales recently, and the first article ("Layering") has gone up, along with the introduction to the series. Feedback is, of course, welcome, through either blog comments or through more traditional channels.

By the way, here's an interesting challenge for those of you who think you're up for it--who are the two members of "the group" spotted by the author during the intro? (Yes, they are, in fact, real people. None of this "Any similarities to persons real or historical is strictly accidental" bull-pucky for me.)


.NET | C++ | Java/J2EE | Ruby | XML Services

Tuesday, October 24, 2006 12:05:42 PM (Pacific Daylight Time, UTC-07:00)
Comments [11]  | 
 Monday, October 16, 2006
There, but for the grace of God (and the experiences of Java) go I

At the patterns&practices Summit in Redmond, I was on a webcasted panel, "Open Source in the Enterprise", moderated by Scott Hanselman and included myself, Rocky Lhotka, and Chris Sells as panelists. Part of the discussion came around to building abstraction layers, though, and one thing that deeply worried and disappointed me was the reaction of the other panelists when I tried to warn them of the dangers of over-abstracting APIs.

You see, we got onto this subject because Scott had mentioned that Corillian (his company) had built an abstraction layer on top of the open-source logging package, log4net. This reminded me so strongly of Commons Logging that I made a comment to that effect, warning that the Java community got itself into trouble (and continues to do so to this day, IMHO) by building abstraction layers on top of abstraction layers on top of abstraction layers, all in the name of "we might want or need to change something... someday". It was this very tendency that drove many developers to embrace YAGNI (You Ain't Gonna Need It) from the agile/XP space, and remains a fiercely-debated subject. But what concerned me was the reactions of the other panelists, whose reaction, paraphrased, came off to me as, "We won't make that mistake--we're smarter than those Java guys."

Sorry, folks. That doesn't cut it.

Certainly, .NET has learned from the five years' lead time the Java community has had: the power of a runtime and bytecode, the usefulness of a large and well-built library upon which to build further, the power of compiled-on-demand Web pages, the usefulness of an openly-extensible build tool, even the "one language" vs. "many languages" debate, all could be said to have been influenced strongly by decisions and experience in the Java community. But Java still has much more it can teach the .NET community: mocking, unit-testing, lightweight containers, dependency-injection, and the perils of O/R-M are just part of the list of things that the Java community has close to a half-decade's experience in, compared to .NET's none.

To stand there and suggest that .NET will somehow avoid the mistakes of the Java community just because "we're smarter than them" is more than sheerest folly; it's a blatant ignorance of the well-known and famous quote:

"Those who do not remember the past are condemned to repeat it." --George Santayana


.NET | Java/J2EE | Ruby

Monday, October 16, 2006 6:58:46 PM (Pacific Daylight Time, UTC-07:00)
Comments [6]  | 
 Tuesday, October 10, 2006
Watching a friend's career die a short, horrific, painful death

Normally, I don't go for the chain-email thing, but recently someone who claims to be a friend of mine sent me this email:

The first episode of my Millahseconds weekly geek comedy podcast has been published. Details are here And you can download/subscribe here. Best Regards, Mark Miller
Now, as I say, I normally don't go in for this sort of shameless self-promotion (at least, on the part of other people, anyway), but his email contained one segment that made me rethink my position:
IMPORTANT: To help promote this, Ive employed the services of a crazy old voodoo gypsy woman named Moombassa. To avoid the Millahseconds Curse (which manifests itself as a rather itchy rash in areas you dont even want to know about), it is essential that you tell absolutely everyone you know about Millahseconds. In doing so, Moombassa says the curse will be lifted from you and passed onto your friends (awesome, eh?). And dont worry, that itching should go away in a few days.
Not that I'm suffering from any itchy rash in areas I don't... er, didn't... want to know about. No, sirreee, not me. This is just a... general rethinking of my position on forwarding selected emails. That's all. Really.

(Good luck, Mark, and for those of you who've never heard Mr. Miller on a comedic rant, you owe it to yourself to have a listen, both to tihs, and to Mondays. Oh, and be sure to have handy a spare pair of underwear--Mark's been known to make people laugh so hard I soiled mine... er, I mean, they soil theirs. It's some brutally wicked geek comedy.)


Conferences | .NET | C++ | Java/J2EE

Tuesday, October 10, 2006 11:17:33 PM (Pacific Daylight Time, UTC-07:00)
Comments [0]  | 
 Friday, October 06, 2006
A little knowledge is a dangerous thing

Five easy steps to thinking you understand a subject well enough to write on it:

  1. Read an article that poorly describes the subject, such as the article at http://java.sys-con.com/read/37613.htm, particularly when it ascribes to a few of the popular myths (such as "Why not tell the garbage collector what and when to collect", or the advice that calling System.gc() is anything but a waste of your time or an unnecessary hindrance to the GC itself).
  2. Follow the directions given there, which ask to create a benchmark with so much noise underneath it (in this case, by running on top of the WebLogic Server... or any J2EE server, for that matter) that you could never be precisely sure of the effect of any change to the code.
  3. Read an unrelated specification, such as one that's unrelated to the "normal" JVM and its GC behavior, like the Real-Time Specification for Java (JSR 1), and pretend that it will offer insights into how the J2SE/JSE JVM works.
  4. Don't bother reading the established literature from the source, such as that from the Sun Hotspot team (for example, the docs available online at "Tuning Garbage Collection with the 1.4.2. VM", in which it says, "Another way applications can interact with garbage collection is by invoking full garbage collections explicitly, such as through the System.gc() call. These calls force major collection, and inhibit scalability on large systems. The performance impact of explicit garbage collections can be measured by disabling explicit garbage collections using the flag -XX:+DisableExplicitGC." and the Hotspot FAQ, in which it says, "14. What type of collection does a System.gc() do? An explicit request to do a garbage collection does a full collection (both young generation and tenured generation). A full collection is always done with the application paused for the duration of the collection." and, most of all, "31. Should I pool objects to help GC? Should I call System.gc() periodically? The answer to these is No! Pooling objects will cause them to live longer than necessary. We strongly advise against object pools. Don't call System.gc(). The system will make the determination of when it's appropriate to do garbage collection and generally has the information necessary to do a much better job of initiating a garbage collection. If you are having problems with the garbage collection (pause times or frequency), consider adjusting the size of the generations.") Ignore that literature in favor of what your cousin's brother's wife's former roommate said about how to make Java GC run better.
  5. Publish your own variation thereof, and repeat.
Anybody still wondering why Java performance myths continue to perpetuate?

(In truth, it's really a shame--the author of the article really seems, on the surface of it, to be quite knowledgeable about the JVM and GC behavior, but as I went through it, I just got this jarring and sick feeling that either she was working with an entirely different JVM than the one I've been using for years now, or else everything I've been told and seen about the JVM was somehow a huge lie in of itself--and if that's the case, boy, are a LOT of the Java experts I know and respect equally fooled. If her benchmark weren't on top of WLS, I'd be tempted to follow it, but any benchmark on top of a J2EE server is going to be skewed, and thus, in my mind, not even worth the bother. Run it on top of a naked JVM, then let's see what's going on and compare notes. Normally I really try to give authors the benefit of the doubt, but this time... Sorry, Ms. Andres, but you've got a really steep uphill battle to fight yet if you're going to get any respect whatsoever on this one.)


Java/J2EE

Friday, October 06, 2006 6:00:02 AM (Pacific Daylight Time, UTC-07:00)
Comments [3]  | 
JAOO? Ja, I O-O too!

For two years now, I've been trying to come up with a good English pun on the name of the JAOO (apparently, officially pronounced "[DJA-OU]"), and that's the best I could come up with. Fortunately, the quality of the show isn't dependent on my puny punability.

Once again, the JAOO folks deserve a peerage. The venue was great (not often do I get to perform on a concert hall stage), the speaker selection was diverse and entertaining (not often do I get to see two people I deeply respect, in this case Glenn Vanderburg and Ian Griffiths, go at each other--respectfully--over the benefits and/or drawbacks of a technology, in this case, the Seaside web framework), and the opportunity to "hang" with those speakers (which is always the principal draw for me) was first-rate. I always love it when a conference dares to bridge the technology gap by bringing Java, .NET and "other" folks, such as the Rubyists, together, and JAOO does that magnificently. What was once a Java-centered conference is clearly no longer; now it's a Java/.NET/Agile/Enterprise/Client/Academic/Pragmatic conference.

Hail, JAOOers!


Conferences

Friday, October 06, 2006 5:08:44 AM (Pacific Daylight Time, UTC-07:00)
Comments [0]  | 
 Wednesday, September 27, 2006
Where've you been, Ted?

Some of the blog readers have emailed me asking about the long silence; a few have even asked if I was injured by one of the flying rotten tomatoes that came with the Vietnam post. No, I've just been traveling a lot, doing a bunch of conferences, with more coming up, like JAOO and DevReach (a new show that's opening in Sofia, Bulgaria, and one that I'm really looking forward to). In fact, for any of those of you who are in the Bulgaria area in a couple of weeks, DevReach is offering a pretty interesting raffle gift, a trip to visit Microsoft Research in Redmond; even if you don't win the prize, though, the Microsoft Research site is still pretty cool to visit.

In other news, I have new digs for my .NET training; yes, some of you had already read this elsewhere, but I'll say it here: I'm very glad to now be a part of the crew at Pluralsight, and I'm looking forward to doing Workflow, WCF, and Architecture classes for them, among others. It's a privilege and honor to be among guys this bright and this articulate, and once again I'm just happy at being a part of a group that will continue to keep me on my toes for a long time to come.

Meanwhile, I do plan on blogging again soon, but probably not until I'm done with my current travel set (eight cities, four countries, two continents, six weeks) and have some time to breathe again.


.NET | C++ | Java/J2EE | Ruby | Windows | XML Services

Wednesday, September 27, 2006 10:24:00 PM (Pacific Daylight Time, UTC-07:00)
Comments [2]  | 
 Tuesday, June 27, 2006
Thoughts on Vietnam commentary

Numerous folks have taken me to task (some here in comments, some through private email, some through still other channels) over the last blog post; rather than try to respond to all individually, I figured it makes more sense to address the more salient points here:

  1. "How dare you use the Vietnam War as an analogy for something so trivial as object/relational mapping?" First of all, let's make a few facts clear. My father served in Vietnam. I have friends in Iraq right now. My best friend from high school served in the Navy during the first Iraq. I studied Vietnam--along with numerous other wars and coflicts--for several years as an International Relations major in college, focused specifically on military history. I have nothing but deep respect for all soldiers, of all nations, who go off to risk their lives in services to their country. I am appalled at how quickly governments (ours and others) chuck troops into a situation without thinking of the long-term strategy. I've spent more time studying war and its effects on the solidiers, the governments and the people than most people have spent watching TV. I am very aware of the ghosts I'm treading upon when I use the word "Vietnam", and quite frankly, folks, we as a nation have yet to come to terms with what happened there. Rambo films don't exorcise ghosts, much as we might want them to. POW-MIA flags don't, either. Please don't bring your ghosts in with you when approaching this subject, and I'll leave mine behind as well.
  2. "The Vietnam War is a bad analogy for O/R-M." Vietnam remains, for most Americans, as the quintessential symbol for "bloody, ugly, unresolvable quagmire". And, as some have pointed out in comments on the blog post already, all analogies break down eventually, and this one is no different--as one commenter put it, nobody ever died from a bad O/R-M tool. (Though the day is not far off when such could occur, given the incredible spread of technology into all corners of our lives--it's not too hard to imagine a day when a patient dies because a doctor received incorrect information about a medical allergy from the enterprise system he/she uses to call up patient records.) That said, however, I assert that the analogy is appropriate, and relevant, for a variety of reasons. One, because just as development teams frequently believe that the object/relational problem is "solvable", so too did the US government believe that the Communist insurgency (which was more of an independence movement than a Communist movement, we've since realized) was "solvable" in South Indochina. Two, development teams frequently believe that with "just a little bit more work, we're almost there..." (wherever "there" is, in the minds of the architect or team lead), just as the US government frequently predicted that the Viet Cong were on the verge of defeat, just a few more troops and the war is over... Three, the analogy holds because even as team leads and architects approach this problem having been burned before, they still attempt solutions to the problem, just as many of the US administrations' advisors believed that Vietnam was a dead-end and ill-fated, they still went in there anyway.
  3. "You aren't being fair--after all, {insert-name-of-favorite-O/R-M-tool-here} doesn't suffer from that problem." Not yet, it doesn't. Or it does, but you just haven't run into it yet. Either answer is possible. And in the early years of the Vietnam conflict, we didn't suffer the problems that we commonly associate with the War--the poor morale, the rampant drug use among the military, the widespread unpopularity of the conflict back home, and so on. The danger here is on the far end of the Slippery Slope, not the near end.
  4. "You aren't being fair--when you balance the pros and cons..." Perhaps not. But as someone who's built three O/R-M's in his lifetime, and refuses to build another one because they all faced the same end, despite very different beginnings, I worry more about the Slippery Slope and where it leaves us in the end. If your team can stay perched on the side of the Slope that yields the most benefits, then more power to you; but I worry about the day when the new college intern says to himself, "You know, with a bit more investment, I bet we could add inheritance...."
  5. "Some languages do allow for varying numbers of fields." Actually, no, most of the languages cited as examples, including Ruby, don't allow for varying fields. Ruby has a feature called "open classes", in which you can change the definition of the class at any time, but it's still (very loosely) a class-based language. (The implementation of Ruby, from what I can see, seems to back this point--each object holds a pointer back to the class object it stems from, which means, at least to me, it's loosely class-based.) We can debate the semantics of this point for days, and frankly I welcome the discussion, but not in the context of this one. We can save that for another post/thread at another time.
  6. "OK, but where can I go to get more info about O/R-M so I don't fall into the quagmire?" Excellent question. Roy Osherove has started a community site about O/R-Ms, which I think holds promise for discussion on the topic. The JDO crowd had several resources available at JDOCentral, and there's lots of discussion about O/R-M (stretching back several years) on TheServerSide. BEA, with its acquisition of Solarmetric, now owns one of the better O/R-M tools on the market, Kodo, and they're likely to still have numerous white papers and such on the subject.
  7. "OK, but where can I go to get more info about object persistence tools?" Right now, the only one I have any faith in is the db4o project; in fact, I'm speaking at their first user/developer conference in London in a few weeks. I've used others (such as Versant) in the past, and frankly, wasn't incredibly impressed.
  8. "OK, but where can I go to get more info about these other languages/approaches?" Keep your eye on LINQ, for starters, as that's one of the first mainstream attempts to bring some of these ideas into traditional statically-typed object platforms. Scala and F# I already mentioned. Ruby is another place to spend some time, as there's a lot of features Ruby has that are trying to make their way into other languages. And, although I will likely gather some serious heat for saying this, Visual FoxPro may have some of the most interesting "best of both worlds" mojo in the entire language space on this subject.
  9. "Great post!" Thanks.

Make no mistake about it: I am deeply sympathetic to anyone who lost somebody--figuratively or literally--to the Vietnam conflict. I feel equally sympathetic to anyone who lost somebody in the Korean War (as my family did), World War Two, or even World War One before that. In fact, my sympathies go out to anyone lost in any of the conflicts across history and the globe in which men and women die for an ideal or symbol. It is an unfortunate statement about human affairs that we see war as the ultimate arbiter over power disputes between nations, but this is the world we live in now. If you don't care for that, then I encourage you to actively work to change it, regardless of your politics. I have far more respect for someone who virulently disagrees with my political viewpoints and actively promotes their own, than I do for those who agree with my politics and do nothing but complain.

Perhaps history will record Vietnam as America's greatest military failure, perhaps not. There is ample evidence to suggest that Vietnam will forever act as a check on American territorial expansionism (remember, Hawaii and Alaska gained statehood after World War Two), and more importantly, as a checkpoint to hold flagrant use of American military muscle in place. But be that as it may, the fact remains that Vietnam had an incalculable effect on American foreign policy and domestic agenda, and will continue to do so for the next several generations. And, as numerous examples from my own experience and others can attest, the use of O/R-M can have the same effect (relativisitically speaking) on a development team's efforts.


.NET | C++ | Java/J2EE | Ruby

Tuesday, June 27, 2006 4:32:07 PM (Pacific Daylight Time, UTC-07:00)
Comments [7]  | 
 Monday, June 26, 2006
The Vietnam of Computer Science

(Two years ago, at Microsoft's TechEd in San Diego, I was involved in a conversation at an after-conference event with Harry Pierson and Clemens Vasters, and as is typical when the three of us get together, architectural topics were at the forefront of our discussions. An crowd gathered around us, and it turned into an impromptu birds-of-a-feather session. The subject of object/relational mapping technologies came up, and it was there and then that I first coined the phrase, "Object/relational mapping is the Vietnam of Computer Science". In the intervening time, I've received numerous requests to flesh out the discussion behind that statement, and given Microsoft's recent announcement regarding "entity support" in ADO.NET 3.0 and the acceptance of the Java Persistence API as a replacement for both EJB Entity Beans and JDO, it seemed time to do exactly that.)

No armed conflict in US history haunts the American military more than $g(Vietnam). So many divergent elements coalesced to create the most decisive turning point in modern American history that it defies any layman's attempt to tease them apart. And yet, the story of Vietnam is fundamentally a simple one: The United States began a military project with simple yet unclear and conflicting goals, and quickly became enmeshed in a quagmire that not only brought down two governments (one legally, one through force of arms), but also deeply scarred American military doctrine for the next four decades (at least).

Although it may seem trite to say it, $g(Object/Relational Mapping) is the Vietnam of Computer Science. It represents a quagmire which starts well, gets more complicated as time passes, and before long entraps its users in a commitment that has no clear demarcation point, no clear win conditions, and no clear exit strategy.

History

PBS has a good synopsis of the war, but for those who are more interested in Computer Science than Political/Military History, the short version goes like this:

$g(South Indochina), now known as Vietnam, Thailand, Laos and Cambodia, has a long history of struggle for autonomy. Before French colonial rule (which began in the mid-1800s), South Indochina wrestled for regional independence from China. During World War Two, the Japanese conquered the area, only to be later "liberated" by the Allies, leading France to resume their colonial rule (as did the British in their colonial territories elsewhere in Asia and India). Following WWII, however, the people of South Indochina, having thrown off one oppressor, extended their anti-occupation efforts to fight the French instead of the Japanese, and in 1954 the French capitulated, signing the $g(Geneva Peace Accords) to formally grant Vietnam its independence. Unfortunately, global pressures perverted the efforts somewhat, and instead of a lasting peace agreement a temporary solution was created, dividing the nation at the 17th parallel, creating two nations where formerly no such division existed. Elections were to be held in 1956 to reunify the country, but the US feared that too much power would be given to the $g(Communist Party of Vietnam) through these elections, and instead backed a counter-Communist state south of the 17th parallel and formed a series of multilateral agreements around it, such as $g(SEATO). The new nation of $g(South Vietnam) was born, and its first (dubiously) elected leader was $g(Ngo Dinh Diem), a staunchly anti-Communist who almost immediately declared his country under Communist attack. The $g(Eisenhower Administration) remained supportive of the Diem government, but Diem's loyalty with the people was almost nonexistent from the beginning.

By the time the US Democratic Party's $g(John F Kennedy) came to the White House, things were coming to a head in South Vietnam. Kennedy sent a team to Vietnam to research the conditions there and help formulate his strategy on the issue. In what's now known as the "$g(December 1961 White Paper)", an argument for an increase in military, technical and economic aid was presented, along with large-scale American "advisers" to help stabilize the Diem government and eliminate the $g(National Liberation Front), dubbed the $g(Viet Cong) by the US. What's not as widely known, however, is that a number of Kennedy's advisers argued against that buildup, calling Vietnam a "dead-end alley".

Faced with two diametrically opposite paths, Kennedy, as was typical for his administration, chose a middle path: instead of either a massive commitment or a complete withdrawal, Kennedy instead chose to seek a limited settlement, sending aid but not large numbers of troops, a path that was almost doomed from the beginning. Through a series of strategic blunders, including the forced relocation of rural villagers (known as the $g(Strategic Hamlet Program)), Diem's support was so deeply eroded that Kennedy hesitatingly and haltingly supported a coup, during which Diem was killed. Three weeks later, Kennedy was also assassinated, throwing the domestic US political scene into turmoil as well. Ironically, the conflict began by Kennedy would in fact later be associated most closely with his replacement.

Johnson's War

At the time of the Kennedy assassination, Vietnam had 16,000 American advisers in place, most of whom weren't involved in daily combat operations. Kennedy's Vice President and new replacement, however, $g(Lyndon Baines Johnson), was not convinced that this path was leading to success, and came to believe that more aggressive action was needed. Seizing on a dubious incident in which Vietnamese patrol boats attacked American destroyers1 in the $g(Gulf of Tonkin), Johnson used pro-war sentiment in Congress to pass a resolution that gave him powers to conduct military action without an explicit declaration of war. To put it simply, Johnson wanted to fight this war "in cold blood": "This meant that America would go to war in Vietnam with the precision of a surgeon with little noticeable impact on domestic culture. A limited war called for limited mobilization of resources, material and human, and caused little disruption in everyday life in America." (source) In essence, it would be a war whose only impact would be felt by the Vietnamese--American life and society would go on without any notice of the events in Vietnam, thus leaving Johnson to pursue his first great love, his "Great Society", a domestic agenda designed to fix many of US society's ills, such as poverty2. History, of course, knows better, and--perhaps cruelly--calls the Vietnam conflict "Johnson's War".

Initially, it must be noted that Vietnam-as-disaster is a more recent perception; Americans polled as late as 1967 were convinced that the war was a good thing, that Communism needed to be stopped and that Vietnam, should it fall, would be the first of a series of nations to succumb to Communist subversion. This "$g(Domino Theory)" was a common refrain for American politics in the latter half of the 20th century. Concerns of this sort plagued American foreign policy ever since the Communists successfully or nearly-successfully subverted several European governments during hte latter half of the 1940's, and then China in the 50's. (It must be noted that Eisenhower and $g(John Foster Dulles), formulators of the theory, never included Vietnam in their ring of dominos that must be preserved, and in fact Eisenhower was surprisingly apathetic about Vietnam during some of his meetings with Kennedy during the White House transition.)

In 1968, however, the Vietnam experience turned significantly, as the North Vietnamese and Viet Cong launched the $g(Tet Offensive), a campaign that put to lie all of the reassurances of the American government that it was winning the war in Vietnam. Ironically, as had been the case for much of the war, the NVA/VC forces lost a substantial number of troops, far more than their American opponents, yet the Tet Offensive is widely considered by historians to be the breaking point of American will in the war. Following that, popular opinion turned on Johnson, and in a dramatic news conference, he announced that he would not seek re-election. Furthermore, he announced that he would seek a negotiated settlement with the Vietnamese.

Nixon's Promise

Unfortunately, American negotiating position was seriously weakened by the very protests that had brought the Americans to the negotiating table in the first place; NVA/VC leadership recognized that the NVA/VC forces, despite staggering military losses that nearly broke them (several times), could simply continue to do as they were doing, and wring concessions from the Americans without offering any in return. Running on a platform that consisted mostly of a promise to "Get America out of Vietnam", Johnson's successor, Republican $g(Richard Nixon), tried several tactics to bring pressure to the NVA/VC forces to bargain, including increased air-combat presence (such as the $g(Christmas bombings) and $g(Operation Menu) ) and regular violations of nearby Laos and Cambodia, pursuing the line of supplies from North Vietnam to cells in South Vietnam. Nothing worked, however, and in 1973 Nixon's administration signed the $g(Paris Peace Agreement), ending American involvement in that conflict. Two years later, South Vietnam had been overrun, and on April 30, 1975, Communist forces captured Saigon, the capital of Vietnam, forcing the evacuation of the American embassy and the most memorable image of the war, that of streams of fleeing people seeking space on the Huey helicopter perched on the roof of the embassy.

War's End

The Second South Indochina War was over, America had experienced its most profound defeat ever in its history, and Vietnam became synonymous with "quagmire". Its impact on American culture was immeasurable, as it taught an entire generation of Americans to fear and mistrust their government, it taught American leaders to fear any amount of US military casualties, and brought the phrase "clear exit strategy" directly into the American political lexicon. Not until $g(Ronald Reagan) used the American military to "liberate" the small island nation of $g(Grenada) would American military intervention be considered a possible tool of diplomacy by American presidents, and even then only with great sensitivity to domestic concern, as $g(Bill Clinton) would find out during his peacekeeping missions to $g(Somalia) and $g(Kosovo). In quantifiable terms, too, Vietnam's effects clearly fell short of Johnson's goal of a war in "cold blood". Final tally: 3 million Americans served in the war, 150,000 seriously wounded, 58,000 dead, and over 1,000 MIA, not to mention nearly a million NVA/Viet Cong troop casualties, 250,000 South Vietnamese casualties, and hundreds of thousands--if not millions, as some historians advocated--of civilian casualties.

Lessons of Vietnam

Vietnam presents an interesting problem to the student of military and political history--exactly what went wrong, when, and where? Obviously, the US government's unwillingness to admit its failures during the war makes for an easy scapegoat, but no government in the history of modern society has ever been entirely truthful with its population about its fortunes of war; one such example includes (but is not limited to) the same US government's careful censorship of activities during World War Two, fifty years earlier, known in American history as "the last 'good' war". It's also tempting to point to the lack of a military objective as the crucial failing point of Vietnam, but other non-military objectives have been successfully executed by the US and other governments without the kind of colossal failure accompanying Vietnam's story. Moreover, it's important to note that the US did, in fact, have a clear objective in what it wanted out of the conflict in South Indochina: to stop the fall of the South Vietnam government, and, barring that, the cessation of the "spread" of Communism. Was it the reluctance of the US government to unleash the military to its fullest capabilities, as $g(General William Westmoreland) always claimed? Certainly the failure in Vietnam was not a military one; the casualty figures make it clear that the US, by any other measure, was clearly winning.

So what were the principal failures in Vietnam? And, more importantly, what does all this have to do with O/R Mapping?

Vietnam and O/R mapping

In the case of Vietnam, the United States political and military apparatus was faced with a deadly form of the $g(Law of Diminishing Returns). In the case of automated Object/Relational Mapping, it's the same concern--that early successes yield a commitment to use O/R-M in places where success becomes more elusive, and over time, isn't a success at all due to the overhead of time and energy required to support it through all possible use-cases. In essence, the biggest lesson of Vietnam--for any group, political or otherwise--is to know when to "cut bait and run", as fishermen say. Too often, as was the case in Vietnam, it is easy to justify further investment in a particular course of action by suggestion that abandoning that course somehow invalidates all the work--or, in Vietnam's case, the lives of American soldiers--that have already been paid. Phrases like "We've gone this far, surely we can see this thing through" and "To back out now is to throw away everything we've sacrificed up until this point" become commonplace. At least during the later, deeply bitter years of the second half of Vietnam, questions of patriotism came into question: if you didn't support the war, you were clearly a traitor, a Communist, obviously "unAmerican", disrespectful of all American veterans of any war fought on any soil for whatever reason, and you probably kicked your dog to boot. (It didn't help the protestors' cause that they blamed the soldiers for the war, holding them accountable--sometimes personally--for the decisions made by military and political leaders, most of whom neither the soldiers nor the protestors had ever met.)

Recognizing that all analogies fail eventually, and that the subject of Vietnam is deeper than this essay can examine, there are still lessons to be learned here in an entirely different arena. One of the key lessons of Vietnam was the danger of what's colloquially called "the Slippery Slope": that a given course of action might yield some early success, yet further investment into that action yields decreasingly commensurate results and increasibly dangerous obstacles whose only solution appears to be greater and greater commitment of resources and/or action. Some have called this "the Drug Trap", after the way pharmaceuticals (legal or illegal) can have diminished effect after prolonged use, requiring upped dosage in order to yield the same results. Others call this "the Last Mile Problem": that as one nears the end of a problem, it becomes increasingly difficult in cost terms (both monetary and abstract) to find a 100% complete solution. All are basically speaking of the same thing--the difficulty of finding an answer that allows our hero to "finish off" the problem in question, completely and satisfactorily.

We begin the analysis of Object/Relational Mapping--and its relationship to the Second South Indochina War--by examining the reasons for it in the first place. What drives developers away from using traditional relational tools to access a relational database, and to prefer instead tools such as O/R-M's?

The Object-Relational Impedence Mismatch

To say that objects and relational data sets are somehow constructed differently is typically not a surprise to any developer who's ever used both; except in extremely simplistic situations, it becomes fairly obvious to recognize that the way in which a relational data store is designed is subtly--and yet profoundly--different than how an object system is designed.

Object systems are typically characterized by four basic components: identity, state, behavior and encapsulation. Identity is an implicit concept in most O-O languages, in that a given object has a unique identity that is distinct from its state (the value of its internal fields)--two objects with the same state are still separate and distinct objects, despite being bit-for-bit mirrors of one another. This is the "identity vs. equivalence" discussion that occurs in languages like C++, C# or Java, where developers must distinguish between "a == b" and "a.equals(b)". The behavior of an object is fairly easy to see, a collection of operations clients can invoke to manipulate, examine, or interact with objects in some fashion. (This is what distinguishes objects from passive data structures in a procedural language like C.) Encapsulation is a key detail, preventing outside parties from manipulating internal object details, thus providing evolutionary capabilities to the object's interface to clients.3. From this we can derive more interesting concepts, such as type, a formal declaration of object state and behavior, association, allowing types to reference one another through a lightweight reference rather than complete by-value ownership (sometimes called composition), inheritance, the ability to relate one type to another such that the relating type incorporates all of the related type's state and behavior as part of its own, and polymorphism, the ability to substitute an object in where a different type is expected.

Relational systems describe a form of knowledge storage and retrieval based on predicate logic and truth statements. In essence, each row within a table is a declaration about a fact in the world, and SQL allows for operator-efficient data retrieval of those facts using predicate logic to create inferences from those facts. [Date04] and [Fussell] define the relational model as characterized by relation, attribute, tuple, relation value and relation variable. A relation is, at its heart, a truth predicate about the world, a statement of facts (attributes) that provide meaning to the predicate. For example, we may define the relation "PERSON" as {SSN, Name, City}, which states that "there exists a PERSON with a Social Security Number SSN who lives in City and is called Name". Note that in a relation, attribute ordering is entirely unspecified. A tuple is a truth statement within the context of a relation, a set of attribute values that match the required set of attributes in the relation, such as "{PERSON SSN='123-45-6789' Name='Catherine Kennedy' City='Seattle'}". Note that two tuples are considered identical if their relation and attribute values are also identical. A relation value, then, is a combination of a relation and a set of tuples that match that relation, and a relation variable is, like most variables, a placeholder for a given relation, but can change value over time. Thus, a relation variable People can be written to hold the relation {PERSON}, and consist of the relation value

{ {PERSON SSN='123-45-6789' Name='Catherine Kennedy' City='Seattle'},
  {PERSON SSN='321-54-9876' Name='Charlotte Neward' City='Redmond'},
  {PERSON SSN='213-45-6978' Name='Cathi Gero' City='Redmond'} }
These are commonly referred to as tables (relation variable), rows (tuples), columns (attributes), and a collection of relation variables as a database. These basic element types can be combined against one another using a set of operators (described in some detail in Chapter 7 of [Date04]): restrict, project, product, join, divide, union, intersection and difference, and these form the basis of the format and approach to SQL, the universally-acceptance language for interacting with a relational system from operator consoles or programming languages. The use of these operators allow for the creation of derived relation values, relations that are calculated from other relation values in the database--for example, we can create a relation value that demonstrates the number of people living in individual cities by making use of the project and restrict operators across the People relation variable defined above.

Already, it's fairly clear to see that there are distinct differences between how the relational world and object world view the "proper" design of a system, and more will become apparent as time progresses. It's important to note, however, that so long as programmers prefer to use object-oriented programming languages to access relational data stores, there will always be some kind of object-relational mapping taking place--the two models are simply too different to bridge silently. (Arguably, the same is true of object-oriented and procedural programming, but that's another argument for another day.) O/R mappings can take place in a variety of forms, the easiest of which to recognize is the automated O/R mapping tool, such as $g(TopLink), $g(Hibernate) / $g(NHibernate), or $g(Gentle.NET). Another form of mapping is the hand-coded one, in which programmers use relational-oriented tools, such as JDBC or ADO.NET, to access relational data and extract it into a form more pleasing to object-minded developers "by hand". A third is to simply accept the shape of the relational data as "the" model from which to operate, and slave the objects around it to this approach; this is also known in the patterns lexicon as Table Data Gateway [PEAA, 144] or Row Data Gateway [PEAA 152]; many data-access layers in both Java and .NET use this approach and combine it with code-generation to simplify the development of that layer. Sometimes we build objects around the relational/table model, put some additional behavior around it, and call it Active Record [PEAA, 160].

In truth, this basic approach--to slave one model into the terms and approach of the other--has been the traditional answer to the impedance mismatch, effectively "solving" the problem by ignoring one half of it. Unfortunately, most development efforts, like the Kennedy Administration, aren't willing to see this through to its logical conclusion with a wholesale commitment to one approach over the other. For example, while most development teams would be happy to adopt an "objects-only" approach, doing so at the storage level implies the use of an Object Oriented DataBase Management System (OODBMS), a topic that frequently has no traction within upper management or the corporate data management team. The opposite approach--a "relational-only" approach--is almost nonsensical to consider, given the technology of the day at the time this was written4.

Given that it's impossible, then, to "unleash the objects to their fullest capabilities", as General Westmoreland might call it, we're left with some kind of hybrid object-to-relational mapping approach, preferably one that's automated as much as possible, so that developers can focus on their Domain Model, rather than on the details of the object-to-table(s) mapping. And here, unfortunately, is where the potential quagmire begins.

The Object-to-Table Mapping Problem

One of the first and most easily-recognizable problems in using objects as a front-end to a relational data store is that of how to map classes to tables. At first, it seems a fairly straightforward exercise--tables map to types, columns to fields. Even the field types appear to line up directly against the relational column types, at least to a fairly isomorphic degree: VARCHARs to Strings, INTEGERs to ints, and so on. So it makes sense that for any given class defined in the system, a corresponding table--likely to be of the same or closely related name--is defined to go with it. Or, perhaps, if the object code is being written to an already existing schema, then the class maps to the table.

But as time progresses, it's only natural that a well-trained object-oriented developer will seek to leverage inheritance in the object system, and seek ways to do the same in the relational model. Unfortunately, the relational model does not support any sort of polymorphism or IS-A kind of relation, and so developers eventually find themselves adopting one of three possible options to map inheritance into the relational world: table-per-class, table-per-concrete-class, or table-per-class-family. Each of them carries potentially significant drawbacks.

The table-per-class approach is perhaps the most easily understood, for it seeks to minimize the "distance" between the object model and the relational model; each class in the inheritance hierarchy gets its own relational table, and objects of derived types are stitched together from relational JOINs across the various inheritance-based tables. So, for example, if an object model has the base class Person, with Student derived from Person and GraduateState derived from Student, then there will be three tables required to hold this model, PERSON, STUDENT, and GRADUATESTUDENT, each holding the fields corresponding to the class of the same name. Relating these tables together, however, requires each to have an independent primary key (one whose value is not actually stored in the object entity) so that each derived class can have a foreign key relation to its superclass's table. The reason for this is clear: a GraduateStudent object, by virtue of its IS-A relationship to Student and Person, is a collection of all three sets of state, and the distinction between the classes is largely removed by the time an object of this type is created--in both Java and .NET, for example, the object itself is a chunk of memory that holds the instance fields defined in all of its classes and superclasses, along with a pointer to the table of methods defined by that same hierarchy. This means that when querying for a particular instance at the relational level, at least three JOINs must be made in order to bring all of the object's state into the object program's working memory.

Actually, it gets worse than that--if the object hierarchy continues to grow, say to include Professor, Staff, Undergrad (inherits from Student), and a whole hierarchy of AdjunctEmployees (inheriting from Staff), and the program wants to find all Persons whose last name is Smith, then JOINs must be done for every derived class in the system, since the semantics of "find all Persons" means that the query must seek data on the PERSON table, but then do an expensive set of JOINs to bring in the rest of the data from across the rest of the database, pulling in the PROFESSOR table to fetch the rest of the data, not to mention the UNDERGRAD, ADJUCTEMPLOYEE, STAFF, and other tables. Considering that JOINs are among the most expensive expressions in RDBMS queries, this is clearly not something to undertake lightly.

As a result, developers typically adopt one of the other two approaches, more complex in outlook but more efficient when dealing with relational storage: they either create a table per concrete (most-derived) class, preferring to adopt denormalization and its costs, or else they create a single table for the entire hierarchy, often in either case creating a discriminator column to indicate to which class each row in the table belongs. (Various hybrids of these schemes are also possible, but typically don't create results that are significantly different from these two.) Unfortunately, the denormalization costs are often significant for a large volume of data, and/or the table(s) will contain significant amounts of empty columns, which will need NULLability constraints on all columns, eliminating the powerful integrity constraints offered by an RDBMS.

Inheritance mapping isn't the end of it; associations between objects, the typical 1:n or m:n cardinality associations so commonly used in both SQL and/or UML, are handled entirely differently: in object systems, association is unidirectional, from the associator to the associatee (meaning the associated object(s) have no idea they are in fact associated unless an explicit bidirectional association is established), whereas in relational systems the association is actually reversed, from the associatee to the associator (via foreign key columns). This turns out to be surprisingly important, as it means that for m:n associations, a third table must be used to store the actual relationship between associator and associatee, and even for the simpler 1:n relationships the associator has no inherent knowledge of the relations to which it associates--discovering that data requires a JOIN against any or all associated tables at some point. (When to actually retrieve that data is a subject of some debate--see the Loading Paradox, below).

The Schema-Ownership Conflict

Discussions of inheritance-to-table and association mapping schemes also reveals a basic flaw: At heart, many object-relational mapping tools assume that the schema is something that can be defined according to schemes that help optimize the O/R-M's queries against the relational data. But this belies a basic problem, that often the database schema itself is not under the direct control of developers, but instead is owned by another group within the company, typically the database administration (DBA) group. To whom does responsibility for designing the database--and deciding when schema changes are permissible--belong?

In many cases, developers begin a new project with a "clean slate", an empty relational database whose schema is theirs to define as they see fit. But, soon after the project has shipped (sometimes even earlier than that, due to political and/or "turf war" issues), it becomes apparent that the developers' ownership of the schema is temporary at best--various departments begin clamoring for reports against the database, DBAs are held accountable to the performance of the database thereby giving them cause to call for "refactoring" and denormalization of the data, and other development teams may start inquiring about how they might make use of the data stored therein. Before too long, the schema must be "frozen", thereby potentially creating a barrier to object model refactoring (see The Coupling Concern, below). In addition, these other teams will expect to see a relational model defined in relational terms, not one which supports an entirely orthogonal form of persistence--for example, the "discriminator" column from the Inheritance-to-Table Mapping Problem will represent difficulties, and arguably be all but unusable, to relational report generators such as Crystal Reports. Unless developers are willing to write all reports (and their UIs, and their printing code, and their ad-hoc capabilities...) by hand, this is usually going to be an unacceptable state of affairs.

(To be fair, this is not so much a technical problem as it is a political problem, but it still represents a serious problem regardless of its source--or solution. And as such, it still represents an impediment to an object/relational mapping solution.)

The Dual-Schema Problem

A related issue to the question of schema ownership is that in an O/R-M solution, the metadata to the system is held fundamentally in two different places: once in the database schema, and once in the object model (another schema, if you will, expressed in Java or C# instead of DDL). Updates or refactorings to one will likely require similar updates or refactorings to the other. Refactoring code to match database schema changes is widely considered to be the easier of the two--refactoring the database frequently requires some kind of migration and/or adaptation of data already within the database, where code has no such requirement. (Objects, at least in this discussion, are ephemeral in-memory instances that will disappear once the process holding them terminates. If the objects are stored in some kind of object form that can persist across process execution--such as serialized object instances stored to disk--then refactoring objects becomes equally problematic.)

More importantly, while it's not uncommon for code to be deployed specifically to a single application, frequently database instances are used by more than one application, and it's frequently unacceptable to business to trigger a company-wide refactoring of code simply because a refactoring on one application requires a similar database-driven refactoring. As a result, as the system grows over time, there will be increasing pressure on the developers to "tie off" the object model from the database schema, such that schema changes won't require similar object model refactorings, and vice versa. In some cases, where the O/R-M doesn't permit such disconnection, an entirely private database instance may have to be deployed, with the exact schema the O/R-M-based solution was built against, creating yet another silo of data in an IT environment where pressure is building to reduce such silos.

Entity Identity Issues

As if these problems weren't enough, we then walk into another problem, that of identity of objects and relations. As noted above, object systems use an implicit sense of identity, typically based on the object's location in memory (the ubiquitous this pointer); alternatively, this is sometimes referred to as an OID (Object IDentifier), usually in systems which don't directly expose memory locations, such as the object database (where an in-memory pointer is pretty useless as an identifier outside of the database process). In a relational model, however, identity is implicit in the state itself--two rows with the exact same state are typically considered a relational data corruption, as the same fact asserted twice is redundant and counterproductive. To be fair, we should be a bit more explicit here; a relational system can, in fact, permit duplicate tuples (as described above), but this is often explicitly disallowed by explicit relational constraints, such as PRIMARY KEY constraints. In those situations where duplicate values are allowed, there is no way for a relational system to determine which of the two duplicate rows are being retrieved--there is no implicit sense of identity to the relation except that offered by its attributes. The same is not true of object systems, where two objects that contain precisely identical bit patterns in two different locations of memory are in fact separate objects. (This is the reason for the distinction between "==" and ".equals()" in Java or C#.) The implication here is simple: if the two systems are going to agree on the sense of identity, the relational system must offer some kind of unique identity concept (usually an auto-incrementing integer column) to match that of the notion of object identity.

This causes some serious concerns regarding automated O/R systems, because the sense of identity is entirely different--if two separate user sessions interact with the same relation in storage, the relational database system's concurrency systems kick in and ensure some form of concurrent access, typically via the transactional metaphor (ACID). If an O/R system retrieves a relation out of storage (essentially forming a "view" over the data), we now have a second source of data identity, one in the database (protected by the aforementioned transactional scheme), and one in the in-memory object representation of that data, which has no consistent transactional support aside from that built into the language (such as the monitors concept in Java and .NET) or libraries (such as System.Transactions in .NET 2.0), either of which can be--and unfortuantely frequently are--easily ignored by developers. Managing isolation and concurrency is not an easy problem to solve, and unfortunately the languages and platforms commonly available to developers aren't yet as consistent or flexible as the database transaction metaphor.

What complicates this problem further is that many O/R systems introduce significant caching support into the O/R layer (usually in an attempt to improve performance and avoid round-trips to the database), and this in turn presents some problems, particularly if the caching system is not a write-through cache: when does the actual "flush" to the database take place, and what does this say about transactional integrity if the application code believes the write to have occurred when in fact it hasn't? This problem in turn only compounds when the O/R system runs in multiple processes in front of the database engine, commonly found in clustered or farmed application server scenarios. Now the data identity is spread across n+1 locations, n being the number of application server nodes, and 1 being the database itself. Each node must somehow signal its intent to do an update to the other nodes in order to obtain some kind of concurrency construct to prevent simultaneous access (by another instance of the same session, or by an instance of a different session accessing the same data), which takes time, killing performance. Even in the case of a read-only cache, updates to the data store must somehow be signaled to the caches running in the application server nodes, requiring server-to-client communication originating from the database; support for this is not well-understood or documented in the current crop of modern relational databases.

The Data Retrieval Mechansim Concern

So once the entity is stored within the database, how exactly do we retrieve it? In all honesty, a purely object-oriented approach would make use of object approaches for retrieval, ideally using constructor-style syntax identifying the object(s) desired, but unfortunately constructor syntax isn't generic enough to allow for something that flexible; in particular, it lacks the ability to initialize a collection of objects, and queries frequently need to return a collection, rather than just a single entity. (Multiple trips to the database to fetch entities individually is generally considered too wasteful, in both latency and bandwidth, to consider credibly as an alternative--see the Load-Time Paradox, below, for more.) As a result, we typically end up with one of Query-By-Example (QBE), Query-By-API (QBA), or Query-By-Language (QBL) approaches.

A QBE approach states that you fill out an object template of the type of object you're looking for, with fields in the object set to a particular value to use as part of the query-filtration process. So, for example, if you're querying the Person object/table for people with the last name of Smith, you set up the query like so:

Person p = new Person(); // assumes all fields are set to null by default
p.LastName = "Smith";
ObjectCollection oc = QueryExecutor.execute(p);
The problem with the QBE approach is obvious: while it's perfectly sufficient for simple queries, it's not nearly expressive enough to support the more complex style of query that frequently we need to execute--"find all Persons named Smith or Cromwell" and "find all Persons NOT named Smith" are two examples. While it's not impossible to build QBE approaches that handle this (and more complex scenarios), it definitely complicates the API significantly. More importantly, it also forces the domain objects into an uncomfortable position--they must support nullable fields/properties, which may be a violation of the domain rules the object would otherwise seek to support--a Person without a name isn't a very useful object, in many scenarios, yet this is exactly what a QBE approach will demand of domain objects stored within it. (Practitioners of QBE will often argue that it's not unreasonable for an object's implementation to take this into account, but again this is neither easy nor frequently done.)

As a result, usually the second step is to have the object system support a "Query-By-API" approach, in which queries are constructed by query objects, usually something of the form:

Query q = new Query();
q.From("PERSON").Where(
  new EqualsCriteria("PERSON.LAST_NAME", "Smith"));
ObjectCollection oc = QueryExecutor.execute(q);
Here, the query is not based on an empty "template" of the object to be retrieved, but off of a set of "query objects" that are used together to define a Command-style object for executing against the database. Multiple criteria are connected using some kind of binomial construct, usually "And" and "Or" objects, each of which contain unique Criteria objects to test against. Additional filtration/manipulation objects can be tagged onto the end, usually by appending calls such as "OrderBy(field-name)" or "GroupBy(field-name)". In some cases, these method calls are actually objects constructed by the programmer and strung together explicitly.

Developers quickly note that the above approach is (generally) much more verbose than the traditional SQL approach, and certain styles of queries (particularly the more unconventional joins, such as outer joins) are much more difficult--if not impossible--to represent in the QBA approach.

On top of this, we have a more subtle problem, that of the reliance on developers' dicipline: both the table name ("PERSON") and the column name in the criteria ("PERSON.LAST_NAME") are standard strings, taken as-is and fed to the system at runtime with no sort of validity-checking until then. This presents a classic problem in programming, that of the "fat-finger" error, where a developer doesn't actually query the "PERSON" table, but the "PRESON" table instead. While a quick unit-test against a live database instance will reveal the error during unit-testing, this presumes two facts--that the developers are religious about adopting unit-testing, and that the unit-tests are run against database instances. While the former is slowly becoming more of a guarantee as more and more developers become "test-infected" (borrowing Gamma's and Beck's choice of terminology), the latter is still entirely open to discussion and interpretation, owing to the fact that setting-up and tearing-down the database instance appropriately for unit tests is still difficult to do in a database. (While there are a variety of ways to circumvent this problem, few of them seem to be in use.)

We're also faced with the basic problem that greater awareness of the logical--or physical--data representation is required on the part of the developer--instead of simply focusing on how the objects are related to one another (through simple associations such as arrays or collection instances), the developer must now have greater awareness of the form in which the objects are stored, leaving the system somewhat vulnerable to database schema changes. This is sometimes obviated by a hybrid approach between the two, whereby the system will take responsibility for interpreting the associations, leaving the developer to write something like this:

Query q = new Query();
Field lastNameFieldFromPerson = Person.class.getDeclaredField("lastName");
q.From(Person.class).Where(new EqualsCriteria(lastNameFieldFromPerson, "Smith"));
ObjectCollection oc = QueryExecutor.execute(q);
Which solves part of the schema-awareness problem and the "fat-fingering" problem but still leaves the developer vulnerable to the concerns over verbosity and still doesn't address the complexity of putting together a more complex query, such as a multi-table (or multi-class, if you will) query joined on several criteria in a variety of ways.

So, then, the next task is to create a "Query-By-Language" approach, in which a new language, similar to SQL but "better" somehow, is written to support the kind of complex and powerful queries normally supported by SQL; OQL and HQL are two examples of this. The problem here is that frequently these languages are a subset of SQL and thus don't offer the full power of SQL. More importantly, the O/R layer has now lost an important "selling point", that of the "objects and only objects" mantra that begat it in the first place; using a SQL-like language is almost just like using SQL itself, so how can it be more "objectish"? While developers may not need to be aware of the physical schema of the data model (the query language interpreter/executor can do the mapping discussed earlier), developers will need to be aware of how object associations and properties are represented within the language, and the subset of the object's capabilities within the query language--for example, is it possible to write something like this?

SELECT Person p1, Person p2 
FROM Person 
WHERE p1.getSpouse() == null 
  AND p2.getSpouse() == null 
  AND p1.isThisAnAcceptableSpouse(p2) 
  AND p2.isThisAnAcceptableSpouse(p1);
In other words, scan through the database and find all single people who find each other acceptable. While the "isThisAnAcceptableSpouse" method is clearly a method that belongs on the Person class (each Person instance may have its own criteria by which to judge the acceptability of another single--are they blonde, brunette, or redhead, are they making more than $100,000 a year, and so on), it's not clear if executing this method is possible in the query language, nor is it clear if it should be. Even for the most trivial implementations, a serious performance hit will be likely, particularly if the O/R layer must turn the relational column data into objects in order to execute the query. In addition, we have no guarantees that the developer wrote this method to be at all efficient, and no ways to enforce any sort of performance-aware implementation.

(Critics will argue that this is a workable problem, proposing two possible solutions. One is to encode the preference data in a separate table and make that part of the query; this will result in a hideously complicated query that will take several pages in length and likely require a SQL expert to untangle later when new preferential criteria want to be added. The other is to encode this "acceptability" implementation in a stored procedure within the database, which now removes code entirely from the object model and leaves us without an "object"-based solution whatsoever--acceptable, but only if you accept the premise that not all implementation can rest inside the object model itself, which rejects the "objects and nothing but objects" premise with which many O/R advocates open their arguments.)

The Partial-Object Problem and the Load-Time Paradox

It has long been known that network traversal, such as that done when making a traditional SQL request, takes a significant amount of time to process. (Rough benchmarks have placed this value at anywhere from three to five orders of magnitude, compared against a simple method call on either the Java or .NET platform5; roughly analogous, if it takes you twenty minutes to drive to work in the morning, and we call that the time required to execute a local method call, four orders of magnitude to that is roughly the time it takes to travel to Pluto, or just shy of fourteen years, one way.) This cost is clearly non-trivial, so as a result, developers look for ways to minimize this cost by optimizing the number of round trips and data retrieved.

In SQL, this optimization is achieved by carefully structuring the SQL request, making sure to retrieve only the columns and/or tables desired, rather than entire tables or sets of tables. For example, when constructing a traditional drill-down user interface, the developer presents a summary display of all the records from which the user can select one, and once selected, the developer then displays the complete set of data for that particular record. Given that we wish to do a drill-down of the Persons relational type described earlier, for example, the two queries to do so would be, in order (assuming the first one is selected):

SELECT id, first_name, last_name FROM person;

SELECT * FROM person WHERE id = 1;
In particular, take notice that only the data desired at each stage of the process is retrieved--in the first query, the necessary summary information and identifier (for the subsequent query, in case first and last name wouldn't be sufficient to identify the person directly), and in the second, the remainder of the data to display. In fact, most SQL experts will eschew the "*" wildcard column syntax, preferring instead to name each column in the query, both for performance and maintenance reasons--performance, since the database will better optimize the query, and maintenance, because there will be less chance of unnecessary columns being returned as DBAs or developers evolve and/or refactor the database table(s) involved. This notion of being able to return a part of a table (though still in relational form, which is important for reasons of closure, described above) is fundamental to the ability to optimize these queries this way--most queries will, in fact, only require a portion of the complete relation.

This presents a problem for most, if not all, object/relational mapping layers: the goal of any O/R is to enable the developer to see "nothing but objects", and yet the O/R layer cannot tell, from one request to another, how the objects returned by the query will be used. For example, it is entirely feasible that most developers will want to write something along the lines of:

Person[] all = QueryManager.execute(...);
Person selected = DisplayPersonsForSelection(all);
DisplayPersonData(selected);
Meaning, in other words, that once the Person to be displayed has been chosen from the array of Persons, no further retrieval action is necessary--after all, you have your object, what more should be necessary?

The problem here is that the data to be displayed in the first Display...() call is not the complete Person, but a subset of that data; here we face our first problem, in that an object-oriented system like C# or Java cannot return just "parts" of an object--an object is an object, and if the Person object consists of 12 fields, then all 12 fields will be present in every Person returned. This means that the system faces one of three uncomfortable choices: one, require that Person objects must be able to accomodate "nullable" fields, regardless of the domain restrictions against that; two, return the Person completely filled out with all the data comprising a Person object; or three, provide some kind of on-demand load that will obtain those fields if and when the developer accesses those fields, even indirectly, perhaps through a method call.

(Note that some object-based languages, such as ECMAScript, view objects differently than class-based languages, such as Java or C# or C++, and as a result, it is entirely possible to return objects which contain varying numbers of fields. That said, however, few languages possess such an approach, not even everybody's favorite dynamic-language poster child, Ruby, and until such languages become widespread, such discussion remains outside the realm of this essay.)

For most O/R layers, this means that objects and/or fields of objects must be retrieved in a lazy-loaded manner, obtaining the field data on demand, because retrieving all of the fields of all of the Person objects/relations would "clearly" be a huge waste of bandwidth for this particular scenario. Typically, the object's entire set of fields will be retrieved when any field not-yet-returned is accessed. (This approach is preferred to a field-by-field approach because there's less chance of the "N+1 query problem", in which retrieving all the data from an object requires 1 query to retrieve the primary key + N queries to retrieve each field from the table as necessary. This minimizes the bandwidth consumed to retrieve data--no unaccessed field will have its data retrieved--but clearly fails to minimize network round trips.)

Unfortunately, fields within the object are only part of the problem--the other problem we face is that objects are frequently associated with other objects, in various cardinalities (one-to-one, one-to-many, many-to-one, many-to-many), and an O/R mapping has to make some up-front decisions about when to retrieve these associated objects, and despite the best efforts of the O/R-M's developers, there will always be common use-cases where the decision made will be exactly the wrong thing to do. Most O/R-M's offer some kind of developer-driven decision-making support, usually some kind of configuration or mapping file, to identify exactly what kind of retrieval policy will be, but this setting is global to the class, and as such can't be changed on a situational basis.

Summary

Given, then, that objects-to-relational mapping is a necessity in a modern enterprise system, how can anyone proclaim it a quagmire from which there is no escape? Again, Vietnam serves as a useful analogy here--while the situation in South Indochina required a response from the Americans, there were a variety of responses available to the Kennedy and Johson Administrations, including the same kind of response that the recent fall of Suharto in Malaysia generated from the US, which is to say, none at all. (Remember, Eisenhower and Dulles didn't consider South Indochina to be a part of the Domino Theory in the first place; they were far more concerned about Japan and Europe.)

Several possible solutions present themselves to the O/R-M problem, some requiring some kind of "global" action by the community as a whole, some more approachable to development teams "in the trenches":

  1. Abandonment. Developers simply give up on objects entirely, and return to a programming model that doesn't create the object/relational impedance mismatch. While distasteful, in certain scenarios an object-oriented approach creates more overhead than it saves, and the ROI simply isn't there to justify the cost of creating a rich domain model. ([Fowler] talks about this to some depth.) This eliminates the problem quite neatly, because if there are no objects, there is no impedance mismatch.
  2. Wholehearted acceptance. Developers simply give up on relational storage entirely, and use a storage model that fits the way their languages of choice look at the world. Object-storage systems, such as the db4o project, solve the problem neatly by storing objects directly to disk, eliminating many (but not all) of the aforementioned issues; there is no "second schema", for example, because the only schema used is that of the object definitions themselves. While many DBAs will faint dead away at the thought, in an increasingly service-oriented world, which eschews the idea of direct data access but instead requires all access go through the service gateway thus encapsulating the storage mechanism away from prying eyes, it becomes entirely feasible to imagine developers storing data in a form that's much easier for them to use, rather than DBAs.
  3. Manual mapping. Developers simply accept that it's not such a hard problem to solve manually after all, and write straight relational-access code to return relations to the language, access the tuples, and populate objects as necessary. In many cases, this code might even be automatically generated by a tool examining database metadata, eliminating some of the principal criticism of this approach (that being, "It's too much code to write and maintain").
  4. Acceptance of O/R-M limitations. Developers simply accept that there is no way to efficiently and easily close the loop on the O/R mismatch, and use an O/R-M to solve 80% (or 50% or 95%, or whatever percentage seems appropriate) of the problem and make use of SQL and relational-based access (such as "raw" JDBC or ADO.NET) to carry them past those areas where an O/R-M would create problems. Doing so carries its own fair share of risks, however, as developers using an O/R-M must be aware of any caching the O/R-M solution does within it, because the "raw" relational access will clearly not be able to take advantage of that caching layer.
  5. Integration of relational concepts into the languages. Developers simply accept that this is a problem that should be solved by the language, not by a library or framework. For the last decade or more, the emphasis on solutions to the O/R problem have focused on trying to bring objects closer to the database, so that developers can focus exclusively on programming in a single paradigm (that paradigm being, of course, objects). Over the last several years, however, interest in "scripting" languages with far stronger set and list support, like Ruby, has sparked the idea that perhaps another solution is appropriate: bring relational concepts (which, at heart, are set-based) into mainstream programming languages, making it easier to bridge the gap between "sets" and "objects". Work in this space has thus far been limited, constrained mostly to research projects and/or "fringe" languages, but several interesting efforts are gaining visibility within the community, such as functional/object hybrid languages like Scala or F#, as well as direct integration into traditional O-O languages, such as the LINQ project from Microsoft for C# and Visual Basic. One such effort that failed, unfortunately, was the SQL/J strategy; even there, the approach was limited, not seeking to incorporate sets into Java, but simply allow for embedded SQL calls to be preprocessed and translated into JDBC code by a translator.
  6. Integration of relational concepts into frameworks. Developers simply accept that this problem is solvable, but only with a change of perspective. Instead of relying on language or library designers to solve this problem, developers take a different view of "objects" that is more relational in nature, building domain frameworks that are more directly built around relational constructs. For example, instead of creating a Person class that holds its instance data directly in fields inside the object, developers create a Person class that holds its instance data in a RowSet (Java) or DataSet (C#) instance, which can be assembled with other RowSets/DataSets into an easy-to-ship block of data for update against the database, or unpacked from the database into the individual objects.
Note that this list is not presented in any particular order; while some are more attractive to others, which are "better" is a value judgment that every developer and development team must make for themselves.

Just as it's conceivable that the US could have achieved some measure of "success" in Vietnam had it kept to a clear strategy and understood a more clear relationship between commitment and results (ROI, if you will), it's conceivable that the object/relational problem can be "won" through careful and judicious application of a strategy that is celarly aware of its own limitations. Developers must be willing to take the "wins" where they can get them, and not fall into the trap of the Slippery Slope by looking to create solutions that increasingly cost more and yield less. Unfortunately, as the history of the Vietnam War shows, even an awareness of the dangers of the Slippery Slope is often not enough to avoid getting bogged down in a quagmire. Worse, it is a quagmire that is simply too attractive to pass up, a Siren song that continues to draw development teams from all sizes of corporations (including those at Microsoft, IBM, Oracle, and Sun, to name a few) against the rocks, with spectacular results. Lash yourself to the mast if you wish to hear the song, but let the sailors row.

Endnotes

1 Later analysis by the principals involved--including then-Secretary of Defense Robert McNamara--concluded that half of the attack never even took place.

2 It is perhaps the greatest irony of the war, that the man Fate selected to lead during America's largest foreign entanglement was a leader whose principal focus was entirely aimed within his own shores. Had circumstances not conspired otherwise, the hippies chanting "Hey, hey LBJ, how many boys did you kill today" outside the Oval Office could very well have been Johnson's staunchest supporters.

3 Ironically, encapsulation, for purposes of maintenance simplicity, turns out to be a major motivation for almost all of the major innovations in Linguistic Computer Science--procedural, functional, object, aspect, even relational technologies ([Date02]) and other languages all cite "encapsulation" as major driving factors.

4 We could, perhaps, consider stored procedure languages like T-SQL or PL/SQL to be "relational" programming languages, but even then, it's extremely difficult to build a UI in PL/SQL.

5 In this case, I was measuring Java RMI method calls against local method calls. Similar results are pretty easily obtainable for SQL-based data access by measuring out-of-process calls against in-process calls using a database product that supports both, such as Cloudscape/Derby or HSQL (Hypersonic SQL).

References

[Fussell]: Foundations of Object Relational Mapping, by Mark L. Fussell, v0.2 (mlf-970703)

[Fowler] Patterns of Enterprise Application Architecture, by Martin Fowler

[Date04]: Introduction to Database Systems, 8th Edition, by Chris Date.

[Neward04]: Effective Enterprise Java


.NET | C++ | Java/J2EE | Ruby

Monday, June 26, 2006 10:59:14 AM (Pacific Daylight Time, UTC-07:00)
Comments [41]  | 
 Sunday, June 18, 2006
"Pragmatic Architecture" TechEd Webcast now up

Cathi Gero's and my session from TechEd, "Pragmatic Architecture", is now available as a webcast for your viewing and listening pleasure. We had a few issues with the audio, which got us started late, but overall the general feedback was positive. Enjoy...


.NET | C++ | Conferences | Java/J2EE | Windows | XML Services

Sunday, June 18, 2006 10:01:47 AM (Pacific Daylight Time, UTC-07:00)
Comments [2]  | 
 Thursday, June 15, 2006
To whatever moron pulled the fire alarm at the speaker hotel at TechEd....

You are a dead man if I ever find you.

That said, it was rather amusing to see which of the speakers were big enough geeks to bring their laptops outside when we were all evacuated.... :-)


Conferences

Thursday, June 15, 2006 1:09:20 AM (Pacific Daylight Time, UTC-07:00)
Comments [5]  | 
 Saturday, May 06, 2006
Can the CLR "go dynamic"? Absolutely... and arguably, already is

Larry O'Brien asks

Are you confident that continuations can be even semi-efficiently implemented on the CLR? I'm not.
and in turn references his blog, where he points out a quote from Patrick Logan that says "If Microsoft really looks at Ruby as competition then Microsoft has already lost the war" and offers this:
  1. If Microsoft thinks Ruby is important, they're ignoring the threat to them posed by X (where, I suspect, X = LISP), or
  2. If Microsoft thinks Ruby is competition, they will not implement it and therefore be doomed
Not long ago, Microsoft posted a job opening for a developer "first task will be to drive the exploration of other dynamic languages such as Ruby and JavaScript on the CLR", so my feeling is that if Microsoft could get a Ruby on the CLR, they'd be thrilled.
First of all, said job has already been filled--Jim Hugunin, of Jython fame, joined Microsoft some months ago on the CLR team and has since pushed a 1.0 beta of his IronPython implementation (which, according to Python benchmarks, is already faster than the corresponding native C Python implementation), available from Microsoft. Second of all, I won't suggest that I know what Mr. Logan was thinking when he made his comment, but I suspect he's thinking more about development process than technological issues. What's more, I don't agree with the comment at all: I think Microsoft needs to pursue high-level "scripting" languages on the CLR, if only because they ARE more productive than statically-typed languages like C#/Java/C++; this is the lesson we forgot, and inadvertently abandoned, from VB. Which leads me to suggest that Ruby is the VB of the next decade. Or, if not Ruby, then something like it.

Larry goes on to say:

Ruby is not easy to implement on the CLR, at least in part because a complete Ruby implementation requires continuations, which are not modeled within the CLR. This isn't just laziness on the part of langauge implementors. The CLR presents a machine architecture different than the wide-open architecture in which most compiler experience has been gathered. The CLR architecture is safer, but more restrictive, when it comes to manipulating the stack, which is central to continuations.
Ruby actually requires more in the way of support than just continuations, but it's not necessarily impossible to implement on the CLR; it's just hard to implement on the CLR in a high-performing manner using today's CLR. That's part of what Jim is there to do, evolve the CLR to better support languages with Ruby's interesting featureset (like open classes and the "missing_method" method) in such a way that it doesn't tear down perf.

Continuations are not impossible to support, however they are currently more or less impossible to support given the current lack of access to the underlying stack frames in the managed environment--you'd need some support from the runtimes (either the JVM or the CLR) to make it work. Such runtime support would not be too difficult to add, however, as both environments already have rich and powerful stack-walking mechanisms (because both environments use the thread stack as bookkeeping tools, among other things, and need to be able to crawl through those stack markers for a variety of reasons, such as security checks), and it would not be hard to create a runtime-level mechanism that allowed code to "take a snapshot" of the stack--and its related object graph--from a certain point to a certain point, and save off that state to some arbitrary location. In many respects, it would be similar to serializing an object, I believe. In fact, we could imagine something along the lines of:

// All this is totally C#-like pseudocode. Imagine 
// something similar for Java if you like. 
// 
public int ContinuationedMethod() { 
  SnapshotMarker sm = new SnapshotMarker(); 
    // in other words, the StackSnapshot will only crawl 
    // back to the SnapshotMarker referenced when we 
    // take the thread's snapshot; this way we don't crawl 
    // all the way back to the Thread's starting point (unless 
    // you really wanted to). 
  int x = 1; 
  for (int i=0; i<10; i++) { 
    x = x * i; 
    if (i == 5) { 
      StackSnapshot ss = Thread.Current.TakeSnapshot(sm); 
        // At this point, the managed stack is walked, heap-referenced 
        // objects are captured, the instruction label that we're on is 
        // saved, and a StackSnapshot is allocated and returned. 
        // However, when ss is later rehydrated--using, say, ss.Resume(), 
        // we need some way of knowing that. So, following the lead of the 
        // old Unix fork() call, I presume that a "null" return value from 
        // TakeSnapshot is our way of knowing that we are resuming. 
        // 
      if (ss == null) 
        continue; 
      else 
      {
        // store ss someplace for later retrieval and return, either by 
        // throwing an exception if you like or just plain-old-"return 0"
        // or something
        //
      }
    } 
  } 
  return x; 
}
This is the API that I cooked up in all of thirty seconds, but hopefully you get the idea--it would be difficult to do from outside the runtime, as the many exception-trace stack-frame approaches suggest.

In the end, continuations are not, I believe, nearly as hard to implement--on either the JVM or the CLR--as some might suggest. Had I the money, I would go off and build the necessary Ruby-esque features we'd want into the CLR (through Rotor) or the JVM (through... uh... the JCL source, I guess, though the licensing there bothers me) for use. Anybody got some cash laying around to cover my mortgage while I do this? :-)


.NET | C++ | Java/J2EE | Ruby | XML Services

Saturday, May 06, 2006 11:19:09 PM (Pacific Daylight Time, UTC-07:00)
Comments [10]  | 
Another podcast with me goes live...

The guys over at Software Engineering Radio asked me to do a podcast a few months back, and it's now live on their website. They were particularly interested in language and new language development, so we spent a fair amount of time talking about Scala, F#, LINQ, and other interesting language developments in the world of the JVM and CLR. Have a listen, if you like...


.NET | Java/J2EE | Ruby | XML Services

Saturday, May 06, 2006 10:21:04 PM (Pacific Daylight Time, UTC-07:00)
Comments [0]  | 
 Friday, May 05, 2006
More on "Monad vs Ruby"... which really wasn't supposed to be a "vs" at all...

A while back, I blogged how MonadWindows PowerShell can be used to do a lot of the things the Ruby advocates are saying is one of Ruby's biggest strengths, that of "scripting" and driving things from the REPL environment. Glenn Vanderburg jumped all over me, believing I was suggesting that this was some kind of contest by which Ruby was supposed to come out in the Negative Points Zone. Had that been my intent, I would heartily agree with his critique; unfortunately, that wasn't the point.

For a while now, people have been holding up Ruby as this "incredibly productive tool", with the implication that such productivity cannot be achieved on the platforms that are currently the standards among the industry--that is, the CLR and Java virtual machine. Dynamically-, weak- or non-typed languages, for example, are all the rage now because they mean we don't need to try and "fool" the compiler on a regular basis with typecasts, we can have things like closures and continuations, and so on. My point simply was to point out that such argument is FUD--we can have closures, continuations, and so on, in platforms like the CLR and JVM--it's just that the major languages of the day don't provide those features (yet).

The Ruby advocate may snicker at my splitting of hairs here--"OK, so your platform may support it, but your languages don't. I'd rather use a language and platform that does support these features, instead of waiting for somebody else to come along and give them to me." It's a fair question--why is this an important distinction? Because asking existing infrastructure and applications to switch to a new platform is almost impossible. Asking them to integrate with a new platform is painful. Asking developers--particularly those on the team who don't get the beauty of dynamic/weak/non-typed languages--to switch to an entirely new way of thinking is going to mean you're going to spend at least five years learning the "new way".

If we can get those features we want from languages like Ruby onto a platform that we've already standardized on gives us a best-of-both-worlds result. Those developers who are comfortable with statically-typed objects can stay with statically-typed objects. Those who want the more dynamic features of "scripting" languages can do so. We can then blend the two together, to form an interesting and seamless whole: "Give me my Rails, but let me call into a J2EE Connector implementation to talk to the mainframe while we're at it." Getting Ruby to run on the JVM or CLR would be a Very Good Thing. It would answer one of the principal criticisms of Rails, for example, that of the idea that it doesn't do well when dealing with Large Enterprise Things--if Rails could create a javax.transaction.XATransaction before it kicks into ActiveRecord code, for example, suddenly we have a two-phase commit possibility that includes not just databases but mainframes and messaging systems. And yes, people DO need those kinds of Large Enterprise Things in the real world sometimes. Would it not be a win for Ruby to be able to hook into those without having to write all that code themselves? :-)

As a particular footnote to the discussion, Lee Holmes (whose blog entry I quoted to start all this) points out that there is a more concise version of the MonadWindows PowerShell script that compares more favorably (in terms of the "lines of code" metric, which, as Glenn notes, I don't really put much stock into) to the Ruby version. But that's not the point--the point is, there *is* a way to do Ruby-esque things in the Windows world, even if you choose not to learn the Ruby syntax. If that makes your life easier, hoo-hah, Sargeant! If you instead want to use JScript.NET and compile into IL, hoo-hah, Sargeant! If you choose to do Ruby, hoo-hah, Sargeant! Whatever makes life easier for you to Get The Work Done.

But don't underestimate the costs of integration, and certainly don't sacrifice integration on the altar of productivity--the time that you saved up front writing the thing will be more than spent when you try to make integration with the rest of your company's IT systems (or your new partner's IT systems, or your new consortium's standards, or....) work. Integration has been, is now, and will remain the Number One challenge to companies today, and that's not going to change in the future, Ruby or no Ruby. :-)


.NET | Java/J2EE | Ruby | XML Services

Friday, May 05, 2006 11:03:12 AM (Pacific Daylight Time, UTC-07:00)
Comments [2]  | 
 Thursday, April 27, 2006
TheServerSide releases a TechTalk with yours truly

Best link to it is from their TechTalk page; the link to the actual stream itself redirects several times....

Some of you've been asking, where the heck have I been lately? Short story--I've been insanely busy. Long story--I've been insanely busy with a bunch of conferences and on-the-road consulting, not to mention the various writing projects I have afloat. Fear not, the blogging will resume shortly....


Java/J2EE

Thursday, April 27, 2006 11:29:19 PM (Pacific Daylight Time, UTC-07:00)
Comments [0]  | 
 Monday, April 03, 2006
Nice to know I'm missed

My son misses me terribly... or, apparently, at least he misses certain parts of me more than others, according to Charlotte's blog... (Warning, only for people who want personal life details. ;-) )




Monday, April 03, 2006 12:32:39 AM (Pacific Daylight Time, UTC-07:00)
Comments [2]  | 
 Sunday, April 02, 2006
Javapolis interview with me is now up and available

The Javapolis folks were kind (deluded? crazy? you pick the word) enough to put the interview of me that Dion did at the show online. Have a listen....


.NET | C++ | Java/J2EE | Ruby | XML Services

Sunday, April 02, 2006 10:32:00 PM (Pacific Daylight Time, UTC-07:00)
Comments [1]  | 
 Friday, March 31, 2006
Need Ruby? Nah--get Monad instead

I happened across this blog entry while doing some research on Monad, Microsoft's new command shell (meaning, think "bash" or "csh", not "Explorer"), and found it so similar in many ways to what guys in the Ruby space have been hyping for a year or two now, that I just had to pass it on. Reproducing directly from the site:

One of the scripts I like the most in my toolbox is the one that gives me answers to questions from the command line.

For the past 2 years or so, Encarta has offered an extremely useful Instant Answers feature. Its since been integrated into MSN Search, as well as a wildly popular Chat Bot. MoW showed how to use that feature through a Monad IM interface (via the Conversagent bot,) but we can do a great job with good ol screen scraping.

[C:\temp]
MSH:70 > get-answer "What is the population of China?"

Answer: China: Population, total: 1,313,973,700
2006 estimate
United States Census International Programs Center


Answer: China : Population:
More than 20 percent of the worldâ?Ts population lives in China. Of the co
untryâ?Ts inhabitants, 92 percent are ethnic Han Chinese. The Han are desc
endants...

[C:\temp]
MSH:71 > get-answer "5^(e^(x^2))=50"

Answer: 5^(e ^( x^2))=50 = x=0.942428  x=-0.942428

[C:\temp]
MSH:72 > get-answer "define: canadian bacon"

Definition: Canadian bacon lean bacon

[C:\temp]
MSH:73 > get-answer "How many calories in an Apple?"

Answer: Apples: calories:
1.0 cup, quartered or chopped has 65 calories 1.0 NLEA serving has 80 calo
ries 1.0 small (2-1/2" dia) (approx 4 per lb) has 55 calories 1.0 medium (
2-3/4" dia) (approx 3 per lb) has 72 calories 1.0 large (3-1/4" dia) (appr
ox 2 per lb) has 110 calories 1.0 cup slices has 57 calories
USDA

[C:\temp]
MSH:74 > get-answer "How many inches in a light year?"

Answer: 1 lightyear = 372,461,748,226,857,000 inches
Here is the script, should you require your own command-line oracle:
## Get-Answer.msh 
## Use Encarta's Instant Answers to answer your question. 

param([string] $question = $(throw "Please ask a question.")) 

function Main 
{ 
   # Load the System.Web.HttpUtility DLL, to let us URLEncode 
   [void] [System.Reflection.Assembly]::LoadWithPartialName("System.Web") 

   ## Get the web page into a single string 
   $encoded = [System.Web.HttpUtility]::UrlEncode($question) 
   $text = get-webpage "http://search.msn.com/encarta/results.aspx?q=$encoded" 

   ## Get the answer with annotations 
   $startIndex = $text.IndexOf('<div id="results">') 
   $endIndex = $text.IndexOf('</div></div><h2>Results</h2>') 

   ## If we found a result, then filter the result 
   if(($startIndex -ge 0) -and ($endIndex -ge 0)) 
   { 
      $partialText = $text.Substring($startIndex, $endIndex - $startIndex) 
    
      ## Very fragile, voodoo screen scraping here 
      $regex = "<\s*a\s*[^>]*?href\s*=\s*[`"']*[^`"'>]+[^>]*>.*?</a>" 
      $partialText = [Regex]::Replace("$partialText", $regex, "") 
      $partialText = $partialText -replace "</div>", "`n" 
      $partialText = $partialText -replace "</span>", "`n" 
      $partialText = clean-html $partialText 
      $partialText = $partialText -replace "`n`n", "`n" 
     
      "" 
      $partialText.TrimEnd() 
   } 
   else 
   { 
      "" 
      "No answer found." 
   } 
} 

## Get a web page
function Get-WebPage ($url=$(throw "need to specify the URL to fetch")) 
{ 
    # canonicalize the url 
    if ($url -notmatch "^[a-z]+://") { $url = "http://$url" } 
     
    $wc = new-object System.Net.WebClient  
    $wc.Headers.Add("user-agent", $userAgent) 
    $wc.DownloadString($url) 
} 

## Clean HTML from a text chunk 
function Clean-Html ($htmlInput) 
{ 
    [Regex]::Replace($htmlInput, "<[^>]*>", "") 
} 

. Main
That's nifty stuff, if you ask me. And, what's best, this is a loosely-typed, dynamic language every bit as interesting and powerful as Ruby, though admittedly without some of the metaprogramming capabilities that Ruby has. But notice how we're making use of the vast power underneath the .NET framework to lay out a pretty straightforward use of the code, in a way that's entirely dynamic and loosely-typed, including the assumed return value from the if/else block, and so on. It's going to be a whole new world for automating projects (among other things) once Monad ships, and the saavy .NET developer (and even the saavy Java developer who builds on Windows) will already be looking at Monad for ways to streamline the things they need to do on Windows.

Added to my list of developer interview questions: "What is Monad, and why do I care?"


.NET | Ruby

Friday, March 31, 2006 7:20:34 PM (Pacific Daylight Time, UTC-07:00)
Comments [7]  | 
 Friday, March 24, 2006
Why programmers shouldn't fear offshoring

Recently, while engaging in my other passion (international relations), I was reading the latest issue of Foreign Affairs, and ran across an interesting essay regarding the increasing outsourcing--or, the term they introduce which I prefer in this case, "offshoring"--of technical work, and I found some interesting analysis there that I think solidifies why I think programmers shouldn't fear offshoring, but instead embrace it and ride the wave to a better life for both us and consumers. Permit me to explain.

The essay, entitled "Offshoring: The Next Industrial Revolution?" (by Alan S. Blinder), opens with an interesting point, made subtly, that offshoring (or "offshore outsourcing"), is really a natural economic consequence:

In February 2004, when N. Gregory Mankiw, a Harvard professor then serving as chairman of the White House Council of Economic Advisers, caused a national uproar with a "textbook" statement about trade, economists rushed to his defense. Mankiw was commenting on the phenomenon that has been clumsily dubbed "offshoring" (or "offshore outsourcing")--the migration of jobs, but not the people who perform them, from rich countries to poor ones. Offshoring, Mankiw said, is only "the latest manifestation of the gains from trade that economists have talked about at least since Adam Smith. ... More things are tradable than were tradable in the past, and that's a good thing." Although Democratic and Republican politicians alike excoriated Mankiw for his callous attitude toward American jobs, economists lined up to support his claim that offshoring is simply international business as usual.

Their economics were basically sound: the well-known principle of comparative advantage implies that trade in new kinds of products will bring overall improvements in productivity and well-being. But Mankiw and his defenders underestimated both the importance of offshoring and its disruptive effect on wealthy countries. Sometimes a quantitative change is so large that it brings qualitative changes, as offshoring likely will. We have so far barely seen the tip of the offshoring iceberg, the eventual dimensions of which may be staggering.

So far, you're not likely convinced that this is a good thing, and Blinder's article doesn't really offer much reassurance as you go on:

To be sure, the furor over Mankiw's remark was grotesquely out of proportion to the current importance of offshoring, which is still largely a prospective phenomenon. Although there are no reliable national data, fragmentary studies indicate that well under a million service-sector jobs have been lost to offshoring to date. (A million seems impressive, but in the gigantic and rapidly churning U.S. labor market, a million jobs is less than two weeks' worth of normal gross job losses.)1 However, constant improvements in technology and global communications will bring much more offshoring of "impersonal services"--that is, services that can be delivered electronically over long distances, with little or no degradation in quality.

That said, we should not view the coming wave of offshoring as an impending catastrophe. Nor should we try to stop it. The normal gains from trade mean that the world as a whole cannot lose from increases in productivity, and the United States and other industrial countries have not only weathered but also benefited from comparable changes in the past. But in order to do so again, the governments and societies of the developed world must face up to the massive, complex, and multifaceted challenges that offshoring will bring. National data systems, trade policies, educational systems, social welfare programs, and politics must all adapt to new realities. Unfortunately, none of this is happening now.

Phrases like "the world cannot lose from increases in productivity" are hardly comforting to programmers who are concerned about their jobs, and hearing "nor should we try to stop" the impending wave of offshoring is not what most programmers want to hear. But there's an interesting analytical point that I think Blinder misses about the software industry, and in order to make the point I have to walk through his argument a bit to get to it. I'm not going to quote the entirety of the article to you, don't worry, but I do have to walk through a few points to get there. Bear with me, it's worth the ride, I think.

Why Offshoring

Blinder first describes the basics of "comparative advantage" and why it's important in this context:

Countries trade with one another for the same reasons that individuals, businesses and regions do: to exploit their comparative advantages. Some advantages are "natural": Texas and Saudi Arabia sit atop massive deposits of oil that are entirely lacking in New York and Japan, and nature has conspired to make Hawaii a more attractive tourist destination than Greenland. Ther eis not much anyone can do about such natural advantages.

But in modern economics, nature's whimsy is far less important than it was in the past. Today, much comparative advantage derives from human effort rather than natural conditions. The concentration of computer companies around Silicon Valley, for example, has nothing to do with bountiful natural deposits of silicon; it has to do with Xerox's fabled Palo Alto Research Center, the proximity of Stanford University, and the arrival of two young men named Hewlett and Packard. Silicon Valley could have sprouted up anywhere.

One important aspect of this modern reality is that patterns of man-made comparative advantage can and do change over time. The economist Jagdish Bhagwait has labeled this phenomenon "kaleidoscopic comparative advantage", and it is critical to understanding offshoring. Once upon a time, the United Kingdom had a comparative advantage in textile manufacturing. Then that advantage shifted to New England, and so jobs were moved from the United Kingdom to the United States.2 Then the comparative advantage in textile manufacturing shifted once again--this time to the Carolinas--and jobs migrated south within the United States.3 Now the comparative advantage in textile manufacturing resides in China and other low-wage countries, and what many are wont to call "American jobs" have been moved there as a result.

Of course, not everything can be traded across long distances. At any point in time, the available technology--especially in transportation and communications4--largely determines what can be traded internationally and what cannot. Economic theorists accordingly divide the world's goods and services into two bins: tradable and non-tradable. Traditionally, any item that could be put in a box and shipped (roughly, manufactured goods) was considered tradable, and anything that could not be put into a box (such as services) or was too heavy to ship (such as houses) was thought of as nontradable. But because technology is always improving and transportation is becoming cheaper and easier, the boundary between what is tradable and what is not is constantly shifting. And unlike comparative advantage, this change is not kaleidoscopic; it moves in only one direction, with more and more items becoming tradable.

The old assumption that if you cannot put it in a box, you cannot trade it is thus hopelessly obsolete. Because packets of digitized information play the role that boxes used to play, many more services are now tradable and many more will surely become so. In the future, and to a great extent already, the key distinction will no longer be between things that can be put in a box and things that cannot. Rather, it will be between services that can be delivered electronically and those that cannot.

Blinder goes on to describe the three industrial revolutions, the first being the one we all learned in school, coming at the end of the 18th century and marked by Adam Smith's The Wealth of Nations in 1776. It was a massive shift in the economic system, as workers in industrializing countries migrated from farm to factory. "It has been estimated that in 1810, 84 percent of the U.S. work force was engaged in agriculture, compared to a paltry 3 percent in manufacturing. By 1960, manufacturing's share had rised to almost 25 percent and agriculture's had dwindled to just 8 percent. (Today, agriculture's share is under 2 percent.)" (This statistic is important, by the way--keep it in mind as we go.) He goes on to point out the second Revolution, the shift from manufacturing to services:

Then came the second Industrial Revolution, and jobs shifted once again--this time away from manufacturing and toward services. The shift to services is still viewed with alarm in the United States and other rich countries, where people bemoan rather than welcome the resulting loss of manufacturing jobs5. But in reality, new service-sector jobs have been created far more rapidly than old manufacturing jobs have disappeared. In 1960, about 35 percent of nonagricultural workers in the United States produced goods and 65 percent produced services. By 2004, only about one-sixth of the United States' nonagricultural jobs were in goods-producing industries, while five-sixths produced services. This trend is worldwide and continuing.

It's also important to point out that the years from 1960 to 2004 saw a meteoric rise in the average standard of living for the United States, on a scale that's basically unheard of in history. In fact, it was SUCH a huge rise that it became an expectation that your children would live better than you did, and the inability to keep that basic expectation in place (which has become a core part of the so-called "American Dream") that creates major societal angst on the part of the United States today.

We are now i nthe arly stages of a third Industrial Revolution--the information age. The cheap and easy flow of information around the globe has vastly expanded the scope of tradable services, and there is much more to come. Industrial revolutions are big deals. And just like the previous two, the third Industrial Revolution will require vast and usettling adjustments in the way Americans and residents of other developed countries work, live, and educate their children.

Wow, nothing unsettles people more than statements like "the world you know will cease to exist" and the like. But think about this for a second: despite the basic "growing pains" that accompanied the transitions themselves, on just about every quantifiable scale imaginable, we live a much better life today than our forebears did just two hundred years ago, and orders of magnitude better than our forebears did three hundred or more years ago (before the first Industrial Revolution). And if you still hearken back to the days of the "American farmer" with some kind of nostalgia, you never worked on a farm. Trust me on this.

So what does this mean?

But now we start to come to the interesting part of the article.

But a bit of historical perspective should help temper fears of offshoring. The first Industrial Revolution did not spell the end of agriculture, or even the end of food production, in the United States. It jus tmean that a much smaller percentage of Americans had to work on farms to feed the population. (By charming historical coincidence, the actual number of Americans working on farms today--around 2 million--is about what it was in 1810.) The main reason for this shift was not foreign trade, but soaring farm productivity. And most important, the massive movement of labor off the farms did not result in mass unemployment. Rather, it led to a large-scale reallocation of labor to factories.
Here's where we get to the "hole" in the argument. Most readers will read that paragraph, do the simple per-capita math, and conclude that thanks to soaring productivity gains in the programming industry (cite whatever technology you want here--Ruby, objects, hardware gains, it really doesn't matter what), the percentage of programmers in the country is about to fall into a black hole. After all, if we can go from 84 percent of the population involved in agriculture to less than 2% or so, thanks to that soaring productivity, why wouldn't it happen here again?

Therein lies the flaw in the argument: the amount of productivity required to achieve the desired ends is constant in the agriculture industry, yet a constantly-changing dynamic value in software. This is also known as what I will posit as the Groves-Gates Maxim: "What Andy Groves giveth, Bill Gates taketh away."

The Groves-Gates Maxim

The argument here is simple: the process of growing food is a pretty constant one: put seed in ground, wait until it comes up, then harvest the results and wait until next year to start again. Although we might have numerous tools that can help make it easier to put seeds into the ground, or harvesting the results, or even helping to increase the yield of the crop when it comes up, the basic amount of productivity required is pretty much constant. (My cousin, the FFA Farmer of the Year from some years back and a seed hybrid researcher in Iowa might disagree with me, mind you.) Compare this with the software industry: the basic differences between what's an acceptable application to our users today, compared to even ten years ago, is an order of magnitude different. Gains in productivity have not yielded the ability to build applications faster and faster, but instead have created a situation where users and managers ask more of us with each successive application.

The Groves-Gates Maxim is an example of that: every time Intel (where Andy Groves is CEO) releases new hardware that accelerates the power and potential of what the "average" computer (meaning, priced at somewhere between $1500-$2000) is capable of, it seems that Microsoft (Mr. Gates' little firm) releases a new version of Windows that sucks up that power by providing a spiffier user interface and "eye-candy" features, be they useful/important or not. In other words, the more the hardware creates possibilities, the more software is created to exploit and explore those possibilities. The additional productivity is spent not in reducing the time required to produce the thing desired (food in the case of agriculture, an operating system or other non-trivial program in the case of software), but in the expansion of the functionality of the product.

This basic fact, the Groves-Gates Maxim, is what saves us from the bloody axe of forced migration. Because what's expected of software is constantly on the same meteoric rise as what productivity gains provide us, the need for programmer time remains pretty close to constant. Now, once the desire for exponentially complicated features starts to level off, the exponentially increasing gains in productivity will have the same effect as they did in the agricultural industry, and we will start seeing a migration of programmers into other, "personal service" industries (which are hard to offshore, as opposed to "impersonal service" industries which can be easily shipped overseas).

Implications

What does this mean for programmers? For starters, as Dave Thomas has already frequently pointed out on NFJS panels, programmers need to start finding ways to make their service a "personal service" position rather than an "impersonal service" one. Blinder points out that the services industry is facing a split down the middle along this distinction, and it's not necessarily a high-paying vs low-paying divide:

Many people blithely assume that the critical labor-market distinction is, and will remain, between highly educated (or highly-skilled) people and less-educated (or less-skilled) people--doctors versus call-center operators, for example. The supposed remedy for the rich countries, accordingly, is more education and a general "upskilling" of the work force. But this view may be mistaken. Other things being equal, education and skills are, of course, good things; education yields higher returns in advanced societies, and more schooling probably makes workers more flexible and more adaptable to change. But the problem with relying on education as the remedy for potential job losses is that "other things" are not remotely close to equal. The critical divide in the future may instead be between those types are work that are easily deliverable through a wire (or via wireless connections) with little or no diminution in quality and those that are not. And this unconventional divide does not correspond well to traditional distinctions between jobs that require high levels of education and jobs that do not.

A few disparate examples will illustrate just how complex--or, rather, how untraditional--the new divide is. It is unlikely that the services of either taxi drivers or airline pilots will ever be delivered electronically over long distances. The first is a "bad job" with negligible educational requirements; the second is quite the reverse. On the other hand, typing services (a low-skill job) and security analysis (a high-skill job) are already being delivered electronically from India--albeit on a small scale so far. Most physicians need not fear that their jobs will be moved offshore, but radiologists are beginning to see this happening already. Police officers will not be replaced by electronic monitoring, but some security guards will be. Janitors and crane operators are probably immune to foreign competition; accountants and computer programmers are not. In short, the dividing line between the jobs that produce services that are suitable for electronic delivery (and are thus threatened by offshoring) and those that do not does not correspond to traditional distinctions between high-end and low-end work.

What's the implications here for somebody deep in our industry? Pay close attention to Blinder's conclusion, that computer programmers are highly vulnerable to foreign competition, based on the assumption that the product we deliver is easily transferable across electronic media. But there is hope:

There is currently not even a vocabulary, much less any systematic data, to help society come to grips with the coming labor-market reality. So here is some suggested nomenclature. Service that cannot be delivered electronically, or that are notably inferior when so delivered, have one essential characteristic: personal, face-to-face contact is either imperative or highly desirable. Think of hte waiter who serves you dinner, the doctor you gives you your annual physical, or the cop on the beat. Now think of any of those tasks being performed by robots controlled from India--not quite the same. But such face-to-face human contact is not necessary in the relationship you have with the telephone operator who arranges your conference call or the clerk who takes your airline reservation over the phone. He or she may be in India already.

The first group of tasks can be called personally-delivered services, or simply personal services, and the second group of impersonally delivered services, or impersonal services. In the brave new world of globalized electronic commerce, impersonal services have more in common with manufactured goods that can be put in boxes than they do with personal services. Thus, many impersonal services are destined to become tradable and therefore vulnerable to offshoring. By contrast, most personal services have attributes that cannot be transmitted through a wire. Some require face-to-face contact (child care), some are inherently "high-risk" (nursing), some involve high levels of personal trust (psychotherapy), and some depend on location-specific attributes (lobbying).

In other words, programmers that want to remain less vulnerable to foreign competition need to find ways to stress the personal, face-to-face contact between themselves and their clients, regardless of whether you are a full-time employee of a company, a contractor, or a consultant (or part of a team of consultants) working on a project for a client. Look for ways to maximize the four cardinalities he points out:
  • Face-to-face contact. Agile methodologies demand that customers be easily accessible in order to answer questions regarding implementation decisions or to resolve lack of understanding of the requirements. Instead of demanding customers be present at your site, you may find yourself in a better position if you put yourself close to your customers.
  • "High-risk". This is a bit harder to do with software projects--either the project is inherently high-risk in its makeup (perhaps this is a mission-critical app that the company depends on, such as the e-commerce portal for an online e-tailer), or it's not. There's not much you can do to change this, unless you are politically savvy enough to "sell" your project to a group that would make it mission-critical.
  • High levels of personal trust. This is actually easier than you might think--trust in this case refers not to the privileged nature of therapist-patient communication, but in the credibility the organization has in you to carry out the work required. One way to build this trust is to understand the business domain of the client, rather than remaining aloof and "staying focused on the technology". This trust-based approach is already present in a variety of forms outside our industry--regardless of the statistical ratings that might be available, most people find that they have a favorite auto repair mechanic or shop not for any quantitatively-measurable reason, but beceause the mechanic "understands" them somehow. The best customer-service shops understand this, and have done so for years. The restaurant that recognizes me as a regular after just a few visits and has my Diet Coke ready for me at my favorite table is far likelier to get my business on a regular basis than the one that never does. Learn your customers, learn their concerns, learn their business model and business plan, and get yourself into the habit of trying to predict what they might need next--not so you can build it already, but so that you can demonstrate to them that you understand them, and by extension, their needs.
  • Location-specific attributes. Sometimes, the software being built is localized to a particular geographic area, and simply being in that same area can yield significant benefits, particularly when heroic efforts are called for. (It's very hard to flip the reset switch on a server in Indiana from a console in India, for example.)
In general, what you're looking to do is demonstrate how your value to the company arises out of more than just your technical skill, but also some other qualities that you can provide in better and more valuable form than somebody in India (or China, or Brazil, or across the country for that matter, wherever the offshoring goes). It's not a guarantee that you might still be offshored--some management folks will just see bottom-line dollars and not recognize the intangible value-add that high levels of personal trust or locality provides--but it'll minimize it on the whole.

But even if this analysis doesn't make you feel a little more comfortable, consider this: there are 1 billion people in China alone, and close to the same in India. Instead of seeing them as potential competition, imagine what happens when the wages from the offshored jobs start to create a demand for goods and services in those countries--if you think the software market in the U.S. was hot a decade ago, where only a half-billion (across both the U.S. and Europe) people were demanding software, now think about it when four times that many start looking for it.


Footnotes

1 Which in of itself is an interesting statistic--it implies that offshoring is far less prevalent than some of people worried about it believe it to be, including me.

2 Interesting bit of trivia--part of the reason that advantage shifted was because the US stole (yes, stole, as in industrial espionage, one of the first recorded cases of modern industrial espionage) the plans for modern textile machinery from the UK. Remember that, next time you get upset at China's rather loose grip of intellectual property law....

3 Which, by the way, was a large part of the reason we fought the Civil War (the "War Between the States" to some, or the "War of Northern Aggression" to others)--the Carolinas depended on slave labor to pick their cotton cheaply, and couldn't acquire Northern-made machinery cheaply to replace the slaves. Hence, for that (and a lot of other reasons), war followed.

4 An interesting argument--is there any real difference between transportation and communications? One ships "stuff", the other "data", beyond that, is there any difference?

5 And, I'd like to point out, the shrinking environmental damage that can arise from a manufacturing-based economy. Services rarely generate pollution, which is part of the clash between the industrialized "Western" nations and the developing "Southern" ones over environmental issues.

Resources

"Offshoring: The Next Industrial Revolutoin?", by Alan S. Blinder, Foreign Affairs (March/April 2006), pp 113 - 128.


.NET | C++ | Development Processes | Java/J2EE | Reading | Ruby | XML Services

Friday, March 24, 2006 2:43:00 AM (Pacific Daylight Time, UTC-07:00)
Comments [6]  | 
 Monday, March 20, 2006
Take a non-technical moment and support the fight against diabetes

Scott Hanselman has diabetes. So does a good friend of mine, who discovered it during our third year of college. Take a moment and spare US$20 to support Scott in his fight against it. Do it because you probably know somebody who has it, or will before long. If nothing else, do it because it's a cheap way to support somebody who's indirectly responsible for this blog (Scott maintains dasBlog, which is the blogging engine I use now).

Out.




Monday, March 20, 2006 11:10:22 PM (Pacific Daylight Time, UTC-07:00)
Comments [0]  | 
 Sunday, March 19, 2006
Another annoying nit in Java, fixed

CLASSPATH now supports wildcards to pick up multiple .jar files. Finally. But given that the AppClassLoader is a derivative of the standard UrlClassLoader, I still don't see why I can't put full URLs on the CLASSPATH and expect them to be resolved correctly....

Oh, and while we're at it, you shouldn't be using CLASSPATH-the-environment variable anymore, anyway. There's far too many ways to manage .jar file resolution to be falling back to that old hack. Prefer instead to put your .jar file dependencies inside the manifest of your application's .jar file, or at the very least, specify the classpath to the java launcher when you kick it off. If you're still doing "set CLASSPATH=..." at the command-line, you're about ten years behind the times.


Java/J2EE

Sunday, March 19, 2006 2:11:35 AM (Pacific Daylight Time, UTC-07:00)
Comments [2]  | 
At last, a minor but annoying nit in Java, fixed

Mustang, the latest JDK (to be called Java6 on its release), fixes a minor but very annoying nit that's bugged Java developers for years: "if a JDBC driver is packaged as a service, you can simply (leave out the call to Class.forName() to bootstrap the driver class into the JVM). ... The DriverManager code will look for implementations of java.sql.Driver in classpath and do Class.forName() implicitly." It's never been a huge deal in Java, to have to explicitly bootstrap the JDBC driver into the JVM before being able to obtain a Connection from it, but it's always been annoying, and inexplicable, given the Service Provider mechanism that's been there for a couple of releases now.

Next up: do the same for JNDI and XML parsers (and do away with the extensions directory while we're at it!), and let's start making all of these factories a bit more practical to use on a larger basis. It would be nice if this mechanism had percolated through other tools and areas, such as servlets, but it's probably too late to correct for that now.


Java/J2EE

Sunday, March 19, 2006 2:07:44 AM (Pacific Daylight Time, UTC-07:00)
Comments [0]  |