|
JOB REFERRALS
|
|
|
|
ON THIS PAGE
|
|
|
|
|
ARCHIVES
|
| May, 2013 (1) |
| April, 2013 (4) |
| March, 2013 (4) |
| February, 2013 (5) |
| January, 2013 (8) |
| December, 2012 (3) |
| November, 2012 (5) |
| October, 2012 (4) |
| May, 2012 (1) |
| March, 2012 (6) |
| January, 2012 (4) |
| December, 2011 (2) |
| October, 2011 (2) |
| August, 2011 (1) |
| May, 2011 (1) |
| April, 2011 (1) |
| February, 2011 (1) |
| January, 2011 (1) |
| December, 2010 (1) |
| November, 2010 (1) |
| October, 2010 (3) |
| September, 2010 (4) |
| August, 2010 (2) |
| July, 2010 (1) |
| June, 2010 (1) |
| May, 2010 (3) |
| March, 2010 (5) |
| February, 2010 (1) |
| January, 2010 (4) |
| December, 2009 (1) |
| November, 2009 (3) |
| October, 2009 (3) |
| August, 2009 (2) |
| July, 2009 (4) |
| June, 2009 (3) |
| May, 2009 (6) |
| April, 2009 (4) |
| March, 2009 (4) |
| February, 2009 (5) |
| January, 2009 (11) |
| December, 2008 (3) |
| November, 2008 (9) |
| October, 2008 (1) |
| September, 2008 (2) |
| August, 2008 (4) |
| July, 2008 (10) |
| June, 2008 (5) |
| May, 2008 (10) |
| April, 2008 (13) |
| March, 2008 (11) |
| February, 2008 (18) |
| January, 2008 (17) |
| December, 2007 (12) |
| November, 2007 (2) |
| October, 2007 (6) |
| September, 2007 (1) |
| August, 2007 (2) |
| July, 2007 (7) |
| June, 2007 (1) |
| May, 2007 (1) |
| April, 2007 (2) |
| March, 2007 (2) |
| February, 2007 (1) |
| January, 2007 (16) |
| December, 2006 (3) |
| November, 2006 (7) |
| October, 2006 (5) |
| September, 2006 (1) |
| June, 2006 (4) |
| May, 2006 (3) |
| April, 2006 (3) |
| March, 2006 (17) |
| February, 2006 (5) |
| January, 2006 (13) |
| December, 2005 (2) |
| November, 2005 (6) |
| October, 2005 (15) |
| September, 2005 (16) |
| August, 2005 (17) |
|
|
|
CATEGORIES
|
|
|
|
|
BLOGROLL
|
|
|
|
|
LINKS
|
|
|
|
|
SEARCH
|
|
|
|
|
MY BOOKS
|
|
|
|
|
DISCLAIMER
|
Powered by:
newtelligence dasBlog 1.9.7067.0
The opinions expressed herein are my own personal opinions and do not represent
my employer's view in any way.
© Copyright
2013
,
Ted Neward
E-mail
|
|
|
|
|
 Tuesday, March 27, 2007
|
Consider the effect of your words before you post or comment
|
|
Kathy Sierra, author of the Head-First books and a well-written, well-spoken author around human-computer interface stuff in general, has withdrawn from the blogosphere because of death threats posted to her through the blogosphere. (Be warned, that post has some pretty graphic material in it, definitely not for children.) The result? Kathy has not only decided to stop posting to her blog (for now, hopefully not a permanent state of affairs), but she is in fact in fear for her life: As I type this, I am supposed to be in San Diego, delivering a workshop at the ETech conference. But I'm not. I'm at home, with the doors locked, terrified. How incredibly sad for the industry, when one person can effectively douse a bright light like Kathy's. Of course, Kathy has my full support and sympathy--as the author of some outspoken pieces, I've been targeted by some heated voices, but never like anything she's now suffering. I really can't imagine what she's feeling right now, and I really hope I never do. But the death threats to one side, the anonymous nature of the blogosphere (and the Internet as a whole) is creating a very real danger of shutting down this incredible social environment we call home. Kathy's experience is only the most extreme end of the spectrum; every blogger has seen their share of "virtual hecklers", people whose comments consist of nothing more intellectual than "you're an idiot" or "your mother should be ashamed of having not had an abortion before you were born" (which is an actual comment I received once). I recognize that when one posts to the blogosphere, one is putting oneself into the public crosshairs, and a certain amount of abuse is to be expected. Hell, sometimes that kind of reaction is what a blogger is gunning for--nothing provokes a good discussion around an idea than an outrageous opinionated statement! I've never questioned the right of people to comment on my blog and call me names (or, at least, what they think is a name--the guy who tries to insult me by calling me "the next Microsoft employee" just really doesn't get it), partly because that's part of the Free Speech idea, and partly because if I can't handle the pressure I shouldn't be running with the big dogs. But folks, let's be honest: if I were to say to you that I get warm fuzzy feelings when somebody posts a personal attack on my character, I'd be lying. Here's the great admission: It does hurt. Of course it hurts. How could it not? Nobody likes to be insulted. Nobody likes to have their intelligence called into question. You wouldn't like it if somebody said the same about you, would you? I'm not suggesting that people who disagree with a blogger's opinions should just roll over and shut up--hardly. You have every right to disagree and offer up your reasons for disagreement. But never lose sight of the fact that behind the blog is a real person, with feelings and a family and the same emotional range as yourself. Or else we may all find the blogosphere reduced to people screaming shrilly at each other while the smart ones quietly slip away to find a better way to hold their discussions. And that doesn't help anybody.
Tuesday, March 27, 2007 9:00:25 AM (Pacific Daylight Time, UTC-07:00)
|
|
 Thursday, March 22, 2007
|
RedHat, Inc: The Next Microsoft?
|
|
Think that RedHat is still the open source capital of the Internet, all happy-happy-joy-joy with its supporters and liberal-minded in its goals? Take a look at this and tell me if your mind isn't changed a little:
Enclosed is a copy of the form letter they sent out to many companies that offer Hibernate consulting and training.
Dear Sir or Madam:
Red Hat, Inc. has become aware that your company is offering Hibernate training courses. Red Hat does not allow the use of its trademarks without a written agreement.
Red Hat is the owner of numerous trademarks, including but not limited to, its Hibernate mark, U.S. Federal Registration Number 3135582. RedHat has made extensive use of its Hibernate marks in interstate and international commerce in connection with the advertising, promotion, and sale of its goods and services. Due widespread use, advertising and extensive marketing, the RedHat marks have become famous.
Red Hat requests that you immediately cease offering Hibernate branded training, as well as any other training that may contain Red Hat marks
or marks that are confusingly similar. Although you may offer object
oriented relational database mapping training, you may not use the Hibernate name to promote and advertise your products and services.
We trust you will understand Red Hat's interest in protecting its valuable intellectual property and ensuring that consumers are not misled as to the source and sponsorship of goods and services sold and/or distributed under the RED HAT marks. We trust this matter can be resolved promptly and amicably and appreciate your attention to this matter.
We look forward to your reply and request a response no later than {WITHHELD}.
Sincerely,
Meredith K. Robertson
Legal Specialist
Red Hat, Inc.
Folks, RedHat has officially moved into the "Big Corporate Entity Seeking Profit At Any Expense" category. So much for the Open-Source-Can-Really-Make-Money-Too-We-Swear poster child, if you ask me...
UPDATE: Apparently, people at eWeek and Yahoo! News posted articles referencing this entry, so let me post some responses to the comments sent in.
First, I don't think this issue is about copyright law whatsoever or IP issues; it's a deeper, more fundamental issue than that. We can certainly argue whether "Hibernate" is a trademarked name or a generic name (such as the discussion over "Kleenex" or the act of copying a paper known as "Xeroxing" it), but that's not the interesting point here either--the point is that RedHat somehow feels that the use of the term "Hibernate" in Bill Dudney's training curriculum is somehow going to imply that Bill has received special blessing from RedHat to do so. Does that mean, then, that I need special blesing from Sun in order to offer "Java" training, or special blessing from Microsoft to offer ".NET" training? If that's the case, then there are a lot of training companies who'd better pull their training courses off the shelf and rethink offering training at all, because there's some serious copyright violations going on out there.
Besides, I thought OSS was a reaction against copyright law.
There's the deeper issue, too, of RedHat's heavy-handedness in this: why is it that companies continually feel that the best way to start these discussions is with cease-and-desist letters? It's pathetic when a corporation like Sun does this (as I went through with my small riff with them over "javageeks.com"), but even more so when an open-source company--who for years has proudly proclaimed their allegiance to "the community" and paraded it around as a compelling reason over commercial "evil corporation" solutions like Solaris or Windows or HP-UX--takes the same path.
I like the OSS stack, and when I write something that's worth putting into play, I will do so. (Arguably, I've already done so--the Java attributes facility I wrote years ago before JSR 175 and JDK 5 shipped was finished by Mark Pollack and used in several OSS projecs, but I call that more Mark's work than my own.) But it's time that we start making the critical realization that an industry cannot rest on the backs of volunteer work. And I, for one, do not want this industry to surrender its commercial aspects; I cannot pay for my house with "community spirit", and frankly, I don't want to give up doing what I love (writing software, and teaching others how to do the same) just because of an idea proposed by a guy who now makes his living from delivering keynotes and ranting about the evils of closed-source. I submit that Stallman would sing a different tune were he in fact still a working programmer with a mortgage and a family to feed.
If RedHat continues with this, they will simply demonstrate that they are, in fact, no better than any of the other "evil corporations", that they are in fact first and foremost concerned with turning a profit. And maybe that's not a bad thing in the long run. I'm certain the employees at RedHat are no more evil than anybody who works at Microsoft or Sun or Oracle. I'm certain RedHat is just as concerned with their image and their standing in the community as those other companies. I'm also certain that, at the end of the day, the people who work at RedHat want to make money doing what they love, just as I and thousands--if not millions--of other programmers do. Why do we think it's wrong for them to do so?
RedHat, you are under no obligation to retract your C-and-D letters. You are perfectly justified in defending your copyright and trademark. But it definitely puts a crimp on the socialistic tendencies that come out of the mouths of the most virulent OSS evangelist for you to do so, and almost puts the whole open-source argument into a strange discussion where now we're just arguing over the quality of the code and the costs... which is maybe where the argument should have been from the beginning, not over "free as in speech" or "free as in beer".
|
 Friday, February 23, 2007
|
Avoiding Ruby/Rails Grief
|
|
Scott Hanselman (the Zen master himself) posted an interesting piece about coming through the five stages of programming language grief, while wrestling with a .NET project written in Boo (a .NET language based on Python). That Scott should fall prey to the temptation to "doing things in the old way" (meaning he tried to port the project to C# because C# is, of course, the GOPL: God's Original Programming Language) is a touch surprising, because I tend to think more highly of Scott than that, but I have to admit having fallen into the same trap myself, so of course his sins are forgivable. Amazing, isn't it, how we can forgive or excuse people's actions when we find ourselves doing the same thing? Anyway, Scott's post highlights the importance of understanding the "Zen" of a particular programming language--its idioms, its approaches, and its strengths/weaknesses/quirks--when you move into it. For most of us, it's always easier to move into new territory with an experienced guide to show you the way. For Java developers, a guide--or, rather, a pair of them--have just made themselves available to you. Stu Halloway and Justin Gehtland (both ex-DevelopMentor instructors and, I'm privileged to say, friends of mine) have published "Rails for Java Programmers", a Java-centric guide to using Rails and Ruby, and a book I highly recommend, on the grounds that it helps "make Rails make sense" for developers used to the traditional Model/View/Controller approach in Java web apps. Weighing in at a pretty reasonable 300+ pages, it's probably one of the most gentle introductions to Rails that I've yet seen, and it's minus all the distractions of Why the Lucky Stiff's intro to Ruby. Have a look--there's a sample chapter on InfoQ.
Friday, February 23, 2007 5:21:30 PM (Pacific Standard Time, UTC-08:00)
|
|
 Tuesday, January 30, 2007
|
Important/Not-so-important
|
|
Frank Kelly posted some good ideas on his entry, "Java: Are we worrying about the wrong things?", but more interestingly, he suggested (implicitly) a new format for weighing in on trends and such, his "Important/Not-so-important" style. For example,
NOT SO IMPORTANT: Web 2.0 IMPORTANT: Giving users a good, solid user experience. Web 2.0 doesn't make sites better by itself - it provides powerful technologies but it's no silver bullet. There are so many terrible web sites out there with issues such as - Too much content / too cluttered http://jdj.sys-con.com/ - Too heavy for the many folks still on dial-up - Inconsistent labeling- etc. (See Jakob Nielsen's site for some great articles ) Sometimes you have to wonder if some web site designers actually care about their intended audience?
I love this format--it helps cut through the B/S and get to the point. Frank, I freely admit that I'm going to steal this idea from you, so I hope you're watching Trackbacks or blog links or whatever. :)
|
 Friday, January 26, 2007
|
More on Ethics
|
|
While traveling not too long ago, I saw a great piece on ethics, and wished I'd kept the silly magazine (I couldn't remember which one) because it was just a really good summation of how to live the ethical life. While wandering around the Web with Google tonight, I found it (scroll down a bit, to after the bits on Prohibition and Laughable Laws); in summary, the author advocates a life around five basic points:
- Do no harm
- Make things better
- Respect others
- Be fair
- Be loving
Seems pretty simple, no? The problems occur, of course, in the interpretation and execution. For example, how exactly do we define "better", when we seek to make things better? Had I the power, I would create a world where all people are free to practice whatever religious beliefs they hold, but clearly if those religious beliefs involve human sacrifice, then it's of dubious belief that my actions made the world "better". (Of course, said practitioners would probably disagree.)
It's also pretty hard to actually follow through on these on a daily basis. The author, Bruce Weinstein, makes this pretty clear in this example:
For example, how often do we really keep “do no harm” in mind during our daily interactions with people? If a clerk at the grocery store is nasty to us, don’t we return the nastiness and tell ourselves, “Serves them right?” We may, but if we do, we harm the other person. In so doing, we harm our own soul—and this is one of the reasons why we shouldn’t return nastiness with more of the same.
Ouch. Guilty as charged.
There's a quiz attached to the article, and I highly suggest anyone who cares about their own ethical behavior take it; some of the questions are pretty clear-cut (at least to me), but some of them fall into that category of "Well, I know what I *should* say I would do, but...", and some of them are just downright surprising.
Personally, I think these five points are points that every developer should also advocate and life their life by, since, quite honestly, I think we as an industry do a pretty poor job on all five points. Clearly we violate #1 when we're not careful with security measures in the code; too many programmers (and projects) fail to realize that "better" in #2 is from the customers' perspective, not our own; too many programmers look down on anyone who's not technical in some way, or even those who disagree with them, thus violating #3; too many consultants I've met (thankfully none I can call "friends") will take any excuse to overbill a client (#4); and so on, and so on, and so on.
Maybe I'm getting negative in my old age, but it just seems to me that there's too much shouting and posturing going on (*cough* Fleury *cough*) and not enough focus on the people to whom we are ultimately beholden: our customers. Do what's right for them, even if it's not the easy thing to do, even when they don't think they need it (such as the incapcitated friend in the quiz), and you can never go wrong.
|
|
Programming Promises (or, the Professional Programmer's Hippocratic Oath)
|
|
Michael.NET, apparently inspired by my "Check Your Politics At The Door" post, and equally peeved at another post on blogs.msdn.com, hit a note of pure inspiration when he created his list of "Programming Promises", which I repeat below:
- I promise to get the job done.
- I promise to use whatever tools I need to, regardless of politics.
- I promise to listen to the Closed Source and Open Source zealots equally, and then dismiss them.
- I promise to support, as long as I am able, any closed source applications I may release.
- I promise to release open source any applications I can not, or will not, support.
- I promise to learn as many languages and libraries as possible, regardless of politics.
- I promise to engage with as many other programmers as possible, both in person and online, in order to learn from them; regardless of politics.
- I promise to not bash Microsoft nor GNU, nor others like them, everyone has a place in our industry.
- I promise to use both Windows and Linux, both have their uses.
- I promise to ask questions when I don't know the answer, and answer questions when I do.
- I promise to learn from my mistakes, and to try to the first time.
- I promise to listen to any idea, however crazy it may sound.
In many ways, this strikes me as fundamentally similar to the Hippocratic Oath that all doctors must take as part of their acceptance into the ranks of the medical profession. For most, this isn't just a bunch of words they recite as entry criteria, this is something they firmly believe and adhere to, almost religiously. It seems to me that our discipline could use something similar. Thus, do I swear by, and encourage others to similarly adopt, the Oath of the Conscientious Programmer:
I swear to fulfill, to the best of my ability and judgment, this covenant:
I will respect the hard-won scientific gains of those programmers and researchers in whose steps I walk, and gladly share such knowledge as is mine with those who are to follow. That includes respect for both those who prefer to keep their work to themselves, as well as those who seek improvement through the open community.
I will apply, for the benefit of the customer, all measures [that] are required, avoiding those twin traps of gold-plating and computing nihilism.
I will remember that there is humanity to programming as well as science, and that warmth, sympathy, and understanding will far outweigh the programmer's editor or the vendor's tool.
I will not be ashamed to say "I know not," nor will I fail to call in my colleagues when the skills of another are needed for a system's development, nor will I hold in lower estimation those colleagues who ask of my opinions or skills.
I will respect the privacy of my customers, for their problems are not disclosed to me that the world may know. Most especially must I tread with care in matters of life and death, or of customers' perceptions of the same. If it is given me to save a project or a company, all thanks. But it may also be within my power to kill a project, for the company's greater good; this awesome responsibility must be faced with great humbleness and awareness of my own frailty. Above all, I must not play at God, and remain open to others' ideas or opinions.
I will remember that I do not create a report, or a data entry screen, but tools for human beings, whose problems may affect the person's family and economic stability. My responsibility includes these related problems, if I am to care adequately for those who are technologically impaired.
I will actively seek to avoid problems that are time-locked, for I know that software written today will still be running long after I was told it would be replaced.
I will remember that I remain a member of society, both our own and of the one surrounding all of us, with special obligations to all my fellow human beings, those sound of mind and body as well as the clueless.
If I do not violate this oath, may I enjoy life and art, respected while I live and remembered with affection thereafter. May I always act so as to preserve the finest traditions of my calling and may I long experience the joy of the thanks and praise from those who seek my help.
I, Ted Neward, so solemnly swear.
|
|
Two more interviews...
|
|
Two more of the interviews I did at JavaPolis 2006 in Belgium are now online... first, Eric Evans (of "Domain-Driven Design" fame), talking about, quite naturally, domain-driven design, and the second, the pair that brought Ruby to the JVM, Charles Nutter and Thomas Enebo. (Charles was just recently added to the No Fluff Just Stuff tour, so I'm looking forward to hanging out with him and playing more with JRuby.)
|
 Sunday, January 21, 2007
|
Interop Briefs: Out-of-proc interop using Intrinsyc's J-Integra
|
|
(This piece originally appeared on TheServerSide under the title "Interop Across the Wire" on 16 November 2006. I've fixed the--again--horrendous formatting problems and touched it up slightly. Changes are in italics.) Welcome to the next installment of “As the Interop World Turns”. In this particular bit, we’re examining interop across the wire, but before we do, let’s acknowledge the major news in the interoperability arena, the announcement of the formation of the Interoperability Alliance, bringing together Microsoft, BEA, Sun, and another dozen or so vendors, all focused on making it easier to play nicely between the platforms. Practically speaking, however, at this point the Interop Alliance hasn’t significantly changed the interop landscape, so while it’s important to note that they exist, there’s nothing more to report. Whether this will turn into Something Big, or just another meaningless consortium of vendors remains to be seen—for now, it remains as a “potential” industry-affecting move. On to more practical matters. In recent years, most focus about interoperability between Java and .NET has been directly on the WS-* stack, AKA “Web Services”. For almost a decade now, the various vendors involved in the various WS-* standardization efforts (and even those who don’t participate directly but graft on to the edges somehow) have promised that as soon as the standards are here, and the implementations all implement the standards, seamless and ubiquitous interoperability across all platforms will be ours. We’re waiting... In the meantime, however, it turns out–-according to those incredibly insightful people at Gartner and other “analysis agencies”--that most of the time, the only two platforms that principally draw interop interest are the JVM and the CLR. Hardly a surprise, for those of us who actually work for a living. And, as it turns out, if you’re looking to limit your interoperability to those two platforms, numerous toolkits abound already to make this happen. While open-source toolkits also exist, in general they aren’t quite “up to speed” against the commercial toolkits, so in this entry we’ll focus on those, mainly the tools offered by J-Intrinsyc. Other commercial tools include JNBridgePro and Borland's Janeva. (In the time since this article's publication--which is a pretty short window, making me wonder if this wasn't the case before publication--attempting to download a trial of Janeva results in an error on Borland's site. More interestingly, Borland's latest release of VisiBroker claims .NET support, so it's possible that Janeva is being discontinued in favor of slipping .NET support into VisiBroker.) Each effectively provides a binary RPC-based interop approach, in which you follow a development process that’s (deliberately) similar to what’s done when working with the native ORPC stack (CORBA or RMI for Java, .NET Remoting for .NET). In several cases, the toolkits use the wire syntax and format of one of the two platforms (IIOP, JRMP or the .NET Remoting format), meaning that for one of the two platforms, the experience is seamless. (Which platform gets to be the seamless experience is up to you, of course, but practical considerations—and a desire to continue to do business with your clients—generally dictate that your clients have the better experience. Choose wisely.) In the case of Janeva, (or any other CORBA tool, for that matter) the definitions are done in CORBA IDL, a language strikingly and deliberately similar to Java/C++/C# interface declarations. Developers familiar with CORBA will know what to do with these definitions on a .NET platform: simply run the ORB's code-gen tool over the IDL file, which will generate stubs (client-side proxies) or skeletons (server-side proxies) as necessary. For existing CORBA systems, this is likely to be the easiest thing for a .NET client to do to hook in, but remember that CORBA IDL is an entire language and type system in of itself, and CORBA itself represents a fairly sizable stack to get used to - easily dwarfing what’s in the .NET Remoting stack in both size and complexity. For simpler scenarios, it’s generally easier to use something a little less intimidating (and, correspondingly, less powerful), such as the JaNET or JNBridge tools. Each is equally useful in my opinion, so I’m picking one at random here to use as a demo. JNBridge lost the toss (seriously!), so I’m going to use the J-Integra tool for this demo. This is actually taken from one of the demos shipping with their product, so if you feel like following along, grab the eval demo off their website, install, and look for the HelloWorld demo in the examples directory. J-Integra takes a “.NET-friendly” perspective, meaning that the development experience is a bit easier on the .NET developer than the Java developer. (JNBridgePro take the opposite tack, for what that’s worth.) Thus, for the C# developer, developing an interoperable scenario is as simple as writing a typical .NET Remoting component—build a class that extends System.MarshalByRefObject: // Copyright 2001-2003 Intrinsyc Software Inc. // All rights reserved. using System; namespace HelloWorld { public class HelloWorldClass: System.MarshalByRefObject { private String name; public HelloWorldClass(String name) { this.name = name; } public String getMessage() { return “Hello World, from ” + name; } } } From a .NET Remoting perspective, there’s absolutely nothing interesting about this class, which is exactly the point—any existing .NET Remoting servers can be flipped to be interoperable by doing exactly nothing. (In this particular demo, Intrinsyc has the HelloWorldClass instances being hosted by ASP.NET, but obviously we could just as easily self-host it if desired—see Ingo Rammer’s “Advanced .NET Remoting” from APress for details if you’re not “up” on your .NET Remoting.) To get Java to call this guy, we need to do run J-Integra’s “GenJava” tool to create Java client proxies and compile them. Once those proxies are generated and compiled (and unfortunately I don’t see any custom Ant tasks to do this, so you’ll likely have to write an “exec” task to do it), drop them into your client .jar file, and call the proxies by name: // Copyright 2001-2003 Intrinsyc Software Inc. // All rights reserved. import HelloWorld.*; public class HelloWorldMain { public static void main(String[] args) throws Exception { HelloWorldClass helloWorldClass = new HelloWorldClass(”Fred”); System.out.println(helloWorldClass.getMessage()); } } Again, nothing special, which is the point—the “magic” takes place inside the generated proxy, which (based on the settings in the GenJava tool) knows how to call over HTTP to the ASP.NET server hosting the HelloWorld instance, execute the call, and send back the returned String to the client. (Before the JNBridgePro folks get peeved at me, let me quickly point out that the development experience there is going to be much the same: point their code-gen tool at the Java RMI server objects you want .NET to communicate to, and use those proxies as-is from C#.) While tempting, there are some caveats to this approach. First, be careful when considering binary RPC-based approaches to interop, because the interface-based code-gen approach carries with it a nasty side effect: once published, a given interop endpoint can never be modified again without requiring all of its clients to also change with it. While this isn’t a major consideration during development of the project initially, it can be devastating when attempting to refactor code later, after the system has been initially released. This kind of tight coupling works against many agile projects, so choose your interfaces (whether Java or .NET based) with care. (And before the comments start flying, let’s be very clear about this: the tight coupling descends from the proxy-based code-generation approach, and not anything to do with the tools themselves. WSDL-based code-generated proxies fall into the same trap.) Secondly, using either of these tools assumes that you will never need to branch beyond the Java and .NET platforms; should you have to incorporate Ruby or “legacy” C++ into the mix, for example, you’re out of luck. This is where the “open-ended” interoperability of the WS-* stack (or its conceptual predecessor, CORBA) holds its own, and if there’s any reason to suspect that you’ll need to reach beyond the JVM and CLR, you should consider an IIOP-based or WS-*-based solution. (Be careful, however, since as of this writing I’m not aware of any Ruby-CORBA packages, so even CORBA could be a dead end if you need to plug Ruby into the mix.) Thirdly, remember that out of the box, these tools generally focus on cross-process communication, and so that means that each method call across the boundary is not only a platform shift, but also a network traversal. Loosely translated, that means “hideously expensive” in performance terms. Even those toolkits that offer support across shared memory channels still go through the marshaling/unmarshaling process, so it’s still not as cheap as an in-proc method call. As with most interoperability scenarios, try to minimize the amount of back-and-forth between the two platforms. (That having been said, however, the JNBridge blog shows how to embed Swing components inside of a WinForms form, which represents a powerful idea and one that shouldn’t be discarded out-of-hand. Just be sure to perf-test.) The tight-coupling concern is a biggie, however, so in future installments we’ll look at ways to avoid it by using messaging tactics, instead of RPC-based ones. Until then, remember, Java and .NET are like your kids: you love them both… “the same”. (Shortly after publication, Wayne Citrin of JNBridgePro posted a comment that I felt merited reprinting here. Unedited, it appears below.) Ted – Thanks for writing about JNBridgePro. I do have a couple of comments on the interop post. - You mention that J-Integra is “.NET-friendly” and we (JNBridgePro) are the opposite. (Presumably, “Java-friendly”.) I don’t think that’s the case, since (1) our proxy tool is written in .NET, and J-Integra’s is written in Java, and (2) J-Integra exposes the low-level details of .NET Remoting to the user, while we hide those details (which is friendly no matter where your development experience lies). In any case, we try to make the product user-friendly whether your background is in Java or in .NET. - While both JNBridgePro and J-Integra both use .NET Remoting, there are more differences between the two products than might be evident from your post. Enumerating them is beyond the scope of this comment, but one that jumps out from your J-Integra example is that for Java-calling-.NET scenarios, J-Integra requires that the accessed objects be .NET-Remotable components: either MarshalByRefObject (if they’re being accessed by reference) or ISerializable (if they’re being accessed by value).JNBridgePro allows the Java code to call _any_ .NET object (MarshalByRefObject, ISerializable, or anything else). - Similarly, for .NET-to-Java directions, with JNBridgePro the .NET code can access _any_ Java object or class; it doesn’t need to be RMI-remotable. - The “in-process” vs. “RPC-interop”/”across-the-wire” dichotomy doesn’t entirely hold up. (You do touch upon this at the end of your post, but I wanted to elaborate.) While JNBridgePro does support “across-the-wire” (what we call “socket-based”) interop, as you mention we also offer a shared-memory communications mechanism that allows the CLR and the JVM to run in process. This mechanism is very popular with our users, and is much faster than the socket-based approach. (You’re correct that it still needs to go through the marshalling/unmarshalling mechanism, so there’s overhead, but you avoid the overhead of traversing the socket stack and doing process switching.) Unlike IKVM, our shared-memory approach is still a bridging solution, and still uses .NET Remoting, which is evidence of the power and flexibility of the .NET Remoting mechanism. - Thanks for pointing out the GUI-embedding blog entry. The support code mentioned in that blog entry has now been integrated into our new version 3.1, which makes this type of embedding a lot simpler. We’ve updated the blog entry to reflect that. Wayne (Readers are, of course, encouraged to download both and form their own opinions.)
Sunday, January 21, 2007 5:43:39 PM (Pacific Standard Time, UTC-08:00)
|
|
|
Interop Briefs: In-proc interop with IKVM
|
|
(This originally appeared on 8 November 2006 as an entry on TheServerSide's blog. The title there was erroneously called "A look at out-of-proc or RPC interop", which is completely nonsensical, since this entry had nothing at all to do with out-of-proc or RPC. I've since corrected the title, and fixed the horrendous formatting problems that appeared there, as well.)
For years, the concept of “Java-.NET interoperability” has been wrapped up in discussions of Web services and the like, but in truth there are a bunch of different ways to make Java and .NET code work together. One such approach is to host the JVM and the CLR inside the same process, using a variety of tools, such as the open-source project IKVM (a part of the Mono project).
IKVM isn’t a “bridge” tool, like other interop technologies—instead, IKVM takes a different path entirely, doing bytecode translation, transforming Java bytecode into CIL instructions, and feeding them through the traditional CLR as such.
This means that Java classes basically become .NET assemblies, and executed using the CLR’s execution engine. The JVM itself, technically, is never loaded—instead, the CLR essentially becomes a JVM, capable of executing Java classes. This also means, then, that the various features that accompany the JVM, such as Hotspot execution of Java bytecode, the JVM garbage collectors, and the various JMX-related monitoring tools that are part of Java5 and later, will not be present, either.
IKVM comes in two basic flavors—a runtime component that’s used to load and execute Java classes from .class binaries, and a precompiler/translator tool, ikvmc, that can be used to translator (or cross-compile, if you will) Java binaries into .NET assemblies. While the second option generally yields faster execution, the first is the more flexible of the two options, as it doesn’t require any preparation on the part of the Java code itself.
Using IKVM to load arbitrary Java code and execute it via Java Reflection turns out to be fairly easy to do; so easy, in fact, that you can use it from Visual Basic code. After adding the IKVM assembly to a VB.NET project, write:
Imports IKVM.Runtime Imports java.lang Imports java.lang.reflect
Imports jlClass = java.lang.Class Imports jlrMethod = java.lang.reflect.Method
The first line just brings the IKVM.Runtime namespace into use, necessary to make use of the “Startup” class without having to fully-qualify it. The next two lines bring in parts of the Java runtime library that ship with IKVM (the GNU Classpath project, precompiled to CIL using ikvmc and tweaked as necessary to fit the CLR’s internals). Similarly, the last two lines create an “alias”, such that now the types “jlClass” and “jlMethod” are now synonyms for “java.lang.Class” and “java.lang.Method”, respectively—we want this because otherwise we’ll run into name clashes with the CLR Reflection APIs, and because it helps cut confusion about which Reflection we’re working with.
Module Module1 Sub Main() Dim properties As Hashtable = New Hashtable properties("java.class.path") = "." Startup.SetProperties(properties)
Next, we create a Hashtable object to hold a set of name-value pairs that will be passed to IKVM in the same manner that we pass “-D” properties to the Java Virtual Machine on the command-line. In this particular case, I’m (redundantly) setting the CLASSPATH to be the current directory, causing the JVM to look for code there along with the usual places (rt.jar and the Extensions directory inside the JRE). “Startup” is a static class, meaning there’s no instance thereof.
Startup.EnterMainThread()
To quote the vernacular, we’re off and running. By calling “EnterMainThread”, IKVM is now up and running, ready to start taking on Java code. Our next task is to find the code we want to execute via the standard Java ClassLoader mechanism, find the “main” method exposed thereon, create the String array of parameters we want to pass, and call it, all via traditional Java Reflection APIs, but called through IKVM instead of through Java code itself.
Dim sysClassLoader = ClassLoader.getSystemClassLoader
Dim cl1 As jlClass = jlClass.forName("App", True, sysClassLoader)
Dim paramTypes As jlClass() = { _ jlClass.forName("[Ljava.lang.String;", True, sysClassLoader) _ } ‘ java.lang.Class has an implicit conversion operator to/from Type ‘Dim paramTypes As jlClass() = { _ ‘ GetType(String()) _ ‘}
Dim main As jlrMethod = cl1.getDeclaredMethod("main", paramTypes)
In the lookup for the “main” method, notice how there are two different ways to specify the method parameters: one, using the JVM syntax to specify an array of Strings (“[Ljava.lang.String;” as given in the Java Virtual Machine Specification), and the other using IKVM’s ability to translate types from .NET to Java, which allows us to specify it as a “String()” in VB (or “String[]” in C#).
Dim parms As Object() = { _ New String() {"From", "IKVM"} _ }
Dim result = main.invoke(Nothing, parms)
We create the array of Strings to pass, then call invoke(), passing “Nothing” (the VB synonym for C#'s null) for the object instance, as per the usual Java Reflection rules. At this point, the “App.main()” method is invoked, and when it returns, the Java code has completed execution. All that is left is to harvest the results and display them, and shut IKVM down appropriately.
If result <> Nothing Then Console.WriteLine(result) Else Console.WriteLine("No result") End If
Startup.ExitMainThread() End Sub
End Module
Using IKVM is not a silver bullet, but it does offer some powerful in-proc interoperability options to the development team looking to leverage both .NET and Java simultaneously, such as calling out to Java EJB servers from within Excel or Word documents, or loading Spring into Outlook in order to evaluate incoming mail messages and process them for local execution.
.NET | Java/J2EE | Windows
Sunday, January 21, 2007 12:38:25 AM (Pacific Standard Time, UTC-08:00)
|
|
 Saturday, January 20, 2007
 Thursday, January 18, 2007
|
Interop Briefs: Begin at the Beginning...
|
|
(Originally appeared as a DevelopMentor article on TheServerSide.com; updates and changes to the piece have been made in accordance with the time difference, roughly two years since its original publication, and some changing beliefs on my part, which I will elucidate further in a future piece.) In the halcyon days of my youth, a major candy manufacturer ran an advertising campaign. It was destined to be a timeless classic, one that sticks with you for the rest of your life: a scene would be some typical meeting place (like a park or a street), one actor would walk from one end of the scene towards the other emerging from the other side. One would be carrying a jar of peanut butter, the other a chocolate bar, and of course, they would bump into each other, the chocolate falling into the peanut butter, and thus a new candy bar was "born". The tagline, of course, was "two great tastes taste great together". So it goes with Java and .NET. Individually, each is a great platform, but together, a practical necessity in the modern enterprise. With enterprise development market share of 35-40% each, and the Microsoft/Sun settlement agreement of earlier this month, it's fairly obvious that neither of these two platforms is "going away" any time soon. Analysts estimate that by 2005, up to 90% of all IT shops will have both platforms running side by side. It's clear to even the most zealous Java or .NET devotee that working with both platforms is going to become the norm. This means that developers will be facing an interesting task in the coming years: "make our .NET inventory management system talk to our J2EE customer relationship system", or "our sales staff wants to access our CRM system (written in J2EE) from Outlook", and so on. And while it would be nice to be able to hold up the Web services software stack, point to it and say, "Here's the answer to all your interoperability concerns", the practical and brutal reality of the situation is that it's nowhere near that simple an answer. Making two platforms interact is at once a simple and difficult problem. Simple, in that it's a fairly closed-requirements solution: if I can work out a few technical details, interaction is achieved. It's also fairly easy to achieve success--if they can talk, you did it, if not, there's still work to go. In fact, once you've worked out low-level issues like byte order, file/data format, and hardware compatibility, basic interaction between any two platforms is pretty straightforward. (As a matter of fact, on this basis was the Internet built.) But the problem of integration still presents hardships, owing to the rich complexity of systems that build on these basic low-level concepts. Merely because I can exchange TCP/IP packets between a Windows machine and a Solaris server does not mean that getting .NET code to exchange data with a J2EE server will be easy--far from it. Numerous hazards lie in wait for the budding integration developer. Before we get too deep into the interoperability hazards, let's take a moment to revisit Enterprise Application Architecture in general, so as to get our bearings in the discussion to follow. Historically, we've preferred "n"-tier systems to client/server 2-tier ones, because of the increased scalability intrinsic to "n"-tier systems. An "n"-tier system can scale to much higher user counts, due (among other things) to the shared database connections from a central middle tier to which clients connect. We also prefer the "n"-tier approach because it tends to allow for better separation of responsibilities in code: presentation-layer code goes on the client tier, business logic goes on the middle tier, and data-access code (largely represented by SQL) goes on the resource tiers either on or behind the middle tier machines. 3-tiers, 3 layers, but not always mapped one-to-one. Drawing a distinction between the tiers and the layers is necessary for good interoperability between the platforms, because interoperability across layers is going to necessitate very different decisions than interoperability within layers. For example, if you have a Windows Forms .NET application that wants to display a Swing applications as a child window, very different decisions are in order than if you want that same Windows Forms applications to talk to a J2EE back end. Web services might suffice for the second requirement; it'll be an unmitigated disaster if you use Web services for the first. (See Fowler's "Patterns of Enterprise Application Architecture" or my "Effective Enterprise Java" for more about the layers-vs-tiers discussion.) Interoperability can take one of three possible shapes: in-proc, out-of-proc, and resource sharing. Of the three, resource sharing is perhaps the easiest to understand and recognize: using JDBC or ADO.NET, for example, a Java application can write data to a relational database that the .NET program can access via ADO.NET. The database access layers each deal with the necessary details to make the data comprehensible to the appropriate platform, leaving the programmer free to focus on working with the data itself. Unfortunately, a great deal of time is consumed in doing so, and trying to implement a short round-trip request/response cycle using an RDBMS is going to be difficult. Still, as an interoperability mechanism, it's by far the simplest and easiest approach to take, and as a result is the vast majority of the interoperability in production. In essence, we're exchanging data through some well-understood format and easily-accessible medium, in this case the relational database format described by SQL and a central database. We can extend this raw data exchange in other ways, however, by leveraging the world's most popular interoperable data format, XML, as the format for data exchange and a variety of different data stores as the medium through which the exchange takes place. For example, it's relatively easy to take an XML document and store it to the filesystem from a servlet application for a .NET application to come around and pick up via a filesystem watcher thread. Or, thanks to the growing XML-in/XML-out capabilities of recent database releases, we can use the database as the exchange medium. Or anything else that's handy, for that matter; the key here is that the exchange format is XML data. This presents its own problems, by the way--XML doesn't deal well with certain aspects of the object-oriented nature of Java and/or C#/.NET programs, such as the ability of objects to create cyclical relationships and bidirectional associations. Imagine, for a moment, a Person class like such: public class Person
{
private Person mother;
private Person father;
private ArrayList siblings;
private Person spouse;
private ArrayList children;
}
While this looks relatively simple to serialize to XML in the simple cases, remember that objects have identity, which means that "Jed's" wife could also be "Jed's" sister (in certain parts of the world). Serializing her twice in the XML document breaks object identity and creates further chaos.
SOAP Section 5 of the 1.1 specification sought to correct for this lack, but forgot that not all platforms that can consume XML are object platforms--a good number of them lack any concept of object references whatsoever, in fact. For this reason, SOAP Section 5 encoding (the "encoding" in "rpc/encoding") is deprecated in later Web services specifications, in favor of XML Schema Definitions (XSD) as a descriptive language for XML data (and "document/literal" as the means by which the services operate). When building systems that will need to interoperate against any and all possible platforms, it's been recommended to always start from schema definitions first, and build object definitions to match against that. (We'll visit this topic again in a future Interop Briefs piece fairly soon. --Ted) Both .NET and Java have libraries and/or specifications to deal with this: the XSD.exe utility and XmlSerializer in .NET (and now WCF and contracts in NetFX 3.0), and the Java API for XML Binding (JAXB) and Java API for XML Web Services (JAXWS) in Java.
This may all seem redundant--after all, who hasn't heard of XML as the suggested data exchange format? Take careful note, however, that we're talking about data exchange, not the use of XML as a communications stack, which of course brings us to the discussion of interoperability through Web services.
Web Services, as a technology, were born of the basic desire to "replicate a call stack in XML" (Don Box, private communication). At the time, it seemed natural enough: take the marshaled parameters from a remote call, and represent them using XML rather than a proprietary binary protocol, thus making it (theoretically) possible for other languages to consume that call without having to write a huge pile of parsing code. Initially, SOAP was available from DevelopMentor and Microsoft in Perl, Java, and COM-based formats.
But as time has progressed, so has the sophistication and scope of Web services. SOAP was rewritten to be a framing and extensibility specification, deferring all question of how to represent data to XSD. Description of Web services endpoints fell to WSDL. Discovery of services was left to UDDI. But within the last two years, things exploded in complexity; by latest count there are well over 30 specifications from a variety of author entities (Microsoft, IBM and BEA being three of the largest sponsors) covering everything from binary attachments (The recently-approved MTOM being the winner) to business process flow. More are coming.
More importantly, when Web services began to sweep the industry as the Next Big Thing, vendor toolkits began to offer extensions that allowed developers to start from the language interface and generate WSDL definitions to make it easy to reuse your existing technology investment by slapping angle brackets around the data traveling across the wire. Unfortunately, any out-of-process remoting API that suggests starting from a language-based interface (whether that be Java or C#) should be taken very skeptically (or, perhaps more accurately, carefully). Consider, for a moment, the following Java interface: public interface Calculator
{
public BigDecimal add(BigDecimal lhs, BigDecimal rhs);
}
what, precisely, should an XML-marshaled BigDecimal instance unmarshal to in .NET?
So often, demos done on the expo show floor clearly prove that the product knows how to talk to the same vendor's product on the other side of the wire, but rarely if ever demonstrates working with another vendor's product. So, for example, an ASMX Web Methods Web service can easily declare itself as returning a Hashtable, for example, but once marshaled and sent across the wire, what format should it resemble in the J2EE space? While Java certainly has its own implementation of Hashtable, there's no love lost between them in implementation details. As a result, it's a fair bet (barring special code to the contrary), the .NET Hashtable will get rendered into a custom data format that has little to no bearing on the .NET Hashtable in widespread use.
For these reasons, just as with data exchange using XSD, when writing WSDL-based services, always start with the "parts in the middle": in this case, the WSDL.
What's worse, few if any of these Web service specifications have a concrete implementation to work with, and fewer still have any sort of "work history" (that is to say, beta-testing and/or field-use) behind them. WS-Routing, for example, exists as an add-on to the Microsoft .NET Framework (Web Services Extensions 1.0), but as of yet no Java Web services software package currently has support for it. Neither Microsoft nor IBM or BEA have any implementation of WS-AtomicTransaction, and so on. (Note: at the time this was published, that was the case; both Microsoft and BEA have WS-AT implementations now, and WS-Routing has been dropped in favor of WS-Addressing. That said, though, a good number of the remaining specifications have shockingly little field use behind them, it seems.)
More importantly though, even a perfect Web services picture still leaves the story incomplete. Web services intrinsically imply remote process communication--no WS-* specification currently describes a way for two platforms to coexist in the same process space, for example. (In point of fact, there's really very little that Web services could say; how do you mandate an in-proc API for Python running in the same process as the CLR, or the JVM?) Interoperability within a given layer (presentation, business logic or data access) sometimes requires the ability to share data in the same process, yet all of the Web services space is focused on simply slinging angle brackets between endpoints across process boundaries. Making things even more interesting, interoperability at the resource tier--the database or the filesystem--is often "just enough" interoperability to make a system work without requiring a major investment.
The most intrusive approach to interoperability is the in-process approach, where a single process hosts both the JVM and the CLR simultaneously. At heart, both managed environments consist of a set of DLLs (JVM.dll in the case of Java, mscoree.dll in the case of .NET), making it fairly trivial to create a single process hosting both. One simple way to do this in Java, for example, is to write a Java Native Interface-defined method using Microsoft's Managed C++: write the Java native method in the usual manner, generate the C header using javah, then implement and compile the C++ code using the "/clr" switch. Because the JNI DLL is a managed code assembly, the .NET framework will automatically be loaded into the process, as usual, thus bringing both managed environments into the same process. Hosting Java from the CLR is a bit trickier, since it requires the explicit use of the JNI Invocation API, but again it's not rocket science.
There's obviously more to the Web services story than what's been covered here; discussions of particular RPC toolkits and ORBs, messaging frameworks/utilities, SOAP, and more, all even before getting into the Web services discussion. And, what's more, variations and hybrid approaches are quite feasible: a JMS consumer pulling TextMessages out of a Queue that contain a SOAP packet which is delivered via MSMQ by doing JNI methods to Managed C++.... and so on. Just bear in mind when considering your needs and options for your next Java/.NET integration project that multiple options are available to you beyond the traditional "Just set up a WSDL endpoint...."
Thursday, January 18, 2007 12:13:13 AM (Pacific Standard Time, UTC-08:00)
|
|
 Monday, January 15, 2007
|
The Root of All Evil
|
|
At a No Fluff Just Stuff conference not that long ago, Brian Goetz and I were hosting a BOF on "Java Internals" (I think it was), and he tossed off a one-liner that just floored me; I forget the exact phrasology, but it went something like: Remember that part about premature optimization being the root of all evil? He was referring to programmer career lifecycle, not software development lifecycle. ... and the more I thought about it, the more I think Brian was absolutely right. There are some projects, no matter how mature or immature, that I simply don't want any developer on the team to "optimize", because I know what their optimizations will be like: trying to avoid method calls because "they're expensive", trying to avoid allocating objects because "it's more work for the GC", and completely ignoring network traversals because they just don't realize the cost of going across the wire (or else they think it really can't be all that bad). And then there are those programmers I've met who are "optimizing" from the very get-go, because they work to avoid network round-trips, or write SQL statements that don't need later optimization, simply because they got it right the first time (where "right" means "correct" and "fast"). It made me wish there was a "Developer Skill" setting I could throw on the compiler/IDE, something that would pick up the following keystrokes... for (int x = 10; x > 0; x--) ... and immediately pop Clippy up (yes, the annoying paperclip from Office) who then says, "It looks like you're doing a decrementing loop count as a premature optimization--would you like me to help you out?" and promptly rewrites the code as... // QUIT BEING STUPID, STUPID! for (int x = 0; x < 10; x++) ... because the JVM and CLR actually better understand and therefore JIT better code when your code is more clear than "hand-optimized". And before any of those thirty-year crusty old curmudgeons start to stand up and shout "See? I told you young whippersnappers to start listening to me, we should have wrote it all in COBOL and we would have liked it!", let me be very quick to point out that years of experience in a developer are very subjective things--I've met developers with less than two years experience that I would qualify as "senior", and I've met developers with more than thirty that I wouldn't feel safe to code "Hello World". Which, naturally, then brings up the logical question, "How do I know if I'm ready to start optimizing?" For our answer, we turn to that ancient Master, Yoda: YODA: Yes, a Jedi's strength flows from the Force. But beware of the dark side. Anger, fear, aggression; the dark side of the Force are they. Easily they flow, quick to join you in a fight. If once you start down the dark path, forever will it dominate your destiny, consume you it will, as it did Obi-Wan's apprentice. LUKE: Vader... Is the dark side stronger? YODA: No, no, no. Quicker, easier, more seductive. LUKE: But how am I to know the good side from the bad? YODA: You will know... when you are calm, at peace, passive. A Jedi uses the Force for knowledge and defense, never for attack. What he refers to, of course, is that most ancient of all powers, the Source. When you feel calm, at peace, while you look through the Source, and aren't scrambling through it looking for a quick and easy answer to your performance problem, then you know you are channelling the Light Side of the Source. Remember, a Master uses the Source for knowledge and defense, never for a hack. (Few people realize that Yoda, in addition to being a great Jedi Master, was also a great Master of the Source. Go back and read your Empire Strikes Back if you don't believe me--most of his teaching to Luke applies to programming just as much as it does to righting evils in the galaxy.) All humor bits aside, the time to learn about performance and JIT compilation is not the eleventh hour; spend some time cruisng the Hotspot FAQ and the various performance-tuning books, and most importantly, if you see a result that doesn't jibe with your experience, ask yourself "why".
|
 Wednesday, January 10, 2007
|
The Five Things Meme
|
|
Simon tagged me, so I suppose I have to do this or else be on the bad end of Bad Luck For Seven-and-a-half-Years or something like that. Here we go, five things you may not have known about me before now: - Je parle francais, un peu. (I'm not sure how to get the French characters on my keyboard or in the blog, so those who speak French will have to pardon the lack of the appropriate accented characters.) Ein BiBen Deutsch, aussi.
- My degree is in International Relations, from the University of California at Davis. I took several Comp Sci classes while there, but stopped when I realized that my self-driven study of programming (thanks to Stroustrup's The C++ Programming Language and Coplien's Advanced C++ Patterns and Idioms) put me actually well ahead of most of the CS undergrad community there. I thought briefly about grad school, but when the Chair of the CS department at UCD told me he'd turn me down due to my B- in ECS 140A: Programming Languages (I had a really hard time trying to get the hang of Lisp), I decided not to bother.
- I'm an avid video-game gamer, dating back to the very early games in the 80's. My most prized accomplishment of that era? Flipping Galaga. (For those who don't know the term, it means gaining a score high enough--in this case, a million points--such that the display "flips" back to zero.) And these were in the days when it was one-quarter-one-game, none of this "play 'til you run out of money" approach first introduced by Gauntlet....
- I didn't grow my hair out until after I'd graduated high school. No, it wasn't a "rebellion thing", it was the plain realization that if I ever wanted it long, college was my last chance to do it, because clearly long hair wasn't acceptable in the big bad working world....
- Speaking of high school, back then everybody thought my first published book would be a Sci-Fi/Fantasy work. I was one of the founding members of the school's Young Author's Club, and had a series of short stories about an assassin for hire--really terribly written, as I look back at them now, modeled after Edward D. Hoch's Nick Velvet mystery stories from Ellery Queen's Mystery Magazine but without any of his style or panache. That said, however, writing has clearly been at the core of my career for some time, as my life has been (quite positively) affected by various technical authors:
- The two technical authors I most wanted to meet (and consciously modeled my writing style after) were Don Box and Jeffrey Richter. I grew up on Advanced Windows NT and Windows 3.1: A Developer's Guide, and I was fascinated by Essential COM and Effective COM.
- The one technical author I never thought I'd ever come close to, much less write a book for and meet in person, was Scott Meyers; Effective C++ and More Effective C++ were amazing, literally life-changing experiences. Had somebody told me, ten years ago, that I would not only have met Scott, but written an Effective book of my own, and be privileged enough to call him friend, I'd have told them they were out-of-their-minds nuts.
- The book that most influenced my technical career had to be Paul DiLascia's Windows++, since his was the first book I'd come across that walked through the nitty-gritty of building a real C++ framework, and that in turn led me down the ultimately futile path of building my own cross-platform GUI framework (which in turn, in its half-baked form, proved to several employers that my C++ skills were for real, despite not having a degree in Computer Science).
- But by far and away, the author who's had the most profound effect on my life was none other than Bjarne Stroustrup, who, when emailed by this fledgling author thinking about writing his first book, offered a cogent, three-page email response filled with advice and wisdom about embarking on the path of the technical author, all of which turned out to be spot-on accurate.
Thanks, to all of you. So, I'm in turn supposed to tag five others, but I'm going to hold off for now, until I get a better idea of who's been tagged and who hasn't. 
Wednesday, January 10, 2007 2:15:31 AM (Pacific Standard Time, UTC-08:00)
|
|
 Saturday, January 06, 2007
|
The First Major Patch/Feature/Change/Whatever to Javac7...
|
|
It's a new brand of property support, submitted by Remi Forax. Have a look, and let the huge language debates begin... Personally, I like what he's done, but then again, I'm a fan of properties-as-first-class-citizens support, a la C#. I'm not so wild about introducing the keyword (I like the C# syntax), but I can understand where the C# syntax is deemed a bit cryptic to Java developers. Besides, Remi's done the Right Thing by not making property (or abstract property) an actual keyword, so we don't have accidental backwards incompatibility issues to worry about. Mind you, I sincerely doubt this is the final form it'll take in Java7, but this is encouraging--people are hacking on the compiler and producing concrete examples of ideas, not just ideas in limbo. Hats off to you, Remi!
Java/J2EE
Saturday, January 06, 2007 4:53:15 AM (Pacific Standard Time, UTC-08:00)
|
|
 Friday, January 05, 2007
|
Interop Briefs: Check your politics at the door
|
|
(Originally appeared on TheServerSide, November 2006; I've made some edits to it since then.)
As we prepare to enter the holiday season here in the US, I think it’s time that we called for Peace on Earth. Or, at least, Peace in Computer Science.
In 2000, when Microsoft first announced the .NET Framework (then called by various alternative names, such as the “Universal RunTime (URT)” or “COM3” or the “Component Object Runtime (COR)”), it was immediately hailed as the formal declaration of war on Sun and Java, if not an actual pre-emptive attack.
Within the industry, a schism already present was made deeper—developers were routinely asked “which side” they were on, whether they were supporters of “open” standards and “community-driven” development, or whether they were trying to support the evil corporate conglomerates. (I’ve since lost track of who’s supposed to be good or evil—Sun because they refused to release Java to an international standards body, IBM because they are trying to subvert Sun’s control over Java, Microsoft because they routinely “embrace and extend” open standards, or Oracle, because… well, just because.) I’m personally regarded as some kind of heretic and looney because not only do I routinely write code for both the Java and .NET platforms, but because I refuse to say, when asked, which one I like “better”.
You know what? I’m damn tired of these arguments. Can’t we all just get along and write software?
It’s not like these arguments really do much for our customers and clients. Truth be told, few of the people who use our software can even tell which platform the silly thing was written in, much less how it being written in Java will somehow make the world a more free (as in speech, as in beer, as in sex, whatever) place. Or that .NET somehow allows for multiple languages—generally speaking, the only language they care about is the one they speak and read and interact in. Most of the time, they’re just happy if they can *use* the software—remember, according to statistics routinely cited at conferences and presentations, half the time our customers never see software they’ve asked for, and when they do, it’s likely to be twice the budget costs originally anticipated, with half the features they originally asked for, in a user interface they don’t quite understand, even though it’s supposed to be “the latest greatest thing”.
This is progress?
Over the last five years, there’s been a quiet revolution under way, and it’s not the dynamic language revolution, nor the REST-HTTP-SOAP revolution, nor the agile revolution, nor AJAX. It’s not about containers or dependency injection or inversion of control or mock objects or unit testing or patterns or services or objects or aspects or meta-object protocols or domain-specific languages or model-driven architecture or any other fancy acronym and accompanying hype and marketing drivel. It’s a revolution of pragmatism, of customers and clients and others turning to developers and saying, “Enough is enough. I want software that works.”
“Works” here is a nebulous term, but before the Marketing goons start spinning the term to their best advantage, let’s clarify: “Works” is a simple term, as defined by our customers, not us. “Works” means runs in a manner that’s genuinely useful to our clients and customers. “Works” means it’s delivered close to on time and preferably under budget. (Nothing will ever make that utopian dream come true completely, so let’s be more realistic about the process—besides, *close* to on time and budget is a pretty good goal to shoot for right now, anyway.) “Works” means software that attaches itself to the existing mess we’ve made over the years, without having to rip out a whole bunch of servers and replace them with a whole bunch more. “Works” means taking what a customer has, in place, that already meets that definition, and tying the new stuff we’re building into that existing mess.
“Works” means, practically speaking, that we take the languages and tools that are available to us, and use them each to their advantage, regardless of political affiliation or perceived moral stance. That means taking Microsoft’s tools and technologies and tying them into Java’s, and vice versa. That means dropping the shrill rhetoric about how each is trying to “leverage” the other out of existence, and figuring out how to use them all together in a meaningful and technologically powerful way. That means recognizing that we are all one community, not little villages out in the countryside trying to beat each other into submission even as we try to scrape a living off the land.
Recently, I've picked up two books that I think typify my approach to programming in 2007, both by Larry Winget: "Shut Up, Stop Whining, & Get a Life", and his more recent follow-up, "It's Called Work For a Reason". In both, he points out that there is no "secret sauce", no "secret recipe" to success, and that for most of us, we already know what the Right Thing To Do is... we just don't want to accept it or admit it. I think that in a lot of ways, the debates over which platform to use and whose language is better are ways that we technologists avoid the much harder problem of dealing with customers. I think it's high time that we face that in the mirror, stop talking so much, and start listening more.
Abraham Lincoln, the man who had the unfortunate luck to preside over the United States during its most divisive era, once said, “A house divided cannot stand.” Neither will ours, I fear, if we keep this up. Please check your politics at the door—here, we care only about how tools can be used to solve problems.
|
|
A Time for a Change
|
|
I've had The Blog Ride up for almost two years now, and it seems the latest fad to change your blog title to match whatever your particular focus is at the moment. Given my tech predictions for 2007, and how I believe that interoperability is going to become a Big Deal (well, I guess in one sense it was already, but now I think it's going to become a Bigger Deal), and that hey, this is my schtick anyway, I've decided to rename the blog from "The Blog Ride" (which was kinda a lame name to begin with) to ... Truth be told, I thought about squatting on Jason Whittington's old blog title ("Managed Space"), given that a lot of where my focus centers these days is around managed environments (Java and .NET, principally), but I didn't like that idea because (a) it was his idea first, and I don't like "me-too" kinds of faked creativity, and (b) I do a lot more than just managed code, so... Welcome to "Interoperability Happens". One of the things I've set as a resolution for the new year is to post some concrete interoperability tips (very similar to the ones I'd been posting to TechTarget's "tssblog" site) ranging on all sorts of interop topics from XML services, to using the proprietary communication toolkits, to using IKVM, to some concrete examples (authenticating from Java against a Microsoft Active Directory or ADAM service, hosting Workflow inside of Spring, or writing Office Action Panes that talk to Java back-end servers, and so on) of interoperability "in the field". I won't promise that I'll have a new one up every other week or so, but that's the goal. And the interop hopefully won't be limited to just Java and .NET; I plan to start exploring the Java/Ruby and .NET/Ruby interop space, as well as other pairings (Python, Tcl, maybe a few other languages or environments, like perhaps Parrot) that appeal to me. (That said, I've got a list of about 20 or 30 or so topics on just Java/.NET, so any delays or significant pauses aren't for lack of material or ideas.) And if there's any particular interoperability topic or question you've got, you know how to reach me. Catch ya around in 2007.
|
 Thursday, January 04, 2007
|
Warning: XSS attack in PDF URLs
|
|
Just heard this through the OWASP mailing list, and it's a dandy:
I wanted to give everyone all a heads-up on a very serious new application security vulnerability that probably affects you. Basically, any application that serves PDF files is likely to be vulnerable to XSS attacks.
Attackers simply have to add an anchor containing a script, e.g. add #blah=javascript:alert(document.cookie); to ANY URL that ends in .pdf (or streams a PDF). The browser hands off the anchor to the Adobe reader plugin, and the script then runs in the victim’s browser.
You can find more information here: http://www.gnucitizen.org/blog/universal-pdf-xss-after-party/
You can protect yourself by upgrading your browser and Adobe Reader. There are many vulnerable browser/plugin combinations in use, including Firefox. However, IE7 and IE6 SP2 do not appear vulnerable.
Protecting the users of your application from attack is more difficult. This problem is entirely in the browser and the Adobe reader. The anchor is not even passed from the browser to the web application, so there’s really not much you can do in your code to detect an attack. You could stop serving PDF documents or move them to a different server, but that’s not realistic for many organizations.
Jeff Williams, Chair, The OWASP Foundation
Now, a couple of thoughts come to mind:
- First and foremost, if your application serves PDFs, make sure your clients know to upgrade to the latest Acrobat version, since that seems (based on how I read the above) to be protected against the XSS attak; if it's not, though, Adobe will fix it soon (I would hope, anyway), and thus you'll be back to making sure your clients know to upgrade to the latest Acrobat version.
- Secondly, this is technology-agnostic, so regardless of your platform (Java, .NET or Rails), you're vulnerable. (Such is always the case with XSS attacks.)
- How many developers will actually take steps to try and prevent it (such as, for example, ensuring that PDF URLS received aren't trailing any fragments before sending the URL request on for Adobe to process)?
- How long before somebody figures out a way to make this all Microsoft's fault? Will this gather any press coverage, and if it does, will they note that IE 6 SP2 and IE 7 don't seem to be affected by the attack? Will Slashdot even bother with a footnote? (My best guess would be, 1 week, yes, no, and no, respectively.)
|
 Wednesday, January 03, 2007
|
2006 Tech Predictions: A Year in Hindsight
|
|
OK, time to face the music and look back at my predictions from last year:
- The hype surrounding Ajax will slowly fade, as people come to realize that there's really nothing new here, just that DHTML is cool again. As Dion points out, Ajax will become a toolbox that you use in web development without thinking that "I am doing Ajax". Just as we don't think about "doing HTML" vs "doing DOM". Well, much as I might have wanted this to take place, it doesn't seem to have happened--Ajax is as much a buzzword (if not more so) than it was in 2005. In fact, it now seems to have grown to the same buzzwordy status as "Web 2.0", in that we're starting to lose sight of it as its acronym originally defined it to be: Asynchronous Javascript And XML. Now people are talking about using JSON, about using it synchronously, and... hey, it's just a matter of time before somebody points out the flaws in Javascript and starts suggesting other dynamic languages for the browser....
- The release of EJB 3 may actually start people thinking about EJB again, but hopefully this time in a more pragmatic and less hype-driven fashion. (Yes, EJB does have its place in the world, folks--it's just a much smaller place than most of the EJB vendors and book authors wanted it to be.) Hah. Fat chance. Though the EJB-bashing wave has slipped to an all-time low, it seems, it's still ready to rear its ugly head any time somebody suggests that there might be something about EJB that doesn't suck. Still, the luster is starting to wear off on Spring, which means that (a) people are starting to look at it critically, rather than taking it for granted as a media darling, and (b) people will start to re-evaluate EJB as a viable technology rather than just demonize it. Maybe.
- Vista will be slipped to 2007, despite Microsoft's best efforts. In the meantime, however, WinFX (which is effectively .NET 3.0) will ship, and people will discover that Workflow (WWF) is by far the more interesting of the WPF/WCF/WWF triplet. Notice that I don't say "powerful" or "important", but "interesting". Here we go: did Vista ship, or not? Officially, Vista was released to manufacturing (RTM'ed), but it's not available to consumers yet, and won't be until later this month or next. WinFX... er, I mean .NET 3.0... er, I mean NetFX3... whatever... shipped at the same time Vista did, though, and developers in the .NET space are beginning to hear more about this thing called "Workflow". It's still a mystery to most, I think, but then so is WCF.
- Scripting languages will hit their peak interest period in 2006; Ruby conversions will be at its apogee, and its likely that somewhere in the latter half of 2006 we'll hear about the first major Ruby project failure, most likely from a large consulting firm that tries to duplicate the success of Ruby's evangelists (Dave Thomas, David Geary, and the other Rubyists I know of from the NFJS tour) by throwing Ruby at a project without really understanding it. In other words, same story, different technology, same result. By 2007 the Ruby Backlash will have begun. Has the Ruby backlash begun? Hard to say--certainly there are those who've been rolling out Rails apps that have found problems with deploying Rails, but for now Rails--and thus Ruby--remain the media darling. Maybe by 2008.
- Interest in building languages that somehow bridge the gap between static and dynamic languages will start to grow, most likely beginning with E4X, the variant of ECMAScript (Javascript to those of you unfamiliar with the standards) that integrates XML into the language. Bah--this was an easy one to call. E4X hasn't yet really gained a lot of traction, but that may be because nobody's really talking about it or writing about it. That part might just require more time, or it may never happen--depends on how badly developers want an easier way to work with XML. Suffice it to say, we'll see lots of E4X-like features show up in other languages as we go; some have already shown up in other languages, such as Flex's ActionScript, for example.
- Java developers will start gaining interest in building rich Java apps again. (Freely admit, this is a long shot, but the work being done by the Swing researchers at Sun, not least of which is Romain Guy, will by the middle of 2006 probably be ready for prime-time consumption, and there's some seriously interesting sh*t in there.) Well, you can ask Scott Delap if you're not convinced, but certainly there's been a growing interest in building Eclipse RIAs. Swing (justifiably or not) still remains in the doghouse, however.
- Somebody at Microsoft starts seriously hammering on the CLR team to support continuations. Talk emerges about supporting it in the 4.0 (post-WinFX) release. I have no empirical or anecdotal proof, but the rumors abound...
- Effective Java (2nd Edition) will ship. (Hardly a difficult prediction to make--Josh said as much in the Javapolis interview I did with him and Neal Gafter.) Whoops. Apparently Josh is busy.
- Effective .NET will ship. Pragmatic XML Services will ship. Whoops. Apparently I was busy, too.
- JDK 6 will ship, and a good chunk of the Java community self-proclaimed experts and cognoscente will claim it sucks. It did ship, and many did claim it sucks. The coolness of JSR 223 (the scripting support) definitely worked to offset a lot of the cries-of-suckiness, though the last-second dropping of the data-mapping capabilities specified in JDBC 4.0 (WTF, Sun?!?) caught a lot of us by (unhappy) surprise. It also raises the question as to efficacy of the JCP documents when Sun feels completely comfortable changing them at the Very Last Second....
- Java developers will seriously begin to talk about what changes we want/need to Java for JDK 7 ("Dolphin"). Lots of ideas will be put forth. Hopefully most will be shot down. With any luck, Joshua Bloch and Neal Gafter will still be involved in the process, and will keep tight rein on the more... aggressive... ideas and turn them into useful things that won't break the spirit of the platform. Well, witness the closures debate between Josh on the one hand, and Neal on the other, and you can clearly see that they're still involved in the process, though not in the manner I'd envisioned. That said, though, the JDK 7 discussions are already ramping up; look for an interview I did with Neal Gafter at Javapolis this year to show up on Parleys.com in the very near future, in which we talked about this exact subject. Some interesting ideas will emerge out of this debate, both for JDK 7 and releases beyond...
- My long-shot hope, rather than prediction, for 2006: Sun comes to realize that the Java platform isn't about the language, but the platform, and begin to give serious credence and hope behind a multi-linguistic JVM ecosystem. Wow. Witness the acquisition of the JRuby pair by Sun, and the scripting support in JDK 6, and maybe, just maybe, I can claim a point on this one.
- My long-shot dream: JBoss goes out of business, the JBoss source code goes back to being maintained by developers whose principal interest is in maintaining open-source projects rather than making money, and it all gets folded together with what the Geronimo folks are doing. In other words, the open-source community stops the infighting and starts pulling oars in the same direction at the same time. For once. Well, you can't win them all.
Not sure how that leaves the score, but there you go....
|
 Sunday, December 31, 2006
|
Lack of power makes it really hard to work, even on a laptop...
|
|
Originally, I was going to post this the weekend just before Christmas, but the power outage struck back, and I was forced to hang on to it for a while longer, until I finally had a chance to post (which is now.) Thanks to all those who expressed concern and support through the outage; the worst that happened to us, overall, was the loss of recharging ability, which is a killer when you live on laptops and GameBoys...
For those who've been following the news of the storm that just hammered the Seattle and Eastside area this weekend, yes, I was one of those million-or-so without power, and as I write this, we're still without power. (Not sure if it's because the damage is so widespread or because the power company is being a big corporation--I'm sure the politicos will weigh in on that soon enough.) For those who've been wondering why I'm so slow on email this weekend, now you can probably guess why... And to the rest, yes, I and the family are fine, just really missing working electrical outlets to recharge laptops and GameBoys...
Sunday, December 31, 2006 9:20:53 PM (Pacific Standard Time, UTC-08:00)
|
|
|
Tech Predictions: 2007 Edition
|
|
So, in what's become an ongoing tradition, this is the time of year when I peer into the patented Ted Neward Crystal Ball (TM) (operators are standing by!), see what it tells me about technology trends and ideas for the coming year, and report them to you. The usual disclaimers apply, meaning I'm not getting any sort of endorsement deals to mention anybody's technology here, I'm not speaking for anybody but myself in this, and so on. And, in order to prove that I'm not an analyst group like Forrester or Burton or any of those other yahoos, in a separate post, I'll look over my predictions for 2006 and see how they panned out, thus proving that the patented Ted Neward Crystal Ball (TM) is just as capable of mistakes as any other crystal ball of course, right all the time. 
2006 was an interesting year, in that a lot of interesting things happened this year for developers. For the .NET crowd, Visual Studio 2005 and SQL Server 2005 finally became widely available to them (yes, it shipped in 2005 but it took a bit for it to percolate through the community), and NetFX 3 (aka .NET 3.0, aka Indigo/Avalon/Workflow) shipped in Q4, not to mention Vista itself, meaning there was all kinds of new stuff to play with. For the Java crowd, Spring 2.0 shipped, Geronimo 1.0 shipped, and Sun decided to finally open the doors on the JDK (apparently not realizing that a lot of us had already slipped in the back way through the doors marked "SCSL license" and "JRL license" since JDK 1.2...). Meanwhile, Ruby continued to amaze those who'd never seen a dynamic/scripting language before, and Rails continued to amaze developers who'd never seen a VB demo before. More WS-* specs shipped, people started talking about JavaScript Object Notation (JSON), RSS/Atom continued to draw attention in droves, and marketing guys looked for all kinds of places they could hang the Tim O'Reilly-inspired "Web 2.0" meme anywhere they could. And yet, through it all, developers somehow ignored the noise and kept working.
Without further ado...
- General: Analysts will call 2007 the Year of the {Something}, where I bet that {Something} will be either "ESB" or "SOA". They will predict that companies adopting {Something} will save millions, if not billions, if only they rush to implement it now. They will tag this with a probability of .8 in order to CYA in case {Something} doesn't pan out. (Yes, I've read far too many of these reports--I'm personally convinced that each of the analyst companies has a template buried away in their basement that they pull out each time they need a new one, and they just do a global search-and-replace of "{Something}" with whatever the technology du jour happens to be.)
- .NET: Thousands of developers will horribly abuse WPF in ways that can only be called nightmarish, thus once again proving the old adage that "just because you can doesn't mean you should" still holds. WPF's capabilities with video will prove, in many ways, to be the modern equivalent to the "blink" tag in HTML. This will provide some author with a golden opportunity: "WPF Applications That Suck". Alan Cooper will re-release "About Face", updated to include WPF UI elements.
- .NET: Thousands of developers will look to Redmond for an answer to the question, "Which should I use? BizTalk, Windows Workflow, or SQL Server Service Broker?", and get no clear answer.
- Windows: Microsoft will try, once again, to kill off the abomination that was called the Windows 95/98/Me line of operating systems, and will once again have to back off as industry outcries of protest (on behalf of little old ladies who are the only ones left running Windows 95/98/Me and probably haven't turned their machine on in months, at least not since the grandkids last visited) go ballistic.
- Windows: Ditto for Visual Basic 6.0, except now the outcry will be on behalf of developers who aren't capable of learning anything new. Sun will use the resulting PR to announce Project YAVKRWMITT (Yet Another VB Killer Really We Mean It This Time, pronounced "YAV-kermit") on java.net. Meanwhile, efforts to make CLASSPATH into something a VB 6 guy actually has a prayer of understanding will go quietly ignored.
- Java: JSR 277 will continue to churn along, and once the next draft ships, publicly nobody will like what we produce, though quietly everybody will admit it's a far cry better than what we have now, and when it ships in JDK 7 will be adopted widely and quietly.
- Java: Thousands of new ideas and proposals to extend the Java language in various ways will flood into the community, now that developers can start hacking on it for themselves thanks to the OpenJDK. Only a small fraction of these will ever get beyond the concept stage, and maybe one or two will actually be finished and released to the Web for consideration by the community and the JCP. Thousands more Java developers craving Alpha-Geek status will stick a "Hello, world" message into the compiler's startup sequence, then claim "experienced with modifying the OpenJDK Java compiler" on their resume and roundly criticize Java in one way or another by saying, "Well, I've looked at the code, and let me tell you....".
- .NET: Somewhere, a developer will realize that SQL Server 2005 can be a SOAP/WSDL XML service endpoint, and open it up as a private back-channel for his application to communicate with the database through the firewall "for performance reasons" (meaning, "So I can avoid having to talk to the app server in between my web server and my database"). With any luck, the DBA will kill him and hide the body before anybody can find and exploit it.
- General: Yet Another Virus That's Microsoft's Fault will rip through the Internet, and nobody will notice that the machines affected are the ones that aren't routinely administered or receive updates/patches. Companies will threaten Microsoft with million-dollar lawsuits, yet will fire none of their system administrators who lovingly lavish whole days tuning their Linux IRC servers yet leave the Windows Exchange Server still running Windows NT 4.0.
- General: Interest in JSON will escalate wildly, hyped as the "natural replacement for XML" in building browser-to-server connections, owing to its incredible simplicity in expressing "object" data. Folks, JSON is a useful format, but it's not a replacement for XML (nor is XML a replacement for it, either). What made XML so popular was not is hierarchical format (Lord above, that's probably the worst part of it, from where we as developers sit), nor its HTML-like simplified-SGML syntax. What made XML interesting was the fact that everybody lined up behind it--Microsoft, Sun, BEA, Oracle, IBM, there's not a big vendor that didn't express its undying love and devotion to XML. I sincerely doubt JSON will get that kind of rallying effect. (And if you're going to stand there and suggest that JSON is better because its simpler and therefore more approachable for developers to build support for themselves, quite honestly, I thought we were trying to get out of developers building all this communications infrastructure--isn't that what the app servers and such taught us?)
- General: Interest in Java/.NET interopability will rise as companies start to realize that (a) the WS-* "silver bullet" isn't, (b) ESB, XML, and SOA are just acronyms and won't, in of themselves, solve all the integration problems, and (c) we have lots of code in both Java and .NET that need to talk to each other. This may be a self-serving prediction, but I got a LOT of interest towards the end of this year in the subject, so I'm guessing that this is going to only get bigger as the WS-* hype continues to lose its shine in the coming years.
- Ruby: Interest in Java/Ruby and .NET/Ruby interoperability is going to start quietly making its presence felt, as people start trying to wire up their quick-to-write "stovepipe" RAILS apps against other systems in their production data center, and find that Ruby really is a platform of its own. RubyCLR or JRuby may be part of the answer here, but there's likely some hidden mines there we haven't seen yet.
- Languages: A new meme will get started: "JavaScript was that thing, that little toy language, that you used to do stuff in the HTML browser. ECMAScript, on the other hand, is a powerful and flexible dynamic programming language suitable for use in all sorts of situations." Pass it on. If you get it, don't tell anybody else. (Don't laugh--it worked for "The Crying Game".) It's the only way
JavaScript ECMAScript will gain widespread acceptance and shed the "toy" label that JavaScript has.
- Languages: Interest in functional-object hybrid languages will grow. Scala, Jaskell, F#, and others not-yet-invented will start to capture developers' attention, particularly when they hear the part about functional languages being easier to use in multi-core systems because it encourages immutable objects and discourages side effects (meaning we don't have to worry nearly so much about writing thread-safe code).
- Languages: Interest in Domain-specific languages will reach a peak this year, but a small backlash will begin next year. Meanwhile, more and more developers will realize that one man's "DSL" is another man's "little language", something UNIX has been doing since the early 70's. This will immediately take the shine off of DSLs, since anything that we did in the 70's must be bad, somehow. (Remember disco?)
- General: Rails will continue to draw developers who want quick-fix solutions/technologies, and largely that community will ignore the underlying power of Ruby itself. The draw will start to die down once Rails-esque feature ideas get folded into Java toolkits. (Rails will largely be a non-issue with the .NET community, owing to the high-productivity nature of the drag-and-drop interface in Visual Studio.)
- Java: Interface21 is going to start looking like a "big vendor" alongside BEA and IBM. I was talking with some of the I21 folks in Aarhus, Denmark at JAOO, and one of them casually mentioned that they were looking at a Spring 2.1 release somewhere in mid-2008. Clearly Spring is settling into eighteen-month major-version release cycles like all the big (meaning popular), established software systems have a tendency to do. This is both a good thing and a bad thing--it's good in that it means that Spring is now becoming an established part of the Java landscape and thus more acceptable to use in production environments, but it's bad in that Spring is now going to face the inevitable problem all big vendors face: trying to be all things to all people. This is dangerous, both for Interface21 and the people relying on Spring, largely because it means that Spring faces a very real future of greater complexity (and there are those, myself included, who believe that Spring is too complex already, easily on par with the complexity seen in EJB, POJOs notwithstanding).
- General: Marc Fleury will get a golden parachute from Red Hat (at their request and to their immense relief), and hopefully will retire to his own small island (might I suggest Elba, la petite corporal?) to quietly enjoy his millions. A shame that the people who did most of the real work on JBoss won't see a commensurate reward, but that's the way the business world works, I guess.
- General: Some company will get millions to build an enterprise product on the backs of RSS and/or Atom, thus proving that VCs are just as stupid and just as vulnerable to hype now as they were back in the DotCom era.
- General: Somebody will attempt to use the phrase "Web 2.0" in a serious discussion, and I will be forced to kill them for attempting to use a vague term in a vain effort to sound intelligent.
- Web clients: Ajax will start to lose its luster when developers realize the power of Google Maps isn't in Ajax, but in the fact that it's got some seriously cool graphics and maps. (Or, put another way, when developers realize that Ajax alone won't make their apps as cool as Google Maps, that's it's the same old DHTML from 1998, the hype will start to die down.)
- XML: Somebody, somewhere, will realize that REST != HTTP. He will be roundly criticized by hordes of HTTP zealots, and quietly crawl away to go build simpler and more robust systems that use transports other than HTTP.
- XML: Somebody, somewhere, will read the SOAP 1.2 specification. H.P. Lovecraft once suggested, loosely paraphrased, the the day Man understands the nature of the universe, he will either be driven into gibbering insanity, or flee back into ignorance in self-preservation. Ditto for the day Man reads the SOAP 1.2 spec and realizes that SOAP is, in fact, RESTful.
- Security: The US Government will continue its unbelievable quest to waste money on "security" by engaging in yet more perimeter security around airports and other indefensible locations, thus proving that none of them have bothered to read Schneier and learn that real security is a three-part tuple: prevention, detection, and response.
- Security: Thousands of companies will follow in the US Government's footsteps by doing exactly the same thing. (Folks, you can't solve all your problems with cryptography, no matter how big the key size--you just end up with the basic problem of where to store the keys, and no, burying them inside the code isn't going to hide them effectively.)
- Security: More and more rootkits-shipping-with-a-product will be discovered. We used to call it "getting close to the metal", now it's a "rootkit". With great power comes great responsibility... and, as many consumers have already discovered, with great power also comes a tendency to create greater instability...
- General: Parrot will ship a 1.0 release. Oh, wait, hang on, sorry, I bumped into the crystal ball and accidentally set it to 2017.
- .NET: Microsoft will ship Orcas (NetFX 3.5). (Sorry, crystal ball's still set on 2017. Trying to fix it...)
- .NET: Vista will surpass Windows XP in market penetration. (Let's see, almost got it set back to 2007, bear with me... There. Got it.)
- General: I will blog more than I did this year. (Hell, I couldn't blog less, even if I tried.)
- General: Pragmatic XML Services, Pragmatic .NET Project Automation and Effective .NET will ship. (Wait, is the crystal ball still on 2017...?)
Same time, next year....
|
 Friday, December 01, 2006
|
Follow-up on the Java Generics post
|
|
A number of folks emailed me with comments and ideas following the post on Java5's generics model. In no particular order...
John Spurlock wrote,
Interesting scenario, I wasn't able to come up with a warning-free solution either - but had some fun trying. I wonder if your compiler of choice makes a difference? I seem to remember Eclipse's JDT compiler having subtle differences from Sun's in regards to edge-case generics/casting scenarios (Sun's being more strict and giving more warnings).
The c# analogue is trivial, although the client code seems unnecessarily verbose (does not/will not infer the "item-type" afaik) In general, the c# compiler seems overly conservative in regards to type inference, forcing explicit type parameters far too often (anonymous parameterized delegates are the biggest offender). Also the fact that Type is not parameterized makes it impossible to pass standard arguments a la the java example.
using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
List<DateTime> listOfDates = GetSomethingOf<List<DateTime>, DateTime>();
Console.WriteLine(listOfDates.Count);
Collection<DateTime> collectionOfDates =
GetSomethingOf<Collection<DateTime>, DateTime>();
Console.WriteLine(collectionOfDates.Count);
LinkedList<DateTime> linkedListOfDates =
GetSomethingOf<LinkedList<DateTime>, DateTime>();
Console.WriteLine(linkedListOfDates.Count);
Dictionary<DateTime, DateTime> dictionaryOfDatePairs =
GetSomethingOf<Dictionary<DateTime, DateTime>, KeyValuePair<DateTime,DateTime>>();
Console.WriteLine(dictionaryOfDatePairs.Count);
List<List<String>> listofListsOfDates =
GetSomethingOf<List<List<String>>, List<String>>();
Console.WriteLine(listofListsOfDates.Count);
Console.ReadLine();
}
static C GetSomethingOf<C, T>()
where C : ICollection<T>, new()
where T : new()
{
C rt = new C();
rt.Add(new T());
return rt;
}
}
}
Thanks, John.
Adam Vanderburg wrote:
In C# you'd add the "new()" constraint to the generic types (both the collection and item), then you can just "new T()" them. In fact, one of the frustrating things about C# 2.0 is that you can require a parameterless constructor, but you can't require Constructors with specifically typed parameters. (Rumor has it that the underlying IL supports it, just not the C# 2 compiler.)
Yep, Adam, that's exactly what John demonstrates above (in case any non-C# programmers were wondering what that "where T : new()" syntax was coming from). As to whether the IL supports constructors with specifically typed parameters, I have to admit I don't know the answer to that one, and don't have time at the moment to find out--maybe Serge Lidin will read this blog entry and email me with an answer that I can post in a future blog entry (or comment, once I get comments re-enabled after doing a dasBlog upgrade to prevent all the crap comment and pingback/trackback spam I've been getting on here).
Next, Matt Tucker wrote:
In regard to your article on Java5 generics, I had a couple comments for you:
First of all, I'm not disputing the contention that Java generics leave something to be desired. Cool as they are, there are clearly some bits missing from the implementation.
No arguments there, Matt, but mostly my argument is that Java's generics model leave something to be desired entirely because they support a model of type erasure, rather than persisting the parameterized type directly into the JVM bytecode level. Doing that would have complicated the JVM a fair bit, but would have (a) allowed other languages to take advantage of generics, (b) preserved type-safety even in the face of Reflection, and (c) allowed for JIT compiler optimizations given the known paramterized type. None of these are possible in a type-erasure-based model. Why did Sun choose this approach? I won't speak authoritatively here, but my guess is because it represented too drastic a change for the language/platform at this point in Java's lifetime. (Whether that's true or not, or a good decision to have made, is entirely up to you to judge for yourself.)
Secondly, the code you had posted in your blog (at 4:00p at least) wouldn't compile. The issue was in the "external" class, and I ended up changing it to:
public static class external {
public static <C, T extends Collection<C>> T getSomethingOf(Class<T> type, Class<C> contentType)
throws Exception {
T result = type.newInstance();
result.add(contentType.newInstance());
return result;
}
public static Set<Date> getSetOfDate()
throws Exception {
return getSomethingOf(HashSet.class, Date.class); // warning
}
}
This still shows a warning on the line in question, but dispenses with the cast of the HashSet class, and rearranges the generic specification a bit.
My next point, which isn't a major one, is that in all of the examples you're doing collection.addAll(Arrays.asList(<single genericized item>)), which causes Java to complain that it's trying to dynamically create an array for an unknown (generic) type. Since that array has one item in it, and the array itself is going to be thrown away, why not do collection.add(contentType.newInstance()) and dispense messing with arrays entirely? If you're worried about doing it in a loop, does it really make sense to create and fill an entire array in a loop, add all its contents to the collection in one operation, and then throw it away?
As for the issue itself, I'd say that while none of the implementations are clean, the last one ("internal") is probably the best from the standpoint that it's at least providing a clean API. Sure, there's some casting and warnings going on inside there, but at least users of the code (ie, getSetOfDate) don't have to mess with it. And the warnings are there to remind you that you need to be careful about what you're doing. Since you have a collection that's only supposed to hold C's, and since you're only creating C's and putting them in the collection, it theoretically *should* be fine. Of course, what's the point of having to deal with a statically typed language if all you can get out of it is "should"?
All are viable points, Matt, and unfortunately I can't answer any questions regarding the intent of the code or why the choice for arrays; as I mentioned, this was code presented to me by an attendee at an NFJS conference, looking for an answer to get rid of the warnings generated by the various options. As to why the code wouldn't compile, that's likely a typo on my end--while we were working with it, we were in Eclipse at the time, and no compiler errors were reported at the time, so I have to assume I fat-fingered the code somewhere along the way.
Then, Bob Lee a.k.a. crazybob wrote to say:
The problem is you're trying to create an instance of a generic type from a Class. Class instances can only represent raw types. For example, List<?>, List<String> and List all share the same Class instance.
Your options are A) use a callback instead of a class instance:
interface CollectionFactory> {
C newInstance();
}
static <T, C extends Collection<T>> C
getSomethingOf(CollectionFactory<C> collectionFactory, Class<T> elementType)
throws Exception {
C c = collectionFactory.newInstance();
c.addAll(Arrays.asList(elementType.newInstance ()));
return c;
}
public static Set<Date> getSetOfDate() throws Exception {
return getSomethingOf(new CollectionFactory<Set<Date>>() {
public Set<Date> newInstance() {
return new HashSet<Date>();
}
}, Date.class);
}
Or B) suppress the warning.
In the example above, we eliminated the warnings when we got rid of the Class representing the collection type, but we left the Class representing the element type in. This isn't a problem in the example above, but if we called getSomethingOf() with a generic element type, the code won't compile. Again, our only options would be to live with warnings or use a callback.
Thanks, Bob, though I'm not sure I like the solution. Bob also pointed out (over IM) that Angelika Langer has a great FAQ on generics off of her website; were I not such a lazy person, I'd link to it directly from here (and will later, when I'm online), but for now, Google on "Langer generics FAQ" and you're feeling lucky....
Finally, Rafael de F. Ferreira wrote:
Hello. I came up with the following ugly hack:
import java.util.*;
import org.junit.Test;
public class WithNewClass {
public static <T, C extends Collection<? super T>>
C getSomethingOf(Class<C> type, Class<T> contentType)
throws Exception
{
C res = type.newInstance();
res.add(contentType.newInstance());
return res;
}
public static Set<Date> getSetOfDate()
throws Exception
{
class HSD extends HashSet<Date> {};
Class<? extends Set<Date>> cls = HSD.class;
return getSomethingOf(cls, Date.class);
}
@Test public void printSetOfDate() throws Exception {
Set<Date> newset = getSetOfDate();
System.out.println(newset);
}
}
It compiles without warnings in Eclipse, but I hope someone knows a better solution. Creating a class just to capture type arguments seems like a kludge.
It is, Rafael, but it's an interesting and useful kludge, nonetheless.
Thanks to all five of you for your comments; much as I dislike the generics system that Java ended up with, the sooner we learn to work with it and account for its... quirks, shall we say... the better.
Java/J2EE
Friday, December 01, 2006 7:49:11 AM (Pacific Standard Time, UTC-08:00)
|
|
 Tuesday, November 28, 2006
|
Java5, generics, and "just not quite there"
|
|
So an attendee comes up to me at one of the past NFJS shows, with this challenge:
The implementation does not know what parametrized Iterable class will be used. The Iterable class will need to know what class it contains. Interfaces are passed to the factory and it calls a lookup to identify (or create) the implementing class. Can this be done without causing a compile warning?
// usage:
Seq<Item> items = factory.createBean(null, Seq.class, Item.class);
// interface:
public abstract <T> T getBean(String localName, Class<T> javaClass,
Type... typeArguments);
// impl:
public <T> T createBean(String localName, Class<T> javaClass, Type... typeArguments) {
Resource resource = createResource(localName);
Collection<STRING> rdfTypes = findRdfTypes(javaClass);
for (String rdfType : rdfTypes) {
addStatement(resource, RDF.TYPE, createResource(rdfType));
}
T bean = rdfBeanFactory.createBean(this, resource, rdfTypes, javaClass);
if (typeArguments != null && bean instanceof RdfParameterizedBean)
((RdfParameterizedBean)bean).setActualTypeArguments(typeArguments);
return bean;
}
-- Some ideas I have tried.
import java.util.Arrays;
import java.util.Collection;
import java.util.Date;
import java.util.HashSet;
import java.util.Set;
public class test {
public static class plain {
public static Collection getSomethingOf(Class type, Class contentType) throws Exception {
Collection result = (Collection) type.newInstance(); // cast
result.addAll(Arrays.asList(contentType.newInstance())); // warning
return result;
}
public static Set<Date> getSetOfDate() throws Exception {
return (Set<Date>) getSomethingOf(HashSet.class, Date.class); // warning
}
}
public static class fixed {
public static <C> Collection<C> getSomethingOf(Class<? extends Collection> type, Class<C> contentType) throws Exception {
Collection<C> result = type.newInstance(); // warning
result.addAll(Arrays.asList(contentType.newInstance()));
return result;
}
public static Set<Date> getSetOfDate() throws Exception {
return (Set<Date>) getSomethingOf(HashSet.class, Date.class); // cast
}
}
public static class external {
public static <C, Collection<C extends T>> T getSomethingOf2(Class<T> type, Class<C> contentType) throws Exception {
T result = type.newInstance();
result.addAll(Arrays.asList(contentType.newInstance()));
return result;
}
public static Set<Date> getSetOfDate2() throws Exception {
Class<HashSet<Date>> type = (Class<HahSet>) HashSet.class; // warning
return getSomethingOf2(type, Date.class);
}
}
public static class internal {
public static <C, T extends Collection, R extends Collection<C>> R getSomethingOf(Class<T> type, Class<C> contentType)
throws Exception {
R result = (R) type.newInstance(); // warning
result.addAll(Arrays.asList(contentType.newInstance()));
return result;
}
public static Set<Date> getSetOfDate() throws Exception {
return getSomethingOf(HashSet.class, Date.class);
}
}
The goal here, I think, is to be able to construct instances of T without compiler warnings or errors (or old-style casts). Needless to say, neither Venkat nor I could manage to cruft up something that could work, and so I thought to throw this out to the blogosphere to see what others could come up with.
If I'm feeling bored one day I'll try coding it in C#, just to (hopefully) exemplify the differences in the generics model between the two.
UPDATE: Hopefully I got the formatting right this time; have I mentioned how much I hate the fact that Java, C# and C++ all use the left-pointy-bracket/right-pointy-bracket syntax when posting code snippets like this...?
Java/J2EE
Tuesday, November 28, 2006 2:06:02 PM (Pacific Standard Time, UTC-08:00)
|
|
 Monday, November 20, 2006
|
"What is Java Software?" You'd think they know by now...
|
|
While looking to download the Java5 JDK from Sun, I ran across this on the home page of java.com:
What is Java Software?
Java software allows you to run applications called "applets" that are written in the Java programming language. These applets allow you to play online games, chat with people around the world, calculate your mortgage interest, and view images in 3D. Corporations also use applets for intranet applications and e-business solutions.
Applets!? After almost a decade of Java's success on the server through J2EE and lightweight containers, the marketing idiots at Sun choose to explain what Java is by citing applets?!? Folks, if ever there was a single-sentence hint as to how Sun doesn't quite "get it", this is it.
Java/J2EE
Monday, November 20, 2006 5:51:45 PM (Pacific Standard Time, UTC-08:00)
|
|
|
Blog changes
|
|
Because of all the referrer and Trackback/Pingback spam, I've had to disable Trackback and Pingback (hopefully just temporarily, at least until I can get my dasBlog upgraded). Dunno if that makes anybody else sad, but I'm bummed at not being able to see peoples' comments and reactions to my posts.
Thus, for the time being, if you respond (positively or negatively) to something I say, and would really like a reaction (again, positive or negative), please either drop me an email or just post a comment here.
Monday, November 20, 2006 5:26:21 PM (Pacific Standard Time, UTC-08:00)
|
|
 Saturday, November 18, 2006
|
Windows Vista has RTM'ed
|
|
... which, normally, would be a source of much excitement. So I pull down the Vista bits, fire up VMWare (not that I don't trust it yet, it's just that... well.. you know... it is a 1.0 release and all, and besides, I do all my work now in VMWare images, and...), and sort through the whole "Vista doesn't like the VMWare CD emulation problem" (by mounting the ISO on the host using Daemon-Tools, so that to VMWare it looks like a real DVD). Voila. Installation proceeds.
And then, Vista prompts me for a license key. This should be the easiest step in the whole process: Being an MVP, we get license keys to everything Microsoft makes. So I cruise on up to the MSDN site, ask for a Vista Ultimate key, and...
"Error while requesting Product Keys. Please try again later or contact customer support. Please try again later. Thank you for your patience."
I try again.
"Error while requesting Product Keys. Please try again later or contact customer support. Please try again later. Thank you for your patience."
One more time--Microsoft software has been known to work the third time (or not at all).
"Error while requesting Product Keys. Please try again later or contact customer support. Please try again later. Thank you for your patience."
Now, fortunately, Vista will allow you to install without the product key up front. But you've got to wonder what the folks in Microsoft's MSDN support department were thinking when they didn't check to make sure requesting product keys would work before posting Vista to the Subscriber Downloads section: "Well, you know, it's not like the MVPs, the folks that we've rewarded for loyalty and external product support, it's not like they would want to download Vista right away and start playing with it or anything... and besides, it's not like they'd want the fullest-featured version of Vista, all they want to do is install the Home/Basic/StrippedDownToNothing version, right?"
Get it fixed, MSDN. And preferably before I have to reinstall Vista in a VMWare image again because I don't get a product key registered in time.
Oh, and for the future? You might want to check these things before you put the silly thing online. And that error message... Oy! "Thank you for your patience"?!? That has GOT to be the most overused phrase in all of customer service. So much so that I'm considering a new crusade to eliminate it from the vocabulary of any and all customer service representatives and management. (If I had any patience, I doubt I would be spending it waiting for somebody to get their act together on this. Now, waiting for my son to make his next move in Catan, THAT's a worthwhile exercise in patience...)
So sorry, Microsoft, but this earns you the highest mark of disrespect I can offer in the blog: "Duh..."
Update: So I went back in to MSDN Subscriber Downloads and got the Product Key without a hitch this time around, but it still doesn't change (a) the inexcusable fact that MSDN couldn't handle the load of its MSDN Subscribers downloading Vista, or (b) the fact that it couldn't even handle the load of people downloading product keys. Possible solutions for future releases: how about handing out product keys *before* the release? Just about a week or two ahead of the actual release, post a notice telling subscribers that "RTM keys are available", and that'd reduce at least a little bit of the load. I think subscribers can understand the difficulties of providing enough server bandwidth to download a 2.5 GB ISO image (!), but not having the product keys ready to go, that's just really hard to understand....
.NET
Saturday, November 18, 2006 2:43:14 AM (Pacific Standard Time, UTC-08:00)
|
|
 Friday, November 17, 2006
 Thursday, November 16, 2006
|
Welcome to Borders' Microsoft Days...
|
|
If you're a Microsoftie and you're in the Redmond area this week, swing by the Borders in the Redmond Town Center, where they're having their "Microsoft Days" experience--everything a Microsoftie buys (whether for themselves or for their significant other, hint hint, guys) is 15% off.
Why the advertisement? Two reasons: one, because I love supporting the local causes, and two, because I'm going to be there Friday night on a panel discussion with several .NET notables, including Bill Vaughn (the original SQL Server curmudgeon), Harry "I Got Your Architecture Right Here, Baby" Pierson, contributor to the "VB6 Migration Guide" book Keith Pleas, and possibly (if we can drag them out of the p & p "war room") agile afficionados Peter Provost and Brad Wilson. We have no real idea what we're going to talk about, but given the fact that we all like to express opinions regardless of whether we have any real working knowledge on the subject, I expect it'll be an interesting discussion....
See your local Borders for details, and while you're there, drop into the cafe and grab an espresso from the cheerful cafe staff... caffeine makes everything better.
Reading
Thursday, November 16, 2006 5:13:54 AM (Pacific Standard Time, UTC-08:00)
|
|
 Thursday, November 02, 2006
|
Kudos to APress...
|
|
So I'm in Borders tonight, looking around, and I happen to see one of APress's latest titles, "Practical OCaml". Several things go through my mind at once:
- WOW. OCaml.
- A book on OCaml. Not even a "Programming Languages 101" textbook, but a practical one, even.
- Like, a book, copywrit this year, on OCaml.
- Gotta buy it--not just because it's another of those Dead Languages I like to explore, but because F# is a dead-ringer for OCaml, and I'm really interested in seeing where we can go with F# these days.
- Gotta buy it--not only for the F# tie-in, but because Scala comes from that same family of languages, so there's probably some goodness on the Scala thought experiment, too.
- You know, come to think of it, this is the third or fourth book on the "Non-Mainstream" languages that APress has done recently. I thought maybe "Practical Common Lisp" was a one-shot, and hey, "Programming Sudoku" isn't a language but definitely a fun title nevertheless, but with "Practical OCaml", maybe Apress is quickly becoming like Morgan-Kaufman, in that they're going after territories that aren't already flooding with ten thousand "Me Too Ruby" books.
- And it's not just limited to languages either, come to think of it: they just published a db4o book, and even before then they had the only Lego Mindstorms books for years.
- Nice going, Gary.
- Hmm.... Wonder if Gary is already has "Practical Scala" under contract...?
Well done, APress. You had me worried there for a while, when you bought up all those Wrox titles (most of which were unadulterated crap, IMHO), but you've restored my faith in you once again. In fact, in my book, you have graduated to an entirely new level of coolness.
Reading
Thursday, November 02, 2006 11:22:41 PM (Pacific Daylight Time, UTC-07:00)
|
|
 Tuesday, October 24, 2006
|
New column goes live
|
|
The folks over at MSDN asked me to author a series of articles based around the theme of the "Pragmatic Architecture" talk I've given in a couple of locales recently, and the first article ("Layering") has gone up, along with the introduction to the series. Feedback is, of course, welcome, through either blog comments or through more traditional channels.
By the way, here's an interesting challenge for those of you who think you're up for it--who are the two members of "the group" spotted by the author during the intro? (Yes, they are, in fact, real people. None of this "Any similarities to persons real or historical is strictly accidental" bull-pucky for me.)
|
 Monday, October 16, 2006
|
There, but for the grace of God (and the experiences of Java) go I
|
|
At the patterns&practices Summit in Redmond, I was on a webcasted panel, "Open Source in the Enterprise", moderated by Scott Hanselman and included myself, Rocky Lhotka, and Chris Sells as panelists. Part of the discussion came around to building abstraction layers, though, and one thing that deeply worried and disappointed me was the reaction of the other panelists when I tried to warn them of the dangers of over-abstracting APIs.
You see, we got onto this subject because Scott had mentioned that Corillian (his company) had built an abstraction layer on top of the open-source logging package, log4net. This reminded me so strongly of Commons Logging that I made a comment to that effect, warning that the Java community got itself into trouble (and continues to do so to this day, IMHO) by building abstraction layers on top of abstraction layers on top of abstraction layers, all in the name of "we might want or need to change something... someday". It was this very tendency that drove many developers to embrace YAGNI (You Ain't Gonna Need It) from the agile/XP space, and remains a fiercely-debated subject. But what concerned me was the reactions of the other panelists, whose reaction, paraphrased, came off to me as, "We won't make that mistake--we're smarter than those Java guys."
Sorry, folks. That doesn't cut it.
Certainly, .NET has learned from the five years' lead time the Java community has had: the power of a runtime and bytecode, the usefulness of a large and well-built library upon which to build further, the power of compiled-on-demand Web pages, the usefulness of an openly-extensible build tool, even the "one language" vs. "many languages" debate, all could be said to have been influenced strongly by decisions and experience in the Java community. But Java still has much more it can teach the .NET community: mocking, unit-testing, lightweight containers, dependency-injection, and the perils of O/R-M are just part of the list of things that the Java community has close to a half-decade's experience in, compared to .NET's none.
To stand there and suggest that .NET will somehow avoid the mistakes of the Java community just because "we're smarter than them" is more than sheerest folly; it's a blatant ignorance of the well-known and famous quote:
"Those who do not remember the past are condemned to repeat it." --George Santayana
.NET | Java/J2EE | Ruby
Monday, October 16, 2006 6:58:46 PM (Pacific Daylight Time, UTC-07:00)
|
|
 Tuesday, October 10, 2006
|
Watching a friend's career die a short, horrific, painful death
|
|
Normally, I don't go for the chain-email thing, but recently someone who claims to be a friend of mine sent me this email:
The first episode of my Millahseconds weekly geek comedy podcast has been published. Details are here And you can download/subscribe here. Best Regards, Mark Miller
Now, as I say, I normally don't go in for this sort of shameless self-promotion (at least, on the part of other people, anyway), but his email contained one segment that made me rethink my position:
IMPORTANT: To help promote this, Ive employed the services of a crazy old voodoo gypsy woman named Moombassa. To avoid the Millahseconds Curse (which manifests itself as a rather itchy rash in areas you dont even want to know about), it is essential that you tell absolutely everyone you know about Millahseconds. In doing so, Moombassa says the curse will be lifted from you and passed onto your friends (awesome, eh?). And dont worry, that itching should go away in a few days.
Not that I'm suffering from any itchy rash in areas I don't... er, didn't... want to know about. No, sirreee, not me. This is just a... general rethinking of my position on forwarding selected emails. That's all. Really.
(Good luck, Mark, and for those of you who've never heard Mr. Miller on a comedic rant, you owe it to yourself to have a listen, both to tihs, and to Mondays. Oh, and be sure to have handy a spare pair of underwear--Mark's been known to make people laugh so hard I soiled mine... er, I mean, they soil theirs. It's some brutally wicked geek comedy.)
|
 Friday, October 06, 2006
|
A little knowledge is a dangerous thing
|
|
Five easy steps to thinking you understand a subject well enough to write on it:
- Read an article that poorly describes the subject, such as the article at http://java.sys-con.com/read/37613.htm, particularly when it ascribes to a few of the popular myths (such as "Why not tell the garbage collector what and when to collect", or the advice that calling System.gc() is anything but a waste of your time or an unnecessary hindrance to the GC itself).
- Follow the directions given there, which ask to create a benchmark with so much noise underneath it (in this case, by running on top of the WebLogic Server... or any J2EE server, for that matter) that you could never be precisely sure of the effect of any change to the code.
- Read an unrelated specification, such as one that's unrelated to the "normal" JVM and its GC behavior, like the Real-Time Specification for Java (JSR 1), and pretend that it will offer insights into how the J2SE/JSE JVM works.
- Don't bother reading the established literature from the source, such as that from the Sun Hotspot team (for example, the docs available online at "Tuning Garbage Collection with the 1.4.2. VM", in which it says, "Another way applications can interact with garbage collection is by invoking full garbage collections explicitly, such as through the System.gc() call. These calls force major collection, and inhibit scalability on large systems. The performance impact of explicit garbage collections can be measured by disabling explicit garbage collections using the flag -XX:+DisableExplicitGC." and the Hotspot FAQ, in which it says, "14. What type of collection does a System.gc() do? An explicit request to do a garbage collection does a full collection (both young generation and tenured generation). A full collection is always done with the application paused for the duration of the collection." and, most of all, "31. Should I pool objects to help GC? Should I call System.gc() periodically? The answer to these is No! Pooling objects will cause them to live longer than necessary. We strongly advise against object pools. Don't call System.gc(). The system will make the determination of when it's appropriate to do garbage collection and generally has the information necessary to do a much better job of initiating a garbage collection. If you are having problems with the garbage collection (pause times or frequency), consider adjusting the size of the generations.") Ignore that literature in favor of what your cousin's brother's wife's former roommate said about how to make Java GC run better.
- Publish your own variation thereof, and repeat.
Anybody still wondering why Java performance myths continue to perpetuate?
(In truth, it's really a shame--the author of the article really seems, on the surface of it, to be quite knowledgeable about the JVM and GC behavior, but as I went through it, I just got this jarring and sick feeling that either she was working with an entirely different JVM than the one I've been using for years now, or else everything I've been told and seen about the JVM was somehow a huge lie in of itself--and if that's the case, boy, are a LOT of the Java experts I know and respect equally fooled. If her benchmark weren't on top of WLS, I'd be tempted to follow it, but any benchmark on top of a J2EE server is going to be skewed, and thus, in my mind, not even worth the bother. Run it on top of a naked JVM, then let's see what's going on and compare notes. Normally I really try to give authors the benefit of the doubt, but this time... Sorry, Ms. Andres, but you've got a really steep uphill battle to fight yet if you're going to get any respect whatsoever on this one.)
Java/J2EE
Friday, October 06, 2006 6:00:02 AM (Pacific Daylight Time, UTC-07:00)
|
|
|
JAOO? Ja, I O-O too!
|
|
For two years now, I've been trying to come up with a good English pun on the name of the JAOO (apparently, officially pronounced "[DJA-OU]"), and that's the best I could come up with. Fortunately, the quality of the show isn't dependent on my puny punability.
Once again, the JAOO folks deserve a peerage. The venue was great (not often do I get to perform on a concert hall stage), the speaker selection was diverse and entertaining (not often do I get to see two people I deeply respect, in this case Glenn Vanderburg and Ian Griffiths, go at each other--respectfully--over the benefits and/or drawbacks of a technology, in this case, the Seaside web framework), and the opportunity to "hang" with those speakers (which is always the principal draw for me) was first-rate. I always love it when a conference dares to bridge the technology gap by bringing Java, .NET and "other" folks, such as the Rubyists, together, and JAOO does that magnificently. What was once a Java-centered conference is clearly no longer; now it's a Java/.NET/Agile/Enterprise/Client/Academic/Pragmatic conference.
Hail, JAOOers!
Conferences
Friday, October 06, 2006 5:08:44 AM (Pacific Daylight Time, UTC-07:00)
|
|
 Wednesday, September 27, 2006
|
Where've you been, Ted?
|
|
Some of the blog readers have emailed me asking about the long silence; a few have even asked if I was injured by one of the flying rotten tomatoes that came with the Vietnam post. No, I've just been traveling a lot, doing a bunch of conferences, with more coming up, like JAOO and DevReach (a new show that's opening in Sofia, Bulgaria, and one that I'm really looking forward to). In fact, for any of those of you who are in the Bulgaria area in a couple of weeks, DevReach is offering a pretty interesting raffle gift, a trip to visit Microsoft Research in Redmond; even if you don't win the prize, though, the Microsoft Research site is still pretty cool to visit.
In other news, I have new digs for my .NET training; yes, some of you had already read this elsewhere, but I'll say it here: I'm very glad to now be a part of the crew at Pluralsight, and I'm looking forward to doing Workflow, WCF, and Architecture classes for them, among others. It's a privilege and honor to be among guys this bright and this articulate, and once again I'm just happy at being a part of a group that will continue to keep me on my toes for a long time to come.
Meanwhile, I do plan on blogging again soon, but probably not until I'm done with my current travel set (eight cities, four countries, two continents, six weeks) and have some time to breathe again.
|
 Tuesday, June 27, 2006
|
Thoughts on Vietnam commentary
|
|
Numerous folks have taken me to task (some here in comments, some through private email, some through still other channels) over the last blog post; rather than try to respond to all individually, I figured it makes more sense to address the more salient points here:
- "How dare you use the Vietnam War as an analogy for something so trivial as object/relational mapping?" First of all, let's make a few facts clear. My father served in Vietnam. I have friends in Iraq right now. My best friend from high school served in the Navy during the first Iraq. I studied Vietnam--along with numerous other wars and coflicts--for several years as an International Relations major in college, focused specifically on military history. I have nothing but deep respect for all soldiers, of all nations, who go off to risk their lives in services to their country. I am appalled at how quickly governments (ours and others) chuck troops into a situation without thinking of the long-term strategy. I've spent more time studying war and its effects on the solidiers, the governments and the people than most people have spent watching TV. I am very aware of the ghosts I'm treading upon when I use the word "Vietnam", and quite frankly, folks, we as a nation have yet to come to terms with what happened there. Rambo films don't exorcise ghosts, much as we might want them to. POW-MIA flags don't, either. Please don't bring your ghosts in with you when approaching this subject, and I'll leave mine behind as well.
- "The Vietnam War is a bad analogy for O/R-M." Vietnam remains, for most Americans, as the quintessential symbol for "bloody, ugly, unresolvable quagmire". And, as some have pointed out in comments on the blog post already, all analogies break down eventually, and this one is no different--as one commenter put it, nobody ever died from a bad O/R-M tool. (Though the day is not far off when such could occur, given the incredible spread of technology into all corners of our lives--it's not too hard to imagine a day when a patient dies because a doctor received incorrect information about a medical allergy from the enterprise system he/she uses to call up patient records.) That said, however, I assert that the analogy is appropriate, and relevant, for a variety of reasons. One, because just as development teams frequently believe that the object/relational problem is "solvable", so too did the US government believe that the Communist insurgency (which was more of an independence movement than a Communist movement, we've since realized) was "solvable" in South Indochina. Two, development teams frequently believe that with "just a little bit more work, we're almost there..." (wherever "there" is, in the minds of the architect or team lead), just as the US government frequently predicted that the Viet Cong were on the verge of defeat, just a few more troops and the war is over... Three, the analogy holds because even as team leads and architects approach this problem having been burned before, they still attempt solutions to the problem, just as many of the US administrations' advisors believed that Vietnam was a dead-end and ill-fated, they still went in there anyway.
- "You aren't being fair--after all, {insert-name-of-favorite-O/R-M-tool-here} doesn't suffer from that problem." Not yet, it doesn't. Or it does, but you just haven't run into it yet. Either answer is possible. And in the early years of the Vietnam conflict, we didn't suffer the problems that we commonly associate with the War--the poor morale, the rampant drug use among the military, the widespread unpopularity of the conflict back home, and so on. The danger here is on the far end of the Slippery Slope, not the near end.
- "You aren't being fair--when you balance the pros and cons..." Perhaps not. But as someone who's built three O/R-M's in his lifetime, and refuses to build another one because they all faced the same end, despite very different beginnings, I worry more about the Slippery Slope and where it leaves us in the end. If your team can stay perched on the side of the Slope that yields the most benefits, then more power to you; but I worry about the day when the new college intern says to himself, "You know, with a bit more investment, I bet we could add inheritance...."
- "Some languages do allow for varying numbers of fields." Actually, no, most of the languages cited as examples, including Ruby, don't allow for varying fields. Ruby has a feature called "open classes", in which you can change the definition of the class at any time, but it's still (very loosely) a class-based language. (The implementation of Ruby, from what I can see, seems to back this point--each object holds a pointer back to the class object it stems from, which means, at least to me, it's loosely class-based.) We can debate the semantics of this point for days, and frankly I welcome the discussion, but not in the context of this one. We can save that for another post/thread at another time.
- "OK, but where can I go to get more info about O/R-M so I don't fall into the quagmire?" Excellent question. Roy Osherove has started a community site about O/R-Ms, which I think holds promise for discussion on the topic. The JDO crowd had several resources available at JDOCentral, and there's lots of discussion about O/R-M (stretching back several years) on TheServerSide. BEA, with its acquisition of Solarmetric, now owns one of the better O/R-M tools on the market, Kodo, and they're likely to still have numerous white papers and such on the subject.
- "OK, but where can I go to get more info about object persistence tools?" Right now, the only one I have any faith in is the db4o project; in fact, I'm speaking at their first user/developer conference in London in a few weeks. I've used others (such as Versant) in the past, and frankly, wasn't incredibly impressed.
- "OK, but where can I go to get more info about these other languages/approaches?" Keep your eye on LINQ, for starters, as that's one of the first mainstream attempts to bring some of these ideas into traditional statically-typed object platforms. Scala and F# I already mentioned. Ruby is another place to spend some time, as there's a lot of features Ruby has that are trying to make their way into other languages. And, although I will likely gather some serious heat for saying this, Visual FoxPro may have some of the most interesting "best of both worlds" mojo in the entire language space on this subject.
- "Great post!" Thanks.
Make no mistake about it: I am deeply sympathetic to anyone who lost somebody--figuratively or literally--to the Vietnam conflict. I feel equally sympathetic to anyone who lost somebody in the Korean War (as my family did), World War Two, or even World War One before that. In fact, my sympathies go out to anyone lost in any of the conflicts across history and the globe in which men and women die for an ideal or symbol. It is an unfortunate statement about human affairs that we see war as the ultimate arbiter over power disputes between nations, but this is the world we live in now. If you don't care for that, then I encourage you to actively work to change it, regardless of your politics. I have far more respect for someone who virulently disagrees with my political viewpoints and actively promotes their own, than I do for those who agree with my politics and do nothing but complain.
Perhaps history will record Vietnam as America's greatest military failure, perhaps not. There is ample evidence to suggest that Vietnam will forever act as a check on American territorial expansionism (remember, Hawaii and Alaska gained statehood after World War Two), and more importantly, as a checkpoint to hold flagrant use of American military muscle in place. But be that as it may, the fact remains that Vietnam had an incalculable effect on American foreign policy and domestic agenda, and will continue to do so for the next several generations. And, as numerous examples from my own experience and others can attest, the use of O/R-M can have the same effect (relativisitically speaking) on a development team's efforts.
.NET | C++ | Java/J2EE | Ruby
Tuesday, June 27, 2006 4:32:07 PM (Pacific Daylight Time, UTC-07:00)
|
|
 Monday, June 26, 2006
|
The Vietnam of Computer Science
|
|
(Two years ago, at Microsoft's TechEd in San Diego, I was involved in a conversation at an after-conference event with Harry Pierson and Clemens Vasters, and as is typical when the three of us get together, architectural topics were at the forefront of our discussions. An crowd gathered around us, and it turned into an impromptu birds-of-a-feather session. The subject of object/relational mapping technologies came up, and it was there and then that I first coined the phrase, "Object/relational mapping is the Vietnam of Computer Science". In the intervening time, I've received numerous requests to flesh out the discussion behind that statement, and given Microsoft's recent announcement regarding "entity support" in ADO.NET 3.0 and the acceptance of the Java Persistence API as a replacement for both EJB Entity Beans and JDO, it seemed time to do exactly that.)
No armed conflict in US history haunts the American military more than $g(Vietnam). So many divergent elements coalesced to create the most decisive turning point in modern American history that it defies any layman's attempt to tease them apart. And yet, the story of Vietnam is fundamentally a simple one: The United States began a military project with simple yet unclear and conflicting goals, and quickly became enmeshed in a quagmire that not only brought down two governments (one legally, one through force of arms), but also deeply scarred American military doctrine for the next four decades (at least).
Although it may seem trite to say it, $g(Object/Relational Mapping) is the Vietnam of Computer Science. It represents a quagmire which starts well, gets more complicated as time passes, and before long entraps its users in a commitment that has no clear demarcation point, no clear win conditions, and no clear exit strategy.
History
PBS has a good synopsis of the war, but for those who are more interested in Computer Science than Political/Military History, the short version goes like this:
$g(South Indochina), now known as Vietnam, Thailand, Laos and Cambodia, has a long history of struggle for autonomy. Before French colonial rule (which began in the mid-1800s), South Indochina wrestled for regional independence from China. During World War Two, the Japanese conquered the area, only to be later "liberated" by the Allies, leading France to resume their colonial rule (as did the British in their colonial territories elsewhere in Asia and India). Following WWII, however, the people of South Indochina, having thrown off one oppressor, extended their anti-occupation efforts to fight the French instead of the Japanese, and in 1954 the French capitulated, signing the $g(Geneva Peace Accords) to formally grant Vietnam its independence. Unfortunately, global pressures perverted the efforts somewhat, and instead of a lasting peace agreement a temporary solution was created, dividing the nation at the 17th parallel, creating two nations where formerly no such division existed. Elections were to be held in 1956 to reunify the country, but the US feared that too much power would be given to the $g(Communist Party of Vietnam) through these elections, and instead backed a counter-Communist state south of the 17th parallel and formed a series of multilateral agreements around it, such as $g(SEATO). The new nation of $g(South Vietnam) was born, and its first (dubiously) elected leader was $g(Ngo Dinh Diem), a staunchly anti-Communist who almost immediately declared his country under Communist attack. The $g(Eisenhower Administration) remained supportive of the Diem government, but Diem's loyalty with the people was almost nonexistent from the beginning.
By the time the US Democratic Party's $g(John F Kennedy) came to the White House, things were coming to a head in South Vietnam. Kennedy sent a team to Vietnam to research the conditions there and help formulate his strategy on the issue. In what's now known as the "$g(December 1961 White Paper)", an argument for an increase in military, technical and economic aid was presented, along with large-scale American "advisers" to help stabilize the Diem government and eliminate the $g(National Liberation Front), dubbed the $g(Viet Cong) by the US. What's not as widely known, however, is that a number of Kennedy's advisers argued against that buildup, calling Vietnam a "dead-end alley".
Faced with two diametrically opposite paths, Kennedy, as was typical for his administration, chose a middle path: instead of either a massive commitment or a complete withdrawal, Kennedy instead chose to seek a limited settlement, sending aid but not large numbers of troops, a path that was almost doomed from the beginning. Through a series of strategic blunders, including the forced relocation of rural villagers (known as the $g(Strategic Hamlet Program)), Diem's support was so deeply eroded that Kennedy hesitatingly and haltingly supported a coup, during which Diem was killed. Three weeks later, Kennedy was also assassinated, throwing the domestic US political scene into turmoil as well. Ironically, the conflict began by Kennedy would in fact later be associated most closely with his replacement.
Johnson's War
At the time of the Kennedy assassination, Vietnam had 16,000 American advisers in place, most of whom weren't involved in daily combat operations. Kennedy's Vice President and new replacement, however, $g(Lyndon Baines Johnson), was not convinced that this path was leading to success, and came to believe that more aggressive action was needed. Seizing on a dubious incident in which Vietnamese patrol boats attacked American destroyers1 in the $g(Gulf of Tonkin), Johnson used pro-war sentiment in Congress to pass a resolution that gave him powers to conduct military action without an explicit declaration of war. To put it simply, Johnson wanted to fight this war "in cold blood": "This meant that America would go to war in Vietnam with the precision of a surgeon with little noticeable impact on domestic culture. A limited war called for limited mobilization of resources, material and human, and caused little disruption in everyday life in America." (source) In essence, it would be a war whose only impact would be felt by the Vietnamese--American life and society would go on without any notice of the events in Vietnam, thus leaving Johnson to pursue his first great love, his "Great Society", a domestic agenda designed to fix many of US society's ills, such as poverty2. History, of course, knows better, and--perhaps cruelly--calls the Vietnam conflict "Johnson's War".
Initially, it must be noted that Vietnam-as-disaster is a more recent perception; Americans polled as late as 1967 were convinced that the war was a good thing, that Communism needed to be stopped and that Vietnam, should it fall, would be the first of a series of nations to succumb to Communist subversion. This "$g(Domino Theory)" was a common refrain for American politics in the latter half of the 20th century. Concerns of this sort plagued American foreign policy ever since the Communists successfully or nearly-successfully subverted several European governments during hte latter half of the 1940's, and then China in the 50's. (It must be noted that Eisenhower and $g(John Foster Dulles), formulators of the theory, never included Vietnam in their ring of dominos that must be preserved, and in fact Eisenhower was surprisingly apathetic about Vietnam during some of his meetings with Kennedy during the White House transition.)
In 1968, however, the Vietnam experience turned significantly, as the North Vietnamese and Viet Cong launched the $g(Tet Offensive), a campaign that put to lie all of the reassurances of the American government that it was winning the war in Vietnam. Ironically, as had been the case for much of the war, the NVA/VC forces lost a substantial number of troops, far more than their American opponents, yet the Tet Offensive is widely considered by historians to be the breaking point of American will in the war. Following that, popular opinion turned on Johnson, and in a dramatic news conference, he announced that he would not seek re-election. Furthermore, he announced that he would seek a negotiated settlement with the Vietnamese.
Nixon's Promise
Unfortunately, American negotiating position was seriously weakened by the very protests that had brought the Americans to the negotiating table in the first place; NVA/VC leadership recognized that the NVA/VC forces, despite staggering military losses that nearly broke them (several times), could simply continue to do as they were doing, and wring concessions from the Americans without offering any in return. Running on a platform that consisted mostly of a promise to "Get America out of Vietnam", Johnson's successor, Republican $g(Richard Nixon), tried several tactics to bring pressure to the NVA/VC forces to bargain, including increased air-combat presence (such as the $g(Christmas bombings) and $g(Operation Menu) ) and regular violations of nearby Laos and Cambodia, pursuing the line of supplies from North Vietnam to cells in South Vietnam. Nothing worked, however, and in 1973 Nixon's administration signed the $g(Paris Peace Agreement), ending American involvement in that conflict. Two years later, South Vietnam had been overrun, and on April 30, 1975, Communist forces captured Saigon, the capital of Vietnam, forcing the evacuation of the American embassy and the most memorable image of the war, that of streams of fleeing people seeking space on the Huey helicopter perched on the roof of the embassy.
War's End
The Second South Indochina War was over, America had experienced its most profound defeat ever in its history, and Vietnam became synonymous with "quagmire". Its impact on American culture was immeasurable, as it taught an entire generation of Americans to fear and mistrust their government, it taught American leaders to fear any amount of US military casualties, and brought the phrase "clear exit strategy" directly into the American political lexicon. Not until $g(Ronald Reagan) used the American military to "liberate" the small island nation of $g(Grenada) would American military intervention be considered a possible tool of diplomacy by American presidents, and even then only with great sensitivity to domestic concern, as $g(Bill Clinton) would find out during his peacekeeping missions to $g(Somalia) and $g(Kosovo). In quantifiable terms, too, Vietnam's effects clearly fell short of Johnson's goal of a war in "cold blood". Final tally: 3 million Americans served in the war, 150,000 seriously wounded, 58,000 dead, and over 1,000 MIA, not to mention nearly a million NVA/Viet Cong troop casualties, 250,000 South Vietnamese casualties, and hundreds of thousands--if not millions, as some historians advocated--of civilian casualties.
Lessons of Vietnam
Vietnam presents an interesting problem to the student of military and political history--exactly what went wrong, when, and where? Obviously, the US government's unwillingness to admit its failures during the war makes for an easy scapegoat, but no government in the history of modern society has ever been entirely truthful with its population about its fortunes of war; one such example includes (but is not limited to) the same US government's careful censorship of activities during World War Two, fifty years earlier, known in American history as "the last 'good' war". It's also tempting to point to the lack of a military objective as the crucial failing point of Vietnam, but other non-military objectives have been successfully executed by the US and other governments without the kind of colossal failure accompanying Vietnam's story. Moreover, it's important to note that the US did, in fact, have a clear objective in what it wanted out of the conflict in South Indochina: to stop the fall of the South Vietnam government, and, barring that, the cessation of the "spread" of Communism. Was it the reluctance of the US government to unleash the military to its fullest capabilities, as $g(General William Westmoreland) always claimed? Certainly the failure in Vietnam was not a military one; the casualty figures make it clear that the US, by any other measure, was clearly winning.
So what were the principal failures in Vietnam? And, more importantly, what does all this have to do with O/R Mapping?
Vietnam and O/R mapping
In the case of Vietnam, the United States political and military apparatus was faced with a deadly form of the $g(Law of Diminishing Returns). In the case of automated Object/Relational Mapping, it's the same concern--that early successes yield a commitment to use O/R-M in places where success becomes more elusive, and over time, isn't a success at all due to the overhead of time and energy required to support it through all possible use-cases. In essence, the biggest lesson of Vietnam--for any group, political or otherwise--is to know when to "cut bait and run", as fishermen say. Too often, as was the case in Vietnam, it is easy to justify further investment in a particular course of action by suggestion that abandoning that course somehow invalidates all the work--or, in Vietnam's case, the lives of American soldiers--that have already been paid. Phrases like "We've gone this far, surely we can see this thing through" and "To back out now is to throw away everything we've sacrificed up until this point" become commonplace. At least during the later, deeply bitter years of the second half of Vietnam, questions of patriotism came into question: if you didn't support the war, you were clearly a traitor, a Communist, obviously "unAmerican", disrespectful of all American veterans of any war fought on any soil for whatever reason, and you probably kicked your dog to boot. (It didn't help the protestors' cause that they blamed the soldiers for the war, holding them accountable--sometimes personally--for the decisions made by military and political leaders, most of whom neither the soldiers nor the protestors had ever met.)
Recognizing that all analogies fail eventually, and that the subject of Vietnam is deeper than this essay can examine, there are still lessons to be learned here in an entirely different arena. One of the key lessons of Vietnam was the danger of what's colloquially called "the Slippery Slope": that a given course of action might yield some early success, yet further investment into that action yields decreasingly commensurate results and increasibly dangerous obstacles whose only solution appears to be greater and greater commitment of resources and/or action. Some have called this "the Drug Trap", after the way pharmaceuticals (legal or illegal) can have diminished effect after prolonged use, requiring upped dosage in order to yield the same results. Others call this "the Last Mile Problem": that as one nears the end of a problem, it becomes increasingly difficult in cost terms (both monetary and abstract) to find a 100% complete solution. All are basically speaking of the same thing--the difficulty of finding an answer that allows our hero to "finish off" the problem in question, completely and satisfactorily.
We begin the analysis of Object/Relational Mapping--and its relationship to the Second South Indochina War--by examining the reasons for it in the first place. What drives developers away from using traditional relational tools to access a relational database, and to prefer instead tools such as O/R-M's?
The Object-Relational Impedence Mismatch
To say that objects and relational data sets are somehow constructed differently is typically not a surprise to any developer who's ever used both; except in extremely simplistic situations, it becomes fairly obvious to recognize that the way in which a relational data store is designed is subtly--and yet profoundly--different than how an object system is designed.
Object systems are typically characterized by four basic components: identity, state, behavior and encapsulation. Identity is an implicit concept in most O-O languages, in that a given object has a unique identity that is distinct from its state (the value of its internal fields)--two objects with the same state are still separate and distinct objects, despite being bit-for-bit mirrors of one another. This is the "identity vs. equivalence" discussion that occurs in languages like C++, C# or Java, where developers must distinguish between "a == b" and "a.equals(b)". The behavior of an object is fairly easy to see, a collection of operations clients can invoke to manipulate, examine, or interact with objects in some fashion. (This is what distinguishes objects from passive data structures in a procedural language like C.) Encapsulation is a key detail, preventing outside parties from manipulating internal object details, thus providing evolutionary capabilities to the object's interface to clients.3. From this we can derive more interesting concepts, such as type, a formal declaration of object state and behavior, association, allowing types to reference one another through a lightweight reference rather than complete by-value ownership (sometimes called composition), inheritance, the ability to relate one type to another such that the relating type incorporates all of the related type's state and behavior as part of its own, and polymorphism, the ability to substitute an object in where a different type is expected.
Relational systems describe a form of knowledge storage and retrieval based on predicate logic and truth statements. In essence, each row within a table is a declaration about a fact in the world, and SQL allows for operator-efficient data retrieval of those facts using predicate logic to create inferences from those facts. [Date04] and [Fussell] define the relational model as characterized by relation, attribute, tuple, relation value and relation variable. A relation is, at its heart, a truth predicate about the world, a statement of facts (attributes) that provide meaning to the predicate. For example, we may define the relation "PERSON" as {SSN, Name, City}, which states that "there exists a PERSON with a Social Security Number SSN who lives in City and is called Name". Note that in a relation, attribute ordering is entirely unspecified. A tuple is a truth statement within the context of a relation, a set of attribute values that match the required set of attributes in the relation, such as "{PERSON SSN='123-45-6789' Name='Catherine Kennedy' City='Seattle'}". Note that two tuples are considered identical if their relation and attribute values are also identical. A relation value, then, is a combination of a relation and a set of tuples that match that relation, and a relation variable is, like most variables, a placeholder for a given relation, but can change value over time. Thus, a relation variable People can be written to hold the relation {PERSON}, and consist of the relation value { {PERSON SSN='123-45-6789' Name='Catherine Kennedy' City='Seattle'},
{PERSON SSN='321-54-9876' Name='Charlotte Neward' City='Redmond'},
{PERSON SSN='213-45-6978' Name='Cathi Gero' City='Redmond'} }
These are commonly referred to as tables (relation variable), rows (tuples), columns (attributes), and a collection of relation variables as a database. These basic element types can be combined against one another using a set of operators (described in some detail in Chapter 7 of [Date04]): restrict, project, product, join, divide, union, intersection and difference, and these form the basis of the format and approach to SQL, the universally-acceptance language for interacting with a relational system from operator consoles or programming languages. The use of these operators allow for the creation of derived relation values, relations that are calculated from other relation values in the database--for example, we can create a relation value that demonstrates the number of people living in individual cities by making use of the project and restrict operators across the People relation variable defined above.
Already, it's fairly clear to see that there are distinct differences between how the relational world and object world view the "proper" design of a system, and more will become apparent as time progresses. It's important to note, however, that so long as programmers prefer to use object-oriented programming languages to access relational data stores, there will always be some kind of object-relational mapping taking place--the two models are simply too different to bridge silently. (Arguably, the same is true of object-oriented and procedural programming, but that's another argument for another day.) O/R mappings can take place in a variety of forms, the easiest of which to recognize is the automated O/R mapping tool, such as $g(TopLink), $g(Hibernate) / $g(NHibernate), or $g(Gentle.NET). Another form of mapping is the hand-coded one, in which programmers use relational-oriented tools, such as JDBC or ADO.NET, to access relational data and extract it into a form more pleasing to object-minded developers "by hand". A third is to simply accept the shape of the relational data as "the" model from which to operate, and slave the objects around it to this approach; this is also known in the patterns lexicon as Table Data Gateway [PEAA, 144] or Row Data Gateway [PEAA 152]; many data-access layers in both Java and .NET use this approach and combine it with code-generation to simplify the development of that layer. Sometimes we build objects around the relational/table model, put some additional behavior around it, and call it Active Record [PEAA, 160].
In truth, this basic approach--to slave one model into the terms and approach of the other--has been the traditional answer to the impedance mismatch, effectively "solving" the problem by ignoring one half of it. Unfortunately, most development efforts, like the Kennedy Administration, aren't willing to see this through to its logical conclusion with a wholesale commitment to one approach over the other. For example, while most development teams would be happy to adopt an "objects-only" approach, doing so at the storage level implies the use of an Object Oriented DataBase Management System (OODBMS), a topic that frequently has no traction within upper management or the corporate data management team. The opposite approach--a "relational-only" approach--is almost nonsensical to consider, given the technology of the day at the time this was written4.
Given that it's impossible, then, to "unleash the objects to their fullest capabilities", as General Westmoreland might call it, we're left with some kind of hybrid object-to-relational mapping approach, preferably one that's automated as much as possible, so that developers can focus on their Domain Model, rather than on the details of the object-to-table(s) mapping. And here, unfortunately, is where the potential quagmire begins.
The Object-to-Table Mapping Problem
One of the first and most easily-recognizable problems in using objects as a front-end to a relational data store is that of how to map classes to tables. At first, it seems a fairly straightforward exercise--tables map to types, columns to fields. Even the field types appear to line up directly against the relational column types, at least to a fairly isomorphic degree: VARCHARs to Strings, INTEGERs to ints, and so on. So it makes sense that for any given class defined in the system, a corresponding table--likely to be of the same or closely related name--is defined to go with it. Or, perhaps, if the object code is being written to an already existing schema, then the class maps to the table.
But as time progresses, it's only natural that a well-trained object-oriented developer will seek to leverage inheritance in the object system, and seek ways to do the same in the relational model. Unfortunately, the relational model does not support any sort of polymorphism or IS-A kind of relation, and so developers eventually find themselves adopting one of three possible options to map inheritance into the relational world: table-per-class, table-per-concrete-class, or table-per-class-family. Each of them carries potentially significant drawbacks.
The table-per-class approach is perhaps the most easily understood, for it seeks to minimize the "distance" between the object model and the relational model; each class in the inheritance hierarchy gets its own relational table, and objects of derived types are stitched together from relational JOINs across the various inheritance-based tables. So, for example, if an object model has the base class Person, with Student derived from Person and GraduateState derived from Student, then there will be three tables required to hold this model, PERSON, STUDENT, and GRADUATESTUDENT, each holding the fields corresponding to the class of the same name. Relating these tables together, however, requires each to have an independent primary key (one whose value is not actually stored in the object entity) so that each derived class can have a foreign key relation to its superclass's table. The reason for this is clear: a GraduateStudent object, by virtue of its IS-A relationship to Student and Person, is a collection of all three sets of state, and the distinction between the classes is largely removed by the time an object of this type is created--in both Java and .NET, for example, the object itself is a chunk of memory that holds the instance fields defined in all of its classes and superclasses, along with a pointer to the table of methods defined by that same hierarchy. This means that when querying for a particular instance at the relational level, at least three JOINs must be made in order to bring all of the object's state into the object program's working memory.
Actually, it gets worse than that--if the object hierarchy continues to grow, say to include Professor, Staff, Undergrad (inherits from Student), and a whole hierarchy of AdjunctEmployees (inheriting from Staff), and the program wants to find all Persons whose last name is Smith, then JOINs must be done for every derived class in the system, since the semantics of "find all Persons" means that the query must seek data on the PERSON table, but then do an expensive set of JOINs to bring in the rest of the data from across the rest of the database, pulling in the PROFESSOR table to fetch the rest of the data, not to mention the UNDERGRAD, ADJUCTEMPLOYEE, STAFF, and other tables. Considering that JOINs are among the most expensive expressions in RDBMS queries, this is clearly not something to undertake lightly.
As a result, developers typically adopt one of the other two approaches, more complex in outlook but more efficient when dealing with relational storage: they either create a table per concrete (most-derived) class, preferring to adopt denormalization and its costs, or else they create a single table for the entire hierarchy, often in either case creating a discriminator column to indicate to which class each row in the table belongs. (Various hybrids of these schemes are also possible, but typically don't create results that are significantly different from these two.) Unfortunately, the denormalization costs are often significant for a large volume of data, and/or the table(s) will contain significant amounts of empty columns, which will need NULLability constraints on all columns, eliminating the powerful integrity constraints offered by an RDBMS.
Inheritance mapping isn't the end of it; associations between objects, the typical 1:n or m:n cardinality associations so commonly used in both SQL and/or UML, are handled entirely differently: in object systems, association is unidirectional, from the associator to the associatee (meaning the associated object(s) have no idea they are in fact associated unless an explicit bidirectional association is established), whereas in relational systems the association is actually reversed, from the associatee to the associator (via foreign key columns). This turns out to be surprisingly important, as it means that for m:n associations, a third table must be used to store the actual relationship between associator and associatee, and even for the simpler 1:n relationships the associator has no inherent knowledge of the relations to which it associates--discovering that data requires a JOIN against any or all associated tables at some point. (When to actually retrieve that data is a subject of some debate--see the Loading Paradox, below).
The Schema-Ownership Conflict
Discussions of inheritance-to-table and association mapping schemes also reveals a basic flaw: At heart, many object-relational mapping tools assume that the schema is something that can be defined according to schemes that help optimize the O/R-M's queries against the relational data. But this belies a basic problem, that often the database schema itself is not under the direct control of developers, but instead is owned by another group within the company, typically the database administration (DBA) group. To whom does responsibility for designing the database--and deciding when schema changes are permissible--belong?
In many cases, developers begin a new project with a "clean slate", an empty relational database whose schema is theirs to define as they see fit. But, soon after the project has shipped (sometimes even earlier than that, due to political and/or "turf war" issues), it becomes apparent that the developers' ownership of the schema is temporary at best--various departments begin clamoring for reports against the database, DBAs are held accountable to the performance of the database thereby giving them cause to call for "refactoring" and denormalization of the data, and other development teams may start inquiring about how they might make use of the data stored therein. Before too long, the schema must be "frozen", thereby potentially creating a barrier to object model refactoring (see The Coupling Concern, below). In addition, these other teams will expect to see a relational model defined in relational terms, not one which supports an entirely orthogonal form of persistence--for example, the "discriminator" column from the Inheritance-to-Table Mapping Problem will represent difficulties, and arguably be all but unusable, to relational report generators such as Crystal Reports. Unless developers are willing to write all reports (and their UIs, and their printing code, and their ad-hoc capabilities...) by hand, this is usually going to be an unacceptable state of affairs.
(To be fair, this is not so much a technical problem as it is a political problem, but it still represents a serious problem regardless of its source--or solution. And as such, it still represents an impediment to an object/relational mapping solution.)
The Dual-Schema Problem
A related issue to the question of schema ownership is that in an O/R-M solution, the metadata to the system is held fundamentally in two different places: once in the database schema, and once in the object model (another schema, if you will, expressed in Java or C# instead of DDL). Updates or refactorings to one will likely require similar updates or refactorings to the other. Refactoring code to match database schema changes is widely considered to be the easier of the two--refactoring the database frequently requires some kind of migration and/or adaptation of data already within the database, where code has no such requirement. (Objects, at least in this discussion, are ephemeral in-memory instances that will disappear once the process holding them terminates. If the objects are stored in some kind of object form that can persist across process execution--such as serialized object instances stored to disk--then refactoring objects becomes equally problematic.)
More importantly, while it's not uncommon for code to be deployed specifically to a single application, frequently database instances are used by more than one application, and it's frequently unacceptable to business to trigger a company-wide refactoring of code simply because a refactoring on one application requires a similar database-driven refactoring. As a result, as the system grows over time, there will be increasing pressure on the developers to "tie off" the object model from the database schema, such that schema changes won't require similar object model refactorings, and vice versa. In some cases, where the O/R-M doesn't permit such disconnection, an entirely private database instance may have to be deployed, with the exact schema the O/R-M-based solution was built against, creating yet another silo of data in an IT environment where pressure is building to reduce such silos.
Entity Identity Issues
As if these problems weren't enough, we then walk into another problem, that of identity of objects and relations. As noted above, object systems use an implicit sense of identity, typically based on the object's location in memory (the ubiquitous this pointer); alternatively, this is sometimes referred to as an OID (Object IDentifier), usually in systems which don't directly expose memory locations, such as the object database (where an in-memory pointer is pretty useless as an identifier outside of the database process). In a relational model, however, identity is implicit in the state itself--two rows with the exact same state are typically considered a relational data corruption, as the same fact asserted twice is redundant and counterproductive. To be fair, we should be a bit more explicit here; a relational system can, in fact, permit duplicate tuples (as described above), but this is often explicitly disallowed by explicit relational constraints, such as PRIMARY KEY constraints. In those situations where duplicate values are allowed, there is no way for a relational system to determine which of the two duplicate rows are being retrieved--there is no implicit sense of identity to the relation except that offered by its attributes. The same is not true of object systems, where two objects that contain precisely identical bit patterns in two different locations of memory are in fact separate objects. (This is the reason for the distinction between "==" and ".equals()" in Java or C#.) The implication here is simple: if the two systems are going to agree on the sense of identity, the relational system must offer some kind of unique identity concept (usually an auto-incrementing integer column) to match that of the notion of object identity.
This causes some serious concerns regarding automated O/R systems, because the sense of identity is entirely different--if two separate user sessions interact with the same relation in storage, the relational database system's concurrency systems kick in and ensure some form of concurrent access, typically via the transactional metaphor (ACID). If an O/R system retrieves a relation out of storage (essentially forming a "view" over the data), we now have a second source of data identity, one in the database (protected by the aforementioned transactional scheme), and one in the in-memory object representation of that data, which has no consistent transactional support aside from that built into the language (such as the monitors concept in Java and .NET) or libraries (such as System.Transactions in .NET 2.0), either of which can be--and unfortuantely frequently are--easily ignored by developers. Managing isolation and concurrency is not an easy problem to solve, and unfortunately the languages and platforms commonly available to developers aren't yet as consistent or flexible as the database transaction metaphor.
What complicates this problem further is that many O/R systems introduce significant caching support into the O/R layer (usually in an attempt to improve performance and avoid round-trips to the database), and this in turn presents some problems, particularly if the caching system is not a write-through cache: when does the actual "flush" to the database take place, and what does this say about transactional integrity if the application code believes the write to have occurred when in fact it hasn't? This problem in turn only compounds when the O/R system runs in multiple processes in front of the database engine, commonly found in clustered or farmed application server scenarios. Now the data identity is spread across n+1 locations, n being the number of application server nodes, and 1 being the database itself. Each node must somehow signal its intent to do an update to the other nodes in order to obtain some kind of concurrency construct to prevent simultaneous access (by another instance of the same session, or by an instance of a different session accessing the same data), which takes time, killing performance. Even in the case of a read-only cache, updates to the data store must somehow be signaled to the caches running in the application server nodes, requiring server-to-client communication originating from the database; support for this is not well-understood or documented in the current crop of modern relational databases.
The Data Retrieval Mechansim Concern
So once the entity is stored within the database, how exactly do we retrieve it? In all honesty, a purely object-oriented approach would make use of object approaches for retrieval, ideally using constructor-style syntax identifying the object(s) desired, but unfortunately constructor syntax isn't generic enough to allow for something that flexible; in particular, it lacks the ability to initialize a collection of objects, and queries frequently need to return a collection, rather than just a single entity. (Multiple trips to the database to fetch entities individually is generally considered too wasteful, in both latency and bandwidth, to consider credibly as an alternative--see the Load-Time Paradox, below, for more.) As a result, we typically end up with one of Query-By-Example (QBE), Query-By-API (QBA), or Query-By-Language (QBL) approaches.
A QBE approach states that you fill out an object template of the type of object you're looking for, with fields in the object set to a particular value to use as part of the query-filtration process. So, for example, if you're querying the Person object/table for people with the last name of Smith, you set up the query like so: Person p = new Person(); // assumes all fields are set to null by default
p.LastName = "Smith";
ObjectCollection oc = QueryExecutor.execute(p);
The problem with the QBE approach is obvious: while it's perfectly sufficient for simple queries, it's not nearly expressive enough to support the more complex style of query that frequently we need to execute--"find all Persons named Smith or Cromwell" and "find all Persons NOT named Smith" are two examples. While it's not impossible to build QBE approaches that handle this (and more complex scenarios), it definitely complicates the API significantly. More importantly, it also forces the domain objects into an uncomfortable position--they must support nullable fields/properties, which may be a violation of the domain rules the object would otherwise seek to support--a Person without a name isn't a very useful object, in many scenarios, yet this is exactly what a QBE approach will demand of domain objects stored within it. (Practitioners of QBE will often argue that it's not unreasonable for an object's implementation to take this into account, but again this is neither easy nor frequently done.)
As a result, usually the second step is to have the object system support a "Query-By-API" approach, in which queries are constructed by query objects, usually something of the form: Query q = new Query();
q.From("PERSON").Where(
new EqualsCriteria("PERSON.LAST_NAME", "Smith"));
ObjectCollection oc = QueryExecutor.execute(q);
Here, the query is not based on an empty "template" of the object to be retrieved, but off of a set of "query objects" that are used together to define a Command-style object for executing against the database. Multiple criteria are connected using some kind of binomial construct, usually "And" and "Or" objects, each of which contain unique Criteria objects to test against. Additional filtration/manipulation objects can be tagged onto the end, usually by appending calls such as "OrderBy(field-name)" or "GroupBy(field-name)". In some cases, these method calls are actually objects constructed by the programmer and strung together explicitly.
Developers quickly note that the above approach is (generally) much more verbose than the traditional SQL approach, and certain styles of queries (particularly the more unconventional joins, such as outer joins) are much more difficult--if not impossible--to represent in the QBA approach.
On top of this, we have a more subtle problem, that of the reliance on developers' dicipline: both the table name ("PERSON") and the column name in the criteria ("PERSON.LAST_NAME") are standard strings, taken as-is and fed to the system at runtime with no sort of validity-checking until then. This presents a classic problem in programming, that of the "fat-finger" error, where a developer doesn't actually query the "PERSON" table, but the "PRESON" table instead. While a quick unit-test against a live database instance will reveal the error during unit-testing, this presumes two facts--that the developers are religious about adopting unit-testing, and that the unit-tests are run against database instances. While the former is slowly becoming more of a guarantee as more and more developers become "test-infected" (borrowing Gamma's and Beck's choice of terminology), the latter is still entirely open to discussion and interpretation, owing to the fact that setting-up and tearing-down the database instance appropriately for unit tests is still difficult to do in a database. (While there are a variety of ways to circumvent this problem, few of them seem to be in use.)
We're also faced with the basic problem that greater awareness of the logical--or physical--data representation is required on the part of the developer--instead of simply focusing on how the objects are related to one another (through simple associations such as arrays or collection instances), the developer must now have greater awareness of the form in which the objects are stored, leaving the system somewhat vulnerable to database schema changes. This is sometimes obviated by a hybrid approach between the two, whereby the system will take responsibility for interpreting the associations, leaving the developer to write something like this: Query q = new Query();
Field lastNameFieldFromPerson = Person.class.getDeclaredField("lastName");
q.From(Person.class).Where(new EqualsCriteria(lastNameFieldFromPerson, "Smith"));
ObjectCollection oc = QueryExecutor.execute(q);
Which solves part of the schema-awareness problem and the "fat-fingering" problem but still leaves the developer vulnerable to the concerns over verbosity and still doesn't address the complexity of putting together a more complex query, such as a multi-table (or multi-class, if you will) query joined on several criteria in a variety of ways.
So, then, the next task is to create a "Query-By-Language" approach, in which a new language, similar to SQL but "better" somehow, is written to support the kind of complex and powerful queries normally supported by SQL; OQL and HQL are two examples of this. The problem here is that frequently these languages are a subset of SQL and thus don't offer the full power of SQL. More importantly, the O/R layer has now lost an important "selling point", that of the "objects and only objects" mantra that begat it in the first place; using a SQL-like language is almost just like using SQL itself, so how can it be more "objectish"? While developers may not need to be aware of the physical schema of the data model (the query language interpreter/executor can do the mapping discussed earlier), developers will need to be aware of how object associations and properties are represented within the language, and the subset of the object's capabilities within the query language--for example, is it possible to write something like this? SELECT Person p1, Person p2
FROM Person
WHERE p1.getSpouse() == null
AND p2.getSpouse() == null
AND p1.isThisAnAcceptableSpouse(p2)
AND p2.isThisAnAcceptableSpouse(p1);
In other words, scan through the database and find all single people who find each other acceptable. While the "isThisAnAcceptableSpouse" method is clearly a method that belongs on the Person class (each Person instance may have its own criteria by which to judge the acceptability of another single--are they blonde, brunette, or redhead, are they making more than $100,000 a year, and so on), it's not clear if executing this method is possible in the query language, nor is it clear if it should be. Even for the most trivial implementations, a serious performance hit will be likely, particularly if the O/R layer must turn the relational column data into objects in order to execute the query. In addition, we have no guarantees that the developer wrote this method to be at all efficient, and no ways to enforce any sort of performance-aware implementation.
(Critics will argue that this is a workable problem, proposing two possible solutions. One is to encode the preference data in a separate table and make that part of the query; this will result in a hideously complicated query that will take several pages in length and likely require a SQL expert to untangle later when new preferential criteria want to be added. The other is to encode this "acceptability" implementation in a stored procedure within the database, which now removes code entirely from the object model and leaves us without an "object"-based solution whatsoever--acceptable, but only if you accept the premise that not all implementation can rest inside the object model itself, which rejects the "objects and nothing but objects" premise with which many O/R advocates open their arguments.)
The Partial-Object Problem and the Load-Time Paradox
It has long been known that network traversal, such as that done when making a traditional SQL request, takes a significant amount of time to process. (Rough benchmarks have placed this value at anywhere from three to five orders of magnitude, compared against a simple method call on either the Java or .NET platform5; roughly analogous, if it takes you twenty minutes to drive to work in the morning, and we call that the time required to execute a local method call, four orders of magnitude to that is roughly the time it takes to travel to Pluto, or just shy of fourteen years, one way.) This cost is clearly non-trivial, so as a result, developers look for ways to minimize this cost by optimizing the number of round trips and data retrieved.
In SQL, this optimization is achieved by carefully structuring the SQL request, making sure to retrieve only the columns and/or tables desired, rather than entire tables or sets of tables. For example, when constructing a traditional drill-down user interface, the developer presents a summary display of all the records from which the user can select one, and once selected, the developer then displays the complete set of data for that particular record. Given that we wish to do a drill-down of the Persons relational type described earlier, for example, the two queries to do so would be, in order (assuming the first one is selected): SELECT id, first_name, last_name FROM person;
SELECT * FROM person WHERE id = 1;
In particular, take notice that only the data desired at each stage of the process is retrieved--in the first query, the necessary summary information and identifier (for the subsequent query, in case first and last name wouldn't be sufficient to identify the person directly), and in the second, the remainder of the data to display. In fact, most SQL experts will eschew the "*" wildcard column syntax, preferring instead to name each column in the query, both for performance and maintenance reasons--performance, since the database will better optimize the query, and maintenance, because there will be less chance of unnecessary columns being returned as DBAs or developers evolve and/or refactor the database table(s) involved. This notion of being able to return a part of a table (though still in relational form, which is important for reasons of closure, described above) is fundamental to the ability to optimize these queries this way--most queries will, in fact, only require a portion of the complete relation.
This presents a problem for most, if not all, object/relational mapping layers: the goal of any O/R is to enable the developer to see "nothing but objects", and yet the O/R layer cannot tell, from one request to another, how the objects returned by the query will be used. For example, it is entirely feasible that most developers will want to write something along the lines of: Person[] all = QueryManager.execute(...);
Person selected = DisplayPersonsForSelection(all);
DisplayPersonData(selected);
Meaning, in other words, that once the Person to be displayed has been chosen from the array of Persons, no further retrieval action is necessary--after all, you have your object, what more should be necessary?
The problem here is that the data to be displayed in the first Display...() call is not the complete Person, but a subset of that data; here we face our first problem, in that an object-oriented system like C# or Java cannot return just "parts" of an object--an object is an object, and if the Person object consists of 12 fields, then all 12 fields will be present in every Person returned. This means that the system faces one of three uncomfortable choices: one, require that Person objects must be able to accomodate "nullable" fields, regardless of the domain restrictions against that; two, return the Person completely filled out with all the data comprising a Person object; or three, provide some kind of on-demand load that will obtain those fields if and when the developer accesses those fields, even indirectly, perhaps through a method call.
(Note that some object-based languages, such as ECMAScript, view objects differently than class-based languages, such as Java or C# or C++, and as a result, it is entirely possible to return objects which contain varying numbers of fields. That said, however, few languages possess such an approach, not even everybody's favorite dynamic-language poster child, Ruby, and until such languages become widespread, such discussion remains outside the realm of this essay.)
For most O/R layers, this means that objects and/or fields of objects must be retrieved in a lazy-loaded manner, obtaining the field data on demand, because retrieving all of the fields of all of the Person objects/relations would "clearly" be a huge waste of bandwidth for this particular scenario. Typically, the object's entire set of fields will be retrieved when any field not-yet-returned is accessed. (This approach is preferred to a field-by-field approach because there's less chance of the "N+1 query problem", in which retrieving all the data from an object requires 1 query to retrieve the primary key + N queries to retrieve each field from the table as necessary. This minimizes the bandwidth consumed to retrieve data--no unaccessed field will have its data retrieved--but clearly fails to minimize network round trips.)
Unfortunately, fields within the object are only part of the problem--the other problem we face is that objects are frequently associated with other objects, in various cardinalities (one-to-one, one-to-many, many-to-one, many-to-many), and an O/R mapping has to make some up-front decisions about when to retrieve these associated objects, and despite the best efforts of the O/R-M's developers, there will always be common use-cases where the decision made will be exactly the wrong thing to do. Most O/R-M's offer some kind of developer-driven decision-making support, usually some kind of configuration or mapping file, to identify exactly what kind of retrieval policy will be, but this setting is global to the class, and as such can't be changed on a situational basis.
Summary
Given, then, that objects-to-relational mapping is a necessity in a modern enterprise system, how can anyone proclaim it a quagmire from which there is no escape? Again, Vietnam serves as a useful analogy here--while the situation in South Indochina required a response from the Americans, there were a variety of responses available to the Kennedy and Johson Administrations, including the same kind of response that the recent fall of Suharto in Malaysia generated from the US, which is to say, none at all. (Remember, Eisenhower and Dulles didn't consider South Indochina to be a part of the Domino Theory in the first place; they were far more concerned about Japan and Europe.)
Several possible solutions present themselves to the O/R-M problem, some requiring some kind of "global" action by the community as a whole, some more approachable to development teams "in the trenches":
- Abandonment. Developers simply give up on objects entirely, and return to a programming model that doesn't create the object/relational impedance mismatch. While distasteful, in certain scenarios an object-oriented approach creates more overhead than it saves, and the ROI simply isn't there to justify the cost of creating a rich domain model. ([Fowler] talks about this to some depth.) This eliminates the problem quite neatly, because if there are no objects, there is no impedance mismatch.
- Wholehearted acceptance. Developers simply give up on relational storage entirely, and use a storage model that fits the way their languages of choice look at the world. Object-storage systems, such as the db4o project, solve the problem neatly by storing objects directly to disk, eliminating many (but not all) of the aforementioned issues; there is no "second schema", for example, because the only schema used is that of the object definitions themselves. While many DBAs will faint dead away at the thought, in an increasingly service-oriented world, which eschews the idea of direct data access but instead requires all access go through the service gateway thus encapsulating the storage mechanism away from prying eyes, it becomes entirely feasible to imagine developers storing data in a form that's much easier for them to use, rather than DBAs.
- Manual mapping. Developers simply accept that it's not such a hard problem to solve manually after all, and write straight relational-access code to return relations to the language, access the tuples, and populate objects as necessary. In many cases, this code might even be automatically generated by a tool examining database metadata, eliminating some of the principal criticism of this approach (that being, "It's too much code to write and maintain").
- Acceptance of O/R-M limitations. Developers simply accept that there is no way to efficiently and easily close the loop on the O/R mismatch, and use an O/R-M to solve 80% (or 50% or 95%, or whatever percentage seems appropriate) of the problem and make use of SQL and relational-based access (such as "raw" JDBC or ADO.NET) to carry them past those areas where an O/R-M would create problems. Doing so carries its own fair share of risks, however, as developers using an O/R-M must be aware of any caching the O/R-M solution does within it, because the "raw" relational access will clearly not be able to take advantage of that caching layer.
- Integration of relational concepts into the languages. Developers simply accept that this is a problem that should be solved by the language, not by a library or framework. For the last decade or more, the emphasis on solutions to the O/R problem have focused on trying to bring objects closer to the database, so that developers can focus exclusively on programming in a single paradigm (that paradigm being, of course, objects). Over the last several years, however, interest in "scripting" languages with far stronger set and list support, like Ruby, has sparked the idea that perhaps another solution is appropriate: bring relational concepts (which, at heart, are set-based) into mainstream programming languages, making it easier to bridge the gap between "sets" and "objects". Work in this space has thus far been limited, constrained mostly to research projects and/or "fringe" languages, but several interesting efforts are gaining visibility within the community, such as functional/object hybrid languages like Scala or F#, as well as direct integration into traditional O-O languages, such as the LINQ project from Microsoft for C# and Visual Basic. One such effort that failed, unfortunately, was the SQL/J strategy; even there, the approach was limited, not seeking to incorporate sets into Java, but simply allow for embedded SQL calls to be preprocessed and translated into JDBC code by a translator.
- Integration of relational concepts into frameworks. Developers simply accept that this problem is solvable, but only with a change of perspective. Instead of relying on language or library designers to solve this problem, developers take a different view of "objects" that is more relational in nature, building domain frameworks that are more directly built around relational constructs. For example, instead of creating a Person class that holds its instance data directly in fields inside the object, developers create a Person class that holds its instance data in a RowSet (Java) or DataSet (C#) instance, which can be assembled with other RowSets/DataSets into an easy-to-ship block of data for update against the database, or unpacked from the database into the individual objects.
Note that this list is not presented in any particular order; while some are more attractive to others, which are "better" is a value judgment that every developer and development team must make for themselves.
Just as it's conceivable that the US could have achieved some measure of "success" in Vietnam had it kept to a clear strategy and understood a more clear relationship between commitment and results (ROI, if you will), it's conceivable that the object/relational problem can be "won" through careful and judicious application of a strategy that is celarly aware of its own limitations. Developers must be willing to take the "wins" where they can get them, and not fall into the trap of the Slippery Slope by looking to create solutions that increasingly cost more and yield less. Unfortunately, as the history of the Vietnam War shows, even an awareness of the dangers of the Slippery Slope is often not enough to avoid getting bogged down in a quagmire. Worse, it is a quagmire that is simply too attractive to pass up, a Siren song that continues to draw development teams from all sizes of corporations (including those at Microsoft, IBM, Oracle, and Sun, to name a few) against the rocks, with spectacular results. Lash yourself to the mast if you wish to hear the song, but let the sailors row.
Endnotes
1 Later analysis by the principals involved--including then-Secretary of Defense Robert McNamara--concluded that half of the attack never even took place.
2 It is perhaps the greatest irony of the war, that the man Fate selected to lead during America's largest foreign entanglement was a leader whose principal focus was entirely aimed within his own shores. Had circumstances not conspired otherwise, the hippies chanting "Hey, hey LBJ, how many boys did you kill today" outside the Oval Office could very well have been Johnson's staunchest supporters.
3 Ironically, encapsulation, for purposes of maintenance simplicity, turns out to be a major motivation for almost all of the major innovations in Linguistic Computer Science--procedural, functional, object, aspect, even relational technologies ([Date02]) and other languages all cite "encapsulation" as major driving factors.
4 We could, perhaps, consider stored procedure languages like T-SQL or PL/SQL to be "relational" programming languages, but even then, it's extremely difficult to build a UI in PL/SQL.
5 In this case, I was measuring Java RMI method calls against local method calls. Similar results are pretty easily obtainable for SQL-based data access by measuring out-of-process calls against in-process calls using a database product that supports both, such as Cloudscape/Derby or HSQL (Hypersonic SQL).
Update (8 November 2012): A Bulgarian translation of this essay is here; thanks to Dimitar Teykiyski for the work.
References
[Fussell]: Foundations of Object Relational Mapping, by Mark L. Fussell, v0.2 (mlf-970703)
[Fowler] Patterns of Enterprise Application Architecture, by Martin Fowler
[Date04]: Introduction to Database Systems, 8th Edition, by Chris Date.
[Neward04]: Effective Enterprise Java
.NET | C++ | Java/J2EE | Ruby
Monday, June 26, 2006 10:59:14 AM (Pacific Daylight Time, UTC-07:00)
|
|
|