JOB REFERRALS
    ON THIS PAGE
    ARCHIVES
    CATEGORIES
    BLOGROLL
    LINKS
    SEARCH
    MY BOOKS
    DISCLAIMER
 
 Sunday, January 01, 2006
Annotation let-down: A response

In a recent thread on TheServerSide.com, Rick Hightower, a fellow NFJS speaker, commented on the JSR-175/annotations specification, and I felt a little obligated to respond, since this is a common critique/criticism:

Why don't you like the implementation? I hate the fact that your code has to import the annotations and then your code is tied to the annotation. It does not seem that different than depending on a interface (i.e., a marker interface). I'd like to see a soft import for annotations that does not impact compilation. (from TheServerSide.com)
Rick, no part of JSR 175 was more hotly debated, or contested, than the requirement we made that annotations be present not only at compilation-time, but at run-time. I could go back and show you the weeks of emails that went flying back and forth between the EG members, trying VERY hard to come up with a solution to the problem, but none could really be created, given Java's basic platform requirements, one of which was that the language was strongly-typed.

The basic problem was this: if the compiler runs across an annotation, and it doesn't match an annotation type defined anywhere in the compilation classpath of imported symbols, what is the proper behavior? In any other scenario, such as:

public class App
{
  public static main(String[] args)
  {
    Systme.out.println("Hello, world!");
  }
}
the behavior is extremely clear: this is an error, and compilation needs to fail to signal as much. But the proposal for "soft imports" of annotations would lead to a much grayer--and potentially disastrous--scenario, where the compiler SHOULD flag an error during compilation, but doesn't:
public class App extends Object
{
  @Overide public String toString(Object obj) { return "App"; }
}
Here, the human eye can clearly see that the class means to take advantage of the @Override compiler annotation to ensure that the toString() method is defined similarly in a base class, but because of a typo, the compiler now, under a soft-import rule, will simply ignore the annotation. This is the worst of all violations of the Principle of Least Surprise--the programmer believes the annotation is present, and that the override is acceptable, where in reality the annotation is ignored, the override isn't checked, and the code will fail to operate as expected.

The big IDE vendors were particularly upset at this idea, leading one to claim, "If we cannot solve this problem we will consider JSR 175 to have been a failure." Unfortunately, then, we failed--there is no good way to solve this problem without breaking the fundamental vision of the Java platform. We tried a variety of ideas, including a few centered around the JDK 1.4 assertion idea (some kind of runtime flag indicating which annotations were safe to ignore), but couldn't work out the basic semantics of such without requiring a definition of the annotation to be present on the compilation path. And frankly, in the grander scheme of things, it makes sense to me that annotations ARE required at compilation time--just as interfaces, helper classes, member field types, and other types are required to be present at compilation time, as well.

Rod Johnson continued his critique of annotations by citing the following two reasons:

  1. No proper mechanism for overriding annotations at runtime, despite the fact that just about any framework that uses annotations is going to need to consider doing that.
  2. Inability for an annotation to extend an existing interface (even if that interface is simple enough to sit within annotations). Of course there are implementation issues around this one, I guess. But it means that it's hard to avoid code duplication when working with annotations and alternative metadata sources--something that's going to be particularly important until everyone and their dog uses Java 5, and anyway will remain important to work with existing code that may not have the right annotation.
Rod, the suggestion for overriding annotations at runtime was made, and we almost unanimously shot it down, because there are no facilities for changing any other of a type's static type information at runtime: I cannot change methods, fields, inheritance, or interfaces at runtime, either. Such behavior belongs in the world of MOPs, perhaps, and hence your interest in such, but in a statically-typed world such behavior is not part of the landscape. Like it or hate it, such is the world that Java is a part of. (By the way, annotations are intended for much more than just frameworks--witness the annotations the javac compiler already recognizes, and other systems beyond frameworks are going to pick up on this in spades in the coming years. Just wait until the design-by-contract folks start talking to compiler folks again.) And you already answered your second criticism, that annotations extending existing interfaces would be difficult to implement. In truth, annotations and interfaces aren't really the same thing, so expecting one to be able to inherit the other wasn't something I'd consider good design. I didn't even like the "@interface" keyword--I preferred something like "annotation" or "attribute" instead, but Josh (rightly) pointed out that introducing new keywords into a language ten years old was going to be a Bad Thing. (And yes, the same was true of "assert", and they did it anyway, and look how well that turned out--they broke JUnit, of all things!)

Nutshell version of all this, the JSR 175 EG did, in fact, think long and hard about "soft imports" and "runtime annotation modification", and both ideas were shot down for what we felt were good reasons.


Java/J2EE

Sunday, January 01, 2006 5:10:11 AM (Pacific Standard Time, UTC-08:00)
Comments [2]  | 
2006 Tech Predictions

In keeping with the tradition, I'm suggesting the following will take place for 2006:

  1. The hype surrounding Ajax will slowly fade, as people come to realize that there's really nothing new here, just that DHTML is cool again. As Dion points out, Ajax will become a toolbox that you use in web development without thinking that "I am doing Ajax". Just as we don't think about "doing HTML" vs "doing DOM".
  2. The release of EJB 3 may actually start people thinking about EJB again, but hopefully this time in a more pragmatic and less hype-driven fashion. (Yes, EJB does have its place in the world, folks--it's just a much smaller place than most of the EJB vendors and book authors wanted it to be.)
  3. Vista will be slipped to 2007, despite Microsoft's best efforts. In the meantime, however, WinFX (which is effectively .NET 3.0) will ship, and people will discover that Workflow (WWF) is by far the more interesting of the WPF/WCF/WWF triplet. Notice that I don't say "powerful" or "important", but "interesting".
  4. Scripting languages will hit their peak interest period in 2006; Ruby conversions will be at its apogee, and its likely that somewhere in the latter half of 2006 we'll hear about the first major Ruby project failure, most likely from a large consulting firm that tries to duplicate the success of Ruby's evangelists (Dave Thomas, David Geary, and the other Rubyists I know of from the NFJS tour) by throwing Ruby at a project without really understanding it. In other words, same story, different technology, same result. By 2007 the Ruby Backlash will have begun.
  5. Interest in building languages that somehow bridge the gap between static and dynamic languages will start to grow, most likely beginning with E4X, the variant of ECMAScript (Javascript to those of you unfamiliar with the standards) that integrates XML into the language.
  6. Java developers will start gaining interest in building rich Java apps again. (Freely admit, this is a long shot, but the work being done by the Swing researchers at Sun, not least of which is Romain Guy, will by the middle of 2006 probably be ready for prime-time consumption, and there's some seriously interesting sh*t in there.)
  7. Somebody at Microsoft starts seriously hammering on the CLR team to support continuations. Talk emerges about supporting it in the 4.0 (post-WinFX) release.
  8. Effective Java (2nd Edition) will ship. (Hardly a difficult prediction to make--Josh said as much in the Javapolis interview I did with him and Neal Gafter.)
  9. Effective .NET will ship.
  10. Pragmatic XML Services will ship.
  11. JDK 6 will ship, and a good chunk of the Java community self-proclaimed experts and cognoscente will claim it sucks.
  12. Java developers will seriously begin to talk about what changes we want/need to Java for JDK 7 ("Dolphin"). Lots of ideas will be put forth. Hopefully most will be shot down. With any luck, Joshua Bloch and Neal Gafter will still be involved in the process, and will keep tight rein on the more... aggressive... ideas and turn them into useful things that won't break the spirit of the platform.
  13. My long-shot hope, rather than prediction, for 2006: Sun comes to realize that the Java platform isn't about the language, but the platform, and begin to give serious credence and hope behind a multi-linguistic JVM ecosystem.
  14. My long-shot dream: JBoss goes out of business, the JBoss source code goes back to being maintained by developers whose principal interest is in maintaining open-source projects rather than making money, and it all gets folded together with what the Geronimo folks are doing. In other words, the open-source community stops the infighting and starts pulling oars in the same direction at the same time. For once.
Flame away....


.NET | C++ | Conferences | Development Processes | Java/J2EE | Reading | Ruby | XML Services

Sunday, January 01, 2006 12:25:56 AM (Pacific Standard Time, UTC-08:00)
Comments [97]  | 
 Thursday, December 29, 2005
Prebuilt VMWare images

Whilst perusing the latest VMWare Workstation offering from their website, I noticed that not only does VMWare offer a free VMWare player (in other words, take a VMWare disk image created by somebody else and use it), but the VMWare site also has links to various pre-built VMWare disk images, including one for BEA's complete WebLogic 8.1 environment.... Whoever thought this idea up deserves to be knighted--what a great way to make it trivially simple for somebody to get started with a rather intimidating task (be that either installing a new O/S or a new app server).

Are you listening, Microsofties? VPCs of Vista, Visual Studio Team System and, heck, even just a base Visual Studio Express (pick a language, C# and VB sound like good starters) image are definitely something to consider if you want to make it easy for dev's to play with your tools.... Particularly people who DON'T want to install Windows just to play with Microsoft's implementation of .NET....


Java/J2EE

Thursday, December 29, 2005 10:34:20 PM (Pacific Standard Time, UTC-08:00)
Comments [2]  | 
 Thursday, December 08, 2005
THE book to read for 2006

If you read no other book this coming year, you must read "Blink", by Malcolm Gladwell, the same author who wrote "The Tipping Point" (which is about why certain trends seem to just "take off" with no prior warning--case in point, the incredible rise of certain fashion trends, such as "Hush Puppies")..

I won't tell you what it's about except to quote the back cover; to do so would ruin the book's effect, to be blunt. The inside jacket reads,

In his landmark bestseller The Tipping Point, Malcolm Gladwell redefined how we understand the world around us. Now, in Blink, he revolutionizes the way we understand the world within. Blink is a book about how we think without thinking, about choices that seem to be made in an instance--in a blink of an eye--that actually aren't as simple as they seem. Why are some people brilliant decision-makers, while others are consistently inept? Why do some people follow their instincts and win, while others end up stumbling into error? How do our brains really work--in the office, in the classroom, in the kitchen and in the bedroom? And why are the best decisions often the ones that are impossible to explain to others?

In Blink we meet the psychologist who has learned to predict whether a marriage will last, based on a few minutes of observing a couple; the tennis coach who knows when a player will double-fault before the racket even makes contact with the ball; the antiquities experts who recognize a fake at a glance. Here, too, are great failures of "blink": the election of Warren Harding; New Coke; and the shooting of Amadou Diallo by police. Blink reveals that great decision makers aren't those who possess the most information or spend the most time deliberating, but those who have perfected the art of "thin-slicing"--filtering the very few factors that matter from an overwhelming number of variables.

Drawing on cutting-edge neuroscience and psychology and displaying all of the brilliance that made The Tipping Point a classic, Blink changes the way you understand every decision you make. Never again will you think about thinking the same way.

Don't let the hyperbole in the above inside jacket prose throw you--how I think about thinking will never be the same again. I knew, intuitively, that intuition (the best word I can use to describe that "blink" effect) is a powerful force, but I couldn't describe why. Gladwell articulates that point. Read it.




Thursday, December 08, 2005 3:41:48 AM (Pacific Standard Time, UTC-08:00)
Comments [3]  | 
 Wednesday, November 30, 2005
World's dumbest spammer

You make the call on this one... cut & pasted directly out of the email (after the horizonal rule):


Subject: Better degree-better pay!

You have 2 options here,

Option 1 - You can put ANY text you want in here.

Option 2 - We will fill it in with the text only portion of the

html message if you put the macro UNIVERSITY DIPLOMAS

 

OBTAIN A PROSPEROUS FUTURE, MONEY-EARNING POWER, AND THE PRESTIGE THAT COMES WITH HAVING THE CAREER POSITION YOU'VE ALWAYS DREAMED OF. DIPLOMAS FROM PRESTIGIOUS NON-ACCREDITED UNIVERSITIES BASED ON YOUR PRESENT KNOWLEDGE AND LIFE EXPERIENCE

 

If you qualify, no required tests, classes, books or examinations.

 

Bachelors', Masters', MBA's, Doctorate & Ph.D. degrees available in your field.

 

CONFIDENTIALITY ASSURED

CALL NOW TO RECEIVE YOUR DIPLOMA WITHIN 2 WEEKS

1-206-279-9144

CALL 24HRS, 7 DAYS A WEEK, INCLUDING SUNDAYS & HOLIDAYS

in here.

NOTE: Some email clients don't disply html data. In that case what you

put here will be seen by the recipient. If the email client does

display html data then this will NOT be seen by the recipient.

Based on this you may wish to put a text version of your add here;

however, you can also put some macros here to make the message

more random.




Wednesday, November 30, 2005 3:02:26 AM (Pacific Standard Time, UTC-08:00)
Comments [3]  | 
 Monday, November 21, 2005
The immutable string

Mark Michaelis posted a challenge: modify a string such that the following would print "Smile":

class Program
{
  static void Main()
  {
      string text;
      // ...
      // Place code here
      // ...
      text = "S5280ft";
      System.Console.WriteLine(text);
  }
}

His solution?

class Program
{
  static void Main()
  {
      string text;
      unsafe {
          fixed (char* pText = text) {
              pText[1] = 'm';
              pText[2] = 'i';
              pText[3] = 'l';
              pText[4] = 'e';
          }
      }
      text = "S5280ft";
      System.Console.WriteLine(text);
  }
}

My answer; note that I believe mine to be cleaner, more elegant, and far far more dangerous, since it never uses any sort of unsafe code:

class Program
{
  static void Main()
  {
      string text;

      string internedText = "S5280ft";
      String.Intern(internedText);
      MethodInfo mi = typeof(string).GetMethod("InsertInPlace", 
        BindingFlags.NonPublic | BindingFlags.Instance, null,
        new Type[] { typeof(Int32), typeof(string), typeof(Int32), typeof(Int32), typeof(Int32) }, null);
      mi.Invoke(internedText, new object[] {0, "Smile", 1, 7, 5});      

      text = "S5280ft";
      System.Console.WriteLine(text);
  }
}

The point? Playing with Reflection can be dangerous... oh, and it helps to know that strings are only as immutable as the platform forces them to be. In this case, my little hack would only be possible because under the covers, .NET doesn't really have immutable strings--it just doesn't let YOU modify them. :-)

(By the way, same trick is available in Java, using the same approach. Or you could write JNI code to sort of duplicate Mark's trick, but who'd want to do that? Brrr.)


.NET | Java/J2EE

Monday, November 21, 2005 3:01:11 AM (Pacific Standard Time, UTC-08:00)
Comments [9]  | 
 Friday, November 18, 2005
Academic .NET radio show debuts

Matt Cassell is putting on an Academic .NET radio show (something in the vein of .NET Rocks! but aimed at students), and asked me to be the opening episode. It's up online now, so have a listen and see if I managed to steer the kids straight....


.NET | Conferences | Java/J2EE | XML Services

Friday, November 18, 2005 2:13:33 PM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
 Tuesday, November 08, 2005
Anonymous generic methods making things "just work"

A good friend of mine and I are looking at taking on a new project together, and as part of the discussion we were exploring some of the differences of taking a relational perspective against an object perspective, and one of the comments she made was that in a relational model, you can always "filter" the data you want based on some predicate. "Ha!", I said, "If that's what you want, I can give you that over objects, too!" What's more, thanks to generics, I can do this for any collection type in the system without having to introduce it on some kind of base class:

    static class SetUtils
    {
        public static List<T> Project<T>(List<T> list, Predicate<T> pred)
        {
            List<T> results = new List<T>();

            foreach (T p in list)
                if (pred(p))
                    results.Add(p);

            return results;
        }

        // Not too hard to imagine the other relational operators here, too
    }

    // Usage:
    class Person
    {
        private string firstName;
        private string lastName;

        public Person(string fn, string ln, int age) {
            this.firstName = fn;
            this.lastName = ln;
        }

        public string FirstName {
            get { return firstName; }
            set { firstName = value; }
        }
        public string LastName {
            get { return lastName; }
            set { lastName = value; }
        }
        public override string ToString() {
            return "[Person [" + firstName + "]" + " " + "[" + lastName + "]" + "]";
        }
    }

    class Program {
        static void Main(string[] args) {
            Person cg = new Person("Cathi", "Gero", 35);
            Person tn = new Person("Ted", "Neward", 35);
            Person sg = new Person("Stephanie", "Gero", 12);
            Person mn = new Person("Michael", "Neward", 12);

            List<Person> list = new List<Person>();
            list.Add(cg);
            list.Add(tn);
            list.Add(sg);
            list.Add(mn);

            List<Person> newards = 
                SetUtils.Project<Person>(list, 
                    delegate (Person p) { if (p.LastName == "Neward") return true; else return false; } );
            foreach (Person p in newards)
                Console.WriteLine(p);
        }
    }
Any more questions? (This is why having (1) a system that supports managed function pointers directly and (2) a generics system that doesn't rely on type erasure is so powerful. Hint, Hint, Sun guys....)

Now if I could just figure out how C# 3.0 manages to differentiate/overload between delegate instances and Expression objects in LINQ/DLinq, I might be able to backport that to C# 2.0, too, and be able to pass these Predicate instances across the wire for execution on other machines.

In a lot of ways, the Predicate delegate type is an example of using C#'s anonymous methods as a form of closure or lambda expression. (It's been argued that anonymous methods-as-delegates aren't "true" closures, since the local variables referenced in a closure will only be references to the objects, not complete copies, but to my mind that's exactly as it should be, as any time you pass a reference to an object, you're passing just that--a reference to an object, not a complete copy of the object. To do otherwise in anonymous methods would violate the Principle of Least Surprise, IMHO.) The Ruby syntax arguably isn't any more elegant or terse, and I suspect similar things could be done in C++ using templates; probably something along these lines already exists in Boost. But alas, I see no way to do this in Java given the current state of the JVM, namely the aforementioned lack of "managed functors" and type-preserving generics. If any out there in Java-land know otherwise, please holler, because I would really love to know how to do this as elegantly.


.NET | C++ | Java/J2EE | Ruby

Tuesday, November 08, 2005 7:02:22 PM (Pacific Standard Time, UTC-08:00)
Comments [17]  | 
Nullable Type correction/bugfix

This is a bit of old news, but the discussion came up during the Seattle Code Camp, so I thought I'd go through the problem, and use it as an example of the issues that can come up when trying to map language concepts on top of a platform that doesn't support the idea natively. Hopefully, this will cause developers looking to build DSLs or other languages on top of the .NET (or JVM) platform to see some of the edge cases a bit more clearly and a bit sooner. :-)

To lay down the background first: dealing with NULLs has always been somewhat problematic; the most obvious example of this is the mapping between relational databases, where even an INTEGER column can either have a value, or be empty, or be NULL, each of those being separate and distinct states. Trying to map NULL integer column values to integer values in the language has always been difficult in Java. C++, and C#, since primitive types / value types generally don't support null values, and Anders (among others) decided that it was time to try and integrate nullability more deeply into the language. The .NET team saw an opportunity to support nullability by creating a generic/templatized type to represent the possibility of nullability, and the C# language team took it further to try and make nullability feel "more at home" within the language. It was a bold, if at first seemingly-trivial, step.

Initially, the Nullable<T> type was pretty simple: it captured an instance of T internally, and if T was null it tripped an internal flag such that the IsNull property would return true. So, using a nullable int would work something like this:

Nullable<int> ni = new Nullable<int>(null);
if (ni.IsNull)
  Console.WriteLine("It's null!");
else
  Console.WriteLine(ni.Value);
By doing this, it seemed fairly straightforward, and then the C# team took it one step further and decided to integrate this more deeply into the language itself, by creating a native syntax for nullability:
int? ni = null;
if (ni == null)
  Console.WriteLine("It's null!");
else
  Console.WriteLine((int)ni);
In other words, any type? designation was an alias for Nullable<type>, and appropriate properties would be consulted when looking to evaluate the nullable type instance. Conversion rules (from the nullable type into the type) had to be written, because it's not necessarily a silent and unambigious conversion to it's original type; for example, in the case where you wrote:
int? ni = null;
int i = (int)ni;
what should the expected behavior of the conversion of ni to i be? Some would argue that it should silently seek to "best" convert the null value of ni to an acceptable integer value of i, but that gets us back to the original problem, figuring out what that mapping is. (Ask any C++ programmer versed in the lore, and they'll be the first to tell you that "0 is NOT the same thing as NULL".) So here, asking to make that conversion will trigger a NullReferenceException.

OK, so far, so good. The problem is, however, that people were going to ask these nullable types to do things that subtly were different from what they'd ask of Nullable<T> instances. For example, the following snippet of code wouldn't behave as expected:

int? ni = null;
object o = ni; // What should this conversion be?
if (o == null) {
  // Should we be in this block?
}
What the conversion from int? to object should be was the subject of some debate, but what the C# team ended up with was the idea that the conversion followed basic CLR rules: that because int? was, internally, an instance of the type Nullable<int>, the conversion was to obtain an object reference to the Nullable<int> instance. In other words, a boxing operation took place, and since the Nullable<int> instance was always present (it's never null, even though it's value might be null), the "if" block above would never evaluate as "true".

Somasegar's weblog describes what happened next in some detail:

Clearly this had to change. We had a solution in Visual Studio 2005 Beta2 that gave users static methods that could determine the correct null-ness for nullable types in these more or less untyped scenarios. However, these methods were costly to call and difficult to remember to use. The feedback you gave us was that you expected it to simply work right by default.

So we went back to the drawing board. After looking at several different workarounds and options, it became clear to all that no amount of tweaking of the languages or framework code was ever going to get this type to work as expected.

The only viable solution was one that needed the runtime to change.
In other words, the runtime had to take a special interest in the Nullable type, treating it with special-cased logic to handle those conversions between Nullable instances and their non-Nullable equivalents. As Soma points out, "A Nullable int now boxes to become not a boxed Nullable int but a boxed int (or a null reference as the null state may indicate)." More importantly, this permeates throughout the entire runtime, so that
int? x = 10;
object y = x;
int? z = (int?)y; // unbox into a Nullable<int>
works as intended, where under the old rules it would have failed conversion because the boxed Nullable reference wouldn't be the same type as the Nullable type it was being converted into. (In other words, boxed(Nullable(T)) != T.)

The lessons here? When building languages to run on top of another platform or runtime, the decisions that runtime makes often put some serious constraints around what you can do within your language. For example, looking to support first-class functors on a JVM or CLR will run into the fact that functions aren't first-class in the runtime, but instead have to be handled with object wrappers around the functions. Hiding those differences in language semantics can only get you so far, and that sometimes you need to involve the runtime team a bit more deeply if you want to close all those edge cases. (Hint to Sun: you really need to start thinking about revising and extending the JVM, instead of this current policy that essentially describes the JVM as perfect as-is. The changes made to support annotations were minor, but a good first step; it's time to open that Pandora's box wider if you want to keep up with the CLR, to be blunt about it.)


.NET | C++ | Java/J2EE | Ruby

Tuesday, November 08, 2005 2:28:25 AM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
HTML is not statically typed... but so what?

Dion Almaer made an interesting point recently:

A friend ... talked about how it is interesting that HTML is not statically typed, yet it has scaled pretty well. The internet architecture has made this happen. We are loosely coupled and modules (pages/site) are seperated out.
Except that HTML itself really had nothing to do with the architecture of the Web, Dion--it is just a presentation format. We could have been "just" as successful in growing the Web (from a scalability perspective) had the presentation format been PDF, Flash, or you name it. It was the Architecture of the World Wide Web that led to the organic and anarchic scability of the Web, not HTML itself. The fact that HTML is dynamically typed (and I take issue with that, as well: HTML isn't typed in the traditional sense of the term, nor is XML for that matter) is a red herring.

Ruby has its merits, Dion--you don't need to make spurious comparisons to try and justify it. Let programmers discover the beauty that is dynamically-typed programming on their own.




Tuesday, November 08, 2005 2:25:16 AM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
 Sunday, October 30, 2005
Porting legacy code

Matt Davey poses an interesting question:

The problem:
  • C++ Corba legacy codebase (5+ years old, 1 million lines)
  • No unit tests
  • Little test data
  • Limited knowledge transfer from the original development team.
  • A flake environment to run the application in.
The requirement:
  • Port the C++ result accumulation and session management code to Java
Do you:
  1. Write C+ unit tests to understand the current system, then write Java equivalent code using TDD
  2. Write Java tests using TDD based on your understanding of the C++ code
  3. Hope you understand the C++ code, and JFDI in Java
  4. Give up and go home
  5. Get the original development team to do the work
Ah, I love the smell of legacy code in the morning. :-)

My answer: depends. (Typical.) Here's what I mean:

  • Option 1 is clearly the "best" answer if the goal is to produce code that will most accurately match what the current C++ code is doing, but also represents the greatest time and energy commitment, as well as making the fundamental assumption that what the C++ code does today is correct in the first place.
  • Option 2 is the approach to take if the time crunch is a bit tighter and/or if the C++ unit tests can't be sold to management ("You're just going to throw them away anyway!"), particularly if the team working on the port has many or all of the original C++ devs. It also allows for the inevitable "You know, we always wanted to change how that code worked, so why don't we...." requirements changes.
  • Option 3 is probably appropriate in those shops where WHISKEY (Why the Hell Isn't Somebody Koding Everything Yet) is considered an acceptable development methodology, but the lack of unit tests for the Java port will catch up to you someday (as it always does).
  • Option 4 is probably best if the company you work for is seriously considering Option 3. :-)
  • Option 5 is only viable if the original development team is available (not going to happen if you outsourced it, by the way), able to work on it (meaning they've flipped the switch to Java at both a syntactic and semantic level), and isn't otherwise engaged on another project (which is probably the dealbreaker).
Matt also left out a few options:
  • 6. Let management believe in the whizzy-bang code conversion wizard that such-and-such company is trying to sell them on that "guarantees" 99% code translation and compatibility
  • 7. Let management outsource the port, and let them worry about it
  • 8. Give it all up and start from scratch--who needs that system anyway? It's not like anybody ever really used it, right?

Porting legacy code is one of the least-favorite projects of any software developer, but what few developers seem to realize is that they're also the least-favorite of management, too: it's a project that has no discernible ROI beyond that of "getting us out of the Stone Age". You might argue that the code becomes more maintainable if it's written in whatever-the-latest-technology-flavor-is-today, but the truth of the matter is, today's hot language is tomorrow's legacy language, subject to being rewritten in tommorrow's hot language. (Any programmer who's been writing code for more than five years probably already knows this, and any programmer who's been writing code for more than 10 years almost certainly knows this.)

Companies have been on this hamster wheel for far too long. Having gone through several transitions, particularly the C++-to-COM/CORBA-to-Java/EJB transitions over the last decade--and they're starting to resist if not outright reject the idea. Instead, they're preferring to find ways to create interoperable solutions rather than ported solutions--hence the huge interest in Web services when they first came out (and the interest in CORBA when it first came out, and the interest in middleware products in general like Tuxedo when they first came out, and so on). Integration still remains the "hard problem" of our industry, one that none of the new languages or platforms seem to want to address until they have to. Witness, for example, Sun's reluctance to really adopt any sort of external-facing technology into Java until they had to (meaning the Java Connector Architecture; their adoption of CORBA was half-hearted at best and a PR move at worst). .NET suffers the same problem, though fortunately Microsoft was wise enogh to realize that shipping .NET without a good Win32/COM interop story was going to kill it before it left the gate. C++ at least had the advantage of being call-compatible with C (if you declared the prototypes correctly), and so could automatically interop against the operating system's libraries pretty easily. In fact, it could be argued that C has long been the de-facto call-level compatibility interoperability standard (Python has C bindings, Ruby has C bindings, Java reluctantly, it seems, support C bindings through JNI, and so on), but of course that only works to a given platform/OS, since C offers so little by way of standardization and the operating systems have never been able to create a portable OS layer beyond the simple stuff; POSIX was arguably the closest they came, and many's the POSIX programmer who will tell you just how successful THAT was.

My point? I hereby declare a rule that any new language developed should think first about its interoperability bindings, and developers contemplating the adoption of a new language must flesh out, in concrete form, how they will integrate the hot new language into their existing architecture, or else they can't use it. (Yes, this applies equally to Ruby, Java, .NET, C++, and all the rest, even FORTRAN--no exceptions.) If you can't describe how it'll integrate into your current stuff, then you're just fascinated with the bright shiny new toy and need to grow up. It doesn't really matter to me how it integrates--through a database, through files on a filesystem, through a message-passing interface like JMS, or through a call-level interface, just have SOME kind of plan for hooking your new <technology X> project into the rest of the enterprise. (And yes, those answers are there for each of those languages/platforms; the test is not whether such answers exist, but how they map into your existing infrastructure.)

What's more, I hereby rededicate this blog to finding interoperabilty solutions across the technology spectrum--got an interop problem you're not sure how to solve? Email me and (with your permission) I'll post the response--sort of an "Ann Landers" for interop geeks. :-)

By the way, this conundrum can be genericized pretty easily using generics/templates:

enum Q
{
  No, Bad, Little, Flakey, Untouchable
};
enum technology
{
  C, C++, Java, C#, C++/CLI, VB 6, VB 7, VB 8, FORTRAN, COBOL, Smalltalk, Lisp, ...
};

Problem<technology X, technology Y, type T extends AbstractTest, enum Q>:
{
  • <X> legacy codebase (<int N where N > 1> years old, <int L where L > 1000> lines)
  • No <type T> tests
  • <Q> test data
  • <Q> knowledge transfer from the original development team
  • <Q> environment to run the application in.
} returning requirement:
  • Port the <X> project to <Y>
(I thought about doing it in Schema, but this seemed geekier... and easier, given all the angle-brackets XSD would require. ;-) )


.NET | C++ | Development Processes | Java/J2EE | Ruby | XML Services

Sunday, October 30, 2005 1:17:33 PM (Pacific Daylight Time, UTC-07:00)
Comments [19]  | 
 Friday, October 28, 2005
Concurrent languages

Ever since the Seattle Code Camp, where I hosted a discussion (hardly can call it a lecture--I didn't do most of the talking this time, as it turned out) on language innovations, one of the topics that came up was the notion of concurrency, and of course Herb Sutter's "No More Free Lunch" article from DDJ from some months ago. That put a bug in my ear: what sort of languages out there support concurrency in some form, baked into the language? I've started to compile a list, but any other suggestions/references would be welcome; I'd like to keep it to "active" languages (as opposed to languages no longer under active development), but if there's a particular concurrent language that had some kind of major influence on a branch of thinking, I'd love to see it listed. And by "language" here I'm willing to be flexible--extensions to preexisting languages (a la OpenMP) are interesting in their own right. But, I'd like to keep it to language-level constructs, not library-level constructs--so C-with-POSIX, C++-with-BOOST or Java-with-java.util.concurrent aren't going to make the list, since they mostly support concurrency through the low-level mechanism of "start yer own thread". I'm interested in languages that do more than that. :-)

So far, what I've come up with includes:

  • Cw (aka C-omega): a combination of X#/Xen and Polyphonic C#, Cw provides an interesting concept called "chords" that suggests that methods of classes "work together" in pairs to handle concurrent access.
  • OpenMP: an extension to FORTRAN and C++, OpenMP uses #pragmas (in C++) to declare regions of code where an OpenMP compiler can spawn off threads and provide concurrent execution. What makes this interesting is its intersection to the mainstream: Visual Studio 2005 is an OpenMP compiler, and works for both unmanaged and C++/CLI code, meaning that this may be an interesting approach to handling concurrency inside of .NET apps. I know there's more out there--fire away! Regardless of whether they compile for .NET, JVM, or unmanaged code, I'm interested in seeing what others have been exploring and/or playing around with. Academic links particularly wanted--they have a tendency to push the edge of the envelope (and some would say sanity) when it comes to areas like this.


.NET | C++ | Java/J2EE | Ruby | XML Services

Friday, October 28, 2005 6:08:36 PM (Pacific Daylight Time, UTC-07:00)
Comments [9]  | 
 Tuesday, October 25, 2005
Rotor patch for XP SP2, 2003, FreeBSD 5.2, and Mac OSX 10.3

I asked Jan Kotas, about a patch he'd made for Rotor (SSCLI) to run on XP SP2, Windows 2003, FreeBSD 5.2 and MacOS/X, since the location Joel had blogged about is no longer available--the www.sscli.net server has been shut down--and he was gracious enough to send it to me. Figuring that others would like to find the same patch, I'm posting it here (which hopefully isn't in violation of the Shared Source license, email me if you're Microsoft and want me to cease-and-desist). This patch, I believe, is to the last official release of the SSCLI tarball (which you can get from microsoft.com).

ssclipatch_20040514.diff.gz (104.21 KB)

By the way, guys, we're all eagerly looking forward to Rotor Whidbey! :-)


.NET

Tuesday, October 25, 2005 3:10:29 PM (Pacific Daylight Time, UTC-07:00)
Comments [1]  | 
WS-* support on the Java platform

Christian Weyer has created a pretty comprehensive chart of WS-* specs and how they map to .NET technologies (which specs are supported in which product), and I realized that I've not seen a similar chart in the Java space detailing WS-* spec to JCP spec, nor how the WS-* specs and/or JCP specs map to various XML service providers (Axis 1.x, 2.x, WebLogic, and so on). So I thought I'd draft one up, but before I do, does anybody know of a similar writeup already existing in the Java space?


.NET | Java/J2EE | XML Services

Tuesday, October 25, 2005 12:21:51 PM (Pacific Daylight Time, UTC-07:00)
Comments [1]  | 
 Wednesday, October 19, 2005
Sorry, Lispers--no offense intended

I noticed a referrer URL in my logs from a Lisp chat channel, where apparently a collection of Lisp programmers found my dynamic languages blog entry and were a little less than impressed at my Lisp knowledge. Let's make something REALLY clear right now:

I know almost nothing about Lisp. :-)

Seriously, my proposal for giving a talk on Lisp was to be the take of a guy who's a statically-typed guy for a decade who's coming to see Lisp and try to explain its concepts to other statically-typed guys, not as a Lisp expert to other Lisp experts. In fact, I'd love it if those who were on the chat emailed me privately so I can try to understand it better.

In the meantime, though, I do know what I've begun to pick up out of books (my current tome being Practical Common Lisp, from APress) and the various Lispers I've talked to in the past, and I do know (until somebody can prove otherwise) that Lisp has a small set of core primitives from which the remainder of the language is built. If that's not the case, show me otherwise. :-)




Wednesday, October 19, 2005 4:44:41 PM (Pacific Daylight Time, UTC-07:00)
Comments [2]  | 
 Tuesday, October 18, 2005
Dynamic languages, type systems and self-modifying systems

Stu Halloway has responded to my earlier post about dynamic languages, and Stu refines his argument. Still wrong, but at least now it's refined. :-)

Stu writes that we're "talking past one another", and in particular notes that

The criticial point is that these abstractions are implemented in the language itself. Developers can (and do!) modify these core abstractions to work in different ways.
where "these abstractions" are referring to "inheritance, encapsulation, delegation", etc, from my post.

Where Stu, I think, is being fallacious with this is that he presumes a bit much with respect to at least a few of these languages; in particular Ruby has some facility for self-modification and language evolution, but still relies on a core set of principles that are implemented in native code inside the Ruby interpreter. Ditto for Smalltalk, ditto for Python, and even for Lisp, the poster child for dynamic languages. (In all fairness, Stu does admit this--in a backhanded sort of way--when he notes that "The rules for adding new methods to existing classes aren’t (for the most part) in the core of ruby — they are implemented in Ruby source code.")

What Stu's point does raise, however, is still the valid point that languages offer a continuum of self-modification and/or evolution, and that languages like Ruby, Smalltalk, Python or Lisp clearly come in on the "more" end of that continuum as opposed to languages like C# or Java or C++. And this plays into his later comment when he states, "It’s all about control. With a vendor-oriented language like C#, core abstractions are much more firmly controlled by the language vendor. Conversely, developer-oriented langauges like Python leave more of these choices to the developer (although they tend to provide reasonable defaults). So, again, who do you trust?"

There's two points I want to raise here. One is technical, the other political/cultural.

First, the technical: dynamic languages may choose to expose more meta-control over the language, but there's nothing inherent in the dynamic language that requires it, nor is there anything in a static language that prevents it. Languages/tools like Shigeru Chiba's OpenC++ or Javassist, or Michiaki Tatsubori's OpenJava clearly demonstrates that we can have a great deal of flexibility in how the language looks without losing the benefits of statically-typed environments. So to attribute this meta-linguistic capability exclusively to dynamic languages is a fallacy.

Secondly is the cultural issue: is the idea of granting meta-linguistic power (known as meta-object protocol, or MOP) to a language a good thing? Stu asserts that it is: "My concern is who controls the abstractions. Developer-oriented languages (like Scheme) give a lot of control (and responsibility) to developers. Vendor-oriented languages (like Java) leave that control more firmly in the hands of the vendor." So in whose hands are these abilities to change the language best placed?

*deep breath* I don't trust developers. There, I've said it.

I say this not because I think developers are all 5-year-olds who need to be carefully watched and monitored and chastised gently when they actually run with scissors, but because in some cases, we don't necessarily know what we're doing when we start adopting certain features or ideas. Here's an example of what I mean: about eight years ago, when servlets were new and Reflection was still a Brand New Topic amongst developers, I read an article on building a servlet-based system that was touted as "dynamic" and "powerful": in essence, the servlet would look for a query parameter in the request URL and Reflect for that method name on the servlet and/or alternate class, and execute it.

This is a Good Thing?!? Incredibly dynamic, granted, but given the overhead and performance implications (not to mention security concerns), I can't see this as a great way to build scalable, dynamic systems.

Gregor Kiczales, the inventor of AspectJ and long-time CLOS wonk--so you know he has experience on both sides of this fence--told me once that one of the greatest flaws of CLOS (I don't know if he used the word "flaw", per se, but that was my takeaway) was that it allowed developers too much power. Developers writing CLOS systems apparently had this tendency to do too many wild-and-crazy things that ultimately (in his view) led to a number of write-only CLOS codebases. AspectJ was deliberately constrained to prevent these sorts of things, and whether or not he's succeeded in that remains to be seen--many long-time O-O advocates still see AspectJ as "an evil hacking language", despite those constraints.

I see the same concern every time a developer starts talking about doing bytecode manipulation at load-time--just because you can doesn't mean you should. In this respect, I trust the guys who've been down this road before much more so than developers who are just coming to this and are starting to flex their new-found freedom and will (undoubtedly) start building systems that exercise this power.

In the end, Stu's right, in that he and I share a lot of common ground--working together for four years has a tendency to do that to you. And I won't even suggest that he's "wrong" so much as that he and I simply disagree on how much meta-control should be baked into a language, dynamic or otherwise.


.NET | C++ | Java/J2EE | Ruby

Tuesday, October 18, 2005 10:07:18 AM (Pacific Daylight Time, UTC-07:00)
Comments [5]  | 
 Thursday, October 13, 2005
CORBA did what?

Long-time blog reader Dilip Ranganathan pointed me to this discussion over on Steve Vinoski's blog about the history of CORBA, and in particular the discussion that ensued in the comments section on the entry. I found it interesting from two perspectives:

  1. The idea that two people could look at the history of CORBA (having presumably lived through it) and come away with entirely different ideas of what that history was, and
  2. The discussion over CORBA's role and influence on the current XML services environment.

For starters, Steve Vinoski was a bit miffed at the idea posited by Mark Baker that CORBA failed. Sorry, Steve, I have to say it, but I agree with Mark--CORBA never fulfilled on its intended promise of seamless middleware interoperability and integration capabilities, and certainly not over the Internet in any meaningful way. By the time CORBA began to address some of those issues--firewalls being a big one--the world had already pretty much abandoned both the "distributed object brokers" (the