|
JOB REFERRALS
|
|
|
|
ON THIS PAGE
|
|
|
|
|
ARCHIVES
|
| February, 2010 (1) |
| January, 2010 (4) |
| December, 2009 (1) |
| November, 2009 (3) |
| October, 2009 (3) |
| August, 2009 (2) |
| July, 2009 (4) |
| June, 2009 (3) |
| May, 2009 (6) |
| April, 2009 (4) |
| March, 2009 (4) |
| February, 2009 (5) |
| January, 2009 (11) |
| December, 2008 (3) |
| November, 2008 (9) |
| October, 2008 (1) |
| September, 2008 (2) |
| August, 2008 (4) |
| July, 2008 (10) |
| June, 2008 (5) |
| May, 2008 (10) |
| April, 2008 (13) |
| March, 2008 (11) |
| February, 2008 (18) |
| January, 2008 (17) |
| December, 2007 (12) |
| November, 2007 (2) |
| October, 2007 (6) |
| September, 2007 (1) |
| August, 2007 (2) |
| July, 2007 (7) |
| June, 2007 (1) |
| May, 2007 (1) |
| April, 2007 (2) |
| March, 2007 (2) |
| February, 2007 (1) |
| January, 2007 (16) |
| December, 2006 (3) |
| November, 2006 (7) |
| October, 2006 (5) |
| September, 2006 (1) |
| June, 2006 (4) |
| May, 2006 (3) |
| April, 2006 (3) |
| March, 2006 (17) |
| February, 2006 (5) |
| January, 2006 (13) |
| December, 2005 (2) |
| November, 2005 (6) |
| October, 2005 (15) |
| September, 2005 (16) |
| August, 2005 (17) |
|
|
|
CATEGORIES
|
|
|
|
|
BLOGROLL
|
|
|
|
|
LINKS
|
|
|
|
|
SEARCH
|
|
|
|
|
MY BOOKS
|
|
|
|
|
DISCLAIMER
|
Powered by:
newtelligence dasBlog 1.9.7067.0
The opinions expressed herein are my own personal opinions and do not represent
my employer's view in any way.
© Copyright
2010
,
Ted Neward
E-mail
|
|
|
|
|
 Monday, May 19, 2008
|
Life at Microsoft
|
|
For those of you who aren't from that side of the world, you might find it... inspirational... to see what life at Microsoft is really like. (And I can vouch for some of these myself, having spent some time inside those buildings....)
Monday, May 19, 2008 3:33:55 AM (Pacific Daylight Time, UTC-07:00)
|
|
 Sunday, May 18, 2008
|
Guide you, the Force should
|
|
Steve Yegge posted the transcript from a talk on dynamic languages that he gave at Stanford. Cedric Beust posted a response to Steve's talk, espousing statically-typed languages. Numerous comments and flamewars erupted, not to mention a Star Wars analogy (which always makes things more fun). This is my feeble attempt to play galactic peacemaker. Or at least galactic color commentary and play-by-play. I have no doubts about its efficacy, and that it will only fan the flames, for that's how these things work. Still, I feel a certain perverse pleasure in pretending, so.... Enjoy the carnage that results. First of all, let me be very honest: I like Steve's talk. I think he does a pretty good job of representing the negatives and positives of dynamic languages, though there are obviously places where I'm going to disagree: - "Because we all know that C++ has some very serious problems, that organizations, you know, put hundreds of staff years into fixing. Portability across compiler upgrades, across platforms, I mean the list goes on and on and on. C++ is like an evolutionary sort of dead-end. But, you know, it's fast, right?" Funny, I doubt Bjarne Stroustrup or Herb Sutter would agree with the "evolutionary dead-end" statement, but they're biased, so let's put that aside for a moment. Have organizations put hundreds of staff years into fixing the problems of C++? Possibly--it would be good to know what Steve considers the "very serious problems" of C++, because that list he does give (compiler/platform/language upgrades and portability across platforms) seems problematic regardless of the langauge or platform you choose--Lord knows we saw that with Java, and Lord knows we see it with ECMAScript in the browser, too. The larger question should be, can, and does, the language evolve? Clearly, based on the work in the Boost libraries and the C++0X standards work, the answer is yes, every bit as much as Java or C#/.NET is, and arguably much more so than what we're seeing in some of the dynamic languages. C++ is getting a standardized memory model, which will make a portable threading package possible, as well as lambda expressions, which is a far cry from the language that I grew up with. That seems evolutionary to me. What's more, Bjarne has said, point-blank, that he prefers taking a slow approach to adopting new features or ideas, so that it can be "done right", and I think that's every bit a fair position to take, regardless of whether I agree with it or not. (I'd probably wish for a faster adoption curve, but that's me.) Oh, and if you're thinking that C++'s problems stem from its memory management approach, you've never written C++ with a garbage collector library.
- "And so you ask them, why not use, like, D? Or Objective-C. And they say, "well, what if there's a garbage collection pause?" " Ah, yes, the "we fear garbage collection" argument. I would hope that Java and C#/.NET have put that particular debate to rest by now, but in the event that said dragon's not yet slain, let's do so now: GC does soak up some cycles, but for the most part, for most applications, the cost is lost in the noise of everything else. As with all things performance related, however, profile.
- "And so, you know, their whole argument is based on these fallacious, you know, sort of almost pseudo-religious... and often it's the case that they're actually based on things that used to be true, but they're not really true anymore, and we're gonna get to some of the interesting ones here." Steve, almost all of these discussions are pseudo-religious in nature. For some reason, programmers like to identify themselves in terms of the language they use, and that just sets up the religious nature of the debate from the get-go.
- "You know how there's Moore's Law, and there are all these conjectures in our industry that involve, you know, how things work. And one of them is that languages get replaced every ten years. ... Because that's what was happening up until like 1995. But the barriers to adoption are really high." I can't tell from the transcript of Steve's talk if this is his opinion, or that this is a conjecture/belief of the industry; in either case, I thoroughly disagree with this sentiment--the barriers to entry to create your own language have never been lower than today, and various elements of research work and available projects just keep making it easier and easier to do, particularly if you target one of the available execution engines. Now, granted, if you want your language to look different from the other languages out there, or if you want to do some seriously cool stuff, yes, there's a fair amount of work you still have to do... but that's always going to be the case. As we find ways to make it easier to build what's "cool" today, the definition of what's "cool" rises in result. (Nowhere is this more clear than in the game industry, for example.) Moore's Law begets Ballmer's Corollary: User expectations double every eighteen months, requiring us to use up all that power trying to meet those expectations with fancier ways of doing things.
- It's a section that's too long to quote directly here, but Steve goes on to talk about how programmers aren't using these alternative languages, and that if you even suggest trying to use D or Scala or [fill in the blank], you're going to get "lynched for trying to use a language that the other engineers don't know. ... And [my intern] is, like, "well I understand the argument" and I'm like "No, no, no! You've never been in a company where there's an engineer with a Computer Science degree and ten years of experience, an architect, who's in your face screaming at you, with spittle flying on you, because you suggested using, you know... D. Or Haskell. Or Lisp, or Erlang, or take your pick." " Steve, with all due respect to your experience, I know plenty of engineers and companies who are using some of these "alternative" languages, and they're having some good success. But frankly, if you work in a company where an architect is "in your face screaming at you, with spittle flying on you", frankly, it's time to move on, because that company is never going to try anything new. Period. I don't care if we're talking about languages, Spring, agile approaches, or trying a new place for lunch today. Companies get into a rut just as much as individuals do, and if the company doesn't challenge that rut every so often, they're going to get bypassed. Period, end of story. That doesn't mean trying every new thing under the sun on your next "mission-critical" project, but for God's sake, Mr. CTO, do you really want to wait until your competition has bypassed you before adopting something new? There's a lot of project work that goes on that has room for some experimentation and experience-gathering before utilizing something on the next big project.
- "I made the famously, horribly, career-shatteringly bad mistake of trying to use Ruby at Google, for this project. ... And I became, very quickly, I mean almost overnight, the Most Hated Person At Google. And, uh, and I'd have arguments with people about it, and they'd be like Nooooooo, WHAT IF... And ultimately, you know, ultimately they actually convinced me that they were right, in the sense that there actually were a few things. There were some taxes that I was imposing on the systems people, where they were gonna have to have some maintenance issues that they wouldn't have [otherwise had]. Those reasons I thought were good ones." Recognizing the cost of deploying a new platform into the IT sphere is a huge deal that programmers frequently try to ignore in their zeal to adopt something new, and as a result, IT departments frequently swing the other way, resisting all change until it becomes inevitable. This is where running on top of one of the existing execution environments (the JVM or the CLR in particular) becomes so powerful--the actual deployment platform doesn't change, and the IT guys remain more or less disconnected from the whole scenario. This is the principal advantage JRuby and IronPython and Jython and IronRuby will have over their native-interpreted counterparts. As for maintenance issues, aside from the "somebody's gonna have to learn this language" tax (which is a real tax but far less costly, I believe, than most people think it to be), I'm not sure what issues would crop up--the IT guys don't usually change your Java or C# or Visual Basic code in production, do they?
- Steve then gets into the discussion about tools around dynamic languages, and I heartily agree with him: the tool vendors have a much deeper toolchest than we (non-tool vendor programmers) give them credit for, and they're proving it left and right as IDEs get better and better for dynamic languages like Groovy and Ruby. In some areas, though, I think we as developers lean too strongly against our tools, expecting them to be able to do the thinking for us, and getting all grumpy when they can't or don't. Granted, I don't want to give up my IntelliJ any time soon, but let's think about this for a second: if I can't program Java today without IntelliJ, then is that my fault, the language's fault, the industry's fault, or some combination thereof? Or is it maybe just a fact of progress? (Would anybody consider building assembly language in Notepad today? Does that make assembly language wrong? Or just the wrong tool for the job?)
- Steve's point about how Java IDE's miss the Reflective case is a good one, and one that every Java programmer should consider. How much of your Java (or C# or C++) code actually isn't capturable directly in the IDE?
- Steve then goes into the ubiquitous Java-generics rant, and I'll have to admit, he's got some good points here--why didn't we (Java, though this applies just as equally to C#) just let the runtime throw the exception when the cast fails, and otherwise just let things go? My guess is that there's probably some good rationale that presumes you already accept the necessity of more verbose syntax in exchange for knowing where the cast might potentially fail, even though there's plenty of other places in the language where exceptions can be thrown without that verbose syntax warning you of that fact, array indexers being a big one. One thing I will point out, however, in what I believe is a refutation of what Steve's suggesting in this discussion: from my research in the area and my memory about the subject from way back when, the javac compiler really doesn't do much in the way of optimizations, and hasn't tried since about JDK 1.1, for the precise reason he points out: the JITter's going to optimize all this stuff anyway, so it's easier to just relax and let the JITter do the heavy lifting.
- The discussion about optimizations is interesting, and while I think he glosses over some issues and hyper-focuses on others, two points stand out, in my mind: performance hits often come from places you don't expect, and that micro-benchmarks generally don't prove much of anything. Sometimes that hit will come from the language, and sometimes that hit will come from something entirely differently. Profile first. Don't let your intuition get in the way, because your intuition sucks. Mine does, too, by the way--there's just too many moving parts to be able to keep it all straight in your head.
Steve then launches into a series of Q&A with the audience, but we'll let the light dim on that stage, and turn our attention over to Cedric's response. - "... the overall idea is that dynamically typed languages are on the rise and statically typed languages are on their way out." Actually, the transcript I read seemed to imply that Steve thought that dynamically typed languages are cool but that nobody will use them for a variety of reasons, some of which he agreed with. I thoroughly disagree with Steve's conclusion there, by the way, but so be it ...
- "I'm happy to be the Luke Skywalker to his Darth Vader. ... Evil shall not prevail." Yes, let's not let this debate fall into the pseudo-religious category, shall we? Fully religious debates have such a better track record of success, so let's just make it "good vs evil", in order to ensure emotions get all neatly wrapped throughout. Just remember, Cedric, even Satan can quote the Bible... and it was Jesus telling us that, so if you disagree with anything I say below you must be some kind of Al-Qaeda terrorist. Or something.
- [Editor's note: Oh, shit, he did NOT just call Cedric a terrorist and a Satanist and invoke the name of Christ in all this. Time to roll out the disclaimer... "Ladies and gentlemen, the views and opinions expressed in this blog entry...."]
- [Author's note: For the humor-challenged in the crowd, no I do not think Cedric is a terrorist. I like Cedric, and hopefully he still likes me, too. Of course, I have also been accused of being the Antichrist, so what that says about Cedric I'm not sure.]
- Cedric on Scala:
- "Have you taken a look at implicits? Seriously? Just when I thought we were not just done realizing that global variables are bad, but we have actually come up with better ways to leverage the concept with DI frameworks such as Guice, Scala knocks the wind out of us with implicits and all our hardly earned knowledge about side effects is going down the drain again." Umm.... Cedric? One reaction comes to mind here, and it's best expressed as.... WTF?!? Implicits are not global variables or DI, they're more a way of doing conversions, a la autoboxing but more flexible. I agree that casual use of implicits can get you in trouble, but I'd have thought Scala's "there are no operators just methods with funny names" would be the more disconcerting of the two.
- "As for pattern matching, it makes me feel as if all the careful data abstraction that I have built inside my objects in order to isolate them from the unforgiving world are, again, thrown out of the window because I am now forced to write deconstructors to expose all this state just so my classes can be put in a statement that doesn't even have the courtesy to dress up as something that doesn't smell like a switch/case..." I suppose if you looked at pattern-matching and saw nothing more than a switch/case, then I'd agree with you, but it turns out that pattern-matching is a lot more powerful than just being a switch/case. I think what Cedric's opposing is the fact that pattern-matching can actually bind to variables expressed in the individual match clauses, which might look like deconstructors exposing state... but that's not the way they get used, from what I've seen thus far. But, hey, just because the language offers it, people will use it wrongly, right? So God forbid a language's library should allow me to, say, execute private methods or access private fields....
- Cedric on the difficulty to impose a non-mainstream language in the industry: "Let me turn the table on you and imagine that one of your coworkers comes to you and tells you that he really wants to implement his part of the project in this awesome language called Draco. How would you react? Well, you're a pragmatic kind of guy and even though the idea seems wacky, I'm sure you would start by doing some homework (which would show you that Draco was an awesome language used back in the days on the Amiga). Reading up on Draco, you realize that it's indeed a very cool language that has some features that are a good match for the problem at hand. But even as you realize this, you already know what you need to tell that guy, right? Probably something like "You're out of your mind, go back to Eclipse and get cranking". And suddenly, you've become *that* guy. Just because you showed some common sense." If, I suppose, we equate "common sense" with "thinking the way Cedric does", sure, that makes sense. But you know, if it turned out that I was writing something that targeted the Amiga, and Draco did, in fact, give us a huge boost on the competition, and the drawbacks of using Draco seemed to pale next to the advantages of using it, then... Well, gawrsh, boss, it jus' might make sense to use 'dis har Draco thang, even tho it ain't Java. This is called risk mitigation, and frankly, it's something too few companies go through because they've "standardized" on a language and API set across the company that's hardly applicable to the problem at hand. Don't get me wrong--you don't want the opposite extreme, which is total anarchy in the operations center as people use any and all languages/platforms available to them on a willy-nilly basis, but the funny thing is, this is a continuum, not a binary switch. This is where languages-on-execution-engines (like the JVM or CLR) gets such a great win-win condition: IT can just think in terms of supporting the JVM or CLR, and developers can then think in whatever language they want, so long it compiles/runs on those platforms.
- Cedric on building tools for dynamic languages: "I still strongly disagree with that. It is different *and* harder (and in some cases, impossible). Your point regarding the fact that static refactoring doesn't cover 100% of the cases is well taken, but it's 1) darn close to 100% and 2) getting closer to it much faster than any dynamic tool ever could. By the way, Java refactorings correcting comments, XML and property files are getting pretty common these days, but good luck trying to perform a reliable method renaming in 100 Ruby files." I'm not going to weigh in here, since I don't write tools for either dynamic or static languages, but watching what the IntelliJ IDEA guys are doing with Groovy, and what the NetBeans guys are doing with Ruby, I'm more inclined to believe in what Steve thinks than what Cedric does. As for the "reliable method renaming in 100 Ruby files", I don't know this for a fact, but I'll be willing to be that we're a lot closer to that than Cedric thinks we are. (I'd love to hear comments from somebody neck-deep in the Ruby crowd who's done this and their experience doing so.)
- Cedric on generics: "I no longer bother trying to understand why complex Generic examples are so... well, darn complex. Yes, it's pretty darn hard to follow sometimes, but here are a few points for you to ponder:
- 90% of the Java programmers (including myself) only ever use Generics for Collections.
- These same programmers never go as far as nesting two Generic declarations.
- For API developers and users alike, Generics are a huge progress.
- Scala still requires you to understand covariance and contravariance (but with different rules. People seem to say that Scala's rules are simpler, I'm not so sure, but not interested in finding out for the aforementioned reasons)."
Honestly, Cedric, the fact that 90% of the Java programmers are only using generics for collections doesn't sway me in the slightest. 90% of the world's population doesn't use Calculus, either, but that doesn't mean that it's not useful, or that we shouldn't be trying to improve our understanding of it and how to do useful things with it. After looking at what the C++ community has done with templates (the Boost libraries) and what .NET is doing with its generic system (LINQ and F# to cite two examples), I think Java missed a huge opportunity with generics. Type erasure may have made sense in a world where Java was the only practical language on top of the JVM, but in a world that's coming to accept Groovy and JRuby and Scala as potential equals on the JVM, it makes no sense whatsoever. Meanwhile, when thinking about Scala, let's take careful note that a Scala programmer can go a long way with the langauge before having to think about covariance, contravariance, upper and lower type bounds, simpler or not. (For what it's worth, I agree with you, I'm not sure if they're simpler, either.) - Cedric on dynamic language performance: "What will keep preventing dynamically typed languages from displacing statically typed ones in large scale software is not performance, it's the simple fact that it's impossible to make sense of a giant ball of typeless source files, which causes automatic refactorings to be unreliable, hence hardly applicable, which in turn makes developers scared of refactoring. And it's all downhill from there. Hello bit rot." There's a certain circular logic here--if we presume that IDEs can't make sense of "typeless source files" (I wasn't aware that any source file was statically typed, honestly--this must be something Google teaches), then it follows that refactoring will be impossible or at least unreliable, and thus a giant ball of them will be unmanageable. I disagree with Cedric's premise--that IDEs can't make sense of dynamic language code--so therefore I disagree with the entire logical chain as a result. What I don't disagree with is the implicit presumption that the larger the dynamic language source base, the harder it is to keep straight in your head. In fact, I'll even amend that statement further: the larger the source base (dynamic or otherwise), the harder it is to keep straight in your head. Abstractions are key to the long-term success of any project, so the language I work with had best be able to help me create those abstractions, or I'm in trouble once I cross a certain threshold. That's true regardless of the language: C++, Java, C#, Ruby, or whatever. That's one of the reasons I'm spending time trying to get my head around Lisp and Scheme, because those languages were all about building abstractions upon abstractions upon abstractions, but in libraries, rather than in the language itself, so they could be swapped out and replaced with something else when the abstractions failed or needed evolution.
- Cedric on program unmaintainability: "I hate giving anecdotal evidence to support my points, but that won't stop me from telling a short story that happened to me just two weeks ago: I found myself in this very predicament when trying to improve a Ruby program that 1) I just wrote a few days before and 2) is 200 lines long. I was staring at an object, trying to remember what it does, failing, searching manually in emacs where it was declared, found it as a "Hash", and then realized I still had no idea what the darn thing is. You see my point..." Ain't nothing wrong with anecdotal evidence, Cedric. We all have it, and if we all examine it en masse, some interesting patterns can emerge. Funny thing is, I've had exactly the same experience with C++ code, Java code, and C# code. What does that tell you? It tells me that I probably should have cooked up some better abstractions for those particular snippets, and that's what I ended up doing. As a matter of fact, I just helped a buddy of mine untangle some Ruby code to turn it into C#, and despite the fact that he's never written (or read) a Ruby program in his life, we managed to flip it over to C# in a couple of hours, including the execution of Ruby code blocks (I love anonymous methods) stored in a string-keyed hash within an array. And this was Ruby code that neither of us had ever seen before, much less written it a few days prior.
- Cedric (and Steve) on error messages: "[Steve said] And the weird thing is, I realized early in my career that I would actually rather have a runtime error than a compile error. [Cedric responded] You probably already know this, but you drew the wrong conclusion. You didn't want a runtime error, you wanted a clear error. One that doesn't lie to you, like CFront (and a lot of C++ compilers even today, I hear) used to spit in our faces. And once I have a clear error message, I much prefer to have it as early as possible, thank you very much." Honestly, I agree with Cedric here: I would much prefer errors before execution, as early as possible, so that there's less chance of my users finding the errors I haven't found yet. And I agree that some of the error messages we sometimes get are pretty incomprehensible, particularly from the C++ compiler during template expansion. But how is that different from the ubiquitous Java "ClassCastException: Cannot cast Person to Person" that arises from time to time? Once you know what the message is telling you, it's easy to know how to fix it, but getting to the point of knowing what the error message is telling you requires a good working understanding of Java ClassLoaders. Do we really expect that any tool--static or dynamic, compiler or runtime, is going to be able to produce error messages that somehow precludes the need to have the necessary background to understand it? All errors are relative to the context from which they are born. If you lack that context, the error message, no matter how well-written or phrased, is useless.
- Cedric on "The dynamic nuclear winter": "[Steve said] And everybody else went and chased static. And they've been doing it like crazy. And they've, in my opinion, reached the theoretical bounds of what they can deliver, and it has FAILED. [Cedric responded] Quite honestly, words fail me here." Wow. Just... wow. I can't agree with Steve at all, that static(ically typed languages) has FAILED, or that they've reached the theoretical bounds of what they can deliver, but neither can I say with complete confidence that statically-typed languages are The Way Forward, either. I think, for the time, chasing statically-typed languages was the right thing to do, because for a long time we were in a position where programmer time was cheaper than computer time; now, I believe that this particular metric has flipped, and that it's time we started thinking about what the costs of programmer time really are. (Frankly, I'd love to see a double-blind study on this, but I've no idea how one would carry that out in a scientific manner.)
So.... what's left? Oh, right: if Steve/Vader is Cedric/Luke's father, then who is Cedric/Luke's sister, and why is she wearing a copper-wire bikini while strangling the Haskell/ML crowd/Jabba the Hutt? Maybe this whole Star Wars analogy thing was a bad idea. Look, at the end of the day, the whole static-vs-dynamic thing is a red herring. It doesn't matter. The crucial question is whether or not the language being used does two things, and how well it does them: - Provide the ability to express the concept in your head, and
- Provide the ability to evolve as the concepts in your head evolve
There are certain things that are just impossible to do in C++, for example. I cannot represent the C++ AST inside the program itself. (Before you jump all over me, C++ers of the world, take careful note: I'm not saying that C++ cannot represent an AST, but an AST of itself, at the time it is executing.) This is something dynamic languages--most notably Lisp, but also other languages, including Ruby--do pretty well, because they're building the AST at runtime anyway, in order to execute the code in the first place. Could C++ do this? Perhaps, but the larger question is, would any self-respecting C++ programmer want to? Look at your average Ruby program--80% to 90% (the number may vary, but most of the Rubyists I talk to agree its somewhere in this range) of the program isn't really using the meta-object capabilities of the language, and is just a "simpler/easier/scarier/unchecked" object language. Most of the weird-*ss Rubyisms don't show up in your average Ruby program, but are buried away in some library someplace, and away from the view of the average Ruby programmer. Keep the simple things simple, and make the hard things possible. That' should be the overriding goal of any language, library, or platform. Erik Meijer coined this idea first, and I like it a lot: Why can't we operate on a basic principle of "static when we can (or should), dynamic otherwise"? (Reverse that if it makes you feel better: "dynamic when we can, static otherwise", because the difference is really only one of gradation. It's also an interesting point for discussion, just how much of each is necessary/desirable.) Doing this means we get the best of both worlds, and we can stop this Galactic Civil War before anybody's planet gets blown up. 'Cuz that would suck.
|
 Saturday, May 17, 2008
|
Clearly Thinking... whether in Language, or otherwise
|
|
Steve Vinoski thinks to deflate my arguments with suppositions and presumptions, which I cannot simply let stand. (Sorry, Steve-O, but I think you're out in left field on this one. I'm happy to argue it further with you over beer, but if you want the last word, have at it, and we'll compare scores when we run into each other at the next conference.) Steve first takes aim at my comparison of the Erlang process model to the *nix process model: First, Ted says: Erlang’s reliability model–that is, the spawn-a-thousand-processes model–is not unique to Erlang. In fact, it’s been the model for Unix programs and servers, most notably the Apache web server, for decades. When building a robust system under Unix, a master-slave model, in which a master process spawns (and monitors) n number of child processes to do the actual work, offers that same kind of reliability and robustness. If one of these processes fail (due to corrupted memory access, operating system fault, or what-have-you), the process can simply die and be replaced by a new child process. There’s really no comparison between the UNIX process model (which BTW I hold in very high regard) and Erlang’s approach to achieving high reliability. They are simply not at all the same, and there’s no way you can claim that UNIX “offers that same kind of reliability and robustness” as Erlang can. If it could, wouldn’t virtually every UNIX process be consistently yielding reliability of five nines or better? What Steve misses here is that just because something can, doesn't mean it does. Processes in *nix are just as vulnerable to bad coding practices as are processes in Windows or Mac OS X, and let's be very clear: the robustness and reliability of a system is entirely held hostage to the skill and care of the worst programmer on the system. There is a large difference between theory and practice, Steve, and whether somebody takes *nix up on that offer depends a great deal on how much they're interested in building robust and reliable software. This is where a system's architecture becomes so important--architecture leads developers down a particular path, enabling them to fall into what Rico Mariani once described as "the pit of success", or what I like to call "correctness by default". Windows leads developers down a single-process/multi-thread-based model, and UNIX leads developers down a multi-process-based model. Which one seems more robust and reliable by default to you? (By the way, Erlang's model is apparently neither processes nor threads; according to Wikipedia's entry on Erlang, Erlang processes are neither operating system processes nor operating system threads, but lightweight processes somewhat similar to Java's original “green threads”. Like operating system processes (and unlike green threads and operating system threads) they have no shared state between them. Now, I grant you, Wikipedia is about as accurate as graffiti scrawled on the wall, but if that's an incorrect summation, please point me to the documentation that contradicts this and let's fix the Wikipedia entry while we're at it. But in the meantime, assuming this is correct, it means that Erlang's model is similar to the CLR's AppDomain construct, which has been around since .NET 1.0, and Java's proposed "Isolate" feature which has yet to be implemented.) (Oh, and if the argument here is that Erlang's reliability comes from its lack of shared state between threads, hell, man, that's hardly a difficult architecture to cook up. Most transactional systems get there pretty easily, including EJB, though then programmers then go through dozens of hoops to try and break it.) Next, Steve makes some interesting fundamental assumptions about "high reliability": Obviously, achieving high reliability requires at least two computers. On those systems, what part of the UNIX process model allows a process on one system to seamlessly fork child processes on another and monitor them over there? Yes, there are ways to do it, but would anyone claim they are as reliable and robust as Erlang’s approach? I sure wouldn’t. Also, UNIX pipes provide IPC for processes on the same host, but what about communicating with processes on other hosts? Yes, there are many, many ways to achieve that as well — after all, I’ve spent most of my career working on distributed computing systems, so I’m well aware of the myriad choices here — but that’s actually a problem in this case: too many choices, too many trade-offs, and far too many ways to get it wrong. Erlang can achieve high reliability in part because it solves these issues, and a whole bunch of other related issues such as live code upgrade/downgrade, extremely well. I think you're making some serious assumptions about the definition of "high reliability" here, Steve--building reliable software no more depends on having at least two computers as it does having at least two power sources, two network infrastructures, or two continents. Obviously, the more reliable you want to get with your software, the more redundancy you want to have, but that's a continuum, and one man's "high" reliability is another man's "low" reliability. Often, having just two processes running is redundant enough to get the job done. As for UNIX's process model making it seamless to fork child processes "over there" and monitor them "over here", I know other languages that have supported precisely that since 1995, SR (Synchronizing Resources) being one of them. (SR later was adapted to the JVM to become JR, and the reason I'm aware of them is because I took a couple of classes from both langauges' creator, Ron Olsson, from UC Davis.) Frankly, I think Steve's reaching with this argument--there's no reasoning here, just "I sure wouldn't [claim that they are as reliable and robust as Erlang's approach]" as persuasion. You like Erlang's ability to spin off child processes, Steve, and that's fine, but let's not pretend that Erlang's doing anything incredibly novel here--it's just taking the UNIX model and incorporating it directly into the language. And that's part of the problem--any time we incorporate something directly as part of the language, there's all kinds of versioning and revision issues that come with it. This, to my mind, is one of Scala's (and F#'s and Lisp's and Clojure's and Scheme's and other composite languages') greatest strengths, the ability to create constructs that look like they're part of the language, but in fact come from libraries. Libraries are much much easier to revise and adapt right now, largely because we know how to do it, at least compared against what we know about how to do it with languages. Next, Steve hollers at me for being apparently inconsistent: Ted continues: There is no reason a VM (JVM, CLR, Parrot, etc) could not do this. In fact, here’s the kicker: it would be easier for a VM environment to do this, because VM’s, by their nature, seek to abstract away the details of the underlying platform that muddy up the picture. In your original posting, Ted, you criticized Erlang for having its own VM, yet here you say that a VM approach can yield the best solution for this problem. Aren’t you contradicting yourself? I do criticize Erlang for having its own VM (though I think it's not a VM, it's an interpreter, which is a far cry from an actual VM), and yet I do believe the VMs can yield the best solution for the problem. The key here is simple: how much energy has gone into making the VM fast, robust, scalable, manageable, monitorable, and so on? The JVM and the CLR have (literally) thousands of man-months sunk into them to reach high levels in all those areas. Can Erlang claim the same? How do I tune Erlang's GC? How do I find out if Erlang's GC is holding objects longer than it should? How do I discover if an Erlang process is about to die due to a low-memory condition? Can I hold objects inside of a weak reference in Erlang? All of these were things developed in the JVM and CLR in response to real customer problems, and once done, were available to any and all languages that run on top of that platform. In order to keep up, Erlang must sink a significant amount of effort into these same scenarios, and I'm willing to bet that Erlang didn't get feature "X" unless Ericsson ran into the need for it themselves. By the way, the same argument applies to Ruby, at least until you start talking about JRuby or IronRuby. Ditto for Python up until IronPython or Jython (both of which, I understand, now run faster than the native C Python interpreter). Steve continues attacking my VM-based arguments: It would be relatively simple to take an Actors-based Java application, such as that currently being built in Scala, and move it away from a threads-based model and over to a process-based model (with the JVM constuction[sic]/teardown being handled entirely by underlying infrastructure) with little to no impact on the programming model. Would it really be “relatively simple?” Even if what you describe really were relatively simple, which I strongly doubt, there’s still no guarantee that the result would help applications get anywhere near the levels of reliability they can achieve using Erlang. Actually, yes, it would, because Scala's already done it. Actors is a library, not part of the language, and as such, is extensible in ways we haven't anticipated yet. As for creating and tearing down JVMs automatically, again, JR has done that already. Combining the two is probably not much more than talking to Prof. Olsson and Prof. Odersky for a while, then writing the code for the host; if I get some time, I'll take a stab at it, if nobody's done it before now. More importantly, the lift web framework is seeing some pretty impressive scalability and performance numbers using Actors, thanks to the inherent nature of an Actors model and the intrinsic perf and scale capabilities of the JVM, though I don't know how much anybody's measured its reliability or robustness yet. (How should we measure it, come to think of it?) Best part is, the IT department doesn't have to do anything different to their existing Java-based network topology to start taking advantage of this. Can you say the same for Erlang? Continuing... As to Steve’s comment that the Erlang interpreter isn’t monitorable, I never said that–I said that Erlang was not monitorable using current IT operations monitoring tools. The JVM and CLR both have gone to great lengths to build infrastructure hooks that make it easy to keep an eye not only on what’s going on at the process level (”Is it up? Is it down?”) but also what’s going on inside the system (”How many requests have we processed in the last hour? How many of those were successful? How many database connections have been created?” and so on). Nothing says that Erlang–or any other system–can’t do that, but it requires the Erlang developer build that infrastructure him-or-herself, which usually means it’s either not going to get done, making life harder for the IT support staff, or else it gets done to a minimalist level, making life harder for the IT support staff. I know what you meant in your original posting, Ted, and my objection still stands. Are you saying here that all Java and .NET applications are by default network-monitoring-friendly, whereas Erlang applications are not? I seem to recall quite a bit of effort spent by various teams at my previous employer to make sure our distributed computing products, including the Java-based products and .NET-based products, played reasonably well with network monitoring systems, and I sure don’t recall any of it being automatic. Yes, it’s nice that the Java and CLR guys have made their infrastructure monitorable, but that doesn’t relieve developers of the need to put actual effort into tying their applications into the monitoring system in a way that provides useful information that makes sense. There is no magic here, and in my experience, even with all this support, it still doesn’t guarantee that monitoring support will be done to the degree that the IT support staff would like to see. And do you honestly believe Erlang — conceived, designed, implemented, and maintained by a large well-established telecommunications company for use in highly-reliable telecommunications systems — would offer nothing in the way of tying into network monitoring systems? I guess SNMP, for example, doesn’t count anymore? (Coincidentally, I recently had to tie some of the Erlang stuff I’m currently working on into a monitoring system which isn’t written in Erlang, and it took me maybe a quarter of a workday to integrate them. I’m absolutely certain it would have taken longer in Java.) Every JVM and CLR process are, by default, network-monitoring-friendly. Java5 introduced JMX, and the CLR has had PerfMon and WMI hooks in it since Day One. Can't make that any clearer. Dunno what kind of efforts your previous employer was going through, but perhaps those efforts were back in the Java 1.4 and earlier days, when JMX wasn't a part of the platform. Frankly, whether the application you're monitoring hooks into the monitoring infrastructure is not really part of the argument, since Erlang doesn't offer that, either. I'm more concerned with whether the infrastructure is monitoring-friendly. Considering that most IT departments are happy if you give them any monitoring capabilities, having the infrastructure monitoring-friendly is a big deal. And Steve, if it takes you more than a quarter of a workday to create an MBean-friendly interface, write an implementation of that interface and register the object under an ObjectName with the MBeanServer, then you're woefully out of practice in Java--you should be able to do that in an hour or so. More to the point, though, if Erlang ties into SNMP out of the box with no work required by the programmer, please tell me where that's doc'ed and how it works! I won't claim to be the smartest Erlang programmer on the block, so anywhere you can point to facts about Erlang that I'm missing, please point 'em out! Finally, Steve approaches the coup de grace: But here’s the part of Ted’s response that I really don’t understand: So given that an execution engine could easily adopt the model that gives Erlang its reliability, and that using Erlang means a lot more work to get the monitorability and manageability (which is a necessary side-effect requirement of accepting that failure happens), hopefully my reasons for saying that Erlang (or Ruby’s or any other native-implemented language) is a non-starter for me becomes more clear. Ted, first you state that an execution engine could (emphasis mine) “easily adopt the model that gives Erlang its reliability,” and then you say that it’s “a lot more work” for anyone to write an Erlang application that can be monitored and managed? Aren’t you getting those backwards? It should be obvious that in reality, writing a monitorable Erlang app is not hard at all, whereas building Erlang-level reliability into another VM would be a considerably complicated and time-consuming undertaking. Yes, the JVM could easily adopt the multi-process model if it chose to. (Said work is being done via the Java Isolates JSR.) The CLR already does (via AppDomains). I mean what I say when I say it. If you have facts that disagree, please cite them. You've already stated that Erlang hooks into SNMP, so please, if you want to get the same degree of monitoring and management that the JVM and CLR have, write the full set of monitoring hooks to keep track of all the same things the JVM and CLR track inside their respective VMs and expose them via SNMP. If you're looking for the full set, look either in the javax.management package JavaDoc and take every interface that ends in "MBean", or walk up to any Windows machine with the .NET framework installed and look at the set of PerfMon counters exposed there. If you really want to prove your point, let's have a bake-off: on the sound of the gun firing, you start adding the monitoring and management hooks to the Erlang interpreter, and I'll add the spin-a-process-off hooks to the CLR or the JVM, and we'll see who's done first. Then we'll release the whole thing to the world and both camps will have been made better for the experience. Or maybe you could just port Erlang to the JVM or CLR, and then we'll both be happy.
|
|
l33t and prior art
|
|
"OMG, my BFF is so l33t." "ROFLOL." There's a generation that looks at the above and rolls their eyes at this, but as it turns out, this is hardly new; in fact, according to Rick Beyer, author of The Greatest Presidential Stories Never Told, we get the phrase "OK" from exactly the same process: People all over the world know what "O.K." means. But few of them realize it was born fro a wordplay craze and a presidential election. In all started in Boston in 1838. People there started using humorous initials, sometimes combined with purposeful misspellings, just for fun. Gosh, this sounds familiar. Newspapers picked up the fad, and writers had a high old time throwing around all sorts of acronyms. For example: g.t.d.h.d = "give the devil his due" n.g. = "no go" s. p. = "small potatoes" O. W. = "Oll Wright (all right)" G. T. = "Gone to Texas" And there was another expression that started gaining some currency: "Oll Korrect", or O.K. So that's what it's supposed to mean. The fad spread quickly to New York, but the phrase "O.K." didn't come into national use until the presidential campaign of 1840. Democrats trying to reelect Martin Van Buren were casting around for political slogans. Van Buren was from Kinderhook, New York, and was sometimes called "Old Kinderhook". O.K. Political operatives seized on the coincidence. Democrats started forming O.K. clubs and staging O.K. balls. The campaign catapulted the expression into national circulation. Van Buren lost his bid for reelection. But "O.K." won in a landslide, and is used billions of times a day in all corners of the globe. l33t.  (BTW, there's 99 more of those, and they're all equally fascinating.)
Languages
Saturday, May 17, 2008 10:22:53 PM (Pacific Daylight Time, UTC-07:00)
|
|
 Friday, May 16, 2008
|
Blogs I'm currently reading
|
|
Recently, a former student asked me, I was in a .NET web services training class that you gave probably 4 or so years ago on-site at a [company name] office in [city], north of Atlanta. At that time I asked you for a list of the technical blogs that you read, and I am curious which blogs you are reading now. I am now with a small company where I have to be a jack of all trades, in the last year I have worked in C++ and Perl backend type projects and web frontend projects with Java, C#, and RoR, so I find your perspective interesting since you also work with various technologies and aren't a zealot for a specific one. Any way, please either respond by email or in your blog, because I think that others may be interested in the list also. As one might expect, my blog list is a bit eclectic, but I suppose that's part of the charm of somebody looking to study Java, .NET, C++, Smalltalk, Ruby, Parrot, LLVM, and other languages and environments. So, without further ado, I've pasted in the contents of my OPML file for cut&paste and easy import. Having said that, though, I would strongly suggest not just blindly importing the whole set of feeds into your nearest RSS reader, but take a moment and go visit each one before you add it. It takes longer, granted, but the time spent is a worthy investment--you don't want to have to declare "blog bankruptcy". Editor's note: We pause here as readers look at each other and go... "WTF?!?" "Blog bankruptcy" is a condition similar to "email bankruptcy", when otherwise perfectly high-functioning people give up on trying to catch up to the flood of messages in their email client's Inbox and delete the whole mess (usually with some kind of public apology explaining why and asking those who've emailed them in the past to resend something if it was really important), effectively trying to "start over" with their email in much the same way that Chapter Seven or Chapter Eleven allows companies to "start over" with their creditors, or declaring bankruptcy allows private citizens to do the same with theirs. "Blog bankruptcy" is a similar kind of condition: your RSS reader becomes so full of stuff that you can't keep up, and you can't even remember which blogs were the interesting ones, so you nuke the whole thing and get away from the blog-reading thing for a while. This happened to me, in fact: a few years ago, when I became the editor-in-chief of TheServerSide.NET, I asked a few folks for their OPML lists, so that I could quickly and easily build a list of blogs that would "tune me in" to the software industry around me, and many of them quite agreeably complied. I took my RSS reader (Newsgator, at the time) and dutifully imported all of them, and ended up with a collection of blogs that was easily into the hundreds of feeds long. And, over time, I found myself reading fewer and fewer blogs, mostly because the whole set was so... intimidating. I mean, I would pick at the list of blogs and their entries in the same way that I picked at vegetables on my plate as a child--half-heartedly, with no real enthusiasm, as if this was something my parents were forcing me to do. That just ruined the experience of blog-reading for me, and eventually (after I left TSS.NET for other pastures), I nuked the whole thing--even going so far as to uninstall my copy of Newsgator--and gave up. Naturally, I missed it, and slowly over time began to rebuild the list, this time, taking each feed one at a time, carefully weighing what value the feed was to me and selecting only those that I thought had a high signal-to-noise ratio. (This is partly why I don't include much "personal" info in this blog--I found myself routinely stripping away those blogs that had more personal content and less technical content, and I figured if I didn't want to read it, others probably felt the same way.) Over the last year or two, I've rebuilt the list to the point where I probably need to prune a bit and close a few of them back down, but for now, I'm happy with the list I've got. And speaking of which.... 1: <?xml version="1.0"?> 2: <opml version="1.0"> 3: <head> 4: <title>OPML exported from Outlook</title> 5: <dateCreated>Thu, 15 May 2008 20:55:19 -0700</dateCreated> 6: <dateModified>Thu, 15 May 2008 20:55:19 -0700</dateModified> 7: </head> 8: <body> 9: <outline text="If broken it is, fix it you should" type="rss" 10: xmlUrl="http://blogs.msdn.com/tess/rss.xml"/> 11: <outline text="Artima Developer Buzz" type="rss" 12: xmlUrl="http://www.artima.com/news/feeds/news.rss"/> 13: <outline text="Artima Weblogs" type="rss" 14: xmlUrl="http://www.artima.com/weblogs/feeds/weblogs.rss"/> 15: <outline text="Artima Chapters Library" type="rss" 16: xmlUrl="http://www.artima.com/chapters/feeds/chapters.rss"/> 17: <outline text="Neal Gafter's blog" type="rss" 18: xmlUrl="http://gafter.blogspot.com/feeds/posts/default"/> 19: <outline text="Room 101" type="rss" 20: xmlUrl="http://gbracha.blogspot.com/feeds/posts/default"/> 21: <outline text="Kelly O'Hair's Blog" type="rss" 22: xmlUrl="http://weblogs.java.net/blog/kellyohair/index.rdf"/> 23: <outline text="John Rose @ Sun" type="rss" 24: xmlUrl="http://blogs.sun.com/jrose/feed/entries/atom"/> 25: <outline text="The Daily WTF" type="rss" 26: xmlUrl="http://syndication.thedailywtf.com/TheDailyWtf"/> 27: <outline text="Brad Wilson" type="rss" 28: xmlUrl="http://feeds.feedburner.com/BradWilson"/> 29: <outline text="Mike Stall's .NET Debugging Blog" type="rss" 30: xmlUrl="http://blogs.msdn.com/jmstall/rss.xml"/> 31: <outline text="Stevey's Blog Rants" type="rss" 32: xmlUrl="http://steve-yegge.blogspot.com/atom.xml"/> 33: <outline text="Brendan's Roadmap Updates" type="rss" 34: xmlUrl="http://weblogs.mozillazine.org/roadmap/index.rdf"/> 35: <outline text="pl patterns" type="rss" 36: xmlUrl="http://plpatterns.blogspot.com/feeds/posts/default"/> 37: <outline text="Joel Pobar's weblog" type="rss" 38: xmlUrl="http://feeds.feedburner.com/callvirt"/> 39: <outline text="Let&#39;s Kill Dave!" type="rss" 40: xmlUrl="http://letskilldave.com/rss.aspx"/> 41: <outline text="Why does everything suck?" type="rss" 42: xmlUrl="http://whydoeseverythingsuck.com/feeds/posts/default"/> 43: <outline text="cdiggins.com" type="rss" xmlUrl="http://cdiggins.com/feed"/> 44: <outline text="LukeH's WebLog" type="rss" 45: xmlUrl="http://blogs.msdn.com/lukeh/rss.xml"/> 46: <outline text="Jomo Fisher -- Sharp Things" type="rss" 47: xmlUrl="http://blogs.msdn.com/jomo_fisher/rss.xml"/> 48: <outline text="Chance Coble" type="rss" 49: xmlUrl="http://leibnizdream.wordpress.com/feed/"/> 50: <outline text="Don Syme's WebLog on F# and Other Research Projects" type="rss" 51: xmlUrl="http://blogs.msdn.com/dsyme/rss.xml"/> 52: <outline text="David Broman's CLR Profiling API Blog" type="rss" 53: xmlUrl="http://blogs.msdn.com/davbr/rss.xml"/> 54: <outline text="JScript Blog" type="rss" 55: xmlUrl="http://blogs.msdn.com/jscript/rss.xml"/> 56: <outline text="Yet Another Language Geek" type="rss" 57: xmlUrl="http://blogs.msdn.com/wesdyer/rss.xml"/> 58: <outline text=".NET Languages Weblog" type="rss" 59: xmlUrl="http://www.dotnetlanguages.net/DNL/Rss.aspx"/> 60: <outline text="DevHawk" type="rss" 61: xmlUrl="http://feeds.feedburner.com/Devhawk"/> 62: <outline text="The Cobra Programming Language" type="rss" 63: xmlUrl="http://cobralang.blogspot.com/feeds/posts/default"/> 64: <outline text="Code Miscellany" type="rss" 65: xmlUrl="http://codemiscellany.blogspot.com/feeds/posts/default"/> 66: <outline text="Fred, Let it go!" type="rss" 67: xmlUrl="http://freddy33.blogspot.com/feeds/posts/default"/> 68: <outline text="Codedependent" type="rss" 69: xmlUrl="http://graphics-geek.blogspot.com/feeds/posts/default"/> 70: <outline text="Presentation Zen" type="rss" 71: xmlUrl="http://www.presentationzen.com/presentationzen/index.rdf"/> 72: <outline text="The Extreme Presentation(tm) Method" type="rss" 73: xmlUrl="http://extremepresentation.typepad.com/blog/index.rdf"/> 74: <outline text="ZapThink" type="rss" 75: xmlUrl="http://feeds.feedburner.com/zapthink"/> 76: <outline text="Chris Smith's completely unique view" type="rss" 77: xmlUrl="http://feeds.feedburner.com/ChrisSmithsCompletelyUniqueView"/> 78: <outline text="Code Commit" type="rss" 79: xmlUrl="http://feeds.codecommit.com/codecommit"/> 80: <outline 81: text="Comments on Ola Bini: Programming Language Synchronicity: A New Hope: Polyglotism" 82: type="rss" 83: xmlUrl="http://ola-bini.blogspot.com/feeds/5778383724683099288/comments/default"/> 84: </body> 85: </opml>
Happy reading.....
|
 Saturday, May 10, 2008
|
I'm Pro-Choice... Pro Programmer Choice, that is
|
|
Not too long ago, Don wrote: The three most “personal” choices a developer makes are language, tool, and OS. No. That may be true for somebody who works for a large commercial or open source vendor, whose team is building something that fits into one of those three categories and wants to see that language/tool/OS succeed. That is not where most of us live. If you do, certainly, you are welcome to your opinion, but please accept with good grace that your agenda is not the same as my own. Most of us in the practitioner space are using languages, tools and OSes to solve customer problems, and making the decision to use a particular language, tool or OS a personal one generally gets us into trouble--how many developers do you know that identify themselves so closely with that decision that they include it in their personal metadata? "Hi, I'm Joe, and I'm a Java programmer." Or, "Oh, good God, you're running Windows? What are you, some kind of Micro$oft lover or something?" Or, "Linux? You really are a geek, aren't you? Recompiled your kernel lately (snicker, snicker)?" Sorry, but all of those make me want to hurl. Of these kinds of statements are technical zealotry and flame wars built. When programmers embed their choice so deeply into their psyche that it becomes the tagline by which they identify themselves, it becomes an "ego" thing instead of a "tool" thing. What's more, it involves customers and people outside the field in an argument that has nothing to do with them. Think about it for a second; the last time you hired a contractor to add a deck to your house, what's your reaction when they introduce themselves as, "Hi, I'm Kim, and I'm a Craftsman contractor." Or, overheard at the job site, "Oh, good God, you're using a Skil? What are you, some kind of nut or something?" Or, as you look at the tools on their belt, "Nokita? You really are a geek, aren't you? Rebuilt your tools from scratch lately (snicker, snicker)?" Do you, the customer, really care what kind of tools they use? Or do you care more for the quality of solution they build for you? It's hard to imagine how the discussion can even come up, it's so ludicrous. Try this one on, instead: "Hi, I'm Ted, and I'm a programmer." I use a variety of languages, tools, and OSes, and my choice of which to use are all geared around a single end goal: not to promote my own social or political agenda, but to make my customer happy. Sometimes that means using C# on Windows. Sometimes that means using Java on Linux. Sometimes that means Ruby on Mac OS X. Sometimes that means creating a DSL. Sometimes that means using EJB, or Spring, or F#, or Scala, or FXCop, or FindBugs, or log4j, or ... ad infinitum. Don't get me wrong, I have my opinions, just as contractors (and truck drivers, it turns out) do. And, like most professionals in their field, I'm happy to share those opinions with others in my field, and also with my customers when they ask: I think C# provides a good answer in certain contexts, and that Java provides an equally good answer, but in different contexts. I will be happy to explain my recommendation on which languages, tools and OSes to use, because unlike the contractor, the languages, tools, and OSes I use will be visible to the customer when the software goes into Production, at a variety of levels, and thus, the customer should be involved in that decision. (Sometimes the situation is really one where the customer won't see it, in which case the developer can have full confidence in whatever language/tool/OS they choose... but that's far more often the exception than the rule, and will generally only be true in cases where the developer is providing a complete customer "hands-off" hosting solution.) I choose to be pro-choice.
|
 Thursday, May 08, 2008
|
Thinking in Language
|
|
A couple of folks have taken me to task over some of the things I said... or didn't say... in my last blog piece. So, in no particular order, let's discuss. A few commented on how I left out commentary on language X, Y or Z. That wasn't an accidental slip or surge of forgetfulness, but I didn't want to rattle off a laundry list of every language I've run across or am exploring, since that list would be much, much longer and arguably of little to no additional benefit. Having said that, though, a more comprehensive list (and more comprehensive explanation and thought process) is probably deserved, so expect to see that from me before long, maybe in the next week or two. Steve Vinoski wrote: In a recent post, Ted Neward gives a brief description of a variety of programming languages. It’s a useful post; I’ve known Ted for awhile now, and he’s quite knowledgeable about such things. Still, I have to comment on what he says about Erlang.... I might have said it like this: Erlang. Joe Armstrong’s baby was built to solve a specific set of problems at Ericsson, and from it we can learn a phenomenal amount about building highly reliable systems that can also support massive concurrency. The fact that it runs on its own interpreter, good; otherwise, the reliability wouldn’t be there and it would be just another curious but useless concurrency-oriented language experiment. Far too many blog posts and articles that touch on Erlang completely miss the point that reliability is an extremely important aspect of the language. To achieve reliability, you have to accept the fact that failure will occur, Once you accept that, then other things fall into place: you need to be able to restart things quickly, and to do that, processes need to be cheap. If something fails, you don’t want it taking everything else with it, so you need to at least minimize, if not eliminate, sharing, which leads you to message passing. You also need monitoring capabilities that can detect failed processes and restart them (BTW in the same posting Ted seems to claim that Erlang has no monitoring capabilities, which baffles me). Massive concurrency capabilities become far easier with an architecture that provides lightweight processes that share nothing, but that doesn’t mean that once you design it, the rest is just a simple matter of programming. Rather, actually implementing all this in a way that delivers what’s needed and performs more than adequately for production-quality systems is an incredibly enormous challenge, one that the Erlang development team has quite admirably met, and that’s an understatement if there ever was one. They come for the concurrency but they stay for the reliability. Do any other “Erlang-like” languages have real, live, production systems in the field that have been running non-stop for years? (That’s not a rhetorical question; if you know of any such languages, please let me know.) Next time you see yet another posting about Erlang and concurrency, especially those of the form “Erlang-like concurrency in language X!” just ask the author: where’s the reliability? As he says, Steve and I have known each other for a while now, so I'm fairly comfortable in saying, Mr. Vinoski, you conflate two ideas together in your assessment of Erlang, and teasing those two things apart reveals a great deal about Erlang, reliability, and the greater world at large. Erlang's reliability model--that is, the spawn-a-thousand-processes model--is not unique to Erlang. In fact, it's been the model for Unix programs and servers, most notably the Apache web server, for decades. When building a robust system under Unix, a master-slave model, in which a master process spawns (and monitors) n number of child processes to do the actual work, offers that same kind of reliability and robustness. If one of these processes fail (due to corrupted memory access, operating system fault, or what-have-you), the process can simply die and be replaced by a new child process. Under the Windows model, which stresses threads rather than processes, corrupted memory access tearing down the process brings down the entire system; this is partly why .NET chose to create the AppDomain model, which looks and feels remarkably like the lightweight process model. (It still can't stop a random rogue pointer access from tearing down the entire process, but if we assume that future servers will be written all in managed code, it offers the same kind of reliability that the process model does so long as your kernel drivers don't crash.) There is no reason a VM (JVM, CLR, Parrot, etc) could not do this. In fact, here's the kicker: it would be easier for a VM environment to do this, because VM's, by their nature, seek to abstract away the details of the underlying platform that muddy up the picture. It would be relatively simple to take an Actors-based Java application, such as that currently being built in Scala, and move it away from a threads-based model and over to a process-based model (with the JVM constuction/teardown being handled entirely by underlying infrastructure) with little to no impact on the programming model. As to Steve's comment that the Erlang interpreter isn't monitorable, I never said that--I said that Erlang was not monitorable using current IT operations monitoring tools. The JVM and CLR both have gone to great lengths to build infrastructure hooks that make it easy to keep an eye not only on what's going on at the process level ("Is it up? Is it down?") but also what's going on inside the system ("How many requests have we processed in the last hour? How many of those were successful? How many database connections have been created?" and so on). Nothing says that Erlang--or any other system--can't do that, but it requires the Erlang developer build that infrastructure him-or-herself, which usually means it's either not going to get done, making life harder for the IT support staff, or else it gets done to a minimalist level, making life harder for the IT support staff. So given that an execution engine could easily adopt the model that gives Erlang its reliability, and that using Erlang means a lot more work to get the monitorability and manageability (which is a necessary side-effect requirement of accepting that failure happens), hopefully my reasons for saying that Erlang (or Ruby's or any other native-implemented language) is a non-starter for me becomes more clear. Meanwhile, Patrick Logan offers up some sharp words about my preference for VMs: What is this obsession with some virtual machine being the one, true byte code? The Java Virtual Machine, the CLR, Parrot, whatever. Give it up. I agree with Steve Vinoski... The fact that it runs on its own interpreter, good; otherwise, the reliability wouldn’t be there. We need to get over our thinking about "One VM to bring them all and in the darkness bind them". Instead we should be focused on improving interprocess communication among various languages. This can be done with HTTP and XMPP. And we should expecially be focused on reliability, deployment, starting and stopping locally or remotely, etc. XMPP's "presence" provides Erlang-process-like linking of a sort as well. With Erlang's JInterface for Java then a Java process can look like an Erlang process (distributed or remote). Two or more Java processes can use JInterface to communicate and "link" reliably and Erlang virtual machines and libraries, save this one single .jar, do not have to be anywhere in sight. To obsess about a single VM is to remain stuck at about 1980 and UCSD Pascal's p-code. It just should not matter today, and certainly not tomorrow. The forest is now much more important than any given tree. Pay attention to the new JVM from IBM in support of their lightweight, fast-start, single-purpose process philosophy embodied in Project Zero. It's not intended to be a big honkin' run everything forever virtual machine. It will support JVM languages and the more the merrier in the sense that such a JVM will enable lightweight pieces to be stiched together dynamically. However the intention is to perform some interprocess communication and then get out of the way. Exactly the right approach for any virtual machine. Jini clearly is *the* most important thing about Java, ever. But it's lost. Gone. Buh-bye. Pity. "We need to get over our thinking about "One VM to bring them all and in the darkness bind them". " Huh? How did we go from "I like virtual machine/execution environments because of the support they give my code for free" to "One VM to bring them all and in the darkness bind them"? I truly fail to see the logical connection there. My love for both the JVM and the CLR has hopefully made itself clear, but maybe Patrick's only subscribed to the Java/J2EE category bits of my RSS feed. Fact is, I'm coming to like any virtual machine/execution environment that offers a layer of abstraction over the details of the underlying platform itself, because developers do not want to deal with those details. They want to be able to get at them when it becomes necessary, granted, but the actual details should remain hidden (as best they can, anyway) until that time. "Instead we should be focused on improving interprocess communication among various languages. This can be done with HTTP and XMPP." I'm sorry, but I'm getting very very tired of this "HTTP is the best way to communicate" meme that surrounds the Internet. Yes, HTTP was successful. Nobody is arguing with this. So is FTP. So is SMTP and POP3. So, for that matter, is XMPP. Each serves a useful purpose, solving a particular problem. Let's not try to force everything down a single pipe, shall we? I would hate to be so focused on the tree of HTTP that we lose sight of the forest of communication protocols. "And we should expecially [sic] be focused on reliability, deployment, starting and stopping locally or remotely, etc. XMPP's "presence" provides Erlang-process-like linking of a sort as well." Yes! XMPP's "presence" aspect is a powerful one, and heavily underutilized. "Presence", however, is really just a specific form of "discovery", and quite frankly our enterprise systems need to explore more "discovery"-based approaches, particularly for resource acquisition and monitoring. I've talked about this for years. "To obsess about a single VM is to remain stuck at about 1980 and UCSD Pascal's p-code." Great one-liner... with no supporting logic, granted, but I'm sure it drew a cheer from the faithful. "It just should not matter today, and certainly not tomorrow." For what reason? Based on what concepts? Look, as much as we want to try and abstract ourselves away from everything, at some point rubber must meet road, and the semantic details of the platform you're using--virtual or otherwise--make a huge difference about how you build systems. For example, Erlang's many-child-processes model works well on Unix, but not as well on Windows, owing to the heavier startup costs of creating a process under Windows. For applications that will involve spinning up thousands of processes, Windows is probably not a good platform to use. Disclaimer: This "it's heavier to spin up processes on Windows than Unix" belief is one I've not verified personally; I'm trusting what I've heard from other sources I know and trust. Under later Windows releases, this may have changed, but my understanding is that it is still much much faster to spin up a thread on Windows than a separate process, and that it is only marginally faster to spin up a thread on Unix than a process, because many Unixes use the process model to "fake" threads, the so-called LightWeightProcess model. "The forest is now much more important than any given tree." Yes! And that means you have to keep an eye on the forest as a whole, which underscores the need for monitoring and managing capabilities in your programs. Do you want to build this by hand? "Pay attention to the new JVM from IBM in support of their lightweight, fast-start, single-purpose process philosophy embodied in Project Zero. It's not intended to be a big honkin' run everything forever virtual machine. It will support JVM languages and the more the merrier in the sense that such a JVM will enable lightweight pieces to be stiched together dynamically. However the intention is to perform some interprocess communication and then get out of the way. Exactly the right approach for any virtual machine." Yes! You make my point for me--the point of the virtual machine/execution environment is to reduce the noise a developer must face, and if IBM's new VM gains us additional reliability by silently moving work and data between processes, great! But the only way you take advantage of this is by writing to the JVM. (Or CLR, or Parrot, or whatever.) If you don't, and instead choose to write to something that doesn't abstract away from the OS, you have to write all of this supporting infrastructure code yourself. That sounds like fun, right? Not to mention highly business-ROI-focused? "Jini clearly is *the* most important thing about Java, ever. But it's lost. Gone. Buh-bye. Pity." Jini was cool. I liked Jini. Jini got nowhere because Sun all but abandoned it in its zeal to push the client-server EJB model of life. sigh I wish they had sought to incorporate more of the discovery elements of Jini into the J2EE stack (see the previous paragraph). But they didn't, and as a result, Jini is all but dead. Disclaimer: I know, I know, Jini isn't really dead. The bits are still there, you can still download them and run them, and there is a rabidly zealous community of supporters out there, but as a tool in widespread use and a good bet for an IT department, it's a non-starter. Oh, and if you're one of those rabidly zealous supporters, don't bother emailing me to tell me how wrong I am, I won't respond. Don't forget that FoxPro and OS/2 still have a rabidly zealous community of supporters out there, too. Frankly, a comment on Patrick's blog entry really captures my point precisely, so (hopefully with permission) I will repeat it here: The only argument you made that I can find against sharing VMs is that people should be focusing on other things. But the main reason for sharing VMs is to allow people to focus on other things, instead of focusing on creating yet another VM. You write as if you think creating an entirely new VM from scratch would be easier than targeting a common VM. Is that really what you think? Couldn't have said it better... though that never stops me from trying. 
|
 Thursday, May 01, 2008
|
Yet Another Muddled Message
|
|
This just recently crossed my Inbox, this time from Redmond Developer News, and once again I'm simply amazed at the audacity of the message and rather far-fetched conclusion: FEEDBACK: THE MOVE FROM J2EE On Tuesday, I wrote about BMC's new Application Problem Resolution System 7.0 tooling, which provides "black box" monitoring and analysis of application behavior to help improve troubleshooting. http://reddevnews.com/blogs/weblog.aspx?blog=2146 In talking to BMC Director Ran Gishri, I ran across some interesting perspectives that he was able to offer on the enterprise development space. Among them, the fact that large orgs seem to be moving away from J2EE and toward a mix of .NET and sundry lightweight frameworks. Richard Eaton, an RDN reader who's a manager of database systems for Georgia System Operations Corp., confirms Gishri's insights. He wrote: "In 2003, we made a decision to build our Web application using Java and a third-party RAD tool for Java development that was locally supported at that time. Since then, the company that developed and supported that RAD tool has gone out of business and left us with virtually no support for the product. The application development that was done was very integrated into the tool, which meant we would virtually have to rewrite the entire app. So we analyzed our experience with using Apache, Linux, Java and Eclipse for our platform and realized the effort was very management-intensive for our small team, and so we looked to .NET. "Considering the advances in the .NET framework and CLR libraries and the integration it offered to our other third-party tools, as well as our prolific Excel spreadsheet environment, the decision was easy to go to .NET. We are also moving away from Sybase databases to SQL Server and looking into the use of SharePoint for various internal collaboration and project functions. The one-stop shop of Microsoft technology and support and ease of development and integration, I think, is the overwhelming weight in deciding between J2EE and .NET." First of all, I'm a little shocked that based on a conversation with one individual, we can safely infer that "large orgs" are "moving away from J2EE and toward a mix of .NET and sundry lightweight frameworks". This is fair and unbiased reporting? That's like going to Houston (home of our current sitting President), or Arizona (home of the Republican candidate), and discovering that a majority of the voters there will vote Republican in the next Presidential election. Amazing! Investigative journalism at its finest. Of course, no report like this could be taken seriously without some kind of personal anecdotal evidence as backup, so next we have a heart-rendering tear-jerker of a story in which a poor company was taken for a ride by those big bad J2EE vendors.... "We made a decision to build our Web applciation using java and a third-party RAD tool for Java... Since then, the company that [built] that RAD tool has gone out of business and left us [screwed]." Uh... wait a minute. Is this a story about moving away from J2EE, or about moving away from third-party proprietary tools that build code that's "very integrated into the tool"? Look, this story doesn't read any better... or any more inaccurately... if we simply reverse the locations of "J2EE" and ".NET" in it. The problem here is that the company in question made a questionable decision: to base their application development on a third-party tool that couldn't be easily supported or replaced in the event the vendor went south. So when the vendor did tank, they found themselves in a thorny situation. That's not J2EE's fault. That's the company's fault. This vetting of the third-party tool (or framework, or library, or ...) is a necessary precaution regardless of whether you're talking about the J2EE, .NET or native platforms, and whether that vendor is a commercial vendor or an open-source vendor. Some will take umbrage at the idea of treating an open-source project as a vendor, but ask yourself the hard question that few open-source advocates really talk much about: in the event the project committers abandon the project, are you really prepared to take up support for it yourself? At Java shows, I frequently ask how many people in the audience have used Tomcat, and almost 100% of the room raises their hand. I ask how many have actually looked at the Tomcat source code, and that number goes down dramatically. Swap in any project you care to name: Hibernate, Ant, Spring, you name it, lots of Java devs have used them, but few are prepared to support them. Open source projects have to be seen in the same light as vendors: some will disappear, some won't, and it's hard to tell which ones are which at the time you're looking to adopt them, so you're best off assuming the worst and figuring out your strategy. It's called risk management assessment, and I wish more software development projects did it. Does .NET offer integration to "other third-party tools"? Sure, depending on the tools, which can even include Java/J2EE, if you manage it right. (I should know.) Am I trying to advocate using J2EE over .NET or vice versa? Hell no--every company has to make the decision for itself, and every company's context is different. Some will even find that neither stack works well for them, and choose to go with something else, a la C++ or Ruby or Perl or... or whatever. Just make sure you know what you are banking on, and how central (or not) those pieces are to your strategy.
|
 Wednesday, April 30, 2008
|
Why, Apple, Why?
|
|
So I see, via the blogosphere, that a Java 6 update is available for the Mac, so I run off to the Apple website to download the package. Click on the link, and I'm happy. Wait.... It's for 64-bit Intel Macs only?!? Apple, why do you tease me this way? Why is it that you can build it for 64-bit machines, but not 32-bit? This just seems entirely spurious and artificial. Somebody please tell me that it's otherwise, and why, because until then, I'm going to just assume that Apple doesn't give a whit about Java.
Java/J2EE | Mac OS
Wednesday, April 30, 2008 12:32:09 AM (Pacific Daylight Time, UTC-07:00)
|
|
 Tuesday, April 29, 2008
|
Groovy or JRuby?
|
|
Recently, it has become the fad to weigh in on the Groovy vs JRuby debate, usually along the lines of "Which is X?", where X is one of "better", "faster", "more powerful", "more acceptable", "easier", and so on. (Everybody seems to have their own adjective/adverb to slide in there, so I won't even begin to try to list them all.) Rick Hightower, in a blog post from January, weighs in on this and comes down harshly on both Scala and JRuby. Frankly, I found the whole post to ooze bitterness and maybe a touch of jealousy. Some of the highlights: - "In short: Scala seems like the next over-hyped language." Rick, they're all over-hyped, including your own nominee for the Presidential race, Groovy. I mean, if we're going to weigh this on the grounds of syntax or familiarity, let's throw {ECMA/Java}script into the ring, since it's:
- ... been around a lot longer than Groovy and therefore a lot more familiar and comfortable to the programmers that might use either or both,
- ... always going to be around, thanks to its inclusion in HTML browsers, and therefore a good investment in your knowledge portfolio regardless of where you end up using it, client- or server- or wherever-side,
- ... has many, if not all, of the same features that Groovy (or JRuby) supports,
- ... runs on top of the JVM (several ways, including Rhino, which ships with JDK 6 now, and FESI),
- ... and has Steve Yegge's vote of confidence, so you know is has to be good, right?
- "Sun please drop JRuby support. It is a waste of time. Take that money and spend it on Groovy which has a compatible syntax to Java. ... Does Ruby and Rails have good ideas? Yes. Borrow them and move on." This seems like a questionable decision to me--why cherry-pick features from one language and port them over to a different language, for no other reason than to say you did? Why not just use said original language in the first place, assuming it can run on your particular platform? Down this path lies the madness that C# and VB have become, as the C# and VB teams seek to create "feature parity" between the two languages, just so that you as a developer can either have your curly-braces-and-semicolons or not. Stupid. Talk about a waste of time and energy. Ruby's syntax is (mostly) vetted, the test cases written, and the featureset understood. Do something different if you're going to create a new language, don't just take the existing features of a language and put new tokens around it. In the South, that's called putting lipstick on the pig: it may be prettier than it was, but underneath, it's still just a pig. (Note: sometimes the new language is designed specifically to be a subset of the feature set of the source language, and I'm completely supportive of that--sometimes it's necessary to scale back just as much as it is to innovate.)
- "After reading the Scala docs, my thought is: while the language features sound great, the syntax makes me want to hurl. Why do things differently just for the sake of it?" Strangely enough, they didn't. (This is frequently the complaint of those who don't understand something. "The designers couldn't have had a good reason for doing it that way, so it must have been just because they wanted to do it differently".) Scala's syntax is actually quite consistent in many ways, particularly if you came from the functional language world, and the underlying rationale is pretty easy to grok... if you bother spending enough time to find out. Scala drops static, for example, because it turns out that Java developers spend a fair amount of time trying to resolve the "should this be a static method or should it be an instance method on a Singleton object" way too often, for example. See Gilad Bracha's arguments against static if you want to find out more of the rationale here. The "def" syntax for method definition is strikingly similar to Groovy, for the same reason: it makes it clear where a definition is taking place. The name-colon-type syntax is deliberate because then it's much easier to leave off the type signatures and let the compiler do the type inferencing for you (a feature, notably, that Rick says he likes). For what it's worth, Rick, here's a lesson I continue to learn the hard way: Spend some time learning the why of something before you take aim and let fly.
- "Final words: I declare the "Ruby will rule the world" fallacy officially over. Remember: Quit pimple pimping Ruby on JRoller! Scala devotees (both of you). Don't even start!" Well, frankly, the "Ruby will rule the world" meme was that over-hyped thing you mentioned earlier, and before anybody starts the next one, let me nip it in the bud: nothing will take over the world. Nothing has taken over the world: not C++, not C, not Java, not C#, not Visual Basic, nothing. The best a language can hope for is to cross what Simon Peyton-Jones calls the "Threshold of Immortality", and lots of languages have done that, too many to list all of them here, though you could probably do so yourself. Some of those include Java, C++, C#, C, Pascal, FORTRAN, COBOL, Perl, Python, Ruby, SQL, maybe even Smalltalk and Lisp and Scheme and the others we normally don't think about.
And we haven't even bothered to go into some kind of feature shootout or performance shootout between any of these guys. Don't get me wrong--like me, Rick is entirely entitled to his own opinions and he doesn't owe me (or anybody else) a lick of rationale to defend them. But when he comes out and suggests that Sun should drop JRuby entirely in favor of Groovy instead, I feel compelled to point out that there's some logic missing from the reasoning behind that suggestion. Cynics of this blog will suggest that I'm speaking out of both sides of my mouth: that I get to say Perl sucks, and get away with it because it's just one man's opinion but Rick can't say JRuby sucks in turn. Fact is, I'm not suggesting that Larry Wall and chromatic and the others should drop Perl and go work on something more meaningful--quite the opposite, in fact: so long as there are people who continue to use Perl, they have a responsibility to continue to develop and update that language. And Parrot is quite the interesting VM to explore in its own right. But don't expect me any time soon to be writing a bunch of Perl code except under strongly-worded protest to the United Nations. At the end of the day, the way I think about these languages loosely falls like this: - C++. For me, programing started here, so I will always have a special place in my heart for it. Templates were vastly more powerful than most people realized until the STL was released, and even to this day, C++ is usually blamed for the complexities of memory management even when garbage collector libraries (like the Boehm collector) were available and could have reduced that complexity significantly. The Boost libraries just blow my mind, and there's some new stuff coming in C++0X that brings C++ to a degree of parity with Java and C#. I wish I could get back to this for a project in the same way that guys fantasize about running into an old high school girlfriend on a business trip.
- C++/CLI. C++ adapted for the CLR. Interesting idea, but it's hard to see why I'd use this, given its syntactical and semantical similarities to C#. Frankly, C++/CLI seems destined to be forever the "glue" language to write managed wrappers on top of unmanaged C/C++ libraries, and that's hardly a compelling reason to pick this guy up for anything beyond that niche area.
- Java. The language I want to feature-freeze, though I do see a value in adding closures, if only to permit closures to enter the design and implementation of the Java libraries, thus making them widely available across all JVM languages. However, if I really got my way, we'd drop the closures-in-Java debate entirely and throw our weight behind John Rose's proposal for method handles in the JVM, since that would enable the same kind of facilities for libraries and without having to rev the Java language significantly. (Lesson to the Java community from the CLR community: not all features of the virtual machine have to be exposed in one language. Not even C# or VB do this.) The JVM I want to continue to enhance and revise and improve.
- Scala. Functional-object hybrid language for the JVM. Pure goodness. Hey, I'm bullish here, I admit it. Scala's type inference makes for lower ceremony, the static type system provides a degree of confidence in code that dynamic languages don't have without programmer-authored unit tests, and the functional nature offers a new design dimension that we haven't been able to easily express before. I won't say that I'm "thinking in Scala", but I'm thinking a lot about Scala these days, and F# too.
- Groovy. "Ruby meets Java in a bar and has a love child." Groovy's syntax is easy and based on Java, and that's both a good and a bad thing. Good if you're a Java programmer who doesn't want to have to reach very far to get some dynamic goodness; bad if you're trying to avoid some of the stranger or syntactically inconsistent aspects of the Java language, or looking to do some entirely new ways of doing things. Personally, I don't find Groovy all that intellectually stimulating, which is both a blessing and a curse.
- JRuby, IronRuby. Ruby on the JVM. 'nuff said. Ditto for IronRuby on to CLR. All the linguistic power (and flaws) of Ruby, on top of the JVM/CLR, which now means it's a far easier sell to the IT boys who run the datacenter.
- C#. The language is great, so long as it retains its original vision and scope. Memo to the C# team: Please let's not try to make C# into a scripting language. Scripting languages have a purpose, and that purpose is generally different from what general-purpose languages do. C# really doesn't need a REPL--don't fall into the trap of trying to make it into Lisp.
- Visual Basic. The language is great (!), so long as it retains its original vision and scope. Yes, I think the language is a good one--you don't really believe how much of a PITA case-sensitivity is until you start programming without it, and suddenly you realize that it's mostly a holdover from the C days. What right-thinking programmer overloads a symbolic name by case? Programmers have died for less than that. So why does case sensitivity matter? More importantly, VB has always been the dynamic language of choice for millions of programmers, it's time for those of us from the C++ community to just own up, admit that there was a place for VB after all, apologize, and let VB go back to being a powerful dynamic language on top of the CLR. Give it a REPL loop, make it the default choice for building top-of-the-stack code, and let VB guys build UIs that call into middle-tier components built by C# and F# guys. Everybody comes out a winner.
- F#. Functional-object hybrid language for the CLR. Pure goodness. The syntax again will seem quirky and strange to people unused to it, but it makes a lot of sense, and compositional construction using higher-order functions is a vastly underestimated and underused design technique. When functions are values, lots of things become possible, as people working in dynamic languages already know.
- Ruby. "Smalltalk meets Perl in a bar and has a love child." I like parts of the Ruby syntax, but there's too many Perl-isms in there for my taste. The fact that Ruby runs on top of its own interpreter (which is neither monitorable nor manageable using IT-datacenter-established tools) is a significant drawback. RoR may be great for vertical silo apps that don't need to integrate with the rest of the datacenter, but that's a pretty scary place to put yourself.
- Python. Dynamic language (goodness) with some functional concepts (goodness) on its own interpreter (badness) with a radical innovation in syntax called significant whitespace to do away with brackets to denote code blocks. Significant whitespace makes it incredibly awkward to embed Python code anywhere but in .py files, meaning Python's suitability for DSLs is reduced significantly. If I could get Python without significant whitespace, I'd be a lot happier camper.
- Jython/IronPython. Python on the JVM/CLR. 'nuff said.
- Perl. Parrot good. Perl syntax and philosophy not one I care for. Use as a shell scripting tool good. Use as a general-purpose programming language not one I recommend. Perl 6's incredibly delayed departure, very bad, unless you're one of those who wants to see Perl become extinct.
- {ECMA/Java}Script. Can we please finally just accept that ES is much more than just a browser extensibility tool? For most developers, this is their first exposure to a classless prototype-based object-oriented language, and unfortunately, most developers don't ever bother exploring it beyond "How do I make my web page do that floating image thing...?" Gah.
- Rhino/FESI/JScript.NET. {ECMA/Java}Script on the JVM/JVM/CLR. 'nuff said, though I wish the JScript guys would incorporate the E4X bits. JScript on the CLR makes for an interesting case study, and maybe (hopefully) they'll use it as another sanity-check for the DLR.
- PowerShell. Scripting language that finally brings much of the power of bash and tcsh and other shells to the Windows world and unifies a ton of different things together into one space: WMI, .NET, COM, and more. Highly necessary for IT admins who've suffered with batch files for decades. Language syntax isn't too bad, and I could even consider using it in an application/system as an extension language to give to power users so I can turn them loose to create emergent behavior without having to keep coming back to me with their feature requests.
- Lisp. With all apologies to Paul Graham, Lisp's window of opportunity (the "woo" factor, as Jay Zimmerman likes to call it) is essentially gone. We will always be looking back at it for ideas, I think, but it's very hard to imagine doing a project that's even remotely near an IT data center in it, for the same reason that Ruby or Erlang are hard to imagine here: running on top of an execution environment that doesn't have managability and monitorability baked in is a non-starter for me. Despite all that, however, programmers owe it to themselves to learn it, because until somebody points it out, you never realize you're color-blind. There's so many interesting ideas in here that you don't even realize what you're missing until you explore it.
- Scheme. See Lisp.
- Haskell. Love it or leave it, but you have to learn it. Functional languages are becoming big, and Haskell is a major influence on them.
- ML. Ditto to Haskell. If you want to see another functional/object hybrid language based on ML, check out OCaml. Note that OCaml is the direct predecessor to F# and the two are frequently (deliberately) syntax-compatible.
- Erlang. Joe Armstrong's baby was built to solve a specific set of problems at Ericsson, and from it we can learn a phenomenal amount about building massively parallel concurrent programs. The fact that it runs on its own interpreter, bad.
And there's still so many more to learn..... but that's the subject of another blog post, coming soon. Update: Naturally, people complained about the languages that were left off the list. No slight is intended--there's a lot more that I could have included here, and I will go into each of these in more detail (I hope), but there's only so much time in the day, and shipping (or posting, in this case) is always a feature. 
|
 Friday, April 25, 2008
|
Everything that's wrong with "ESB"
|
|
This email recently crossed my Inbox, and it just completely typifies everything I find wrong with the ESB: Title: "Architect Complex Integration with ESB's" Body: F1000 Keys and Barriers to ESB Success [Event] Event: [Name struck to protect the guilty] Date: [struck] Times: [struck] Place: Online No-Charge Conference Gartner reports these 2 facts: 1) F1000 firms increasingly see the Enterprise Service Bus as a “core component” in their multi-million-dollar Service-Oriented infrastructure investments. 2) ESB software is by far the fastest growing application integration middleware market segment in all of IT – with growth rates exceeding 100% year-over-year! At [EVENT], you will hear top ESB and business integration experts from IBM, Software AG, and Progress Software show you how F1000 customers design, build and deploy powerful and flexible ESBs—and cut risks and boost rewards. 12noon ET (9AM PT): Case Studies Roundtable (Keynote Panel) In a fast-paced ESB Case Studies Roundtable, you’ll hear 6 F1000 Case Studies. Our ESB experts will share details on how ESBs improved business value and ROI for their F1000 customers. [Name struck] IBM WW Marketing Content Lead for SOA Reuse and Connectivity [Name struck] Software AG Dir. of Product Marketing for ESB, Enterprise Integration & B2B Tools [Name struck] Progress Software Senior Product Marketing Manager 1:00pm ET: Presentation Introduction to the webMethods ESB Platform [Name struck] Dir. of Product Marketing for ESB, Enterprise Integration & B2B Tools 2:00pm ET: Presentation How ESBs can power your business [Name struck] WW Marketing Content Lead for SOA Reuse and Connectivity 3:00pm ET: Presentation Progress Sonic and the SOA Portfolio: It’s your world it’s your SOA [Name struck] Sr. Director, Progress Product Marketing [Event] is an [Media Company] production series designed to provide executives, business managers, software developers and architects with valuable "hands-on" information about improving techniques and results for design time and runtime SOA projects. So what do I find so offensive about this email? Just about everything. Consider these little tidbits: - Start with the title: "Architect Complex Integration with ESB's". The desire to create complexity is a misguided one; enterprise data centers already have too much complexity within them, and creating more just adds to the mess. Neil Ford hits this point perfectly when he talks about the difference between "essential complexity" and "accidental complexity" in his NFJS keynote. Essential complexity is that complexity required to solve the problem--the business workflow that comes as part of a business process, for example, or the rules that go into calculating shipping & handling for an order. "Accidental complexity", however, is all that complexity that occurs in software that has nothing to do with the problem at hand. When accidental complexity gets too large, the desire to "rewrite the whole thing from scratch" kicks in, something that might be OK for a single application, but clearly won't hold for systems already set in Production and working. Naturally, this means that we have to find ways to integrate all this stuff, and here comes the battalions of consultants and ideas, ranging from "data feeds" to "flat files" to "CORBA" to "Web Services" to "ESBs" to whatever-comes-next to create an n-squared problem as each system hooks up to other systems in a point-to-point fashion (regardless of what the technology was originally supposed to do). Unfortunately, as it turns out, we as an industry are much better at building accidental complexity than we are at building lean, mean, efficient systems, and holding complexity up as a goal of a system is not helping matters.
- Next, we have two "facts" reported by Gartner, with no citation of the report or the source of the facts to enable any kind of follow-up fact-checking. Don't get me wrong, I have no doubt that Gartner said these things, and in fact, I won't even dispute the facts themselves--ESBs were pretty unheard of a few years ago, so the fact that it's the "fastest growing application integration middleware" wouldn't even surprise me--anything to help solve this problem of complexity in their data center is welcome, and hey, it's a lot easier to purchase something than to face the problem. After all, it's only money. And the fact that "F1000 firms increasingly see the Enterprise Service Bus as a “core component” in their multi-million-dollar Service-Oriented infrastructure investments"? Sure, I buy that too: if you have a multi-million-dollar infrastructure investment, and everybody seems to be doing ESBs, then by Golly, you're going to do an ESB too, because "Nobody ever got fired for buying IBM", which, loosely translated, means, "If I'm a CTO, and I spend a million dollars on something that everybody else is spending a million dollars on, you can't really fire me when--not if--it doesn't work. After all, it's only money, and it's not my money, to boot, compared with losing my job if I spend that million on a serious in-depth look at our IT infrastructure and find out exactly what it would take to get some of that accidental complexity under control."
- Next, we have the fact that this event is an event "designed to provide executives, business managers, software developers and architects with valuable 'hands-on' information about improving technologies and results for design time and runtime SOA projects." How can you create an event that will be appealing and useful to that diverse of a crowd? The concerns that executives face are nowhere near the concerns that face software developers, and business managers' concerns are definitely different from those that architects face. To be useful to the executive, it must be high-level; to be useful to the software developer, it must be low-level. You cannot serve two masters well, much less four.
- Following up on said quote, let's examine the fact that this is a webcast (an online event), yet it claims to provide 'hands-on' information. Last time I checked, 'hands-on' meant I can get my hands on it myself and, presumably, drive it for myself to see how well it works. This is remarkably difficult to do in a webcast event, particularly one that's only three hours long and consists of a panel. So "hands-on" here must mean you can touch your keyboard, which, of course, is connected to the computer that's displaying the web browser in which this information is being displayed. That must be the new definition of 'hands-on' in this "Web 2.0" world. I guess.
- Oh, and did anybody else notice that all of this "information about improving techniques and results for design time and runtime SOA projects" is being presented by not one but three managers? Of Marketing? No offense to the intelligence of these three gentlemen intended, but last time I looked, Marketing guys don't spend much time implementing SOA projects of any time, design, run- or otherwise. They spend time trying to cook up good campaigns to reach developers and their managers in order to sell product. Not exactly a huge source of technical information, at least, not in my experience.
- Last but not least, does anyone find it at all suspicious that this seminar is being sponsored by three companies with a vested interest in selling you on ESB so that they can sell you on their ESB product? I dunno about y'all, but when the salesman comes around to my door, I don't take much of anything he says too literally or too naively. Why we take Marketing and Sales executives from technology companies without the same degree of cynicism has never failed to amaze me.
At the end of the day, sensationalist presentations, claims of amazing productivity, or wild-eyed testimonials without sufficient history to prove or disprove them do not only hurt the developers and companies seeking to use these tools, they hurt the credibility of the companies promoting these tools. Claiming that your product is the answer to all the IT world's ills is a recipe for disaster unless you can deliver the goods, and frankly, I have yet to see a tool--any tool--answer that promise. A message to the ESB vendors: Don't make it harder on us... or yourselves. Give us hard technical info, not sensationalist claims.
Friday, April 25, 2008 11:33:36 AM (Pacific Daylight Time, UTC-07:00)
|
|
 Wednesday, April 16, 2008
|
Do you fall prey to technical folk etymology?
|
|
From Wikipedia (itself a source of conceptual folk etymology, but that's another rant): - A commonly held misunderstanding of the origin of a particular word, a false etymology
- "The popular perversion of the form of words in order to render it apparently significant"; "the process by which a word or phrase, usually one of seemingly opaque formation, is arbitrarily reshaped so as to yield a form which is considered to be more transparent"
What do I mean by "technical folk etymology"? - If you're a Java developer, consider the term EJB. No, I mean, seriously think about it for a moment. What images are conjured before your eyes when you do so? Horrendous APIs, hideous deployment requirements, complete untestability, and something that takes forty-five minutes to start?
- If you're a Ruby developer, consider the phrase static typing. Same sort of reaction?
- If you're a .NET developer, try on COM or COM+ or even ATL. Mystifying collections of code repeated by rote, cut-and-pasted from that very first project you built by hand from Inside OLE 2, and once you got it to work, served as your basic template for every other COM/MTS/COM+ entity you ever built?
- Or, if you're any of these, how about Visual Basic?
- Or maybe Web services, specifically all those WS-* specifications?
As the MVP Summit Product Team dinners wound down here in Redmond tonight, I found myself at a table with (not surprisingly) Neal Ford and Venkat Subramaniam, two of my close friends from the NFJS tour, and who should join us but first Amanda Silver (lately of the Office team, but with her heart still firmly rooted in her Visual Basic dev days), then Don Box (lately of the Oslo team) and Chris Sells (also lately of the Oslo team), and a rousing discussion around the concept of DSLs--domain specific languages--arose, largely because Don wanted to sound out Venkat and Neal on the subject. Listening to the conversation (Don was mostly interested in Neal and Venkat's opinions, so I just relaxed and listened for the most part), I realized that the discussion was entirely rooted in this concept of context, that ephemeral structure surrounding a concept that gives it shape and color and taste and the other aesthetic qualities that lead us to "like" or "dislike" or "accept" or "reject" certain concepts. Don held the position--either for arguments' sake or because he believes it, I'm not sure which--that domain-specific languages lose context too easily once stored on the file system, in ways that data does not. His test was to suggest "What if a random piece of text drops into my email, how do I know what consumes this text?" The answer, of course, is that you don't, unless you somehow have a context by which to understand a piece of text, in many cases based solely on nothing more than filesystem extension, or MIME type, or the "#!/bin/..." line that precedes many shell scripts, and so on. Interestingly enough, as I drove home after the dinner, I realized that the conversation echoed an exchange Neal and Venkat and I were having in the car on the way over, about how Microsoft (I think) is making a huge mistake by looking to make C# more dynamic in nature[1]. My position was (and is) that Microsoft needs to differentiate the two key languages they offer--C# and Visual Basic--and an obvious way to do so would be to designate VB as the official "dynamic language" for the CLR, and C# as the official "static language" for the CLR, and encourage developers to use C# to build infrastructure (libraries and business types and so on) and VB to build "top of the stack" kinds of code (WinForms, ASP.NET, and so on). Neal put me squarely back on my heels with this (paraphrased) comment: Microsoft will never do this, because Visual Basic will never be able to shed the image it has gained, that of being the programming language for idiots[2]. Wow. Sad thing is, he's right. Go back to the terms I suggested you think about at the top of this blog post. If you're like most Java developers, you heard the term "EJB" and immediately got a note of distaste in your mouth. You know that if you suggest EJB on your next Java project, you will be ridiculed and shamed and made to stand in the corner with the Dunce Cap on, even if it makes complete sense from a technical perspective. Companies are choosing instead to build their own transactional-oriented client/server middleware infrastructure, just to avoid the "shame" of using EJB. Because, as we all know, you just can't test EJB. Which, by the way, is a fallacy, and always has been. Oh, I know, you meant you can't unit-test EJB, but that's a fallacy too. It's always been testable, to the same degree that any servlet application has been testable, it's just that nobody wanted to take the time to figure out how to test it effectively, particularly not once Rod Johnson had unleashed Spring upon the world and Made Everything Better(TM) (or, at least, XML configurable, which is better... right?). Static typing suffers the same kind of negative prejudice today. Suggest that C++ has a place in the world, and you will be kicked to the curb by any Right-Thinking Technical Leader. Suggest that C++ has a place on your next project, and you're likely to get sternly reprimanded, possibly even cut loose from the project. Suggest anything that doesn't fit with the Way We Build Software Today, and you're swimming upstream, either with management or with your fellow developers. All because they fall prey to technical folk etymology. They bend the context around the phrases in question to mean something entirely different than what the words actually mean, and as a result, the words take on an aura of snarling, bitter distaste, or, worse, angelic euphoric enlightenment. Domain-specific languages are the new phrase of the moment, and its emotional context is being built as we speak. Functional languages will be there sometime next year or the year after. For both, the euphoria is growing, and for each, in some period of n (three, maybe four) years will be crashing just as hard as they were built up, just as Ruby's and Visual Basic's and COM's and EJB's and WS-*'s and other technologies have done before it. It's as predictable as the flow of alcohol at an MVP Summit, or the consumption of either caffeine at an all-night code frenzy. Other industries have varying relationships with this notion of context: the medical field seems to be almost as susceptible to it as we are, particularly the area of weight management and holistic health (remember the water diet? the South Beach diet? the no-sodium diet? the low-cholesterol diet?), whereas traditional engineering disciplines, such as electrical and construction disciplines, seem far less vulnerable to "the hip new thing of the day". I'm not sure why this is, quite honestly, except that software and medicine share the similar characteristics of a rapid influx of new information on a regular, even daily, basis. People often call me a contrarian, a technical fuddy-duddy who refuses to embrace anything new or anything "bleeding-edge". In many respects, I welcome and accept that label, but frankly, I bristle at the implicit "you just don't want to learn anything new" accusation, because it's a gross misunderstanding and hideous misinterpretation of what I'm really trying to do: Distance myself from the emotional context surrounding a technology, and examine it through the lens of dispassionate observation. In short, I actively seek to defeat technical folk etymology, if only in the small area I personally can affect. Do you? [1] That particular discussion will have to wait for a different blog post on a different day. [2] I should point out, before the hate mail comes flooding in, that this isn't Neal's own opinion, nor mine--witness my post on "Mort means productivity". What he--and I--refer to here is the reputation Visual Basic has garnered, not the fact surrounding it. And if you care to argue that point, then you're not paying attention to the relative average salary numbers between C# and Visual Basic developers. The laws of economics do not lie.
|
 Saturday, April 12, 2008
|
JRuby 1.1 released
|
|
From the "Where the hell was I that day?" Department.... The JRuby community is pleased to announce the release of JRuby 1.1! Homepage: http://www.jruby.org/ Download: http://dist.codehaus.org/jruby/ JRuby 1.1 is the second major release for our project. The main goal for 1.1 has been improving performance. We have made great strides in performance during the last nine months. There have been more and more reports of applications exceeding Ruby 1.8.6 performance; we are even beating Ruby 1.9 in some microbenchmarks. ... (Source: http://docs.codehaus.org/display/JRUBY/2008/04/05/JRuby+1.1+Released) Congratulations to Thomas and Charlie and the rest of the JRuby team; I'm looking forward to playing around with JRuby, specifically in AOT/compiled mode, and for using it as Ruby was originally intended, as a scripting language to make working with systems (in this case, the JVM) easier, a la JMX client scripts to stop-and-start a web application during development/deployment.
|
 Thursday, April 10, 2008
|
Is "Performance" Subjective or Objective in nature?
|
|
(Editor's note: This post is likely to open a huge can of whoop-*ss on this blog, so unless you want to get caught up in the huge bar fight that's about to break out, you're advised to take your whiskey or beer and head outside for a smoke until the cops come.) As a fellow Scala writer, I've been following Daniel Spiewak's blog with no small amount of interest, as he discovers little tidbits inside the Scala language (like the Option type). Then I ran across this entry, about benchmarks and comparing the performance of Java, Groovy and Scala: I’ve seen these results dozens of times (looking back at the post), but they never cease to startle me. How could Groovy be that much slower than everything else? Granted it is very much a dynamic language, compared to Java and Scala which are peers in static-land. But still, this is a ray tracer we’re talking about! There’s no meta-programming involved to muddle the scene, so a halfway-decent optimizer should be able to at least squeeze that gradient down to maybe 5x as slow, rather than a factor of 830. That's a huge discrepancy, and like Daniel, I'm not sure where the perf hit comes from, particularly when we consider that JRuby, another language with equally powerful metaobject protocol (MOP) capabilities, is turning in performance times that are equal to those we see with the original Ruby interpreter (according to Daniel's blog entry, though I note that the comparison of JRuby to Java isn't given). And if the disbelievers in the crowd are starting to tune this out based on the fact that "Ah, it must be an edge case, after all, there's always one benchmark that any language will fail compared to another one; maybe Groovy's just not cut out to do ray-tracing. Yeah, that must be it. Besides, how often do I really do ray-tracing when I'm writing code at work?", take heed, for Daniel notes this and starts to cite other evidence that seem to establish a disturbing pattern: If this were an isolated incident, I would probably just blow it off as bad benchmarking, or perhaps an odd corner case that trips badness in the Groovy runtime. Then a week later, I read this post by Pete Knego (which shows Groovy's performance as equally disappointing, on the order of 7.6x to 56x worse than equivalent Java code --TKN). All of this is old news, so the question is: Why am I bringing this up now? Well, I recently saw a post on Groovy Zone by none-other-than Rick Ross, talking about this very subject. Rick’s post was in response to two posts (here and here), discussing ways to improve Groovy code performance by obfuscating code. Uh, oh. I don't know about y'all, but anytime somebody is suggesting improving performance by obfuscating code, I'm nervous--almost by definition, code obfuscation makes code run more slowly, not more quickly, because now the bytecode is pulled out of familiar patterns recognizable by the JITter and therefore more aggressively turned into optimized native code. I'm not saying Rick is wrong, but if his experiments are leading him to understand that obfuscated code is somehow running faster than non-obfuscated code, then something deeply strange is afoot. (Editor's note: Better hurry and head outside folks, the Groovyists in the corner are starting to grumble amongst themselves, working up the courage to toss that first beer in the piano player's face.) Daniel's not done here, though, and goes on: Final result? This text is being written as I was changing and trying things, I gained 20s from minor changes of which I lost track. " src="http://www.codecommit.com/blog/wp-includes/images/smilies/icon_smile.gif"> I am currently at 1m30s (down from the original 4m and comparing with Java’s 4s). I’m sorry, this is acceptable performance? This is someone who’s spent time trying to optimize Groovy, and by his own admission, Groovy is 23x slower than the equivalent Java code. Certainly this is a far cry from the 830x slower in the ray tracer benchmark, but in this case it’s simple string manipulation, rather than a mathematically intensive test. Coming back to Rick’s entry, he looks at the conclusion and has this to say about it: Language performance is highly overrated Much is often made of the theoretical “performance” of a language based on benchmarks and arcane tests. There have even been cases where vendors have built cheats into their products specifically so they would score well on benchmarks. In the end, runtime execution speed is not as important a factor as a lot of people would think it is if they only read about performance comparisons. Other factors such as maintainability, interoperability, developer productivity and tool and library support are all very significant, too. Wait a minute, that sounds a lot like something else I’ve read recently! Maybe something like this: Is picking out the few performance weaknesses the right way to judge the overall speed of Groovy? To me the Groovy performance is absolutely sufficient because of the easy integration with Java. If something’s too slow, I do it in Java. And Java compared to Python is in most cases much faster. I appreciate the efforts of the Groovy team to improve the performance, but if they wouldn’t, this would be no real problem to me. Groovy is the grooviest language with a development team always having the simplicity and elegance of the language usage in mind - and that counts to me. " src="http://www.codecommit.com/blog/wp-includes/images/smilies/icon_smile.gif"> This is almost a mantra for the Groovy proponents: performance is irrelevant. What’s worse, is that the few times where they’ve been pinned down on a particular performance issue that’s obviously a problem, the response seems to be along the lines of: this test doesn’t really show anything, since micro-benchmarks are useless. I’m sorry, but that’s a cop-out. Face up to it, Groovy’s performance is terrible. Anyone who claims otherwise is simply not looking at the evidence. Oh, and if you’re going to claim that this is just a function of shoe-horning a dynamic language onto the JVM, check out a direct comparison between JRuby and and Groovy. Groovy comes out ahead in only four of the tests. Uh, oh. (Editor's note: Head for the doors, folks--those guys in the corner wearing the black leather jackets sporting the "Grails Rulez" logos on the back have started to head for the center of the room, and they're looking drunk, mean, and angry.) Here comes the coup de grace What really bothers me about the Groovy performance debates is that most “Groovyists” seem to believe that performance is in the eye of the beholder. The thought is that it’s all just a subjective issue and so should be discounted almost completely from the language selection process. People who say this have obviously forgotten what it means to try to write a scalable non-trivial application which performs decently under load. When you start getting hundreds of thousands of hits an hour, you’ll be willing to sell your soul for every last millisecond. The only answer I can think of is that the Groovy core team just doesn’t value performance. Why else would they consistently bury their heads in the sand, ignoring the issues even when the evidence is right in front of them? It’s as if they have repeated their own “performance is irrelevant” mantra so many times that they are actually starting to believe it. It’s unfortunate, because Groovy really is an interesting effort. I may not see any value for my needs, but I can understand how a lot of people would. It fills a nice syntactic niche that other languages (such as Ruby) just miss. But all of its benefits are for naught if it can’t deliver when it counts. That did it. (Editor's note: Shiiiiiiiit! I didn't say nothing, HE did, why're you swinging that beer stein at *WHACK*) ^^^^^^^^^^^^^^^^^^^^^ OK, now that we've gotten that out of our system, let's sit back and examine this issue more carefully, shall we? The fact is, Groovy is slower than it should be. The Groovy guys can mumble about how performance isn't that important and that developer productivity is what really matters and similar kinds of rationale, but at the end of the day, the basic fact remains that Groovy is, by measurement of several different tests, at least an order of magnitude slower than compiled Java code. Or is it? Funny thing is, looking at some of these tests, they don't say whether the Groovy code was compiled first, or run through the Groovy interpreter. Theoretically this shouldn't matter, since the Groovy architecture essentially compiles the classes generated once read, it might make a difference in practice. Although Daniel's post doesn't mention it, I went back and double-checked. Peter Knego's benchmark says that "Groovy code was inside a Groovy script, compiled with groovyc.", which leads me to believe that it was compiled code rather than run through the Groovy shell interpreter, but it would be nice if the actual code and batch/command scripts that ran the benchmarks would be available. (He also notes that each time, the benchmark was "warmed up" by running the bechmark five or six times in a loop, presumably to allow the JIT to work its magic, but most notably, doesn't point out which JVM was used, -client or -server.) Meanwhile, Derek Young's ray-tracing example explicitly uses the Groovy interpreter, but defends that decision in comments: "The only reason I didn’t use groovyc was because the difference was so great, and the compilation overhead at the beginning of the run only takes a couple seconds. I decided it wasn’t worth waiting another two and a half hours to time the compiled output. Running with groovy first compiles the code just like groovyc does, then executes that code. It doesn't interpret the source code or run any differently." So, apparently, it doesn't make much difference. That's not a good development for the Groovy language. But let's put the hard numbers out of the way for a moment, and concentrate on the much bigger question: does the core Groovy team just not value performance? And, as a corollary to this, does the performance of a language really matter in a day and age when CPUs are still doubling in size and number of cores? Rick's (and others', including myself) positions on this seem fairly clear, that we long ago passed the threshold where programmer time became more expensive than CPU time, and therefore we should optimize based on programmer productivity, not CPU efficiency, and that's important to recognize: a language should enable the programmer to express the core idea without a great deal of "noise" or additional work, what Stu Halloway has coined as "ceremony", and certainly Groovy takes the Java programmer a step closer towards that place of lower ceremony. But... But I can't help it, folks. Ted's First Law of Computer Science states that "Dogma is the Root of All Evil", and holding scripting languages up as the last language you'll ever have to use is dogma, plain and simple. Ted's Second Law of Computer Science states, "Context matters", and in this case, the context includes the performance cost of using a language or tool. Taking a performance hit that weighs in at the orders of magnitude mark is just too big to ignore--the ray-tracing example, at its close-to-four-orders-of-magnitude hit, almost suggests that it would have been just as expensive to offload all those calculations through a distributed RPC call to another machine, rather than calculate it locally in Groovy, and when it becomes faster to go off-CPU to do a calculation than to do it locally, something is wrong. And Daniel's point is good to hear clearly through the noise: "When you start getting hundreds of thousands of hits an hour, you’ll be willing to sell your soul for every last millisecond." Forget getting hundreds or thousands of hits an hour--the real test will be when the system gets hundreds or thousands of hits per second, that's when developers will be scrambling to find ways to eke out those last bits of performance from the system, even if it means selling their last Mountain Dew (which for some is pretty much synonymous to "soul") to whatever entity can give it to them. So what exactly is my point with this particular entry, besides stirring the pot up a little? In order: - Measure for yourself. As with all things performance and scalability related, abstract benchmarks aren't a good measure of how well it works for you and your system. Build a prototype, measure, and then compare that against your performance and scalability goals. You did establish performance and scalability goals as part of your project's runup, right? (If you didn't, then you probably assume your users don't care about performance, and I suspect you'll be rudely surprised on the veracity of that statement before long...)
- Benchmarks are tricky things. Programmers could learn something from politicians, and that is the imprecise nature of poll results. Thanks to the nature of statistical analysis and the sample size and source used to produce the poll, polls are always cited with a "plus or minus 3 percent" (or 5 percent, or 10 percent) to indicate the imprecision assumed in the poll. Benchmarks, both across languages and across other products, should be assumed to have similar kinds of imprecision. As people have already noted, benchmarks very quickly get "gamed" in order to produce results that are unfairly biased one way or another if they're not explicitly written and administered to be fair and equal to all sides involved. This isn't to say that benchmarks aren't useful, they're just not useful to a point more precise than rounding to the nearest 10% figure.
- Groovy IS slower than it could or should be. As much as I like the people involved in the Groovy space, and as much as I like the language itself, I can't help but be very very worried that Groovy's performance numbers aren't anywhere close to where they should be. Yes, productivity will get you a long way in the technology-adoption market, but once people have adopted your language or tool, if your system proves to be unresponsive and performance-challenged, the doors letting people in will get blocked by the people trying to get out, and that's not good for Groovy.
- Groovy's performance may not be a reason not to use Groovy. Let's be honest, again: productivity matters. This is Mort's principal goal, remember, and there's nothing wrong with it. Groovy fits into that most natural of places, a scripting language gluing together pieces written in a system language. Perl and Python serve the same purpose for native/C/C++ code, PowerShell does the same (I believe) for .NET code, and it's high time we see the value in doing that for Java code.
- I believe that the core Groovy team holds performance as a value. I know Graeme and Guillaume, and I believe they believe in the value of performance. I believe that Groovy will get faster over time, as they discover new and better ways to compile Groovy code into bytecode. That doesn't mean users of Groovy should walk into this exercise with their eyes shut, mind you, but take the whole of this discussion into context as you figure out where Groovy can be used to make your life, as a developer, more productive and powerful. Certain parts of your system are perf-sensitive, and certain parts aren't. Identify which of those parts are which, and apply Groovy (and other tools) judiciously.
- These discussions are always good, so long as they're held without rancor. Groovy doesn't suck. It has warts, but so does everything. Hushing them up or pooh-poohing them just leads to arguments--I encourage the Groovy team to take the criticism of Groovy's performance the way I intend it in this blog entry: a challenge to be faced and overcome, and not as an indictment of any and all Groovy code everywhere. Because, and I will say this outright, if Groovy's backers seriously mean Groovy as "a better Java 7", then they have a large gap to fill.
Oh, and if some of you wouldn't mind sticking around to clean up the mess...? Getting beer off the ceiling can be tricky.
|
 Wednesday, April 09, 2008
 Tuesday, April 08, 2008
|
IE 8 Beta
|
|
This email crossed my desk yesterday, courtesy of the MVP program: Microsoft has recently released a public beta of IE8. Standards and security are of top importance in this release. To that end, the IE team is planning on releasing IE8 in full standards mode. Releasing in Full Standards Mode offers many benefits in the long term, but short term, could cause some end-user and developer issues. We would love to understand your thoughts around the impact of this specific issue and invite your suggestions on how we can best communicate it. If you have thoughts and feedback on IE 8 releasing in full standards mode, please respond to the questions below and send your reply to jasontil@microsoft.com with “[IE8 Community Feedback]” in the subject line by this Friday, April 11th at Noon, PDT. 1) IE8 releasing in expected to release in “standards mode”. (a) What do people in your communities space think about this decision? (b) What do you predict the impact to be on the customer and/or Developer experience? (c) Do you have a recommendations on how best to share this information? 2) Our current plan is to communicate this heavily with web site owners and developers. We will be contacting top sites directly, distributing developer FAQs, and writing Knowledge Base articles on authoring to these standards. (a) Do you think that will be effective at improving the customer experience? (b) Are there other suggestions do you could offer to transition web sites to be standards-based or to improve the experience for users? 3) Is there anything else you or those in your communities wish to tell us about this issue to improve how we react and respond as Internet Explorer advances to release? If you're a web developer, regardless of what your server-side language of choice is, you probably want to sign up for this, if only to get an early look at IE 8 and how it will, once again, change your life as a web developer. (Ditto for Firefox 3 and Safari, while we're at it.)
|
 Sunday, April 06, 2008
|
More on Paradise
|
|
A couple of people have commented on the previous entry, citing, essentially, that Google needs to do this to be "the best". I understand the argument completely: Google wants to attract the top talent, or retain the top talent, or at least entice the top talent, not to mention give them every reason to be horribly productive, so all of that extravagance is a justifiable--and some might argue necessary--expense. Thing is, I don't buy into that argument for a second. Talent wants to be rewarded, granted, but think about this for a moment: what kind of hours are these employees buying into by working there? There's an implicit tradeoff here, one that says, "If you are insanely productive, then the cost of this office is justified", meaning the pressure is on. Having an off day? Better pull the all-nighter to make up for it. Got stuck on something you didn't anticipate? Better pull the all-weekender to compensate. You're not in the bush leagues any more, sonny--you're at Google, and we paid a lot of money to make this office your home away from home, so snap to it! I'm not suggesting that Goole is explicitly demanding this of their employees... but neither did Microsoft, back in the day. See, all of this--including the justification arguments--is eerily reminiscent of Microsoft in their heyday, with the best example being the original Windows NT team. The hours they pulled over the last few months (some say years) of that project were nothing short of marathon sprints, and Microsoft laid everything they could at the feet of these developers (though nothing like what Google has built in Zurich, mind you) to help them focus on shipping the project. The Wall Street Journal ran an article about the whole thing, and one quote from that article stuck with me: that the pressure to work the insanely long hours didn't come from upper management, but from the other developers on the team. "Are you signed up for this thing or not?" was a euphemism for "Why the hell are you leaving at 9PM? And you're not back until 8AM? What are you, some kind of slacker?" (I felt like screaming back, "Just say no!", and I wasn't even there.) The peer pressure was insane, and drove several members of the team to get outta Dodge as the first opportunity. Or some took off for bike rides across the country to recharge. Or some just... broke. Microsoft doesn't do this anymore. Nobody is expected to put in 60 hour work weeks as a matter of course; now, the average is around 45, which I believe (though I have no factual evidence to support this) is about average for the industry as a whole. (C'mon, admit it, even if you're a strict 9-to-5'er, you still do a little reading at home or stick around after hours to help with the big rollout. It needs to be done, and you're professional enough to want to see it done right.) In college, I learned a lot about startups and established companies. Like most of the folks I hung out with in college, I used to stay up way too late with friends hanging out at Woodstock's (the local pizzeria) arguing politics/sex/religion/operating systems or playing role-playing games*, then come home and bang out a 5-page paper in a few hours before cracking open my notes to study for the final that morning. I could do this without any real penalty, but I usually ended up taking the final, then coming back to my apartment and passing out for a few hours, only to awake to my roommates chanting "Piz-za! Piz-za! Piz-za!" and starting the whole thing over again. I was young, I had energy, I was fundamentally stupid, and I can't do that anymore. I can't sprint like that and still be able to function coherently over time, and as you get older, you realize that while college can be managed as a series of sprints, life requires you to have a more marathon attitude, particularly because you can't know when the sprints are coming, like they do in college. As companies grow larger, their initial lifecycle is a series of sprints: roll the first release out, take a breather while the sales guys gather in the customer(s) and figure out what the next iteration will be, then do the whole thing over again. This effect is even more pronounced if the company has that one Really Big Customer, the one that represents some significant (over 50%) of the company's revenue; it's that company that drives the feature set and its delivery date. Meeting their needs and challenges becomes the source of the sprints to come. As time goes on, however, and assuming the company has somehow managed to find success, they find that the Really Big Customer is actually now just one of several, and the features and the timing of the releases need to be balanced across the entire customer set. In other words, while some sprints are still necessary, the frequency and intensity begins to smooth out and the focus shifts to structure, pace, and consistency. That is, if the company has successfully transitioned from "startup" to "established". Some startups never do, and try to sprint themselves from one scenario to the another, and eventually run themselves into the ground. Managing this transition isn't easy, and is something that generally only comes from having lived through it once or twice... or three or four times... Ah, good times. Remember that whole "work/life balance" thing? We're discovering, over and over again, that having a good work/life balance is a key part of maintaining a sound outlook on life, much less your basic sanity. Creating a "home away from home" where employees can put in insane amount of hours is not healthy "work/life balance"... unless you presume your company will be staffed with fresh-from-college twenty-something singles who have nobody to go home to and all their friends in the office. And that, folks, is not a sustainable model. Point is, Google's extravagance here smacks of startups, sprints, and fevered intensity. What's worse, I hear little bits and pieces of rumors that Google reveres the eleventh-hour developer-god who swoops in, pulls the all-weeker to get the release out the door, to high praise from management and his/her peers. That's not sustainable, and the sooner Google--or any other company, for that matter--realizes that, the better they will be in the long term. * - Does this really surprise you? Yes, I am a huge geek.
|
|
The Complexities of Black Boxes
|
|
Kohsuke Kawagachi has posted a blog entry describing how to watch the assembly code get generated by the JVM during execution, using a non-product (debug or fastdebug) build of Hotspot and the -XX:+PrintOptoAssembly flag, a trick he says he learned while at TheServerSide Java Symposium a few weeks ago in Vegas. He goes on to do some analysis of the generated assembly instructions, offering up some interesting insights into the JVM's inner workings. There's only one problem with this: the flag doesn't exist. Looking at the source for the most recent builds of the JVM (b24, plus whatever new changesets have been applied to the Mercurial repositories since then), and in particular at the "globals.hpp" file (jdk7/hotspot/src/share/vm/runtime/globals.hpp), where all the -XX flags are described, no such flag exists. It obviously must have at one point, since he's obviously been able to use it to get an assembly dump (as must whomever taught him how to do it), but it's not there anymore. OK, OK, I lied. It never was there for the client build (near as I can tell), but it is there if you squint hard enough (jdk7/hotspot/src/share/vm/opto/c2_globals.hpp), but as the pathname to the source file implies, it's only there for the server build, which is why Kohsuke has to specify the "-server" flag on the command line; if you leave that off, you get an error message from the JVM saying the flag is unrecognized, leading you to believe Kohsuke (and whomever taught him this trick) is clearly a few megs shy in their mental heap. So when you try this trick, make sure to use "-server", and make sure to run methods enough to force JIT to take place (or set the JIT threshold using -XX:CompileThreshold=1) in order to see the assembly actually get generated. Oh, and make sure to swing the dead chicken--fresh, not frozen--by the neck over your head three times, counterclockwise, while facing the moon and chanting, "Ohwah... Tanidd... Eyah... Tiam...". If you don't, the whole thing won't work. Seriously. ... Ever feel like that's how we tune the JVM? Me too. The whole thing is this huge black box, and it's nearly impossible to extract any kind of useful information without wandering into the scores of mysterious "-XX" flags, each of which is barely documented, not guaranteed to do anything visibly useful, and barely understood by anybody outside of Sun. Hey, at least we have those flags in the JVM; the CLR developers have to take whatever the Microsoft guys give them. ("And they'll like it, too! Why, when I was their age, I had to program using nothing but pebbles by the side of the road on my way to school! Uphill! Both ways! In the raging blizzards of Arizona!") Interestingly enough, this conversation got me into an argument with a friend of mine who works for Sun. During the conversation, I mentioned that I was annoyed at the difficulty a Java developer has in trying to see how the Java code he/she writes turns into assembly, making it hard to understand what's really happening inside the black box. After all, the CLR makes this pretty trivial--when you set a breakpoint in Visual Studio, if you have the right flags turned on, your C# or VB source is displayed alongside the actual native instructions, making it fairly easy to see that the JITted code. This was always of great help when trying to prove to skeptical C++ developers that the CLR wasn't entirely brain-dead, and did a lot of the optimizations their favorite C++ compiler did, in some cases even better than the C++ compiler might have done. "Why don't we have some kind of double-X-show-me-the-code flag, so I can do the same with the JVM?", I lamented. His contention was that this lack of a flag is a good thing. Convinced I was misunderstanding his position, I asked him what he meant by that, and he said, roughly paraphrasing, that there are only about 20 or so people in the world who could look at that assembly dump and not draw incredibly misguided impressions of how the JVM operates internally; more importantly, because so few people could do anything useful with that output, it was to our collective benefit that this information was so hard to obtain. To quote one of my favorite comedians, "Well excuuuuuuuuuuse ME." I was a bit... taken aback, shall we say. I understand his point--that sometimes knowledge without the right context around it can lead to misinterpretation and misunderstanding. I'll agree totally with the assertion that the JVM is an incredibly complex piece of software that does some very sophisticated runtime analysis to get Java code to run faster and faster. I'll even grant you that the timing involved in displaying the assembly dump is critical, since Hotspot re-JITs methods that get used repeatedly, something the CLR has talked about ("code pitching") but thus far hasn't executed on. But this idea that only a certain select group of people are smart enough and understand the JVM well enough to interpret the results correctly? That's dangerous, on several levels. First, it's potentially an elitist attitude to take, essentially presenting a "We look down on you poor peasants who just don't get it" persona, and if word gets out that this is how Sun views Java developers as a whole, then it's a black mark on Sun's PR image and causes them some major pain and credibility loss. Now, let me brutally frank here: For the record, I don't think this is the case--everybody I've met at Sun thus far is helpful and down-to-earth, and scary-smart. I have a hard time believing that they're secretly thumbing their nose at me. I suppose it's possible, but it's also possible that Bill Gates and Scott McNealy were in cahoots the whole time, too. Second, and more importantly, there will never be any more than those 20 people we have now, unless Sun works to open the deep dark internals of the JVM to more people. I know I'm not alone in the world in wanting to know how the JVM works at the same level as I know how the CLR works, and now that the OpenJDK is up and running, if Sun wants to see any patches or feature enhancements from the community, then they need to invest in more educational infrastructure to get those of us who are interested in this stuff more up to speed on the topic. Third, and most important of all, so long as the JVM remains a black box, the "myths, legends and lore" will haunt us forever. Remember when all the Java performance articles went on and on about how method marked "final" were better-performing and so therefore should be how you write your Java code? Now, close to ten years later, we can look back at that and laugh, seeing it for the micro-optimization it is, but if challenged on this idea, we have no proof. There is no way to create demonstrable evidence to prove or disprove this point. Which means, then, that Java developers can argue this back and forth based on nothing more than our mental model of the JVM and what "logically makes sense". Some will suggest that we can use micro-benchmarks to compare the two options and see how, after a million iterations, the total elapsed time compares. Brian Goetz has spent a lot of time and energy refuting this myth, but to put it in some degree of perspective, a micro-benchmark to prove or disprove the performance benefits of "final" methods is like changing the motor oil in your car and then driving across the country over and over again, measuring how long until the engine explodes. You can do it, but there's going to be so much noise from everything else around the experiment--weather, your habits as a driver, the speeds at which you're driving, and so on--that the results will be essentially meaningless unless there is a huge disparity, capable of shining through the noise. This is a position born out across history--we've never been able to understand a system until we can observe it from outside the system; examples abound, such as the early medical understanding of Aristotle's theories weighed against the medical experiments performed by the Renaissance thinkers. One story says a skeptic, looking at the body in front of him disproving one of Aristotle's theories, shook his head and said, "I would believe you except that it was Aristotle who said it." When mental models are built on faith, rather than on fact and evidence, progress cannot reasonably occur. Don't think the analogy holds? How long did we as Java developers hold faith with the idea that object pools were a good idea, and that objects are expensive to create, despite the fact that the Hotspot FAQ has explicitly told us otherwise since JDK 1.3? I still run into Java developers who insist that object pools are a good idea all across the board. I show them the Hotspot FAQ page, and they shake their head and say, "I would believe you except that it was (so-and-so article author) who said it." Oh, and don't get me started on a near-total opacity of the Parrot and Ruby environments, among others--this isn't a "static vs dynamic" thing, this is something everybody running on a managed platform needs to be able to do. I'm tired of arguing from a position of faith. I want evidence to either prove or disprove my assertions, and more importantly, I want my mental model of how the JVM operates to improve until it's more reflective of and closer to the reality. I can't do that until the JVM offers up some mechanisms for gathering that evidence, or at least for gathering it more easily and comprehensively. You shouldn't have to be a JVM expert to get some verification that your understanding of how the JVM works is correct or incorrect.
|
 Thursday, April 03, 2008
|
Developer paradise?
|
|
Check out this video. No, go on, watch it. The rest of this won't make much sense until you do. Now that you've seen it, take a moment, do the "WOW" thing in your head, imagine how cool it would be to work there, all of it. Go on, I know you want to, I did too when I first saw it. Go ahead, take a moment; you'll be distracted until you do, and you'll miss the rest of the point of this blog entry, and then I'll be sad. Go on, now. Here, I'll do it with you, even. Mmmmmm. Slide to lunch. Ahhhh. Massage chair in front of the fish tank. Wow, just think of how cool it must be to work at Google. I mean, they work hard and all, but still... now there's a company that knows how to take care of its engineers, right? OK, daydreaming done? Let's think about this for a moment. First, how can anybody get anything done with all that noise surrounding them? Oh, I don't mean actual audio noise, I know they've created quiet zones and all that, I mean the myriad distractions that float around that office building. I'll be honest--I find myself getting work done better in an environment without that additional stimulus and excitement (legacy of my ADD, I'm sure). Knowing that I could just nip on over to the video game room to spend some "thinking time" in front of an all-you-can-play Galaga machine would drive me batty. Maybe that's just me, and others are just begging to be given the chance to prove me wrong, and if that's the case, then by all means, please feel free. But I've heard this same experience from lots of people doing the work-at-home thing, and I don't think the anecdotal evidence here is widely skewed. Sometimes you want work to be... just work. Vanilla, boring, and predictable. Don't get me wrong--I don't exactly look forward to my next engagement that plops me down in the middle of the cube farm--there's a continuum here, and Google is clearly far on the opposite end of that spectrum from the Dilbert-esque cubicle prairie as anyone can get. But had I my personal preference here, it would be a desk, fairly plain, comfortable, yet focused more on the functional than the "fun". But second, there's a deeper concern that I have, one which I worry a lot more about than just peoples' preference in work space. When's the last time you saw this kind of extravagance being lavished on developers? For me, it was at a number of different Silicon Valley firms during the dot-com boom of the late 90's... and all of those firms are dessicated remains of what they once were, or else dried up completely into dust and have long since blown away with the coastal breeze. This was classic startup behavior: drop a ton of money I'll call it: If Google sees nothing wrong with this kind of extravagance in setting up an office, then they have just done their first evil. Pause for a moment to think about the costs involved in setting up that office. I submit to you, dear reader, that Google is being financially irresponsible with that office, all nice perks aside. Google's money machine isn't going to last forever--nobody's ever does--and the company (desperately, IMHO) needs to find something else to prove to Wall Street and Developer Street that they're still a company that knows how to write cool software and make money. (Plenty of companies write cool software, and close their doors a few years later, and plenty of companies know how to make money, but having a company who can do both is a real rarity.) Look at Google's habits right now: they're pouring money out left and right in an effort to maintain or improve the Google "image"; tons of giveaways at conferences, tons of offices all across the world, incredible office spaces like the one in the video, and a ton of projects created by Google engineers just because said engineers think it's cool. While that's a developer's dream, it doesn't pay the rent. I want to work for a company that offers me a creative, productive work environment, true, but more than that, I want to work for a company that knows how to make sure my checks still cash. (Yes, I remember the late 90's well, and the collapse that followed.) I'm worried about Google--they appear to be on a dangerous arc, spending in what would seem to be far greater excess of what they're taking in, and that's not even considering some of the companies they would be well-advised to consider buying in order to flush out more of their corporate profile (which is its own interesting discussion, one for a later day). What is Google's principal source of income right now? Near as I can tell, it's AdWords, and I just can't believe that the AdWords gravy train will run any longer than... ... well, than the DOS gravy train. Granted, that train ran for a long time, but eventually it ran out, and the company had to find alternative sources of income. Microsoft did, and now it's Google's turn to prove they can put money back into their corporate coffers. The parallels between Google and Microsoft are staggering, IMHO.
|
 Wednesday, April 02, 2008
|
Is Microsoft serious?
|
|
Recently I received a press announcement from Waggener-Edstrom, Microsoft's PR company, about their latest move in the interoperability space; I reproduce it here in its entirety for your perusal: Hi Ted, Microsoft is announcing another action to promote greater interoperability, opportunity and choice across the IT industry of developers, partners, customers and competitors. Today Microsoft is posting additional documentation of the XAML (eXtensible Application Markup Language) formats for advanced user experiences, enabling third parties to access and implement the XAML formats in their own client, server and tool products. This documentation is publicly available, for no charge, at http://go.microsoft.com/fwlink/?LinkId=113699 . It will assist developers building non-Microsoft clients and servers to read and write XAML to process advanced user experiences – with lots of animation, rich 2D and 3D graphic and video. Specifically, non-Microsoft servers can more easily generate XAML files to be handled, for example, by applications running on Windows client machines. In addition, non-Microsoft clients can be written more easily to interpret XAML files. This action will assist ISVs in creating design tools and file format converters to read and write XAML to create advanced user experiences. Microsoft is making this documentation available under the Microsoft Open Specification Promise (OSP), which will allow developers of all types anywhere in the world to access and implement the XAML formats in their own client, server or tool products without having to take a license or pay a fee to Microsoft. The following quote is attributable to Tom Robertson, general manager, Interoperability and Standards, Microsoft. “Microsoft’s posting of the expanded set of XAML format documentation to assist third parties to access and implement the XAML formats in their own client, server and tool products will help promote interoperability, opportunity and choice across the IT community. Use of the Open Specification Promise assures developers that they can use any Microsoft patents needed to implement all or part of the XAML formats for free, anywhere in the world, now and in the future.” Please let me know if you have any questions or if I can provide you with any additional information. Best, N-- This marks the most recent in a slew of efforts by the Borg of the Pacific Northwest to "promote greater interoperability, opportunity and choice", and I know it's left a lot of people feeling decidedly skeptical and... well, let's just call it what it is, paranoid, about the company's plans and ulterior motive behind all these efforts. After all, this is the company that tried to co-opt Java, put Stacker out of business, used their monopoly operating system power to crush Novell, used their monopoly office suite power to crush the Mac, bribe an entire country to vote their way on the new office-file specifications, and I don't know what all else. I know, I know, all my blog-readers who work at Microsoft are going nuts right now, protesting, claiming that this isn't the same company that they work for now, and so on. Fact is, folks, if you work at Microsoft, you work for a company whose name is not well-received in many quarters, and while some of it is undeserved... some of it is. Microsoft has done some pretty stupid things in its history, and if that reputation doesn't sit well with you now, I can't help but wonder if somewhere in that great Corporate Heaven, Stac Electronics isn't just jumping up and down, foaming at the mouth and screaming, "Ha! Serves you right!" I don't want to use this blog as a chance for everybody who ever got burned by Microsoft (or thought they got burned by Microsoft, which is much more widespread and just as much more likely to be in their own minds) to trot out "reap and sow" cliches. Instead, I want to revisit one of my favorite topics, that of interoperability, and see exactly what this new shift in Microsoft's attitude towards interoperability really means. Let's take these one at a time. Note that I have no "Deep Throat" at Microsoft feeding me "the Redmondagon Papers"; this is all based on my own conjecture and perspective. What does releasing the XAML spec really mean? Honestly, it means that now non-Microsoft platforms can try to create competitors to Aero and Windows Presentation Foundation, and have the same kind of rich client experiences that Windows users can enjoy. Honestly, I expect this to go pretty much nowhere. Realistically speaking, if a non-Microsoft app server wanted to generate XAML, it was a simple matter of generating the appropriate XML, tagging it with an appropriate MIME type in the HTTP header, and serving it up over an HTTP request; I've been giving this demo at conferences for three or four years now, pretty much since the first betas of WPF were stable enough to use. This really isn't rocket science. But more importantly, XAMl has always been misunderstood: it's not a presentation format, it's an object graph format. XAML simply "wires up" a collection of objects into a tree, and it's the underlying object model that provides the functionality or power or presentation or whatever. It's an easier way of writing "Button b = new Button(...);", nothing more, nothing less. Sure, it would be nice to have some kind of equivalent for the Swing space, but doing so would tie the corresponding XML (XSML?) to the Swing APIs, just as WPF XAML is tied to the WPF API. Does releasing the XAML specs mean that now Linux and Mac OS will get WPF features? They've had them for years, in the guise of the OpenGL APIs, and nobody knew what to do with them, except maybe for a sliver of folks building games and interesting "effects". Unless somebody really feels the desire to try and create an adapter layer to map the WPF Button over to an OpenGL button, I really don't see much point. This is one of the most dangerous points in the discussion: attempting to build an adapter to another platform's API is almost always a failed experiment from the day it's begun, and Microsoft's own attempt to port the MFC APIs over to the Mac OS (back in the pre-OS X days, circa 1995) were just a miserable, abject failure. Not because of any lack of intelligence on Microsoft's part, mind you, but because the two operating systems are just too different. Want to see what I mean? Bring a Mac guy and a Windows guy into the same room, and ask them each where God intended the menu bar to live. Then creep, quietly, out of the room, before you get caught in the blood frenzy. Why does Microsoft suddenly care about interoperability? This is the crown jewel of the lot: why should this company, so famous for going it alone on so many issues, suddenly decide that it's important for them to embrace the other kids on the playground and make nice? Is this back to the "embrace" part of the "embrace and extend and extinguish" cycle that they're so famous for? Partly. To understand the point I am about to make, let's set some context. (In other words, gather 'round, children, it's story time.) Truth is, there was a time back there in the '90s when I think Microsoft really thought they could take over the world. COM was on the ascendancy, and it was a better platform for building software than anything else out there (at the time), particularly in the area of building rich media applications (remember when embedding a sound clip into your email message embedded inside your spreadsheet was all the rage?). The CORBA initiative was going strong, true, but its great claim to fame was to allow two remote processes to talk to one another--the rest of the CORBA "push" was in standards that either never materialized, or else materialized but turned out to be really hard to build, or use, or deploy, or all of the above. IBM's great competitor--SOM--wasn't even in beta on anything other than OS/2 (another great IBM product). Then, when DCOM shipped, it was seen by some as the final nail in the CORBA coffin; Microsoft clearly was going to "win". Along came Java. Java literally took the rug out from underneath the COM platform, almost overnight. It provided a platform with most of the same benefits as the COM/DCOM platform, but without having to memorize the QueryInterface rules or knowing what IUnknown was or how IDispatch was required to work or how static_cast<> and dynamic_cast<> and QueryInterface were all related. ("Would you, should you, static_cast? Not if you want your code to last..." Ah, those were heady days.) Suddenly, "mere mortals" could program on this platform, and feel a strong sense of confidence that their code would work, over time, regardless of whether they remembered to set references to null when they were done with them. At first, Microsoft was "down with it", because in Java they saw a great marriage: the Java language as the "sweet spot" between C++'s expressive power and VB's layers of abstraction, running on top of the JVM as a "sweet spot" intermingled with the COM platform to provide the easiest, most powerful Windows programming environment yet. Visual J++ was clearly the favored child of the litter. And then the lawyers got involved, and Sun saw their chance to steal a march on Microsoft, and maybe break the feared operating system monopolist, and maybe even get a few more percentage points for Solaris (because, after all, "Write Once, Run Anywhere" meant that you wouldn't have to run sucky operating systems like Windows and instead could trade up to real operating systems like Solaris, right? Hey, where'd that penguin come from, anyway, and why is he eating all our fish?). Sun refused to let Microsoft's marriage of the JVM (technically the MSVM) and COM take place, and Microsoft, rather than seek to fight it out, instead decided to cede the battle, and look for a battleground of their own choosing, instead. Thus was the thing that would become called ".NET" born. But this "master plan" would take four or so years to develop, and in the meantime... ... in the meantime, EJB and Servlets and later J2EE and "app servers" and Spring and all those wonderful things that came with them, they were eating Microsoft's lunch. Comparing J2EE (even with EJB in the mix!) with the complexities of writing unmanaged COM code on top of COM+ is simply no comparison--again, the power of the managed platform simply proved to be too hard to turn away without compelling reason, and the COM/DCOM/COM+ story simply didn't have that compelling reason. Microsoft watched their "inevitable victory" sail into the sunset without them, just as the Department of Justice came up to them and shackled them with the first of many, many papers about "anti-competitive practices". In many respects, the positions got reversed--Sun inherited a huge share (an unhealthy dose, in fact) of Microsoft's arrogance, and for a long time there, thought they were suddenly destiny's child, that Java (meaning Sun, of course) would be the one to "win", and thus would Sun's assurance of world dominance thus be assured. Except it didn't play out that way. Sun found that by embracing standards over implementations, they spent long hours thrashing out specifications, only to provide instant credibility to other vendors' products while their own languished. Weblogic stole the EJB early adopter window. A number of small vendors provided servlet implementations before Tomcat was born... which, although written by Sun employees, was an open-source project and yielded no financial benefit. JMS... well, JMS was always the redheaded stepchild of the J2EE family, at least until vendors like Sonic and Fiorano rescued it for the common Java programmer. (Those who'd been using IBM MQSeries all the while never really could see why you'd want to program against JMS APIs instead of IBM's own.) In each and every case, Sun found their product to be the third or fourth entry into the race, usually years after the others had started, and as a result.... Meanwhile, back in Redmond.... Microsoft comes to the game with .NET in 2003. (The early betas don't count because many people openly wonder if Microsoft is really serious about this ".NET" thing in the first place. After all, remember Microsoft Bob?) And despite .NET's obvious advantage of being formulated nearly a decade after Java's initial release, thus able to apply hindsight to fix or improve the obvious blemishes in the Java environment, Microsoft finds that they're playing catch-up in the all-too-important enterprise space. Microsoft's tools and products have always been seen as "second-class citizens" to the "big boys" in the enterprise space, particularly at the ends of the "high scale" continuum, and the lack of an obvious "app server" in the .NET arena only serves to underscore and reinforce that opinion among many large firms. More importantly, Microsoft doesn't ever want to get blindsided by the Java experience again. They want to make sure that they are never in a position where it looks like their tools are vastly out-of-date, underfeatured, underpowered, and underused. They need to remain somewhere near the bleeding edge, but not so close that their customers are the ones doing the bleeding. (We pause for the inevitable Vista joke.) To Microsoft, Java is that near-death experience that pulls many adrenaline (and other) junkies back from the brink they so callously teetered on before. They need some kind of forward progress, some kind of advancement in the game, so that their customers and their would-be customers feel like Microsoft is on top of it at all times. Result: Somewhere in the 2000-2003 timeframe, Microsoft looks around, sees the landscape, and realizes it needs to make itself relevant to a largely J2EE-based universe, and fast. At first, Microsoft sees a play through the establishment of some standards between the big vendors, around this new "XML" thing, a largely portable data format, and so they throw themselves heart and soul into that space. Doing so will allow them to show existing J2EE-based shops that the power of the .NET platform lies in complementing the existing infrastructure, not replacing it. (Microsoft is smart enough to realize that preaching the software equivalent of hellfire-and-brimstone, known as "rip-and-replace", will not cater well to this congregation.) (Rubyists could have learned a valuable lesson here, but either weren't paying attention, didn't realize the value of the lesson, or else just chose not to.) But this play doesn't turn out the way they expect: the WS-* standards become top-heavy, and start to resemble the very thing Microsoft sought to smash fifteen years earlier: CORBA. The number of WS- specifications available through the W3C (and OASIS, and WS-I and whatever other industry consortiums are formed) is exceeded only by the number of Cos- specifications available from the OMG. The complexities therein leave many Java--and .NET--programmers confused, bewildered, and hopelessly lost when trying to get all but the most simple services to work. Thus does the community turn to alternatives--JSON, simple sockets, REST, whatever--to try and find something that works, even if it only addresses a subset of the problems they will eventually face. Meanwhile... Open source grows ever more important, and Microsoft-the-company realizes they have to either kill it or join it. It's hard to kill something that has no body (unlike their previous competitors), so joining it is the only viable option. Unlike many other software product companies, however, Microsoft has too large an established software base to just "flip the switch", and has far too deeply entrenched a corporate community to take any kind of radical action without a well-thought plan. (Wall Street, a place few programmers ever bother to consider, much less visit, would not take kindly to Microsoft essentially giving away their core product without something in its place to generate revenue, and regardless of how many programmers would like to imagine a world with a bankrupt Microsoft, this would be bad for business for everybody.) And thus do we come to the present. Microsoft needs a play that is Wall Street friendly, programmer friendly, and corporate friendly. They are slowly flirting more and more deeply with open source, yet still firmly committed to turning a profit (something a few of these other open-source-based companies should probably learn to do at some point--just maneuvering to the point of being bought out by a larger fish, like Oracle, is not really a long-term competitive strategy, just so you know). Microsoft wants--arguably, needs--to keep Office relevant in a world where software isn't always paid for, so they need a play that keeps Office ubiquitous and out in the forefront of developer mindshare. If they can't get you to buy Office, then at least let's get you to use tools that keep the Office file formats ubiquitous. If (and this is a big "if") the Office formats turn out to be technically superior to their competition, then Microsoft succeeds. If not, they find a new play. In the short, Microsoft needs an interoperability story, and they need a real interoperability play, because their reputation is damaged from the many "embrace, extend, extinguish" plays they've made in the past. The era of a large vendor "winning" is clearly well behind us (if it was ever, in fact, more than just a marketing VP's wet dream), and if Microsoft is going to make sure that they're never in a vulnerable come-from-behind position again, they need to make sure that they can work well with all the other new technologies out there, whether up-and-coming or well-established or even fading-fast. They need to have an interoperability story that developers can believe in, which means some kind of open-source-friendly play, and one that carries serious "street cred" for actually working. What's the lesson that I, a developer, take away from this? If you are a Java developer, get past your old prejudices and accept that .NET is a viable platform. The Java developer who refuses to learn how to write C# code on the grounds that "Micro$oft is a company that just puts out crap" or that "M$FT sux" is going to be a Java developer whose value to the business is reduced compared to those with less virulent politics. Thanks to tools like VMWare and Virtual PC, you don't have to give up your Mac or your Linux environment to write .NET code and prove that you can offer value to those projects that need to talk to .NET. Look into more than just the WS-* or REST stacks for communication, as well; explore some of the interoperability options I've been ranting about for four years, a la IKVM, Jace, Hessian, even CORBA. If you are a Ruby developer, get over yourself and your "we're more agile and more powerful" meme. Ruby is a tool, nothing more, and one whose shine is fast coming off. IT organizations are discovering the myriad problems with the original Ruby runtime, and are unwilling to risk enterprise apps on a runtime that has zero monitoring and zero manageability play. Yes, you can certainly do lots of things yourself to make your Ruby apps more manageable and more monitorable--but that's all time you have to spend building it, or figuring out how to hook it into the existing IT infrastructure, and when all that time gets added up, it's not going to look all that different from a Java or .NET app's timecycle arc. If you don't have an answer to the question, "How will we make this work with the existing infrastructure we've got?", then you have a problem, and no amount of chanting "Obi-Dave Thomas-Kenobi, you and dynamic typing are my only hope" will save you. If you are a .NET developer, it's high time you accepted that the Java folks are about five years ahead of you on this "managed code" arc, and that they suffered through a lot of hard lessons before arriving at the decisions they came to. Don't be stupid, learn from their mistakes. Why do Java programmers chant "dependency injection" with holy fervor? Why do Java programmers put so much stress on unit testing? What has Microsoft not given you with the latest release of Visual Studio that Java developers think you're an idiot for not demanding in the next release? Yes, C# has some interesting new features in it that Java-the-language doesn't have... but why are the Java guys getting all misty-eyed over Groovy? What do they know that you don't? If you are a developer outside of these areas, you're swimming in dangerous waters, because while I'm sure you're not having any problems finding a job, chances are your next job is going to require you to talk to one of those three environments. Better have your integration/interoperability story worked out, whether its Phalanger for the PHP developer who needs to talk to .NET (and damn if PHP script driving a WinForms app isn't an interesting idea in of itself... and a useful way to bridge yourself into an entirely new area of employment), or its figuring out how to apply your mad Haskell skillz to F# or Scala, you need to have a good idea of what those languages are (and aren't) and how your knowledge of functional concepts can catapult you to the head of the class the next time a massively-scalable system needs to be built. If you are a Microsoft employee, don't blow this. Don't make this into another "embrace, extend, extinguish" cycle. Accept that your company made some bone-headed maneuvers in the past, and rather than try to defend them, accept that your reputation outside of the Redmond Reality-Distortion Bubble is not what it looks like from the inside. As hard as this will be to do sometimes, just stop and listen to what others are saying about the company and the paranoia that creeps up every time Microsoft moves into an area of interest. Take the extra moment to hear the concerns, not just the words. And if you are a Google employee, tatoo this on your forehead: Reputation Matters. The first time anybody at your company does something even remotely "evil", you will be branded as "the next Microsoft" and all of these problems will be yours to share and enjoy, as well.
|
 Tuesday, April 01, 2008
|
MSDN "F# Primer" Article Feedback
|
|
Since the publication of the F# article in the MSDN Launch magazine, I've gotten some feedback from readers (for which I heartily thank you all, by the way), but in particular I've gotten two emails from "tms" that I thought deserved more widespread notice and commentary. I'm happy to give full credit to "tms" for his comments, but thus far I haven't heard back from him saying it was OK to do so; that said, his points are valid, and I think important for the rest of the world to hear, so I'm posting this under a pseudonym until he gives permission to offer up his real name. In his first note, tms says.... I appreciated the (F#) article. I would like to point out one error. You wrote: Like many functional languages, F# permits currying ...
Your example does not demonstrate currying well, as it could be written in any non-currying language such as C#. (It is indeed an "idiom" that one uses in C# to manually do the equivalent of currying, where desired.)
Here are two statements, either of which would demonstrate currying:
1: let add5 = add 5 2: 3: 5 |> (add 5)
Neither of these two statements have any direct equivalent in C#, because C# lacks the concept of currying. What is significant about these statements, is "add 5" -- the use of add with only one of its two parameters. This is the essence of currying. It takes a function that requires n parameters, and directly turns it into a function that requires n-1 parameters, with no need to name or otherwise talk about the "missing" parameter.
Agreed, but even there, it's possible to do in C# with the use of (multiples of) anonymous methods. For example, the "add5" example you use can be seen as something akin to this:
1: // Note this has not been compiled with anything except the 2: // Neward & Associates Blog Compiler (i.e., my eyes) 3: public class Container 4: { 5: public void add(int a, int b) { return a + b; } 6: 7: // This is the simple, hard-coded version 8: public void add5(int b) { return add(5, b); } 9: 10: // This is the more complex approach that arguably is closer to F# 11: public delegate int AddMethod(int, int); 12: public AddMethod Add = new AddMethod(add); 13: 14: public delegate int Add5Method(int); 15: public AddMethod Add5 = new Add5Method((b)=> return Add(5, b)); 16: }
Your second example, using the pipeline operator can, in fact, also be done using C# and a well-established set of delegate types arranged into a pipeline, a la how PowerShell passes objects (or lists of objects) from one Cmdlet to another....
... but your point is still well taken; there's much better examples of currying in the world; Don Syme (who tech-reviewed the article) openly questioned whether or not currying was a good thing to bring up in this introduction, and I argued that I thought it was necessary to at least open the subject in order to explain some of the inherent power of functional programming (and, by extension, some of the motivation for learning F#).
Net result: there is some smoothing of the story on F# yet to be done. You only find this out from presenting a story to an audience, hearing their feedback, and iterating on it further.
In his second note, tms points out
Your evolution of
let results = [ for i in 0 .. 100 -> (i, i*i) ]
into:
let compute2 x = (x, x*x) let compute3 x = (x, x*x, x*x*x) let results2 = [ for i in 0 .. 100 -> compute2 i ] let results3 = [ for i in 0 .. 100 -> compute3 i ]
I think this could use a better explanation about what is being shown.
When I first read it, my reaction was:
'I can do the same thing in C# -- you just replaced an an expression in the language, "(i, i*i)" with a function that returns the value of that expression, "compute2 i".'
It wasn't until I sat down to write the C# equivalent that I saw what the benefit is: in F# it is easy to define functions anywhere. In C#, the code would have occurred somewhere in a method of some class, so if "compute2" were a static method on the same class, it would be just as easy to use -- it would simply be "compute2(i)". But in C# I can't embed it as is done in F#. Somewhere else in the class I have to add the function:
1: // This is C# 2: class MyClass { 3: 4: method SomeMethod() { 5: ... 6: result.Push( Pair(i, i*i) ) 7: ... 8: }
== can be turned into ==>
1: class MyClass { 2: ... 3: static method compute2 (int a) { return Pair(i, i*i); } 4: ... 5: 6: method SomeMethod() { 7: ... 8: result.Push( compute2(i) ) 9: ... 10: } 11: } 12:
It would be really cool if C# let you define a function locally, something like:
1: class MyClass { 2: 3: method SomeMethod() { 4: ... 5: function compute2 (int a) { return Pair(i, i*i); } 6: result.Push( compute2(i) ) 7: ... 8: } 9: } 10:
Is that the benefit you were describing?
Weeeelll..... I'd like to say that was the case, but in truth, I don't think I had that in mind when I was writing the article. In fact, it's a bit hard, looking back, exactly what I had in mind during that particular section of the article, except perhaps to try and explain a bit more of the F# syntax. I think what I was trying to do was show how functions could be used in a higher-order manner, but with a simple (arguably trivial) manner, which, in retrospect, doesn't really do the concept of higher-order functions much justice. I'd like to use as my excuse the technical writer's traditional escape, which is to say, "Hey, you try explaining a complex concept in 5000 words, along with introducing basic syntax and still make it relevant to the audience", but in truth, that's just an excuse, and I admit it. *sigh* Fortunately, folks like you are around to point out the flaws in my prose, and (hopefully) make it stronger the next time around. 
The other thing to remember, too, that as with most language comparisons, it isn't so much a matter of what I can or can't do in a particular language vis-a-vis a different language (F# vis-a-vis C#, in this case), but more a question of "What does this language allow me to express as a first-class concept that the other one forces me to express via much lower-level constructs?" Just about everything that F# offers can be replicated in C#--thanks in no small part to anonymous methods/lambdas, to be frank--but forces the C# developer into writing much of the scaffolding that has to be in place. (If you think about it, this has to be true, at least at some level, because both F# and C# run on top of the CLR, which means they each have to 'boil down' to CIL at some level, and given the relatively high level of fidelity between C# and CIL, almost any construct expressed in CIL can be 'redrawn' in C#, if we're willing to.)
Case in point: consider the snippet tms calls out above:
let compute2 x = (x, x*x) let compute3 x = (x, x*x, x*x*x) let results2 = [ for i in 0 .. 100 -> compute2 i ] let results3 = [ for i in 0 .. 100 -> compute3 i ]
If we take this snippet and run it through the F# compiler grinder, then look at the results in ILDasm, we get an interesting comparison of how F#'s first-class support for functions maps into C#'s view of the world.
First, ILDasm:
(You'll note I spared you the huge text dump of "ildasm /out:example.il example.exe", since that would have more noise than signal. Feel free to perform the experiment on your own, if you'd like to see the raw output.)
As you can see here, the F# "top-level" code gets stored into a static method _main stored in the class "<StartupCode$example>" in the namespace "<StartupCode$example>", and yes, _main() is marked with the CIL ".entrypoint" directive, telling the CLR that this is where life begins for this particular assembly. Notice as well how the filename becomes the class "container" for the functions defined therein (the class "Example"), and the functions in particular--compute2() and compute3()--are exported as public static methods. You can see, however, that their parameter types are definitely more complex than the form we would use in traditional idiomatic C#, tuples instead of a list of individual parameters, which tms tries to keep fidelity to in his pseudo-C# translation. The "results2" and "results3" identifiers are in turn kept as properties, exposed on the Example class, and to top it off, are actually defined (not once, but twice) as nested classes of the Example class, because these are, in fact, lists of results, not a single result.
I could go on, but frankly, the noise would begin to swamp the signal. I leave the exercise of opening example.exe in Reflector up to the interested reader. (If you're even remotely interested in F#, I highly recommend doing so once or twice, just to get an idea of how much scaffolding and infrastructure F# is putting into place for you. It's also incredibly useful for when you're trying to figure out C#-calling-F# interop issues.) It's particularly interesting to walk the path of how results2 gets generated, and how wildly different that is from the traditional C# "for" loop. It turns out that everything I'm doing in the code snippet above can be done in C#, but wow, why would you want to? Particularly if you want to get exactly the same kind of fidelity to side effects (that is to say, none at all) that the F# approach gives you?
Both are excellent points, tms, and thanks for taking the time to offer feedback.
|
 Sunday, March 30, 2008
|
Leopard broke my MacBook Pro's wireless!
|
|
So I took the plunge and installed Leopard onto my MacBook Pro tonight, and as of right now, I'm not a happy camper. The installation started off well enough--pop in the DVD, bring up the installer, double-click, answer a few form fields, then wait as it verifies the DVD, reboots into the CD-launched installer again, answer a few form fields, then sit and read my latest copy of Ellery Queen Mystery Magazine while the installation completes. Roughly an hour or so later, it's done, and I have a bright and shiny new Leopard installation on my Mac. Yay. Software Update tells me there are a few things that need updating--sure, that makes sense, since I think the latest version of Leopard is actually now 10.5.2, so go ahead. Bad move. Ever since that update, any attempt to join my home wireless network fails miserably. AirPort can clearly see the network--it discovers the SSID without a problem--but joining it yields no love. The error that shows up in the console log is always this pair: airportd Error: Apple80211Associated() failed -6 _emUIServer Error: airport MIG failed = -6 ((null) port = 60027) I've tried several things suggested in the Apple forums, from changing the order of connected systems to put the Airport on the top, to clearing out my list of remembered SSIDs, to turning the AirPort off and back on again, to downloading the TimeMachine upgrade and installing it, even to blowing out the PRAM on boot. Nothing doing. Tomorrow we make a trip to the Apple Genius Bar to see what those geniuses have to say, but I'm not optimistic. I will update this blog and apologize profusely if I'm wrong, of course, but given the number of unsuccessful support calls that people are lamenting, I'm guessing this will be one of those "Well, if you want to ship it back to the factory, sir, ...." responses, which is NOT an option. Well... OK, it is an option, given that I do most everything in VMWare images, sure, but the thought of going back to my T42p (with only 1.5GB of RAM on it, compared to the full 4GB on my MBP) is not endearing to me, particularly because Vista has a problem with releasing the USB hard drives that I store most of the VMWare images on.... Somebody please tell me they have an easy fix for this, one which Googling has not yet revealed.... Update: So I took my MBP into the Apple Store... and, naturally, the wireless on the MBP picks up the Apple Store's network just fine. Grrr. Regardless, I had them do an "archive and install" of 10.5.1 onto the machine, and when I got home... perfect! Sees and connects to my home wireless without a hitch. So I'd suggest for those who recently moved up to 10.5.2, try dropping back to 10.5.1 and see if that solves the problem. Meanwhile, I'll be holding off of doing the 10.5.2 update for a while, I think. Of course, that also means I can't do the iPhone SDK, I think, so I may try the update once more just to see if it'll take this time, and if it doesn't, then off to the Apple Store again for the 10.5.1 re-install again. But at least this time, I'd know what the viable solution is. (I hope....)
Mac OS | VMWare
Sunday, March 30, 2008 5:15:52 AM (Pacific Daylight Time, UTC-07:00)
|
|
 Saturday, March 29, 2008
|
The torrent has begun...
|
|
Not the BitTorrent of some particular movie or game, but the torrent of changes to the JDK that were held up pending a final blessing on the OpenJDK Mercurial transition. How do I, a non-Sun employee know this? Because I'm subscribed to the build-dev mailing list (which seems to be getting the Mercurial changeset notification emails), and on Wednesday (March 26th), one such email contained 72 new changesets, ranging from extensions to the query API for JMX 2.0: 6602310: Extensions to Query API for JMX 2.0 6604768: IN queries require their arguments to be constants Summary: New JMX query language and support for dotted attributes in queries. to bug fixes in javaw.exe for the Windows JRE: 6596475: (launcher) javaw should call InitCommonControls Summary: javaw does not show error window after manifest changes. to some changes to the Process class to better allow for IO redirection: 4960438: (process) Need IO redirection API for subprocesses and more beyond that. I have to say, I'm getting a little giddy watching all these things flow into the JDK--it's been a while since I just sat and watched the build notification messages on a large project like this, and it always gives me this weird sense of accomplishment, even though it's not work that I'm doing or arguably even care about. But it should stand as a clear sign to anybody who think Java-the-platform is "done"--the guys at Sun certainly don't think so, and more importantly, are putting in the effort to improve it. Except now, we can see the work being done, which makes all the difference in the world. Some of you may remember that on several speaker panels I was on, I was a bit bullish (on the surface of things) about the OpenJDK process. I think my exact comments were, "I think for the majority of Java developers, this is a 'No Big Deal, Move Along, Nothing to See Here' kind of step." I still believe that, in fact: I believe that to the vast majority of Java developers, the fact that anybody can now see the sausage being made yields no real advantage, and therefore is of no real interest. But to the handful of Java developers who refuse to see the JVM or the Java libraries (or even the Java compiler) as a black box, this is huge. We can now not only post the bugs that we run across during development, and more importantly, subscribe to the mailing lists, watch for the bug fix notification, apply the Mercurial changeset that patches the bug, and if the patch doesn't work, notify Sun. But if the patch does work, not only can we confirm the bug's elimination, but we can move beyond it, even before the production release of the next Java build. It may not be something you do on a regular basis, but when you're completely blocked waiting for a bug fix from Sun... ... that's huge.
Java/J2EE | Languages
Saturday, March 29, 2008 2:17:58 AM (Pacific Daylight Time, UTC-07:00)
|
|
 Friday, March 28, 2008
|
Hangin' in Vegas
|
|
I hate Las Vegas. I'm here for TheServerSide Java Symposium 2008, which has been held here in Vegas for the past (umm... three? four? five?) years, and every time I come here I'm reminded why I really don't like Vegas. It's loud, both in auditory volume and visual noise, it's boisterous bordering on raunchy, the locals are almost always soured by their near-constant exposure to tourists, the tourists are... well, they're American tourists and that says a lot right there, and there's no way to escape it. Ugh. Fortunately for me, the hotels have conveniently painted a nice blue sky on the roof (in the Venetian, where the conference is held) so I don't have to go outside to see if it's sunny, they provided a nice winding river of bright neon blue water/Windex to have our leisurely cafe lunch next to, and no fake recreation of Venice would be complete without fake gondolas poled by fake gondoleers singing to tourists on the fake Windex river that's all of about two minutes in ride length before they have to do a U-turn and pole back the other way. Wow, it's all so magical. About the only thing that makes Vegas palatable is some of the shows you can catch here, like one of Cirque du Soleil's six (!) different presentations going on here. But, of course, you must be careful when you buy tickets, or the guy at the concierge desk will start finding tickets for you, only to discover later that he thought you said "Tah", meaning "Tom Jones", when you said, "Ka", the Cirque du Soleil show, because my California accent is too thick to be understood. I hate Las Vegas. The upshot is that when I'm here for this show, I get to hang out with some cool people, NFJS speaker alum and otherwise. Brian Sletten and I did a tag-team talk on SOAP and REST that was billed to be controversial but probably disappointed the crowd in that we didn't (a) throw any punches at one another, (b) didn't really proclaim a "victor" between the two, and (c) laid down some basic rules for when to look to a RESTful approach and when to take advantage of the existing SOAP-based infrastructure that is currently SOAP's greatest strength. Note to those who didn't attend the session: you didn't hear me say it, so I'll repeat it: I hate WSDL almost as much as I hate Las Vegas. Ask me why sometime, or if I get enough of a critical mass of questions, I'll blog it. If you've seen me do talks on Web Services, though, you've probably heard the rant: WSDL creates tightly-coupled endpoints precisely where loose coupling is necessary, WSDL encourages schema definitions that are inflexible and unevolvable, and WSDL intrinsically assumes a synchronous client-server invocation model that doesn't really meet the scalability or feature needs of the modern enterprise. And that's just for starters. I hate WSDL. I still hate Vegas more, though. Meanwhile, Glenn Vanderberg, NFJS speaker alum and current Chief Scientist over at Relevance, pulled me aside for a few minutes to show me how to build apps for the iPhone using the newly-released iPhone SDK (something I'd asked about once before and that's been exploring recently). We basked in the glory that is Objective-C (now there's a language that should have gotten more traction than it did, IMHO), and then in the glory that is the iPhone (OpenGL, OpenSA, which I didn't know but Glenn tells me is basically like an audio-equivalent library for OpenGL), and then we swapped some ideas about what people might do with the iPhone now that the SDK is available. I've always been pretty bullish on the mobile device market, and I still am, but the iPhone might be the turning point in that space. I'll reserve judgment for now, and just enjoy hacking on my own for the time being.  Neal Ford did the Wednesday morning keynote, and I got the chance to present "Why the Next Five Years Will Be About Languages" after lunch today, which seemed to go over well, at least based on what the attendees who came up to me afterwards were saying. (Of course, that's always a biased assessment, since the ones who hate it are hardly likely to come up and tell me that, so I always take that statistic with a grain of salt.) They videoed it, so I imagine it'll be online before long. Of course, TSS wouldn't be TSS without speaker panels saying really controversial things... but I wouldn't know about them this year, I wasn't on any. (Perhaps the conference organizers finally took everybody's advice...) Tomorrow (well, actually, today as I write this, since I'm up way too late as usual) I'll be doing a talk on Scala, having dinner with a few friends, then off to McCarron airport and home. I don't think the Scala talk will be taped, but you can catch me doing much the same stuff (well, as much as it ever is the same stuff when I speak, since I mostly make everything up on the fly anyway) at the NFJS symposium near you, so you don't have to come to Vegas to hear about it in between ducking packs of drunk twenty-something guys chasing packs of drunk twenty-something girls all the while dodging the attentions of finger-snapping sidewalk vultures handing out glossy business cards saying "Girls Direct to You". I so hate Vegas. Update: Hah, that'll teach me to blog that before the conference is over--Eugene and Joe drafted me into the final panel session of the conference, on "Cross-Cutting Concerns, a Multi-Disciplinary Approach to Java", which none of us--including our emcees--had any idea was supposed to be included in such a discussion. Glenn Vanderberg and Patrick Linskey then sought to take a vote and change the topic of the panel to "Shearing off Ted's Ponytail". Fortunately a kind attendee asked a question and we moved on, ponytail intact. Of course, given that this was Vegas, I probably could have gotten Carrot Top to do it and made some money on the deal.
|
|
Rules for Review
|
|
Apparently, I'm drawing enough of an audience through this blog that various folks have started to send me press releases and notifications and requests for... well, I dunno exactly, but I'm assuming some blogging love of some kind. I'm always a little leery about that particular subject, because it always has this dangerous potential to turn the blog into a less-credible marketing device, but people at conferences have suggested that they really are interested in what I think about various products and tools, so perhaps it's time to amend my stance on this. With that in mind, if you are a vendor and have a product that you'd like me to take a look at and (possibly) offer up a review here, here's the basic rules: - No guarantees. Sending me something will in no way guarantee that I will review your product, for several reasons, two of which being (a) I get really busy sometimes, and (b) I may have no interest whatsoever in your product and I refuse to pretend to do so. (Readers can usually tell when the reviewer isn't all that excited about the subject, I've found.)
- If you're not going to send me a "real" version (meaning not the time-locked or feature-crippled demo), don't bother. I have no idea when I will get around to a review, and I have no desire to review something that isn't "the real deal". I will in turn promise that the licensed version you send me (if necessary) will not be used for any purpose other than my own research and exploration (signing contract if necessary to give you that "fresh-from-the-lawyer's-office" warm and fuzzy feeling).
- I say what I think, pro and con. I will not edit my review to suit your marketing purpose, and if you ask me to do so I will simply note in the review that you have asked me to do so. I retain full editorial control over what I say about your product.
- Having established #1, I will try to be as fair as I can about your product, and point out things that I liked and things that I didn't. (Of course, if I hated it from top to bottom, I may end up with the only positive thing being "It didn't set the atmosphere on fire when I started the app", but hey, that's something positive, right?)
- Also in the spirit of #1, if you send me mail answering questions or complaints in my review, I will of course amend the review with your comments. You are always welcome to post comments to the blog entry itself, too. Unless you insult my grandmother, then I will have to get all DELETE-key on you.
The reason I'm posting this here is twofold: one, so my faithful audience of four blog readers will know the rules under which I'm looking at these products and (hopefully) realize that I'm not financially vested in any of these products, and two, so the various vendor folks can read this and know what the rules are up front before even asking. I know it sounds a little cheeky to lay this out. The image I get in my head is that of the kid at Christmas declaring to his grandparents as they walk through the door, presents in hand, "Make sure it's not a scratchy sweater, I hate scratchy sweaters. And G.I. Joe was only popular when my Dad was a kid. And if you give me another lunchbox I will scream until you buy me something cool, like a new GameBoy." Ugh. But I value the trust that people seem to have in me, and so I risk the perception of cheekiness for this tiny window in time in order to (hopefully) establish full disclosure over the reviews that come to pass (which, by the way, will always have the category "review" applied to them, so you know which is an official review and which is just me exploring, like the LLVM and Parrot posts of recent time). We now return you to the regularly-scheduled blog.
|
 Saturday, March 22, 2008
|
Reminder
|
|
A couple of people have asked me over the last few weeks, so it's probably worth saying out loud: No, I don't work for a large company, so yes, I'm available for consulting and research projects. If you've got one of those burning questions like, "How would our company/project/department/whatever make use of JRuby-and-Rails, and what would the impact to the rest of the system be", or "Could using F# help us write applications faster", or "How would we best integrate Groovy into our application", or "How does the new Adobe Flex/AIR move help us build richer client apps", or "How do we improve the performance of our Java/.NET app", or other questions along those lines, drop me a line and let's talk. Not only will I cook up a prototype describing the answer, but I'll meet with your management and explain the consequences of the research, both pro and con, for them to evaluate. Shameless call for consulting complete, now back to the regularly-scheduled programming.
|
 Thursday, March 20, 2008
|
Eclipse gets some help... building Windows apps... from Microsoft?
|
|
This delicious little tidbit just crossed my desk, and for those of you too scared to click the link, check this out: Microsoft will begin collaborating with the Eclipse Foundation to improve native Windows application development on Java. Sam Ramji, the director of Microsoft's open-source software lab, announced at the EclipseCon conference in Santa Clara, Calif., on Wednesday that the lab will work with Eclipse . The goal of the joint work, which will include contributions from Microsoft engineers, is to make it easier to use Java to write applications that take full advantage of the look and feel of Windows Vista. Ramji wrote about the planned collaboration on Microsoft's Port25 blog. "Among a range of other opportunities (which we're still working on), we discovered that Steve Northover (the SWT team lead) had gotten requests to make it easy for Java developers to write applications that look and feel like native Windows Vista. He and a small group of developers built out a prototype that enables SWT to use Windows Presentation Foundation (WPF). We're committing to improve this technology with direct support from our engineering teams and the Open Source Software Lab, with the goal of a first-class authoring experience for Java developers," he wrote. The move builds on several initiatives coming from Microsoft's open-source software labs to ensure that open-source products work well on Windows and other Microsoft products. My first reaction has to be characterized as... WTF?!? My second reaction has to be characterized as... WTF?!? There's some serious credulity issues here. Not credibility, mind you, because I believe the reporter is entirely accurate in this story, but credulity. As in, "That's incredulous!", which is another way of saying... WTF?!? First, it's not been that long ago since Microsoft and Java were actively trying to beat one another into something vaguely resembling... well, resembling either a strawberry after that oh-so-ill-advised blind date with a blind steamroller, or else the end-product of the local butcher's sausage grinder. I seriously doubt anybody's memory has lapsed so deeply that they forget the rather nasty shooting war that erupted over J++ and Microsoft's Application Foundation Classes. (For those of you who weren't writing Java code at the time, AFC was Microsoft's variation on AWT designed to make it easier to write native Windows apps, making heavy use of a language/library construct that was an extension to Java, known as "delegates". Yes, those same delegates as what appeared in C# a few years later, and those same delegates that became the core implementation behind C# 2.0's asynchronous methods and C# 3.0's lambda expressions, and arguably the same delegates that everybody is looking to incorporate into the Java language today. Funny how things turn out, no?) Second, Microsoft partnering with IBM (yes, I know, the news piece says Eclipse, but who runs most of the Eclipse projects? IBM is to Eclipse what Sun is to the JCP, folks) to do this is just not going to make the whole IBM-Sun rift any smoother, or calm the turbulent waters in the Java ecosystem any further. Granted, SWT, is the logical place for Microsoft to go when trying to make it easier for Java devs to write Windows apps (which, by the way, was always a core principle behind the design and implementation of the CLR, which is why the CLR has such a powerful and simple P/Invoke and COMInterop story), but the last thing Microsoft wants at this point, it would seem to me, is more controversy around it and Java. After all, how hard would it be for Sun to haul them into court again, claiming that this somehow violate's the Microsoft/Sun peace agreement of a few years ago? And while I applaud the fact that Microsoft is looking for ways to contribute to the open-source space, it just seems to me that there were a lot of other places they could have gone to start doing so without incurring this kind of reaction. Go write a standard Perl implementation, for example, or, even better, do a "Visual Lisp" and integrate it on top of the CLR, if you want to make a mark in the open-source world. There's thousands of places the gathering-steam Microsoft open-source direction could have gone, with far greater success for both the open-source community and Microsoft. The skin here is just too sensitive and the past wounds just too raw for this company to go rubbing elbows up against this space again. Oh, just as a footnote, in case you're looking for more reasons to dislike the JBoss guys.... "It just makes sense to enable Java on Windows. We started a collaborative effort with JBoss two years ago that continues to this day. At the end of the day, it's all about the developer," Ramji said. See? They sold out a long time ago! *grin*
|
|
Cirque du Soleil for geeks
|
|
Watch this guy beat calculators, doing two-, three- and then four-digit squares in his head. Have a look if you ever thought you were good at doing numbers in your head. Have a look even if you're of the opposite extreme. (I'm sure there's some other tricks in his head he's using to be able to do this, but the net effect is still impressive, regardless.)
Thursday, March 20, 2008 12:10:12 AM (Pacific Daylight Time, UTC-07:00)
|
|
 Saturday, March 15, 2008
|
The reason for conferences
|
|
People have sometimes asked me if it's really worth it to go to a conference these days, given that so much material is appearing online via blogs, webcasts, online publications and Google. I think the answer is an unqualified "yes" (what else would you expect from a guy who spends a significant part of his life speaking at conferences?), but not necessarily for the reasons you might think. A long time ago, Billy Hollis said something very profound to me: "Newbies go to conferences for the technical sessions. Seasoned veterans go to conferences for the people." At the time, I thought this was Billy's way of saying that the sessions really weren't "all that" at most conferences (JavaOne and TechEd come to mind, for example--whatever scheduling gods that think project managers on a particular project make good technical speakers on that subject really needs to be taken out back and shot), and that you're far better off spending the time networking to improve your social network. Now I think it's for a different reason. By way of explanation, allow me to recount a brief travel anecdote. I spend a lot of time on airplanes, as you might expect. Periodically, while staring out the window (trying to rearrange words in my head in order to make them sound coherent for the current email, blog entry, book chapter or article), I will see another commercial aircraft traveling in the same air traffic control lane going the other way. Every time I see it, I'm simply floored at how fast they appear to be going--they usually don't stay within my visibility for more than a few seconds. "Whoosh" is the first thought that goes through my easily-amused consciousness, and then, "Damn, they're really moving." Then I realize, "Wow--somebody on that plane over there is probably looking at this plane I'm on, and thinking the exact same thing." This is why you go to conferences. In the agile communities, they talk about velocity, the amount of work a team can get done in a particular iteration. But I think teams need to have a sense of their velocity relative to the rest of the industry, too. It helps put things into perspective. All too often, I find teams that look at me in meetings and conference calls and say, "Surely the rest of the industry isn't this bad, right?" or "Surely, somebody else has found a solution to this problem by now, right?" or "Please, dear God, tell me this kind of WTF-style of project management is unique to my company". While I am certainly happy to answer those questions, the fact of the matter is, at the end of the day they're still left taking my word for it, and let's be blunt: my answer can really only cover those companies and/or project teams I've had direct contact with. I can certainly tell you what I've heard from others (usually at conferences), but even that's now getting into secondhand information, which to you will be third-hand. (And that, of course, assumes I'm getting it from the source in the first place.) This isn't just about project management styles--agile or waterfall or WHISCEY (Why the Hell Isn't Somebody Coding Everything Yet) or what-have-you--but also about technical trends. Is Ruby taking off? Is Scala becoming more mainstream? Is JRuby worth exploring? Is C++ making a comeback? What are peoples' experiences with Spring 2.5? Has Grails reached a solid level of performance and/or stability? Sure, I'm happy to come to your company, meet with your team, talk about what I've seen and heard and done--but sending your developers (and managers, though *ahem* preferably to conferences that aren't in Las Vegas) to a conference like No Fluff Just Stuff or JavaOne or TechEd or SD West can give them some opportunities to swap stories and gain some important insights about your team's (and company's) velocity relative to the rest of the industry. All of which is to say, yes, Billy was right: it's about the people. Which means, boss, it's OK to let the developers go to the parties and maybe sleep in and miss a session or two the next morning.
|
|
Mort means productivity
|
|
Recently, a number of folks in the Java space have taken to openly ridiculing Microsoft's use of the "Mort" persona, latching on to the idea that Mort is somehow equivalent to "Visual Basic programmer", which is itself somehow equivalent to "stupid idiot programmer who doesn't understand what's going on and just clicks through the wizards". This would be a mischaracterization, one which I think Nikhilik's definition helps to clear up: Mort, the opportunistic developer, likes to create quick-working solutions for immediate problems and focuses on productivity and learn as needed. Elvis, the pragmatic programmer, likes to create long-lasting solutions addressing the problem domain, and learn while working on the solution. Einstein, the paranoid programmer, likes to create the most efficient solution to a given problem, and typically learn in advance before working on the solution. .... The description above is only rough summarization of several characteristics collected and documented by our usability folks. During the meeting a program manager on our team applied these personas in the context of server controls rather well (I think), and thought I should share it. Mort would be a developer most comfortable and satisfied if the control could be used as-is and it just worked. Elvis would like to able to customize the control to get the desired behavior through properties and code, or be willing to wire up multiple controls together. Einstein would love to be able to deeply understand the control implementation, and want to be able to extend it to give it different behavior, or go so far as to re-implement it. Phrased this way, it's fairly easy to recognize that it's possible that these are more roles than actual categorizations for programmers as individuals--sometimes you want to know how to create the most efficient solution, and sometimes you just want the damn thing (whatever it is) to get out of your way and let you move on. Don Box called this latter approach (which is a tendency on the part of all developers, not just the VB guys) the selective ignorance bit: none of us have the time or energy or intellectual capacity to know how everything works, so for certain topics, a programmer flips the selective ignorance bit and just learns enough to make it work. For myself, a lot of the hardware stuff sees that selective ignorance bit flipped on, as well as a lot of the frou-frou UI stuff like graphics and animation and what-not. Sure, I'm curious, and I'd love to know how it works, but frankly, that's way down the priority list. If trying to find a quick-working solution to a particular problem means labeling yourself as a Mort, then call me Mort. After all, raise your hand if you didn't watch a team of C++ guys argue for months over the most efficient way to create a reusable framework for an application that they ended up not shipping because they couldn't get the framework done in time to actually start work on the application as a whole....
|
 Tuesday, February 26, 2008
|
Lang.NET videos are now online
|
|
If you read the three days of Lang.NET posts I did last month and wondered "Man, I wish I could've seen...", fret no more. My personal favorites: Of course the other presentations are good, but each of these had a moment in them when I said, "Hmm...."
Tuesday, February 26, 2008 7:49:40 PM (Pacific Standard Time, UTC-08:00)
|
|
 Sunday, February 24, 2008
|
Apropos of nothing: Job trends
|
|
While tracking some of the links relating to the Groovy/Ruby war, I found this website, which purportedly tracks job trends based on a whole mess of different job sites. So, naturally, I had to plug in to get a graph of C#, C++, Java, Ruby, and VB: Interesting. I don't think it proves anything one way or another, mind you, but interesting nonetheless. Having said that, a few things stand out to me after looking at this for all of thirty seconds: - Wow, what the hell happened in 1Q and 2Q of 2005? Java takes a huge drop in 2005, and all of them take a small drop of some form around the same time in 2006. What is it with summertime? Did the HR supervisor suddenly take a look at the company's job board and mutter, "Damn, I thought we closed all those listings already..."? (Or maybe, "Thank God for cheap college interns..."?)
- C++ jobs still outnumber C# jobs, even in 4Q 2007?
- C++ jobs remain essentially flat from 1Q 2005 to 4Q 2007; apparently, there's a lot more C++ going on than most companies are willing to admit to.... (Can't you picture it? The nervous candidate, sitting at the table, as the interviewer shuffles the paper and says, "So, you're here for a programming job?" The candidate sort of squirms in his chair as he replies, "Well, actually, I was hoping for a... a... C++ job." The interviewer quickly looks around to see who might be listening as he says loudly, "C++? What ever gave you the idea that we do C++ here at BigCorp?" Meanwhile, he surreptitiously scribbles on the back of a business card and slides it across the table to the candidate, then stands up and says loudly, "I'm afraid you've come to the wrong place, sir. You can see yourself out, I take it?" The candidate palms the card, and only once has he left the building does he look at the back, which reads, "8PM, corner of Mission and Vine, password is 'Lippman, Stroustrup, Sutter, and Meyers!' Viva C++!"...)
- VB jobs fall to below C#? So much for those vast hordes of VB programmers that supposedly form the "long tail" of the .NET community....
- Java jobs remain essentially flat from 1Q 2005 to 4Q 2007, despite numerous ups and downs. So much for the idea that Java is somehow going away....
- Ruby's penetration into the job market is much smaller than what I would have guessed.
- I couldn't help myself, I did another query with "cobol" added in, but I'll leave it to you to run your own query to see what that looks like. It's surprising....
Of course, statistics without any sort of understanding of how they were gathered or from what sources are essentially meaningless, but ooooh, it's in color....
|
|
Some interesting tidbits about LLVM
|
|
LLVM definitely does some interesting things as part of its toolchain. Consider the humble HelloWorld: 1: #include <stdio.h> 2: 3: int main() { 4: printf("hello world\n"); 5: return 0; 6: }
Assuming you have a functioning llvm and llvm-gcc working on your system, you can compile it into LLVM bitcode. This bitcode is directly executable using the lli.exe from llvm:
$ lli < hello.bc hello world
Meh. Not so interesting. Let's look at the LLVM bitcode for the code, though--that's interesting as a first peek at what LLVM bitcode might look like:
1: ; ModuleID = '<stdin>' 2: target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64" 3: target triple = "mingw32" 4: @.str = internal constant [12 x i8] c"hello world\00" ; <[12 x i8]*> [#uses=1] 5: 6: define i32 @main() { 7: entry: 8: %tmp2 = tail call i32 @puts( i8* getelementptr ([12 x i8]* @.str, i32 0, i32 0) ) ; <i32> [#uses=0] 9: ret i32 0 10: } 11: 12: declare i32 @puts(i8*)
Hmm. Now of course, LLVM also has to be able to get down to actual machine instructions, and in point of fact there is a tool in the LLVM toolchain, called llc, that can do this transformation ahead-of-time, like so:
$ llc hello.bc -o hello.bc.s -march x86
And, looking at the results, we see...
1: .text 2: .align 16 3: .globl _main 4: .def _main; .scl 2; .type 32; .endef 5: n: 6: pushl %ebp 7: movl %esp, %ebp 8: subl $8, %esp 9: andl $4294967280, %esp 10: movl $16, %eax 11: call __alloca 12: call ___main 13: movl $_.str, (%esp) 14: call _puts 15: xorl %eax, %eax 16: movl %ebp, %esp 17: popl %ebp 18: ret 19: .data 20: r: # .str 21: .asciz "hello world" 22: .def _puts; .scl 2; .type 32; .endef
Bleah. Assembly language, and in NASM format, to boot. (What did you expect, anyway?)
Of course, assembly language and C were always considered fairly close together in terms of their abstraction layer (C was designed as a replacement for assembly language when porting Unix, remember), so it might not be too hard to...
$ llc hello.bc -o hello.bc.c -march c
And get...
1: /* Provide Declarations */ 2: #include <stdarg.h> 3: #include <setjmp.h> 4: /* get a declaration for alloca */ 5: #if defined(__CYGWIN__) || defined(__MINGW32__) 6: #define alloca(x) __builtin_alloca((x)) 7: #define _alloca(x) __builtin_alloca((x)) 8: #elif defined(__APPLE__) 9: extern void *__builtin_alloca(unsigned long); 10: #define alloca(x) __builtin_alloca(x) 11: #define longjmp _longjmp 12: #define setjmp _setjmp 13: #elif defined(__sun__) 14: #if defined(__sparcv9) 15: extern void *__builtin_alloca(unsigned long); 16: #else 17: extern void *__builtin_alloca(unsigned int); 18: #endif 19: #define alloca(x) __builtin_alloca(x) 20: #elif defined(__FreeBSD__) || defined(__NetBSD__) || defined(__OpenBSD__) 21: #define alloca(x) __builtin_alloca(x) 22: #elif defined(_MSC_VER) 23: #define inline _inline 24: #define alloca(x) _alloca(x) 25: #else 26: #include <alloca.h> 27: #endif 28: 29: #ifndef __GNUC__ /* Can only support "linkonce" vars with GCC */ 30: #define __attribute__(X) 31: #endif 32: 33: #if defined(__GNUC__) && defined(__APPLE_CC__) 34: #define __EXTERNAL_WEAK__ __attribute__((weak_import)) 35: #elif defined(__GNUC__) 36: #define __EXTERNAL_WEAK__ __attribute__((weak)) 37: #else 38: #define __EXTERNAL_WEAK__ 39: #endif 40: 41: #if defined(__GNUC__) && defined(__APPLE_CC__) 42: #define __ATTRIBUTE_WEAK__ 43: #elif defined(__GNUC__) 44: #define __ATTRIBUTE_WEAK__ __attribute__((weak)) 45: #else 46: #define __ATTRIBUTE_WEAK__ 47: #endif 48: 49: #if defined(__GNUC__) 50: #define __HIDDEN__ __attribute__((visibility("hidden"))) 51: #endif 52: 53: #ifdef __GNUC__ 54: #define LLVM_NAN(NanStr) __builtin_nan(NanStr) /* Double */ 55: #define LLVM_NANF(NanStr) __builtin_nanf(NanStr) /* Float */ 56: #define LLVM_NANS(NanStr) __builtin_nans(NanStr) /* Double */ 57: #define LLVM_NANSF(NanStr) __builtin_nansf(NanStr) /* Float */ 58: #define LLVM_INF __builtin_inf() /* Double */ 59: #define LLVM_INFF __builtin_inff() /* Float */ 60: #define LLVM_PREFETCH(addr,rw,locality) __builtin_prefetch(addr,rw,locality) 61: #define __ATTRIBUTE_CTOR__ __attribute__((constructor)) 62: #define __ATTRIBUTE_DTOR__ __attribute__((destructor)) 63: #define LLVM_ASM __asm__ 64: #else 65: #define LLVM_NAN(NanStr) ((double)0.0) /* Double */ 66: #define LLVM_NANF(NanStr) 0.0F /* Float */ 67: #define LLVM_NANS(NanStr) ((double)0.0) /* Double */ 68: #define LLVM_NANSF(NanStr) 0.0F /* Float */ 69: #define LLVM_INF ((double)0.0) /* Double */ 70: #define LLVM_INFF 0.0F /* Float */ 71: #define LLVM_PREFETCH(addr,rw,locality) /* PREFETCH */ 72: #define __ATTRIBUTE_CTOR__ 73: #define __ATTRIBUTE_DTOR__ 74: #define LLVM_ASM(X) 75: #endif 76: 77: #if __GNUC__ < 4 /* Old GCC's, or compilers not GCC */ 78: #define __builtin_stack_save() 0 /* not implemented */ 79: #define __builtin_stack_restore(X) /* noop */ 80: #endif 81: 82: #define CODE_FOR_MAIN() /* Any target-specific code for main()*/ 83: 84: #ifndef __cplusplus 85: typedef unsigned char bool; 86: #endif 87: 88: 89: /* Support for floating point constants */ 90: typedef unsigned long long ConstantDoubleTy; 91: typedef unsigned int ConstantFloatTy; 92: typedef struct { unsigned long long f1; unsigned short f2; unsigned short pad[3]; } ConstantFP80Ty; 93: typedef struct { unsigned long long f1; unsigned long long f2; } ConstantFP128Ty; 94: 95: 96: /* Global Declarations */ 97: /* Helper union for bitcasts */ 98: typedef union { 99: unsigned int Int32; 100: unsigned long long Int64; 101: float Float; 102: double Double; 103: } llvmBitCastUnion; 104: 105: /* External Global Variable Declarations */ 106: 107: /* Function Declarations */ 108: double fmod(double, double); 109: float fmodf(float, float); 110: long double fmodl(long double, long double); 111: unsigned int main(void); 112: unsigned int puts(unsigned char *); 113: unsigned char *malloc(); 114: void free(unsigned char *); 115: void abort(void); 116: 117: 118: /* Global Variable Declarations */ 119: static unsigned char _2E_str[12]; 120: 121: 122: /* Global Variable Definitions and Initialization */ 123: static unsigned char _2E_str[12] = "hello world"; 124: 125: 126: /* Function Bodies */ 127: static inline int llvm_fcmp_ord(double X, double Y) { return X == X && Y == Y; } 128: static inline int llvm_fcmp_uno(double X, double Y) { return X != X || Y != Y; } 129: static inline int llvm_fcmp_ueq(double X, double Y) { return X == Y || llvm_fcmp_uno(X, Y); } 130: static inline int llvm_fcmp_une(double X, double Y) { return X != Y; } 131: static inline int llvm_fcmp_ult(double X, double Y) { return X < Y || llvm_fcmp_uno(X, Y); } 132: static inline int llvm_fcmp_ugt(double X, double Y) { return X > Y || llvm_fcmp_uno(X, Y); } 133: static inline int llvm_fcmp_ule(double X, double Y) { return X <= Y || llvm_fcmp_uno(X, Y); } 134: static inline int llvm_fcmp_uge(double X, double Y) { return X >= Y || llvm_fcmp_uno(X, Y); } 135: static inline int llvm_fcmp_oeq(double X, double Y) { return X == Y ; } 136: static inline int llvm_fcmp_one(double X, double Y) { return X != Y && llvm_fcmp_ord(X, Y); } 137: static inline int llvm_fcmp_olt(double X, double Y) { return X < Y ; } 138: static inline int llvm_fcmp_ogt(double X, double Y) { return X > Y ; } 139: static inline int llvm_fcmp_ole(double X, double Y) { return X <= Y ; } 140: static inline int llvm_fcmp_oge(double X, double Y) { return X >= Y ; } 141: 142: unsigned int main(void) { 143: unsigned int llvm_cbe_tmp2; 144: 145: CODE_FOR_MAIN(); 146: llvm_cbe_tmp2 = /*tail*/ puts((&(_2E_str[((signed int )((unsigned int )0))]))); 147: return ((unsigned int )0); 148: }
Granted, it's some ugly-looking C code, with all those preprocessor fragments floating around in there, but if you take a few moments and go down to the main() definition, it's C to bitcode to C. We've come full circle.
Looking back at that first disassembly dump, I'm struck by how LLVM bitcode looks a lot like any other high-level assembly or low-level virtual machine language, even reminiscent of MSIL. In fact, there's probably a pretty close correlation between LLVM bitcode and MSIL.
In point of fact, LLVM knows this, too:
$ llc hello.bc -o hello.bc.il -march msil
And check out what it generates:
1: .assembly extern mscorlib {} 2: .assembly MSIL {} 3: 4: // External 5: .method static hidebysig pinvokeimpl("MSVCRT.DLL") 6: unsigned int32 modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 'puts'(void* ) preservesig {} 7: 8: .method static hidebysig pinvokeimpl("MSVCRT.DLL") 9: vararg void* modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 'malloc'() preservesig {} 10: 11: .method static hidebysig pinvokeimpl("MSVCRT.DLL") 12: void modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 'free'(void* ) preservesig {} 13: 14: .method public hidebysig static pinvokeimpl("KERNEL32.DLL" ansi winapi) native int LoadLibrary(string) preservesig {} 15: .method public hidebysig static pinvokeimpl("KERNEL32.DLL" ansi winapi) native int GetProcAddress(native int, string) preservesig {} 16: .method private static void* $MSIL_Import(string lib,string sym) 17: managed cil 18: { 19: ldarg lib 20: call native int LoadLibrary(string) 21: ldarg sym 22: call native int GetProcAddress(native int,string) 23: dup 24: brtrue L_01 25: ldstr "Can no import variable" 26: newobj instance void [mscorlib]System.Exception::.ctor(string) 27: throw 28: L_01: 29: ret 30: } 31: 32: .method static private void $MSIL_Init() managed cil 33: { 34: ret 35: } 36: 37: // Declarations 38: .class value explicit ansi sealed 'unsigned int8 [12]' { .pack 1 .size 12 } 39: 40: // Definitions 41: .field static private valuetype 'unsigned int8 [12]' '.str' at '.str$data' 42: .data '.str$data' = { 43: int8 (104), 44: int8 (101), 45: int8 (108), 46: int8 (108), 47: int8 (111), 48: int8 (32), 49: int8 (119), 50: int8 (111), 51: int8 (114), 52: int8 (108), 53: int8 (100), 54: int8 (0) [1] 55: } 56: 57: // Startup code 58: .method static public int32 $MSIL_Startup() { 59: .entrypoint 60: .locals (native int i) 61: .locals (native int argc) 62: .locals (native int ptr) 63: .locals (void* argv) 64: .locals (string[] args) 65: call string[] [mscorlib]System.Environment::GetCommandLineArgs() 66: dup 67: stloc args 68: ldlen 69: conv.i4 70: dup 71: stloc argc 72: ldc.i4 4 73: mul 74: localloc 75: stloc argv 76: ldc.i4.0 77: stloc i 78: L_01: 79: ldloc i 80: ldloc argc 81: ceq 82: brtrue L_02 83: ldloc args 84: ldloc i 85: ldelem.ref 86: call native int [mscorlib]System.Runtime.InteropServices.Marshal::StringToHGlobalAnsi(string) 87: stloc ptr 88: ldloc argv 89: ldloc i 90: ldc.i4 4 91: mul 92: add 93: ldloc ptr 94: stind.i 95: ldloc i 96: ldc.i4.1 97: add 98: stloc i 99: br L_01 100: L_02: 101: call void $MSIL_Init() 102: call unsigned int32 modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) main() 103: conv.i4 104: ret 105: } 106: 107: .method static public unsigned int32 modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 'main' 108: () cil managed 109: { 110: .locals (unsigned int32 'ltmp_0_1') 111: .maxstack 16 112: ltmp_1_2: 113: 114: // %tmp2 = tail call i32 @puts( i8* getelementptr ([12 x i8]* @.str, i32 0, i32 0) ) ; <i32> [#uses=0] 115: 116: ldsflda valuetype 'unsigned int8 [12]' '.str' 117: conv.u4 118: call unsigned int32 modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 'puts'(void* ) 119: stloc 'ltmp_0_1' 120: 121: // ret i32 0 122: 123: ldc.i4 0 124: ret 125: }
Holy frickin' crap. I think I'm in love.
.NET | C++ | Languages | LLVM
Sunday, February 24, 2008 5:00:17 AM (Pacific Standard Time, UTC-08:00)
|
|
|
Quotables
|
|
Some quotes I've found to be thought-provoking over the last week or so: "Some programming languages manage to absorb change, but withstand progress." "In a 5 year period we get one superb programming language. Only we can't control when the 5 year period will begin." "Every program has (at least) two purposes: the one for which it was written and another for which it wasn't." "If a listener nods his head when you're explaining your program, wake him up." "A language that doesn't affect the way you think about programming, is not worth knowing." "Wherever there is modularity there is the potential for misunderstanding: Hiding information implies a need to check communication." (All of the above, Alan Perlis) "Program testing can be used to show the presence of bugs, but never to show their absence!" "The competent programmer is fully aware of the limited size of his own skull. He therefore approaches his task with full humility, and avoids clever tricks like the plague." "How do we convince people that in programming simplicity and clarity —in short: what mathematicians call "elegance"— are not a dispensable luxury, but a crucial matter that decides between success and failure?" "Are you quite sure that all those bells and whistles, all those wonderful facilities of your so called powerful programming languages, belong to the solution set rather than the problem set?" "Object-oriented programming is an exceptionally bad idea which could only have originated in California." "The prisoner falls in love with his chains." "Write a paper promising salvation, make it a 'structured' something or a 'virtual' something, or 'abstract', 'distributed' or 'higher-order' or 'applicative' and you can almost be certain of having started a new cult." "I remember from those days two design principles that have served me well ever since, viz. - before really embarking on a sizable project, in particular before starting the large investment of coding, try to kill the project first, and
- start with the most difficult, most risky parts first."
(All of the above, Edsgar Dijkstra) Make of them what you will....
Languages | Reading
Sunday, February 24, 2008 3:16:52 AM (Pacific Standard Time, UTC-08:00)
|
|
 Saturday, February 23, 2008
|
Building LLVM on Windows using MinGW32
|
|
As I've mentioned in passing, one of the things I'm playing with in my spare time (or will play with, now that I've got everything working, I think) is the LLVM toolchain. In essence, it looks to be a parallel to Microsoft's Phoenix, except that it's out, it's been in use in production environments (Apple is a major contributor to the project and uses it pretty extensively, it seems), and it supports not only C/C++ and Objective-C, but also Ada and Fortran. It's also a useful back-end for people writing languages, hence my interest. One of the things that appeals about LLVM is that it uses an "intermediate representation" that in many ways reminds me of Phoenix's Low IR, though I'm sure there are significant differences that I'm not well-practiced enough to spot. Consider this bit of Fibonacci code, for example: 1: define i32 @fib(i32 %AnArg) { 2: EntryBlock: 3: %cond = icmp sle i32 %AnArg, 2 ; <i1> [#uses=1] 4: br i1 %cond, label %return, label %recurse 5: 6: return: ; preds = %EntryBlock 7: ret i32 1 8: 9: recurse: ; preds = %EntryBlock 10: %arg = sub i32 %AnArg, 1 ; <i32> [#uses=1] 11: %fibx1 = tail call i32 @fib( i32 %arg ) ; <i32> [#uses=1] 12: %arg1 = sub i32 %AnArg, 2 ; <i32> [#uses=1] 13: %fibx2 = tail call i32 @fib( i32 %arg1 ) ; <i32> [#uses=1] 14: %addresult = add i32 %fibx1, %fibx2 ; <i32> [#uses=1] 15: ret i32 %addresult 16: } 17: 18: declare void @abort()
It's rather interesting to imagine this as a direct by-product of that first pass off of the hypothetical Universal AST....
Getting this thing to build has been an exercise of patience, however.
The documentation on the website, while extensive, isn't very Windows-friendly. For example, there's a page that describes how to build it with Visual Studio, but it's a touch out-of-date. On top of that, it turns out that the VS/LLVM tools can't compile to LLVM bitcode, only execute it once it's in that format; you need "llvm-gcc" to compile to bitcode, which means you're left with a two-machine solution: a *nix box using llvm-gcc to compile the code, and then your Windows box to run it. Ugh.
Fortunately, Windows users have two choices for dealing with *nix solutions: Cygwin and MinGW32. The first tries to lay down a *nix-like layer on top of the Win32 APIs (meaning everything depends on cygwin1.dll once built), the second tries to provide an adapter layer such that when a *nix tool is done building, it has no dependencies beyond what you'd see from any other Win32 app. Debates rage about the validity of each, and rather than seem like I'm coming down in favor of one or the other, I'll simply note that I have both installed in my Languages VMWare image now, and leave it at that.
Building LLVM with MinGW was a bit more painful than I expected, however, so for a long time I just didn't bother. Last night that changed, thanks to Anton Korobeynikov, who spent the better part of three or four hours in back-and-forth email conversation with me, walking me patiently through the step-by-step of getting MinGW and msys up and running on my machine long enough to build the LLVM 2.2+ (meaning the tip beyond the current 2.2 release) code base. I can't thank him enough--both for the direct help in getting the MinGW bits up and in the right places as well as for the casual conversation about MinGW along the way--so I thought I'd replicate what we did on my box to the 'Net in an attempt to spare others the effort.
First, there's a pile of tarballs from the MinGW download page that require downloading and extracting:
- gcc-g++-3.4.5-20060117-1.tar.gz
- binutils-2.18.50-20080109.tar.gz
- mingw-runtime-3.14.tar.gz
- gcc-core-3.4.5-20060117-1.tar.gz
- w32api-3.11.tar.gz
Note that I also pulled down the other gcc- tarballs (gcj, objc and so on), just because I wanted to play with the MinGW versions of these tools. Extract all of these into a directory; on my system, that's C:/Prg/MinGW.
(There is a .exe installer on the Sourceforge page that supposedly manages all this for you, but it installed the binutils-2.17 package instead of 2.18, and I couldn't figure out how to get it to grab 2.18. All it does is download these packages and extract them, so going without it isn't a huge ordeal.)
By the way, if you're curious about experimenting with gcj as well (hey, it's a Java compiler that compiles to native code--that's interesting in its own right, if you ask me), take careful note that as it stands right now in the installation process, you can run gcj but can't compile Hello.java with it--it complains about a missing library, "iconv". This is a known bug, it seems, and the solution is to install libiconv from the GnuWin32 project--just extract the "bin" and "lib" packages into C:/Prg/MinGW.
At this point, you're done with C:/Prg/MinGW32.
Next, there's a couple of installers and additional tarballs that need downloading and extracting:
- MSYS-1.0.10.exe
- msysDTK-1.0.1.exe
- bash-3.1-MSYS-1.0.11-1.tar.bz2
- bison-2.3-MSYS-1.0.11.tar.bz2
- flex-2.5.33-MSYS-1.0.11.tar.bz2
- regex-0.12-MSYS-1.0.11.tar.bz2 (required by flex)
The first two just execute and install; on my system, that is C:/Prg/msys/1.0. The next one just extracts into the C:/Prg/msys/1.0 directory. The last three are a tad tricky, however--apparently they assume that everything should be installed into a top-level "usr" directory, and that's not quite where we want them; we want them. Apparently, we want them installed directly (so that "/usr/bin" from bison goes into "/bin" inside of "C:/Prg/msys/1.0"), so extract these to a temporary directory, then xcopy everything inside the temp/usr directory over to C:/Prg/msys/1.0. (That is, "cd temp", then "cd usr", then "xcopy /s/e * C:/Prg/msys/1.0".)
At this point, we're done with the setup--create a directory into which you want LLVM built (on my system, that's C:/Prg/LLVM/msys-build, where the source from SVN is held in C:/Prg/LLVM/llvm-svn), and execute the "configure" script in this directory (that is, "cd C:/Prg/LLVM/msys-build" and "../llvm-svn/configure"). The script will deposit a bunch of makefiles and directories into the build directory, after which a simple "make" suffices to build everything (in Debug; if you want Release, do "make ENABLE_OPTIMIZED=1", as per the LLVM documentation).
Thanks again, Anton! Now can you help me get llvm-gcc working? 
|
|
I love it when good accountanting girls go geek
|
|
Erik Mork, C++ and .NET programmer extraordinaire and bright guy in his own right, has subverted my sister-in-law to programming, and the pair of them are now opening the doors of their new company, Silver Bay Labs, with a series of podcasts on Silverlight and "sparkling clients" in general. Have a listen, if you're interested in the whole "rich client" thing....
.NET | Windows
Saturday, February 23, 2008 2:21:33 AM (Pacific Standard Time, UTC-08:00)
|
|
 Friday, February 22, 2008
|
URLs as first-class concepts in a language
|
|
While perusing the E Tutorial, I noticed something that was simple and powerful all at the same time: URLs as first-class concepts in the language. Or, if you will, URLs as a factory for creating objects. Check out this snippet of E: ? pragma.syntax("0.8")
? def poem := <http://www.erights.org/elang/intro/jabberwocky.txt>
# value: <http://www.erights.org/elang/intro/jabberwocky.txt>
? <file:c:/jabbertest>.mkdirs(null);
? <file:c:/jabbertest/jabberwocky.txt>.setText(poem.getText())
Notice how the initialization of the "poem" variable is set to what looks like an HTTP URL? This essentially downloads the contents of that file and stores it into poem (in a form I don't precisely understand yet--I think it's an object that wraps the contents, but I could be wrong). Then the script uses file URLs to create the local directory (jabbertest) and to create a new file (jabberwocky.txt) and set the contents of that file to be the same as the contents of the stored "poem" object.
That, my friends, is just slick. It also neatly avoids the whole "how are files and directories and stuff different from URLs" that tends to make doing this same bit of code in Java or C# that much more difficult.
|
|
More language features revisited
|
|
Since we're examining various aspects of the canonical O-O language (the three principals being C++, Java and C#/VB.NET), let's take in a review of another recent post, this time on the use of "new" in said languages. All of us have probably written code like this: Foo f = new Foo(); And what could be simpler? As long as the logic in the constructor is simple (or better yet, the constructor is empty), it would seem that the simplest code is the best, so just use the constructor. Certainly the MSDN documentation is rife with code that uses public constructors. You can probably find plenty of public constructors used right here on my blog. Why invest the effort in writing (and using) a factory class that will probably never do anything useful, other than call a public constructor? In his excellent podcast entitled "Emergent Design: The Evolutionary Nature of Software Development," Scott Bain of Net Objectives nevertheless makes a strong case against the routine use of public constructors. The problem, notes Scott, is that the use of a public constructor ties the calling code to the implementation of Foo as a concrete class. But suppose that you later discover that there need to be many subtypes of Foo, and Foo should therefore be an abstract class instead of a concrete class--what then? You've got a big problem, that's what; a lot of client code that has been making use of Foo's public constructor suddenly becomes invalid. I just love it when people rediscover advice that they could have had much earlier, had they only been aware of the prior art in the field. I refer the curious C#/VB.NET developer to the book Effective Java, by Joshua Bloch, in which Item 1 states, "Consider providing static factory methods instead of constructors". Quoting from said book, we see: One advantage of static factory methods is that, unlike constructors, they have names. If the parameters to a constructor do not, in and of themselves, describe the object being returned, a static factory with a well-chosen name can make a class easier to user and the resulting client code easier to read. ... A second advantage of static factory methods is that, unlike constructors, they are not required to create a new object each time they're invoked. This allows immutable classes (Item 13) to use preconstructed instances or to cache instances as they're constructed and to dispense these instances repeatedly so as to avoid creating unnecessary duplicate values. ... A third advantage of static factory methods is that, unlike constructors, they can return an object of any subtype of their return type. This gives you great flexibility in choosing the class of the returned object. ... The main disadvantage of static factory methods is that classes without public or protected constructors cannot be subclassed. The same is true for nonpublic classes returned by public static factories. A second disadvantage of static factory methods is that they are not readily distinguishable from other static methods. They do not stand out in API documentation the way that constructors do. C# and VB.NET developers are encouraged to read the book to discover about 30 or so other nuggets of wisdom that are directly applicable to the .NET framework. Note that Josh is in the process, this very month, of revising the book for rerelease as a second edition, taking into account the wide variety of changes that have taken place in the Java language since EJ's initial release. Meanwhile.... One thing that's been nagging at me is how I think Java and C# missed the boat in respect to the various ways we'd like to construct objects. The presumption was always that allocation and initialization would (a) always take place at the same time, and (b) always take place in the same manner--the underlying system would allocate the memory, the object would be laid out in this newly-minted chunk of heap, and your constructor would then initialize the contents. Neither assumption can be taken to be true, as we've seen over the years; the object may need to come from pre-existing storage (a la the object cache), or the object may need to be a derived type (a la the covariant return Josh mentions in #3 advantage above), or in some cases you want to mint the object from an entirely different part of the process. C++ actually had an advantage over C# and Java here, in that you could overload operator new() for a class (which then meant you had to overload operator delete(), and oh-by-the-way don't forget to overload array new, that is, operator new[]() and its corresponding twin, array delete, operator delete[](), which was a bit of a pain) to gain better control over both allocation and initialization, to a degree. Initially we always used it to control allocation--the idea being one would create a class-specific allocator, on the grounds that knowing some of the assumptions of the class, such as its size, would allow you to write faster allocation routines for it. But one of the rarely-used features of operator new() was that it could take additional parameters, using a truly obscure syntactic corner of C++: 1: void* operator new(size_t s, const string& message) 2: { 3: cout << "Operator new sez " << message << endl; 4: // allocate s bytes and return; Foo ctor will be invoked automagically 5: } 6: Foo* newFoo = new ("Howdy, world!") Foo();
Officially, one such overloaded operator was recognized, the placement new operator, which took a void* as a parameter, indicating the exact location in which your object was to be allocated and thus laid down. This meant that C++ developers could allocate from some other part of the process (including shudder a pointer they'd made up out of thin air) and drop the initialized object right there. While useful in its own right, placement new opened up a whole new world of construction options to the C++ developer that we never really took advantage of, since now you could pass parameters to the construction process without involving the constructor.
That's kind of nifty, in an obscure and slightly terrifying fashion. One thought I'd always had was that it would be cool if a C++ O/R-M overloaded operator new() for database-bound objects to indicate which database connection to use during construction:
1: DBConnection conn; 2: 3: Person* newFoo = new (conn) Person("Ted", "Neward");
Of course, such syntax has the immediate drawback of eliciting a chorus of "WTF?!?" at the next code review, but still....
Meanwhile, other languages choose to view new as one of those nasty static methods Gilad dislikes so much, Ruby and Smalltalk being two of them. That is to say, construction now basically calls into a static method on a class, which has the nice effect of keeping the number of "special" parts of the language to a minimum (since now "new" is just a method, not a keyword), makes it easier to have different-yet-similar names to represent slightly different concepts ("create" vs "new" vs "fetch" vs "allocate", and so on) sitting side by side, and helps eliminate Josh's second disadvantage above. I'm not certain how exactly this could eliminate Josh's first disadvantage (that of inheritance and inaccessible constructors), but it's not entirely unimaginable that the language would have a certain amount of incestuous knowledge here to be able to reach those static method (constructors) in the same way it does currently.
(It actually works better if they aren't static methods at all, but instance methods on class objects, to which the language automatically defers when it sees a "classname.new"; that is, when it sees
Person ann = Person.new("Ann", "Sheriff");
the language automatically changes this to read:
Person ann = Person.class.new("Ann", "Sheriff");
which would be eminently doable in Java, were class objects available for modification/definition somehow. In a language built on top of the JVM or CLR, the class object would be a standalone singleton, a la "object" definitions in Scala.)
|
 Thursday, February 21, 2008
|
Static considered harmful?
|
|
Gilad makes the case that static, that staple of C++, C#/VB.NET, and Java, does not belong: Most imperative languages have some notion of static variable. This is unfortunate, since static variables have many disadvantages. I have argued against static state for quite a few years (at least since the dawn of the millennium), and in Newspeak, I’m finally able to eradicate it entirely. I think Gilad conflates a few things, but he's also got some good points. To the dissecting table! To begin: Static variables are bad for security. See the E literature for extensive discussion on this topic. The key idea is that static state represents an ambient capability to do things to your system, that may be taken advantage of by evildoers. Eh.... I'm not sure I buy into this. For evildoers to be able to change static state, they have to have some kind of "poke" access inside the innards of your application, and if they have that, then just about anything is vulnerable. Now, granted, I haven't spent a great deal of time on the E literature, so maybe I'm missing the point here, but if an attacker has data-manipulability into my program, then I'm in a whole world of pain, whether he's attacking statics or instances. Having said that, statics have to be stored in a particular well-known location inside the process, so maybe that makes them a touch more vulnerable. Still, this seems a specious argument. Static variables are bad for distribution. Static state needs to either be replicated and sync’ed across all nodes of a distributed system, or kept on a central node accessible by all others, or some compromise between the former and the latter. This is all difficult/expensive/unreliable. Now this one I buy into, but the issue isn't the "static"ness of the data, but the fact that it's effectively a Singleton, and Singletons in any distributed system are Evil. I talked a great deal about this in Effective Enterprise Java, so I'll leave that alone, but let me point out that any Singleton is evil, whether it's represented in a static, a Singleton object, a Newspeak module, or a database. The "static"ness here is a red herring. Static variables are bad for re-entrancy. Code that accesses such state is not re-entrant. It is all too easy to produce such code. Case in point: javac. Originally conceived as a batch compiler, javac had to undergo extensive reconstructive surgery to make it suitable for use in IDEs. A major problem was that one could not create multiple instances of the compiler to be used by different parts of an IDE, because javac had significant static state. In contrast, the code in a Newspeak module definition is always re-entrant, which makes it easy to deploy multiple versions of a module definition side-by-side, for example. Absolutely, but this is true for instance fields, too--any state that is modified as part of two or more method bodies is vulnerable to a re-entrancy concern, since now the field is visibly modified state to that particular instance. How deeply do you want your code to be re-entrant? Gilad's citation of the javac compiler points out that the compiler was hardly re-entrant at any reasonable level, but the fact is that the compiler *could* have been used in a parallelized fashion using the isolational properties of ClassLoaders. (Its ugly, and Java desperately needs Isolates for that reason.) Static variables are bad for memory management. This state has to be handed specially by implementations, complicating garbage collection. The woeful tale of class unloading in Java revolves around this problem. Early JVMs lost application’s static state when trying to unload classes. Even though the rules for class unloading were already implicit in the specification, I had to add a section to the JLS to state them explicitly, so overzealous implementors wouldn’t throw away static application state that was not entirely obvious. This one I can't really comment on, since I'm not in the habit of writing memory-management code. I'll take Gilad's word for it, though I'm curious to know why this is so, in more detail. Static variables are bad for for startup time. They encourage excess initialization up front. Not to mention the complexities that static initialization engenders: it can deadlock, applications can see uninitialized state, and unless you have a really smart runtime, you find it hard to compile efficiently (because you need to test if things are initialized on every use). I'm not sure I see how this is different for any startup/initialization code--anything that the user can specify as part of startup will run the risk of deadlocks and viewing uninitialized state. Consider the alternative, however--if the user didn't have the ability to specify startup code, then they would have to either write their own, post-runtime, startup code, or else they have to constantly check the state of their uninitialized objects and initialize them on first use, the very thing that he claims is hard to compile efficiently. Static variables are bad for for concurrency. Of course, any shared state is bad for concurrency, but static state is one more subtle time bomb that can catch you by surprise. Absolutely: any shared state is bad for concurrency. However, I think we need to go back to first principles here. Since any shared state is bad for concurrency, and since static data is always shared by definition, it follows that static data is bad for concurrency. Pay particular attention to that chain of reasoning, however: any shared state is bad for concurrency, whether it's held by the process in a special non-instance-aligned location or in an data store that happens to be reachable from multiple paths of control. This means that your average database table is also bad for concurrency, were it not for the transactional protections that surround the table. This isn't an indictment of static variables, per se, but of shared state. Gilad goes on to describe how Newspeak solves this problem of static: It may seem like you need static state, somewhere to start things off, but you don’t. You start off by creating an object, and you keep your state in that object and in objects it references. In Newspeak, those objects are modules. Newspeak isn’t the only language to eliminate static state. E has also done so, out of concern for security. And so has Scala, though its close cohabitation with Java means Scala’s purity is easily violated. The bottom line, though, should be clear. Static state will disappear from modern programming languages, and should be eliminated from modern programming practice. I wish Newspeak were available for widespread use, because I'd love to explore this concept further; in the CLR, for example, there is the same idea of "modules", in that modules are singleton entities in which methods and data can reside, at a higher level than individual objects themselves. Assemblies, for example, form modules, and this is where "global variables" and "global methods" exist (when supported by the compiling language in question). At the end of the day, though, these are just statics by another name, and face most, if not all, of the same problems Gilad lays out above. Scala "objects" have the same basic property. I think the larger issue here is that one should be careful where one stores state, period. Every piece of data has a corresponding scope of accessibility, and developers have grown complacent about considering that scope when putting data there: they consider the accessibility at the language level (public, private, what-have-you), and fail to consider the scope beyond that (concurrency, re-entrancy, and so on). At the end of the day, it's simple: static entities and instance entities are just entities. Nothing more, nothing less. Caveat emptor.
|
 Tuesday, February 19, 2008
|
The Fallacies Remain....
|
|
Just recently, I got this bit in an email from the Redmond Developer News ezine: TWO IF BY SEA In the course of just over a week starting on Jan. 30, a total of five undersea data cables linking Europe, Africa and the Middle East were damaged or disrupted. The first two cables to be lost link Europe with Egypt and terminate near the Port of Alexandria. http://reddevnews.com/columns/article.aspx?editorialsid=2502 Early speculation placed the blame on ship anchors that might have dragged across the sea floor during heavy weather. But the subsequent loss of cables in the Persian Gulf and the Mediterranean has produced a chilling numbers game. Someone, it seems, may be trying to sabotage the global network. It's a conclusion that came up at a recent International Telecommunication Union (ITU) press conference. According to an Associated Press report, ITU head of development Sami al-Murshed isn't ready to "rule out that a deliberate act of sabotage caused the damage to the undersea cables over two weeks ago." http://tinyurl.com/3bjtdg You think? In just seven or eight days, five undersea cables were disrupted. Five. All of them serving or connecting to the Middle East. And thus far, only one cable cut -- linking Oman and the United Arab Emirates -- has been identified as accidental, caused by a dragging ship anchor. So what does it mean for developers? A lot, actually. Because it means that the coming wave of service-enabled applications needs to take into account the fact that the cloud is, literally, under attack. This isn't new. For as long as the Internet has been around, concerns about attacks on the network have centered on threats posed by things like distributed denial of service (DDOS) and other network-borne attacks. Twice -- once in 2002 and again in 2007 -- DDOS attacks have targeted the 13 DNS root servers, threatening to disrupt the Internet. But assaults on the remote physical infrastructure of the global network are especially concerning. These cables lie hundreds or even thousands of feet beneath the surface. This wasn't a script-kiddie kicking off an ill-advised DOS attack on a server. This was almost certainly a sophisticated, well-planned, well-financed and well-thought-out effort to cut off an entire section of the world from the global Internet. Clearly, efforts need to be made to ensure that the intercontinental cable infrastructure of the Internet is hardened. Redundant, geographically dispersed links, with plenty of excess bandwidth, are a good start. But development planners need to do their part, as well. Web-based applications shouldn't be crafted with the expectation of limitless bandwidth. Services and apps must be crafted so that they can fail gracefully, shift to lower-bandwidth media (such as satellite) and provide priority to business-critical operations. In short, your critical cloud-reliant apps must continue to work, when almost nothing else will. And all this, I might add, as the industry prepares to welcome the second generation of rich Internet application tools and frameworks. Silverlight 2.0 will debut at MIX08 next month. Adobe is upping the ante with its latest offerings. Developers will enjoy a major step up in their ability to craft enriched, Web-entangled applications and environments. But as you make your plans and write your code, remember this one thing: The people, organization or government that most likely sliced those four or five cables in the Mediterranean and Persian Gulf -- they can do it again. There's a couple of things to consider here, aside from the geopolitical ramifications of a concerted attack on the global IT infrastructure (which does more to damage corporations and the economy than it does to disrupt military communications, which to my understanding are mostly satellite-based). First, this attack on the global infrastructure raises a huge issue with respect to outsourcing--if you lose touch with your development staff for a day, a week, a month (just how long does it take to lay down new trunk cable, anyway?), what sort of chaos is this going to strike with your project schedule? In The World is Flat, Friedman mentions that a couple of fast-food restaurants have outsourced the drive-thru--you drive up to the speaker, and as you place your order, you're talking to somebody half a world way who's punching it into a computer that's flashing the data back to the fast-food join in question for harvesting (it's not like they make the food when you order it, just harvest it from the fields of pre-cooked burgers ripening under infrared lamps in the back) and disbursement as you pull forward the remaining fifty feet to the first window. The ludicrousness of this arrangement notwithstanding, this means that the local fast-food joint is now dependent on the global IT infrastructure in the same way that your ERP system is. Aside from the obvious "geek attraction" to a setup like this, I find it fascinating that at no point did somebody stand up and yell out, "What happened to minimizing the risks?" Effective project development relies heavily on the ability to touch base with the customer every so often to ensure things are progressing in the way the customer was anticipating. When the development team is one ocean and two continents away in one direction, or one ocean and a whole pile of islands away in the other direction, or even just a few states over, that vital communication link is now at the mercy of every single IT node in between them and you. We can make huge strides, but at the end of the day, the huge distances involved can only be "fractionalized", never eliminated. Second, as Desmond points out, this has a huge impact on the design of applications that are assuming a 100% or 99.9% Internet uptime. Yes, I'm looking at you, GMail and Google Calendar and the other so-called "next-generation Internet applications" based on technologies like AJAX. (I categorically refuse to call them "Web 2.0" applications--there is no such thing as "Web 2.0".) As much as we keep looking to the future for an "always-on" networking infrastructure, the more we delude ourselves to the practical realities of life: there is no such thing as "always-on" infrastructure. Networking or otherwise. I know this personally, since last year here in Redmond, some stronger-than-normal winter storms knocked down a whole slew of power lines and left my house without electricity for a week. To very quickly discover how much of modern Western life depends on "always-on" assumptions, go without power to the house for a week. We were fortunate--parts of Redmond and nearby neighborhoods got power back within 24 hours, so if I needed to recharge the laptop or get online to keep doing business, much less get a hot meal or just find a place where it was warm, it meant a quick trip down to the local strip mall where a restaurant with WiFi (Canyon's, for those of you that visit Redmond) kept me going. For others in Redmond, the power outage meant a brief vacation down at the Redmond Town Center Marriott, where power was available pretty much within an hour or two of its disruption. The First Fallacy of Enterprise Systems states that "The network is reliable". The network is only as reliable as the infrastructure around it, and not just the infrastructure that your company lays down from your workstation to the proxy or gateway or cable modem. Take a "traceroute" reading from your desktop machine to the server on which your application is running--if it's not physically in the building as you, then you're probably looking at 20 - 30 "hops" before it reaches the server. Every single one of those "hops" is a potential point of failure. Granted, the architecture of TCP/IP suggests that we should be able to route around any localized points of failure, but how many of those points are, in fact, to your world view, completely unroutable? If your gateway machine goes down, how does TCP/IP try to route around that? If your ISP gets hammered by a Denial-of-Service attack, how do clients reach the server? If we cannot guarantee 100% uptime for electricity, something we've had close to a century to perfect, then how can you assume similar kinds of guarantees for network availability? And before any of you point out that "Hey, most of the time, it just works so why worry about it?", I humbly suggest you walk into your Network Operations Center and ask the helpful IT people to point out the Uninterruptible Power Supplies that fuel the servers there "just in case". When they in turn ask you to point out the "just in case" infrastructure around the application, what will you say? Remember, the Fallacies only bite you when you ignore them: 1) The network is reliable 2) Latency is zero 3) Bandwidth is infinite 4) The network is secure 5) Topology doesn't change 6) There is one administrator 7) Transport cost is zero 8) The network is homogeneous 9) The system is monolithic 10) The system is finished Every project needs, at some point, to have somebody stand up in the room and shout out, "But how do we minimize the risks?" If this is truly a "mission-critical" application, then somebody needs the responsibility of cooking up "What if?" scenarios and answers, even if the answer is to say, "There's not much we can reasonably do in that situation, so we'll just accept that the company shuts its doors in that case".
|
 Monday, February 18, 2008
|
Who herds the cats?
|
|
Recently I've been looking more closely at the various (count them, four of them) proposals for adding new features into the Java language, the "BGGA", "FCM", "CICE" and "JCA" proposals. All of them are interesting and have their merits. A few other proposals for Java 7 have emerged as well, such as extension methods, enhancements to switch, the so-called "multi-catch" enhancement to exceptions, properties, better null support, and some syntax to support lists and maps natively. All of them intriguing ideas, and highly subject to reasonable debate among reasonable people. My concern lies in a different direction. Who herds this bunch of cats? This isn't just a question of process within the JCP. And it's not just a question of closures or the other features we're looking at for Java 7. This is a question about the moral leadership of Java. In the C# space, we have Anders. He clearly "owns" language, and acts as the benevolent dictator. Nothing goes into "his" language without his explicit and expressed OK. Other languages have similar personages in similar roles. Python has Guido. Perl has Larry. C++ has Bjarne. Ruby has Matz. Certainly other individuals "float" around these languages and lend their impressive weight towards the language's design--Scott Meyers, and Herb Sutter in C++, for example, or Dave Thomas and Martin Fowler in Ruby--but the core language design principles rest firmly inside the head of one man. Whereso for Java? James Gosling? Please--Jimmy abandoned the language shortly after its release, and now only comes out every so often to launch T-shirts into the crowd, answer reporters' questions whenever something Java-related comes up, and blog his two cents' worth. He's a reminder of the "good old days", for sure, but he's not coming out with new directions of his own accord and taking the reins to lead us there. He's the Teddy Kennedy of the Java Party. His endorsement weighs in as about as influential as Bob Dole's--interesting to an analytical few, but hardly meaningful in the grand scheme of things. Unfortunately, the two most recognized "benevolent dictators" of the Java language, Neal Gafter and Joshua Bloch, are on opposing sides of the aisle on this. Each has put forth a competing proposal for how the Java language should evolve. Each has his good reasons for how he wants to implement closures in Java. Each has his impressive list of names supporting him. It's Clinton and Obama, Java Edition. The fact is, though, that when these two disagreed on how to move forward, lots of Java developers found themselves in the uncomfortable position faced by the children when the parents fight: do you take sides? Do you try to make peace between them? Or do you just go hide your head under a pillow until the yelling stops? This is the real danger facing Java right now: there is no one with enough moral capital and credibility in the Java space to make this call. We can take polls and votes and strawman proposals until the cows come home, but language design by committee has generally not worked well in the past. If someone without that authority ends up making the decision, it will alienate half the Java community regardless of which way the decision goes. The split is too even to expect one to come out as the obvious front-runner. And expecting a JSR committee process to somehow resolve the differences between these four proposals into a single direction forward is asking a lot. So who makes the call?
Java/J2EE | Languages
Monday, February 18, 2008 9:47:38 PM (Pacific Standard Time, UTC-08:00)
|
|
|