JOB REFERRALS
    ON THIS PAGE
    ARCHIVES
    CATEGORIES
    BLOGROLL
    LINKS
    SEARCH
    MY BOOKS
    DISCLAIMER
 
 Sunday, April 06, 2008
More on Paradise

A couple of people have commented on the previous entry, citing, essentially, that Google needs to do this to be "the best". I understand the argument completely: Google wants to attract the top talent, or retain the top talent, or at least entice the top talent, not to mention give them every reason to be horribly productive, so all of that extravagance is a justifiable--and some might argue necessary--expense.

Thing is, I don't buy into that argument for a second. Talent wants to be rewarded, granted, but think about this for a moment: what kind of hours are these employees buying into by working there? There's an implicit tradeoff here, one that says, "If you are insanely productive, then the cost of this office is justified", meaning the pressure is on. Having an off day? Better pull the all-nighter to make up for it. Got stuck on something you didn't anticipate? Better pull the all-weekender to compensate. You're not in the bush leagues any more, sonny--you're at Google, and we paid a lot of money to make this office your home away from home, so snap to it!

I'm not suggesting that Goole is explicitly demanding this of their employees... but neither did Microsoft, back in the day.

See, all of this--including the justification arguments--is eerily reminiscent of Microsoft in their heyday, with the best example being the original Windows NT team. The hours they pulled over the last few months (some say years) of that project were nothing short of marathon sprints, and Microsoft laid everything they could at the feet of these developers (though nothing like what Google has built in Zurich, mind you) to help them focus on shipping the project.

The Wall Street Journal ran an article about the whole thing, and one quote from that article stuck with me: that the pressure to work the insanely long hours didn't come from upper management, but from the other developers on the team. "Are you signed up for this thing or not?" was a euphemism for "Why the hell are you leaving at 9PM? And you're not back until 8AM? What are you, some kind of slacker?" (I felt like screaming back, "Just say no!", and I wasn't even there.) The peer pressure was insane, and drove several members of the team to get outta Dodge as the first opportunity. Or some took off for bike rides across the country to recharge. Or some just... broke.

Microsoft doesn't do this anymore. Nobody is expected to put in 60 hour work weeks as a matter of course; now, the average is around 45, which I believe (though I have no factual evidence to support this) is about average for the industry as a whole. (C'mon, admit it, even if you're a strict 9-to-5'er, you still do a little reading at home or stick around after hours to help with the big rollout. It needs to be done, and you're professional enough to want to see it done right.)

In college, I learned a lot about startups and established companies. Like most of the folks I hung out with in college, I used to stay up way too late with friends hanging out at Woodstock's (the local pizzeria) arguing politics/sex/religion/operating systems or playing role-playing games*, then come home and bang out a 5-page paper in a few hours before cracking open my notes to study for the final that morning. I could do this without any real penalty, but I usually ended up taking the final, then coming back to my apartment and passing out for a few hours, only to awake to my roommates chanting "Piz-za! Piz-za! Piz-za!" and starting the whole thing over again. I was young, I had energy, I was fundamentally stupid, and I can't do that anymore. I can't sprint like that and still be able to function coherently over time, and as you get older, you realize that while college can be managed as a series of sprints, life requires you to have a more marathon attitude, particularly because you can't know when the sprints are coming, like they do in college.

As companies grow larger, their initial lifecycle is a series of sprints: roll the first release out, take a breather while the sales guys gather in the customer(s) and figure out what the next iteration will be, then do the whole thing over again. This effect is even more pronounced if the company has that one Really Big Customer, the one that represents some significant (over 50%) of the company's revenue; it's that company that drives the feature set and its delivery date. Meeting their needs and challenges becomes the source of the sprints to come.

As time goes on, however, and assuming the company has somehow managed to find success, they find that the Really Big Customer is actually now just one of several, and the features and the timing of the releases need to be balanced across the entire customer set. In other words, while some sprints are still necessary, the frequency and intensity begins to smooth out and the focus shifts to structure, pace, and consistency.

That is, if the company has successfully transitioned from "startup" to "established". Some startups never do, and try to sprint themselves from one scenario to the another, and eventually run themselves into the ground. Managing this transition isn't easy, and is something that generally only comes from having lived through it once or twice... or three or four times... Ah, good times.

Remember that whole "work/life balance" thing? We're discovering, over and over again, that having a good work/life balance is a key part of maintaining a sound outlook on life, much less your basic sanity. Creating a "home away from home" where employees can put in insane amount of hours is not healthy "work/life balance"... unless you presume your company will be staffed with fresh-from-college twenty-something singles who have nobody to go home to and all their friends in the office. And that, folks, is not a sustainable model.

Point is, Google's extravagance here smacks of startups, sprints, and fevered intensity. What's worse, I hear little bits and pieces of rumors that Google reveres the eleventh-hour developer-god who swoops in, pulls the all-weeker to get the release out the door, to high praise from management and his/her peers.

That's not sustainable, and the sooner Google--or any other company, for that matter--realizes that, the better they will be in the long term.

 

 

 

* - Does this really surprise you? Yes, I am a huge geek.


Development Processes

Sunday, April 06, 2008 12:49:56 AM (Pacific Daylight Time, UTC-07:00)
Comments [3]  | 
The Complexities of Black Boxes

Kohsuke Kawagachi has posted a blog entry describing how to watch the assembly code get generated by the JVM during execution, using a non-product (debug or fastdebug) build of Hotspot and the -XX:+PrintOptoAssembly flag, a trick he says he learned while at TheServerSide Java Symposium a few weeks ago in Vegas. He goes on to do some analysis of the generated assembly instructions, offering up some interesting insights into the JVM's inner workings.

There's only one problem with this: the flag doesn't exist.

Looking at the source for the most recent builds of the JVM (b24, plus whatever new changesets have been applied to the Mercurial repositories since then), and in particular at the "globals.hpp" file (jdk7/hotspot/src/share/vm/runtime/globals.hpp), where all the -XX flags are described, no such flag exists. It obviously must have at one point, since he's obviously been able to use it to get an assembly dump (as must whomever taught him how to do it), but it's not there anymore.

OK, OK, I lied. It never was there for the client build (near as I can tell), but it is there if you squint hard enough (jdk7/hotspot/src/share/vm/opto/c2_globals.hpp), but as the pathname to the source file implies, it's only there for the server build, which is why Kohsuke has to specify the "-server" flag on the command line; if you leave that off, you get an error message from the JVM saying the flag is unrecognized, leading you to believe Kohsuke (and whomever taught him this trick) is clearly a few megs shy in their mental heap. So when you try this trick, make sure to use "-server", and make sure to run methods enough to force JIT to take place (or set the JIT threshold  using -XX:CompileThreshold=1) in order to see the assembly actually get generated.

Oh, and make sure to swing the dead chicken--fresh, not frozen--by the neck over your head three times, counterclockwise, while facing the moon and chanting, "Ohwah... Tanidd... Eyah... Tiam...". If you don't, the whole thing won't work. Seriously.

...

Ever feel like that's how we tune the JVM? Me too. The whole thing is this huge black box, and it's nearly impossible to extract any kind of useful information without wandering into the scores of mysterious "-XX" flags, each of which is barely documented, not guaranteed to do anything visibly useful, and barely understood by anybody outside of Sun.

Hey, at least we have those flags in the JVM; the CLR developers have to take whatever the Microsoft guys give them. ("And they'll like it, too! Why, when I was their age, I had to program using nothing but pebbles by the side of the road on my way to school! Uphill! Both ways! In the raging blizzards of Arizona!")

Interestingly enough, this conversation got me into an argument with a friend of mine who works for Sun.

During the conversation, I mentioned that I was annoyed at the difficulty a Java developer has in trying to see how the Java code he/she writes turns into assembly, making it hard to understand what's really happening inside the black box. After all, the CLR makes this pretty trivial--when you set a breakpoint in Visual Studio, if you have the right flags turned on, your C# or VB source is displayed alongside the actual native instructions, making it fairly easy to see that the JITted code. This was always of great help when trying to prove to skeptical C++ developers that the CLR wasn't entirely brain-dead, and did a lot of the optimizations their favorite C++ compiler did, in some cases even better than the C++ compiler might have done. "Why don't we have some kind of double-X-show-me-the-code flag, so I can do the same with the JVM?", I lamented.

His contention was that this lack of a flag is a good thing.

Convinced I was misunderstanding his position, I asked him what he meant by that, and he said, roughly paraphrasing, that there are only about 20 or so people in the world who could look at that assembly dump and not draw incredibly misguided impressions of how the JVM operates internally; more importantly, because so few people could do anything useful with that output, it was to our collective benefit that this information was so hard to obtain.

To quote one of my favorite comedians, "Well excuuuuuuuuuuse ME." I was a bit... taken aback, shall we say.

I understand his point--that sometimes knowledge without the right context around it can lead to misinterpretation and misunderstanding. I'll agree totally with the assertion that the JVM is an incredibly complex piece of software that does some very sophisticated runtime analysis to get Java code to run faster and faster. I'll even grant you that the timing involved in displaying the assembly dump is critical, since Hotspot re-JITs methods that get used repeatedly, something the CLR has talked about ("code pitching") but thus far hasn't executed on.

But this idea that only a certain select group of people are smart enough and understand the JVM well enough to interpret the results correctly? That's dangerous, on several levels.

First, it's potentially an elitist attitude to take, essentially presenting a "We look down on you poor peasants who just don't get it" persona, and if word gets out that this is how Sun views Java developers as a whole, then it's a black mark on Sun's PR image and causes them some major pain and credibility loss. Now, let me brutally frank here: For the record, I don't think this is the case--everybody I've met at Sun thus far is helpful and down-to-earth, and scary-smart. I have a hard time believing that they're secretly thumbing their nose at me. I suppose it's possible, but it's also possible that Bill Gates and Scott McNealy were in cahoots the whole time, too.

Second, and more importantly, there will never be any more than those 20 people we have now, unless Sun works to open the deep dark internals of the JVM to more people. I know I'm not alone in the world in wanting to know how the JVM works at the same level as I know how the CLR works, and now that the OpenJDK is up and running, if Sun wants to see any patches or feature enhancements from the community, then they need to invest in more educational infrastructure to get those of us who are interested in this stuff more up to speed on the topic.

Third, and most important of all, so long as the JVM remains a black box, the "myths, legends and lore" will haunt us forever. Remember when all the Java performance articles went on and on about how method marked "final" were better-performing and so therefore should be how you write your Java code? Now, close to ten years later, we can look back at that and laugh, seeing it for the micro-optimization it is, but if challenged on this idea, we have no proof. There is no way to create demonstrable evidence to prove or disprove this point. Which means, then, that Java developers can argue this back and forth based on nothing more than our mental model of the JVM and what "logically makes sense".

Some will suggest that we can use micro-benchmarks to compare the two options and see how, after a million iterations, the total elapsed time compares. Brian Goetz has spent a lot of time and energy refuting this myth, but to put it in some degree of perspective, a micro-benchmark to prove or disprove the performance benefits of "final" methods is like changing the motor oil in your car and then driving across the country over and over again, measuring how long until the engine explodes. You can do it, but there's going to be so much noise from everything else around the experiment--weather, your habits as a driver, the speeds at which you're driving, and so on--that the results will be essentially meaningless unless there is a huge disparity, capable of shining through the noise.

This is a position born out across history--we've never been able to understand a system until we can observe it from outside the system; examples abound, such as the early medical understanding of Aristotle's theories weighed against the medical experiments performed by the Renaissance thinkers. One story says a skeptic, looking at the body in front of him disproving one of Aristotle's theories, shook his head and said, "I would believe you except that it was Aristotle who said it." When mental models are built on faith, rather than on fact and evidence, progress cannot reasonably occur.

Don't think the analogy holds? How long did we as Java developers hold faith with the idea that object pools were a good idea, and that objects are expensive to create, despite the fact that the Hotspot FAQ has explicitly told us otherwise since JDK 1.3? I still run into Java developers who insist that object pools are a good idea all across the board. I show them the Hotspot FAQ page, and they shake their head and say, "I would believe you except that it was (so-and-so article author) who said it."

Oh, and don't get me started on a near-total opacity of the Parrot and Ruby environments, among others--this isn't a "static vs dynamic" thing, this is something everybody running on a managed platform needs to be able to do.

I'm tired of arguing from a position of faith. I want evidence to either prove or disprove my assertions, and more importantly, I want my mental model of how the JVM operates to improve until it's more reflective of and closer to the reality. I can't do that until the JVM offers up some mechanisms for gathering that evidence, or at least for gathering it more easily and comprehensively. You shouldn't have to be a JVM expert to get some verification that your understanding of how the JVM works is correct or incorrect.


.NET | Java/J2EE | LLVM | Parrot | Ruby

Sunday, April 06, 2008 12:48:50 AM (Pacific Daylight Time, UTC-07:00)
Comments [2]  | 
 Thursday, April 03, 2008
Developer paradise?

Check out this video. No, go on, watch it. The rest of this won't make much sense until you do.

Now that you've seen it, take a moment, do the "WOW" thing in your head, imagine how cool it would be to work there, all of it. Go on, I know you want to, I did too when I first saw it. Go ahead, take a moment; you'll be distracted until you do, and you'll miss the rest of the point of this blog entry, and then I'll be sad. Go on, now. Here, I'll do it with you, even.

Mmmmmm. Slide to lunch. Ahhhh. Massage chair in front of the fish tank. Wow, just think of how cool it must be to work at Google. I mean, they work hard and all, but still... now there's a company that knows how to take care of its engineers, right?

OK, daydreaming done? Let's think about this for a moment.

First, how can anybody get anything done with all that noise surrounding them? Oh, I don't mean actual audio noise, I know they've created quiet zones and all that, I mean the myriad distractions that float around that office building. I'll be honest--I find myself getting work done better in an environment without that additional stimulus and excitement (legacy of my ADD, I'm sure). Knowing that I could just nip on over to the video game room to spend some "thinking time" in front of an all-you-can-play Galaga machine would drive me batty.

Maybe that's just me, and others are just begging to be given the chance to prove me wrong, and if that's the case, then by all means, please feel free. But I've heard this same experience from lots of people doing the work-at-home thing, and I don't think the anecdotal evidence here is widely skewed. Sometimes you want work to be... just work. Vanilla, boring, and predictable.

Don't get me wrong--I don't exactly look forward to my next engagement that plops me down in the middle of the cube farm--there's a continuum here, and Google is clearly far on the opposite end of that spectrum from the Dilbert-esque cubicle prairie as anyone can get. But had I my personal preference here, it would be a desk, fairly plain, comfortable, yet focused more on the functional than the "fun".

But second, there's a deeper concern that I have, one which I worry a lot more about than just peoples' preference in work space.

When's the last time you saw this kind of extravagance being lavished on developers? For me, it was at a number of different Silicon Valley firms during the dot-com boom of the late 90's... and all of those firms are dessicated remains of what they once were, or else dried up completely into dust and have long since blown away with the coastal breeze. This was classic startup behavior: drop a ton of money

I'll call it: If Google sees nothing wrong with this kind of extravagance in setting up an office, then they have just done their first evil.

Pause for a moment to think about the costs involved in setting up that office. I submit to you, dear reader, that Google is being financially irresponsible with that office, all nice perks aside. Google's money machine isn't going to last forever--nobody's ever does--and the company (desperately, IMHO) needs to find something else to prove to Wall Street and Developer Street that they're still a company that knows how to write cool software and make money. (Plenty of companies write cool software, and close their doors a few years later, and plenty of companies know how to make money, but having a company who can do both is a real rarity.)

Look at Google's habits right now: they're pouring money out left and right in an effort to maintain or improve the Google "image"; tons of giveaways at conferences, tons of offices all across the world, incredible office spaces like the one in the video, and a ton of projects created by Google engineers just because said engineers think it's cool. While that's a developer's dream, it doesn't pay the rent. I want to work for a company that offers me a creative, productive work environment, true, but more than that, I want to work for a company that knows how to make sure my checks still cash. (Yes, I remember the late 90's well, and the collapse that followed.)

I'm worried about Google--they appear to be on a dangerous arc, spending in what would seem to be far greater excess of what they're taking in, and that's not even considering some of the companies they would be well-advised to consider buying in order to flush out more of their corporate profile (which is its own interesting discussion, one for a later day). What is Google's principal source of income right now? Near as I can tell, it's AdWords, and I just can't believe that the AdWords gravy train will run any longer than...

... well, than the DOS gravy train. Granted, that train ran for a long time, but eventually it ran out, and the company had to find alternative sources of income. Microsoft did, and now it's Google's turn to prove they can put money back into their corporate coffers.

The parallels between Google and Microsoft are staggering, IMHO.


Development Processes

Thursday, April 03, 2008 3:02:54 AM (Pacific Daylight Time, UTC-07:00)
Comments [11]  | 
 Wednesday, April 02, 2008
Is Microsoft serious?

Recently I received a press announcement from Waggener-Edstrom, Microsoft's PR company, about their latest move in the interoperability space; I reproduce it here in its entirety for your perusal:

Hi Ted,

Microsoft is announcing another action to promote greater interoperability, opportunity and choice across the IT industry of developers, partners, customers and competitors. 

Today Microsoft is posting additional documentation of the XAML (eXtensible Application Markup Language) formats for advanced user experiences, enabling third parties to access and implement the XAML formats in their own client, server and tool products.  This documentation is publicly available, for no charge, at http://go.microsoft.com/fwlink/?LinkId=113699

It will assist developers building non-Microsoft clients and servers to read and write XAML to process advanced user experiences – with lots of animation, rich 2D and 3D graphic and video. Specifically, non-Microsoft servers can more easily generate XAML files to be handled, for example, by applications running on Windows client machines.  In addition, non-Microsoft clients can be written more easily to interpret XAML files. This action will assist ISVs in creating design tools and file format converters to read and write XAML to create advanced user experiences.

Microsoft is making this documentation available under the Microsoft Open Specification Promise (OSP), which will allow developers of all types anywhere in the world to access and implement the XAML formats in their own client, server or tool products without having to take a license or pay a fee to Microsoft.

The following quote is attributable to Tom Robertson, general manager, Interoperability and Standards, Microsoft.

“Microsoft’s posting of the expanded set of XAML format documentation to assist third parties to access and implement the XAML formats in their own client, server and tool products will help promote interoperability, opportunity and choice across the IT community.  Use of the Open Specification Promise assures developers that they can use any Microsoft patents needed to implement all or part of the XAML formats for free, anywhere in the world, now and in the future.” 

Please let me know if you have any questions or if I can provide you with any additional information. 

Best,

N--

This marks the most recent in a slew of efforts by the Borg of the Pacific Northwest to "promote greater interoperability, opportunity and choice", and I know it's left a lot of people feeling decidedly skeptical and... well, let's just call it what it is, paranoid, about the company's plans and ulterior motive behind all these efforts. After all, this is the company that tried to co-opt Java, put Stacker out of business, used their monopoly operating system power to crush Novell, used their monopoly office suite power to crush the Mac, bribe an entire country to vote their way on the new office-file specifications, and I don't know what all else.

I know, I know, all my blog-readers who work at Microsoft are going nuts right now, protesting, claiming that this isn't the same company that they work for now, and so on. Fact is, folks, if you work at Microsoft, you work for a company whose name is not well-received in many quarters, and while some of it is undeserved... some of it is. Microsoft has done some pretty stupid things in its history, and if that reputation doesn't sit well with you now, I can't help but wonder if somewhere in that great Corporate Heaven, Stac Electronics isn't just jumping up and down, foaming at the mouth and screaming, "Ha! Serves you right!"

I don't want to use this blog as a chance for everybody who ever got burned by Microsoft (or thought they got burned by Microsoft, which is much more widespread and just as much more likely to be in their own minds) to trot out "reap and sow" cliches. Instead, I want to revisit one of my favorite topics, that of interoperability, and see exactly what this new shift in Microsoft's attitude towards interoperability really means.

Let's take these one at a time. Note that I have no "Deep Throat" at Microsoft feeding me "the Redmondagon Papers"; this is all based on my own conjecture and perspective.

What does releasing the XAML spec really mean?

Honestly, it means that now non-Microsoft platforms can try to create competitors to Aero and Windows Presentation Foundation, and have the same kind of rich client experiences that Windows users can enjoy.

Honestly, I expect this to go pretty much nowhere.

Realistically speaking, if a non-Microsoft app server wanted to generate XAML, it was a simple matter of generating the appropriate XML, tagging it with an appropriate MIME type in the HTTP header, and serving it up over an HTTP request; I've been giving this demo at conferences for three or four years now, pretty much since the first betas of WPF were stable enough to use. This really isn't rocket science.

But more importantly, XAMl has always been misunderstood: it's not a presentation format, it's an object graph format. XAML simply "wires up" a collection of objects into a tree, and it's the underlying object model that provides the functionality or power or presentation or whatever. It's an easier way of writing "Button b = new Button(...);", nothing more, nothing less. Sure, it would be nice to have some kind of equivalent for the Swing space, but doing so would tie the corresponding XML (XSML?) to the Swing APIs, just as WPF XAML is tied to the WPF API.

Does releasing the XAML specs mean that now Linux and Mac OS will get WPF features?

They've had them for years, in the guise of the OpenGL APIs, and nobody knew what to do with them, except maybe for a sliver of folks building games and interesting "effects". Unless somebody really feels the desire to try and create an adapter layer to map the WPF Button over to an OpenGL button, I really don't see much point.

This is one of the most dangerous points in the discussion: attempting to build an adapter to another platform's API is almost always a failed experiment from the day it's begun, and Microsoft's own attempt to port the MFC APIs over to the Mac OS (back in the pre-OS X days, circa 1995) were just a miserable, abject failure. Not because of any lack of intelligence on Microsoft's part, mind you, but because the two operating systems are just too different. Want to see what I mean? Bring a Mac guy and a Windows guy into the same room, and ask them each where God intended the menu bar to live.

Then creep, quietly, out of the room, before you get caught in the blood frenzy.

Why does Microsoft suddenly care about interoperability?

This is the crown jewel of the lot: why should this company, so famous for going it alone on so many issues, suddenly decide that it's important for them to embrace the other kids on the playground and make nice? Is this back to the "embrace" part of the "embrace and extend and extinguish" cycle that they're so famous for?

Partly.

To understand the point I am about to make, let's set some context.

(In other words, gather 'round, children, it's story time.)

Truth is, there was a time back there in the '90s when I think Microsoft really thought they could take over the world. COM was on the ascendancy, and it was a better platform for building software than anything else out there (at the time), particularly in the area of building rich media applications (remember when embedding a sound clip into your email message embedded inside your spreadsheet was all the rage?). The CORBA initiative was going strong, true, but its great claim to fame was to allow two remote processes to talk to one another--the rest of the CORBA "push" was in standards that either never materialized, or else materialized but turned out to be really hard to build, or use, or deploy, or all of the above. IBM's great competitor--SOM--wasn't even in beta on anything other than OS/2 (another great IBM product). Then, when DCOM shipped, it was seen by some as the final nail in the CORBA coffin; Microsoft clearly was going to "win".

Along came Java.

Java literally took the rug out from underneath the COM platform, almost overnight. It provided a platform with most of the same benefits as the COM/DCOM platform, but without having to memorize the QueryInterface rules or knowing what IUnknown was or how IDispatch was required to work or how static_cast<> and dynamic_cast<> and QueryInterface were all related. ("Would you, should you, static_cast? Not if you want your code to last..." Ah, those were heady days.) Suddenly, "mere mortals" could program on this platform, and feel a strong sense of confidence that their code would work, over time, regardless of whether they remembered to set references to null when they were done with them.

At first, Microsoft was "down with it", because in Java they saw a great marriage: the Java language as the "sweet spot" between C++'s expressive power and VB's layers of abstraction, running on top of the JVM as a "sweet spot" intermingled with the COM platform to provide the easiest, most powerful Windows programming environment yet. Visual J++ was clearly the favored child of the litter.

And then the lawyers got involved, and Sun saw their chance to steal a march on Microsoft, and maybe break the feared operating system monopolist, and maybe even get a few more percentage points for Solaris (because, after all, "Write Once, Run Anywhere" meant that you wouldn't have to run sucky operating systems like Windows and instead could trade up to real operating systems like Solaris, right? Hey, where'd that penguin come from, anyway, and why is he eating all our fish?). Sun refused to let Microsoft's marriage of the JVM (technically the MSVM) and COM take place, and Microsoft, rather than seek to fight it out, instead decided to cede the battle, and look for a battleground of their own choosing, instead. Thus was the thing that would become called ".NET" born.

But this "master plan" would take four or so years to develop, and in the meantime...

... in the meantime, EJB and Servlets and later J2EE and "app servers" and Spring and all those wonderful things that came with them, they were eating Microsoft's lunch. Comparing J2EE (even with EJB in the mix!) with the complexities of writing unmanaged COM code on top of COM+ is simply no comparison--again, the power of the managed platform simply proved to be too hard to turn away without compelling reason, and the COM/DCOM/COM+ story simply didn't have that compelling reason. Microsoft watched their "inevitable victory" sail into the sunset without them, just as the Department of Justice came up to them and shackled them with the first of many, many papers about "anti-competitive practices".

In many respects, the positions got reversed--Sun inherited a huge share (an unhealthy dose, in fact) of Microsoft's arrogance, and for a long time there, thought they were suddenly destiny's child, that Java (meaning Sun, of course) would be the one to "win", and thus would Sun's assurance of world dominance thus be assured.

Except it didn't play out that way.

Sun found that by embracing standards over implementations, they spent long hours thrashing out specifications, only to provide instant credibility to other vendors' products while their own languished. Weblogic stole the EJB early adopter window. A number of small vendors provided servlet implementations before Tomcat was born... which, although written by Sun employees, was an open-source project and yielded no financial benefit. JMS... well, JMS was always the redheaded stepchild of the J2EE family, at least until vendors like Sonic and Fiorano rescued it for the common Java programmer. (Those who'd been using IBM MQSeries all the while never really could see why you'd want to program against JMS APIs instead of IBM's own.) In each and every case, Sun found their product to be the third or fourth entry into the race, usually years after the others had started, and as a result....

Meanwhile, back in Redmond....

Microsoft comes to the game with .NET in 2003. (The early betas don't count because many people openly wonder if Microsoft is really serious about this ".NET" thing in the first place. After all, remember Microsoft Bob?) And despite .NET's obvious advantage of being formulated nearly a decade after Java's initial release, thus able to apply hindsight to fix or improve the obvious blemishes in the Java environment, Microsoft finds that they're playing catch-up in the all-too-important enterprise space. Microsoft's tools and products have always been seen as "second-class citizens" to the "big boys" in the enterprise space, particularly at the ends of the "high scale" continuum, and the lack of an obvious "app server" in the .NET arena only serves to underscore and reinforce that opinion among many large firms.

More importantly, Microsoft doesn't ever want to get blindsided by the Java experience again. They want to make sure that they are never in a position where it looks like their tools are vastly out-of-date, underfeatured, underpowered, and underused. They need to remain somewhere near the bleeding edge, but not so close that their customers are the ones doing the bleeding.

(We pause for the inevitable Vista joke.)

To Microsoft, Java is that near-death experience that pulls many adrenaline (and other) junkies back from the brink they so callously teetered on before. They need some kind of forward progress, some kind of advancement in the game, so that their customers and their would-be customers feel like Microsoft is on top of it at all times.

Result: Somewhere in the 2000-2003 timeframe, Microsoft looks around, sees the landscape, and realizes it needs to make itself relevant to a largely J2EE-based universe, and fast.

At first, Microsoft sees a play through the establishment of some standards between the big vendors, around this new "XML" thing, a largely portable data format, and so they throw themselves heart and soul into that space. Doing so will allow them to show existing J2EE-based shops that the power of the .NET platform lies in complementing the existing infrastructure, not replacing it. (Microsoft is smart enough to realize that preaching the software equivalent of hellfire-and-brimstone, known as "rip-and-replace", will not cater well to this congregation.)

(Rubyists could have learned a valuable lesson here, but either weren't paying attention, didn't realize the value of the lesson, or else just chose not to.)

But this play doesn't turn out the way they expect: the WS-* standards become top-heavy, and start to resemble the very thing Microsoft sought to smash fifteen years earlier: CORBA. The number of WS- specifications available through the W3C (and OASIS, and WS-I and whatever other industry consortiums are formed) is exceeded only by the number of Cos- specifications available from the OMG. The complexities therein leave many Java--and .NET--programmers confused, bewildered, and hopelessly lost when trying to get all but the most simple services to work. Thus does the community turn to alternatives--JSON, simple sockets, REST, whatever--to try and find something that works, even if it only addresses a subset of the problems they will eventually face.

Meanwhile...

Open source grows ever more important, and Microsoft-the-company realizes they have to either kill it or join it. It's hard to kill something that has no body (unlike their previous competitors), so joining it is the only viable option. Unlike many other software product companies, however, Microsoft has too large an established software base to just "flip the switch", and has far too deeply entrenched a corporate community to take any kind of radical action without a well-thought plan. (Wall Street, a place few programmers ever bother to consider, much less visit, would not take kindly to Microsoft essentially giving away their core product without something in its place to generate revenue, and regardless of how many programmers would like to imagine a world with a bankrupt Microsoft, this would be bad for business for everybody.)

And thus do we come to the present.

Microsoft needs a play that is Wall Street friendly, programmer friendly, and corporate friendly. They are slowly flirting more and more deeply with open source, yet still firmly committed to turning a profit (something a few of these other open-source-based companies should probably learn to do at some point--just maneuvering to the point of being bought out by a larger fish, like Oracle, is not really a long-term competitive strategy, just so you know).

Microsoft wants--arguably, needs--to keep Office relevant in a world where software isn't always paid for, so they need a play that keeps Office ubiquitous and out in the forefront of developer mindshare. If they can't get you to buy Office, then at least let's get you to use tools that keep the Office file formats ubiquitous. If (and this is a big "if") the Office formats turn out to be technically superior to their competition, then Microsoft succeeds. If not, they find a new play.

In the short, Microsoft needs an interoperability story, and they need a real interoperability play, because their reputation is damaged from the many "embrace, extend, extinguish" plays they've made in the past. The era of a large vendor "winning" is clearly well behind us (if it was ever, in fact, more than just a marketing VP's wet dream), and if Microsoft is going to make sure that they're never in a vulnerable come-from-behind position again, they need to make sure that they can work well with all the other new technologies out there, whether up-and-coming or well-established or even fading-fast. They need to have an interoperability story that developers can believe in, which means some kind of open-source-friendly play, and one that carries serious "street cred" for actually working.

What's the lesson that I, a developer, take away from this?

If you are a Java developer, get past your old prejudices and accept that .NET is a viable platform. The Java developer who refuses to learn how to write C# code on the grounds that "Micro$oft is a company that just puts out crap" or that "M$FT sux" is going to be a Java developer whose value to the business is reduced compared to those with less virulent politics. Thanks to tools like VMWare and Virtual PC, you don't have to give up your Mac or your Linux environment to write .NET code and prove that you can offer value to those projects that need to talk to .NET. Look into more than just the WS-* or REST stacks for communication, as well; explore some of the interoperability options I've been ranting about for four years, a la IKVM, Jace, Hessian, even CORBA.

If you are a Ruby developer, get over yourself and your "we're more agile and more powerful" meme. Ruby is a tool, nothing more, and one whose shine is fast coming off. IT organizations are discovering the myriad problems with the original Ruby runtime, and are unwilling to risk enterprise apps on a runtime that has zero monitoring and zero manageability play. Yes, you can certainly do lots of things yourself to make your Ruby apps more manageable and more monitorable--but that's all time you have to spend building it, or figuring out how to hook it into the existing IT infrastructure, and when all that time gets added up, it's not going to look all that different from a Java or .NET app's timecycle arc. If you don't have an answer to the question, "How will we make this work with the existing infrastructure we've got?", then you have a problem, and no amount of chanting "Obi-Dave Thomas-Kenobi, you and dynamic typing are my only hope" will save you.

If you are a .NET developer, it's high time you accepted that the Java folks are about five years ahead of you on this "managed code" arc, and that they suffered through a lot of hard lessons before arriving at the decisions they came to. Don't be stupid, learn from their mistakes. Why do Java programmers chant "dependency injection" with holy fervor? Why do Java programmers put so much stress on unit testing? What has Microsoft not given you with the latest release of Visual Studio that Java developers think you're an idiot for not demanding in the next release? Yes, C# has some interesting new features in it that Java-the-language doesn't have... but why are the Java guys getting all misty-eyed over Groovy? What do they know that you don't?

If you are a developer outside of these areas, you're swimming in dangerous waters, because while I'm sure you're not having any problems finding a job, chances are your next job is going to require you to talk to one of those three environments. Better have your integration/interoperability story worked out, whether its Phalanger for the PHP developer who needs to talk to .NET (and damn if PHP script driving a WinForms app isn't an interesting idea in of itself... and a useful way to bridge yourself into an entirely new area of employment), or its figuring out how to apply your mad Haskell skillz to F# or Scala, you need to have a good idea of what those languages are (and aren't) and how your knowledge of functional concepts can catapult you to the head of the class the next time a massively-scalable system needs to be built.

If you are a Microsoft employee, don't blow this. Don't make this into another "embrace, extend, extinguish" cycle. Accept that your company made some bone-headed maneuvers in the past, and rather than try to defend them, accept that your reputation outside of the Redmond Reality-Distortion Bubble is not what it looks like from the inside. As hard as this will be to do sometimes, just stop and listen to what others are saying about the company and the paranoia that creeps up every time Microsoft moves into an area of interest. Take the extra moment to hear the concerns, not just the words.

And if you are a Google employee, tatoo this on your forehead: Reputation Matters. The first time anybody at your company does something even remotely "evil", you will be branded as "the next Microsoft" and all of these problems will be yours to share and enjoy, as well.


.NET | C++ | F# | Flash | Java/J2EE | Mac OS | Parrot | Ruby | Solaris | VMWare | Windows | XML Services

Wednesday, April 02, 2008 6:12:22 AM (Pacific Daylight Time, UTC-07:00)
Comments [10]  | 
 Tuesday, April 01, 2008
MSDN "F# Primer" Article Feedback

Since the publication of the F# article in the MSDN Launch magazine, I've gotten some feedback from readers (for which I heartily thank you all, by the way), but in particular I've gotten two emails from "tms" that I thought deserved more widespread notice and commentary.

I'm happy to give full credit to "tms" for his comments, but thus far I haven't heard back from him saying it was OK to do so; that said, his points are valid, and I think important for the rest of the world to hear, so I'm posting this under a pseudonym until he gives permission to offer up his real name.

In his first note, tms says....

I appreciated the (F#) article. I would like to point out one error.

You wrote:

Like many functional languages, F# permits currying ...

let add5 a =
add a 5

Your example does not demonstrate currying well, as it could be written in any non-currying language such as C#. (It is indeed an "idiom" that one uses in C# to manually do the equivalent of currying, where desired.)

Here are two statements, either of which would demonstrate currying:

   1: let add5 = add 5
   2:  
   3: 5 |> (add 5)

Neither of these two statements have any direct equivalent in C#, because C# lacks the concept of currying.
What is significant about these statements, is "add 5" -- the use of add with only one of its two parameters. This is the essence of currying. It takes a function that requires n parameters, and directly turns it into a function that requires n-1 parameters, with no need to name or otherwise talk about the "missing" parameter.

Agreed, but even there, it's possible to do in C# with the use of (multiples of) anonymous methods. For example, the "add5" example you use can be seen as something akin to this:

   1: // Note this has not been compiled with anything except the
   2: // Neward & Associates Blog Compiler (i.e., my eyes)
   3: public class Container
   4: {
   5:     public void add(int a, int b) { return a + b; }
   6:  
   7:     // This is the simple, hard-coded version
   8:     public void add5(int b) { return add(5, b); }
   9:  
  10:     // This is the more complex approach that arguably is closer to F#
  11:     public delegate int AddMethod(int, int);
  12:     public AddMethod Add = new AddMethod(add);
  13:  
  14:     public delegate int Add5Method(int);
  15:     public AddMethod Add5 = new Add5Method((b)=> return Add(5, b));
  16: }

Your second example, using the pipeline operator can, in fact, also be done using C# and a well-established set of delegate types arranged into a pipeline, a la how PowerShell passes objects (or lists of objects) from one Cmdlet to another....

... but your point is still well taken; there's much better examples of currying in the world; Don Syme (who tech-reviewed the article) openly questioned whether or not currying was a good thing to bring up in this introduction, and I argued that I thought it was necessary to at least open the subject in order to explain some of the inherent power of functional programming (and, by extension, some of the motivation for learning F#).

Net result: there is some smoothing of the story on F# yet to be done. You only find this out from presenting a story to an audience, hearing their feedback, and iterating on it further.

 

In his second note, tms points out

Your evolution of

let results = [ for i in 0 .. 100 -> (i, i*i) ]

into:

let compute2 x = (x, x*x)
let compute3 x = (x, x*x, x*x*x)
let results2 = [ for i in 0 .. 100 -> compute2 i ]
let results3 = [ for i in 0 .. 100 -> compute3 i ]

I think this could use a better explanation about what is being shown.

When I first read it, my reaction was:

'I can do the same thing in C# -- you just replaced an an expression in the language, "(i, i*i)" with a function that returns the value of that expression, "compute2 i".'

It wasn't until I sat down to write the C# equivalent that I saw what the benefit is: in F# it is easy to define functions anywhere. In C#, the code would have occurred somewhere in a method of some class, so if "compute2" were a static method on the same class, it would be just as easy to use -- it would simply be "compute2(i)". But in C# I can't embed it as is done in F#. Somewhere else in the class I have to add the function:

   1: // This is C#
   2: class MyClass {
   3:  
   4:   method SomeMethod() {
   5:     ...
   6:     result.Push( Pair(i, i*i) )
   7:     ...
   8:   }

== can be turned into ==>

   1: class MyClass {
   2:   ...
   3:   static method compute2 (int a) { return Pair(i, i*i); }
   4:   ...
   5:  
   6:   method SomeMethod() {
   7:     ...
   8:     result.Push( compute2(i) )
   9:     ...
  10:   }
  11: }
  12:  

It would be really cool if C# let you define a function locally, something like:

   1: class MyClass {
   2:  
   3:   method SomeMethod() {
   4:     ...
   5:     function compute2 (int a) { return Pair(i, i*i); }
   6:     result.Push( compute2(i) )
   7:     ...
   8:   }
   9: }
  10:  

Is that the benefit you were describing?

Weeeelll..... I'd like to say that was the case, but in truth, I don't think I had that in mind when I was writing the article. In fact, it's a bit hard, looking back, exactly what I had in mind during that particular section of the article, except perhaps to try and explain a bit more of the F# syntax. I think what I was trying to do was show how functions could be used in a higher-order manner, but with a simple (arguably trivial) manner, which, in retrospect, doesn't really do the concept of higher-order functions much justice. I'd like to use as my excuse the technical writer's traditional escape, which is to say, "Hey, you try explaining a complex concept in 5000 words, along with introducing basic syntax and still make it relevant to the audience", but in truth, that's just an excuse, and I admit it. *sigh* Fortunately, folks like you are around to point out the flaws in my prose, and (hopefully) make it stronger the next time around. :-)

The other thing to remember, too, that as with most language comparisons, it isn't so much a matter of what I can or can't do in a particular language vis-a-vis a different language (F# vis-a-vis C#, in this case), but more a question of "What does this language allow me to express as a first-class concept that the other one forces me to express via much lower-level constructs?" Just about everything that F# offers can be replicated in C#--thanks in no small part to anonymous methods/lambdas, to be frank--but forces the C# developer into writing much of the scaffolding that has to be in place. (If you think about it, this has to be true, at least at some level, because both F# and C# run on top of the CLR, which means they each have to 'boil down' to CIL at some level, and given the relatively high level of fidelity between C# and CIL, almost any construct expressed in CIL can be 'redrawn' in C#, if we're willing to.)

Case in point: consider the snippet tms calls out above:

let compute2 x = (x, x*x)
let compute3 x = (x, x*x, x*x*x)
let results2 = [ for i in 0 .. 100 -> compute2 i ]
let results3 = [ for i in 0 .. 100 -> compute3 i ]

If we take this snippet and run it through the F# compiler grinder, then look at the results in ILDasm, we get an interesting comparison of how F#'s first-class support for functions maps into C#'s view of the world.

First, ILDasm:

fsharp-ildasm

(You'll note I spared you the huge text dump of "ildasm /out:example.il example.exe", since that would have more noise than signal. Feel free to perform the experiment on your own, if you'd like to see the raw output.)

As you can see here, the F# "top-level" code gets stored into a static method _main stored in the class "<StartupCode$example>" in the namespace "<StartupCode$example>", and yes, _main() is marked with the CIL ".entrypoint" directive, telling the CLR that this is where life begins for this particular assembly. Notice as well how the filename becomes the class "container" for the functions defined therein (the class "Example"), and the functions in particular--compute2() and compute3()--are exported as public static methods. You can see, however, that their parameter types are definitely more complex than the form we would use in traditional idiomatic C#, tuples instead of a list of individual parameters, which tms tries to keep fidelity to in his pseudo-C# translation. The "results2" and "results3" identifiers are in turn kept as properties, exposed on the Example class, and to top it off, are actually defined (not once, but twice) as nested classes of the Example class, because these are, in fact, lists of results, not a single result.

I could go on, but frankly, the noise would begin to swamp the signal. I leave the exercise of opening example.exe in Reflector up to the interested reader. (If you're even remotely interested in F#, I highly recommend doing so once or twice, just to get an idea of how much scaffolding and infrastructure F# is putting into place for you. It's also incredibly useful for when you're trying to figure out C#-calling-F# interop issues.) It's particularly interesting to walk the path of how results2 gets generated, and how wildly different that is from the traditional C# "for" loop. It turns out that everything I'm doing in the code snippet above can be done in C#, but wow, why would you want to? Particularly if you want to get exactly the same kind of fidelity to side effects (that is to say, none at all) that the F# approach gives you?

Both are excellent points, tms, and thanks for taking the time to offer feedback.


.NET | F# | Languages | Windows

Tuesday, April 01, 2008 12:15:43 AM (Pacific Daylight Time, UTC-07:00)
Comments [1]  | 
 Sunday, March 30, 2008
Leopard broke my MacBook Pro's wireless!

So I took the plunge and installed Leopard onto my MacBook Pro tonight, and as of right now, I'm not a happy camper.

The installation started off well enough--pop in the DVD, bring up the installer, double-click, answer a few form fields, then wait as it verifies the DVD, reboots into the CD-launched installer again, answer a few form fields, then sit and read my latest copy of Ellery Queen Mystery Magazine while the installation completes. Roughly an hour or so later, it's done, and I have a bright and shiny new Leopard installation on my Mac. Yay.

Software Update tells me there are a few things that need updating--sure, that makes sense, since I think the latest version of Leopard is actually now 10.5.2, so go ahead.

Bad move.

Ever since that update, any attempt to join my home wireless network fails miserably. AirPort can clearly see the network--it discovers the SSID without a problem--but joining it yields no love. The error that shows up in the console log is always this pair:

airportd Error: Apple80211Associated() failed -6

_emUIServer Error: airport MIG failed = -6 ((null) port = 60027)

I've tried several things suggested in the Apple forums, from changing the order of connected systems to put the Airport on the top, to clearing out my list of remembered SSIDs, to turning the AirPort off and back on again, to downloading the TimeMachine upgrade and installing it, even to blowing out the PRAM on boot. Nothing doing.

Tomorrow we make a trip to the Apple Genius Bar to see what those geniuses have to say, but I'm not optimistic. I will update this blog and apologize profusely if I'm wrong, of course, but given the number of unsuccessful support calls that people are lamenting, I'm guessing this will be one of those "Well, if you want to ship it back to the factory, sir, ...." responses, which is NOT an option.

Well... OK, it is an option, given that I do most everything in VMWare images, sure, but the thought of going back to my T42p (with only 1.5GB of RAM on it, compared to the full 4GB on my MBP) is not endearing to me, particularly because Vista has a problem with releasing the USB hard drives that I store most of the VMWare images on....

Somebody please tell me they have an easy fix for this, one which Googling has not yet revealed....

Update: So I took my MBP into the Apple Store... and, naturally, the wireless on the MBP picks up the Apple Store's network just fine. Grrr. Regardless, I had them do an "archive and install" of 10.5.1 onto the machine, and when I got home... perfect! Sees and connects to my home wireless without a hitch. So I'd suggest for those who recently moved up to 10.5.2, try dropping back to 10.5.1 and see if that solves the problem.

Meanwhile, I'll be holding off of doing the 10.5.2 update for a while, I think. Of course, that also means I can't do the iPhone SDK, I think, so I may try the update once more just to see if it'll take this time, and if it doesn't, then off to the Apple Store again for the 10.5.1 re-install again. But at least this time, I'd know what the viable solution is. (I hope....)


Mac OS | VMWare

Sunday, March 30, 2008 5:15:52 AM (Pacific Daylight Time, UTC-07:00)
Comments [3]  | 
 Saturday, March 29, 2008
The torrent has begun...

Not the BitTorrent of some particular movie or game, but the torrent of changes to the JDK that were held up pending a final blessing on the OpenJDK Mercurial transition. How do I, a non-Sun employee know this? Because I'm subscribed to the build-dev mailing list (which seems to be getting the Mercurial changeset notification emails), and on Wednesday (March 26th), one such email contained 72 new changesets, ranging from extensions to the query API for JMX 2.0:

6602310: Extensions to Query API for JMX 2.0

6604768: IN queries require their arguments to be constants

Summary: New JMX query language and support for dotted attributes in queries.

to bug fixes in javaw.exe for the Windows JRE:

6596475: (launcher) javaw should call InitCommonControls

Summary: javaw does not show error window after manifest changes.

to some changes to the Process class to better allow for IO redirection:

4960438: (process) Need IO redirection API for subprocesses

and more beyond that.

I have to say, I'm getting a little giddy watching all these things flow into the JDK--it's been a while since I just sat and watched the build notification messages on a large project like this, and it always gives me this weird sense of accomplishment, even though it's not work that I'm doing or arguably even care about. But it should stand as a clear sign to anybody who think Java-the-platform is "done"--the guys at Sun certainly don't think so, and more importantly, are putting in the effort to improve it.

Except now, we can see the work being done, which makes all the difference in the world.

Some of you may remember that on several speaker panels I was on, I was a bit bullish (on the surface of things) about the OpenJDK process. I think my exact comments were, "I think for the majority of Java developers, this is a 'No Big Deal, Move Along, Nothing to See Here' kind of step." I still believe that, in fact: I believe that to the vast majority of Java developers, the fact that anybody can now see the sausage being made yields no real advantage, and therefore is of no real interest.

But to the handful of Java developers who refuse to see the JVM or the Java libraries (or even the Java compiler) as a black box, this is huge. We can now not only post the bugs that we run across during development, and more importantly, subscribe to the mailing lists, watch for the bug fix notification, apply the Mercurial changeset that patches the bug, and if the patch doesn't work, notify Sun. But if the patch does work, not only can we confirm the bug's elimination, but we can move beyond it, even before the production release of the next Java build. It may not be something you do on a regular basis, but when you're completely blocked waiting for a bug fix from Sun...

... that's huge.


Java/J2EE | Languages

Saturday, March 29, 2008 2:17:58 AM (Pacific Daylight Time, UTC-07:00)
Comments [1]  | 
 Friday, March 28, 2008
eWeek posts a review of the TSSJS Languages Keynote

Quick note before I head off to the conference center to do the Scala talk: Darryl Taft covers the "Why the Next Five Years..." keynote from TSSJS. Thanks, Darryl.

Update: Just noticed that Darryl also covered Brian's and my "SOAP and REST" talk, as well.




Friday, March 28, 2008 11:35:37 AM (Pacific Daylight Time, UTC-07:00)
Comments [0]  | 
Hangin' in Vegas

I hate Las Vegas.

I'm here for TheServerSide Java Symposium 2008, which has been held here in Vegas for the past (umm... three? four? five?) years, and every time I come here I'm reminded why I really don't like Vegas. It's loud, both in auditory volume and visual noise, it's boisterous bordering on raunchy, the locals are almost always soured by their near-constant exposure to tourists, the tourists are... well, they're American tourists and that says a lot right there, and there's no way to escape it. Ugh.

Fortunately for me, the hotels have conveniently painted a nice blue sky on the roof (in the Venetian, where the conference is held) so I don't have to go outside to see if it's sunny, they provided a nice winding river of bright neon blue water/Windex to have our leisurely cafe lunch next to, and no fake recreation of Venice would be complete without fake gondolas poled by fake gondoleers singing to tourists on the fake Windex river that's all of about two minutes in ride length before they have to do a U-turn and pole back the other way.

Wow, it's all so magical.

About the only thing that makes Vegas palatable is some of the shows you can catch here, like one of Cirque du Soleil's six (!) different presentations going on here. But, of course, you must be careful when you buy tickets, or the guy at the concierge desk will start finding tickets for you, only to discover later that he thought you said "Tah", meaning "Tom Jones", when you said, "Ka", the Cirque du Soleil show, because my California accent is too thick to be understood.

I hate Las Vegas.

The upshot is that when I'm here for this show, I get to hang out with some cool people, NFJS speaker alum and otherwise. Brian Sletten and I did a tag-team talk on SOAP and REST that was billed to be controversial but probably disappointed the crowd in that we didn't (a) throw any punches at one another, (b) didn't really proclaim a "victor" between the two, and (c) laid down some basic rules for when to look to a RESTful approach and when to take advantage of the existing SOAP-based infrastructure that is currently SOAP's greatest strength.

Note to those who didn't attend the session: you didn't hear me say it, so I'll repeat it: I hate WSDL almost as much as I hate Las Vegas. Ask me why sometime, or if I get enough of a critical mass of questions, I'll blog it. If you've seen me do talks on Web Services, though, you've probably heard the rant: WSDL creates tightly-coupled endpoints precisely where loose coupling is necessary, WSDL encourages schema definitions that are inflexible and unevolvable, and WSDL intrinsically assumes a synchronous client-server invocation model that doesn't really meet the scalability or feature needs of the modern enterprise. And that's just for starters.

I hate WSDL.

I still hate Vegas more, though.

Meanwhile, Glenn Vanderberg, NFJS speaker alum and current Chief Scientist over at Relevance, pulled me aside for a few minutes to show me how to build apps for the iPhone using the newly-released iPhone SDK (something I'd asked about once before and that's been exploring recently). We basked in the glory that is Objective-C (now there's a language that should have gotten more traction than it did, IMHO), and then in the glory that is the iPhone (OpenGL, OpenSA, which I didn't know but Glenn tells me is basically like an audio-equivalent library for OpenGL), and then we swapped some ideas about what people might do with the iPhone now that the SDK is available. I've always been pretty bullish on the mobile device market, and I still am, but the iPhone might be the turning point in that space. I'll reserve judgment for now, and just enjoy hacking on my own for the time being. :-)

Neal Ford did the Wednesday morning keynote, and I got the chance to present "Why the Next Five Years Will Be About Languages" after lunch today, which seemed to go over well, at least based on what the attendees who came up to me afterwards were saying. (Of course, that's always a biased assessment, since the ones who hate it are hardly likely to come up and tell me that, so I always take that statistic with a grain of salt.) They videoed it, so I imagine it'll be online before long.

Of course, TSS wouldn't be TSS without speaker panels saying really controversial things... but I wouldn't know about them this year, I wasn't on any. (Perhaps the conference organizers finally took everybody's advice...)

Tomorrow (well, actually, today as I write this, since I'm up way too late as usual) I'll be doing a talk on Scala, having dinner with a few friends, then off to McCarron airport and home. I don't think the Scala talk will be taped, but you can catch me doing much the same stuff (well, as much as it ever is the same stuff when I speak, since I mostly make everything up on the fly anyway) at the NFJS symposium near you, so you don't have to come to Vegas to hear about it in between ducking packs of drunk twenty-something guys chasing packs of drunk twenty-something girls all the while dodging the attentions of finger-snapping sidewalk vultures handing out glossy business cards saying "Girls Direct to You".

I so hate Vegas.

Update: Hah, that'll teach me to blog that before the conference is over--Eugene and Joe drafted me into the final panel session of the conference, on "Cross-Cutting Concerns, a Multi-Disciplinary Approach to Java", which none of us--including our emcees--had any idea was supposed to be included in such a discussion. Glenn Vanderberg and Patrick Linskey then sought to take a vote and change the topic of the panel to "Shearing off Ted's Ponytail". Fortunately a kind attendee asked a question and we moved on, ponytail intact.

Of course, given that this was Vegas, I probably could have gotten Carrot Top to do it and made some money on the deal.


Conferences | Java/J2EE

Friday, March 28, 2008 4:27:37 AM (Pacific Daylight Time, UTC-07:00)
Comments [0]  | 
Rules for Review

Apparently, I'm drawing enough of an audience through this blog that various folks have started to send me press releases and notifications and requests for... well, I dunno exactly, but I'm assuming some blogging love of some kind. I'm always a little leery about that particular subject, because it always has this dangerous potential to turn the blog into a less-credible marketing device, but people at conferences have suggested that they really are interested in what I think about various products and tools, so perhaps it's time to amend my stance on this.

With that in mind, if you are a vendor and have a product that you'd like me to take a look at and (possibly) offer up a review here, here's the basic rules:

  1. No guarantees. Sending me something will in no way guarantee that I will review your product, for several reasons, two of which being (a) I get really busy sometimes, and (b) I may have no interest whatsoever in your product and I refuse to pretend to do so. (Readers can usually tell when the reviewer isn't all that excited about the subject, I've found.)
  2. If you're not going to send me a "real" version (meaning not the time-locked or feature-crippled demo), don't bother. I have no idea when I will get around to a review, and I have no desire to review something that isn't "the real deal". I will in turn promise that the licensed version you send me (if necessary) will not be used for any purpose other than my own research and exploration (signing contract if necessary to give you that "fresh-from-the-lawyer's-office" warm and fuzzy feeling).
  3. I say what I think, pro and con. I will not edit my review to suit your marketing purpose, and if you ask me to do so I will simply note in the review that you have asked me to do so. I retain full editorial control over what I say about your product.
  4. Having established #1, I will try to be as fair as I can about your product, and point out things that I liked and things that I didn't. (Of course, if I hated it from top to bottom, I may end up with the only positive thing being "It didn't set the atmosphere on fire when I started the app", but hey, that's something positive, right?)
  5. Also in the spirit of #1, if you send me mail answering questions or complaints in my review, I will of course amend the review with your comments. You are always welcome to post comments to the blog entry itself, too. Unless you insult my grandmother, then I will have to get all DELETE-key on you.

The reason I'm posting this here is twofold: one, so my faithful audience of four blog readers will know the rules under which I'm looking at these products and (hopefully) realize that I'm not financially vested in any of these products, and two, so the various vendor folks can read this and know what the rules are up front before even asking.

I know it sounds a little cheeky to lay this out. The image I get in my head is that of the kid at Christmas declaring to his grandparents as they walk through the door, presents in hand, "Make sure it's not a scratchy sweater, I hate scratchy sweaters. And G.I. Joe was only popular when my Dad was a kid. And if you give me another lunchbox I will scream until you buy me something cool, like a new GameBoy." Ugh. But I value the trust that people seem to have in me, and so I risk the perception of cheekiness for this tiny window in time in order to (hopefully) establish full disclosure over the reviews that come to pass (which, by the way, will always have the category "review" applied to them, so you know which is an official review and which is just me exploring, like the LLVM and Parrot posts of recent time).

We now return you to the regularly-scheduled blog.


.NET | C++ | Flash | Java/J2EE | Languages | LLVM | Mac OS | Parrot | Reading | Review | Ruby | Security | Solaris | VMWare | Windows | XML Services

Friday, March 28, 2008 4:18:12 AM (Pacific Daylight Time, UTC-07:00)
Comments [0]  | 
Lang.NET 2008 videos back online

For those who were skimming my blog looking for the notification that the Lang.NET 2008 Symposium videos were back online, look no further.


.NET | Java/J2EE | Languages | Windows

Friday, March 28, 2008 3:55:49 AM (Pacific Daylight Time, UTC-07:00)
Comments [0]  | 
 Saturday, March 22, 2008
Reminder

A couple of people have asked me over the last few weeks, so it's probably worth saying out loud:

No, I don't work for a large company, so yes, I'm available for consulting and research projects. If you've got one of those burning questions like, "How would our company/project/department/whatever make use of JRuby-and-Rails, and what would the impact to the rest of the system be", or "Could using F# help us write applications faster", or "How would we best integrate Groovy into our application", or "How does the new Adobe Flex/AIR move help us build richer client apps", or "How do we improve the performance of our Java/.NET app", or other questions along those lines, drop me a line and let's talk. Not only will I cook up a prototype describing the answer, but I'll meet with your management and explain the consequences of the research, both pro and con, for them to evaluate.

Shameless call for consulting complete, now back to the regularly-scheduled programming.


.NET | C++ | Conferences | Development Processes | Flash | Java/J2EE | Languages | LLVM | Mac OS | Parrot | Reading | Ruby | Security | Solaris | VMWare | Windows | XML Services

Saturday, March 22, 2008 3:43:18 AM (Pacific Daylight Time, UTC-07:00)
Comments [0]  | 
 Thursday, March 20, 2008
Eclipse gets some help... building Windows apps... from Microsoft?

This delicious little tidbit just crossed my desk, and for those of you too scared to click the link, check this out:

Microsoft will begin collaborating with the Eclipse Foundation to improve native Windows application development on Java.

Sam Ramji, the director of Microsoft's open-source software lab, announced at the EclipseCon conference in Santa Clara, Calif., on Wednesday that the lab will work with Eclipse .

The goal of the joint work, which will include contributions from Microsoft engineers, is to make it easier to use Java to write applications that take full advantage of the look and feel of Windows Vista. Ramji wrote about the planned collaboration on Microsoft's Port25 blog.

"Among a range of other opportunities (which we're still working on), we discovered that Steve Northover (the SWT team lead) had gotten requests to make it easy for Java developers to write applications that look and feel like native Windows Vista. He and a small group of developers built out a prototype that enables SWT to use Windows Presentation Foundation (WPF). We're committing to improve this technology with direct support from our engineering teams and the Open Source Software Lab, with the goal of a first-class authoring experience for Java developers," he wrote.

The move builds on several initiatives coming from Microsoft's open-source software labs to ensure that open-source products work well on Windows and other Microsoft products.

My first reaction has to be characterized as... WTF?!?

My second reaction has to be characterized as... WTF?!?

There's some serious credulity issues here. Not credibility, mind you, because I believe the reporter is entirely accurate in this story, but credulity. As in, "That's incredulous!", which is another way of saying...

WTF?!?

First, it's not been that long ago since Microsoft and Java were actively trying to beat one another into something vaguely resembling... well, resembling either a strawberry after that oh-so-ill-advised blind date with a blind steamroller, or else the end-product of the local butcher's sausage grinder. I seriously doubt anybody's memory has lapsed so deeply that they forget the rather nasty shooting war that erupted over J++ and Microsoft's Application Foundation Classes.

(For those of you who weren't writing Java code at the time, AFC was Microsoft's variation on AWT designed to make it easier to write native Windows apps, making heavy use of a language/library construct that was an extension to Java, known as "delegates". Yes, those same delegates as what appeared in C# a few years later, and those same delegates that became the core implementation behind C# 2.0's asynchronous methods and C# 3.0's lambda expressions, and arguably the same delegates that everybody is looking to incorporate into the Java language today. Funny how things turn out, no?)

Second, Microsoft partnering with IBM (yes, I know, the news piece says Eclipse, but who runs most of the Eclipse projects? IBM is to Eclipse what Sun is to the JCP, folks) to do this is just not going to make the whole IBM-Sun rift any smoother, or calm the turbulent waters in the Java ecosystem any further. Granted, SWT, is the logical place for Microsoft to go when trying to make it easier for Java devs to write Windows apps (which, by the way, was always a core principle behind the design and implementation of the CLR, which is why the CLR has such a powerful and simple P/Invoke and COMInterop story), but the last thing Microsoft wants at this point, it would seem to me, is more controversy around it and Java. After all, how hard would it be for Sun to haul them into court again, claiming that this somehow violate's the Microsoft/Sun peace agreement of a few years ago?

And while I applaud the fact that Microsoft is looking for ways to contribute to the open-source space, it just seems to me that there were a lot of other places they could have gone to start doing so without incurring this kind of reaction. Go write a standard Perl implementation, for example, or, even better, do a "Visual Lisp" and integrate it on top of the CLR, if you want to make a mark in the open-source world. There's thousands of places the gathering-steam Microsoft open-source direction could have gone, with far greater success for both the open-source community and Microsoft. The skin here is just too sensitive and the past wounds just too raw for this company to go rubbing elbows up against this space again.

Oh, just as a footnote, in case you're looking for more reasons to dislike the JBoss guys....

"It just makes sense to enable Java on Windows. We started a collaborative effort with JBoss two years ago that continues to this day. At the end of the day, it's all about the developer," Ramji said.

See? They sold out a long time ago!

*grin*


.NET | Java/J2EE | Languages | Windows

Thursday, March 20, 2008 2:35:35 AM (Pacific Daylight Time, UTC-07:00)
Comments [5]  | 
Cirque du Soleil for geeks

Watch this guy beat calculators, doing two-, three- and then four-digit squares in his head. Have a look if you ever thought you were good at doing numbers in your head. Have a look even if you're of the opposite extreme.

(I'm sure there's some other tricks in his head he's using to be able to do this, but the net effect is still impressive, regardless.)




Thursday, March 20, 2008 12:10:12 AM (Pacific Daylight Time, UTC-07:00)
Comments [0]  | 
 Saturday, March 15, 2008
The reason for conferences

People have sometimes asked me if it's really worth it to go to a conference these days, given that so much material is appearing online via blogs, webcasts, online publications and Google. I think the answer is an unqualified "yes" (what else would you expect from a guy who spends a significant part of his life speaking at conferences?), but not necessarily for the reasons you might think.

A long time ago, Billy Hollis said something very profound to me: "Newbies go to conferences for the technical sessions. Seasoned veterans go to conferences for the people." At the time, I thought this was Billy's way of saying that the sessions really weren't "all that" at most conferences (JavaOne and TechEd come to mind, for example--whatever scheduling gods that think project managers on a particular project make good technical speakers on that subject really needs to be taken out back and shot), and that you're far better off spending the time networking to improve your social network. Now I think it's for a different reason. By way of explanation, allow me to recount a brief travel anecdote.

I spend a lot of time on airplanes, as you might expect. Periodically, while staring out the window (trying to rearrange words in my head in order to make them sound coherent for the current email, blog entry, book chapter or article), I will see another commercial aircraft traveling in the same air traffic control lane going the other way. Every time I see it, I'm simply floored at how fast they appear to be going--they usually don't stay within my visibility for more than a few seconds. "Whoosh" is the first thought that goes through my easily-amused consciousness, and then, "Damn, they're really moving." Then I realize, "Wow--somebody on that plane over there is probably looking at this plane I'm on, and thinking the exact same thing."

This is why you go to conferences.

In the agile communities, they talk about velocity, the amount of work a team can get done in a particular iteration. But I think teams need to have a sense of their velocity relative to the rest of the industry, too. It helps put things into perspective. All too often, I find teams that look at me in meetings and conference calls and say, "Surely the rest of the industry isn't this bad, right?" or "Surely, somebody else has found a solution to this problem by now, right?" or "Please, dear God, tell me this kind of WTF-style of project management is unique to my company". While I am certainly happy to answer those questions, the fact of the matter is, at the end of the day they're still left taking my word for it, and let's be blunt: my answer can really only cover those companies and/or project teams I've had direct contact with. I can certainly tell you what I've heard from others (usually at conferences), but even that's now getting into secondhand information, which to you will be third-hand. (And that, of course, assumes I'm getting it from the source in the first place.)

This isn't just about project management styles--agile or waterfall or WHISCEY (Why the Hell Isn't Somebody Coding Everything Yet) or what-have-you--but also about technical trends. Is Ruby taking off? Is Scala becoming more mainstream? Is JRuby worth exploring? Is C++ making a comeback? What are peoples' experiences with Spring 2.5? Has Grails reached a solid level of performance and/or stability? Sure, I'm happy to come to your company, meet with your team, talk about what I've seen and heard and done--but sending your developers (and managers, though *ahem* preferably to conferences that aren't in Las Vegas) to a conference like No Fluff Just Stuff or JavaOne or TechEd or SD West can give them some opportunities to swap stories and gain some important insights about your team's (and company's) velocity relative to the rest of the industry.

All of which is to say, yes, Billy was right: it's about the people. Which means, boss, it's OK to let the developers go to the parties and maybe sleep in and miss a session or two the next morning.


Conferences | Development Processes

Saturday, March 15, 2008 8:58:52 AM (Pacific Daylight Time, UTC-07:00)
Comments [1]  | 
Mort means productivity

Recently, a number of folks in the Java space have taken to openly ridiculing Microsoft's use of the "Mort" persona, latching on to the idea that Mort is somehow equivalent to "Visual Basic programmer", which is itself somehow equivalent to "stupid idiot programmer who doesn't understand what's going on and just clicks through the wizards". This would be a mischaracterization, one which I think Nikhilik's definition helps to clear up:

Mort, the opportunistic developer, likes to create quick-working solutions for immediate problems and focuses on productivity and learn as needed. Elvis, the pragmatic programmer, likes to create long-lasting solutions addressing the problem domain, and learn while working on the solution. Einstein, the paranoid programmer, likes to create the most efficient solution to a given problem, and typically learn in advance before working on the solution. ....

The description above is only rough summarization of several characteristics collected and documented by our usability folks. During the meeting a program manager on our team applied these personas in the context of server controls rather well (I think), and thought I should share it. Mort would be a developer most comfortable and satisfied if the control could be used as-is and it just worked. Elvis would like to able to customize the control to get the desired behavior through properties and code, or be willing to wire up multiple controls together. Einstein would love to be able to deeply understand the control implementation, and want to be able to extend it to give it different behavior, or go so far as to re-implement it.

Phrased this way, it's fairly easy to recognize that it's possible that these are more roles than actual categorizations for programmers as individuals--sometimes you want to know how to create the most efficient solution, and sometimes you just want the damn thing (whatever it is) to get out of your way and let you move on.

Don Box called this latter approach (which is a tendency on the part of all developers, not just the VB guys) the selective ignorance bit: none of us have the time or energy or intellectual capacity to know how everything works, so for certain topics, a programmer flips the selective ignorance bit and just learns enough to make it work. For myself, a lot of the hardware stuff sees that selective ignorance bit flipped on, as well as a lot of the frou-frou UI stuff like graphics and animation and what-not. Sure, I'm curious, and I'd love to know how it works, but frankly, that's way down the priority list.

If trying to find a quick-working solution to a particular problem means labeling yourself as a Mort, then call me Mort. After all, raise your hand if you didn't watch a team of C++ guys argue for months over the most efficient way to create a reusable framework for an application that they ended up not shipping because they couldn't get the framework done in time to actually start work on the application as a whole....


.NET | C++ | Development Processes | Java/J2EE | Languages | Ruby

Saturday, March 15, 2008 8:57:39 AM (Pacific Daylight Time, UTC-07:00)
Comments [1]  | 
 Tuesday, February 26, 2008
Lang.NET videos are now online

If you read the three days of Lang.NET posts I did last month and wondered "Man, I wish I could've seen...", fret no more.

My personal favorites:

Of course the other presentations are good, but each of these had a moment in them when I said, "Hmm...."




Tuesday, February 26, 2008 7:49:40 PM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
 Sunday, February 24, 2008
Apropos of nothing: Job trends

While tracking some of the links relating to the Groovy/Ruby war, I found this website, which purportedly tracks job trends based on a whole mess of different job sites. So, naturally, I had to plug in to get a graph of C#, C++, Java, Ruby, and VB:

Interesting. I don't think it proves anything one way or another, mind you, but interesting nonetheless. Having said that, a few things stand out to me after looking at this for all of thirty seconds:

  • Wow, what the hell happened in 1Q and 2Q of 2005? Java takes a huge drop in 2005, and all of them take a small drop of some form around the same time in 2006. What is it with summertime? Did the HR supervisor suddenly take a look at the company's job board and mutter, "Damn, I thought we closed all those listings already..."? (Or maybe, "Thank God for cheap college interns..."?)
  • C++ jobs still outnumber C# jobs, even in 4Q 2007?
  • C++ jobs remain essentially flat from 1Q 2005 to 4Q 2007; apparently, there's a lot more C++ going on than most companies are willing to admit to.... (Can't you picture it? The nervous candidate, sitting at the table, as the interviewer shuffles the paper and says, "So, you're here for a programming job?" The candidate sort of squirms in his chair as he replies, "Well, actually, I was hoping for a... a... C++ job." The interviewer quickly looks around to see who might be listening as he says loudly, "C++? What ever gave you the idea that we do C++ here at BigCorp?" Meanwhile, he surreptitiously scribbles on the back of a business card and slides it across the table to the candidate, then stands up and says loudly, "I'm afraid you've come to the wrong place, sir. You can see yourself out, I take it?" The candidate palms the card, and only once has he left the building does he look at the back, which reads, "8PM, corner of Mission and Vine, password is 'Lippman, Stroustrup, Sutter, and Meyers!' Viva C++!"...)
  • VB jobs fall to below C#? So much for those vast hordes of VB programmers that supposedly form the "long tail" of the .NET community....
  • Java jobs remain essentially flat from 1Q 2005 to 4Q 2007, despite numerous ups and downs. So much for the idea that Java is somehow going away....
  • Ruby's penetration into the job market is much smaller than what I would have guessed.
  • I couldn't help myself, I did another query with "cobol" added in, but I'll leave it to you to run your own query to see what that looks like. It's surprising....

Of course, statistics without any sort of understanding of how they were gathered or from what sources are essentially meaningless, but ooooh, it's in color....


.NET | C++ | Java/J2EE | Languages | Ruby

Sunday, February 24, 2008 9:33:02 PM (Pacific Standard Time, UTC-08:00)
Comments [6]  | 
Some interesting tidbits about LLVM

LLVM definitely does some interesting things as part of its toolchain.

Consider the humble HelloWorld:

   1: #include <stdio.h>
   2:  
   3: int main() {
   4:   printf("hello world\n");
   5:   return 0;
   6: }

Assuming you have a functioning llvm and llvm-gcc working on your system, you can compile it into LLVM bitcode. This bitcode is directly executable using the lli.exe from llvm:

$ lli < hello.bc
hello world

Meh. Not so interesting. Let's look at the LLVM bitcode for the code, though--that's interesting as a first peek at what LLVM bitcode might look like:

   1: ; ModuleID = '<stdin>'
   2: target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64"
   3: target triple = "mingw32"
   4: @.str = internal constant [12 x i8] c"hello world\00"        ; <[12 x i8]*> [#uses=1] 
   5:  
   6: define i32 @main() {
   7: entry:
   8:     %tmp2 = tail call i32 @puts( i8* getelementptr ([12 x i8]* @.str, i32 0, i32 0) )        ; <i32> [#uses=0]
   9:     ret i32 0
  10: } 
  11:  
  12: declare i32 @puts(i8*)

Hmm. Now of course, LLVM also has to be able to get down to actual machine instructions, and in point of fact there is a tool in the LLVM toolchain, called llc, that can do this transformation ahead-of-time, like so:

$ llc hello.bc -o hello.bc.s -march x86

And, looking at the results, we see...

   1: .text
   2: .align    16
   3: .globl    _main
   4: .def     _main;    .scl    2;    .type    32;    .endef
   5: n:
   6: pushl    %ebp
   7: movl    %esp, %ebp
   8: subl    $8, %esp
   9: andl    $4294967280, %esp
  10: movl    $16, %eax
  11: call    __alloca
  12: call    ___main
  13: movl    $_.str, (%esp)
  14: call    _puts
  15: xorl    %eax, %eax
  16: movl    %ebp, %esp
  17: popl    %ebp
  18: ret
  19: .data
  20: r:                # .str
  21: .asciz    "hello world"
  22: .def     _puts;    .scl    2;    .type    32;    .endef

Bleah. Assembly language, and in NASM format, to boot. (What did you expect, anyway?)

Of course, assembly language and C were always considered fairly close together in terms of their abstraction layer (C was designed as a replacement for assembly language when porting Unix, remember), so it might not be too hard to...

$ llc hello.bc -o hello.bc.c -march c

And get...

   1: /* Provide Declarations */
   2: #include <stdarg.h>
   3: #include <setjmp.h>
   4: /* get a declaration for alloca */
   5: #if defined(__CYGWIN__) || defined(__MINGW32__)
   6: #define  alloca(x) __builtin_alloca((x))
   7: #define _alloca(x) __builtin_alloca((x))
   8: #elif defined(__APPLE__)
   9: extern void *__builtin_alloca(unsigned long);
  10: #define alloca(x) __builtin_alloca(x)
  11: #define longjmp _longjmp
  12: #define setjmp _setjmp
  13: #elif defined(__sun__)
  14: #if defined(__sparcv9)
  15: extern void *__builtin_alloca(unsigned long);
  16: #else
  17: extern void *__builtin_alloca(unsigned int);
  18: #endif
  19: #define alloca(x) __builtin_alloca(x)
  20: #elif defined(__FreeBSD__) || defined(__NetBSD__) || defined(__OpenBSD__)
  21: #define alloca(x) __builtin_alloca(x)
  22: #elif defined(_MSC_VER)
  23: #define inline _inline
  24: #define alloca(x) _alloca(x)
  25: #else
  26: #include <alloca.h>
  27: #endif
  28:  
  29: #ifndef __GNUC__  /* Can only support "linkonce" vars with GCC */
  30: #define __attribute__(X)
  31: #endif
  32:  
  33: #if defined(__GNUC__) && defined(__APPLE_CC__)
  34: #define __EXTERNAL_WEAK__ __attribute__((weak_import))
  35: #elif defined(__GNUC__)
  36: #define __EXTERNAL_WEAK__ __attribute__((weak))
  37: #else
  38: #define __EXTERNAL_WEAK__
  39: #endif
  40:  
  41: #if defined(__GNUC__) && defined(__APPLE_CC__)
  42: #define __ATTRIBUTE_WEAK__
  43: #elif defined(__GNUC__)
  44: #define __ATTRIBUTE_WEAK__ __attribute__((weak))
  45: #else
  46: #define __ATTRIBUTE_WEAK__
  47: #endif
  48:  
  49: #if defined(__GNUC__)
  50: #define __HIDDEN__ __attribute__((visibility("hidden")))
  51: #endif
  52:  
  53: #ifdef __GNUC__
  54: #define LLVM_NAN(NanStr)   __builtin_nan(NanStr)   /* Double */
  55: #define LLVM_NANF(NanStr)  __builtin_nanf(NanStr)  /* Float */
  56: #define LLVM_NANS(NanStr)  __builtin_nans(NanStr)  /* Double */
  57: #define LLVM_NANSF(NanStr) __builtin_nansf(NanStr) /* Float */
  58: #define LLVM_INF           __builtin_inf()         /* Double */
  59: #define LLVM_INFF          __builtin_inff()        /* Float */
  60: #define LLVM_PREFETCH(addr,rw,locality) __builtin_prefetch(addr,rw,locality)
  61: #define __ATTRIBUTE_CTOR__ __attribute__((constructor))
  62: #define __ATTRIBUTE_DTOR__ __attribute__((destructor))
  63: #define LLVM_ASM           __asm__
  64: #else
  65: #define LLVM_NAN(NanStr)   ((double)0.0)           /* Double */
  66: #define LLVM_NANF(NanStr)  0.0F                    /* Float */
  67: #define LLVM_NANS(NanStr)  ((double)0.0)           /* Double */
  68: #define LLVM_NANSF(NanStr) 0.0F                    /* Float */
  69: #define LLVM_INF           ((double)0.0)           /* Double */
  70: #define LLVM_INFF          0.0F                    /* Float */
  71: #define LLVM_PREFETCH(addr,rw,locality)            /* PREFETCH */
  72: #define __ATTRIBUTE_CTOR__
  73: #define __ATTRIBUTE_DTOR__
  74: #define LLVM_ASM(X)
  75: #endif
  76:  
  77: #if __GNUC__ < 4 /* Old GCC's, or compilers not GCC */ 
  78: #define __builtin_stack_save() 0   /* not implemented */
  79: #define __builtin_stack_restore(X) /* noop */
  80: #endif
  81:  
  82: #define CODE_FOR_MAIN() /* Any target-specific code for main()*/
  83:  
  84: #ifndef __cplusplus
  85: typedef unsigned char bool;
  86: #endif
  87:  
  88:  
  89: /* Support for floating point constants */
  90: typedef unsigned long long ConstantDoubleTy;
  91: typedef unsigned int        ConstantFloatTy;
  92: typedef struct { unsigned long long f1; unsigned short f2; unsigned short pad[3]; } ConstantFP80Ty;
  93: typedef struct { unsigned long long f1; unsigned long long f2; } ConstantFP128Ty;
  94:  
  95:  
  96: /* Global Declarations */
  97: /* Helper union for bitcasts */
  98: typedef union {
  99:   unsigned int Int32;
 100:   unsigned long long Int64;
 101:   float Float;
 102:   double Double;
 103: } llvmBitCastUnion;
 104:  
 105: /* External Global Variable Declarations */
 106:  
 107: /* Function Declarations */
 108: double fmod(double, double);
 109: float fmodf(float, float);
 110: long double fmodl(long double, long double);
 111: unsigned int main(void);
 112: unsigned int puts(unsigned char *);
 113: unsigned char *malloc();
 114: void free(unsigned char *);
 115: void abort(void);
 116:  
 117:  
 118: /* Global Variable Declarations */
 119: static unsigned char _2E_str[12];
 120:  
 121:  
 122: /* Global Variable Definitions and Initialization */
 123: static unsigned char _2E_str[12] = "hello world";
 124:  
 125:  
 126: /* Function Bodies */
 127: static inline int llvm_fcmp_ord(double X, double Y) { return X == X && Y == Y; }
 128: static inline int llvm_fcmp_uno(double X, double Y) { return X != X || Y != Y; }
 129: static inline int llvm_fcmp_ueq(double X, double Y) { return X == Y || llvm_fcmp_uno(X, Y); }
 130: static inline int llvm_fcmp_une(double X, double Y) { return X != Y; }
 131: static inline int llvm_fcmp_ult(double X, double Y) { return X <  Y || llvm_fcmp_uno(X, Y); }
 132: static inline int llvm_fcmp_ugt(double X, double Y) { return X >  Y || llvm_fcmp_uno(X, Y); }
 133: static inline int llvm_fcmp_ule(double X, double Y) { return X <= Y || llvm_fcmp_uno(X, Y); }
 134: static inline int llvm_fcmp_uge(double X, double Y) { return X >= Y || llvm_fcmp_uno(X, Y); }
 135: static inline int llvm_fcmp_oeq(double X, double Y) { return X == Y ; }
 136: static inline int llvm_fcmp_one(double X, double Y) { return X != Y && llvm_fcmp_ord(X, Y); }
 137: static inline int llvm_fcmp_olt(double X, double Y) { return X <  Y ; }
 138: static inline int llvm_fcmp_ogt(double X, double Y) { return X >  Y ; }
 139: static inline int llvm_fcmp_ole(double X, double Y) { return X <= Y ; }
 140: static inline int llvm_fcmp_oge(double X, double Y) { return X >= Y ; }
 141:  
 142: unsigned int main(void) {
 143:   unsigned int llvm_cbe_tmp2;
 144:  
 145:   CODE_FOR_MAIN();
 146:   llvm_cbe_tmp2 =  /*tail*/ puts((&(_2E_str[((signed int )((unsigned int )0))])));
 147:   return ((unsigned int )0);
 148: }

Granted, it's some ugly-looking C code, with all those preprocessor fragments floating around in there, but if you take a few moments and go down to the main() definition, it's C to bitcode to C. We've come full circle.

Looking back at that first disassembly dump, I'm struck by how LLVM bitcode looks a lot like any other high-level assembly or low-level virtual machine language, even reminiscent of MSIL. In fact, there's probably a pretty close correlation between LLVM bitcode and MSIL.

In point of fact, LLVM knows this, too:

$ llc hello.bc -o hello.bc.il -march msil

And check out what it generates:

   1: .assembly extern mscorlib {}
   2: .assembly MSIL {}
   3:  
   4: // External
   5: .method static hidebysig pinvokeimpl("MSVCRT.DLL")
   6:     unsigned int32 modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 'puts'(void* ) preservesig {}
   7:  
   8: .method static hidebysig pinvokeimpl("MSVCRT.DLL")
   9:     vararg void* modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 'malloc'() preservesig {}
  10:  
  11: .method static hidebysig pinvokeimpl("MSVCRT.DLL")
  12:     void modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 'free'(void* ) preservesig {}
  13:  
  14: .method public hidebysig static pinvokeimpl("KERNEL32.DLL" ansi winapi)  native int LoadLibrary(string) preservesig {}
  15: .method public hidebysig static pinvokeimpl("KERNEL32.DLL" ansi winapi)  native int GetProcAddress(native int, string) preservesig {}
  16: .method private static void* $MSIL_Import(string lib,string sym)
  17:  managed cil
  18: {
  19:     ldarg    lib
  20:     call    native int LoadLibrary(string)
  21:     ldarg    sym
  22:     call    native int GetProcAddress(native int,string)
  23:     dup
  24:     brtrue    L_01
  25:     ldstr    "Can no import variable"
  26:     newobj    instance void [mscorlib]System.Exception::.ctor(string)
  27:     throw
  28: L_01:
  29:     ret
  30: }
  31:  
  32: .method static private void $MSIL_Init() managed cil
  33: {
  34:     ret
  35: }
  36:  
  37: // Declarations
  38: .class value explicit ansi sealed 'unsigned int8 [12]' { .pack 1 .size 12 }
  39:  
  40: // Definitions
  41: .field static private valuetype 'unsigned int8 [12]' '.str' at '.str$data'
  42: .data '.str$data' = {
  43: int8 (104),
  44: int8 (101),
  45: int8 (108),
  46: int8 (108),
  47: int8 (111),
  48: int8 (32),
  49: int8 (119),
  50: int8 (111),
  51: int8 (114),
  52: int8 (108),
  53: int8 (100),
  54: int8 (0) [1]
  55: }
  56:  
  57: // Startup code
  58: .method static public int32 $MSIL_Startup() {
  59:     .entrypoint
  60:     .locals (native int i)
  61:     .locals (native int argc)
  62:     .locals (native int ptr)
  63:     .locals (void* argv)
  64:     .locals (string[] args)
  65:     call    string[] [mscorlib]System.Environment::GetCommandLineArgs()
  66:     dup
  67:     stloc    args
  68:     ldlen
  69:     conv.i4
  70:     dup
  71:     stloc    argc
  72:     ldc.i4    4
  73:     mul
  74:     localloc
  75:     stloc    argv
  76:     ldc.i4.0
  77:     stloc    i
  78: L_01:
  79:     ldloc    i
  80:     ldloc    argc
  81:     ceq
  82:     brtrue    L_02
  83:     ldloc    args
  84:     ldloc    i
  85:     ldelem.ref
  86:     call    native int [mscorlib]System.Runtime.InteropServices.Marshal::StringToHGlobalAnsi(string)
  87:     stloc    ptr
  88:     ldloc    argv
  89:     ldloc    i
  90:     ldc.i4    4
  91:     mul
  92:     add
  93:     ldloc    ptr
  94:     stind.i
  95:     ldloc    i
  96:     ldc.i4.1
  97:     add
  98:     stloc    i
  99:     br    L_01
 100: L_02:
 101:     call void $MSIL_Init()
 102:     call    unsigned int32 modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) main()
 103:     conv.i4
 104:     ret
 105: }
 106:  
 107: .method static public unsigned int32 modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 'main'
 108:     () cil managed
 109: {
 110:     .locals (unsigned int32 'ltmp_0_1')
 111:     .maxstack    16
 112: ltmp_1_2:
 113:  
 114: //    %tmp2 = tail call i32 @puts( i8* getelementptr ([12 x i8]* @.str, i32 0, i32 0) )        ; <i32> [#uses=0]
 115:  
 116:     ldsflda    valuetype 'unsigned int8 [12]' '.str'
 117:     conv.u4
 118:     call    unsigned int32 modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 'puts'(void* )
 119:     stloc    'ltmp_0_1'
 120:  
 121: //    ret i32 0
 122:  
 123:     ldc.i4    0
 124:     ret
 125: }

Holy frickin' crap. I think I'm in love.


.NET | C++ | Languages | LLVM

Sunday, February 24, 2008 5:00:17 AM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
Quotables

Some quotes I've found to be thought-provoking over the last week or so:

"Some programming languages manage to absorb change, but withstand progress."

"In a 5 year period we get one superb programming language. Only we can't control when the 5 year period will begin."

"Every program has (at least) two purposes: the one for which it was written and another for which it wasn't."

"If a listener nods his head when you're explaining your program, wake him up."

"A language that doesn't affect the way you think about programming, is not worth knowing."

"Wherever there is modularity there is the potential for misunderstanding: Hiding information implies a need to check communication."

(All of the above, Alan Perlis)

 

"Program testing can be used to show the presence of bugs, but never to show their absence!"

"The competent programmer is fully aware of the limited size of his own skull. He therefore approaches his task with full humility, and avoids clever tricks like the plague."

"How do we convince people that in programming simplicity and clarity —in short: what mathematicians call "elegance"— are not a dispensable luxury, but a crucial matter that decides between success and failure?"

"Are you quite sure that all those bells and whistles, all those wonderful facilities of your so called powerful programming languages, belong to the solution set rather than the problem set?"

"Object-oriented programming is an exceptionally bad idea which could only have originated in California."

"The prisoner falls in love with his chains."

"Write a paper promising salvation, make it a 'structured' something or a 'virtual' something, or 'abstract', 'distributed' or 'higher-order' or 'applicative' and you can almost be certain of having started a new cult."

"I remember from those days two design principles that have served me well ever since, viz.

  1. before really embarking on a sizable project, in particular before starting the large investment of coding, try to kill the project first, and
  2. start with the most difficult, most risky parts first."

(All of the above, Edsgar Dijkstra)

Make of them what you will....


Languages | Reading

Sunday, February 24, 2008 3:16:52 AM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
 Saturday, February 23, 2008
Building LLVM on Windows using MinGW32

As I've mentioned in passing, one of the things I'm playing with in my spare time (or will play with, now that I've got everything working, I think) is the LLVM toolchain. In essence, it looks to be a parallel to Microsoft's Phoenix, except that it's out, it's been in use in production environments (Apple is a major contributor to the project and uses it pretty extensively, it seems), and it supports not only C/C++ and Objective-C, but also Ada and Fortran. It's also a useful back-end for people writing languages, hence my interest.

One of the things that appeals about LLVM is that it uses an "intermediate representation" that in many ways reminds me of Phoenix's Low IR, though I'm sure there are significant differences that I'm not well-practiced enough to spot. Consider this bit of Fibonacci code, for example:

   1: define i32 @fib(i32 %AnArg) {
   2: EntryBlock:
   3:     %cond = icmp sle i32 %AnArg, 2        ; <i1> [#uses=1]
   4:     br i1 %cond, label %return, label %recurse
   5:  
   6: return:        ; preds = %EntryBlock
   7:     ret i32 1
   8:  
   9: recurse:        ; preds = %EntryBlock
  10:     %arg = sub i32 %AnArg, 1        ; <i32> [#uses=1]
  11:     %fibx1 = tail call i32 @fib( i32 %arg )        ; <i32> [#uses=1]
  12:     %arg1 = sub i32 %AnArg, 2        ; <i32> [#uses=1]
  13:     %fibx2 = tail call i32 @fib( i32 %arg1 )        ; <i32> [#uses=1]
  14:     %addresult = add i32 %fibx1, %fibx2        ; <i32> [#uses=1]
  15:     ret i32 %addresult
  16: }
  17:  
  18: declare void @abort()

It's rather interesting to imagine this as a direct by-product of that first pass off of the hypothetical Universal AST....

Getting this thing to build has been an exercise of patience, however.

The documentation on the website, while extensive, isn't very Windows-friendly. For example, there's a page that describes how to build it with Visual Studio, but it's a touch out-of-date. On top of that, it turns out that the VS/LLVM tools can't compile to LLVM bitcode, only execute it once it's in that format; you need "llvm-gcc" to compile to bitcode, which means you're left with a two-machine solution: a *nix box using llvm-gcc to compile the code, and then your Windows box to run it. Ugh.

Fortunately, Windows users have two choices for dealing with *nix solutions: Cygwin and MinGW32. The first tries to lay down a *nix-like layer on top of the Win32 APIs (meaning everything depends on cygwin1.dll once built), the second tries to provide an adapter layer such that when a *nix tool is done building, it has no dependencies beyond what you'd see from any other Win32 app. Debates rage about the validity of each, and rather than seem like I'm coming down in favor of one or the other, I'll simply note that I have both installed in my Languages VMWare image now, and leave it at that.

Building LLVM with MinGW was a bit more painful than I expected, however, so for a long time I just didn't bother. Last night that changed, thanks to Anton Korobeynikov, who spent the better part of three or four hours in back-and-forth email conversation with me, walking me patiently through the step-by-step of getting MinGW and msys up and running on my machine long enough to build the LLVM 2.2+ (meaning the tip beyond the current 2.2 release) code base. I can't thank him enough--both for the direct help in getting the MinGW bits up and in the right places as well as for the casual conversation about MinGW along the way--so I thought I'd replicate what we did on my box to the 'Net in an attempt to spare others the effort.

First, there's a pile of tarballs from the MinGW download page that require downloading and extracting:

  • gcc-g++-3.4.5-20060117-1.tar.gz
  • binutils-2.18.50-20080109.tar.gz
  • mingw-runtime-3.14.tar.gz
  • gcc-core-3.4.5-20060117-1.tar.gz
  • w32api-3.11.tar.gz

Note that I also pulled down the other gcc- tarballs (gcj, objc and so on), just because I wanted to play with the MinGW versions of these tools. Extract all of these into a directory; on my system, that's C:/Prg/MinGW.

(There is a .exe installer on the Sourceforge page that supposedly manages all this for you, but it installed the binutils-2.17 package instead of 2.18, and I couldn't figure out how to get it to grab 2.18. All it does is download these packages and extract them, so going without it isn't a huge ordeal.)

By the way, if you're curious about experimenting with gcj as well (hey, it's a Java compiler that compiles to native code--that's interesting in its own right, if you ask me), take careful note that as it stands right now in the installation process, you can run gcj but can't compile Hello.java with it--it complains about a missing library, "iconv". This is a known bug, it seems, and the solution is to install libiconv from the GnuWin32 project--just extract the "bin" and "lib" packages into C:/Prg/MinGW.

At this point, you're done with C:/Prg/MinGW32.

Next, there's a couple of installers and additional tarballs that need downloading and extracting:

  • MSYS-1.0.10.exe
  • msysDTK-1.0.1.exe
  • bash-3.1-MSYS-1.0.11-1.tar.bz2
  • bison-2.3-MSYS-1.0.11.tar.bz2
  • flex-2.5.33-MSYS-1.0.11.tar.bz2
  • regex-0.12-MSYS-1.0.11.tar.bz2 (required by flex)

The first two just execute and install; on my system, that is C:/Prg/msys/1.0. The next one just extracts into the C:/Prg/msys/1.0 directory. The last three are a tad tricky, however--apparently they assume that everything should be installed into a top-level "usr" directory, and that's not quite where we want them; we want them. Apparently, we want them installed directly (so that "/usr/bin" from bison goes into "/bin" inside of "C:/Prg/msys/1.0"), so extract these to a temporary directory, then xcopy everything inside the temp/usr directory over to C:/Prg/msys/1.0. (That is, "cd temp", then "cd usr", then "xcopy /s/e * C:/Prg/msys/1.0".)

At this point, we're done with the setup--create a directory into which you want LLVM built (on my system, that's C:/Prg/LLVM/msys-build, where the source from SVN is held in C:/Prg/LLVM/llvm-svn), and execute the "configure" script in this directory (that is, "cd C:/Prg/LLVM/msys-build" and "../llvm-svn/configure"). The script will deposit a bunch of makefiles and directories into the build directory, after which a simple "make" suffices to build everything (in Debug; if you want Release, do "make ENABLE_OPTIMIZED=1", as per the LLVM documentation).

Thanks again, Anton! Now can you help me get llvm-gcc working? :-)


C++ | Java/J2EE | Languages | LLVM | Windows

Saturday, February 23, 2008 8:34:35 PM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
I love it when good accountanting girls go geek

Erik Mork, C++ and .NET programmer extraordinaire and bright guy in his own right, has subverted my sister-in-law to programming, and the pair of them are now opening the doors of their new company, Silver Bay Labs, with a series of podcasts on Silverlight and "sparkling clients" in general. Have a listen, if you're interested in the whole "rich client" thing....


.NET | Windows

Saturday, February 23, 2008 2:21:33 AM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
 Friday, February 22, 2008
URLs as first-class concepts in a language

While perusing the E Tutorial, I noticed something that was simple and powerful all at the same time: URLs as first-class concepts in the language. Or, if you will, URLs as a factory for creating objects. Check out this snippet of E:

? pragma.syntax("0.8")

? def poem := <http://www.erights.org/elang/intro/jabberwocky.txt>
# value: <http://www.erights.org/elang/intro/jabberwocky.txt>

? <file:c:/jabbertest>.mkdirs(null);
? <file:c:/jabbertest/jabberwocky.txt>.setText(poem.getText())

Notice how the initialization of the "poem" variable is set to what looks like an HTTP URL? This essentially downloads the contents of that file and stores it into poem (in a form I don't precisely understand yet--I think it's an object that wraps the contents, but I could be wrong). Then the script uses file URLs to create the local directory (jabbertest) and to create a new file (jabberwocky.txt) and set the contents of that file to be the same as the contents of the stored "poem" object.

That, my friends, is just slick. It also neatly avoids the whole "how are files and directories and stuff different from URLs" that tends to make doing this same bit of code in Java or C# that much more difficult.


.NET | C++ | Java/J2EE | Languages | Parrot

Friday, February 22, 2008 11:40:06 PM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
More language features revisited

Since we're examining various aspects of the canonical O-O language (the three principals being C++, Java and C#/VB.NET), let's take in a review of another recent post, this time on the use of "new" in said languages.

All of us have probably written code like this:

Foo f = new Foo();

And what could be simpler?  As long as the logic in the constructor is simple (or better yet, the constructor is empty), it would seem that the simplest code is the best, so just use the constructor.  Certainly the MSDN documentation is rife with code that uses public constructors.  You can probably find plenty of public constructors used right here on my blog.  Why invest the effort in writing (and using) a factory class that will probably never do anything useful, other than call a public constructor?

In his excellent podcast entitled "Emergent Design: The Evolutionary Nature of Software Development," Scott Bain of Net Objectives nevertheless makes a strong case against the routine use of public constructors.  The problem, notes Scott, is that the use of a public constructor ties the calling code to the implementation of Foo as a concrete class.  But suppose that you later discover that there need to be many subtypes of Foo, and Foo should therefore be an abstract class instead of a concrete class--what then?  You've got a big problem, that's what; a lot of client code that has been making use of Foo's public constructor suddenly becomes invalid.

I just love it when people rediscover advice that they could have had much earlier, had they only been aware of the prior art in the field. I refer the curious C#/VB.NET developer to the book Effective Java, by Joshua Bloch, in which Item 1 states, "Consider providing static factory methods instead of constructors". Quoting from said book, we see:

One advantage of static factory methods is that, unlike constructors, they have names. If the parameters to a constructor do not, in and of themselves, describe the object being returned, a static factory with a well-chosen name can make a class easier to user and the resulting client code easier to read. ...

A second advantage of static factory methods is that, unlike constructors, they are not required to create a new object each time they're invoked. This allows immutable classes (Item 13) to use preconstructed instances or to cache instances as they're constructed and to dispense these instances repeatedly so as to avoid creating unnecessary duplicate values. ...

A third advantage of static factory methods is that, unlike constructors, they can return an object of any subtype of their return type. This gives you great flexibility in choosing the class of the returned object. ...

The main disadvantage of static factory methods is that classes without public or protected constructors cannot be subclassed. The same is true for nonpublic classes returned by public static factories.

A second disadvantage of static factory methods is that they are not readily distinguishable from other static methods. They do not stand out in API documentation the way that constructors do.

C# and VB.NET developers are encouraged to read the book to discover about 30 or so other nuggets of wisdom that are directly applicable to the .NET framework. Note that Josh is in the process, this very month, of revising the book for rerelease as a second edition, taking into account the wide variety of changes that have taken place in the Java language since EJ's initial release.

Meanwhile....

One thing that's been nagging at me is how I think Java and C# missed the boat in respect to the various ways we'd like to construct objects. The presumption was always that allocation and initialization would (a) always take place at the same time, and (b) always take place in the same manner--the underlying system would allocate the memory, the object would be laid out in this newly-minted chunk of heap, and your constructor would then initialize the contents. Neither assumption can be taken to be true, as we've seen over the years; the object may need to come from pre-existing storage (a la the object cache), or the object may need to be a derived type (a la the covariant return Josh mentions in #3 advantage above), or in some cases you want to mint the object from an entirely different part of the process.

C++ actually had an advantage over C# and Java here, in that you could overload operator new() for a class (which then meant you had to overload operator delete(), and oh-by-the-way don't forget to overload array new, that is, operator new[]() and its corresponding twin, array delete, operator delete[](), which was a bit of a pain) to gain better control over both allocation and initialization, to a degree. Initially we always used it to control allocation--the idea being one would create a class-specific allocator, on the grounds that knowing some of the assumptions of the class, such as its size, would allow you to write faster allocation routines for it. But one of the rarely-used features of operator new() was that it could take additional parameters, using a truly obscure syntactic corner of C++:

   1: void* operator new(size_t s, const string& message)
   2: {
   3:     cout << "Operator new sez " << message << endl;
   4:         // allocate s bytes and return; Foo ctor will be invoked automagically
   5: }
   6: Foo* newFoo = new ("Howdy, world!") Foo();
 

Officially, one such overloaded operator was recognized, the placement new operator, which took a void* as a parameter, indicating the exact location in which your object was to be allocated and thus laid down. This meant that C++ developers could allocate from some other part of the process (including shudder a pointer they'd made up out of thin air) and drop the initialized object right there. While useful in its own right, placement new opened up a whole new world of construction options to the C++ developer that we never really took advantage of, since now you could pass parameters to the construction process without involving the constructor.

That's kind of nifty, in an obscure and slightly terrifying fashion. One thought I'd always had was that it would be cool if a C++ O/R-M overloaded operator new() for database-bound objects to indicate which database connection to use during construction:

   1: DBConnection conn;
   2:  
   3: Person* newFoo = new (conn) Person("Ted", "Neward");

 

Of course, such syntax has the immediate drawback of eliciting a chorus of "WTF?!?" at the next code review, but still....

Meanwhile, other languages choose to view new as one of those nasty static methods Gilad dislikes so much, Ruby and Smalltalk being two of them. That is to say, construction now basically calls into a static method on a class, which has the nice effect of keeping the number of "special" parts of the language to a minimum (since now "new" is just a method, not a keyword), makes it easier to have different-yet-similar names to represent slightly different concepts ("create" vs "new" vs "fetch" vs "allocate", and so on) sitting side by side, and helps eliminate Josh's second disadvantage above. I'm not certain how exactly this could eliminate Josh's first disadvantage (that of inheritance and inaccessible constructors), but it's not entirely unimaginable that the language would have a certain amount of incestuous knowledge here to be able to reach those static method (constructors) in the same way it does currently.

(It actually works better if they aren't static methods at all, but instance methods on class objects, to which the language automatically defers when it sees a "classname.new"; that is, when it sees

Person ann = Person.new("Ann", "Sheriff");

the language automatically changes this to read:

Person ann = Person.class.new("Ann", "Sheriff");

which would be eminently doable in Java, were class objects available for modification/definition somehow. In a language built on top of the JVM or CLR, the class object would be a standalone singleton, a la "object" definitions in Scala.)


.NET | C++ | Java/J2EE | Languages | Parrot | Ruby

Friday, February 22, 2008 1:49:49 AM (Pacific Standard Time, UTC-08:00)
Comments [2]  | 
 Thursday, February 21, 2008
Static considered harmful?

Gilad makes the case that static, that staple of C++, C#/VB.NET, and Java, does not belong:

Most imperative languages have some notion of static variable. This is unfortunate, since static variables have many disadvantages. I have argued against static state for quite a few years (at least since the dawn of the millennium), and in Newspeak, I’m finally able to eradicate it entirely.

I think Gilad conflates a few things, but he's also got some good points. To the dissecting table!

To begin:

Static variables are bad for security. See the E literature for extensive discussion on this topic. The key idea is that static state represents an ambient capability to do things to your system, that may be taken advantage of by evildoers.

Eh.... I'm not sure I buy into this. For evildoers to be able to change static state, they have to have some kind of "poke" access inside the innards of your application, and if they have that, then just about anything is vulnerable. Now, granted, I haven't spent a great deal of time on the E literature, so maybe I'm missing the point here, but if an attacker has data-manipulability into my program, then I'm in a whole world of pain, whether he's attacking statics or instances. Having said that, statics have to be stored in a particular well-known location inside the process, so maybe that makes them a touch more vulnerable. Still, this seems a specious argument.

Static variables are bad for distribution. Static state needs to either be replicated and sync’ed across all nodes of a distributed system, or kept on a central node accessible by all others, or some compromise between the former and the latter. This is all difficult/expensive/unreliable.

Now this one I buy into, but the issue isn't the "static"ness of the data, but the fact that it's effectively a Singleton, and Singletons in any distributed system are Evil. I talked a great deal about this in Effective Enterprise Java, so I'll leave that alone, but let me point out that any Singleton is evil, whether it's represented in a static, a Singleton object, a Newspeak module, or a database. The "static"ness here is a red herring.

Static variables are bad for re-entrancy. Code that accesses such state is not re-entrant. It is all too easy to produce such code. Case in point: javac. Originally conceived as a batch compiler, javac had to undergo extensive reconstructive surgery to make it suitable for use in IDEs. A major problem was that one could not create multiple instances of the compiler to be used by different parts of an IDE, because javac had significant static state. In contrast, the code in a Newspeak module definition is always re-entrant, which makes it easy to deploy multiple versions of a module definition side-by-side, for example.

Absolutely, but this is true for instance fields, too--any state that is modified as part of two or more method bodies is vulnerable to a re-entrancy concern, since now the field is visibly modified state to that particular instance. How deeply do you want your code to be re-entrant? Gilad's citation of the javac compiler points out that the compiler was hardly re-entrant at any reasonable level, but the fact is that the compiler *could* have been used in a parallelized fashion using the isolational properties of ClassLoaders. (Its ugly, and Java desperately needs Isolates for that reason.)

Static variables are bad for memory management. This state has to be handed specially by implementations, complicating garbage collection. The woeful tale of class unloading in Java revolves around this problem. Early JVMs lost application’s static state when trying to unload classes. Even though the rules for class unloading were already implicit in the specification, I had to add a section to the JLS to state them explicitly, so overzealous implementors wouldn’t throw away static application state that was not entirely obvious.

This one I can't really comment on, since I'm not in the habit of writing memory-management code. I'll take Gilad's word for it, though I'm curious to know why this is so, in more detail.

Static variables are bad for for startup time. They encourage excess initialization up front. Not to mention the complexities that static initialization engenders: it can deadlock, applications can see uninitialized state, and unless you have a really smart runtime, you find it hard to compile efficiently (because you need to test if things are initialized on every use).

I'm not sure I see how this is different for any startup/initialization code--anything that the user can specify as part of startup will run the risk of deadlocks and viewing uninitialized state. Consider the alternative, however--if the user didn't have the ability to specify startup code, then they would have to either write their own, post-runtime, startup code, or else they have to constantly check the state of their uninitialized objects and initialize them on first use, the very thing that he claims is hard to compile efficiently.

Static variables are bad for for concurrency. Of course, any shared state is bad for concurrency, but static state is one more subtle time bomb that can catch you by surprise.

Absolutely: any shared state is bad for concurrency. However, I think we need to go back to first principles here. Since any shared state is bad for concurrency, and since static data is always shared by definition, it follows that static data is bad for concurrency. Pay particular attention to that chain of reasoning, however: any shared state is bad for concurrency, whether it's held by the process in a special non-instance-aligned location or in an data store that happens to be reachable from multiple paths of control. This means that your average database table is also bad for concurrency, were it not for the transactional protections that surround the table. This isn't an indictment of static variables, per se, but of shared state.

Gilad goes on to describe how Newspeak solves this problem of static:

It may seem like you need static state, somewhere to start things off, but you don’t. You start off by creating an object, and you keep your state in that object and in objects it references. In Newspeak, those objects are modules.

Newspeak isn’t the only language to eliminate static state. E has also done so, out of concern for security. And so has Scala, though its close cohabitation with Java means Scala’s purity is easily violated. The bottom line, though, should be clear. Static state will disappear from modern programming languages, and should be eliminated from modern programming practice.

I wish Newspeak were available for widespread use, because I'd love to explore this concept further; in the CLR, for example, there is the same idea of "modules", in that modules are singleton entities in which methods and data can reside, at a higher level than individual objects themselves. Assemblies, for example, form modules, and this is where "global variables" and "global methods" exist (when supported by the compiling language in question). At the end of the day, though, these are just statics by another name, and face most, if not all, of the same problems Gilad lays out above. Scala "objects" have the same basic property.

I think the larger issue here is that one should be careful where one stores state, period. Every piece of data has a corresponding scope of accessibility, and developers have grown complacent about considering that scope when putting data there: they consider the accessibility at the language level (public, private, what-have-you), and fail to consider the scope beyond that (concurrency, re-entrancy, and so on).

At the end of the day, it's simple: static entities and instance entities are just entities. Nothing more, nothing less. Caveat emptor.


.NET | C++ | Java/J2EE | Languages | Parrot | Ruby | XML Services

Thursday, February 21, 2008 8:07:37 PM (Pacific Standard Time, UTC-08:00)
Comments [7]  | 
 Tuesday, February 19, 2008
The Fallacies Remain....

Just recently, I got this bit in an email from the Redmond Developer News ezine:

TWO IF BY SEA

In the course of just over a week starting on Jan. 30, a total of five undersea data cables linking Europe, Africa and the Middle East were damaged or disrupted. The first two cables to be lost link Europe with Egypt and terminate near the Port of Alexandria.

http://reddevnews.com/columns/article.aspx?editorialsid=2502

Early speculation placed the blame on ship anchors that might have dragged across the sea floor during heavy weather. But the subsequent loss of cables in the Persian Gulf and the Mediterranean has produced a chilling numbers game. Someone, it seems, may be trying to sabotage the global network.

It's a conclusion that came up at a recent International Telecommunication Union (ITU) press conference. According to an Associated Press report, ITU head of development Sami al-Murshed isn't ready to "rule out that a deliberate act of sabotage caused the damage to the undersea cables over two weeks ago."

http://tinyurl.com/3bjtdg

You think?

In just seven or eight days, five undersea cables were disrupted.

Five. All of them serving or connecting to the Middle East. And thus far, only one cable cut -- linking Oman and the United Arab Emirates -- has been identified as accidental, caused by a dragging ship anchor.

So what does it mean for developers? A lot, actually. Because it means that the coming wave of service-enabled applications needs to take into account the fact that the cloud is, literally, under attack.

This isn't new. For as long as the Internet has been around, concerns about attacks on the network have centered on threats posed by things like distributed denial of service (DDOS) and other network-borne attacks. Twice -- once in 2002 and again in 2007 -- DDOS attacks have targeted the 13 DNS root servers, threatening to disrupt the Internet.

But assaults on the remote physical infrastructure of the global network are especially concerning. These cables lie hundreds or even thousands of feet beneath the surface. This wasn't a script-kiddie kicking off an ill-advised DOS attack on a server. This was almost certainly a sophisticated, well-planned, well-financed and well-thought-out effort to cut off an entire section of the world from the global Internet.

Clearly, efforts need to be made to ensure that the intercontinental cable infrastructure of the Internet is hardened. Redundant, geographically dispersed links, with plenty of excess bandwidth, are a good start.

But development planners need to do their part, as well. Web-based applications shouldn't be crafted with the expectation of limitless bandwidth. Services and apps must be crafted so that they can fail gracefully, shift to lower-bandwidth media (such as satellite) and provide priority to business-critical operations. In short, your critical cloud-reliant apps must continue to work, when almost nothing else will.

And all this, I might add, as the industry prepares to welcome the second generation of rich Internet application tools and frameworks.

Silverlight 2.0 will debut at MIX08 next month. Adobe is upping the ante with its latest offerings. Developers will enjoy a major step up in their ability to craft enriched, Web-entangled applications and environments.

But as you make your plans and write your code, remember this one thing: The people, organization or government that most likely sliced those four or five cables in the Mediterranean and Persian Gulf -- they can do it again.

There's a couple of things to consider here, aside from the geopolitical ramifications of a concerted attack on the global IT infrastructure (which does more to damage corporations and the economy than it does to disrupt military communications, which to my understanding are mostly satellite-based).

First, this attack on the global infrastructure raises a huge issue with respect to outsourcing--if you lose touch with your development staff for a day, a week, a month (just how long does it take to lay down new trunk cable, anyway?), what sort of chaos is this going to strike with your project schedule? In The World is Flat, Friedman mentions that a couple of fast-food restaurants have outsourced the drive-thru--you drive up to the speaker, and as you place your order, you're talking to somebody half a world way who's punching it into a computer that's flashing the data back to the fast-food join in question for harvesting (it's not like they make the food when you order it, just harvest it from the fields of pre-cooked burgers ripening under infrared lamps in the back) and disbursement as you pull forward the remaining fifty feet to the first window.

The ludicrousness of this arrangement notwithstanding, this means that the local fast-food joint is now dependent on the global IT infrastructure in the same way that your ERP system is. Aside from the obvious "geek attraction" to a setup like this, I find it fascinating that at no point did somebody stand up and yell out, "What happened to minimizing the risks?" Effective project development relies heavily on the ability to touch base with the customer every so often to ensure things are progressing in the way the customer was anticipating. When the development team is one ocean and two continents away in one direction, or one ocean and a whole pile of islands away in the other direction, or even just a few states over, that vital communication link is now at the mercy of every single IT node in between them and you.

We can make huge strides, but at the end of the day, the huge distances involved can only be "fractionalized", never eliminated.

Second, as Desmond points out, this has a huge impact on the design of applications that are assuming a 100% or 99.9% Internet uptime. Yes, I'm looking at you, GMail and Google Calendar and the other so-called "next-generation Internet applications" based on technologies like AJAX. (I categorically refuse to call them "Web 2.0" applications--there is no such thing as "Web 2.0".) As much as we keep looking to the future for an "always-on" networking infrastructure, the more we delude ourselves to the practical realities of life: there is no such thing as "always-on" infrastructure. Networking or otherwise.

I know this personally, since last year here in Redmond, some stronger-than-normal winter storms knocked down a whole slew of power lines and left my house without electricity for a week. To very quickly discover how much of modern Western life depends on "always-on" assumptions, go without power to the house for a week. We were fortunate--parts of Redmond and nearby neighborhoods got power back within 24 hours, so if I needed to recharge the laptop or get online to keep doing business, much less get a hot meal or just find a place where it was warm, it meant a quick trip down to the local strip mall where a restaurant with WiFi (Canyon's, for those of you that visit Redmond) kept me going. For others in Redmond, the power outage meant a brief vacation down at the Redmond Town Center Marriott, where power was available pretty much within an hour or two of its disruption.

The First Fallacy of Enterprise Systems states that "The network is reliable". The network is only as reliable as the infrastructure around it, and not just the infrastructure that your company lays down from your workstation to the proxy or gateway or cable modem. Take a "traceroute" reading from your desktop machine to the server on which your application is running--if it's not physically in the building as you, then you're probably looking at 20 - 30 "hops" before it reaches the server. Every single one of those "hops" is a potential point of failure. Granted, the architecture of TCP/IP suggests that we should be able to route around any localized points of failure, but how many of those points are, in fact, to your world view, completely unroutable? If your gateway machine goes down, how does TCP/IP try to route around that? If your ISP gets hammered by a Denial-of-Service attack, how do clients reach the server?

If we cannot guarantee 100% uptime for electricity, something we've had close to a century to perfect, then how can you assume similar kinds of guarantees for network availability? And before any of you point out that "Hey, most of the time, it just works so why worry about it?", I humbly suggest you walk into your Network Operations Center and ask the helpful IT people to point out the Uninterruptible Power Supplies that fuel the servers there "just in case".

When they in turn ask you to point out the "just in case" infrastructure around the application, what will you say?

Remember, the Fallacies only bite you when you ignore them:

1) The network is reliable

2) Latency is zero

3) Bandwidth is infinite

4) The network is secure

5) Topology doesn't change

6) There is one administrator

7) Transport cost is zero

8) The network is homogeneous

9) The system is monolithic

10) The system is finished

Every project needs, at some point, to have somebody stand up in the room and shout out, "But how do we minimize the risks?" If this is truly a "mission-critical" application, then somebody needs the responsibility of cooking up "What if?" scenarios and answers, even if the answer is to say, "There's not much we can reasonably do in that situation, so we'll just accept that the company shuts its doors in that case".


.NET | C++ | Development Processes | Java/J2EE | Ruby | Security | XML Services

Tuesday, February 19, 2008 9:25:03 PM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
 Monday, February 18, 2008
Who herds the cats?

Recently I've been looking more closely at the various (count them, four of them) proposals for adding new features into the Java language, the "BGGA", "FCM", "CICE" and "JCA" proposals. All of them are interesting and have their merits. A few other proposals for Java 7 have emerged as well, such as extension methods, enhancements to switch, the so-called "multi-catch" enhancement to exceptions, properties, better null support, and some syntax to support lists and maps natively. All of them intriguing ideas, and highly subject to reasonable debate among reasonable people. My concern lies in a different direction.

Who herds this bunch of cats?

This isn't just a question of process within the JCP. And it's not just a question of closures or the other features we're looking at for Java 7. This is a question about the moral leadership of Java.

In the C# space, we have Anders. He clearly "owns" language, and acts as the benevolent dictator. Nothing goes into "his" language without his explicit and expressed OK. Other languages have similar personages in similar roles. Python has Guido. Perl has Larry. C++ has Bjarne. Ruby has Matz. Certainly other individuals "float" around these languages and lend their impressive weight towards the language's design--Scott Meyers, and Herb Sutter in C++, for example, or Dave Thomas and Martin Fowler in Ruby--but the core language design principles rest firmly inside the head of one man.

Whereso for Java? James Gosling? Please--Jimmy abandoned the language shortly after its release, and now only comes out every so often to launch T-shirts into the crowd, answer reporters' questions whenever something Java-related comes up, and blog his two cents' worth. He's a reminder of the "good old days", for sure, but he's not coming out with new directions of his own accord and taking the reins to lead us there. He's the Teddy Kennedy of the Java Party. His endorsement weighs in as about as influential as Bob Dole's--interesting to an analytical few, but hardly meaningful in the grand scheme of things.

Unfortunately, the two most recognized "benevolent dictators" of the Java language, Neal Gafter and Joshua Bloch, are on opposing sides of the aisle on this. Each has put forth a competing proposal for how the Java language should evolve. Each has his good reasons for how he wants to implement closures in Java. Each has his impressive list of names supporting him. It's Clinton and Obama, Java Edition. The fact is, though, that when these two disagreed on how to move forward, lots of Java developers found themselves in the uncomfortable position faced by the children when the parents fight: do you take sides? Do you try to make peace between them? Or do you just go hide your head under a pillow until the yelling stops?

This is the real danger facing Java right now: there is no one with enough moral capital and credibility in the Java space to make this call. We can take polls and votes and strawman proposals until the cows come home, but language design by committee has generally not worked well in the past. If someone without that authority ends up making the decision, it will alienate half the Java community regardless of which way the decision goes. The split is too even to expect one to come out as the obvious front-runner. And expecting a JSR committee process to somehow resolve the differences between these four proposals into a single direction forward is asking a lot.

So who makes the call?


Java/J2EE | Languages

Monday, February 18, 2008 9:47:38 PM (Pacific Standard Time, UTC-08:00)
Comments [14]  | 
Why we need both static and dynamic in the same language

Stu demonstrates one of the basic problems with an all-dynamic language: "I just spent an hour figuring out why some carefully-tested code went no-op after adding RSpec to a project." As much as I berate Stu at times (both in person and in blog), the fact is, I deeply respect and admire his programming skill, and if he can lose an hour to something that (I submit for your consideration) could have been caught by a static analysis tool fairly easily, then clearly that was a wasted hour of Stu's life. Worse, the problem is not yet solved, since now he has to make a hard choice about which definition to use, or else find a way to hack around the two definitions and create a third. Or perhaps something even uglier than this....

And this presumes that all developers using Ruby will have Stu's skill and his sense of responsibility when coming up with the solution. Asking that of all programmers across the globe is simply too much.

But clearly we cannot simply abandon the power of the dynamic language, either. Quoting again from the same source, Stu points out the very reason why dynamic languages are so powerful: "Once you start treating code as data, the elegance of your code is dependent on your skill. You cannot hide behind the limitations of your programming language anymore, because there aren't any."

What's a language designer left to do?

Choose both, of course.

The more I think about it, the more I think Cobra (and other languages) has it right: a programming language should have both static and dynamic features within it, simultaneously. This is the first "modern" language I've seen come along that espouses the "static when you can, dynamic when you want" principle as a first-class concept. Even at that, I imagine that there's much more that could be done than what Cobra espouses. Imagine combining the power of Scala's type inferencing system with the flexibility of a Groovy or Ruby.

Shivers.




Monday, February 18, 2008 4:22:11 AM (Pacific Standard Time, UTC-08:00)
Comments [2]  | 
Modular Toolchains

During the Lang.NET Symposium, a couple of things "clicked" all simultaneously, giving me one of those "Oh, I get it now" moments that just doesn't want to leave you alone.

During the Intentional Software presentation, as the demo wound onwards I (and the rest of the small group gathered there) found myself looking at the same source code, but presented in a variety of new ways, some of which appealed to me as the programmer, others of which appealed to the mathematicians in the room, others of which appealed to the non-programmers in the room. (I heard one of the Microsoft hosts, a non-technical program manager, I think, say, "Wow, even I could understand that spreadsheet view, and that was writing code?")

During the spreadsheet-written-in-IronPython presentation (ResolverOne), we were essentially looking at new ways of writing IronPython code, thus leveraging all the syntactic power of a programming language with a nicer front end.

During the aspect-oriented talk (the one by Stefan Wenig and Fabian Schmeid), we found ourselves looking at a tool that essentially takes compiled assemblies and weaves in additional code based on descriptors from outside that codebase; in essence, just another aspect-oriented tool.

But combine this with my own investigations into Soot, LLVM, Parrot, and Phoenix, alongside the usual discussions around the DLR, CLR, JVM and DaVinci machine, couple that with the presentation Harry gave about parser expression grammars and the research in the functional community into parser combinators, throw in the aspect-oriented and metaprogramming facilities that the Rubyists and other dynamic linguists go on for days about, and what do you end up with?

Folks, the future is in modular toolchains.

This is an oversimplification, and a radical oversimplification at that, but imagine for a moment:

  1. A parser takes your source code (let's assume it is Java, just for grins) and builds an AST out of it. Not an AST that's inherently deeply coupled to the Java language, mind you, but a general-purpose one that stands as a union of Java, C#, C++, Perl, Python, Smalltalk, and other languages. (Note that some of the linguistic concepts in some of those languages may not end up in this AST, but instead operate on the AST itself, a la C++'s template facilities.) Said parser is now finished, and can either output a binary (or potentially XML, though it'd probably be hideously verbose) version of this AST to disk for later consumption, or would more than likely be passed directly along to the next beast in the chain.
  2. In the simplest scenario, the next beast would be a code generator, which takes the AST and seeks to export some kind of back-end code out of it. Here, since we're working with a general-purpose AST, we can assume that this back-end is flexible and open, a la the Phoenix toolkit (where either native or MSIL can be generated).
  3. In a slightly more complicated scenario, verification of the correctness of the AST (against whatever libraries are specified) is checked, usually prior to code-gen, thus making this particular toolchain a statically-checked chain; were verification left out, it would need to happen at runtime, in which case we'd be talking about a dynamically-checked chain.
    Note that I stay away from the term "statically-typed" or "dynamically-typed" for the moment. That would be a measurement of the parser, not the verifier. Verification still occurs in a lot of these dynamically-typed languages, just as it does in statically-typed languages.
    Assuming the verification process succeeds, the AST can be again, written out or passed to the next step in the chain.
  4. Another potential step in the process, usually post-parser and pre-verification, would be an "aspect" step, in which a tool takes the AST, consults some external descriptors, and modifies the AST based on what it finds there. (This is how most of your non-AspectJ-like AOP tools work today, except that they have to rebuild the AST from compiled .class files or assemblies first.)
  5. Naturally, another step in the process would be an optimize step, but this has to be considered carefully, since some "high-level" optimizations can be done without regard to code-gen backend, and some will need to be done with regard to code-gen backend; for example, register spill is (from what I've heard, can't say I know too much about this) generally only useful if you know how many registers you're targeting. Plus, it's not hard to imagine certain optimizations that are only generally useful on the x86 architecture, versus those that are useful on other CPU platforms. Even operating systems I would imagine would have an impact here. (It turns out that many compiler toolchains go through a dozen or so optimization steps today, so it's not hard to imagine a "code-gen backend" being a series of a half-dozen or so targeted optimization steps before actually generating code.)
  6. Bear in mind, too, that these ASTs should have enough information to be directly executable, thus giving us an interpreter back-end instead of a code-generation back-end, a la the DLR instead of the CLR.
  7. Also, given the standard AST format, it would be relatively trivial to create a whole series of different "parser"s to get to the AST, along the lines of what the Intentional Software guys have created, thus blowing open the whole concept of "DSL" into areas that heretofore have only been imagined. You still get the complete support of the rest of the toolchain, which is what makes the whole DSL concept viable in the first place, including aspects and verification and your choice of either interpretation or compilation.
  8. While we're at it, bear in mind that this AST could/should also be reachable from within the code itself, thus giving languages that want to operate on their own AST at runtime the ability to do so, because the AST is in a standard format and the interpreter could be bundled as part of the generated executable, thus providing a compile-when-you-can-interpret-when-you-must flavor that is currently the reigning meme in language/platform environments like JRuby. (It would also have the happy side effect of making Paul Graham shut up about Lisp, at least for a while. Yes, Paul, code-as-data, it's brilliant, it's wonderful, we get it.)
  9. Nothing says this toolchain needs be one-way, by the way: many of the toolkits I mentioned before (LLVM, Phoenix, Soot) can start from compiled binary and work back to AST, thus offering us the opportunity to do surgery of either the exploratory kind (static analysis) or the manipulative kind (aspect-weaving, etc) on compiled code in a relatively clean way. Reflector demonstrates the power of being able to go "back and forth" in this way (even in the relatively limited way Reflector does so), so imagine how powerful it would be to do this from end-to-end throughout the toolchain.

How likely is this utopian vision? I'm not sure, honestly--certainly tools like LLVM and Phoenix seem to imply that there's ways to represent code across languages in a fairly generic form, but clearly there's much more work to be done, starting with this notion of the "uber-AST" that I've been so casually tossing around without definition. Every AST is more or less tied to the language it is supposed to represent, and there's clearly no way to imagine an AST that could represent every language ever invented. Just imagine trying to create an AST that could incorporate Java, COBOL and Brainf*ck, for example. But if we can get to a relatively stable 80/20, where we manage to represent the most-commonly-used 80% of languages within this AST (such as an AST that can incorporate Java, C#, and C++, for starters), then maybe there's enough of a critical mass there to move forward.

Now all I need to do is find somebody who'll fund this little bit of research... anybody got a pile of cash they don't know what to do with? :-)

Update: By the way, in case you want a graphical depiction of what I'm thinking about, the Phoenix page has one (though obviously it's limited to the Phoenix scope of vision, and you may have to be a Microsoft CONNECT member to see it).


.NET | C++ | Flash | Java/J2EE | Languages | Mac OS | Parrot | Ruby

Monday, February 18, 2008 1:55:53 AM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
 Sunday, February 10, 2008
An Appeal: www.findjohnglasgow.com

Long-time readers of this blog know that as a general rule, I try not to include much in the way of personal stuff here; I try (sometimes with more success than others) to keep the subject material focused on the technology space: Java, .NET, Ruby, languages, XML services, and so on.

This, however, is a deviation from that norm.

A near and dear friend of mine has asked that I help spread the word about the disappearance of a family member (a cousin, in fact). I don't know the details of the disappearance other than what anybody else can read on the website, but I do know that if someone in my family were to go missing for an inexplicable reason, I would want the help of anybody and everybody I knew to try and find them.

I would like to ask everyone's help in finding my brother John. He went missing January 28, 2008 and official search efforts were called off last Friday, even though the family has mounted their own search.   Please go to www.findjohnglasgow.com to see a picture of John and print off a flyer.   If you could put it in your car window or some other visible place, it would help us a lot.   It is possible that he could have traveled out of the area where he went missing, so we are trying to get the word out on a national level, to cover all possible scenarios.   Thank you all for your help.

Donna Jean Glasgow

February 6, 2008

If you reside near the Little Rock, Arkansas area in particular, please take a look at the photo below ...

john_glasgow

... and let somebody (either me or through the above-mentioned website) know if you've seen him, one way or another.

I won't ask you to forward this to everyone you know; instead I just ask that if you feel a twinge of sympathy for a missing family member that's connected to you through less than two degrees of separation, then do what you think would help.

Thanks for your time.




Sunday, February 10, 2008 6:09:10 AM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
 Sunday, February 03, 2008
Maybe 'twould be better to suggest "done like the Giants"

Wow. Giants 17, Patriots 14, when just about everybody had the Patriots by two touchdowns or so.

Just goes to show, shouldn't count the little guy out 'til the fat lady sings and the cows come home.

Also just goes to show, I shouldn't be blogging after an emotional heart-jerker like that one.




Sunday, February 03, 2008 9:38:19 PM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
 Saturday, February 02, 2008
My Secret (?) Shame (Or, Building Parrot 0.5.2)

OK, after a week of getting the Internet equivalent of Bad Mojo being sent my way by every Perl developer on the planet, I have to admit something that may strike readers as inconsistent and incongruous.

I want Parrot to work.

I don't really care about Perl 6, per se. As I've said before, the language has a lot of linguistic inconsistencies and too many violations of the the Principle of Least Surprise to carry a lot of favor with me. Whether Perl-the-language lives or dies really doesn't make a significant dent in my life.

But Parrot.... now there's something I care about.

Following the open debate on Perl (a surprising side-effect, given the subject matter of the post that spawned it), and chromatic's insistence that Parrot development was moving along, I decided to give in to my secret hopes, and pull the Parrot bits down again for a look-see.

In the spirit of the OpenJDK post last month, this is a quick chronicle of how I got Parrot to build on a Win32 system.

Installation details

Just for the record, I'm doing this in a VMWare image (one in which I keep all the languages I play with) with both Visual Studio 2008 and Visual Studio 2005 installed. The Parrot docs explicitly reference using Visual Studio 2003 (or the free Visual C++ Toolkit, which has since turned into Visual C++ 2005 Express), but I'm going to first have a shot at it with VS 2008 before falling back to VS 2005. This shouldn't make any difference, because 2008 is supposed to be a superset of 2005, but... well, you know how that old chestnut goes.

svn co parrot

Checking Parrot's code out is easy: just svn co https://svn.perl.org/parrot/trunk parrot-svn . (I use the -svn suffix on directories to distinguish between svn-pulled source trees and downloaded source trees. Helps in case I ever need/want to pull down a named release and keep the svn-pulled source at the same time.) I pull all this into a directory underneath C:\Prg, so the total path to Parrot's source base is C:\Prg\parrot-svn.

Configure

From there, as with many Unix-based projects, you have to run the "Configure.pl" script. I opened up a VS 2008 Command Prompt, and used ActiveState's Perl [1] to run the Configure script. It chugs away and comes back with this message:

C:\Prg\parrot-svn>perl Configure.pl
Parrot Version 0.5.2 Configure 2.0
Copyright (C) 2001-2008, The Perl Foundation.

Hello, I'm Configure. My job is to poke and prod your system to figure out
how to build Parrot. The process is completely automated, unless you passed in
the `--ask' flag on the command line, in which case I'll prompt you for a few
pieces of info.

Since you're running this program, you obviously have Perl 5--I'll be pulling
some defaults from its configuration.

Checking MANIFEST.....................................................done.
Setting up Configure's default values.................................done.
Setting up installation paths.........................................done.
Tweaking settings for miniparrot...................................skipped.
Loading platform and local hints files................................done.
Finding header files distributed with Parrot..........................done.
Determining what C compiler and linker to use.........................done.
Determining whether make is installed..................................yes.
Determining whether lex is installed...............................skipped.
Determining whether yacc is installed..............................skipped.
Determining if your C compiler is actually gcc..........................no.
Determining whether libc has the backtrace* functions (glibc only)......no.
Determining Fink location on Darwin................................skipped.
Determining if your C compiler is actually Visual C++..................yes.
Detecting compiler attributes (-DHASATTRIBUTE_xxx)....................done.
Detecting supported compiler warnings (-Wxxx)......................skipped.
Enabling optimization...................................................no.
Determining flags for building shared libraries.......................done.
Determine if parrot should be linked against a shared library..........yes.
Determining what charset files should be compiled in..................done.
Determining what encoding files should be compiled in.................done.
Determining what types Parrot should use..............................done.
Determining what opcode files should be compiled in...................done.
Determining what pmc files should be compiled in......................done.
Determining your minimum pointer alignment......................... 1 byte.
Probing for C headers.................................................done.
Determining some sizes................................................done.
Computing native byteorder for Parrot's wordsize.............little-endian.
Test the type of va_ptr (this test is likely to segfault)............stack.
Figuring out how to pack() Parrot's types.............................done.
Figuring out what formats should be used for sprintf..................done.
Determining if your C library has a working S_ISREG.....................no.
Determining CPU architecture and OS...................................done.
Determining architecture, OS and JIT capability.......................done.
Generating CPU specific stuff.........................................done.
Verifying that the compiler supports function pointer casts............yes.
Determining whether your compiler supports computed goto................no.
Determining if your compiler supports inline...........................yes.
Determining what allocator to use.....................................done.
Determining if your C library supports memalign.........................no.
Determining some signal stuff.........................................done.
Determining whether there is socklen_t..................................no.
Determining if your C library has setenv / unsetenv...............unsetenv.
Determining if your platform supports AIO...............................no.
Determining if your platform supports GMP...............................no.
Determining if your platform supports readline..........................no.
Determining if your platform supports gdbm..............................no.
Testing snprintf......................................................done.
Determining whether perldoc is installed...............................yes.
Determining whether python is installed.........................yes, 2.5.1.
Determining whether GNU m4 is installed................................yes.
Determining whether (exuberant) ctags is installed......................no.
Determining Parrot's revision.......................................r25452.
Determining whether ICU is installed................................failed.
Generating C headers..................................................done.
Generating core pmc list..............................................done.
Generating runtime/parrot/include.....................................done.
Configuring languages.................................................done.
Generating makefiles and other build files............................done.
Moving platform files into place......................................done.
Recording configuration data for later retrieval......................done.
Okay, we're done!

You can now use `nmake' to build your Parrot.
After that, you can use `nmake test' to run the test suite.

Happy Hacking,
        The Parrot Team

C:\Prg\parrot-svn>

Looks good so far. I kick off nmake (which is still running as I write this). Note that the Configure script discovers ActiveState's Perl as part of its rummaging around on my system, so that's what it uses to do the build steps that require execution of Perl. I have no idea what the least-acceptable version of AS Perl is, but the version I pulled down was probably about a year ago.

(Note: I have to admit, the Configure stuff is slick. I don't like opening those files and looking at what's in there, but you'll never hear me criticize the existence of Perl, for this reason alone: having a scripting language that can rummage around your machine and figure out the paths to all the cr*p it needs to build is a hideously useful thing. I do admit to wishing those scripts were written in something I feel better about reading, though, like Ruby, but this is a practice that far pre-dates me, so I'll just shut up and ride along because I find it useful when it works. As it does here.)

Note to the Parrot guys: under VS 2008, the build generates a ton of warnings. Most of all, VS 2008 complains about the use of the Wp64 flag, which it says is deprecated and will be removed in a future release. (Chromatic, if you want a full build log, I can clean-and-build again and send you the piped output, if it'll help.)

After about 10 minutes of disk churn and a ton of warnings reported (most of which seem to be just three or four warnings being repeated throughout the code, so either it's something in a couple of headers files that're included from everywhere, or these are spurious warnings that could be turned off via a #pragma)... success! I have a parrot.exe, along with a few other .exe utilities, in the root of the parrot-svn directory.

Next step: "nmake test".

Well, clearly parrot must be working pretty well, because it's churning through a ton of tests with "ok" results for everything except that which is platform-specific (a la the Fink tests intended for Darwin/Mac OS X, which are obviously going to fail on my XP box and therefore get skipped). A couple of tests get skipped (in the compilers tree?) with explanations that I don't quite understand, but it doesn't look like these are errors, per se, so I'm willing to accept on faith that we're all kosher. So while the tests are still running, I'll post this and offer up kudos to chromatic and the crew for something that at least builds, runs, and passes a whole slew of unit tests. Now for the fun part--finding out how extensive PMC, PIR and PASM are, and thinking about how this VM fits in the Grand Scheme of Things against the Da Vinci Machine and the DLR and the JVM and the CLR.... :-)

(Note to self: must suggest to John Lam and the guys on the DLR team to invite chromatic up to the Lang.NET 2009 Symposium. If the Sun folks can be made to feel welcome on the Microsoft campus for this kind of event, then surely the Parrot guys can come and feel welcome and--hopefully--carry away some interesting ideas, too.)

Update: Well, might have spoken too soon, looks like the tests failed after all. To be exact, the tests hung for a while, and I Ctrl-C'ed the process because it didn't look like it was going anywhere; this is the last few lines:

t/library/cgi_query_hash.....................ok
t/library/coroutine..........................ok
t/library/data_escape........................ok
        1/22 skipped: test not written
t/library/dumper.............................ok
t/library/File_Spec..........................ok
t/library/getopt_obj.........................ok
t/library/iter...............................ok
t/library/md5................................ok
t/library/mime_base64........................ok
t/library/parrotlib..........................ok
t/library/pcre...............................
t/library/pcre...............................NOK 1#     Failed test (t/library/p
cre.t at line 48)
# Exited with error code: 1
# Received:
# ok 1
# ok 2
# Null PMC access in invoke()
# current instr.: 'parrot;PCRE;compile' pc 118 (C:\Prg\parrot-svn\runtime\parrot
\library\pcre.pir:127)
# called from Sub 'main' pc 83 (C:\Prg\parrot-svn\t\library\pcre_1.pir:49)
#
# Expected:
# ok 1
# ok 2
# ok 3
# ok 4
# ok 5
#
# Looks like you failed 1 test of 1.
t/library/pcre...............................dubious
        Test returned status 1 (wstat 256, 0x100)
DIED. FAILED test 1
        Failed 1/1 tests, 0.00% okay
t/library/pg.................................Terminating on signal SIGINT(2)
NMAKE : fatal error U1077: NMAKE : fatal error U1058: terminated by user
Stop.

C:\Prg\parrot-svn>

Not sure what this means, but bear in mind, this is off today's tip, so it may be a temporary thing.

 

 

 

[1] Why, you may ask, do I have Active State's Perl installed if I so despise the language? Rotor (SSCLI 2.0) uses it as part of its build process, and I like spelunking with Rotor, as some of you will have noticed.


Languages | Parrot | Windows

Saturday, February 02, 2008 6:43:19 PM (Pacific Standard Time, UTC-08:00)
Comments [2]  | 
Diving into the Grails-vs-Rails wars (Or, Here we go again....)

Normally, I like to stay out of these kinds of wars, but this post by Stu (whom I deeply respect and consider a friend, though he may not reciprocate by the time I'm done here) just really irked me somewhere sensitive. I'm not entirely sure why, but something about it just... rubbed me the wrong way, I guess is the best way to say it.

Let's dissect, shall we?

Stu begins with the following two candidates:

1. Joe has a problem to solve. The problem is specific, the need is immediate, and the scope is well-contrained.
2. Jane has a problem to solve. The problem is poorly understood, the need is ongoing, and the scope is ambiguous.

For starters, Joe doesn't exist. Or rather, exists only in the theoretical. Of course, neither does Jane really exist, either. Fact is, almost all projects are a combination of Joe and Jane. More importantly, Stu's efforts here to force people into the "either/or" approach to categorization is a subtle (or perhaps not so) ploy to force people into the decision-making path he thinks should be taken.

It's sort of like saying, most people fall into two categories:

  1. Joe lives in Ghettopia, where all the men are dumb, the women are ugly, and the children are rejects from the ADHD Clinic.
  2. Jane lives in Utopia, where all the men are smart, the woman are good-looking, and the children are well-behaved.

Think about it: you're at work, you have a project, and you happen across Stu's page. Faced with the typical project (too little time, too few resources, too vague in the understanding of requirements and domain comprehension), with whom are you likely to identify? Disturblingly happy Joe, who has a specific problem in a well-constrained scope? Hardly. So from the beginning, you're expected to identify with Jane, which (not surprisingly) leads you into Stu's preferred conclusion.

He goes on:

How should Joe and Jane think differently about software platforms?

   1. Joe's platform needs to be mainstream. It needs to offer immediate productivity, and the toolset should closely match the problem. Also, Joe doesn't want to climb a learning curve.
   2. Jane's needs are quite the opposite. Jane needs flexibility. She needs glue that doesn't set. She needs a way to control technical debt (Joe doesn't care.)

For my part, I am interested in Jane's problems. (And anyway, Joe often discovers he is actually Jane midway through projects.)

Hey, Stu, quick reality check for ya: most developers want all of the above. It's not a binary choice, productivity and toolset vs. flexibility and dynamism. The fact is, the Java language has a degree of flexibility, just not as much as is offered by the Ruby language. For that matter, if you want real flexibility, maybe you oughta look into Lisp, or even Smalltalk, since it (ST) can get at the underlying stack frames from the ST language itself! Now that's flexibility you Ruby guys can only dream of. (Oh, I know, Rubinius will give you that flexibility. Someday. Justin even alludes to how Rubinius is essentially an attempt to recapture that dynamism from Smalltalk. Ironic, then, isn't it, that the guys who wrote the fastest Smalltalk VM on the planet (Strongtalk, which is open-source now, by the way) ended up working at Sun... on the thing that later came to be called Hotspot? You think maybe they have a little familiarity and experience with VMs?)

And that crack about "control technical debt (Joe doesn't care)"?

Bullshit.

Let me repeat that in case you missed it: BULL-SHIT.

Joe and Jane both care about technical debt. Each may be willing to spend their currency on different problems, granted, but both of them care about technical debt. Not caring about technical debt is what got Chandler into trouble, and it had nothing to do with language or tools whatsoever. It's insulting to suggest that either of them don't care about technical debt, particularly the guy that chooses differently than you.

(Shame on you, Stu. You know better. Quit trolling.)

We continue:

So how does this affect platform choice? If you are Joe, you care about specific details about what a toolset can do right now. Most of Graeme's Top 10 reasons are in the "Right here, right now" category. This is true regardless of whether you think he is right. (Sometimes he is, sometimes not.)

I'll grant you, some of Graeme's Top 10 reasons are a bit spurious, and Stu-and-company do a good job of pointing those out. Frankly, anybody who makes a technical selection based on version numbers or whether or not a book exists for it seems to be missing the point, if you ask me. Of far greater concern is the stability of the language/tool, or the wealth of documentation for it. (And yes, this may seem to fly in the face of my arguments against Parrot a few posts ago; actually, it's not. If Parrot were more stable and/or more fully fleshed out, and the version updates just kept going, I'd be happy to say, "Go get this thing and give it a spin". But it doesn't feel stable to me, so I can't.)

But Stu's argument here is spurious: I don't care if you're Joe or if you're Jane, you always care about specific details about what a toolset can do, right now or otherwise. Certain concerns may be concerns that you can put off until later, but those concerns are always a part of the platform selection. Consider a hypothetical for a second: you currently are developing on Windows, and your project will run on Windows servers, with a possibility that it may need to run on non-Windows servers at some point in the future. Do you consider .NET or not? This is exactly the kind of detail that needs to be discussed--how likely is the move to a non-Windows server going to be? If it's <25%, then the CLR and ASP.NET might be a good choice, particularly if your developers are less "plumbing wonk" than "GUI designer", and you rely on being able to move the assemblies to a non-Windows server later via Mono.

Note: I'm not suggesting this a good choice in all scenarios. I'm making the point that the details of the toolset matter in your choice of toolsets, based on what your particular project needs are.

Jane cares just as much about toolset details as Joe does. I can't imagine a scenario where either of them don't care.

To continue:

My advice to Joe: Know exactly what you need, and then pick the platform that comes closest to solving it out of the box. Depending on Joe's needs, either Rails or Grails might be appropriate (or neither!). A particular point in Grails' favor would be an established team of Spring ninjas.

"Know exactly what you need"? Ah, right, because Joe belongs to that .01% of projects that have "specific problems, immediate need, and well-constrained scope". Nothing like conceding a point to the other guys, in preparation for the "killer blow":

If you are Jane, you care more about architecture. I mean this term in two senses:

   1. Architecture: the decisions you cannot unmake easily.
   2. Architecture: the constraints on how you think and work.

If you are Jane, you care about how and why the platform was assembled, because you are likely to have to adapt it quite a bit.

You know, I don't think I've ever been on a project where I didn't care about architecture or in having to "adapt it quite a bit". Of course, back in the days when I was writing C++, this meant either subclassing CWnd or TWindow in interesting ways, or else sometimes even going so far as to reach into the source code and making some tweaks, either at compile-time or through some well-established hackery. (Yes, I wrote a template class called THackOMatic that allowed me to bang away on private fields. Sue me. It worked, I documented the hell out of it, and ripped the hack back out once the bug was fixed.) Point is, both Joe and Jane care about the architecture.

Now, I think what Stu means here is that the architecture of the web framework is more malleable in Rails than it is in Grails, because Rails is written on top of Ruby and Grails is written on top of Groovy, Spring, the JEE container architecture, and Java:

Most of the commenters on my earlier post (and Graeme in his addendum) correctly identified the real architectural difference between Grails and Rails. Rails builds on Ruby, while Grails builds on Groovy and Spring.

Yes! I agree with this so far. (In fact, everybody should, because these are simple statements of fact.) But then Stu takes the cake for the Best Parting Non-Supported Shot Ever:

Rails wins this architecture bakeoff twice:

    * Ruby is a better language than Groovy.
    * Spring does most of its heavy lifting in the stable layer, which is not the right place.

Huh?

Ruby is perhaps a more flexible language than Groovy (and that's an arguable point, folks, and one which I really don't care to get into), but Ruby also runs on a less-flexible and less-scalable and less-supported platform than Groovy. I dunno that this makes Ruby better. It simply makes it different. Try convincing your IT guys to add yet another platform into their already-overwhelmingly complex suite of tools, particularly given the surprisingly sparse amount of monitoring information that Ruby platform offers. Stu may want to argue that Ruby-the-language is more flexible, regardless of what platform it runs on, and if so, then we're arguing languages not platforms, and while he might win much of his "Ruby is a better language than Groovy" argument, he's going to lose the "Ruby is more dynamic than Groovy", because on the JVM they have to be implemented under the same set of restrictions. You can't have it both ways.

(By the way, if you're one of those Ruby/Rails enthusiasts who's going to counterclaim that "Ruby-meaning-MRV is fast enough", I've heard the argument, and I think it's specious and ignorant. "Fast enough" is an argument that rests on your project being able to remain within the expected performance and scalability curve known at the beginning of the project, and remember, Jane's problem is that she doesn't know those sorts of things yet. So either you know, and have some better scope around the problem than Stu gives credit to Jane for having, or else you don't know, and can't assume that the Ruby interpreter will be able to handle the load.)

And WTF is up with the idea that "Spring does most of its heavy lifting in the stable layer, which is not the right place"? I think Stu means to say that Spring is a static layer, not stable layer[1], because hey, stability is kinda important to a few folks. (I'll give Stu the benefit of the doubt here and assume he cares about stability, too. I know his customers do.) Spring has its flaws, mind you, but arguing that it's not up to the heavy lifting seems to be like arguing that Java cannot scale. (Even Microsoft has given up on that argument, by the way.)

The worst part of this is, I've had discussions like this with Stu in the past, and he's much more articulate about it in person than he is in this blog post. Frankly, I think the most interesting space here is the intersection of Graeme's and Stu's positions, which is to say JRuby (and IronRuby or Ruby.NET, but that's for a different platform and out of the scope of this discussion entirely... yet still compelling and relevant, strangely enough). At the end of the day, these arguments about "my web framework is better than your web framework" are really just stupid. (As long as you're not trying to claim that Perl is the best web framework, anyway. Yes, Perl enthusiasts, I'm picking on you.)

My advice to Jane: Rails over Grails.

My advice to Jane: pick a consulting firm that doesn't have preconceived dogma about which web framework... or language, or any other toolset... to use. [2]

And if Jane can't afford a consulting firm, then Jane needs to do the research on her own and make her own decision based on the problem set, the context, and the whole range of tools available to her. (Anybody making a decision based solely on the basis of a blog-post-flame-war deserves what they get, regardless.)

As for Joe? Well, Joe could probably benefit from the goodness inherent in the dynamic languages that are popping up all over the place, too, not to mention the goodness inherent in the type-inferred languages that are starting to poke their heads through the Barrier of Adoption, all the while not ignoring the fact that he could probably benefit from the inherent performance and scalability of the major virtual machine technologies that have been a decade or more in production...

Meaning Joe probably needs to go through the same decision-making criteria Jane does. Thank God both of them, it turned out, work on the same project, as is often the case.

Meanwhile, I'm done with this thread. It's a pointless, stupid argument. Use the right tool for the job. Or, if you prefer, "From each language, according to its abilities, to each project, according to its needs."

Just remember that both shipping and supporting are features, too. Don't neglect the other in favor of the one.

 

 

 

[1] Yes, I saw the hyperlink to Ola's post about languages, and his definitions therein. Ironically, Ola's own comments there state that "Java is really the only choice here", which directly contradicts Stu's choice of MRV (the native Ruby interpreter). More importantly, I think Stu's point is resting on the static nature of the Java layer in Groovy, and while it's certainly more flexible to be able to hack at any layer of the stack, this is only realistically possible in small applications--this isn't my opinion, it's the opinion of Gregor Kiczales, who spent many years in CLOS and determined that CLOS's extremely flexible MOP system (more so than what Ruby currently supports, in fact) led to inherent problems in larger-scale projects. It was this thought that led him to create AspectJ in the first place.

[2] By the way, if there's any temptation in you[3] to post commentary and say, "Dude, you just don't understand Ruby" or "How can you agree with Graeme this way?", just don't. I do understand Ruby, and I like the language. (Much more than I do Rails, anyway.) And I'm not intrinsically agreeing that Grails is better than Rails, because I don't believe that, either. I believe in the basic equation that says the solution you pick is the one that is the right solution to the given problem in the stated context that yields the most desirable consequences.

[3] This includes you, Stu. Or Justin, or Graeme, or anybody working for Relevance, or anybody working for G2One, Inc.


.NET | C++ | Java/J2EE | Languages | Ruby | Windows

Saturday, February 02, 2008 3:14:20 AM (Pacific Standard Time, UTC-08:00)
Comments [4]  | 
 Friday, February 01, 2008
Latest installment of "Pragmatic Architecture" (Data Access) is up ...

... here. (Yes, it's an MSDN web page, but the article itself--as have all of its brethren in the series--is actually quite technology-neutral.) Enjoy and flame away....




Friday, February 01, 2008 9:57:38 PM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
 Wednesday, January 30, 2008
Apparently I'm the #2 Perl Lover on the Internet

perllover

ROFL!

Update: Apparenty, this post (and two more referencing it) pushed me to #'s 1-4 on the "perl lover" Google list, out of 250,000. That is just so wrong, on so many levels.... :-)




Wednesday, January 30, 2008 8:08:34 PM (Pacific Standard Time, UTC-08:00)
Comments [2]  | 
Highlights of the Lang.NET Symposium, Day Three (from memory)

My Mac froze when I tried to hook it up to the projector in the afternoon to do a 15-minute chat on Scala, thus losing the running blog entry in its entirety. Crap. This is my attempt to piece this overview together from memory--accordingly, details may suffer. Check the videos for verification when they come out. Of course, details were never my long suit anyway, so you probably want to do that for all of these posts, come to think of it...

I got to the conference about a half-hour late, owing to some personal errands in the morning; as I got there, Wayne Kelly was talking about his work on the Ruby.NET compiler.

Wayne Kelly: Parsing Ruby is made much harder by the fact that there is no Ruby specification to work from, which means the parser can't easily be generated from a parser generator. He tried, but couldn't get it to work cleanly and finally gave up in favor of getting to the "fun stuff" of code generation. Fortunately, the work he spent on the parser was generalized into the Gardens Point Parser Generator tools, which are also used in other environments and are included (?) as part of the Visual Studio SDK download. Good stuff. Ruby.NET uses a "wrapper" class around the .NET type that contains a hash of all the symbols for that type, which permits them to avoid even constructing (or even knowing!) the actual .NET type behind the scenes, except in certain scenarios where they have to know ahead of time. Interesting trick--probably could be used to great effect in a JSR-223 engine. (I know Rhino makes use of something similar, though I don't think they defer construction of the Java object behind the Rhino object.)

In general, I'm hearing this meme that "Ruby's lack of a specification is making X so much harder". I hate to draw the parallel, but it's highly reminiscent of the state of Perl until Perl 6, when Larry decided it was finally time to write a language specification (and the language has languished ever since), but maybe it's time for Matz or another Ruby digerati to sit down and write a formal specification for the Ruby language. Or even just its grammar.

Luke Hoban: Luke is the PM on the F# team, which is a language that I've recently been spending some quality time with, so I'm looking forward to this talk and how he presents the language. (Note to self: steal slides... I mean leverage slides... for my own future presentations on F#.) Not surprisingly, he makes pretty heavy use of the F# Interactive window in Visual Studio, using a trick I hadn't known before this: swipe some text in the editor, then press Alt-Enter, and it sends it to the Interactive window for execution. Nifty.

Then he starts showing off F#'s fidelity to the underlying CLR, and just for effect creates a DirectX surface and starts graphing functions on it. Then he starts playing with the functions, while the graph is still up, which has the neat effect of changing the function's graph in the DirectX surface without any explicit additional coding. Then he tightens up the mesh of the graph, and adds animation. (Mind you, these are all one-to-four lines of F# at a time he's pasting into the Interactive window.) What gets even more fun is when he pastes in a page and a half more of F# code that introduces balls rolling on the graphed surface. Very nifty. Makes Excel's graphing capabilities just look silly by comparison, in terms of "approachability" by programmers.

I will say, though, that I think that the decision to use significant whitespace in F# the same way Python does is a mistake. We don't have to go back to the semicolon everywhere, but surely there has to be A Better Way than significant whitespace.

Harry Pierson: Harry works in MS IT, so he doesn't play with languages on a regular basis, but he likes to explore, and recently has been exploring Parser Expression Grammars, which purport to be an easier way to write parsers based on an existing grammar. He shows off some code he wrote in F# by hand to do this (a port of the original Haskell code from the PEG paper), then shows the version that Don (Syme) sent back, which made use of active patterns in F#. (Check out Don's Expert F# for details.)

Harry predicated this talk with his experience talking with the creators of Glassbox (a C#-based tool that wanted to do something similar to what the C# mixins guys were doing from yesterday), and when he heard how much pain they were going through taking the Mono C# compiler and hacking it to introduce their extensions, he realized that compilers needed to be more modular. I had an interesting thought on this today, which I'll talk about below.

Magnus ???: Again, this was a lightning talk, a quick-hit lecture on the tool that his company is building, and I can't tell if the name of the tool was Intentional Software, or the name of the company was Intentional Software, or both. It's a derivative of what Charles Simonyi was working on at Microsoft (Intentional Programming), and basically they're creating programming language source trees in various ways while preserving the contents of the tree. So, for example, he takes some sample code (looked like C#, I don't think he said exactly what it was--assume some random C-family language), and presto, the curly braces are now in K&R style instead of where they belong (on new lines). Yawn. Then he presses another button, and suddenly the mathematical expressions are using traditional math "one over x" (with the horizontal line, a la MathML-described output) instead of "one slash x". That got a few peoples' attention. As did the next button-press, which essentially transformed whole equations in code into their mathematical equivalents. Then, he promptly button-presses again, and now the if/else constructs that are part of the equation are displayed inside the equation as "when"/"otherwise" clauses. Another button press, and suddenly we have a Lisp-like expression tree of the same function. Another button press, and we have a circuit diagram of the same function.

Wow. I'm floored. With this, completely non-programmer types can write/edit/test code, with full fidelity back to the original textual source. And, in fact, he promptly demonstrates that, with a table-driven representation of some business rules for a Dutch bank. It's a frickin' spreadsheet we're looking at, yet underneath (as he shows us once or twice), it's live and it's really code.

Combine this with some unit tests, and you have a real user-friendly programming environment, one that makes Rails look amateurish by comparison.

Now, if this stuff actually ships.... but this talk leads me to some deeper insight in conjunction with Harry's comments, which I'll go into below.

Wesner Moise: Wesner presents his product, NStatic, which is a static analysis tool that scans .NET assemblies for violations and bugs, much in the same way that FindBugs does in the Java space. It operates on binary assemblies (I think), rather than on source files the way FxCop does (I think), and it has a very spiffy GUI to help present the results. It also offers a sort of "live" view of your code, but I can't be certain of how it works because despite the fact that he takes the time to fire it up, he doesn't actually walk us through using it. (Wesner, if you read this, this is a HUGE mistake. Your slides should be wrapped around a demo, not the other way around. In fact, I'd suggest strongly ditching the slides altogether and just bring up an assembly and display the results.)

As readers of this blog (column?) will know, I'm a big fan of static analysis tools because I think they have the advantageous properties of being "always on" and, generally, "extensible to include new checks". Compilers fall into the first category, but not the second, in addition to being pretty weak in terms of the checks they do perform--given the exposure we're getting to functional languages and type inferencing, this should change pretty dramatically in the next five years. But in the meantime, I'm curious to start experimenting with the AbsIL toolkit (from MS Research) and F# or a rules engine (either a Prolog variant or something similar) to do some of my own tests against assemblies.

Unfortunately, it's a commercial product, so I don't think source will be available (in case you were wondering).

Chuck ...: Chuck stands up and does a quick-hit lecture on his programming language, CORBA... er, sorry about that, flashback from a bad acid trip. I mean of course, the language Cobra, which according to his blog he was working on at the Lang.NET 2006 Symposium. It's a Python derivative (ugh, more significant whitespace) with some interesting features, including a combination of static and dynamic typing, contracts, a "squeaky-clean" syntax, first-class support for unit tests (directly in the method definition!), and uses source-to-source "compilation", in this case from Cobra to C#, rather than compilation directly to IL.

It's a fascinating little piece of work, and I'm planning on playing with it some.

Miguel de Icaza: Miguel is another one of those who has more energy than any human being should have right to, and he spends the entire talk in fast-forward mode, speaking at a rapid-fire pace. He talks first of all about some experiences with Mono and the toolchain, then gets around to the history of Moonlight ("Can you give us a demo in 3 weeks?") and their (Mono's/Novell's) plans to get Moonlight out the door. They're already an impressive amount of the way there, but they have to make use of a "no-release" codecs library that also (potentially) contains some copywrit stuff, so they're instead going to incorporate Microsoft codecs, which they have rights to thanks to the Microsoft/Novell agreement of n months ago.

The thought of all these Linux devs running Microsoft code in their browser as they work with Moonlight just tickles my demented funny bone to no end.

He then switches tacks, and moves into gaming, because apparently a number of game companies are approaching Novell about using Mono for their gaming scripting engine. (Apparently it is being adopted by SecondLife, but the demo tanks because the SecondLife servers aren't up, apparently. That, or the Microsoft firewall is doing its job.) He jumps into some discussion about UnityScript, a {ECMA/Java}Script-like language for a game engine (called Unity, I think) that Rodrigo (creator of Boo) was able to build a parser for (in Boo) in 20 hours.

He then demonstrates the power of game engines and game editors by giving a short demo of the level editor for the game. He modifies the Robot bad guys to shoot each other instead of the player. If you're a game modder, this is old hat. If you're a business programmer, this is wildly interesting, probably because now you have visions of pasting your boss' face on the robots as you blast them.

Aaron Marten and Carl Brochu: I think his co-presenter's name was Carl something, but memory fails me, sorry. These two are from the Visual Studio Ecosystem team (which I think gets the prize for strangest product team name, ever), and they're here to give an overview of the Visual Studio integration API and tooling, with some sample code around how to plug into VS. This is good, because not an hour or two before, during Chuck's Cobra talk, he was talking about wanting to integrate into VS as a formal "next steps" for his language. Frankly, the whole area of IDE integration Dark Art to most folks (ranking behind custom languages, but still high up there), and the more the VSX team can do to dispel that myth, the more we'll start to see interesting and useful plugins for VS a la what we see in the Eclipse space. (Actually, let's hope the plugins we see for VS work more than a quarter of the time--Eclipse has become the dumping ground for every programmer who had an idea for a plugin, created a space on Sourceforge, wrote twenty lines of code, then got stuck and went away, leaving a nifty idea statement and a plugin that crashes Eclipse when you fire it up, not that I'm bitter or anything.)

The code demo they show off is a RegEx language colorization sample, nothing too terribly useful but still a nice small example of how to do it in VS. As VS starts to put more and more of a managed layer into place inside of VS, this sort of thing should become easier and easier, and thus a lot more approachable to the Average Mortal.

Me: I did a 15-minute presentation on Scala, since the name had come up a few times during the week, and promptly watched in horror as hooking my Mac up to the overhead projector locked the Mac completely. Ugh. Hard reboot. Ugh. Shuffle and dance about the history of Scala while waiting for the Mac to reboot and the VMWare image in which I have Scala installed to reboot. Ugh. I have no prepared slides, so I open up a random Scala example and start talking briefly about the syntax of a language whose list of features alone is so long it would take all fifteen minutes just to read aloud, much less try to explain. Cap it off with a leading question from Don Box ("Is this Sun's attempt to catch up to the C# compiler, given that Java is 'done' like the Patriots or the Dolphins?") that I try to answer as honestly and truthfully as possible, and a second question from Don (again) that forcefully reminds me that I'm out of time despite the "5 min" and "1 min" signs being held up by the guy next to him ("What would you say, in the two minutes you have left to you, is the main reason people should look at Scala?"), and I can safely say that I was thoroughly disgusted with myself at presenting what had to be the crappiest talk a the conference. *sigh*

That's it, no more presentations on technical topics, ever.

OK, not really, but a man can dream....

Don Box and Chris Andersen: I had to leave about ten minutes into their talk, so I still have no idea what Don and he are working on deep inside their incubating little cells in Microsoft. Something to do with "modeling and languages", and something that seeks to bring data to the forefront instead of code. *shrug* Not sure what to make of it, but I'm sure the video will make it more clear.

Meanwhile...

Overall: Here are some thoughts I think I think:

  • A blog is not a part of your presentation, and your presentation is not part of your blog. I find it frustrating when speakers say, in their presentation, "Oh, you can find Y on my blog" and don't go into any more detail about it. I don't want to have to go look up your blog after the talk, when the context of the question or situation is swapped out of memory, and I don't want to have to go look it up during your presentation and miss whatever follows in your talk. If you blogged it, you should be able to give me a 30-second summary about the blog entry or what not, enough to tell me whether or not I want the deeper details of what's on your blog. Exception: files that contain examples of a concept you're discussing or sample code or whatnot.
  • Don't hook your Mac up to the projector when you have a VMWare session on an external USB disk running. This happened to me at Seattle Code Camp, too, with the same result: Mac lockup. Dunno what the deal is, but from now on, the rule is, connect thy Mac, then fire up thy suspended VMWare VM.
  • Language design and implementation is a lot more approachable now than it was even five years ago. Don't assume, for even a second, that the only way to go building a "DSL" or "little language" is by way of Rake and Rails--it's still a fair amount of work to build a non-trivial language, but between parser combinators and toolkits like the DLR and Phoenix, I'd go head-to-head against a Ruby-based DSL development process any day of the week.
  • Don't go in front of Don Box at a conference. Dude may like to go long on his own talks, but man, he watches the clock like a hawk when it's time for him to start. (I may sound like I'm angry at Don--I'm not--but I'm not going to resist a chance to poke at him, either. *grin*)
  • Modular tool chains are the future. Actually, this is a longish idea, so I will defer that for a future post.
  • This conference rocks. It's not the largest conference, you get zero swag, and the room is a touch crowded at times, but man, this little get-together has one of the highest signal-to-noise ratio of any get-together I've been to, and without a doubt, within the realm of languages and language design, this is where the Cool Kids like to hang out.

Bye for now, and thanks for listening....


.NET | C++ | Conferences | Java/J2EE | Languages | Ruby

Wednesday, January 30, 2008 7:32:14 PM (Pacific Standard Time, UTC-08:00)
Comments [7]  | 
 Tuesday, January 29, 2008
What about Context?

Andrew Wild emails me:

I vaguely remember one of your blog posts in which you went into a bit of an exposition of 'context'.
Did you ever come up with anything solid or did you wind up talking yourself in self-referential circles?

Because that post was actually a part of the old weblog hosted at neward.net, I decided to repost it and the followup discussion to this blog in order to make it available again, although the WayBack Machine also has it and its followup tucked away.

Context

I'm not normally one to promote myself as a "pattern miner"--those who "discover" patterns in the software systems around us--since I don't think I have that much experience yet, but one particular design approach, "patlet", if you will, has been showing up with frightening regularity (such as Sandy Khaund's mention of EDRA, the format of a SOAP 1.2 message, which in itself forms a Context, and more), and yet hasn't, to my knowledge, been documented anywhere, that I thought I'd take a stab at documenting it and see what comes out of it. Treat this as a alpha, at best, and be brutal in your feedback.

Context (Object Behavioral)

Define a wrapper object that encapsulates all aspects of an operation, including details that may not be directly related to that operation. Context allows an object or graph of objects to be handled in a single logical unit, as part of a logical unit of work.

Motivation

Frequently an operation, which consists fundamentally of inputs and a generated output, requires additional information by which to carry out its work. In some cases, this consists of out-of-band information, such as historical data, previous values, or quality-of-service data, which needs to travel with the operation regardless of its execution path within the system. The desire is to decouple the various participants working with the operation from having to know everything that is being "carried around" as part of the operation.

In many cases, a Context will be what is passed around between the various actors in a Chain of Responsibility (223).

Consequences

I'm not sure yet.

Known Uses

Several distributed communication toolkits make use of Context or something very similar to it. COM+, for example, uses the notion of Context as a interception barrier, allowing for a tightly-coupled graph of objects to be treated as an atomic unit, synchronizing multi-threaded calls into the context, also called an apartment. Transactions are traced as they flow through different parts of the system, such that each Context knows the transaction ID it was associated with and can allow that same transaction ID (the causality) to continue to flow through, thus avoiding self-deadlock.

Web Services also make use of Context, using the SOAP Message format as a mechanism in which out-of-band information, such as security headers and addressing information, can be conveyed without "polluting" the data stored in the message itself. WS-Security, WS-Transaction, WS-Routing, among others, are all examples of specifications that simply add headers to SOAP messages, so that other "processing nodes" in the Web service call chain can provide the appropriate semantics.

(I know there are others, but nothing's coming to mind at the moment.)

Related Patterns

Context is often the object passed along a Chain of Responsibility; each ConcreteHandler in the Chain examines the Context and potentially modifies it as necessary before handing it along to the next Handler in the Chain.

Context is also used as a wrapper for a Command object, providing additional information beyond the Command itself. The key difference between the two is that Context provides out-of-band information that the Command object may not even know is there, for processing by others around the Command.

The followup looked like this:

Wow--lots of you have posted comments about Context. Let's address some of them and see what comes up in the second iteration:

  • Michael Earls wrote:

    Very timely. I'm building a system right now that fits this pattern. We spent about five minutes determining what to call "it" (the little "black box" that holds the core command, entity, and metadata information). We settled on "nugget". Now there's prior art I can refer to. I'm using Context with WSE 2.0 and SOAP extensions for the pipeline in exactly the way you describe. Nice.
    and
    Another Related Pattern: Additionally, the Context may also be an container/extension/augmentation/decoration on the UnitOfWork (???).
    I suspect you're right--Context can be used to hold the information surrounding a UnitOfWork, including the transaction it's bound to (if the transaction is already opened). This is somewhat similar to what the MTS implementation does, if I'm not mistaken.

  • Kris wrote:

    The HTTP pipeline in ASP.NET comes to mind, with the HttpContext being passed through for various things like session state, security, etc. One possible side effect that I can see (hopefully you can drop some thoughts on this one), is how to manage dependencies between the chain, as well as order of invocation of chain elements. The MS PAG stuff I believe talks about this somewhat with the Pipelines & Filters pattern, but I'd love to hear your thoughts as well.
    The PAG stuff (Sandy Khaund's post) was part of what triggered this post in the first place, but I want to be careful not to rely too much on Microsoft prior art (WSE, Shadowfax, HttpContext, COM/MTS/COM+) since in many cases those systems were designed by people who had worked together before and/or shared ideas. The Rule of Three says that the pattern needs to be discovered "independently" of any other system, although with Google around these days that's becoming harder and harder to do. :-) As to managing dependencies between the chain, I think that's out of scope to Context itself--in fact, that raises another interesting pattern relationship, in that Context can be the thing operated upon by a Blackboard [POSA1 71]. Context doesn't care who interacts with it when, IMHO.

  • Dan Moore wrote:

    Another context (pun intended? --TKN) I've read a lot about is the transactional context used in enterprise transaction processing systems. This entity contains information about the transaction, needed by various participants.
    Yep. Read on (Dan Malks' post).

  • Dan Malks wrote:

    Hi Ted, Good start with your pattern writeup :) We document "Context Object" in our pattern catalog in our book "Core J2EE Patterns" 2nd ed., Alur, Crupi, Malks. I hope you'll find some interesting content in our writeup, so please feel free to have a look and let me know what you think. Thanks, Dan Malks
    Thanks, Dan. As soon as you posted I ran off and grabbed my copy of your book, and looked it up, and while I think there's definitely some overlap (boy what I wouldn't give to workshop this thing at PLOP this year), Context Object, given the protocol-independence focus that you guys gave it in Core J2EE, looks like a potential combination of Context and Chain of Responsibility. I wanted to elevate Context out of just the Chain of Responsibility usage, though, to create something that isn't "part of" another pattern--I'll leave it to you guys to decide whether Context makes sense in that regard.

  • Mark Levison wrote:

    On related patterns: Context is what is passed into a Flyweight (alias Glyph's). We've been using Context for over two years on current project.
    Really? Wow; I never would have considered that, but of course it makes sense when you describe it that way. I'm not really keeping track, but I think we've reached the Rule of Three.

    By the way, if there's anybody listening in on this weblog that's going to the PLOP conference this year in Illinois, I would LOVE for you to workshop this one, if there's time. (I wish I could go, but I'm going to be otherwise occupied.) Drop me a note if you're going, are interested, and think there's still time to get it onto the program.

To answer your question, Andrew, no, I never did follow up on this further, but I think Context did emerge as a pattern at one of the PLoP conferences, though I don't know which one and can't find it via Google right now. (I write this at the Lang.NET conference, and I'm trying to keep up with the presentations.)




Tuesday, January 29, 2008 5:41:42 PM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
Highlights of the Lang.NET Symposium Day Two

No snow last night, which means we avoid a repeat of the Redmond-wide shutdown of all facilities due to a half-inch of snow, and thus we avoid once again the scorn of cities all across the US for our wimpiness in the face of fluffy cold white stuff.

Erik Meijer: It's obvious why Erik is doing his talk at 9AM, because the man has far more energy than any human being has a right to have at this hour of the morning. Think of your hyperactive five-year-old nephew. On Christmas morning. And he's getting a G.I. Joe Super Bazooka With Real Bayonet Action(TM). Then you amp him up on caffeine and sugar. And speed.

Start with Erik's natural energy, throw in his excitement about Volta, compound in the fact that he's got the mic cranked up to 11 and I'm sitting in the front row and... well, this talk would wake the dead.

Volta, for those who haven't seen it before, is a MSIL->JavaScript transformation engine, among other things. In essence, he wants to let .NET developers write code in their traditional control-eventhandler model, then transform it automatically into a tier-split model when developers want to deploy it to the Web. (Erik posted a description of it to LtU, as well.) He's said a couple of times now that "Volta stretches the .NET platform to cover the Cloud", and from one perspective this is true--Volta automatically "splits" the code (in a manner I don't quite understand yet) to run Javascript in the browser and some amount of server-side code that remains in .NET.

A couple of thoughts came to mind when I first saw this, and they still haven't gone away:

  • How do I control the round trips? If Volta is splitting the code, do I have control over what runs locally (on the server) and what runs remotely (in the browser)? The fact that Volta will help break things out from synchronous calls is nice, but I get much better perf and scale from avoiding the remote call entirely. [Erik answers this later, sort of: use of the RunAtOrigin attribute on a class defines that class to run on the server. He also addresses this again later in the section marked "End-to-End Profiling". Apparently you use a tool called "Rotunda" to profile where the tier split would be most effective.]
  • How do I avoid the least-common denominator problem? Any time a library or language has tried to "cover up" the differences between the various UI models, it's left a bad taste in my mouth. Volta doesn't try to hide the markup, per se, but it's not hard to imagine a model where somebody says, "Well, if I write a control that I want to use in both WPF and HTML...."
  • Is JavaScript really fast enough to handle the whole .NET library translated into JS? This is a general concern for both GWT and Volta--if I'm putting that much weight on top of the JS engine, will it collapse under several megs of JS code and who-knows-how-much data/objects inside of it?

Still, the idea of transforming MSIL into some other interesting useful form is a cool idea, and one I hope gets more play in other ways, too.

Gilad Bracha: Gilad discusses Newspeak, a Smalltalk- and Self-influenced language that, as John Rose puts it, "is one of the world's smallest languages while still remaining powerful". It bases on message send and receive, a la Smalltalk, but there's some immutability and some other ideas in there as well, on top of a pretty small syntactic core (a la Lisp, I think). Most of the discussion is around Newspeak's influences (Smalltalk, Self, Beta, and a little Scala, plus some Scheme and E), with code examples drawn from a compiler framework. Most notably, Gilad shows how because the language is based on message-sends, it becomes pretty trivial to build a parser combinator that combines both scanning and actions by breaking lexing/scanning into a base class and the actions into a derived class. Elegant.

Unfortunately, no implementation is available, though Gilad strongly suggests that anybody who wants to see it should send him a letter on company letterhead so he can show it to the corporate heads back at the office in order to get it out to the world at large. I'm sufficiently intrigued that I'm going to send him one, personally.

Giles Thomas: Giles talks about Resolver One, his company's spreadsheet product, which is built in IronPython and exposes it as the scripting language within the spreadsheet, a la Excel's formula language and VBA combined. It's an interesting talk from sveeral perspectives:

  1. he's got 110,000 lines of code written in IronPython and hasn't found the need to go to C# yet (implying that, yes, dynamic languages can scale(1))
  2. he's taking the position that spreadsheets are essentially programs, and therefore should be accessible in a variety of ways outside of the spreadsheet itself--as a back-end to a web service or website, for example
  3. he's attended a conference in the UK on spreadsheets. Think about that for a moment: a conference... on spreadsheets. That sounds about as exciting as attending the IRS' Annual Tax Code Conference and Social.
  4. he's effectively demonstrating the power of scripting languages exposed inside an application engine, in this case, the scripting language runs throughout the product/application. Frankly I personally think he'd be better off writing the UI core in C# or VB and using the IronPython as the calculation engine, but give credit where credit is due: it runs pretty damn fast, there was no crash ever, and it's fascinating watching him put regular .NET objects (like IronPython generators or lambdas) into the spreadsheet grid and use them from other cells. Nifty.

This is a really elegant design. I'm impressed. JVMers (thanks to JSR 233), CLRers (thanks to DLR), take note: this is the way to build applications/systems with emergent behavior.

Seo Sanghyeon: Seo had a few problems with his entirely gratuitous demo for his context-free talk (although I could've sworn he said "content-free" talk, but it was probably just a combination of his accent and my wax-filled ears). In essence, he wants to produce new backends for the DLR, in order to reuse the existing DLR front- and middle-ends and make lots of money (his words). I can get behind that. In fact, he uses a quote from my yesterday's blog (the "DLR should produce assemblies out the back end" one), which is both flattering and a little scary. ("Wait, that means people are actually reading this thing?!?")

Jim Hugunin had an interesting theme threaded through his talk yesterday that I didn't explicitly mention, and that was a mistake, because it's recurring over and over again this week: "Sharing is good, but homogeneity is bad". I can completely agree with this; sharing implies the free exchange of resources (such as assemblies and type systems, in this case) and ideas (at the very least), but homogeneity--in this case, the idea that there exists somewhere in space and time the One Language God Intended--is something that just constrains our ability to get stuff done. Imagine trying to access data out of a relational database using C++, for example.

Paul Vick: Paul's from the VB team [cue bad one-liner disparaging VB here], and he's talking on "Bringing Scripting (Back) to Visual Basic", something that I can definitely agree with.

Editor's Note: I don't know what Visual Basic did to anger the Gods of Computer Science, but think about it for a second: they were a dynamic language that ran on a bytecode-based platform, used dynamic typing and late name-based binding by default, provided a "scripting glue" to existing applications (Office being the big one), focused primarily on productivity, and followed a component model from almost the first release. Then, after languishing for years as the skinny guy on the beach as the C++ developers kicked sand on their blanket, they get the static-typing and early-binding religion, just in time to be the skinny guy on the beach as the Ruby developers kick sand on their blanket.

Oh, and to add insult to injury, the original code name for Visual Basic before it got a marketing name? Ruby.

Whatever you did, VB, your punishment couldn't have fit the crime. Hopefully your highly-publicized personal hell is almost over.

Paul points out that most VBers come to the language not by purchasing the VB tool chain, but through VBA in Office, and demos VB inside of Excel to prove the point. The cool thing is (and I don't know how he did this), he has a Scripting Window inside of Excel 2007 and demos both VB and IronPython in an interactive mode, flipping from one to the other. A couple of people have done this so far, and I'd love to know if that's a core part of the DLR or something they just built themselves. (Note to self: pick apart DLR code base in my copious spare time.) He does an architectural overview of the VB compilation toolchain, which is nice if you're interested in how to architect a modern IDE environment. The VB guys split things into Core services (what you'd expect from a compiler), Project services (for managing assembly references and such), and IDE services (Intellisense and so on). Note that the Project services implementation is different (and simpler) for the command-line compiler, and obviously the command-line compiler has no IDE services. Their goal for Visual Basic v.Next, is to provide the complete range of Core/compiler, Project and even IDE services for people who want to use VB as a scripting engine, and he demos a simple WinForms app that hosts a single control that exposes the VB editor inside of it. Cool beans.

Serge Baranovsky: (Serge goes first because Karl Prosser has problems hooking his laptop up to the projector.) Serge is a VB MVP and works for a tools company, and he talks about doing some code analysis works. He runs a short demo that has an error in it (he tries to serialize a VB class that has a public event, which as Rocky Lhotka has pointed out prior to now, is a problem). The tool seems somewhat nice, but I wish he'd talked more about the implementation of it rather than the various patterns it spots. (The talk kinda feels like it was intended for a very different audience than this one.) Probably the most interesting thing is that he runs the tool over newTelligence's dasBlog codebase, and finds close to 4000 violations of Microsoft's coding practices. While I won't hold that up as a general indictment of dasBlog, I will say that I like static analysis tools precisely because they can find errors or practice violations in an automated form, without requiring human intervention. Compilers need to tap into this more, but until they do, these kinds of standalone tools can hook into your build process and provide that kind of "always on" effect.

Karl Prosser: Karl's talking about PowerShell, but I'm worried as he gets going that he's talking from a deck that's intended for an entirely difference audience than this one. Hopefully I'm just being paranoid. As the talk progresses, he's right down the middle: he's showing off some interesting aspects of PowerShell-the-language, and has some interesting ideas about scripting languages in general (which obviously includes the PowerShell language) in the console vs. in a GUI, but he also spends too much time talking about the advantages of PowerShell-the-tool (and a little bit about his product, which I don't mind--he's got a kick-ass PowerShell console window). He also talks about some of the advantages of offering a console view instead of a GUI view, which I already agree with, and how to create apps to be scripted, which I also already agree with, so maybe I'm just grumpy at not hearing some more about experiences with PowerShell-the-language and how it could be better or lessons learned for other languages. He talks about the value of the REPL loop, which I think is probably already a given with this crowd (even though it most definitely wouldn't be at just about any other conference on the planet, with possible exception of OOPSLA).

One thing he says that I find worth contemplating more is that "Software is a 2-way conversation, which is why I dislike waterfall so much." I think he's mixing metaphors here--developing software may very well be a 2-way conversation which is why agile methodologies have become so important, and using software may very well also be a 2-way conversation, but that has nothing to do with how the software was built. User interaction with software is one of those areas that developers--agile or otherwise--commonly don't think about much beyond "Does the user like it or not?" (and sometimes not even that much, sadly). What makes this so much worse is that half the time, what the user thinks they want is nowhere close to what they actually want, and the worst part about it is you won't know it until they see the result and then weigh in with the, "Oh, man, that's just not what I thought it would look like."

Which raises the question: how do you handle this? I would tend to say, "I really don't think you'll like this when it's done", but then again I've been known to be high-handed and arrogant at times, so maybe that's not the best tack to take. Thoughts?

Wez Furlong: Wez is talking about PHP, which he should know about, because apparently he's a "Core Developer" (his quotes) of PHP. This promises to be interesting, because PHP is one of those language-slash-web-frameworks that I've spent near-zero time with. (If PHP were accessible outside of the web world, I'd be a lot more interested in it; frankly, I don't know why it couldn't be used outside of the web world, and maybe it already can, but I haven't spent any time studying it to know for sure one way or another.) His question: "Wouldn't it be great if the web devs could transfer their language knowledge to the client side--Silverlight?" Honestly, I'm kind of tired of all these dynamic language discussions being framed in the context of Silverlight, because it seems to pigeonhole the whole dynamic language thing as "just a Silverlight thing". (Note to John Lam: do everything you can to get the DLR out of Silverlight as a ship vehicle, because that only reinforces that notion, IMHO.) Direct quote, and I love it: (slide) "PHP was designed to solve the specific problem of making it easy for Rasmus to make his home page; Not a good example of neat language design." (Wez) "It's a kind of mishmash of HTML, script, code, all thrown together into a stinking pile of a language." He's going over the basics of PHP-the-language, which (since I don't know anything about PHP) is quite interesting. PHP has a "resource" type, which is a "magical handle for an arbitrary C structure type", for external integration stuff.

He's been talking to Jim (Hugunin, I presume) about generics in PHP. Dude... generics... in PHP? In a language with no type information and no first-class support for classes and interfaces? That just seems like such a wrong path to consider....

Interesting--another tidbit I didn't know: PHP uses a JIT-compilation scheme to compile into its own opcode and runs it in the Zend (sp?) engine. Yet another VM hiding in plain sight. I have to admit, I am astounded at how many VMs and execution engines I keep running into in various places.

Another direct quote, which I also love: (slide) "PHP 4: Confirmed as a drunken hack." (Wez) "There's this rumor that one night in a bar, somebody said, Wouldn't it be cool if there were objects in PHP, and the next day there was a patch..." If Wez is any indication of the rest of the PHP community, I could learn to like this language, if only for its self-deprecating sense of humor about itself.

He then mentions Phalanger, a CLR implementation of PHP, and hands the floor over to Thomas for his Phalanger talk. Nice very high-level intro of PHP, and probably entirely worthless if you already knew something about PHP... which I didn't, so I liked it. :-)

Thomas Petricek; Peli de Halleux and Nikolai Tillman; Jeffrey Sax:

(I left the room to get a soda, got roped into doing a quick Channel 9 video about why the next five years will be about languages, then ran into Wez and we talked for a bit about PHP's bytecode engine, then ran into with Jeffrey Snover, PM from the PowerShell team, and we talked for a bit about PSH, hosting PSH, and some other things. Since I don't have a lot of call for numeric computing, I didn't catch most of Jeffrey's talk. I wish I'd caught the Phalanger talk, though. I'll have to collar Thomas in the hallway tomorrow.)

(Just as a final postscript to this talk--John Rose of Sun is sitting next to me during Jeff's talk, and he has more notes on this one talk than any other I've seen. Combined with the cluster of CLR guys that swarmed Jeff as soon as he was done, and I'll go out on a really short limb here and say that this was definitely one of the ones you want to catch when the videos go online "in about a week", according to one of the organizers.)

Stefan Wenig and Fabian Schmied: Oh, this was a fun talk. Very humorous opening, particularly the (real) town's sign they show in the first five or so slides. But their point is good, that enterprise software for various different customers is not easy. They write all their code in C#, so they have to handle this. They cite Jacobsen's "Aspect-Oriented Software Development with Use Cases" as an exemplar of the problem, and go through a few scenarios that don't work to solve it: lots of configuration or scripting, multiple inheritance, inheriting one from another, and so on. (slide) "Inheritance is not enough." (To those of you not here--this is a great slide deck and very well delivered. Even if you don't care about C# or mixins, watch this talk if you give presentations.) Stefan sets up the problem, and Fabian discusses their mixin implementation. (slide) "Mixin programming is the McFlurry programming model." *grin* Mixins in their implementation can be configured "either way": either the mixins can declare what classes they apply to, or the target class can declare which mixins it implements. They create a derived class of your class which implements the mixin interface and mixes in the mixin implementation, then you create the generated derived class via a factory method.

I asked if this was a compile-time, or run-time solution; it's run-time, and they generate code using Reflection.Emit once you call through their static factory (which kicks the process off).

Their mixin implementation is available here.


.NET | C++ | Conferences | Java/J2EE | Languages | Ruby | Windows

Tuesday, January 29, 2008 5:29:14 PM (Pacific Standard Time, UTC-08:00)
Comments [2]  | 
 Monday, January 28, 2008
The "Count your keystrokes" concept; or, blogs or email?

Jon Udell has a great post about the multiplier effect of blogs against private email.

For those of you who didn't share my liberal arts background, the "multiplier effect" is a concept in economics that says if I put $10 in your pocket, you'll maybe save $1 and spend the other $9, thus putting $9 in somebody else's pocket, who will save $1 and spend $8, and so on. Thus, putting $10 into the hands of somebody inside the economy has the effect of putting $10 + $9 + $8 + ... into the economy as a whole, thus creating a clear multiplier effect from that one $10 drop.

Jon's point is that when you email, you're putting $10 worth of information into the email recipients' pocket, which may go to two or three people, or maybe even to a mailing list. When you blog, you're putting that $10 on the Internet where Google can find it, and people you've never met can comment, respond, and enhance it, maybe even making it $11 or $15 or $20, which is a HUGE multiplier effect. :-)

People often email me with questions or comments or suggestions and what not, and I'm always a bit unsure about how to treat it: I'd like to blog it, but email has an implicit privacy element associated with it that I'm reluctant to violate without permission. But Jon's post gives me a new idea about how to handle this:

If you email me, and you want me to email in turn (thus keeping the communication private, for whatever reason), say so in your email. Say exactly what policy you want regarding the privacy of your email, otherwise I will otherwise assume that if you email me, it's OK to blog it and thus take advantage of the blogging multiplier effect.

Which reminds me: please feel free to email me! Commentary on blog items, items you'd like me to venture an opinion on, whatever comes to mind. ted AT tedneward DOT com.




Monday, January 28, 2008 8:46:22 PM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
Highlights of the Lang.NET Symposium, Day One

Thought I'd offer a highly-biased interpretation of the goings-on here at the Lang.NET Symposium. Quite an interesting crowd gathered here; I don't have a full attendee roster, but it includes Erik Meijer, Brian Goetz, Anders Hjelsberg, Jim Hugunin, John Lam, Miguel de Icaza, Charlie Nutter, John Rose, Gilad Braha, Paul Vick, Karl Prosser, Wayne Kelly, Jim Hogg, among a crowd in total of about 40. Great opportunities to do those wonderful hallway chats that seem to be the far more interesting part of conferences.

Jason Zander: Jason basically introduces the Symposium, and the intent of the talk was mostly to welcome everybody (including the > 50% non-Microsoft crowd here) and offer up some interesting history of the CLR and .NET, dating all the way back to a memo/email sent by Chris Brumme in 1998 about garbage collection and the "two heaps", one around COM+ objects, and the other for malloc-allocated data. Fun stuff; hardly intellectually challenging, mind you, but interesting.

Anders Hjelsberg: Anders walks us through the various C# 3.0 features and how they combine to create the subtle power that is LINQ (it's for a lot more than just relational databases, folks), but if you've seen his presentation on C# 3 at TechEd or PDC or any of the other conferences he's been to, you know how that story goes. The most interesting part of his presentation was a statement he made that I think has some interesting ramifications for the industry:

I think that the taxonomies of programming languages are breaking down. I think that languages are fast becoming amalgam. ... I think that in 10 years, there won't be any way to categorize languages as dynamic, static, procedural, object, and so on.

(I'm paraphrasing here--I wasn't typing when he said it, so I may have it wrong in the exact wording.)

I think, first of all, he's absolutely right. Looking at both languages like F# and Scala, for example, we see a definite hybridization of both functional and object languages, and it doesn't take much exploration of C#'s and VB's expression trees facility to realize that they're already a half-step shy of a full (semantic or syntactic) macro system, something that traditionally has been associated with dynamic languages.

Which then brings up a new question: if languages are slowly "bleeding" out of their traditional taxonomies, how will the vast myriad hordes of developers categorize themselves? We can't call ourselves "object-oriented" developers if the taxonomy doesn't exist, and this will have one of two effects: either the urge to distinguish ourselves in such a radical fashion will disappear and we'll all "just get along", or else the distinguishing factor will be the language itself and the zealotry will only get worse. Any takers?

Jim Hugunin: Jim talks about the DLR... and IronPython... by way of a Lego Mindstorms robot and balloon animals. (You kinda had to be there. Or watch the videos--they taped it all, I don't know if they're going to make them publicly available, but if they do, it's highly recommended to watch them.) He uses a combination of Microsoft Robotics Studio, the XNA libraries, his Lego mindstorms robot, and IronPython to create an XBox-controller-driven program to drive the robot in a circle around him. (Seriously, try to get the video.)

(Note to self: go grab the XNA libraries and experiment. The idea of using an Xbox controller to drive Excel or a Web browser just appeals at such a deep level, it's probably a sign of serious dementia.)

Jim talks about the benefits of multiple languages running on one platform, something that a large number of the folks here can definitely agree with. As an aside, he shows the amount of code required to build a C-Python extension in C, and the amount of code required to build an IronPython extension in C#. Two or three orders of magnitude difference, seriously. Plus now the Python code can run on top of a "real" garbage collector, not a reference-counted GC such as the one C-Python uses (which was news to me).

Personally, I continue to dislike Python's use of significant whitespace, but I'm sure glad he came to Microsoft and put it there, because his work begat IronRuby, and that work in turn begat the DLR, which will in turn beget a ton more languages.

Thought: What would be truly interesting would be to create a compiler for the DLR--take a DLR AST, combine it with the Phoenix toolkit, and generate assemblies out of it. They may have something like that already in the DLR, but if it's not there, it should be.

Martin Maly: Martin talks about the DLR in more depth, about the expression trees/AST trees, and the advantages of writing a language on top of the DLR instead of building your own custom platform for it. He shows implementation of the Add operation in ToyScript, the language that ships "with" the DLR (which is found, by the way, in the source for the IronPython and IronRuby languages), and how it manages the reflection (if you will) of operations within the DLR to find the appropriate operation.

Martin is also the one responsible for LOLcode-DLR, and pulls it out in the final five minutes because he just had to give it one final hurrah (or GIMMEH, as you wish). The best part is writing "HAI VISIBLE "Howdy" KTHXBYE" at the DLR console, and just to get even more twisted, he uses the DLR console to define a function in ToyScript, then call it from LOLCODE (using his "COL ... WIT ... AN ..." syntax, which is just too precious for words) directly.

I now have a new goal in life: to create a WCF service in LOLCode that calls into a Windows Workflow instance, also written in LOLcode. I don't know why, but I must do this. And create a UI that's driven by an XBox-360 controller, while I'm at it.

I need a life.

Charlie Nutter/John Rose: Charlie (whom I know from a few No Fluff Just Stuff shows) and John (whom I know from a Scala get-together outside of JavaOne last year) give an overview of some of the elements of the JVM and JRuby, some of the implementational details, and some of the things they want to correct in future versions. John spent much time talking about the "parallel universe" he felt he'd walked into, because he kept saying, "Well, in the JVM we have <something>... which is just like what you [referring to the Microsoft CLR folk who'd gone before him] call <something else>...." It was both refreshing (to see Microsoft and Sun folks talking about implementations without firing nasty white papers back and forth at one another) and disappointing (because there really were more parallels there than I'd thought there'd be, meaning there's less interesting bits for each side to learn from the other) at the same time.

In the end, I'm left with the impression that the JVM really needs something akin to the DLR, because I'm not convinced that just modifying the JVM itself (the recently-named Da Vinci Machine) will be the best road to take--if it's implemented inside the VM, then modifications and enhancements will take orders of magnitude longer to work their way into production use, since there will be so much legacy (Java) code that will have to be regression-tested against those proposed changes. Doing it in a layer-on-top will make it easier and more agile, I believe.

That said, though, I'm glad they (Sun) are (finally) taking the steps necessary to put more dynamic hooks inside the JVM. One thing that John said that really has me on tenterhooks is that Java really does need a lightweight method handle, similar (sort of, kind of, well OK exactly just like) .NET delegates (but we'll never admit it out loud). Once they have that, lots of interesting things become possible, but I have no idea if it would be done in time for Java 7. (It would be nice, but first the Mercurial repositories and other OpenJDK transition work needs to be finished; in the meantime, though, John's been posting patches on his personal website, available as a link off of the Da Vinci Machine/mlvm project page.)

Dan Ingalls: Dan shows us the Lively Kernel project from Sun Labs, which appears to be trying to build the same kind of "naked object" model on top of the Browser/JavaScript world that the Naked Objects framework did on top of the CLR/WinForms and JVM/AWT, both of which trying essentially to recapture the view of objects as Alan Kay originally intended them (entities directly manipulable by the user). For example, there's a "JavaScript CodeBrowser" which looks eerily reminiscent of the Object Browser from Smalltalk environments, except that the code inside of it is all {Java/ECMA}Script. A bit strange to see if you're used to seeing ST code there.

I can't help but wonder, how many people are watching this, thinking, "Great, we're back to where we were 30 years ago?" Granted, there's a fair amount of that going on anyway, given how many concepts that are hot today were invented back in the 50's and 60's, but still, reinventing the Smalltalk environment on top of the browser space just... seems... *sigh*...

It's here if you want to play with it, though when I tried just now it presented me with authentication credentials that I don't have; you may have better luck choosing the 0.8b1 version from here, and the official home page (with explanatory text and a tutorial) for it is here.

Pratap Lakshman: Pratap starts with a brief overview of {Java/ECMA}Script, focusing initially on prototype-based construction. Then he moves into how the DLR should associate various DLR Expression and DLR Rule nodes to the language constructs. Interesting, but a tad on the slow/redundant side, and perhaps a little bit more low-level than I would have liked. That said, though, Charlie spotted what he thought would be a race condition in the definition of types in the code demonstrated, and he and Jim had an interesting discussion around lock-free class definition and modification, which was interesting, if just somewhat slightly off-topic.

Roman Ivantsov: Roman's built the Irony parser, which is a non-code-gen C# parser language reminiscent of the growing collection of parser combinators running around, and he had some thoughts on an ERP language with some interesting linguistic features. I'm going to check out Irony (already pulled it down, in fact), but I'm also very interested to see what comes out of Harry's talk on F# Parsing tomorrow.

Dinner: Pizza. Mmmmm, pizza.

More tomorrow, assuming I don't get stuck here on campus due to the City of Redmond shutting almost completely down due to 2 inches (yes, 2 inches) of snow on the ground from last night. (If you're from Boston, New York, Chicago, Vermont, Montana, North Dakota, or anyplace that gets snow, please don't comment--I already know damn well how ludicrous it is to shut down after just 2 frickin' inches.)


.NET | C++ | Java/J2EE | Languages | Mac OS | Ruby

Monday, January 28, 2008 5:26:46 PM (Pacific Standard Time, UTC-08:00)
Comments [4]  | 
 Friday, January 25, 2008
By the way, if anybody wants to argue about languages next week...

... or if you're a-hankering to kick my *ss over my sacreligious statements about Perl, I'll be at Building 20 on the Microsoft campus in Redmond, at the Language.NET Symposium with a few other guys who know something about language and VM implementation: Jim Hugunin, Gilad Bracha, Wayne Kelly, Charlie Nutter, John Rose, John Lam, Erik Meijer, Anders Hejlsberg....

I wish there were more "other VMs" representation showing up (some of the Parrot or Strongtalk or Squeak folks would offer up some great discussion points), but in the event they don't, it'll still be an interesting discussion. Some of the topics I'm looking forward to:

"Targeting DLR" (Martin Maly)

"Multiple Languages on the Java VM" (John Rose and Charles Nutter)

"Vision of the DLR" (Jim Hugunin)

"Retargeting DLR" (Seo Sanghyeon)

"Ruby" (John Lam)

"Ruby.NET" (Wayne Kelly)

"Integrating Languages into the VSS" (Aaron Marten) [I presume VSS means Visual Studio Shell and not Visual Source Safe...]

"JScript" (Pratap Lakshman) [He can't be looking forward to this, based on what I'm hearing about the debates around ECMAScript 4.0....]

"Volta" (Erik Meijer)

"Parsing Expression Grammars in F#" (Harry Pierson) [I can't be certain, but I think I turned Harry on to F# in the first place, so I'm curious to learn what he's doing with it in Real Life]

And for those of you living within easy driving distance of Redmond, take a trip out to DigiPen this Saturday and Sunday for the Seattle Code Camp. I'll be doing a talk on F# and another one on Scala on Saturday (modulo any scheduling changes). Those of you already coming should check out the xUnit.NET presentation (currently scheduled for 4:45PM on Saturday)--some of James' and Brad's ideas of what a unit-testing framework should really look like are kinda radical, very intriguing, and guaranteed to be thought-provoking. Dunno if there's an xUnit.JVM yet...

... but there should be.


.NET | C++ | Conferences | Java/J2EE | Languages | Ruby | Windows | XML Services

Friday, January 25, 2008 4:16:16 AM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
So I Don't Like Perl. Sue Me.

A number of folks commented on the last post about my "ignorant and apparently unsupported swipes against Parrot and Perl". Responses:

  1. I took exactly one swipe at Perl, and there was a smiley at the end of it. Apparently, based on the heavily-slanted pro-Perl/anti-Perl-bigotry comments I've received, Perl programmers don't understand smileys. So I will translate: "It means I am smiling as I say this, which is intended as a way of conveying light-heartedness or humor."
  2. I didn't take any swipes at Parrot. I said, "Parrot may change that in time, but right now it sits at a 0.5 release and doesn't seem to be making huge inroads into reaching a 1.0 release that will be attractive to anyone outside of the "bleeding-edge" crowd." It is sitting at a 0.5 release (up from a 0.4 release at this time last year), and it doesn't seem to be making huge inroads into reaching a 1.0 release, which I have had several CxO types tell me is the major reason they won't even consider looking at it. That's not a "swipe", that's a practical reality. The same CxO types stay the hell away from Microsoft .NET betas and haven't upgraded to JDK 1.6 yet, either, and they're perfectly justified in doing so: it's called the bleeding edge for a reason.
  3. Fact: I don't like Perl. Therefore, on my blog, which is a voice for my opinion and statements, Perl sucks. I don't like a language that has as many side effects and (to my mind) strange symbolic syntax as Perl uses. The side effects I think are a bad programming language design artifact; the strange symbolic syntax is purely an aesthetic preference.
  4. Fact: I don't pretend that everybody should agree with me. If you like Perl, cool. I also happen to be Lutheran. If you're Catholic, that's cool, too. Doesn't mean we can't get along, so long as you respect my aesthetic preferences so I can respect yours.
  5. I don't have to agree with you to learn from you, and vice versa. In fact, I like it better when people argue, because I learn more that way.
  6. I also don't have to like your favorite language, and you don't have to like mine (if I had one).
  7. I'm not ignorant, and please don't try to assert your supposed superiority by taking that unsupported swipe at me, either. I've tried Perl. I've tried Python, too, and I find its use of significant whitespace to be awkward and ill-considered, and a major drawback to what otherwise feels like an acceptable language. Simply because I disagree with your love of the language doesn't make me ignorant any more than you are if you dislike Java or C# or C++ or any of the languages I like.
  8. Fact: I admit to a deep ignorance of the Perl community. I've never claimed anything of the sort. I also admit to a deep ignorance of the Scientology community, yet that doesn't stop me from passing personal judgment on the Scientologists' beliefs, particularly as expressed by Tom Cruise, or Republicans' beliefs, as expressed by Pat Robertson. And honestly, I don't think I need a deep understanding of the Perl community to judge the language, just as I don't need a deep understanding of Tom Cruise to judge Scientology, or just as you don't need a deep understanding of me to judge my opinions.
  9. If by "homework", by the way, you mean "Spend years writing Perl until you come to love it as I do", then yes, I admit, by your definition of "homework", I've not done my homework. If by "homework" you mean "Learn Perl until you become reasonably proficient in it", then yes, I have done my homework. I had to maintain some Perl scripts once upon an eon ago, not to mention the periodic deciphering of the Perl scripts that come with the various Linux/Solaris/Mac OS images I work with, and my dislike and familiarity with the language stemmed from that experience. I have a similar dislike of 65C02 assembler.
  10. I've met you, chromatic, though you may not remember it: At the second FOO Camp, you and I and Larry Wall and Brad Merrill and Dave Thomas and Peter Drayton had an impromptu discussion about Parrot, virtual machines, the experiences Microsoft learned while building the Common Type System for the CLR, some of the lessons I'd learned from playing with multiple languages on top of the JVM, and some of the difficulties in trying to support multiple languages on top of a single VM platform. I trust that you don't consider Dave Thomas to be ignorant; he and I had a long conversation after that impromptu round table and we came to the conclusion that Parrot was going to be in for a very rough ride without some kind of common type definitions across the various languages built for it. (He was a little stunned at the idea that there wasn't some kind of common hash type across the languages, if that helps to recall the discussion.) This in no way impugns the effort you're putting into Parrot, by the way, nor should you take this criticism to suggest that you should stop your work. Frankly, I'd love to see how Parrot ends up, since it takes a radically different approach to a virtual execution engine than other environments do, and stark contrast is always a good learning experience. The fact that Parrot has advanced all of a minor build number in the last year seems to me, an outsider who periodically grabs the code, builds it and pokes around, to be indicative of the idea that Parrot is taking a while.
  11. Oh, and by the way, chromatic, since I've got your attention, while there, you argued that the Parrot register-based approach was superior to the CLR or JVM approach because "passing things in registers is much faster than passing them on the stack". (I may be misquoting what you said, but this is what Peter, Brad, Dave and I all heard.) I wanted to probe that statement further, but Brad jumped in to explain to you (and the subject got changed fairly quickly, so I don't know if you picked up on it) that the execution stack in the CLR (and the JVM) is an abstraction--both virtual machines make use of registers where and when possible, and can do so fairly easily. Numerous stack-based VMs have done this over the years as a performance enhancement. I assume you know this, so I'm curious to know if I misunderstood the rationale behind a register-based VM.
  12. Fact: Perl 6 recently celebrated the fifth anniversary of its announcement. Not its ship date, but the announcement. Fact: Perl 6 has not yet shipped.
  13. Opinion: I hate to say this if you're a Perl lover, but based on the above, Perl 6 is quickly vying for the Biggest Vaporware Ever award. The only language that rivals this in terms of incubation length is the official C++ standard, which took close to or more than a decade. And it (rightly) was crucified in the popular press for taking that long, too. (And there was a long time where we--a large group of other C++ programmers I worked with--weren't sure it would ship at all, much less before the language was completely dead, because there was no visible progress taking place: no new features, no new libraries, no new changes, nothing.)
  14. Fact: I would love for Parrot to ship, because I would love to be able to start experimenting with building languages that emit PIR. I would love to embed Parrot as an execution engine inside of a larger application, using said language as the glue around the core parts of the application. I would love to do all of this in a paid project. When Parrot reaches a 1.0 release, I'll consider it, just as I had to wait until the CLR and C# reached a 1.0 release when I started playing with them in July of 2001.
  15. Fact: The JVM and CLR are not nearly as good for heavily-recursive languages (such as what we see in functional languages like Haskell and ML and F# and Erlang and Scala) because neither one, as of this writing, supports tail-call recursion optimization; the CLR pretends to, via the "tail" opcode that is essentially ignored as of CLR v2.0 (the CLR that ships with .NET 2, 3 and 3.5), but the JVM doesn't even go that far. JIT compilers can do a few things to help optimize here, but realistically both environments need this if they're to become reasonable dynamic language platforms.
  16. Fact: Lots of large systems have been built in COBOL, too, and scale even better than systems built in Perl, or C#, or Java, or C++. That doesn't mean I like them, want to program in them, or that the COBOL community should be any less proud of them. Again, just because I don't care for abstract art doesn't undermine the brilliance of an artist like Mark Rothko.
  17. And I find the statement, "If you need X, don't look at other languages" to be incredibly short-sighted. Even if I were only paid to write Java, I would look at other languages, because I learn more about programming in general by doing so, thus improving my Java code. I would heartily suggest the same thing for the C# programmer, the C++ programmer, the VB programmer, the Ruby programmer, the Perl programmer, ad infinitum.

At the end of the day, the fact that I don't like Perl doesn't undermine its efficacy amongst those who use it. The fact that Perl scale(1)s and scale(2)s doesn't take away from the fact that I don't like its syntax, semantics, or idioms. The fact that the Perl community can't take a ribbing over the large numbers of incomprehensible Perl scripts out there only reinforces the idea that Perl developers like incomprehensible syntax. (If you want a kind of dirty revenge, ask the Java developers about generics.)

Besides, if you listen to Paul Graham, all these languages are just footnotes on Lisp, anyway, so let's all quit yer bitchin' and start REPLing with lots of intuitively selected (or, if you prefer, irritatingly silly) parentheses.

But, in the interests of making peace with the Perl community....

65C02 assembler sucks way worse than Perl. (And no smiley; that's a statement delivered in straight-faced monotone.)


.NET | C++ | Java/J2EE | Ruby

Friday, January 25, 2008 3:53:25 AM (Pacific Standard Time, UTC-08:00)
Comments [10]  | 
 Wednesday, January 23, 2008
Can Dynamic Languages Scale?

The recent "failure" of the Chandler PIM project generated the question, "Can Dynamic Languages Scale?" on TheServerSide, and, as is all too typical these days, it turned into a "You suck"/"No you suck" flamefest between a couple of posters to the site.

I now make the perhaps vain attempt to address the question meaningfully.

What do you mean by "scale"?

There's an implicit problem with using the word "scale" here, in that we can think of a language scaling in one of two very orthogonal directions:

  1. Size of project, as in lines-of-code (LOC)
  2. Capacity handling, as in "it needs to scale to 100,000 requests per second"

Part of the problem I think that appears on the TSS thread is that the posters never really clearly delineate the differences between these two. Assembly language can scale(2), but it can't really scale(1) very well. Most people believe that C scales(2) well, but doesn't scale(1) well. C++ scores better on scale(1), and usually does well on scale(2), but you get into all that icky memory-management stuff. (Unless, of course, you're using the Boehm GC implementation, but that's another topic entirely.)

Scale(1) is a measurement of a language's ability to extend or enhance the complexity budget of a project. For those who've not heard the term "complexity budget", I heard it first from Mike Clark (though I can't find a citation for it via Google--if anybody's got one, holler and I'll slip it in here), he of Pragmatic Project Automation fame, and it's essentially a statement that says "Humans can only deal with a fixed amount of complexity in their heads. Therefore, every project has a fixed complexity budget, and the more you spend on infrastructure and tools, the less you have to spend on the actual business logic." In many ways, this is a reflection of the ability of a language or tool to raise the level of abstraction--when projects began to exceed the abstraction level of assembly, for example, we moved to higher-level languages like C to help hide some of the complexity and let us spend more of the project's complexity budget on the program, and not with figuring out which register needed to have the value of the interrupt to be invoked. This same argument can be seen in the argument against EJB in favor of Spring: too much of the complexity budget was spent in getting the details of the EJB beans correct, and Spring reduced that amount and gave us more room to work with. Now, this argument is at the core of the Ruby/Rails-vs-Java/JEE debate, and implicitly it's obviously there in the middle of the room in the whole discussion over Chandler.

Scale(2) is an equally important measurement, since a project that cannot handle the expected user load during peak usage times will have effectively failed just as surely as if the project had never shipped in the first place. Part of this will be reflected in not just the language used but also the tools and libraries that are part of the overall software footprint, but choice of language can obviously have a major impact here: Erlang is being tossed about as a good choice for high-scale systems because of its intrinsic Actors-based model for concurrent processing, for example.

Both of these get tossed back and forth rather carelessly during this debate, usually along the following lines:

  1. Pro-Java (and pro-.NET, though they haven't gotten into this particular debate so much as the Java guys have) adherents argue that a dynamic language cannot scale(1) because of the lack of type-safety commonly found in dynamic languages. Since the compiler is not there to methodically ensure that parameters obey a certain type contract, that objects are not asked to execute methods they couldn't possibly satisfy, and so on. In essence, strongly-typed languages are theorem provers, in that they take the assertion (by the programmer) that this program is type-correct, and validate that. This means less work for the programmer, as an automated tool now runs through a series of tests that the programmer doesn't have to write by hand; as one contributor to the TSS thread put it:
    "With static languages like Java, we get a select subset of code tests, with 100% code coverage, every time we compile. We get those tests for "free". The price we pay for those "free" tests is static typing, which certainly has hidden costs."
    Note that this argument frequently derails into the world of IDE support and refactoring (as its witnessed on the TSS thread), pointing out that Eclipse and IntelliJ provide powerful automated refactoring support that is widely believed to be impossible on dynamic language platforms.
  2. Pro-Java adherents also argue that dynamic languages cannot scale(2) as well as Java can, because those languages are built on top of their own runtimes, which are arguably vastly inferior to the engineering effort that goes into the garbage collection facilities found in the JVM Hotspot or CLR implementations.
  3. Pro-Ruby (and pro-Python, though again they're not in the frame of this argument quite so much) adherents argue that the dynamic nature of these languages means less work during the creation and maintenance of the codebase, resulting in a far fewer lines-of-code count than one would have with a more verbose language like Java, thus implicitly improving the scale(1) of a dynamic language.

    On the subject of IDE refactoring, scripting language proponents point out that the original refactoring browser was an implementation built for (and into) Smalltalk, one of the world's first dynamic languages.

  4. Pro-Ruby adherents also point out that there are plenty of web applications and web sites that scale(2) "well enough" on top of the MRV (Matz's Ruby VM?) interpreter that comes "out of the box" with Ruby, despite the widely-described fact that MRV Ruby Threads are what Java used to call "green threads", where the interpreter manages thread scheduling and management entirely on its own, effectively using one native thread underneath.
  5. Both sides tend to get caught up in "you don't know as much as me about this" kinds of arguments as well, essentially relying on the idea that the less you've coded in a language, the less you could possibly know about that language, and the more you've coded in a language, the more knowledgeable you must be. Both positions are fallacies: I know a great deal about D, even though I've barely written a thousand lines of code in it, because D inherits much of its feature set and linguistic expression from both Java and C++. Am I a certified expert in it? Hardly--there are likely dozens of D idioms that I don't yet know, and certainly haven't elevated to the state of intuitive use, and those will come as I write more lines of D code. But that doesn't mean I don't already have a deep understanding of how to design D programs, since it fundamentally remains, as its genealogical roots imply, an object-oriented language. Similar rationale holds for Ruby and Python and ECMAScript, as well as for languages like Haskell, ML, Prolog, Scala, F#, and so on: the more you know about "neighboring" languages on the linguistic geography, the more you know about that language in particular. If two of you are learning Ruby, and you're a Python programmer, you already have a leg up on the guy who's never left C++. Along the other end of this continuum, the programmer who's written half a million lines of C++ code and still never uses the "private" keyword is not an expert C++ programmer, no matter what his checkin metrics claim. (And believe me, I've met way too many of these guys, in more than just the C++ domain.)

A couple of thoughts come to mind on this whole mess.

Just how refactorable are you?

First of all, it's a widely debatable point as to the actual refactorability of dynamic languages. On NFJS speaker panels, Dave Thomas (he of the PickAxe book) would routinely admit that not all of the refactorings currently supported in Eclipse were possible on a dynamic language platform given that type information (such as it is in a language like Ruby) isn't present until runtime. He would also take great pains to point out that simple search-and-replace across files, something any non-trivial editor supports, will do many of the same refactorings as Eclipse or IntelliJ provides, since type is no longer an issue. Having said that, however, it's relatively easy to imagine that the IDE could be actively "running" the code as it is being typed, in much the same way that Eclipse is doing constant compiles, tracking type information throughout the editing process. This is an area I personally expect the various IDE vendors will explore in depth as they look for ways to capture the dynamic language dynamic (if you'll pardon the pun) currently taking place.

Who exactly are you for?

What sometimes gets lost in this discussion is that not all dynamic languages need be for programmers; a tremendous amount of success has been achieved by creating a core engine and surrounding it with a scripting engine that non-programmers use to exercise the engine in meaningful ways. Excel and Word do it, Quake and Unreal (along with other equally impressively-successful games) do it, UNIX shells do it, and various enterprise projects I've worked on have done it, all successfully. A model whereby core components are written in Java/C#/C++ and are manipulated from the UI (or other "top-of-the-stack" code, such as might be found in nightly batch execution) by these less-rigorous languages is a powerful and effective architecture to keep in mind, particularly in combination with the next point....

Where do you run again?

With the release of JRuby, and the work on projects like IronRuby and Ruby.NET, it's entirely reasonable to assume that these dynamic languages can and will now run on top of modern virtual machines like the JVM and the CLR, completely negating arguments 2 and 4. While a dynamic language will usually take some kind of performance and memory hit when running on top of VMs that were designed for statically-typed languages, work on the DLR and the MLVM, as well as enhancements to the underlying platform that will be more beneficial to these dynamic language scenarios, will reduce that. Parrot may change that in time, but right now it sits at a 0.5 release and doesn't seem to be making huge inroads into reaching a 1.0 release that will be attractive to anyone outside of the "bleeding-edge" crowd.

So where does that leave us?

The allure of the dynamic language is strong on numerous levels. Without having to worry about type details, the dynamic language programmer can typically slam out more work-per-line-of-code than his statically-typed compatriot, given that both write the same set of unit tests to verify the code. However, I think this idea that the statically-typed developer must produce the same number of unit tests as his dynamically-minded coworker is a fallacy--a large part of the point of a compiler is to provide those same tests, so why duplicate its work? Plus we have the guarantee that the compiler will always execute these tests, regardless of whether the programmer using it remembers to write those tests or not.

Having said that, by the way, I think today's compilers (C++, Java and C#) are pretty weak in the type expressions they require and verify. Type-inferencing languages, like ML or Haskell and their modern descendents, F# and Scala, clearly don't require the degree of verbosity currently demanded by the traditional O-O compilers. I'm pretty certain this will get fixed over time, a la how C# has introduced implicitly typed variables.

Meanwhile, why the rancor between these two camps? It's eerily reminiscent of the ill-will that flowed back and forth between the C++ and Java communities during Java's early days, leading me to believe that it's more a concern over job market and emplyability than it is a real technical argument. In the end, there will continue to be a ton of Java work for the rest of this decade and well into the next, and JRuby (and Groovy) afford the Java developer lots of opportunities to learn those dynamic languages and still remain relevant to her employer.

It's as Marx said, lo these many years ago: "From each language, according to its abilities, to each project, according to its needs."

Oh, except Perl. Perl just sucks, period. :-)

PostScript

I find it deeply ironic that the news piece TSS cited at the top of the discussion claims that the Chandler project failed due to mismanagement, not its choice of implementation language. It doesn't even mention what language was used to build Chandler, leading me to wonder if anybody even read the piece before choosing up their sides and throwing dirt at one another.


.NET | C++ | Development Processes | Java/J2EE | Languages | Ruby

Wednesday, January 23, 2008 11:51:02 PM (Pacific Standard Time, UTC-08:00)
Comments [15]  | 
 Thursday, January 17, 2008
You're Without A Point, Mr. Zachmann

In the latest Redmond Developer News, William Zachmann writes "Game programming is fundamental to understanding where software development is headed in the years ahead", which is a position I happen to believe quite strongly myself. And then...

... then he says absolutely nothing at all.

Oh, there's a couple of book recommendations, two paragraphs about how the techniques of game programming mirror the development of the GUI in the 80s and 90s, and since GUIs obviously became important in time, so will game programming. What parts of game programming, you ask? Why, just this list:

Full 3-D modeling, person and vehicle animation, scripting, textures, lighting effects, object physics, particle effects, voice and video creation and streaming, plotting, goal setting and scoring, scenario building, player interaction strategies, lighting effects, heads-up displays (HUDs), object rendering, damage-level maintenance, artificial intelligence (AI) and virtual-reality rendering are just a few of the component technologies that go into game creation and development. Any one of them can be a totally absorbing learning experience all in itself. Mastering game development requires learning about them all -- and more.

Frankly, the whole article was essentially fluff. Zero in the way of logical defense to his argument, and zero in the way of prescriptive advice, aside from "Learn it all, my son, learn it all."

So here's how I think the article should have read:

Only a game? Think Again (the Ted Neward Version)

Developing enterprise software has never been an easy task, and the demands of corporate IT departments in the next decade are only going to get more stringent. Users demand snappier user interfaces, more expressive displays of data and information, higher performance and scalability, much better interaction among the various user- and machine-driven nodes in the network, and more and more "assistance" from the software to get users from "A" to "B" without having to do all the grunt work themselves. (It's a tough job, moving the mouse, clicking it, moving it some more, clicking again and again and again.... And Lord, then you have to type on the keyboard.... It's amazing the average IT knowledge worker doesn't draw hazard pay.)

So where does the enterprise developer find the skills necessary to stand out in the 2010s?

From his free-wheeling high-flying long-haired pizza-snorting DietCoke-mainlining cousin over in the entertainment software industry, of course.

Consider, if you will, the best-selling game World of Warcraft, not from a point of view that describes the domain of the software, but from its non-domain requirements, what some people also refer to as the non-functional requirements:

  1. Performance: if the software behaves sluggishly, users will complain and quit using the software, which directly affects their bottom line: they charge access fees on an hourly basis.
  2. Security: if users can hack the software to grant themselves higher access or change their data (gaining more gold, items, whatever), users will complain and quit using the software. (This is a huge deal, by the way--an entire economy has sprung up around MMORPGs like WoW, where people will pay real-world money--or other real-world currency--for WoW-world goods and services. If attackers can alter their WoW accounts, that can translate directly into hard real-world cash.
  3. Scalability: the more simultaneous users, the more cash in Blizzard's pocket.
  4. Concurrency: these users are all interacting in sub-second timeframes with each other and the rest of the system, so accuracy of information exchange is critical.
  5. Portability: the more systems the software can run on, the more potential users the software can attract (and, again, the more cash in Blizzard's pocket in return).
  6. User Interface: a tremendous amount of information needs to be available to the user at a moment's notice, and a huge variety of options must be quickly and easily selectable/actionable.
  7. Extensibility: users will need new and different elements (scenarios/quests, character types, races, worlds, and so on) in order to stay interested in using the software. (This isn't generally a problem with enterprise software, since it's not like you're going to be excited about using the HR system anyway, but extensibility there is still going to look a lot like extensibility here.)
  8. Resiliency: In the inevitable event of a crash, data must not be lost, or users will be... miffed... to say the least. Clear distinctions between transient and durable data must be drawn, and must be communicated to the user, so as to manage expectations accordingly. And it goes without saying that if a server (or server farm) goes down, it must come back up or be hot-swapped with another server/farm as quickly as humanly possible.

No doubt hard-core gamers could come up with a variety of other features that would--once the gaming domain is removed from them--be recognizable to the enterprise developer. Naturally, the entertainment industry has other areas that generally a software developer doesn't run into--physics modeling and what-not--but surprisingly a great deal of the modern video game can, and undoubtedly will, make its way into the enterprise software arena. Some thoughts that come to mind:

  1. Animation. Apple has certainly been at the forefront of incorporating animation into user interface, but this is just the tip of the iceberg. Particularly for software that will reach out to the general public, first impressions mean a great deal, and a UI that grabs your attention without being overly dramatic will leave users with warm fuzzies and fond memories. This doesn't even begin to consider the more practical applications of animation, such as a travel reservations system providing a map with your trip itinerary graphically plotted, with zoom-in/zoom-out effects as you work with different parts of the trip (zoom out to look at the air routing, zoom in to look at a particular city for the hotel options, and so on).
  2. User interface paradigms. The modern video game, particularly those that involve deeper strategic and tactical thought (a la the real-time strategy game like Command & Conquer), make heavy use of the heads-up display, or HUD, to provide a small-real-estate control panel that doesn't distract from the main focus of the user's attention (the map). Microsoft has started to work with this idea some, with the new Office 2007 taking a very different approach to the ubiquitous menubar-across-the-top, going for what they call "ribbon" elements that fly out and fly back, much the same way that the HUD does in C&C. Also something to consider is the map navigation system, where simply moving the mouse to the far edge of the screen starts scrolling in that direction. Consider this before you dismiss the idea: horizontal scrolling is completely verboten in the word processing app, yet we do it all the time (without too much complaint) in the modern RTS. Why? I submit to you that it is because scrolling is so much easier in the RTS than it is in Word/Excel/whatever.
  3. Player-to-computer interaction. This is different from UI in that the computer often has to masquerade as a player, and in order to do so in strategy games (a la Civilization IV), the programmers typically limit the interaction to very specific statements. Now consider natural language parsing (an offshoot branch of AI research), which can take English statements, break them down, analyze them and respond according to the content of the statement. How much easier would it be for users to say, "Show me all the unsold merchandise for the Northern California region for the years 2005 to 2006", rather than "SELECT * FROM merchandise WHERE ..." or navigating a complex report form?
  4. Speech and sound. Consider the user who is blind, or is missing digits from either or both hands. How useful is a computer then? Now consider the same user who can speak to the machine (a la the natural language point above) or converse with the machine, as blind people do with other people, every day. Not everything has to be presented visually--I eagerly await the interaction of cell phones to Interactive Voice Response systems that are backed by a natural language parser. It's coming, folks.
  5. Scripting languages. Most games are built as game engines written by programmers, with scenarios or missions or quests (or whatever) written in some kind of scripting engine and/or scenario editor. This is the epitome of the domain-specific language, and was done to allow the non-technical knowledge worker (the game designer and playtest leads) to be able to adjust the scenarios without requiring complex development steps.
  6. Explosions and "ka-boom" sound effects. Well... I suppose you could get one of these when you deleted an employee from the system, but that's just getting a little gratuitous.

The point is, all of these things, and more, could--and I submit, will--radically change how we build business software. And considering that most game development isn't about twiddling assembly instructions but writing in modern high-level languages (native C++ being the most common, with Java and C# bringing up a close-and-rapidly-growing second), complete with high-level abstractions and libraries to handle the ugly details (including lighting effects, object interaction, and more), it's fast becoming reasonable to learn these skills without having to throw away everything the enterprise developer already knows.

As for resources, a trip down to your local computer book store or Amazon will yield a plethora of game-related titles, some of which focus on the details of 3D graphics, others of which focus on game design (the actual modeling of the game domain itself--how many units, hit points, etc). One interesting series to consider picking up is "Game Programming Gems", which are collections of short essays on a huge variety of topics--including the recently-discovered concept of "unit testing" that the entertainment industry has just picked up.

So yes, we have a few things we can contribute to them, as well. *grin*

And besides, it'll finally be nice to explain to your non-technical friends and family what you do for a living. "Well, you see this? I wrote this..." will generate "oohs" and "aahs" rather than "Um... that's just text on a screen, what did that do?"




Thursday, January 17, 2008 1:59:31 PM (Pacific Standard Time, UTC-08:00)
Comments [1]  | 
 Tuesday, January 15, 2008
My Open Wireless Network

People visiting my house have commented from time to time on the fact that at my house, there's no WEP key or WPA password to get on the network; in fact, if you were to park your car in my driveway and open up your notebook, you can jump onto the network and start browsing away. For years, I've always shrugged and said, "If I can't spot you sitting in my driveway, you deserve the opportunity to attack my network." Fortunately, Bruce Schneier, author of the insanely-good-reading Crypto-Gram newsletter, is in the same camp as I:

My Open Wireless Network

Whenever I talk or write about my own security setup, the one thing that surprises people -- and attracts the most criticism -- is the fact that I run an open wireless network at home.

There's no password. There's no encryption. Anyone with wireless capability who can see my network can use it to access the internet.

To me, it's basic politeness. Providing internet access to guests is kind of like providing heat and electricity, or a hot cup of tea. But to some observers, it's both wrong and dangerous.

I'm told that uninvited strangers may sit in their cars in front of my house, and use my network to send spam, eavesdrop on my passwords, and upload and download everything from pirated movies to child pornography. As a result, I risk all sorts of bad things happening to me, from seeing my IP address blacklisted to having the police crash through my door.

While this is technically true, I don't think it's much of a risk. I can count five open wireless networks in coffee shops within a mile of my house, and any potential spammer is far more likely to sit in a warm room with a cup of coffee and a scone than in a cold car outside my house. And yes, if someone did commit a crime using my network the police might visit, but what better defense is there than the fact that I have an open wireless network? If I enabled wireless security on my network and someone hacked it, I would have a far harder time proving my innocence.

This is not to say that the new wireless security protocol, WPA, isn't very good. It is. But there are going to be security flaws in it; there always are.

I spoke to several lawyers about this, and in their lawyerly way they outlined several other risks with leaving your network open.

While none thought you could be successfully prosecuted just because someone else used your network to commit a crime, any investigation could be time-consuming and expensive. You might have your computer equipment seized, and if you have any contraband of your own on your machine, it could be a delicate situation. Also, prosecutors aren't always the most technically savvy bunch, and you might end up being charged despite your innocence. The lawyers I spoke with say most defense attorneys will advise you to reach a plea agreement rather than risk going to trial on child-pornography charges.

In a less far-fetched scenario, the Recording Industry Association of America is known to sue copyright infringers based on nothing more than an IP address. The accused's chance of winning is higher than in a criminal case, because in civil litigation the burden of proof is lower. And again, lawyers argue that even if you win it's not worth the risk or expense, and that you should settle and pay a few thousand dollars.

I remain unconvinced of this threat, though. The RIAA has conducted about 26,000 lawsuits, and there are more than 15 million music downloaders. Mark Mulligan of Jupiter Research said it best: "If you're a file sharer, you know that the likelihood of you being caught is very similar to that of being hit by an asteroid."

I'm also unmoved by those who say I'm putting my own data at risk, because hackers might park in front of my house, log on to my open network and eavesdrop on my internet traffic or break into my computers.

This is true, but my computers are much more at risk when I use them on wireless networks in airports, coffee shops and other public places. If I configure my computer to be secure regardless of the network it's on, then it simply doesn't matter. And if my computer isn't secure on a public network, securing my own network isn't going to reduce my risk very much.

Yes, computer security is hard. But if your computers leave your house, you have to solve it anyway. And any solution will apply to your desktop machines as well.

Finally, critics say someone might steal bandwidth from me. Despite isolated court rulings that this is illegal, my feeling is that they're welcome to it. I really don't mind if neighbors use my wireless network when they need it, and I've heard several stories of people who have been rescued from connectivity emergencies by open wireless networks in the neighborhood.

Similarly, I appreciate an open network when I am otherwise without bandwidth. If someone were using my network to the point that it affected my own traffic or if some neighbor kid was dinking around, I might want to do something about it; but as long as we're all polite, why should this concern me? Pay it forward, I say.

Certainly this does concern ISPs. Running an open wireless network will often violate your terms of service. But despite the occasional cease-and-desist letter and providers getting pissy at people who exceed some secret bandwidth limit, this isn't a big risk either. The worst that will happen to you is that you'll have to find a new ISP.

A company called Fon has an interesting approach to this problem. Fon wireless access points have two wireless networks: a secure one for you, and an open one for everyone else. You can configure your open network in either "Bill" or "Linus" mode: In the former, people pay you to use your network, and you have to pay to use any other Fon wireless network.

In Linus mode, anyone can use your network, and you can use any other Fon wireless network for free. It's a really clever idea.

Security is always a trade-off. I know people who rarely lock their front door, who drive in the rain (and, while using a cell phone), and who talk to strangers. In my opinion, securing my wireless network isn't worth it. And I appreciate everyone else who keeps an open wireless network, including all the coffee shops, bars and libraries I have visited in the past, the Dayton International Airport where I started writing this, and the Four Points Sheraton where I finished. You all make the world a better place.

I'll admit that he's gone to far greater lengths to justify the open wireless network than I; frankly, the idea that somebody might try to sit in my driveway in order to hack my desktop machine and store kitty porn on it had never occurred to me. I was always far more concerned that somebody might sit on my ISP's server, hack my desktop machine's IP from there and store kitty porn on it. Which is why, like Schneier, I keep any machine that's in my house as up to date as possible. Granted, that doesn't protect me against a zero-day exploit, but if an attacker is that determined to put kitty porn on my machine, I probably couldn't stop them from breaking down my front door while we're all at work and school and loading it on via a CD-ROM, either.

And, at least in my neighborhood, I can (barely) find the signal for a few other wireless networks that are wide open, too, so I know I'm not the only target of opportunity here.So the prospective kitty porn bandit has his choice of machines to attack, and frankly I'll take the odds of my machines being the more hardened targets over my neighbors' machines any day. (Remember, computer security is often an exercise in convincing the bad guy to go play in somebody else's yard. I wish it were otherwise, but until we have effective response and deterrence mechanisms, it's going to remain that way for a long time.)

I've known a lot of people who leave their front doors unlocked--my grandparents lived in rural Illinois for sixty some-odd years in the same house, leaving the front door pretty much unlocked all the time, and the keys to their cars in the drivers' side sun shade, and never in all that time did any seedy character "break in" to their home or steal their car. (Hell, after my grandfather died a few years ago, the kids--my mom and her siblings--descended on the place to get rid of a ton of the junk he'd collected over the years. I think they would have welcomed a seedy character trying to make off with the stuff at that point.)

Point is, as Schneier points out in the last paragraph, security is always a trade-off, and we must never lose sight of that fact. Remember, dogma is the root of all evil, and should never be considered a substitute for reasoned thought processes.

And meanwhile, friends, when you come to my house to visit, enjoy the wireless, the heat, and the electricity. If you're nice, we may even let you borrow chair for a while, too. :-)


Development Processes | Mac OS | Security | Windows

Tuesday, January 15, 2008 9:45:10 AM (Pacific Standard Time, UTC-08:00)
Comments [3]  | 
Commentary Responses: 1/15/2008 Edition

A couple of people have left comments that definitely deserve response, so here we go:

Glenn Vanderberg comments in response to the Larraysaywhut? post, and writes:

Interesting post, Ted ... and for the most part I agree with your comments.  But I have to ask about this one:

Actually, there are languages that do it even worse than COBOL. I remember one Pascal variant that required your keywords to be capitalized so that they would stand out. No, no, no, no, no! You don't want your functors to stand out. It's shouting the wrong words: IF! foo THEN! bar ELSE! baz END! END! END! END!

[Oh, now, that's just silly.]

Seriously?  You don't think Larry has a point there?  That's one of the primary things I always hated about Wirth's languages, for exactly the reason cited here.  Most real-world Pascal implementations relaxed that rule to recognize upper- and lowercase keywords, but he didn't learn, making the same horrible mistake in Modula-2 and Oberon.

Capitalized words draw your attention, and make it hard to see the real code in between.

Rather than disagree with him, I agree with Larry: uppercased keywords, in a language, are just SOOOO last-century. But so is line-numbering, declaration-before-use, and hailing recursion as a feature. It just seems silly to put this out there as a point of language design, when I can't imagine anyone, with the possible exception of the old COBOL curmudgeon in the corner ("In MY day, we wrote code without a shift key, and we LIKED it! Uphill, both ways, I tell you!"), thinks that uppercased keywords is a good idea.

As for Mr. Wirth, well, dude had some good ideas, but even Einstein had his wacky moments. Repeat after me, everybody: "Just because some guy is brilliant and turns out to be generally right doesn't mean we take everything he says as gospel". It's true for Einstein, it's true for Wirth, and it's true even for Dave Thomas (whom I am privileged to call friend, love deeply, and occasionally think is off his rocker... but I digress).

Actually, Glenn, I think case-sensitivity as a whole is silly. Let's face it, all ye who think that the C-family of languages have this one right, when's the last time you thought it was perfectly acceptable to write code like "int Int = 30;" ? Frankly, if anybody chose to overload based on case, I'd force them to maintain that same code for the next five years as punishment.

(I thought about ripping their still-beating hearts out of their chests instead, but honestly, having to live with the mess they create seems worse, and more fitting to boot.)

What's ironic, though, is that to be perfectly frank, I do exactly this with my SQL code, and it DOESN'T! SEEM! TO! SHOUT! to me AT! ALL! For some reason, this

SELECT name, age, favorite_language FROM programmers WHERE age > 25 AND favorite_language != 'COBOL';

just seems to flow pretty easily off the tongue. Err... eyeball. Whatever.

Meanwhile, 'Of Fibers and Continuations' drew some ire from Mark Murphy:

Frankly, this desire to accommodate the nifty feature of the moment smacks a great deal of Visual Basic, and while VB certainly has its strengths, coherent language design and consistent linguistic facilities is not one of them. It's played havoc with people who tried to maintain code in VB, and it's played hell with the people who try to maintain the VB language. One might try to argue that the Ruby maintainers are just Way Smarter than the Visual Basic maintainers, but I think that sells the VB team pretty short, having met some of them.

Conversely, I think you're selling the Ruby guys a bit short. And this is coming from a guy who's old enough to have written code in Visual Basic for DOS several years into his programming experience.

Wow. Next thing you know, Bruce Tate will be in here, talking about the "chuck the baby out the window" game he wrote for QuickBASIC. (True story.) And, FWIW, I too know the love of BASIC, although in this case I did QuickBasic (DOS) for a while, before it became known as QBasic, and Applesoft BASIC even before that. (Anybody else remember lo-res vs. hi-res graphics debates?) Ah, the sweet, sweet memories of PEEK and POKE and.... *shudder* Never mind.

[insert obligatory "get off my lawn!" reference here]

Get off my lawn, ya hooligan!

The death-knell for VB is widely considered to be the move from VB6 to VB.NET. In doing that, they changed significant quantities of the VB syntax. That's why there was so much hue and cry to keep maintaining VB6, because folk didn't want to take the time to port their zillions of lines of VB6 code.

Actually, much of that hue and cry was from a corner of the VB world that really just didn't want to learn something new. It turned out that most of the VB hue'ers and cry'ers were those who'd been hue'ing and cry'ing with every successive release of VB, and in the words of one very popular VB speaker and programmer, "If they don't want to come along, well, frankly, I think we're better off without 'em anyway."

Truthfully? VB seems to have move along just fine since. And, interestingly enough, since its transition to the CLR, VB has had a much stronger "core vision" to the language than it did for many years. I don't know if this is because the CLR helps them keep that vision clear, or if trying to keep up with C# is good intra-corporate competition, or what, but I haven't heard anywhere near the kinds of grousing about new linguistic changes in the two successive revisions of VB since VB.NET's release (VS 2005 and VS 2008) than I did prior to its move to the CLR.

The changes Ruby made in 1.9 had very little syntax impact (colons in case statements, and not much else, IIRC). Fibers, in particular, are just objects, supplied as part of the stock Ruby class library. I'm not aware of new syntax required to use fibers.

Grousing about a language adding to its standard class library seems a little weak. When Microsoft added new APIs to .NET when they released 3.0, I suspect you didn't bat an eye.

Oh, heavens, no. Quite the contrary--when .NET 3.0 shipped with WCF, Workflow and WPF in it, I was actually a little concerned, because the CLR's basic footprint is just ballooning like mad. How long before the CLR installation rivals that of the OS itself? Besides, this monolithic approach has its limitations, as the Java folks have discovered to their regret, and it's not too long before people start noticing the five or six different versions of the CLR all living on their machine simultaneously....

Let's be honest here--an API release is different from changing the execution model of the virtual machine, and that's partly what fibers do.

But of even more interest to this particular discussion, I wasn't really grousing about the syntax, or the addition of fibers, as I was pointing out that this is something that other platforms (notably Win32) has had before, and that it ended up being a "ho-hum, another subject I can safely ignore" topic for the world's programmers. That, and the interesting--and heretofore unrecognized, to me--link between fibers and coroutines and continuations.

In particular, grousing about how Language X adds something to its class library that duplicates a portion of something "baked into" Language Y seems really weak. Does this mean that once something is invented in a language, no other language is supposed to implement it in any way, shape, or form?

Heavens, no! Just like if you want to use objects, you're more than welcome to do so in C, or Pascal, or even assembly!

What if fibers weren't part of the Ruby 1.9 distribution, but rather were done by a third party and released as a popular gem? (I'm not sure if this would have been possible, as there may have been changes needed to the MRI to support fibers, but let's pretend for a moment.) Does this mean that nobody writing class libraries for any programming language are allowed to implement features that are "baked into" some other programming language?

Um... no: witness LINQ, stealing... *ahem* leveraging... a great deal of the concepts that are behind functional languages. Or the Win16 API (or the classic Mac OS API, or the Xt API, or ...), using object concepts from within the C language.

If so, C# should have never been created.

Huh?

Look, I have nothing against Ruby swiping ideas from another language. But let's not pretend that Ruby was built, from the ground up, as a functional language. The concepts that Ruby is putting forth in its 1.9 release are "bolted on", and will show the same leaks in the abstraction model as any other linguistic approach "bolted on" after the fact. This is a large part of the beef with generics in Java, with objects in C, with O/R-Ms, and so on. Languages choose, very precisely, which abstractions they want to make as first-class citizens, and usually when they try to add more of those concepts in after the fact, backwards compatibility and the choices they made earlier trip them up and create a suboptimal scenario. (Witness the various attempts to twist Java into a metaprogramming language: generics, AOP, and so on.)

Besides, if you're going to explore those features, why not go straight to the source? Since when has it become fashionable to discourage people from learning a new concept in the very environment where it is highlighted? Ruby is a phenomenal dynamic language (as is Lisp and Smalltalk, among others), and anybody who wants to grok dynamic languages should learn Ruby (and/or Lisp, and/or Smalltalk). Ditto for functional languages (Haskell and ML/OCaml being the two primary candidates in that camp).

Don't get me wrong -- I agree that there are way better languages for FP than Ruby, even with fibers. That's part of the reason why so many people are tracking JRuby and IronRuby, as having Ruby implementations on common VMs/LRs gives developers greater flexibility for mixing-and-matching languages to fit specific needs (JRuby/Scala/Groovy/Java on JVM, IronEverything/LotsOf# on CLR/DLR).

Which is the same thing I just said. Cool. :-)

I just think you could have spun this more positively and made the same points. The Rails team is having their hats handed to them over the past week or two; casting fibers as a "whither Ruby?" piece just feels like piling on.

Well, frankly, I don't track what's going on in the Rails space at all [and, to be honest, if one more programmer out there invents one more web framework that rhymes with "ails" in any way, so help me God I will SCREAM], so I can honestly say that I wasn't trying to "pile on". What I do find frustrating, however, is the general belief that Ruby is somehow God's Original Scripting Language, and that the Ruby community is constantly innovating while the rest of the programming world is staring on in drooling slack-jawed envy. Most of what Ruby does today is Old Hat to Smalltalkers, and I fully expect that PowerShellers will come along and find most of what the Ruby guys are doing to be interesting experiments in just how powerful the PSH environment really is.

Of deeper concern is the blending of "shell language" and "programming language" that Ruby seems to encourage; the only other language that I think really crosses that line is Perl, and honestly, that's not necessarily good company to be in on this score. When a language tries to hold fealty to too many masters, it loses coherence. Time will tell how well Ruby can chart that narrow course; to my mind, this is what ultimately doomed (and continues to dog) Perl 6.


.NET | C++ | Java/J2EE | Languages | Ruby | Windows

Tuesday, January 15, 2008 3:16:20 AM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
Java: "Done" like the Patriots, or "Done" like the Dolphins?

English is a wonderful language, isn't it? I'm no linguist, but from what little study I've made of other languages (French and German, so far), English seems to have this huge propensity, more so than the other languages, to put multiple meanings behind the same word. Consider, for example, the word "done": it means "completed", as in "Are you finished with your homework? Yes, Dad, I'm done.", or it can mean "wiped out, exhausted, God-just-take-me-now-please", as in "Good God, another open-source Web framework? That's it, I give up. I'm done. Code the damn thing in assembler for all I care."

So is Java "done" like the Patriots, a job well accomplished, or "done" like the Dolphins, the less said, the better?

(For those of you who are not American football fans, the New England Patriots have gone completely undefeated this season, a mark only set once before in the game's history, and the Miami Dolphins almost went completely unvictorious this season, a mark never accomplished. [Update: Hamlet D'Arcy points out, "Actually, a winless season has been accomplished before. Tampa Bay started their first two seasons winless with an overall 0-26 record before finally winning its first game in 1977." Thanks, Hamlet; my fact-checking on that one was lax, as I was trusting the commentary by a sportscaster during the Dolphins-Ravens game, and apparently his fact-checking was a tad lax, as well. :-)] The playoffs are still going on, but the Patriots really don't look beatable by any of the teams remaining. Meanwhile, the Dolphins managed to eke one out just before the season ended, posting a final record of 1-15, something reserved usually for new teams in the league, not a team with historical greatness behind them. And that's it for Sports, back to you in the studio, Tom.)

Bruce Eckel seems to suggest that Java is somewhere more towards Miami than New England, and that generics were the major culprit. (He also intimates that his criticism of generics has swayed Josh and Neal's opinions to now being critical of generics, something I highly doubt, personally. More on that later.) Now, I'll be the first to admit that I think generics in Java suck, and I've said this before, but the fact remains, no one feature can sink a language. Consider multiple inheritance in C++, something that Stroustrup himself admits (in Design and Evolution of C++) he did before templates or exceptions because he wanted to know how he could do it. Lots of people argued for years (decades, even) over MI and its inclusion in the language, and in the end....

... in the end MI turns out to be a useful feature of the language, but not the way anybody figured they would be. Ditto for templates, by the way. After looking at the Boost libraries, even just the basic examples using them, I feel like I'm looking at Sanskrit or something. As Scott Meyers put it once, "We're a long way from Stack-of-T here, folks."

And that is my principal complaint about generics: the fact that they aren't fully reified down into the JVM means that we lost 90% of the power of generics, and more importantly, we lost all of the emergent behavior and functionality that came out of C++ templates. Nothing new could come out of Java generics, because they were designed to do exactly what they were designed to do: give us type-safe collections. Whee. We're cooking with gas now, folks. Next thing you know, they'll give us printf() back, too.

(Oh, wait, they did that, too.)

Fact is, there's a lot of things that could be done to Java as a language to make it more attractive, but doing so risks that core element that Sun refuses to surrender, that of backwards compatibility. This was evident as far back as JavaPolis 2006, when I interviewed Neal and Josh on the subject; when asked, point-blank, why generics didn't "go all the way down", a la .NET generics do, they both basically said, "that would break backwards compatibility, and that was a core concern from the start". (I disagreed with them, off-camera, mind you, particularly on the grounds that the Collections library, the major source of concern around backwards compatibility, could have been ported over, but then Neal pointed out to me that it wasn't just the library itself but all the places it was used, particularly all those libraries outside of Sun, that was at stake. Perhaps, but I still believe that a happier middle ground could have been eked out.) That is still the message today, from what I can see of Neal's and Josh's public statements.

And the fact is, so far as it goes, Java generics are (ugh) useful. Useful solely as a Java compiler trick, perhaps, and far more verbose than we'd prefer, but useful nonetheless. Using them is about as exciting as using a new hammer, but they can at least get the job done.

There, I've made the obligatory "generics don't completely suck" disclaimer, and I'll be the first one to tell you, I just live with the warnings when I write Java code. Possibly that's because I don't worry too much about type-safe collections in my code, but I know lots of other programmers (particularly those on teams where the team composition isn't perhaps as strong as they'd like it to be) who do, and thus take the extra time to write their code to be generics-friendly and thus warning-free.

The mere fact that we have to work at it to create code that is "generics-friendly" is part of the problem here. For all those who came from C++ years ago, you'll know what I mean when I say that "Java generics are the new C++ const": Writing const-correct code was always a Good Thing To Do, it's just that it was also just such a Damn Hard Thing To Do. Which meant that nobody did it.

Languages should enable you to fall into the pit of success. That's the heart of the Principle of Least Surprise, even if it's not always said that way. (I'm not sure that C# 3 does this, time will tell. I'm reasonably certain that Ruby doesn't, despite the repeated insistence of Ruby advocates, many of whom I deeply respect. I'm nervous that Scala and F# will fall into this same trap, owing to their unusual syntax in places. It will be fun to see how ActionScript 3 turns out.)

Here's a thought: Let's leave Java where it is, and just start creating new JVM languages that cater to specific needs. You can call them Java, too, if you like. Or something else, like Scala or Clojure or Groovy or JRuby or CJ or whatever suits your fancy. Since everybody compiles down to JVM bytecode, it's all really academic--they're all Java, in some fundamental way. Which means that Java can thus rest easy, knowing that it fought the good fight, and that others equally capable are carrying on the tradition of JVM programming.

Eckel makes a good point:

Arguably one of the best features of C is that it hasn't changed at all for decades.

... which completely ignores some of the changes that were proposed and accepted for the C99 standard, but we'll leave that alone for now. The point is, the core C language now is the same core C language that I learned back in my high school days, and most, if not all, C code from even that far back will still compile under today's compilers. (Granted, there's likely to be a ton of warnings if you're using old "K-and-R" C, but the code will still compile.)

What about evolution, though? Don't languages need to evolve in order to stay relevant?

Consider the C case: C++ came along, made a whole bunch of changes to the language, but went zooming off in its own direction, to the point where a standards-compliant C++ compiler won't compile even relatively recent C code.

And how many people have complained about that?

By the way, if you're a C/C++ programmer and you haven't looked at D, you're about to get leaped on the evolutionary ladder again. Just an FYI.

As a matter of fact, if you're a Java or .NET programmer, you'd be well-advised to take a look at D, too. It's one of the more interesting native-compilation languages I've seen in a while, and yet arguably it's just what a C++ compiler author would come up with after studying Java and C# for a while (which, as far as I can tell, is exactly what happened). And because D can essentially mimic C bindings for dynamic libraries, it means that a Java guy can now write a JNI DLL in a garbage-collected language that (mostly) does away with pointer arithmetic for most of its work... just as Java did.

Heck, I'd love to see a D-for-the-JVM variant. And D-for-the-CLR, while we're at it. Just for fun.

Let's do this: somebody take the old, pre-Java5 javac source, and release it as "JWH" (short for Java Work Horse), and maintain it as a separate branch of the Java compiler. Then we can hack on the new Java5 language for years, maybe call it "JWNFF" (short for Java With New-Fangled Features), and everybody can get back to work without complaints.

Well, at least those who want to go back to work can do so; there'll always be people who'd rather complain than Get Stuff Done. *shrug*

Now, on the other hand, let's talk about the JVM, and specifically what needs to change there if the JVM platform is to be the workhorse of the 21st century like it was for the latter half of the last decade of the 20th....


.NET | C++ | Java/J2EE | Languages | Ruby

Tuesday, January 15, 2008 2:27:12 AM (Pacific Standard Time, UTC-08:00)
Comments [6]  | 
 Thursday, January 10, 2008
Of Fibers and Continuations

Dave explains Ruby fibers, as they're called in Ruby 1.9. Now, before I get going here, let me explain my biases up front: in the Windows world, we've had fibers for near on to half-decade, I think, and they're basically programmer-managed cooperative tasks. In other words, they're much like threads before threads were managed by the operating system--you decide when to switch to a different fiber, you manage the scheduling, the fiber just gives you a data structure and some basic housekeeping. (I know I'm oversimplifying and glossing over details, but that's the core, as I remember it. It's been a while since I tried to use them.) Legend has it that fibers were introduced into the Win32 API on behalf of the SQL Server team, who need to take that kind of control over thread scheduling in order to best manage the CPU, but here's the rub: they never served much purpose otherwise.

Frankly, nobody could figure out what to do with them. I'm beginning to wonder if it was because our languages of the time (C, C++) didn't have any real idea of freezing execution of a task at a certain point, putting it aside, then coming back to it and restoring it. In other words, the very behavior we see out of a continuation.

In Dave's explanation, Ruby fibers take on a different meaning. According to Dave's explanation:

A fiber is somewhat like a thread, except you have control over when it gets scheduled. Initially, a fiber is suspended. When you resume it, it runs the block until the block finishes, or it hits a Fiber.yield. This is similar to a regular block yield: it suspends the fiber and passes control back to the resume. Any value passed to Fiber.yield becomes the value returned by resume.

They sound a lot like Win32 fibers combined with Python generators, with a touch more by way of API support. (The Win32 API version was codified using C bindings, for starters, not objects.) But Dave quickly points out that fibers can become full-fledged coroutines by allowing fibers to transfer control from one to another, which is interesting, though I suspect lots of people will explore this feature and write lots of bad code as a result. Oh, well: bright shiny new toys have that effect on programmers sometimes.

He then goes on to describe how Ruby can provide pipelines:

As a starting point, let's write two fibers. One's a generator—it creates a list of even numbers. The second is a consumer. All it does it accept values from the generator and print them. We'll make the consumer stop after printing 10 numbers.

    evens = Fiber.new do
      value = 0
      loop do
        Fiber.yield value
        value += 2
      end
    end

    consumer = Fiber.new do
      10.times do
        next_value = evens.resume
        puts next_value
      end
    end

    consumer.resume

Note how we had to use resume to kick off the consumer. Technically, the consumer doesn't have to be a Fiber, but, as we'll see in a minute, making it one gives us some flexibility.

Ah, the classic producer-consumer example. Gotta love it. The interesting thing here, though, is that evens, prior to the call to resume, has done nothing. No execution has taken place. In essence, the fiber here is in deferred execution mode (now, where have I heard that before?), meaning nothing actually fires until asked for. It then runs until it hits the yield, essentially going to sleep again.

Is it me, or does this smell suspiciously like continuations?

More interesting, Dave goes on to define the consumer fiber to take the name of a source to resume, then shows how once can abstract the coupling between producer and consumer away even further by creating a filter that only allows multiples of three through the pipeline:

    def evens
      Fiber.new do
        value = 0
        loop do
          Fiber.yield value
          value += 2
        end
      end
    end

    def multiples_of_three(source)
      Fiber.new do
        loop do
          next_value = source.resume
          Fiber.yield next_value if next_value % 3 == 0
        end
      end
    end

    def consumer(source)
      Fiber.new do
        10.times do
          next_value = source.resume
          puts next_value
        end
      end
    end

    consumer(multiples_of_three(evens)).resume

Running this, we get the output

0
6
12
18
. . .

This is getting cool. We write little chunks of code, and then combine them to get work done. Just like a pipeline.

Actually, instead of calling it a pipeline, let's call it a comprehension and be done with it.

See, Ruby apparently has discovered the joys of functional programming, something that Scala and F# have baked in from the beginning, instead of bolted on from the outside. No offense intended to the Ruby community or to Matz, but I get a little lost as to exactly what Ruby's core concepts are--it's a scripting language, it's a development language, it's a DSL platform, it's object-oriented, it's functional, it's a bird, it's a plane, it's horribly confused.

Dave touches on this point in one of his responses to comments:

The thing that's interesting to me about Ruby in this context is how much is can bend into multiple paradigms. Haskell does FP way better than Ruby. Smalltalk does OO (marginally) better. But Ruby does them all, and in a way that interoperates nicely.

I like a lot of Ruby's core concepts--open classes, mixins, and so on--but I'm worried that Ruby's trying to do too much, much as another language I know and love is. Frankly, this desire to accommodate the nifty feature of the moment smacks a great deal of Visual Basic, and while VB certainly has its strengths, coherent language design and consistent linguistic facilities is not one of them. It's played havoc with people who tried to maintain code in VB, and it's played hell with the people who try to maintain the VB language. One might try to argue that the Ruby maintainers are just Way Smarter than the Visual Basic maintainers, but I think that sells the VB team pretty short, having met some of them.

Don't get me wrong here, I think it's nifty that Ruby has come around to realize the power of atomic components doing one thing well, passing its results on into the pipeline for something else to process, and this is a large part of why PowerShell is, in my mind, the sleeper programming language of 2008/2009. Pipelines also scale very well, since they encourage immutable state, since the results of each processing step are essentially fed in from the outside and the results are passed back out to the next step in the chain--all state is passed from one step to the next, meaning I can run lots of these pipelines in parallel with no fear of deadlocks or bottlenecks, since each processing step is itself essentially state-free. This is also, in fact, a lot of how original transaction-processing systems were designed, which also scaled pretty well, at least until we got the bright idea to store mutable state in them (*cough* EJB *cough*).

Oh, and for what it's worth, this concept is trivial to do in F#, via the pipeline operator ( "|>" ). Ditto for Scala. If you're going to think in pipelines, you may as well work with a language that has the concept baked in a little more deeply, IMHO. And before the Rubyists beat me over the head about this, Dave himself admits this is true in another comment response:

Paolo: I don't think Ruby or Smalltalk really do functional programming to any deep level. However, both can be used to implement particular FP constructs (such as generators).

And maybe, in the end, that's the important thing: recognizing what aspects of functional programming can be easily lifted into your language of choice and used to make your life simpler. Still, I'm always looking for languages that take the concepts that float in my head and let me express them as first-class constructs, not as duck-taped partial implementations thereof. I felt the same way about doing "objects" in C (back in the Win16 programming day, before C++ Windows frameworks emerged), and about doing "aspects" in Java using interception.

If you're going to think in a concept, you generally want a language that expresses that concept as a first-class citizen, or you'll get frustrated quickly. Ruby's fibers may be the gateway drug for developers to learn functional programming, but they're not going to get it at any deep level until they dive into Haskell or ML or one of its derivatives (Scala or F#). For example, once you see the power inherent in Scala's comprehensions, you never look at a simple for loop the same way again.

Oh, and Groovyists? I'm sure they could do this, but I dunno if it's worth it, given that Groovy and Scala, at some level, are fundamentally interoperable as well. (Note to self: must do a blog post about Groovy calling into Scala code, just to show it can be done. Y'all hold me to that, if you don't see it in a week or two.)

Meanwhile, the link between continuations and Ruby fibers (and Win32 fibers, while we're at it) still tickles at the back of my mind.... But that's a thought waiting to be explored another day.


.NET | Java/J2EE | Languages | Ruby

Thursday, January 10, 2008 5:28:00 AM (Pacific Standard Time, UTC-08:00)
Comments [5]  | 
 Wednesday, January 09, 2008
Larraysaywhut?

Larry Wall, (in)famous creator of that (in)famous Perl language, has contributed a few cents' worth to the debate over "scripting" languages:

I think, to most people, scripting is a lot like obscenity. I can't define it, but I'll know it when I see it.

Aside from the fact that the original quote reads "pornography" instead of "obscenity", I get what he's talking about. Finding a good definition for scripting is like trying to find a good definition for "object-oriented" or "service-oriented" or... come to think of it, like a lot of the terms that we tend to use on a daily basis. So I'm right there along with him, assuming that his goal here is to call out a workable definition for "scripting" languages.

Here are some common memes floating around:

    Simple language
    "Everything is a string"
    Rapid prototyping
    Glue language
    Process control
    Compact/concise
    Worse-is-better
    Domain specific
    "Batteries included"

...I don't see any real center here, at least in terms of technology. If I had to pick one metaphor, it'd be easy onramps. And a slow lane. Maybe even with some optional fast lanes.

I'm not sure where some of these memes come from, but some of them I recognize (Simple language, Rapid prototyping, glue language, compact/concise), some of them are new to me ("Everything is a string", process control), and some of them I seriously question the sanity of anybody suggesting them (worse-is-better, domain specific, "batteries included"). Fortunately he didn't include the "dynamically typed" or "loosely coupled" memes, which I hear tagged on scripting languages all the time.

But basically, scripting is not a technical term. When we call something a scripting language, we're primarily making a linguistic and cultural judgment, not a technical judgment. I see scripting as one of the humanities. It's our linguistic roots showing through.

I can definitely see the use of the term "scripting" as a term of value judgement, but I'm not sure I see the idea that scripting languages somehow demonstrate our linguistic roots.

We then are treated to one-sentence reviews of every language Larry ever programmed in, starting from his earliest days in BASIC, with some interesting one-liners scattered in there every so often:

On Ruby: "... a great deal of Ruby's syntax is borrowed from Perl, layered over Smalltalk semantics."

On Lisp: "Is LISP a candidate for a scripting language? While you can certainly write things rapidly in it, I cannot in good conscience call LISP a scripting language. By policy, LISP has never really catered to mere mortals. And, of course, mere mortals have never really forgiven LISP for not catering to them."

On JavaScript: "Then there's JavaScript, a nice clean design. It has some issues, but in the long run JavaScript might actually turn out to be a decent platform for running Perl 6 on. Pugs already has part of a backend for JavaScript, though sadly that has suffered some bitrot in the last year. I think when the new JavaScript engines come out we'll probably see renewed interest in a JavaScript backend." Presumably he means a new JavaScript backend for Perl 6. Or maybe a new Perl 6 backend for JavaScript.

On scripting langauges as a whole: "When I look at the present situation, what I see is the various scripting communities behaving a lot like neighboring tribes in the jungle, sometimes trading, sometimes warring, but by and large just keeping out of each other's way in complacent isolation."

Like the prize at the bottom of the cereal box, if you can labor through all of this, though, you get treated to one of the most amazing succinct discussions/point-lists of language design and implementation I've seen in a long while; I've copied that section over verbatim, though I annotate with my own comments in italics:

early binding / late binding

Binding in this context is about exactly when you decide which routine you're going to call for a given routine name. In the early days of computing, most binding was done fairly early for efficiency reasons, either at compile time, or at the latest, at link time. You still tend to see this approach in statically typed languages. With languages like Smalltalk, however, we began to see a different trend, and these days most scripting languages are trending towards later binding. That's because scripting languages are trying to be dwimmy (Do What I Mean), and the dwimmiest decision is usually a late decision because you then have more available semantic and even pragmatic context to work with. Otherwise you have to predict the future, which is hard.

So scripting languages naturally tend to move toward an object-oriented point of view, where the binding doesn't happen 'til method dispatch time. You can still see the scars of conflict in languages like C++ and Java though. C++ makes the default method type non-virtual, so you have to say virtual explicitly to get late binding. Java has the notion of final classes, which force calls to the class to be bound at compile time, essentially. I think both of those approaches are big mistakes. Perl 6 will make different mistakes. In Perl 6 all methods are virtual by default, and only the application as a whole can tell the optimizer to finalize classes, presumably only after you know how all the classes are going to be used by all the other modules in the program.

[Frankly, I think he leaves out a whole class of binding ideas here, that being the "VM-bound" notion that both the JVM and the CLR make use of. In other words, the Java language is early-bound, but the actual linking doesn't take place until runtime (or link time, as it were). The CLR takes this one step further with its delegates design, essentially allowing developrs to load a metadata token describing a function and construct a delegate object--a functor, as it were--around that. This is, in some ways, a highly useful marriage of both early and late binding.

[I'm also a little disturbed by his comments "only the application as a whole can tell the optimizer to finalize classes, presumably only after you know how all that classes are going to be used by all the other modules in the program. Since when can programmers reasonably state that they know how classes are going to be used by all the other modules in the program? This seems like a horrible set-you-up-for-failure point to me.]

single dispatch / multiple dispatch

In a sense, multiple dispatch is a way to delay binding even longer. You not only have to delay binding 'til you know the type of the object, but you also have to know the types of all rest of the arguments before you can pick a routine to call. Python and Ruby always do single dispatch, while Dylan does multiple dispatch. Here is one dimension in which Perl 6 forces the caller to be explicit for clarity. I think it's an important distinction for the programmer to bear in mind, because single dispatch and multiple dispatch are philosophically very different ideas, based on different metaphors.

With single-dispatch languages, you are basically sending a message to an object, and the object decides what to do with that message. With multiple dispatch languages, however, there is no privileged object. All the objects involved in the call have equal weight. So one way to look at multiple dispatch is that the objects are completely passive. But if the objects aren't deciding how to bind, who is?

Well, it's sort of a democratic thing. All the routines of a given name get together and hold a political conference. (Well, not really, but this is how the metaphor works.) Each of the routines is a delegate to the convention. All the potential candidates put their names in the hat. Then all the routines vote on who the best candidate is, and the next best, and the next best after that. And eventually the routines themselves decide what the best routine to call is.

So basically, multiple dispatch is like democracy. It's the worst way to do late binding, except for all the others.

But I really do think that's true, and likely to become truer as time goes on. I'm spending a lot of time on this multiple dispatch issue because I think programming in the large is mutating away from the command-and-control model implicit in single dispatch. I think the field of computation as a whole is moving more toward the kinds of decisions that are better made by swarms of insects or schools of fish, where no single individual is in control, but the swarm as a whole has emergent behaviors that are somehow much smarter than any of the individual components.

[I think it's a pretty long stretch to go from "multiple dispatch", where the call is dispatched based not just on the actual type of the recipient but the caller as well, to suggesting that whole "swarms" of objects are going to influence where the call comes out. People criticized AOP for creating systems where developers couldn't predict, a priori, where a call would end up, how will they react to systems where nondeterminism--having no real idea at source level which objects are "voting", to use his metaphor--is the norm, not the exception?]

eager evaluation / lazy evaluation

Most languages evaluate eagerly, including Perl 5. Some languages evaluate all expressions as lazily as possible. Haskell is a good example of that. It doesn't compute anything until it is forced to. This has the advantage that you can do lots of cool things with infinite lists without running out of memory. Well, at least until someone asks the program to calculate the whole list. Then you're pretty much hosed in any language, unless you have a real Turing machine.

So anyway, in Perl 6 we're experimenting with a mixture of eager and lazy. Interestingly, the distinction maps very nicely onto Perl 5's concept of scalar context vs. list context. So in Perl 6, scalar context is eager and list context is lazy. By default, of course. You can always force a scalar to be lazy or a list to be eager if you like. But you can say things like for 1..Inf as long as your loop exits some other way a little bit before you run into infinity.

[This distinction is, I think, becoming one of continuum rather than a binary choice; LINQ, for example, makes use of deferred execution, which is fundamentally a lazy operation, yet C# itself as a whole generally prefers eager evaluation where and when it can... except in certain decisions where the CLR will make the call, such as with the aforementioned delegates scenario. See what I mean?]

eager typology / lazy typology

Usually known as static vs. dynamic, but again there are various positions for the adjustment knob. I rather like the gradual typing approach for a number of reasons. Efficiency is one reason. People usually think of strong typing as a reason, but the main reason to put types into Perl 6 turns out not to be strong typing, but rather multiple dispatch. Remember our political convention metaphor? When the various candidates put their names in the hat, what distinguishes them? Well, each candidate has a political platform. The planks in those political platforms are the types of arguments they want to respond to. We all know politicians are only good at responding to the types of arguments they want to have...

[OK, Larry, enough with the delegates and the voting thing. It just doesn't work. I know it's an election year, and everybody wants to get in on the whole "I picked the right candidate" thing, but seriously, this metaphor is getting pretty tortured by this point.]

There's another way in which Perl 6 is slightly more lazy than Perl 5. We still have the notion of contexts, but exactly when the contexts are decided has changed. In Perl 5, the compiler usually knows at compile time which arguments will be in scalar context, and which arguments will be in list context. But Perl 6 delays that decision until method binding time, which is conceptually at run time, not at compile time. This might seem like an odd thing to you, but it actually fixes a great number of things that are suboptimal in the design of Perl 5. Prototypes, for instance. And the need for explicit references. And other annoying little things like that, many of which end up as frequently asked questions.

[Again, this is a scenario where smarter virtual machines and execution engines can help with this--in Java, for example, the JVM can make some amazing optimizations in its runtime compiler (a.k.a. JIT compiler) that a normal ahead-of-time compiler simply can't make, such as monomorphic interface calls. One area that I think he's hinting at here, though, which I think is an interesting area of research and extension, is that of being able to access the context in which a call is being made, a la the .NET context architecture, which had some limited functionality in the EJB space, as well. This would also be a good "middle-ground" for multi-dispatch, since now the actual dispatch could be done on the basis of the context itself, which could be known, rather than on random groups of objects that Larry's gathered together for an open conference on dispatching the method call.... I kid, I kid.]

limited structures / rich structures

Awk, Lua, and PHP all limit their composite structures to associative arrays. That has both pluses and minuses, but the fact that awk did it that way is one of the reasons that Perl does it differently, and differentiates ordered arrays from unordered hashes. I just think about them differently, and I think a lot of other people do too.

[Frankly, none of the "popular" languages really has a good set-based first-class concept, whereas many of the functional languages do, and thanks to things like LINQ, I think the larger programming world is beginning to see the power in sets and set projections. So let's not limit the discussion to associative arrays; yes, they're useful, but in five years they'll be useful in the same way that line-numbered BASIC and use of the goto keyword can still be useful.]

symbolic / wordy

Arguably APL is also a kind of scripting language, largely symbolic. At the other extreme we have languages that eschew punctuation in favor of words, such as AppleScript and COBOL, and to a lesser extent all the Algolish languages that use words to indicate blocks where the C-derived languages use curlies. I prefer a balanced approach here, where symbols and identifiers are each doing what they're best at. I like it when most of the actual words are those chosen by the programmer to represent the problem at hand. I don't like to see words used for mere syntax. Such syntactic functors merely obscure the real words. That's one thing I learned when I switched from Pascal to C. Braces for blocks. It's just right visually.

[Sez you, though I have to admit my own biases agree. As with all things, though, this can get out of hand pretty quickly if you're not careful. The prosecution presents People's 1, Your Honor: the Perl programming langauge.]

Actually, there are languages that do it even worse than COBOL. I remember one Pascal variant that required your keywords to be capitalized so that they would stand out. No, no, no, no, no! You don't want your functors to stand out. It's shouting the wrong words: IF! foo THEN! bar ELSE! baz END! END! END! END!

[Oh, now, that's just silly.]

Anyway, in Perl 6 we're raising the standard for where we use punctuation, and where we don't. We're getting rid of some of our punctuation that isn't really pulling its weight, such as parentheses around conditional expressions, and most of the punctuational variables. And we're making all the remaining punctuation work harder. Each symbol has to justify its existence according to Huffman coding.

Oddly, there's one spot where we're introducing new punctuation. After your sigil you can add a twigil, or secondary sigil. Just as a sigil tells you the basic structure of an object, a twigil tells you that a particular variable has a weird scope. This is basically an idea stolen from Ruby, which uses sigils to indicate weird scoping. But by hiding our twigils after our sigils, we get the best of both worlds, plus an extensible twigil system for weird scopes we haven't thought of yet.

[Did he just say "twigil"? As in, this is intended to be a serious term? As in, Perl wasn't symbol-heavy enough, so now they're adding twigils that will hide after sigils, with maybe forgils and fivegils to come in Perl 7 and 8, respectively?]

We think about extensibility a lot. We think about languages we don't know how to think about yet. But leaving spaces in the grammar for new languages is kind of like reserving some of our land for national parks and national forests. Or like an archaeologist not digging up half the archaeological site because we know our descendants will have even better analytical tools than we have.

[Or it's just YAGNI, Larry. Look, if your language wants to have syntactic macros--which is really the only way to have langauge extensibility without having to rewrite your parser and lexer and AST code every n number of years, then build in syntactic macros, but really, now you're just emulating LISP, that same language you said wasn't for mere mortals, waaaay back there up at the top.]

Really designing a language for the future involves a great deal of humility. As with science, you have to assume that, over the long term, a great deal of what you think is true will turn out not to be quite the case. On the other hand, if you don't make your best guess now, you're not really doing science either. In retrospect, we know APL had too many strange symbols. But we wouldn't be as sure about that if APL hadn't tried it first.

[So go experiment with something that doesn't have billions of lines of code scattered all across the planet. That's what everybody else does. Witness Gregor Kiczales' efforts with AspectJ: he didn't go and modify Java proper, he experimented with a new language to see what AOP constructs would fit. And he never proposed AspectJ as a JSR to modify core Java. Not because he didn't want to, mind you, I know that this was actively discussed. But I also know that he was waiting to see what a large-scale AOP system looked like, so we could find the warts and fix them. The fact that he never opened an AspectJ JSR suggests to me that said large-scale AOP system never materialized.]

compile time / run time

Many dynamic languages can eval code at run time. Perl also takes it the other direction and runs a lot of code at compile time. This can get messy with operational definitions. You don't want to be doing much file I/O in your BEGIN blocks, for instance. But that leads us to another distinction:

declarational / operational

Most scripting languages are way over there on the operational side. I thought Perl 5 had an oversimplified object system till I saw Lua. In Lua, an object is just a hash, and there's a bit of syntactic sugar to call a hash element if it happens to contain code. Thats all there is. [Dude, it's the same with JavaScript/ECMAScript. And a few other langauges, besides.] They don't even have classes. Anything resembling inheritance has to be handled by explicit delegation. That's a choice the designers of Lua made to keep the language very small and embeddable. For them, maybe it's the right choice.

Perl 5 has always been a bit more declarational than either Python or Ruby. I've always felt strongly that implicit scoping was just asking for trouble, and that scoped variable declarations should be very easy to recognize visually. Thats why we have my. It's short because I knew we'd use it frequently. Huffman coding. Keep common things short, but not too short. In this case, 0 is too short.

Perl 6 has more different kinds of scopes, so we'll have more declarators like my and our. But appearances can be deceiving. While the language looks more declarative on the surface, we make most of the declarations operationally hookable underneath to retain flexibility. When you declare the type of a variable, for instance, you're really just doing a kind of tie, in Perl 5 terms. The main difference is that you're tying the implementation to the variable at compile time rather than run time, which makes things more efficient, or at least potentially optimizable.

[The whole declarational vs operational point here seems more about type systems than the style of code; in a classless system, a la JavaScript/ECMAScript, objects are just objects, and you can mess with them at runtime as much as you wish. How you define the statements that use them, on the other hand, is another axis of interest entirely. For example, SQL is a declarational language, really more functional in nature (since functional languages tend to be declarational as well), since the interpreter is free to tackle the statement in any sub-clause it wishes, rather than having to start from the beginning and parse right. There's definitely greater distinctions waiting to be made here, IMHO, since there's still a lot of fuzziness in the taxonomy.]

immutable classes / mutable classes

Classes in Java are closed, which is one of the reasons Java can run pretty fast. In contrast, Ruby's classes are open, which means you can add new things to them at any time. Keeping that option open is perhaps one of the reasons Ruby runs so slow. But that flexibility is also why Ruby has Rails. [Except that Ruby now compiles to the JVM, and fully supports open classes there, and runs a lot faster than the traditional Ruby interpreter, which means that either the mutability of classes has nothing to do with the performance of a virtual machine, or else the guys working on the traditional Ruby interpreter are just morons compared to the guys working on Java. Since I don't believe the latter, I believe that the JVM has some intrinsic engineering in it that the Ruby interpreter could have--given enough time and effort--but simply doesn't have yet. Frankly, from having spelunked the CLR, there's really nothing structurally restricting the CLR from having open classes, either, so long as the semantics of modifying a class structure in memory were well understood: concurrency issues, outstanding objects, changes in method execution semantics, and so on.]

Perl 6 will have an interesting mix of immutable generics and mutable classes here, and interesting policies on who is allowed to close classes when. Classes are never allowed to close or finalize themselves, for instance. Sorry, for some reason I keep talking about Perl 6. It could have something to do with the fact that we've had to think about all of these dimensions in designing Perl 6.

class-based / prototype-based

Here's another dimension that can open up to allow both approaches. Some of you may be familiar with classless languages like Self or JavaScript. Instead of classes, objects just clone from their ancestors or delegate to other objects. For many kinds of modeling, it's actually closer to the way the real world works. Real organisms just copy their DNA when they reproduce. They don't have some DNA of their own, and an @ISA array telling you which parent objects contain the rest of their DNA.

[I get nervous whenever people start drawing analogies and start pursuing them too strongly. Yes, this is how living organisms replicate... but we're not designing living organisms. A model is just supposed to represent a part of reality, not try to recreate reality itself. Having said that, though, there's definitely a lot to be said for classless languages (which don't necessarily have to be prototype-based, by the way, though it makes sense for them to be). Again, what I think makes the most sense here is a middle-of-the-road scenario combined with open classes. Objects belong to classes, but fully support runtime reification of types.]

The meta-object protocol for Perl 6 defaults to class-based, but is flexible enough to set up prototype-based objects as well. Some of you have played around with Moose in Perl 5. Moose is essentially a prototype of Perl 6's object model. On a semantic level, anyway. The syntax is a little different. Hopefully a little more natural in Perl 6.

passive data, global consistency / active data, local consistency

Your view of data and control will vary with how functional or object-oriented your brain is. People just think differently. Some people think mathematically, in terms of provable universal truths. Functional programmers don't much care if they strew implicit computation state throughout the stack and heap, as long as everything looks pure and free from side-effects.

Other people think socially, in terms of cooperating entities that each have their own free will. And it's pretty important to them that the state of the computation be stored with each individual object, not off in some heap of continuations somewhere.

Of course, some of us can't make up our minds whether we'd rather emulate the logical Sherlock Holmes or sociable Dr. Watson. Fortunately, scripting is not incompatible with either of these approaches, because both approaches can be made more approachable to normal folk.

[Or, don't choose at all, but combine as you need to, a la Scala or F#. By the way, objects are not "free willed" entities--they are intrinsically passive entities, waiting to be called, unless you bind a thread into their execution model, which then makes them "active objects" or sometimes called "actors" (not to be confused with the concurrency model Actors, such as Scala uses). So let's not get too hog-wild with that "individual object/live free or die" meme, not unless you're going to differentiate between active objects and passive objects. Which, I think, is a valuable thing to differentiate on, FWIW.]

info hiding / scoping / attachment

And finally, if you're designing a computer language, there are a couple bazillion ways to encapsulate data. You have to decide which ones are important. What's the best way to let the programmer achieve separation of concerns?

object / class / aspect / closure / module / template / trait

You can use any of these various traditional encapsulation mechanisms.

transaction / reaction / dynamic scope

Or you can isolate information to various time-based domains.

process / thread / device / environment

You can attach info to various OS concepts.

screen / window / panel / menu / icon

You can hide info various places in your GUI. Yeah, yeah, I know, everything is an object. But some objects are more equal than others. [NO. Down this road lies madness, at least at the language level. A given application might choose to, for reasons of efficiency... but doing so is a local optimization, not something to consider at the language level itself.]

syntactic scope / semantic scope / pragmatic scope

Information can attach to various abstractions of your program, including, bizarrely, lexical scopes. Though if you think about it hard enough, you realize lexical scopes are also a funny kind of dynamic scope, or recursion wouldn't work right. A state variable is actually more purely lexical than a my variable, because it's shared by all calls to that lexical scope. But even state variables get cloned with closures. Only global variables can be truly lexical, as long as you refer to them only in a given lexical scope. Go figure.

So really, most of our scopes are semantic scopes that happen to be attached to a particular syntactic scope.

[Or maybe scope is just scope.]

You may be wondering what I mean by a pragmatic scope. That's the scope of what the user of the program is storing in their brain, or in some surrogate for their brain, such as a game cartridge. In a sense, most of the web pages out there on the Internet are part of the pragmatic scope. As is most of the data in databases. The hallmark of the pragmatic scope is that you really don't know the lifetime of the container. It's just out there somewhere, and will eventually be collected by that Great Garbage Collector that collects all information that anyone forgets to remember. The Google cache can only last so long. Eventually we will forget the meaning of every URL. But we must not forget the principle of the URL. [This is weirdly Zen, and either makes no sense at all, or has a scope (pardon the pun) far outside of that of programming languages and is therefore rendered meaningless for this discussion, or he means something entirely different from what I'm reading.] That leads us to our next degree of freedom.

use Lingua::Perligata;

If you allow a language to mutate its own grammar within a lexical scope, how do you keep track of that cleanly? Perl 5 discovered one really bad way to do it, namely source filters, but even so we ended up with Perl dialects such as Perligata and Klingon. What would it be like if we actually did it right?

[Can it even be done right? Lisp had a lot of success here with syntactic macros, but I don't think they had scope attached to them the way Larry is looking at trying to apply here. Frankly, what comes to mind most of all here is the C/C++ preprocessor, and multiple nested definitions of macros. Yes, it can be done. It is incredibly ugly. Do not ask me to remember it again.]

Doing it right involves treating the evolution of the language as a pragmatic scope, or as a set of pragmatic scopes. You have to be able to name your dialect, kind of like a URL, so there needs to be a universal root language, and ways of warping that universal root language into whatever dialect you like. This is actually near the heart of the vision for Perl 6. We don't see Perl 6 as a single language, but as the root for a family of related languages. As a family, there are shared cultural values that can be passed back and forth among sibling languages as well as to the descendants.

I hope you're all scared stiff by all these degrees of freedom. I'm sure there are other dimensions that are even scarier.

But... I think its a manageable problem. I think its possible to still think of Perl 6 as a scripting language, with easy onramps.

And the reason I think its manageable is because, for each of these dimensions, it's not just a binary decision, but a knob that can be positioned at design time, compile time, or even run time. For a given dimension X, different scripting languages make different choices, set the knob at different locations.

Somewhere in the universe, a budding programming language designer reads that last paragraph, thinks to himself, I know! I'll create a language where the programmer can set that knob wherever they want, even at runtime! Sort of like a "Option open_classes on; Option dispatch single; Option meta-object-programming off;" thing....

And with any luck, somebody will kill him before he unleashes it on us all.

Meanwhile, I just sit back and wonder, All this from the guy who proudly claimed that Perl never had a formal design to it whatsoever?


.NET | C++ | Java/J2EE | Languages | Ruby

Wednesday, January 09, 2008 9:35:49 PM (Pacific Standard Time, UTC-08:00)
Comments [2]  |