JOB REFERRALS
    ON THIS PAGE
    ARCHIVES
    CATEGORIES
    BLOGROLL
    LINKS
    SEARCH
    MY BOOKS
    DISCLAIMER
 
 Tuesday, January 15, 2008
My Open Wireless Network

People visiting my house have commented from time to time on the fact that at my house, there's no WEP key or WPA password to get on the network; in fact, if you were to park your car in my driveway and open up your notebook, you can jump onto the network and start browsing away. For years, I've always shrugged and said, "If I can't spot you sitting in my driveway, you deserve the opportunity to attack my network." Fortunately, Bruce Schneier, author of the insanely-good-reading Crypto-Gram newsletter, is in the same camp as I:

My Open Wireless Network

Whenever I talk or write about my own security setup, the one thing that surprises people -- and attracts the most criticism -- is the fact that I run an open wireless network at home.

There's no password. There's no encryption. Anyone with wireless capability who can see my network can use it to access the internet.

To me, it's basic politeness. Providing internet access to guests is kind of like providing heat and electricity, or a hot cup of tea. But to some observers, it's both wrong and dangerous.

I'm told that uninvited strangers may sit in their cars in front of my house, and use my network to send spam, eavesdrop on my passwords, and upload and download everything from pirated movies to child pornography. As a result, I risk all sorts of bad things happening to me, from seeing my IP address blacklisted to having the police crash through my door.

While this is technically true, I don't think it's much of a risk. I can count five open wireless networks in coffee shops within a mile of my house, and any potential spammer is far more likely to sit in a warm room with a cup of coffee and a scone than in a cold car outside my house. And yes, if someone did commit a crime using my network the police might visit, but what better defense is there than the fact that I have an open wireless network? If I enabled wireless security on my network and someone hacked it, I would have a far harder time proving my innocence.

This is not to say that the new wireless security protocol, WPA, isn't very good. It is. But there are going to be security flaws in it; there always are.

I spoke to several lawyers about this, and in their lawyerly way they outlined several other risks with leaving your network open.

While none thought you could be successfully prosecuted just because someone else used your network to commit a crime, any investigation could be time-consuming and expensive. You might have your computer equipment seized, and if you have any contraband of your own on your machine, it could be a delicate situation. Also, prosecutors aren't always the most technically savvy bunch, and you might end up being charged despite your innocence. The lawyers I spoke with say most defense attorneys will advise you to reach a plea agreement rather than risk going to trial on child-pornography charges.

In a less far-fetched scenario, the Recording Industry Association of America is known to sue copyright infringers based on nothing more than an IP address. The accused's chance of winning is higher than in a criminal case, because in civil litigation the burden of proof is lower. And again, lawyers argue that even if you win it's not worth the risk or expense, and that you should settle and pay a few thousand dollars.

I remain unconvinced of this threat, though. The RIAA has conducted about 26,000 lawsuits, and there are more than 15 million music downloaders. Mark Mulligan of Jupiter Research said it best: "If you're a file sharer, you know that the likelihood of you being caught is very similar to that of being hit by an asteroid."

I'm also unmoved by those who say I'm putting my own data at risk, because hackers might park in front of my house, log on to my open network and eavesdrop on my internet traffic or break into my computers.

This is true, but my computers are much more at risk when I use them on wireless networks in airports, coffee shops and other public places. If I configure my computer to be secure regardless of the network it's on, then it simply doesn't matter. And if my computer isn't secure on a public network, securing my own network isn't going to reduce my risk very much.

Yes, computer security is hard. But if your computers leave your house, you have to solve it anyway. And any solution will apply to your desktop machines as well.

Finally, critics say someone might steal bandwidth from me. Despite isolated court rulings that this is illegal, my feeling is that they're welcome to it. I really don't mind if neighbors use my wireless network when they need it, and I've heard several stories of people who have been rescued from connectivity emergencies by open wireless networks in the neighborhood.

Similarly, I appreciate an open network when I am otherwise without bandwidth. If someone were using my network to the point that it affected my own traffic or if some neighbor kid was dinking around, I might want to do something about it; but as long as we're all polite, why should this concern me? Pay it forward, I say.

Certainly this does concern ISPs. Running an open wireless network will often violate your terms of service. But despite the occasional cease-and-desist letter and providers getting pissy at people who exceed some secret bandwidth limit, this isn't a big risk either. The worst that will happen to you is that you'll have to find a new ISP.

A company called Fon has an interesting approach to this problem. Fon wireless access points have two wireless networks: a secure one for you, and an open one for everyone else. You can configure your open network in either "Bill" or "Linus" mode: In the former, people pay you to use your network, and you have to pay to use any other Fon wireless network.

In Linus mode, anyone can use your network, and you can use any other Fon wireless network for free. It's a really clever idea.

Security is always a trade-off. I know people who rarely lock their front door, who drive in the rain (and, while using a cell phone), and who talk to strangers. In my opinion, securing my wireless network isn't worth it. And I appreciate everyone else who keeps an open wireless network, including all the coffee shops, bars and libraries I have visited in the past, the Dayton International Airport where I started writing this, and the Four Points Sheraton where I finished. You all make the world a better place.

I'll admit that he's gone to far greater lengths to justify the open wireless network than I; frankly, the idea that somebody might try to sit in my driveway in order to hack my desktop machine and store kitty porn on it had never occurred to me. I was always far more concerned that somebody might sit on my ISP's server, hack my desktop machine's IP from there and store kitty porn on it. Which is why, like Schneier, I keep any machine that's in my house as up to date as possible. Granted, that doesn't protect me against a zero-day exploit, but if an attacker is that determined to put kitty porn on my machine, I probably couldn't stop them from breaking down my front door while we're all at work and school and loading it on via a CD-ROM, either.

And, at least in my neighborhood, I can (barely) find the signal for a few other wireless networks that are wide open, too, so I know I'm not the only target of opportunity here.So the prospective kitty porn bandit has his choice of machines to attack, and frankly I'll take the odds of my machines being the more hardened targets over my neighbors' machines any day. (Remember, computer security is often an exercise in convincing the bad guy to go play in somebody else's yard. I wish it were otherwise, but until we have effective response and deterrence mechanisms, it's going to remain that way for a long time.)

I've known a lot of people who leave their front doors unlocked--my grandparents lived in rural Illinois for sixty some-odd years in the same house, leaving the front door pretty much unlocked all the time, and the keys to their cars in the drivers' side sun shade, and never in all that time did any seedy character "break in" to their home or steal their car. (Hell, after my grandfather died a few years ago, the kids--my mom and her siblings--descended on the place to get rid of a ton of the junk he'd collected over the years. I think they would have welcomed a seedy character trying to make off with the stuff at that point.)

Point is, as Schneier points out in the last paragraph, security is always a trade-off, and we must never lose sight of that fact. Remember, dogma is the root of all evil, and should never be considered a substitute for reasoned thought processes.

And meanwhile, friends, when you come to my house to visit, enjoy the wireless, the heat, and the electricity. If you're nice, we may even let you borrow chair for a while, too. :-)


Development Processes | Mac OS | Security | Windows

Tuesday, January 15, 2008 9:45:10 AM (Pacific Standard Time, UTC-08:00)
Comments [3]  | 
Commentary Responses: 1/15/2008 Edition

A couple of people have left comments that definitely deserve response, so here we go:

Glenn Vanderberg comments in response to the Larraysaywhut? post, and writes:

Interesting post, Ted ... and for the most part I agree with your comments.  But I have to ask about this one:

Actually, there are languages that do it even worse than COBOL. I remember one Pascal variant that required your keywords to be capitalized so that they would stand out. No, no, no, no, no! You don't want your functors to stand out. It's shouting the wrong words: IF! foo THEN! bar ELSE! baz END! END! END! END!

[Oh, now, that's just silly.]

Seriously?  You don't think Larry has a point there?  That's one of the primary things I always hated about Wirth's languages, for exactly the reason cited here.  Most real-world Pascal implementations relaxed that rule to recognize upper- and lowercase keywords, but he didn't learn, making the same horrible mistake in Modula-2 and Oberon.

Capitalized words draw your attention, and make it hard to see the real code in between.

Rather than disagree with him, I agree with Larry: uppercased keywords, in a language, are just SOOOO last-century. But so is line-numbering, declaration-before-use, and hailing recursion as a feature. It just seems silly to put this out there as a point of language design, when I can't imagine anyone, with the possible exception of the old COBOL curmudgeon in the corner ("In MY day, we wrote code without a shift key, and we LIKED it! Uphill, both ways, I tell you!"), thinks that uppercased keywords is a good idea.

As for Mr. Wirth, well, dude had some good ideas, but even Einstein had his wacky moments. Repeat after me, everybody: "Just because some guy is brilliant and turns out to be generally right doesn't mean we take everything he says as gospel". It's true for Einstein, it's true for Wirth, and it's true even for Dave Thomas (whom I am privileged to call friend, love deeply, and occasionally think is off his rocker... but I digress).

Actually, Glenn, I think case-sensitivity as a whole is silly. Let's face it, all ye who think that the C-family of languages have this one right, when's the last time you thought it was perfectly acceptable to write code like "int Int = 30;" ? Frankly, if anybody chose to overload based on case, I'd force them to maintain that same code for the next five years as punishment.

(I thought about ripping their still-beating hearts out of their chests instead, but honestly, having to live with the mess they create seems worse, and more fitting to boot.)

What's ironic, though, is that to be perfectly frank, I do exactly this with my SQL code, and it DOESN'T! SEEM! TO! SHOUT! to me AT! ALL! For some reason, this

SELECT name, age, favorite_language FROM programmers WHERE age > 25 AND favorite_language != 'COBOL';

just seems to flow pretty easily off the tongue. Err... eyeball. Whatever.

Meanwhile, 'Of Fibers and Continuations' drew some ire from Mark Murphy:

Frankly, this desire to accommodate the nifty feature of the moment smacks a great deal of Visual Basic, and while VB certainly has its strengths, coherent language design and consistent linguistic facilities is not one of them. It's played havoc with people who tried to maintain code in VB, and it's played hell with the people who try to maintain the VB language. One might try to argue that the Ruby maintainers are just Way Smarter than the Visual Basic maintainers, but I think that sells the VB team pretty short, having met some of them.

Conversely, I think you're selling the Ruby guys a bit short. And this is coming from a guy who's old enough to have written code in Visual Basic for DOS several years into his programming experience.

Wow. Next thing you know, Bruce Tate will be in here, talking about the "chuck the baby out the window" game he wrote for QuickBASIC. (True story.) And, FWIW, I too know the love of BASIC, although in this case I did QuickBasic (DOS) for a while, before it became known as QBasic, and Applesoft BASIC even before that. (Anybody else remember lo-res vs. hi-res graphics debates?) Ah, the sweet, sweet memories of PEEK and POKE and.... *shudder* Never mind.

[insert obligatory "get off my lawn!" reference here]

Get off my lawn, ya hooligan!

The death-knell for VB is widely considered to be the move from VB6 to VB.NET. In doing that, they changed significant quantities of the VB syntax. That's why there was so much hue and cry to keep maintaining VB6, because folk didn't want to take the time to port their zillions of lines of VB6 code.

Actually, much of that hue and cry was from a corner of the VB world that really just didn't want to learn something new. It turned out that most of the VB hue'ers and cry'ers were those who'd been hue'ing and cry'ing with every successive release of VB, and in the words of one very popular VB speaker and programmer, "If they don't want to come along, well, frankly, I think we're better off without 'em anyway."

Truthfully? VB seems to have move along just fine since. And, interestingly enough, since its transition to the CLR, VB has had a much stronger "core vision" to the language than it did for many years. I don't know if this is because the CLR helps them keep that vision clear, or if trying to keep up with C# is good intra-corporate competition, or what, but I haven't heard anywhere near the kinds of grousing about new linguistic changes in the two successive revisions of VB since VB.NET's release (VS 2005 and VS 2008) than I did prior to its move to the CLR.

The changes Ruby made in 1.9 had very little syntax impact (colons in case statements, and not much else, IIRC). Fibers, in particular, are just objects, supplied as part of the stock Ruby class library. I'm not aware of new syntax required to use fibers.

Grousing about a language adding to its standard class library seems a little weak. When Microsoft added new APIs to .NET when they released 3.0, I suspect you didn't bat an eye.

Oh, heavens, no. Quite the contrary--when .NET 3.0 shipped with WCF, Workflow and WPF in it, I was actually a little concerned, because the CLR's basic footprint is just ballooning like mad. How long before the CLR installation rivals that of the OS itself? Besides, this monolithic approach has its limitations, as the Java folks have discovered to their regret, and it's not too long before people start noticing the five or six different versions of the CLR all living on their machine simultaneously....

Let's be honest here--an API release is different from changing the execution model of the virtual machine, and that's partly what fibers do.

But of even more interest to this particular discussion, I wasn't really grousing about the syntax, or the addition of fibers, as I was pointing out that this is something that other platforms (notably Win32) has had before, and that it ended up being a "ho-hum, another subject I can safely ignore" topic for the world's programmers. That, and the interesting--and heretofore unrecognized, to me--link between fibers and coroutines and continuations.

In particular, grousing about how Language X adds something to its class library that duplicates a portion of something "baked into" Language Y seems really weak. Does this mean that once something is invented in a language, no other language is supposed to implement it in any way, shape, or form?

Heavens, no! Just like if you want to use objects, you're more than welcome to do so in C, or Pascal, or even assembly!

What if fibers weren't part of the Ruby 1.9 distribution, but rather were done by a third party and released as a popular gem? (I'm not sure if this would have been possible, as there may have been changes needed to the MRI to support fibers, but let's pretend for a moment.) Does this mean that nobody writing class libraries for any programming language are allowed to implement features that are "baked into" some other programming language?

Um... no: witness LINQ, stealing... *ahem* leveraging... a great deal of the concepts that are behind functional languages. Or the Win16 API (or the classic Mac OS API, or the Xt API, or ...), using object concepts from within the C language.

If so, C# should have never been created.

Huh?

Look, I have nothing against Ruby swiping ideas from another language. But let's not pretend that Ruby was built, from the ground up, as a functional language. The concepts that Ruby is putting forth in its 1.9 release are "bolted on", and will show the same leaks in the abstraction model as any other linguistic approach "bolted on" after the fact. This is a large part of the beef with generics in Java, with objects in C, with O/R-Ms, and so on. Languages choose, very precisely, which abstractions they want to make as first-class citizens, and usually when they try to add more of those concepts in after the fact, backwards compatibility and the choices they made earlier trip them up and create a suboptimal scenario. (Witness the various attempts to twist Java into a metaprogramming language: generics, AOP, and so on.)

Besides, if you're going to explore those features, why not go straight to the source? Since when has it become fashionable to discourage people from learning a new concept in the very environment where it is highlighted? Ruby is a phenomenal dynamic language (as is Lisp and Smalltalk, among others), and anybody who wants to grok dynamic languages should learn Ruby (and/or Lisp, and/or Smalltalk). Ditto for functional languages (Haskell and ML/OCaml being the two primary candidates in that camp).

Don't get me wrong -- I agree that there are way better languages for FP than Ruby, even with fibers. That's part of the reason why so many people are tracking JRuby and IronRuby, as having Ruby implementations on common VMs/LRs gives developers greater flexibility for mixing-and-matching languages to fit specific needs (JRuby/Scala/Groovy/Java on JVM, IronEverything/LotsOf# on CLR/DLR).

Which is the same thing I just said. Cool. :-)

I just think you could have spun this more positively and made the same points. The Rails team is having their hats handed to them over the past week or two; casting fibers as a "whither Ruby?" piece just feels like piling on.

Well, frankly, I don't track what's going on in the Rails space at all [and, to be honest, if one more programmer out there invents one more web framework that rhymes with "ails" in any way, so help me God I will SCREAM], so I can honestly say that I wasn't trying to "pile on". What I do find frustrating, however, is the general belief that Ruby is somehow God's Original Scripting Language, and that the Ruby community is constantly innovating while the rest of the programming world is staring on in drooling slack-jawed envy. Most of what Ruby does today is Old Hat to Smalltalkers, and I fully expect that PowerShellers will come along and find most of what the Ruby guys are doing to be interesting experiments in just how powerful the PSH environment really is.

Of deeper concern is the blending of "shell language" and "programming language" that Ruby seems to encourage; the only other language that I think really crosses that line is Perl, and honestly, that's not necessarily good company to be in on this score. When a language tries to hold fealty to too many masters, it loses coherence. Time will tell how well Ruby can chart that narrow course; to my mind, this is what ultimately doomed (and continues to dog) Perl 6.


.NET | C++ | Java/J2EE | Languages | Ruby | Windows

Tuesday, January 15, 2008 3:16:20 AM (Pacific Standard Time, UTC-08:00)
Comments [0]  | 
Java: "Done" like the Patriots, or "Done" like the Dolphins?

English is a wonderful language, isn't it? I'm no linguist, but from what little study I've made of other languages (French and German, so far), English seems to have this huge propensity, more so than the other languages, to put multiple meanings behind the same word. Consider, for example, the word "done": it means "completed", as in "Are you finished with your homework? Yes, Dad, I'm done.", or it can mean "wiped out, exhausted, God-just-take-me-now-please", as in "Good God, another open-source Web framework? That's it, I give up. I'm done. Code the damn thing in assembler for all I care."

So is Java "done" like the Patriots, a job well accomplished, or "done" like the Dolphins, the less said, the better?

(For those of you who are not American football fans, the New England Patriots have gone completely undefeated this season, a mark only set once before in the game's history, and the Miami Dolphins almost went completely unvictorious this season, a mark never accomplished. [Update: Hamlet D'Arcy points out, "Actually, a winless season has been accomplished before. Tampa Bay started their first two seasons winless with an overall 0-26 record before finally winning its first game in 1977." Thanks, Hamlet; my fact-checking on that one was lax, as I was trusting the commentary by a sportscaster during the Dolphins-Ravens game, and apparently his fact-checking was a tad lax, as well. :-)] The playoffs are still going on, but the Patriots really don't look beatable by any of the teams remaining. Meanwhile, the Dolphins managed to eke one out just before the season ended, posting a final record of 1-15, something reserved usually for new teams in the league, not a team with historical greatness behind them. And that's it for Sports, back to you in the studio, Tom.)

Bruce Eckel seems to suggest that Java is somewhere more towards Miami than New England, and that generics were the major culprit. (He also intimates that his criticism of generics has swayed Josh and Neal's opinions to now being critical of generics, something I highly doubt, personally. More on that later.) Now, I'll be the first to admit that I think generics in Java suck, and I've said this before, but the fact remains, no one feature can sink a language. Consider multiple inheritance in C++, something that Stroustrup himself admits (in Design and Evolution of C++) he did before templates or exceptions because he wanted to know how he could do it. Lots of people argued for years (decades, even) over MI and its inclusion in the language, and in the end....

... in the end MI turns out to be a useful feature of the language, but not the way anybody figured they would be. Ditto for templates, by the way. After looking at the Boost libraries, even just the basic examples using them, I feel like I'm looking at Sanskrit or something. As Scott Meyers put it once, "We're a long way from Stack-of-T here, folks."

And that is my principal complaint about generics: the fact that they aren't fully reified down into the JVM means that we lost 90% of the power of generics, and more importantly, we lost all of the emergent behavior and functionality that came out of C++ templates. Nothing new could come out of Java generics, because they were designed to do exactly what they were designed to do: give us type-safe collections. Whee. We're cooking with gas now, folks. Next thing you know, they'll give us printf() back, too.

(Oh, wait, they did that, too.)

Fact is, there's a lot of things that could be done to Java as a language to make it more attractive, but doing so risks that core element that Sun refuses to surrender, that of backwards compatibility. This was evident as far back as JavaPolis 2006, when I interviewed Neal and Josh on the subject; when asked, point-blank, why generics didn't "go all the way down", a la .NET generics do, they both basically said, "that would break backwards compatibility, and that was a core concern from the start". (I disagreed with them, off-camera, mind you, particularly on the grounds that the Collections library, the major source of concern around backwards compatibility, could have been ported over, but then Neal pointed out to me that it wasn't just the library itself but all the places it was used, particularly all those libraries outside of Sun, that was at stake. Perhaps, but I still believe that a happier middle ground could have been eked out.) That is still the message today, from what I can see of Neal's and Josh's public statements.

And the fact is, so far as it goes, Java generics are (ugh) useful. Useful solely as a Java compiler trick, perhaps, and far more verbose than we'd prefer, but useful nonetheless. Using them is about as exciting as using a new hammer, but they can at least get the job done.

There, I've made the obligatory "generics don't completely suck" disclaimer, and I'll be the first one to tell you, I just live with the warnings when I write Java code. Possibly that's because I don't worry too much about type-safe collections in my code, but I know lots of other programmers (particularly those on teams where the team composition isn't perhaps as strong as they'd like it to be) who do, and thus take the extra time to write their code to be generics-friendly and thus warning-free.

The mere fact that we have to work at it to create code that is "generics-friendly" is part of the problem here. For all those who came from C++ years ago, you'll know what I mean when I say that "Java generics are the new C++ const": Writing const-correct code was always a Good Thing To Do, it's just that it was also just such a Damn Hard Thing To Do. Which meant that nobody did it.

Languages should enable you to fall into the pit of success. That's the heart of the Principle of Least Surprise, even if it's not always said that way. (I'm not sure that C# 3 does this, time will tell. I'm reasonably certain that Ruby doesn't, despite the repeated insistence of Ruby advocates, many of whom I deeply respect. I'm nervous that Scala and F# will fall into this same trap, owing to their unusual syntax in places. It will be fun to see how ActionScript 3 turns out.)

Here's a thought: Let's leave Java where it is, and just start creating new JVM languages that cater to specific needs. You can call them Java, too, if you like. Or something else, like Scala or Clojure or Groovy or JRuby or CJ or whatever suits your fancy. Since everybody compiles down to JVM bytecode, it's all really academic--they're all Java, in some fundamental way. Which means that Java can thus rest easy, knowing that it fought the good fight, and that others equally capable are carrying on the tradition of JVM programming.

Eckel makes a good point:

Arguably one of the best features of C is that it hasn't changed at all for decades.

... which completely ignores some of the changes that were proposed and accepted for the C99 standard, but we'll leave that alone for now. The point is, the core C language now is the same core C language that I learned back in my high school days, and most, if not all, C code from even that far back will still compile under today's compilers. (Granted, there's likely to be a ton of warnings if you're using old "K-and-R" C, but the code will still compile.)

What about evolution, though? Don't languages need to evolve in order to stay relevant?

Consider the C case: C++ came along, made a whole bunch of changes to the language, but went zooming off in its own direction, to the point where a standards-compliant C++ compiler won't compile even relatively recent C code.

And how many people have complained about that?

By the way, if you're a C/C++ programmer and you haven't looked at D, you're about to get leaped on the evolutionary ladder again. Just an FYI.

As a matter of fact, if you're a Java or .NET programmer, you'd be well-advised to take a look at D, too. It's one of the more interesting native-compilation languages I've seen in a while, and yet arguably it's just what a C++ compiler author would come up with after studying Java and C# for a while (which, as far as I can tell, is exactly what happened). And because D can essentially mimic C bindings for dynamic libraries, it means that a Java guy can now write a JNI DLL in a garbage-collected language that (mostly) does away with pointer arithmetic for most of its work... just as Java did.

Heck, I'd love to see a D-for-the-JVM variant. And D-for-the-CLR, while we're at it. Just for fun.

Let's do this: somebody take the old, pre-Java5 javac source, and release it as "JWH" (short for Java Work Horse), and maintain it as a separate branch of the Java compiler. Then we can hack on the new Java5 language for years, maybe call it "JWNFF" (short for Java With New-Fangled Features), and everybody can get back to work without complaints.

Well, at least those who want to go back to work can do so; there'll always be people who'd rather complain than Get Stuff Done. *shrug*

Now, on the other hand, let's talk about the JVM, and specifically what needs to change there if the JVM platform is to be the workhorse of the 21st century like it was for the latter half of the last decade of the 20th....


.NET | C++ | Java/J2EE | Languages | Ruby

Tuesday, January 15, 2008 2:27:12 AM (Pacific Standard Time, UTC-08:00)
Comments [6]  | 
 Thursday, January 10, 2008
Of Fibers and Continuations

Dave explains Ruby fibers, as they're called in Ruby 1.9. Now, before I get going here, let me explain my biases up front: in the Windows world, we've had fibers for near on to half-decade, I think, and they're basically programmer-managed cooperative tasks. In other words, they're much like threads before threads were managed by the operating system--you decide when to switch to a different fiber, you manage the scheduling, the fiber just gives you a data structure and some basic housekeeping. (I know I'm oversimplifying and glossing over details, but that's the core, as I remember it. It's been a while since I tried to use them.) Legend has it that fibers were introduced into the Win32 API on behalf of the SQL Server team, who need to take that kind of control over thread scheduling in order to best manage the CPU, but here's the rub: they never served much purpose otherwise.

Frankly, nobody could figure out what to do with them. I'm beginning to wonder if it was because our languages of the time (C, C++) didn't have any real idea of freezing execution of a task at a certain point, putting it aside, then coming back to it and restoring it. In other words, the very behavior we see out of a continuation.

In Dave's explanation, Ruby fibers take on a different meaning. According to Dave's explanation:

A fiber is somewhat like a thread, except you have control over when it gets scheduled. Initially, a fiber is suspended. When you resume it, it runs the block until the block finishes, or it hits a Fiber.yield. This is similar to a regular block yield: it suspends the fiber and passes control back to the resume. Any value passed to Fiber.yield becomes the value returned by resume.

They sound a lot like Win32 fibers combined with Python generators, with a touch more by way of API support. (The Win32 API version was codified using C bindings, for starters, not objects.) But Dave quickly points out that fibers can become full-fledged coroutines by allowing fibers to transfer control from one to another, which is interesting, though I suspect lots of people will explore this feature and write lots of bad code as a result. Oh, well: bright shiny new toys have that effect on programmers sometimes.

He then goes on to describe how Ruby can provide pipelines:

As a starting point, let's write two fibers. One's a generator—it creates a list of even numbers. The second is a consumer. All it does it accept values from the generator and print them. We'll make the consumer stop after printing 10 numbers.

    evens = Fiber.new do
      value = 0
      loop do
        Fiber.yield value
        value += 2
      end
    end

    consumer = Fiber.new do
      10.times do
        next_value = evens.resume
        puts next_value
      end
    end

    consumer.resume

Note how we had to use resume to kick off the consumer. Technically, the consumer doesn't have to be a Fiber, but, as we'll see in a minute, making it one gives us some flexibility.

Ah, the classic producer-consumer example. Gotta love it. The interesting thing here, though, is that evens, prior to the call to resume, has done nothing. No execution has taken place. In essence, the fiber here is in deferred execution mode (now, where have I heard that before?), meaning nothing actually fires until asked for. It then runs until it hits the yield, essentially going to sleep again.

Is it me, or does this smell suspiciously like continuations?

More interesting, Dave goes on to define the consumer fiber to take the name of a source to resume, then shows how once can abstract the coupling between producer and consumer away even further by creating a filter that only allows multiples of three through the pipeline:

    def evens
      Fiber.new do
        value = 0
        loop do
          Fiber.yield value
          value += 2
        end
      end
    end

    def multiples_of_three(source)
      Fiber.new do
        loop do
          next_value = source.resume
          Fiber.yield next_value if next_value % 3 == 0
        end
      end
    end

    def consumer(source)
      Fiber.new do
        10.times do
          next_value = source.resume
          puts next_value
        end
      end
    end

    consumer(multiples_of_three(evens)).resume

Running this, we get the output

0
6
12
18
. . .

This is getting cool. We write little chunks of code, and then combine them to get work done. Just like a pipeline.

Actually, instead of calling it a pipeline, let's call it a comprehension and be done with it.

See, Ruby apparently has discovered the joys of functional programming, something that Scala and F# have baked in from the beginning, instead of bolted on from the outside. No offense intended to the Ruby community or to Matz, but I get a little lost as to exactly what Ruby's core concepts are--it's a scripting language, it's a development language, it's a DSL platform, it's object-oriented, it's functional, it's a bird, it's a plane, it's horribly confused.

Dave touches on this point in one of his responses to comments:

The thing that's interesting to me about Ruby in this context is how much is can bend into multiple paradigms. Haskell does FP way better than Ruby. Smalltalk does OO (marginally) better. But Ruby does them all, and in a way that interoperates nicely.

I like a lot of Ruby's core concepts--open classes, mixins, and so on--but I'm worried that Ruby's trying to do too much, much as another language I know and love is. Frankly, this desire to accommodate the nifty feature of the moment smacks a great deal of Visual Basic, and while VB certainly has its strengths, coherent language design and consistent linguistic facilities is not one of them. It's played havoc with people who tried to maintain code in VB, and it's played hell with the people who try to maintain the VB language. One might try to argue that the Ruby maintainers are just Way Smarter than the Visual Basic maintainers, but I think that sells the VB team pretty short, having met some of them.

Don't get me wrong here, I think it's nifty that Ruby has come around to realize the power of atomic components doing one thing well, passing its results on into the pipeline for something else to process, and this is a large part of why PowerShell is, in my mind, the sleeper programming language of 2008/2009. Pipelines also scale very well, since they encourage immutable state, since the results of each processing step are essentially fed in from the outside and the results are passed back out to the next step in the chain--all state is passed from one step to the next, meaning I can run lots of these pipelines in parallel with no fear of deadlocks or bottlenecks, since each processing step is itself essentially state-free. This is also, in fact, a lot of how original transaction-processing systems were designed, which also scaled pretty well, at least until we got the bright idea to store mutable state in them (*cough* EJB *cough*).

Oh, and for what it's worth, this concept is trivial to do in F#, via the pipeline operator ( "|>" ). Ditto for Scala. If you're going to think in pipelines, you may as well work with a language that has the concept baked in a little more deeply, IMHO. And before the Rubyists beat me over the head about this, Dave himself admits this is true in another comment response:

Paolo: I don't think Ruby or Smalltalk really do functional programming to any deep level. However, both can be used to implement particular FP constructs (such as generators).

And maybe, in the end, that's the important thing: recognizing what aspects of functional programming can be easily lifted into your language of choice and used to make your life simpler. Still, I'm always looking for languages that take the concepts that float in my head and let me express them as first-class constructs, not as duck-taped partial implementations thereof. I felt the same way about doing "objects" in C (back in the Win16 programming day, before C++ Windows frameworks emerged), and about doing "aspects" in Java using interception.

If you're going to think in a concept, you generally want a language that expresses that concept as a first-class citizen, or you'll get frustrated quickly. Ruby's fibers may be the gateway drug for developers to learn functional programming, but they're not going to get it at any deep level until they dive into Haskell or ML or one of its derivatives (Scala or F#). For example, once you see the power inherent in Scala's comprehensions, you never look at a simple for loop the same way again.

Oh, and Groovyists? I'm sure they could do this, but I dunno if it's worth it, given that Groovy and Scala, at some level, are fundamentally interoperable as well. (Note to self: must do a blog post about Groovy calling into Scala code, just to show it can be done. Y'all hold me to that, if you don't see it in a week or two.)

Meanwhile, the link between continuations and Ruby fibers (and Win32 fibers, while we're at it) still tickles at the back of my mind.... But that's a thought waiting to be explored another day.


.NET | Java/J2EE | Languages | Ruby

Thursday, January 10, 2008 5:28:00 AM (Pacific Standard Time, UTC-08:00)
Comments [5]  | 
 Wednesday, January 09, 2008
Larraysaywhut?

Larry Wall, (in)famous creator of that (in)famous Perl language, has contributed a few cents' worth to the debate over "scripting" languages:

I think, to most people, scripting is a lot like obscenity. I can't define it, but I'll know it when I see it.

Aside from the fact that the original quote reads "pornography" instead of "obscenity", I get what he's talking about. Finding a good definition for scripting is like trying to find a good definition for "object-oriented" or "service-oriented" or... come to think of it, like a lot of the terms that we tend to use on a daily basis. So I'm right there along with him, assuming that his goal here is to call out a workable definition for "scripting" languages.

Here are some common memes floating around:

    Simple language
    "Everything is a string"
    Rapid prototyping
    Glue language
    Process control
    Compact/concise
    Worse-is-better
    Domain specific
    "Batteries included"

...I don't see any real center here, at least in terms of technology. If I had to pick one metaphor, it'd be easy onramps. And a slow lane. Maybe even with some optional fast lanes.

I'm not sure where some of these memes come from, but some of them I recognize (Simple language, Rapid prototyping, glue language, compact/concise), some of them are new to me ("Everything is a string", process control), and some of them I seriously question the sanity of anybody suggesting them (worse-is-better, domain specific, "batteries included"). Fortunately he didn't include the "dynamically typed" or "loosely coupled" memes, which I hear tagged on scripting languages all the time.

But basically, scripting is not a technical term. When we call something a scripting language, we're primarily making a linguistic and cultural judgment, not a technical judgment. I see scripting as one of the humanities. It's our linguistic roots showing through.

I can definitely see the use of the term "scripting" as a term of value judgement, but I'm not sure I see the idea that scripting languages somehow demonstrate our linguistic roots.

We then are treated to one-sentence reviews of every language Larry ever programmed in, starting from his earliest days in BASIC, with some interesting one-liners scattered in there every so often:

On Ruby: "... a great deal of Ruby's syntax is borrowed from Perl, layered over Smalltalk semantics."

On Lisp: "Is LISP a candidate for a scripting language? While you can certainly write things rapidly in it, I cannot in good conscience call LISP a scripting language. By policy, LISP has never really catered to mere mortals. And, of course, mere mortals have never really forgiven LISP for not catering to them."

On JavaScript: "Then there's JavaScript, a nice clean design. It has some issues, but in the long run JavaScript might actually turn out to be a decent platform for running Perl 6 on. Pugs already has part of a backend for JavaScript, though sadly that has suffered some bitrot in the last year. I think when the new JavaScript engines come out we'll probably see renewed interest in a JavaScript backend." Presumably he means a new JavaScript backend for Perl 6. Or maybe a new Perl 6 backend for JavaScript.

On scripting langauges as a whole: "When I look at the present situation, what I see is the various scripting communities behaving a lot like neighboring tribes in the jungle, sometimes trading, sometimes warring, but by and large just keeping out of each other's way in complacent isolation."

Like the prize at the bottom of the cereal box, if you can labor through all of this, though, you get treated to one of the most amazing succinct discussions/point-lists of language design and implementation I've seen in a long while; I've copied that section over verbatim, though I annotate with my own comments in italics:

early binding / late binding

Binding in this context is about exactly when you decide which routine you're going to call for a given routine name. In the early days of computing, most binding was done fairly early for efficiency reasons, either at compile time, or at the latest, at link time. You still tend to see this approach in statically typed languages. With languages like Smalltalk, however, we began to see a different trend, and these days most scripting languages are trending towards later binding. That's because scripting languages are trying to be dwimmy (Do What I Mean), and the dwimmiest decision is usually a late decision because you then have more available semantic and even pragmatic context to work with. Otherwise you have to predict the future, which is hard.

So scripting languages naturally tend to move toward an object-oriented point of view, where the binding doesn't happen 'til method dispatch time. You can still see the scars of conflict in languages like C++ and Java though. C++ makes the default method type non-virtual, so you have to say virtual explicitly to get late binding. Java has the notion of final classes, which force calls to the class to be bound at compile time, essentially. I think both of those approaches are big mistakes. Perl 6 will make different mistakes. In Perl 6 all methods are virtual by default, and only the application as a whole can tell the optimizer to finalize classes, presumably only after you know how all the classes are going to be used by all the other modules in the program.

[Frankly, I think he leaves out a whole class of binding ideas here, that being the "VM-bound" notion that both the JVM and the CLR make use of. In other words, the Java language is early-bound, but the actual linking doesn't take place until runtime (or link time, as it were). The CLR takes this one step further with its delegates design, essentially allowing developrs to load a metadata token describing a function and construct a delegate object--a functor, as it were--around that. This is, in some ways, a highly useful marriage of both early and late binding.

[I'm also a little disturbed by his comments "only the application as a whole can tell the optimizer to finalize classes, presumably only after you know how all that classes are going to be used by all the other modules in the program. Since when can programmers reasonably state that they know how classes are going to be used by all the other modules in the program? This seems like a horrible set-you-up-for-failure point to me.]

single dispatch / multiple dispatch

In a sense, multiple dispatch is a way to delay binding even longer. You not only have to delay binding 'til you know the type of the object, but you also have to know the types of all rest of the arguments before you can pick a routine to call. Python and Ruby always do single dispatch, while Dylan does multiple dispatch. Here is one dimension in which Perl 6 forces the caller to be explicit for clarity. I think it's an important distinction for the programmer to bear in mind, because single dispatch and multiple dispatch are philosophically very different ideas, based on different metaphors.

With single-dispatch languages, you are basically sending a message to an object, and the object decides what to do with that message. With multiple dispatch languages, however, there is no privileged object. All the objects involved in the call have equal weight. So one way to look at multiple dispatch is that the objects are completely passive. But if the objects aren't deciding how to bind, who is?

Well, it's sort of a democratic thing. All the routines of a given name get together and hold a political conference. (Well, not really, but this is how the metaphor works.) Each of the routines is a delegate to the convention. All the potential candidates put their names in the hat. Then all the routines vote on who the best candidate is, and the next best, and the next best after that. And eventually the routines themselves decide what the best routine to call is.

So basically, multiple dispatch is like democracy. It's the worst way to do late binding, except for all the others.

But I really do think that's true, and likely to become truer as time goes on. I'm spending a lot of time on this multiple dispatch issue because I think programming in the large is mutating away from the command-and-control model implicit in single dispatch. I think the field of computation as a whole is moving more toward the kinds of decisions that are better made by swarms of insects or schools of fish, where no single individual is in control, but the swarm as a whole has emergent behaviors that are somehow much smarter than any of the individual components.

[I think it's a pretty long stretch to go from "multiple dispatch", where the call is dispatched based not just on the actual type of the recipient but the caller as well, to suggesting that whole "swarms" of objects are going to influence where the call comes out. People criticized AOP for creating systems where developers couldn't predict, a priori, where a call would end up, how will they react to systems where nondeterminism--having no real idea at source level which objects are "voting", to use his metaphor--is the norm, not the exception?]

eager evaluation / lazy evaluation

Most languages evaluate eagerly, including Perl 5. Some languages evaluate all expressions as lazily as possible. Haskell is a good example of that. It doesn't compute anything until it is forced to. This has the advantage that you can do lots of cool things with infinite lists without running out of memory. Well, at least until someone asks the program to calculate the whole list. Then you're pretty much hosed in any language, unless you have a real Turing machine.

So anyway, in Perl 6 we're experimenting with a mixture of eager and lazy. Interestingly, the distinction maps very nicely onto Perl 5's concept of scalar context vs. list context. So in Perl 6, scalar context is eager and list context is lazy. By default, of course. You can always force a scalar to be lazy or a list to be eager if you like. But you can say things like for 1..Inf as long as your loop exits some other way a little bit before you run into infinity.

[This distinction is, I think, becoming one of continuum rather than a binary choice; LINQ, for example, makes use of deferred execution, which is fundamentally a lazy operation, yet C# itself as a whole generally prefers eager evaluation where and when it can... except in certain decisions where the CLR will make the call, such as with the aforementioned delegates scenario. See what I mean?]

eager typology / lazy typology

Usually known as static vs. dynamic, but again there are various positions for the adjustment knob. I rather like the gradual typing approach for a number of reasons. Efficiency is one reason. People usually think of strong typing as a reason, but the main reason to put types into Perl 6 turns out not to be strong typing, but rather multiple dispatch. Remember our political convention metaphor? When the various candidates put their names in the hat, what distinguishes them? Well, each candidate has a political platform. The planks in those political platforms are the types of arguments they want to respond to. We all know politicians are only good at responding to the types of arguments they want to have...

[OK, Larry, enough with the delegates and the voting thing. It just doesn't work. I know it's an election year, and everybody wants to get in on the whole "I picked the right candidate" thing, but seriously, this metaphor is getting pretty tortured by this point.]

There's another way in which Perl 6 is slightly more lazy than Perl 5. We still have the notion of contexts, but exactly when the contexts are decided has changed. In Perl 5, the compiler usually knows at compile time which arguments will be in scalar context, and which arguments will be in list context. But Perl 6 delays that decision until method binding time, which is conceptually at run time, not at compile time. This might seem like an odd thing to you, but it actually fixes a great number of things that are suboptimal in the design of Perl 5. Prototypes, for instance. And the need for explicit references. And other annoying little things like that, many of which end up as frequently asked questions.

[Again, this is a scenario where smarter virtual machines and execution engines can help with this--in Java, for example, the JVM can make some amazing optimizations in its runtime compiler (a.k.a. JIT compiler) that a normal ahead-of-time compiler simply can't make, such as monomorphic interface calls. One area that I think he's hinting at here, though, which I think is an interesting area of research and extension, is that of being able to access the context in which a call is being made, a la the .NET context architecture, which had some limited functionality in the EJB space, as well. This would also be a good "middle-ground" for multi-dispatch, since now the actual dispatch could be done on the basis of the context itself, which could be known, rather than on random groups of objects that Larry's gathered together for an open conference on dispatching the method call.... I kid, I kid.]

limited structures / rich structures

Awk, Lua, and PHP all limit their composite structures to associative arrays. That has both pluses and minuses, but the fact that awk did it that way is one of the reasons that Perl does it differently, and differentiates ordered arrays from unordered hashes. I just think about them differently, and I think a lot of other people do too.

[Frankly, none of the "popular" languages really has a good set-based first-class concept, whereas many of the functional languages do, and thanks to things like LINQ, I think the larger programming world is beginning to see the power in sets and set projections. So let's not limit the discussion to associative arrays; yes, they're useful, but in five years they'll be useful in the same way that line-numbered BASIC and use of the goto keyword can still be useful.]

symbolic / wordy

Arguably APL is also a kind of scripting language, largely symbolic. At the other extreme we have languages that eschew punctuation in favor of words, such as AppleScript and COBOL, and to a lesser extent all the Algolish languages that use words to indicate blocks where the C-derived languages use curlies. I prefer a balanced approach here, where symbols and identifiers are each doing what they're best at. I like it when most of the actual words are those chosen by the programmer to represent the problem at hand. I don't like to see words used for mere syntax. Such syntactic functors merely obscure the real words. That's one thing I learned when I switched from Pascal to C. Braces for blocks. It's just right visually.

[Sez you, though I have to admit my own biases agree. As with all things, though, this can get out of hand pretty quickly if you're not careful. The prosecution presents People's 1, Your Honor: the Perl programming langauge.]

Actually, there are languages that do it even worse than COBOL. I remember one Pascal variant that required your keywords to be capitalized so that they would stand out. No, no, no, no, no! You don't want your functors to stand out. It's shouting the wrong words: IF! foo THEN! bar ELSE! baz END! END! END! END!

[Oh, now, that's just silly.]

Anyway, in Perl 6 we're raising the standard for where we use punctuation, and where we don't. We're getting rid of some of our punctuation that isn't really pulling its weight, such as parentheses around conditional expressions, and most of the punctuational variables. And we're making all the remaining punctuation work harder. Each symbol has to justify its existence according to Huffman coding.

Oddly, there's one spot where we're introducing new punctuation. After your sigil you can add a twigil, or secondary sigil. Just as a sigil tells you the basic structure of an object, a twigil tells you that a particular variable has a weird scope. This is basically an idea stolen from Ruby, which uses sigils to indicate weird scoping. But by hiding our twigils after our sigils, we get the best of both worlds, plus an extensible twigil system for weird scopes we haven't thought of yet.

[Did he just say "twigil"? As in, this is intended to be a serious term? As in, Perl wasn't symbol-heavy enough, so now they're adding twigils that will hide after sigils, with maybe forgils and fivegils to come in Perl 7 and 8, respectively?]

We think about extensibility a lot. We think about languages we don't know how to think about yet. But leaving spaces in the grammar for new languages is kind of like reserving some of our land for national parks and national forests. Or like an archaeologist not digging up half the archaeological site because we know our descendants will have even better analytical tools than we have.

[Or it's just YAGNI, Larry. Look, if your language wants to have syntactic macros--which is really the only way to have langauge extensibility without having to rewrite your parser and lexer and AST code every n number of years, then build in syntactic macros, but really, now you're just emulating LISP, that same language you said wasn't for mere mortals, waaaay back there up at the top.]

Really designing a language for the future involves a great deal of humility. As with science, you have to assume that, over the long term, a great deal of what you think is true will turn out not to be quite the case. On the other hand, if you don't make your best guess now, you're not really doing science either. In retrospect, we know APL had too many strange symbols. But we wouldn't be as sure about that if APL hadn't tried it first.

[So go experiment with something that doesn't have billions of lines of code scattered all across the planet. That's what everybody else does. Witness Gregor Kiczales' efforts with AspectJ: he didn't go and modify Java proper, he experimented with a new language to see what AOP constructs would fit. And he never proposed AspectJ as a JSR to modify core Java. Not because he didn't want to, mind you, I know that this was actively discussed. But I also know that he was waiting to see what a large-scale AOP system looked like, so we could find the warts and fix them. The fact that he never opened an AspectJ JSR suggests to me that said large-scale AOP system never materialized.]

compile time / run time

Many dynamic languages can eval code at run time. Perl also takes it the other direction and runs a lot of code at compile time. This can get messy with operational definitions. You don't want to be doing much file I/O in your BEGIN blocks, for instance. But that leads us to another distinction:

declarational / operational

Most scripting languages are way over there on the operational side. I thought Perl 5 had an oversimplified object system till I saw Lua. In Lua, an object is just a hash, and there's a bit of syntactic sugar to call a hash element if it happens to contain code. Thats all there is. [Dude, it's the same with JavaScript/ECMAScript. And a few other langauges, besides.] They don't even have classes. Anything resembling inheritance has to be handled by explicit delegation. That's a choice the designers of Lua made to keep the language very small and embeddable. For them, maybe it's the right choice.

Perl 5 has always been a bit more declarational than either Python or Ruby. I've always felt strongly that implicit scoping was just asking for trouble, and that scoped variable declarations should be very easy to recognize visually. Thats why we have my. It's short because I knew we'd use it frequently. Huffman coding. Keep common things short, but not too short. In this case, 0 is too short.

Perl 6 has more different kinds of scopes, so we'll have more declarators like my and our. But appearances can be deceiving. While the language looks more declarative on the surface, we make most of the declarations operationally hookable underneath to retain flexibility. When you declare the type of a variable, for instance, you're really just doing a kind of tie, in Perl 5 terms. The main difference is that you're tying the implementation to the variable at compile time rather than run time, which makes things more efficient, or at least potentially optimizable.

[The whole declarational vs operational point here seems more about type systems than the style of code; in a classless system, a la JavaScript/ECMAScript, objects are just objects, and you can mess with them at runtime as much as you wish. How you define the statements that use them, on the other hand, is another axis of interest entirely. For example, SQL is a declarational language, really more functional in nature (since functional languages tend to be declarational as well), since the interpreter is free to tackle the statement in any sub-clause it wishes, rather than having to start from the beginning and parse right. There's definitely greater distinctions waiting to be made here, IMHO, since there's still a lot of fuzziness in the taxonomy.]

immutable classes / mutable classes

Classes in Java are closed, which is one of the reasons Java can run pretty fast. In contrast, Ruby's classes are open, which means you can add new things to them at any time. Keeping that option open is perhaps one of the reasons Ruby runs so slow. But that flexibility is also why Ruby has Rails. [Except that Ruby now compiles to the JVM, and fully supports open classes there, and runs a lot faster than the traditional Ruby interpreter, which means that either the mutability of classes has nothing to do with the performance of a virtual machine, or else the guys working on the traditional Ruby interpreter are just morons compared to the guys working on Java. Since I don't believe the latter, I believe that the JVM has some intrinsic engineering in it that the Ruby interpreter could have--given enough time and effort--but simply doesn't have yet. Frankly, from having spelunked the CLR, there's really nothing structurally restricting the CLR from having open classes, either, so long as the semantics of modifying a class structure in memory were well understood: concurrency issues, outstanding objects, changes in method execution semantics, and so on.]

Perl 6 will have an interesting mix of immutable generics and mutable classes here, and interesting policies on who is allowed to close classes when. Classes are never allowed to close or finalize themselves, for instance. Sorry, for some reason I keep talking about Perl 6. It could have something to do with the fact that we've had to think about all of these dimensions in designing Perl 6.

class-based / prototype-based

Here's another dimension that can open up to allow both approaches. Some of you may be familiar with classless languages like Self or JavaScript. Instead of classes, objects just clone from their ancestors or delegate to other objects. For many kinds of modeling, it's actually closer to the way the real world works. Real organisms just copy their DNA when they reproduce. They don't have some DNA of their own, and an @ISA array telling you which parent objects contain the rest of their DNA.

[I get nervous whenever people start drawing analogies and start pursuing them too strongly. Yes, this is how living organisms replicate... but we're not designing living organisms. A model is just supposed to represent a part of reality, not try to recreate reality itself. Having said that, though, there's definitely a lot to be said for classless languages (which don't necessarily have to be prototype-based, by the way, though it makes sense for them to be). Again, what I think makes the most sense here is a middle-of-the-road scenario combined with open classes. Objects belong to classes, but fully support runtime reification of types.]

The meta-object protocol for Perl 6 defaults to class-based, but is flexible enough to set up prototype-based objects as well. Some of you have played around with Moose in Perl 5. Moose is essentially a prototype of Perl 6's object model. On a semantic level, anyway. The syntax is a little different. Hopefully a little more natural in Perl 6.

passive data, global consistency / active data, local consistency

Your view of data and control will vary with how functional or object-oriented your brain is. People just think differently. Some people think mathematically, in terms of provable universal truths. Functional programmers don't much care if they strew implicit computation state throughout the stack and heap, as long as everything looks pure and free from side-effects.

Other people think socially, in terms of cooperating entities that each have their own free will. And it's pretty important to them that the state of the computation be stored with each individual object, not off in some heap of continuations somewhere.

Of course, some of us can't make up our minds whether we'd rather emulate the logical Sherlock Holmes or sociable Dr. Watson. Fortunately, scripting is not incompatible with either of these approaches, because both approaches can be made more approachable to normal folk.

[Or, don't choose at all, but combine as you need to, a la Scala or F#. By the way, objects are not "free willed" entities--they are intrinsically passive entities, waiting to be called, unless you bind a thread into their execution model, which then makes them "active objects" or sometimes called "actors" (not to be confused with the concurrency model Actors, such as Scala uses). So let's not get too hog-wild with that "individual object/live free or die" meme, not unless you're going to differentiate between active objects and passive objects. Which, I think, is a valuable thing to differentiate on, FWIW.]

info hiding / scoping / attachment

And finally, if you're designing a computer language, there are a couple bazillion ways to encapsulate data. You have to decide which ones are important. What's the best way to let the programmer achieve separation of concerns?

object / class / aspect / closure / module / template / trait

You can use any of these various traditional encapsulation mechanisms.

transaction / reaction / dynamic scope

Or you can isolate information to various time-based domains.

process / thread / device / environment

You can attach info to various OS concepts.

screen / window / panel / menu / icon

You can hide info various places in your GUI. Yeah, yeah, I know, everything is an object. But some objects are more equal than others. [NO. Down this road lies madness, at least at the language level. A given application might choose to, for reasons of efficiency... but doing so is a local optimization, not something to consider at the language level itself.]

syntactic scope / semantic scope / pragmatic scope

Information can attach to various abstractions of your program, including, bizarrely, lexical scopes. Though if you think about it hard enough, you realize lexical scopes are also a funny kind of dynamic scope, or recursion wouldn't work right. A state variable is actually more purely lexical than a my variable, because it's shared by all calls to that lexical scope. But even state variables get cloned with closures. Only global variables can be truly lexical, as long as you refer to them only in a given lexical scope. Go figure.

So really, most of our scopes are semantic scopes that happen to be attached to a particular syntactic scope.

[Or maybe scope is just scope.]

You may be wondering what I mean by a pragmatic scope. That's the scope of what the user of the program is storing in their brain, or in some surrogate for their brain, such as a game cartridge. In a sense, most of the web pages out there on the Internet are part of the pragmatic scope. As is most of the data in databases. The hallmark of the pragmatic scope is that you really don't know the lifetime of the container. It's just out there somewhere, and will eventually be collected by that Great Garbage Collector that collects all information that anyone forgets to remember. The Google cache can only last so long. Eventually we will forget the meaning of every URL. But we must not forget the principle of the URL. [This is weirdly Zen, and either makes no sense at all, or has a scope (pardon the pun) far outside of that of programming languages and is therefore rendered meaningless for this discussion, or he means something entirely different from what I'm reading.] That leads us to our next degree of freedom.

use Lingua::Perligata;

If you allow a language to mutate its own grammar within a lexical scope, how do you keep track of that cleanly? Perl 5 discovered one really bad way to do it, namely source filters, but even so we ended up with Perl dialects such as Perligata and Klingon. What would it be like if we actually did it right?

[Can it even be done right? Lisp had a lot of success here with syntactic macros, but I don't think they had scope attached to them the way Larry is looking at trying to apply here. Frankly, what comes to mind most of all here is the C/C++ preprocessor, and multiple nested definitions of macros. Yes, it can be done. It is incredibly ugly. Do not ask me to remember it again.]

Doing it right involves treating the evolution of the language as a pragmatic scope, or as a set of pragmatic scopes. You have to be able to name your dialect, kind of like a URL, so there needs to be a universal root language, and ways of warping that universal root language into whatever dialect you like. This is actually near the heart of the vision for Perl 6. We don't see Perl 6 as a single language, but as the root for a family of related languages. As a family, there are shared cultural values that can be passed back and forth among sibling languages as well as to the descendants.

I hope you're all scared stiff by all these degrees of freedom. I'm sure there are other dimensions that are even scarier.

But... I think its a manageable problem. I think its possible to still think of Perl 6 as a scripting language, with easy onramps.

And the reason I think its manageable is because, for each of these dimensions, it's not just a binary decision, but a knob that can be positioned at design time, compile time, or even run time. For a given dimension X, different scripting languages make different choices, set the knob at different locations.

Somewhere in the universe, a budding programming language designer reads that last paragraph, thinks to himself, I know! I'll create a language where the programmer can set that knob wherever they want, even at runtime! Sort of like a "Option open_classes on; Option dispatch single; Option meta-object-programming off;" thing....

And with any luck, somebody will kill him before he unleashes it on us all.

Meanwhile, I just sit back and wonder, All this from the guy who proudly claimed that Perl never had a formal design to it whatsoever?


.NET | C++ | Java/J2EE | Languages | Ruby

Wednesday, January 09, 2008 9:35:49 PM (Pacific Standard Time, UTC-08:00)
Comments [2]  |