Lair of the Gecko: Moving Up, Part 2 (or Language Evolution)

In yesterday's Part 1, I expounded on the differences between most current development environments (at least in my experience) and a theoretical ideal, which comes remarkably close to the reality of Smalltalk.

Watching the development of computer languages, I've become increasingly convinced that they do not -- and perhaps cannot -- evolve. (Obviously, when speaking of computer language evolution, I'm using the term as a shorthand for "incrementally adding changes between versions," so don't get hung up on semantics.)

At work, we've started looking at moving our codebase up from C# 1.1 to C# 2.0. It's eventually going to be a great change, and we're really looking forward to using the new features (plus, frankly, it'll be required for any future Windows development, so it's not as if we have much choice anyway). Considering the revisions we're going to have make to our existing code, though, has bolstered a hypothesis that I've been cultivating for a few years now: computer languages don't evolve, in the sense that new versions are not expansions of the previous versions, but are merely additions to them.

Perl 6, for example, is not Perl 5 Plus More. In fact, it appears that it will really be Perl in name (and spirit?) only, but not in actual practice. Perl 5 and Perl 6 will only speak to eachother through a translator. Likewise, early info on Ruby 2 indicates that it may be significantly different from Ruby 1. Ditto all the Java versions, the C# versions, and most other languages.

The hypothesis can be broadly generalized as follows:

When you set out to upgrade a computer language, what you're really doing is designing a whole new language inspired by the original.

In this sense, new language versions are really remakes of the original rather than sequels. Don't get me wrong, there's really nothing wrong with this! Perl 6 and Parrot look like they could be a great platform. The new Java and C# versions have added some great new tools to the tool chest.

So, if this is a valid theory, the next question is why it seems to be true. To answer that question, my next hypothesis can be stated thusly:

Languages which are not self-descriptive cannot evolve.

Natural languages are self-descriptive. Take English, for instance. The grammatical rules that govern the English language are themselves written in English. All the words in the dictionary are defined in English. When English assimilates a new word or concept from another language, it is redefined in English. When English needed a convenient term for "the spirit and energy of a time period," it found zeitgeist in German, redefined it in English, and started happily using it whenever the need arose.

Most computer languages are different. When C# needed generics, changes had to be made to the CLR, to the C# syntax, to the compiler, etc. In essence, a new language had to be created that was based on C#, but which supported the new features.

On the other hand, computer languages which are self-describing, like Lisp and Smalltalk, don't have to be replaced in order to support new features, they merely have to be added to. They can actually be used to expand themselves, in much the same way that natural languages can. In and of themselves, Smalltalk and Lisp (especially Scheme) are really small languages which have been expanded to form large, robust language environments. This isn't something that's possible for externally-defined languages. [1]

Yes, they're all Turing Complete, so yes, you can implement any of the features in any of the languages (eventually). But we don't tend to do that. Why not? I suppose the obvious answer is performance, probably combined with a lot of cultural tradition in the language communities, but maybe a large part of the answer is that most languages tend to follow sort of a punctuated equilibrium approach. Needed features get approximated via the existing language until enough changes have accumulated to justify re-encoding them in the underlying defining language and calling it a new version of LanguageX.

That approach has advantages, and it's certainly the dominant approach in the market, but it's not really what I would call computer language evolution. Nevertheless, it is instructive to note the long, relatively unbroken lines representing Smalltalk and Lisp in the language timeline.

[1] From what I've seen and read, C# 3.0 is actually intended to be bytecode compatible with C# 2.0. No new foundational changes are supposed to be necessary. Instead, the new features are based on features added to 2.0, so it really is a kind of evolution, which could set C# apart as the first mainstream language to become self-descriptive.

Lair of the Gecko

Friday, January 20, 2006

Moving Up, Part 2 (or Language Evolution)

0 Comments:

About Me

Current Reading

Current Tweets

What I've Dugg

Previous Posts