Tuesday, August 14, 2007

the typing debate is a red herring

Or: even more than before, dynamic vs. static variable typing is not the salient point of differentiation in programming language debates.

I've hinted at this previously, but my discovery of Jomo Fisher's blog on the more cutting-edge aspects of .Net reminded me of it. The casually-readable entry The Least You Need to Know about C# 3.0 describes multiple features which put the dynamic-static (compilation-runtime) distinction in a new light: the keyword "var" (for type inferenced or "anonymous type" variables), extension methods, and expression trees. These additions help make LINQ possible.

As .Net increases its explicit support for dynamic code (as opposed to dynamic code support through a Reflection API), the barrier between the capabilities of a "static" and a "dynamic" language keeps shrinking. If "expression tree" objects in .Net 3.5 allow someone to generate and execute a customized switch statement at runtime, then what we have is a solution with the flexibility of a dynamic language and the performance of a static language. Ideally, the smart folks working on porting dynamically-typed languages to statically-typed VM platforms would accomplish something similar in the implementation innards. The code that accomplishes this feat is a bit daunting, but it is cutting-edge, after all.

Some of those irascible Lisp users may be snickering at the "cutting-edge" label. As they should. I've read that Lisp implementations have had the ability to employ either static or dynamic typing for many years. Moreover, Lisp uses lists for both data and program structures, so it doesn't need a special expression tree object. It also has had a REPL loop that made the compilation-runtime distinction mushy before .Net was a gleam in Microsoft's eye. On the other hand Lisp is, well, Lisp. Moving along...

The way I see it (and echoing/elaborating what I have written before now), there are three reasons why the question of static and dynamic variable typing has always been, if not a red herring, at best a flawed litmus-test for language comparisons.
  1. The time when a variable's data type is set doesn't affect the clear fact that the data the variable refers to has one definite data type at runtime. Ruby and Python cheerleaders are fond of reminding others that their variables are strongly typed, thankyouverymuch! Where "strongly typed" means that the code doesn't attempt to perform inappropriate operations on data by applying coercion rules to one or more operands. The timeless example is how to evaluate 1 + "1". Should it be "11", 2, or Exception? Strongly-typed languages are more likely than weakly-typed languages to evaluate it as Exception (whether a static CompileException or a dynamic RuntimeException). So dynamic typing is separate from strong typing, precisely because variable typing, a part of the code, is different from data typing, which is what the code receives and processes in one particular way at runtime. Data is typed--even null values, for which the type is also null. Regardless of language, the next question after "what is the name and scope of a variable?" is "what can I do with the variable?", and its answer is tied to the type of data in the variable. In fact, this connection is how ML-like languages can infer the data type of a variable from what the code does with it. Similarly, Haskell's type classes appear to define a data type precisely by the operations it supports. No matter how strict a variable type is at compile time or run time, when the code executes such considerations are distractions from what the actual data and type really is.
  2. Programming languages are intangible entities until someone tries to use them, and therefore publicity (ideas about ideas) is of prime importance. One example is the stubborn insistence on calling a language "dynamic" instead of "scripting"; with one word programmers are working on active and powerful code (it's dynamite, it's like a dynamo!) while with the other word programmers are merely "writing scripts". Unfortunately, applying the word "dynamic" to an entire language/platform can also be misleading. Languages/platforms with static variable typing are by no means necessarily excluded from a degree of dynamism, apart from support for reflection or expression tree objects. Consider varargs, templates, casting (both up and down the hierarchy), runtime object composition/dependency injection, delegates, dynamically-generated proxies, DSL "little languages" (in XML or Lua or BeanShell or Javascript or...) parsed and executed at runtime by an interpreter written in the "big" language, map data structures, even linked lists of shifting size. The capabilities available to the programmer for creating dynamic, or perhaps true meta-programmatic, programs can be vital in some situations, but in any case it's too simplistic to assume static variable typing precludes dynamic programs. I don't seem to hear the Haskell cheerleaders often complaining about a static-typing straightjacket (or is that because they're too busy trying to force their lazy expressions to lend a hand solving the problem?).
  3. OOP has been in mainstream use for a long time. I think it's uncontroversial to note the benefits (in maintainability and reuse) of breaking a program into separate units of self-contained functions and data called objects, and then keeping the interactions between the objects minimal and well-defined, especially for large, complex programs. This fundamental idea behind OOP is independent of variable typing, and also independent of object inheritance. Anyone with a passing familiarity of Smalltalk or Objective-C would agree. A language might allow one object to send a message to, or call a method on, any other object, with a defined fallback behavior if the object isn't able to handle the message. Or, it might not allow message-passing to be that open-ended. Maybe, for reasons of performance or error-checking, it has a mechanism to enforce the fact that an object must be able to handle all messages passed to it. This "message-passing contract" may be explicit in the code or inferred by the compiler. Most likely, if it has object inheritance it supports using a descendant object directly in place of one of its ancestor objects (Liskov substitution principle). My point is that OOP support may be intimately connected to a language's scheme for typing variables (Java), or it may not be (Perl). A tendency to confuse OOP with a typing system (as in "Java's too persnickety about what I do with my object variables! OOP must be for framework-writing dweebs!") is another way in which the typing question can lead to ignorant language debates.
When faced with comparing one language to another, one important question is when (and if) the type of a variable gets set, and how strictly. However, that question is merely one of many.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.