Rippling Brainwaves: August 2007

Wednesday, August 22, 2007

knowledge labor is a great racket

Recently I had the pleasant experience of adding a tricky new "twist" to some software, but because of some previous design choices, I was able to introduce the twist with rapid, minimal effort. Later, it occurred to me that if someone walked by my desk and noticed a lack of typing, he or she might assume I was slacking off because I was no longer working on the twist! (To satisfy the nosy: I wasn't typing any more because I was reading some Web documentation about some configuration options.)

One good analogy to my situation is an old story I heard once (so I'll probably tell it wrong) about a customer and his auto mechanic. The mechanic looked closely at the customer's engine, fiddled with a few of the parts to check them out, then finally used one of his tools to make a single adjustment. The customer balked at the repair cost...the mechanic had barely done anything! The mechanic remarked that the cost wasn't just for making the adjustment, but for knowing where to make the adjustment.

How much do code monkeys and grease monkeys have in common?

Tuesday, August 21, 2007

lives effected by Hurricane Dean

Let me start out by saying that my sympathy goes out to all those in hurricane areas.

But reading "whose lives were effected by Hurricane Dean" in a real news article makes me chuckle. If you don't know why, search for "affect effect" on the Web to learn the distinction between the verbs "affect" and "effect".

If your life was effected by Hurricane Dean, your father is a blow-hard. (KA-ching!)

Tuesday, August 14, 2007

the typing debate is a red herring

Or: even more than before, dynamic vs. static variable typing is not the salient point of differentiation in programming language debates.

I've hinted at this previously, but my discovery of Jomo Fisher's blog on the more cutting-edge aspects of .Net reminded me of it. The casually-readable entry The Least You Need to Know about C# 3.0 describes multiple features which put the dynamic-static (compilation-runtime) distinction in a new light: the keyword "var" (for type inferenced or "anonymous type" variables), extension methods, and expression trees. These additions help make LINQ possible.

As .Net increases its explicit support for dynamic code (as opposed to dynamic code support through a Reflection API), the barrier between the capabilities of a "static" and a "dynamic" language keeps shrinking. If "expression tree" objects in .Net 3.5 allow someone to generate and execute a customized switch statement at runtime, then what we have is a solution with the flexibility of a dynamic language and the performance of a static language. Ideally, the smart folks working on porting dynamically-typed languages to statically-typed VM platforms would accomplish something similar in the implementation innards. The code that accomplishes this feat is a bit daunting, but it is cutting-edge, after all.

Some of those irascible Lisp users may be snickering at the "cutting-edge" label. As they should. I've read that Lisp implementations have had the ability to employ either static or dynamic typing for many years. Moreover, Lisp uses lists for both data and program structures, so it doesn't need a special expression tree object. It also has had a REPL loop that made the compilation-runtime distinction mushy before .Net was a gleam in Microsoft's eye. On the other hand Lisp is, well, Lisp. Moving along...

The way I see it (and echoing/elaborating what I have written before now), there are three reasons why the question of static and dynamic variable typing has always been, if not a red herring, at best a flawed litmus-test for language comparisons.

The time when a variable's data type is set doesn't affect the clear fact that the data the variable refers to has one definite data type at runtime. Ruby and Python cheerleaders are fond of reminding others that their variables are strongly typed, thankyouverymuch! Where "strongly typed" means that the code doesn't attempt to perform inappropriate operations on data by applying coercion rules to one or more operands. The timeless example is how to evaluate 1 + "1". Should it be "11", 2, or Exception? Strongly-typed languages are more likely than weakly-typed languages to evaluate it as Exception (whether a static CompileException or a dynamic RuntimeException). So dynamic typing is separate from strong typing, precisely because variable typing, a part of the code, is different from data typing, which is what the code receives and processes in one particular way at runtime. Data is typed--even null values, for which the type is also null. Regardless of language, the next question after "what is the name and scope of a variable?" is "what can I do with the variable?", and its answer is tied to the type of data in the variable. In fact, this connection is how ML-like languages can infer the data type of a variable from what the code does with it. Similarly, Haskell's type classes appear to define a data type precisely by the operations it supports. No matter how strict a variable type is at compile time or run time, when the code executes such considerations are distractions from what the actual data and type really is.
Programming languages are intangible entities until someone tries to use them, and therefore publicity (ideas about ideas) is of prime importance. One example is the stubborn insistence on calling a language "dynamic" instead of "scripting"; with one word programmers are working on active and powerful code (it's dynamite, it's like a dynamo!) while with the other word programmers are merely "writing scripts". Unfortunately, applying the word "dynamic" to an entire language/platform can also be misleading. Languages/platforms with static variable typing are by no means necessarily excluded from a degree of dynamism, apart from support for reflection or expression tree objects. Consider varargs, templates, casting (both up and down the hierarchy), runtime object composition/dependency injection, delegates, dynamically-generated proxies, DSL "little languages" (in XML or Lua or BeanShell or Javascript or...) parsed and executed at runtime by an interpreter written in the "big" language, map data structures, even linked lists of shifting size. The capabilities available to the programmer for creating dynamic, or perhaps true meta-programmatic, programs can be vital in some situations, but in any case it's too simplistic to assume static variable typing precludes dynamic programs. I don't seem to hear the Haskell cheerleaders often complaining about a static-typing straightjacket (or is that because they're too busy trying to force their lazy expressions to lend a hand solving the problem?).
OOP has been in mainstream use for a long time. I think it's uncontroversial to note the benefits (in maintainability and reuse) of breaking a program into separate units of self-contained functions and data called objects, and then keeping the interactions between the objects minimal and well-defined, especially for large, complex programs. This fundamental idea behind OOP is independent of variable typing, and also independent of object inheritance. Anyone with a passing familiarity of Smalltalk or Objective-C would agree. A language might allow one object to send a message to, or call a method on, any other object, with a defined fallback behavior if the object isn't able to handle the message. Or, it might not allow message-passing to be that open-ended. Maybe, for reasons of performance or error-checking, it has a mechanism to enforce the fact that an object must be able to handle all messages passed to it. This "message-passing contract" may be explicit in the code or inferred by the compiler. Most likely, if it has object inheritance it supports using a descendant object directly in place of one of its ancestor objects (Liskov substitution principle). My point is that OOP support may be intimately connected to a language's scheme for typing variables (Java), or it may not be (Perl). A tendency to confuse OOP with a typing system (as in "Java's too persnickety about what I do with my object variables! OOP must be for framework-writing dweebs!") is another way in which the typing question can lead to ignorant language debates.

When faced with comparing one language to another, one important question is when (and if) the type of a variable gets set, and how strictly. However, that question is merely one of many.

Friday, August 10, 2007

abstract algebra sometimes drives me nuts

The problem with learning new abstractions is that, from then on, the abstractions have the infuriating tendency to crop up everywhere. All someone must do is take a concrete idea, throw away some information, and recognize what's left. Applied geometry has some trivial examples. To compute a geometrical property of a ball, throw out such facts as: what color it is, the store it came from, its texture. Recognize the shape that's left, a sphere, and apply the geometrical formulas for spheres. There's nothing exotic about this process.

However, because shapes in the real world are so complex, geometrical abstractions don't break into my thoughts too often. Abstract algebra is worse, perhaps because its abstractions have minimal assumptions or requirements: sets, mappings, operations, groups, rings. Mathematicians and scientific theorists explicitly apply these concepts all the time, if only for classification.

But I don't want to keep thinking along those lines in everyday life. Specifically, I was thinking about changing the days of the week I water a plant, so the same number of days passes in-between. Right now I water on Monday and Thursday, leaving one gap of two days and another gap of three days. (Yes, I realize this problem is most likely not worth the mental expenditure I'm describing--but what are hobbies for?). Mentally starting at Wednesday, I counted in three day increments until I reached Wednesday again...

...and after I was done I went on a long tangent looking up information about the cyclic group Z₇, merely one in this list at wikipedia, but also an important group because it is a simple group. By associating each day of the week with an integer in the range 0-6 (like 0 := Sunday and then assigning progressively greater numbers in chronological order), and setting the group "multiplication" operation to be plain addition followed by modulo 7, the days of the week match that group perfectly. Although people might be a bit bewildered if you mention multiplying Monday by Friday (or Friday by Monday, abelian, baby!) to get Saturday [(1 + 5) modulo 7 = 6].

This group has no nontrivial subgroups (the trivial subgroup is just 0 alone, which is useless because I must assume that time is passing and I must water the plant more than once!). The lack of subgroups implies that no matter what interval of days in the range 1-6 I try, I'll end up having to include every day of the week in my watering schedule!

I mentioned before that I tried 3, which is Wednesday according to the mapping (isomorphism, whatever). Three days past Wednesday is Saturday [(3 + 3) modulo 7 = 6]. Three days past Saturday is Tuesday [(6 + 3) modulo 7 = 2]. Three days past Tuesday is Friday [(2 + 3) modulo 7 = 5]. Three days past Friday is Monday [(5 + 3) modulo 7 = 1]. Three days past Monday is Thursday [(1 +3) modulo 7 = 4]. Three days past Thursday is Sunday [(4 + 3) modulo 7 = 0]. Finally, three days past Sunday is Wednesday [(0 + 3) modulo 7 = 3]. The cycle is complete, but all I have is the same group I started with--the entire week!

I'm not sure what's worse: that all this pondering about abstract algebra didn't lead to any useful insights, or that I got caught up in doing it. I don't know how I manage to get anything done. Ghaah, I can be so nerdy even I can't stand it...

Thursday, August 09, 2007

willful ignorance

This account of people at OSCON dumping scorn on MySQL and PHP hit a chord with me, especially when the writer, a prominent Perl user, had the insight to compare it to when a Java fan scoffs "People still use Perl?" The "grass is always stupider on the other side" perspective can afflict anyone, going in either direction.

I doubt I need to cluck my tongue and admonish everyone to ignore the usual ideological conflicts and use what works. (The fallacy of identifying programmers with languages was the topic of a rant a while ago.) My new revelation is recognizing that one of the ingredients of robust technological snobbery is a disturbingly natural human trait: willful ignorance.

Willful ignorance is the compulsion to either 1) avoid learning something which could conflict with a truth someone is personally invested in, or 2) learn something just enough to find valid reasons for dismissing it. Willful ignorance is everywhere, because people not only want something to be true but need it to be true. Once someone's mental and emotional energy are mixed up with some piece of information, employing a mechanism to defend it is as instinctual as tensing up when a toothy predator pounces at him or her.

Interestingly, this constant also applies to people who pride themselves on their rationality; they need it to be true that rationality matters and works. For instance, someone who has the grating habit of correcting others' grammar to make it internally consistent may also be someone who both desperately wants to consider language as a logical system and to believe in the importance of logic for valid thinking. Such a person would be well-advised to keep willfully ignorant of the notion of language as a creative act of communication. Starting from the wrong axioms more or less guarantees the failure of rationality to reach the right conclusion. I remember a time when I couldn't figure out why some code under test never produced the expected test result. I took a short break after realizing the test had been wrong.

Therefore the proper antidote to willful ignorance is not rationality alone (rationality can mislead someone into excluding too many possibilities). But a wide-open mind isn't the antidote, either. Accepting information without discrimination leaves one open to each passing fad, or paralyzed with indecision. The best strategy is to make one's mind a centrifuge. Pour in whatever ideas are lying around, then stir and process and compare until the useful portions are separated out. If one of the challengers to your beliefs has value, isn't it better to figure that out as soon as possible? At the risk of getting off point, I respect the Java thinkers who heard about Rails and worked to apply some of its ideas to the Java world, instead of willfully ignoring Rails altogether or trying to immediately convert to it (thereby incurring significant costs).

The above was about willful ignorance. It doesn't apply to ignorance born out of a lack of time/energy. I haven't been able to look into Seaside at all, for that very reason.

Wednesday, August 08, 2007

design patterns: just relax

My feeds have been showing some more debate about design patterns. Are design patterns still applicable? Do design patterns lead to unnecessary complexity? Is the use of a design pattern an indication that a platform and/or programming language is inadequate? And on and on.

I suppose I just don't understand the hubbub. As I see it, a design pattern is a technique to solve one or more problems, by trading complexity for flexibility. The tricky part isn't recognizing a problem in existing code, and then refactoring it to use a relevant design pattern; the tricky part is deciding when (and if) it makes sense to apply a design pattern preventively. Maybe part of the reason some people hold their noses around Java code is because of their dislike for design pattern overuse.

Given that the goal of software development is to solve information problems, design patterns are organized descriptions of solutions. Solving a problem with a design pattern is fine. Solving a problem without a design pattern is even better, because that means the problem isn't one of the harder ones. Solving a problem inside a platform, framework, or language that makes the design pattern unnecessary (at least in some common cases) is the best, because the problem is solved on behalf of the programmer. Lastly, solving a difficult problem at runtime using some wild black-magic, while enjoyable, may demand careful organization and documentation to keep it maintainable (stuffing that genie into its own separate bottle, behind a good API, may be a good idea).

Design patterns: use them if you need to. In any case, learn them just in case they may be handy later, and also so you can communicate with other design pattern aficionados. A name for the "callbacks within callbacks design pattern" would have allowed me to end that post with a sentence that wasn't so lame.

Tuesday, August 07, 2007

actual news about the zap2it labs shutdown

For those who care, I noticed that the Schedules Direct site has updated its page! Schedules Direct is the zap2it labs replacement service, according to the MythTV wiki.

The gist appears to be that although the "free-as-in-speech" software ride is not over (of course), the "free-as-in-beer" tv listings ride is. In return for a DVR that does whatever I want it to (for standard-definition TV, anyhow, *sigh*), I'm willing to cough up some dollars. Already have, when I assembled the computer parts...

good language for internal DSLs

It's tiny, mature, and performs well. Its syntax is minimal and straightforward, but it still supports a number of fancy features. It's embeddable. It's portable. It's extensible through a mechanism somewhat like a meta-object protocol. Its C interface is remarkably convenient.

It's Lua. You didn't think I was describing Ruby, did you?

Monday, August 06, 2007

shifting programming metaphors

Someone at work has a habit of using "Fortran" as a synonym for "simple, imperative-style programming", regardless of the specific language: "Fortran Java", for instance, means clumping all code into the main method.

What makes this amusing is that Fortran hasn't been sitting still since he last used it. Even Fortran has objects now.

Saturday, August 04, 2007

callbacks within callbacks

Oh, the travails of an Ajax amateur...in the beginning the requirement was straightforward. Grab some data asynchronously by passing along an "update-display" callback function (a function with a 'data' parameter). The callback function could even be anonymous, or refer to variables in enclosing function scopes at the time of definition--nothing out of the ordinary.

Then someone has the nerve to change his mind about the interface, so now one display needs to show what is actually the patched-together result of several invocations of the same data call, all at once. At first glance, this is still easy--just fire off all the asynchronous requests required and fuhgeddaboutit, because each one has a callback to update only its part of the display.

Wrong! In this case, the parts of the display are ordered and the necessary extra data calls to fill in "missing" points aren't known until some of the data has been gathered, meaning the part of the display generated from call C must be finished after the part of the display generated from call B is finished, and so on. But all the calls are asynchronous, so there is no guarantee whatsoever of the time or order in which individual calls will return.

Flashbacks of semaphores and mutexes and Dining Philosophers dance before your eyes. No--that way lies madness. Since the display is ordered, better to just get all the data pieces, assemble them, then display the whole as if it had been from a single data call. Unfortunately, the "update-display" function can't immediately run after firing off the asynchronous calls, of course, because then it might run before all the data was gathered! So the "update-display" function must be in a callback of some kind. Yet if the calls are asynchronous, which callback should contain the overall "update-display" function? The last one to finish, of course. But which one will that be?

Puzzle it till your puzzler is sore, and one answer, not necessarily the best, may emerge. The time spent studying the techniques of functional programming may generate (yield?) a payoff. Consider the following almost-real code, in which a callback function takes its own function parameter.


function baseCallback(data,update-display-function) {
 var afterFunction = function(dataparm) {
   update-display-function(dataparm); };
 function addAnotherAfterFunction(currentAfterFunction,
                                      patching-info) {
   return function(dataparm) {
     var innerCallback = function(innerData) {
       // process data...
       // patch data using patching-info
       // (data patch location passed by reference)
       currentAfterFunction(dataparm);
     }
     dataCallInvoke(params-computed-from-patching-info,
                                       innerCallback);
   }
 }
 // process data in a loop, and within the loop,
 // when there's a need to make another data call...
 afterFunction =
   addAnotherAfterFunction(afterFunction,
                           data-patching-information);
 // after the data processing loop
 afterFunction(data);
}
function dataCallInvoke(originalParams,upd-disp-fn) {
 var normalCallback = function(dataparam) {
   baseCallback(dataparam,upd-disp-fn);
 }
 callWithCallback(originalParams,normalCallback);
}
dataCallInvoke(origArguments,overall-display-func);

The idea is that addAnotherAfterFunction repeatedly operates, as needed, on the function in the afterFunction variable. One callback becomes embedded inside another callback inside another callback. When the outermost callback (the final contents of the afterFunction variable) runs, it makes a data call and passes in the next-inner callback. The data call runs, then executes the next-inner callback that was passed to it. The next-inner callback perhaps makes its own data call...and so on, until the "true original" callback, the "update-display" function, takes the fully-constructed data and does what it will.

One downside is that the invocations of the data calls can't run in parallel (although since order explicitly matters so much in this situation, parallel execution won't work anyway). Also, tossing around so many functions with their own individual closed-over scopes surely eats up some memory (although I wonder if some "copy-on-write" implementation could be a workaround). Honestly, I'm not sure what terminology describes this technique. Um, mumble, mumble, continuation-passing-style, mumble, combinator, mumble, monadic bind, er...

Rippling Brainwaves