Thursday, June 05, 2008

when code simplicity is detrimental

I like code simplicity. Code is a composition of pieces or elements, and simplicity refers to a minimal quantity of those. Generally, simplicity confers greater productivity: quicker speed of initial development, safer modifications and later additions, faster understanding for newcomers, easier debugging and profiling, a smaller proportion of time spent on peripheral tasks. I suspect that the insightful application of simplicity is part of the huge skill disparity between exceptional programmers and the rest. An E. F. Schumacher quotation says it better:
Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius—and a lot of courage—to move in the opposite direction.
However, slavish attachment to code simplicity could have counterintuitive results that solve the task harder or wrongly.
  • NIH (Not Invented Here) syndrome. I doubt this term needs any more explanation. Simplicity may be one of its motivations, as in "The solutions that were Not Invented Here are all needlessly intricate for my use case(s); I'll circumvent the learning curve and integration headache by writing for myself just exactly what I need". I admit that occasionally such reasoning is valid, because the outside solutions truly are either too cumbersome or lacking. But NIH is prone to producing code (a custom web framework and/or, horror of horrors, another template language?) that combines breathtaking simplicity with the subtle bugs and limitations that its matured counterparts patched several versions ago.
  • Flattened "lightweight". Obviously, one of the best ways to keep code small and simple is to focus it on a minimal, carefully-selected set of features or capabilities, and maybe also sacrifice unwanted flexibility and reuse. Code written explicitly with that design decision is "lightweight". Lightweight code is often excellent at fulfilling limited requirements in a way that's easy to understand. Unfortunately, when requirements expand, the trickier situations that the lightweight code ignores could abruptly become more important. The good news is that several approaches virtually eliminate the lightweight/heavyweight dilemma: plugin-oriented code for using separate modules as needed, code (component) specialization for connecting up several expert "lightweights" to match a generalized "heavyweight", APIs and standardized cooperation mechanisms for combining code that's similar but has complementary pros and cons.
  • Security. This is one of the inherently trickiest domains of software development. Its experts tend to be tricksy and unconventional, no matter what color of hat they wear. Here more than elsewhere, simple thoughts are typically naive thoughts. Developers who aren't specialists shouldn't be attempting to create "fast, easy, elegant, novel" data protection algorithms. When security matters, simplicity is a much lower priority than proven best practices. To the untrained, inaccurate evaluations of security weaknesses are perilously likely, and the cliché is accurate: a whole security scheme is no stronger than its weakest point.
  • Interoperability. Any successful communication depends on a mutually-understood encoding strategy. For code, whose "intelligence" is drastically limited, deviation from a chosen encoding strategy will result in communication failure in some way, although the failure might not be absolute or fatal, e.g., discarding just the unexpected symbols. Therefore, de-facto and de-jure standards for data encoding are important. But in the majority of situations, the entire depth and breadth of one or more standards really isn't strictly applicable or necessary. Thus, the temptation is to write or reuse code that simply handles the "core subset" of a communication standard. Sometimes this choice is justified, when all the communicators are exhaustively known and under tight control (within an intranet, for instance). Other times, this choice is short-sighted because code that tries to communicate using the standard could try to use the complex parts that were left out for simplicity's sake. XML namespaces and Unicode are two common targets. Fortunately, conforming XML parsers are everywhere and Unicode is built right into programming languages.
  • Scalability. As hardware's twin performance pushes of miniaturization and faster clock frequencies give way to the alternate path of workload parallelization (more cores, more sockets), ignoring code scalability in favor of simplicity becomes a less competitive option. Sure, the individual units of parallelized code are usually pretty simple. Depending on the architecture, the local memory available to each unit might be tiny in modern terms. The complexity arises in the efficient coordination of all the units and in the decomposition of an algorithm into parallel tasks. Another scalability realm is the deployment of code to multiple hosts, especially to high-traffic Web servers. Code that's not written toward this goal has the potential to hinder an even distribution of the user load.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.