Friday, March 28, 2008

a link to the past

Discovering Delphi. Just marvel at its design sense.

I found the link on the Object Pascal entry at Wikipedia. I arrived at that entry after noticing that Delphi is in the top 10 at TIOBE.

Wednesday, March 26, 2008

Comments on Head First OOAD -or- The Fanboy Speaketh

A while back, I read Head First Design Patterns almost solely because the sample chapter I downloaded was so effective at making the Decorator pattern seem obvious, to the immediate benefit of understanding how Java I/O streams wrap around one another. I became a fan of the "Head First" approach, although I stopped short of buying the design patterns poster. My second "Head First" book, Head First Object-Oriented Analysis and Design, hasn't dampened my enthusiasm. As you would probably guess, this book is about the up-front tasks of software development: gathering feature lists and requirements and use cases, decomposing larger tasks into smaller, focusing on eliminating overall risk, determining how to divide a process among objects, writing tests, and ensuring the code is written with reuse and extension in mind.

The style of these books is so jarring that it needs a rationale. Hence the books use the intro partly for that very explanation and partly to advise the reader on how to respond. My opinion on the Head First formula is mixed. For dull nonfiction, expressing information visually is more memorable and expressive than words; developers who've never read a "Head First" book still make diagrams at work. Exercises and thoughtful questions are good for keeping the reader engaged. Have other books ever asked so many questions all the time? (Such a strategy is reminiscent of the well-known study technique of raising and answering mental questions of the text while reading it.) I also appreciate that the text describing images/code is in nearby snippets with clearly-pointing arrows, like the "word balloons" of a comic strip or an application tooltip. Each snippet is thereby closely associated with its object, trimmed down to an inviting size, and yet stimulates sustained, active attention because the eye must rove to keep up.

Nevertheless, the layout has its downsides. Sometimes, conversational writing is too conversational. I prefer terser, denser language. On the other hand, the readability and redundancy ensure that the intended point is unambiguously communicated, and the "there are no Dumb Questions" Q&A sections are excellent--from time to time the questions matched my own misgivings perfectly. The humor flops more often than not, but I realize a failed joke is still more interesting than no joke. Packing fewer of the repetitive images into each page, or shrinking the images, would leave more room to introduce ideas. Smaller font sizes in some cases would also conserve space, averting the occasional sensation that the reader disproportionately paid for a lot of blankness. Widespread images, varying fonts, and free-form positioning enhance both enjoyment and impact; however, here and there these books sprint into flying leaps off the deep end. I imagine OOP books are especially suitable for illustrations.

Regardless of style, the content is superb, as the quoted reviews mention. It's not advanced or exhaustive...it has a grab-bag appendix that swiftly summarizes extra subjects like the design patterns book's appendix of extra patterns. Its explicit purpose is to show and tell how to use a collection of procedures and guidelines to make great software. Some might argue that all professional developers should already know (have been taught?) this material. While that's true, that also presumes too much about the "education" of a "professional" "developer". Taking some Java classes doesn't imply that someone also has good grounding in OOAD, nor that someone groks the goal of and reasons behind object orientation. At the same time, OO development in languages besides Java can benefit from this knowledge.

Like the title indicates, this isn't a "Java book" just because it contains Java-only code. As with Head First Design Patterns, the topics are in context because vital underlying principles persist throughout. Chapter 8's sole focus is design principles. I wasn't expecting a mention of the Liskov Substitution Principle, but there it was. (What I expected even less was the clue in the end-of-chapter crossword to call Liskov "he"!) Other fundamentals included encapsulation, inheritance, polymorphism, delegation, composition, aggregation, DRY, cohesion, single-responsibility principle, interfaces, coupling, design patterns.

Again, some might argue that those concepts should be common knowledge (or, worse, some might argue the concepts are common sense). While that's true, the genius of "Head First" is intermixing principles and prolonged scenarios. Instead of generally introducing delegation and then proceeding to walk through a piece of code specifically contrived for that purpose, "Head First" is more likely to start with a simple yet sufficient situation. After organically producing working but naively-designed code, the situation changes, and as a result the code must be reconsidered. At this point, the problems in the original code become apparent, maybe with some references to previously-covered principles. Then delegation as both a general concept and specific solution comes into the picture and improves the code. This method works best when the first-time reader isn't skipping around the book. It also means that the summaries and bullet points at the end of each chapter are better for reference than the chapter itself. The fully-descriptive table of contents aids in reference, too.

Thus, this book has a strong practical bent along with its OOAD generalities. Its stated, oft-repeated emphasis is to make great software. And great software does more than function. It meets the customer's needs and supports change and reuse. To the degree that a technical book has a major theme, its theme is change. Requirements change. Designs change. Objects change, though each one should change for only one reason. Another book might have code that's perfect when it appears, or portrayed as perfect. The truth is that OOAD doesn't lead to one right solution for all time. In fact, a frequent side comment in the exercise answers, perhaps inserted to reassure perfectionists, is "don't worry if you may not have come up with this exact solution, but yours should have contained _____ ". Readers who become frustrated with the multiple revisions of the designs and code are missing the point. They may also simply be doing what I did with the inventory search example: anticipating the eventual shape of the code in later chapters. (My thinking at the time was "What?! I would have used an interface to achieve looser coupling, because there's no need to assume anything more about the shape of the object(s). I guess I'll just smugly keep my superior design in mind, and continue reading." I was gratified when mine showed up many pages afterward. The fictional customer forced a design shift.)

The two "Head First" books I have contain sizable amounts of real, minimal code despite being about rather high-level, abstract parts of software development. Since unapplied principles don't matter, connecting principles and code and "real" pictured entities is a reasonable way to teach them. Look past the goofy pictures and one-liners. A relative lack of information density is offset by a dogged determination to engage and inform, to spread practical and usable knowledge. If the table of contents has items someone would like to know better, I'd recommend learning those items via this book. Chances are, by the end he or she will have absorbed unexpected tips that will prevent future development pain...

Tuesday, March 18, 2008

define maintainable

Here's a John Lam quote at the end of here:
Finally, why should .NET developers bother with Ruby? According to Lam: "You spend less time writing software than you spend maintaining software. Optimizing for writing software versus maintaining software is probably the wrong thing to do. Static typing makes it harder to maintain software because it's harder to change it."
What makes this quote so noteworthy is how its definition of the word "maintainable" is apparently different from the definition used by static typing proponents. They would say that static typing makes it easier to maintain software because the static types document the code (some people might even characterize static types as metadata), enable additional checking of the code by the compiler, enable automatic refactoring, and generally make code easier to read/optimize/execute at the expense of making it harder to write.

The static typing cheerleaders allege that dynamic typing optimizes for writing software versus maintaining software. But based on this quote from John Lam, dynamic typing cheerleaders allege the same against static typing.

Does "maintainable" refer to preserving maximum code flexibility, or does it refer to preserving maximum code clarity (by static types, low dynamism, simple concepts)? The answer is both. Flexibility and clarity are just one of the design trade-offs canny developers mull every day.

UPDATE: I feel obligated to comment on the linked article title. "No Borg-like release train for Ruby on .Net" doesn't make any sense. Borg have no releases! Borg assimilate. I know that it's a Register-related website and the chances are good that readers know that "Borg" is a tee-hee term for (the non-open-source portions of?) Microsoft, but c'mon, at least don't use the Borg label when it isn't appropriate.

Monday, March 17, 2008

life is a metaphor for video games

It happens to me every time I pick up a quick lunch from a fast-burger franchise. I think "hey, this reminds me of Monolith Burger....hee hee hee". Then when I go back to my cubicle, I think, "hey, this reminds me of the cubicle maze on Pestulon, but that one had people walking around cracking whips". Then when I go home and watch some show that has killer robots, I think, "hey, this reminds me of being hunted by Arnoid".

And at that point I'm quite grateful that in my everyday life making one little mistake won't result in bloody death or encasement in lime gelatin.

Thursday, March 13, 2008

not only n00bs model data

Prelude

It seems to me that there are two measures for how "successful" a blogger is. One is regular readership. Like the ratings for a TV show, the box office receipts for a movie, or the number of copies for a book, the size of the audience is solid proof of a work's attractiveness. Second is the amount of discussion or "buzz" a blog provokes, which is more visibly obvious for a blog than for other media due to being part of a network of hyperlinks. (Of course, sometimes works in other media primarily aim for the second measure, like "Oscar bait" movies that contain few of the usual elements most casual moviegoers want.) These two measures may or may not be orthogonal: lots of comments and trackbacks might expose a blog to a broader number of regular readers, but a blog with one resonant post might not maintain any lasting popularity unless it routinely delivers similar posts, while a "comfortably inane" blog might combine a set of loyal subscribers with a competent focus on uncontroversial topics. I don't think one of the two measures is necessarily better, but then again I like to both learn new ideas and let my old ideas be challenged.

Yegge's blog seems to succeed based on either measure, despite breaking one of the Laws of Blogging: posts should be short, cohesive, and maybe even a bit brusque (well, I suppose he nevertheless has "brusque" down pat). He has a gift for taking the usual programming arguments that blogs beat to death, expressing each side in colorful metaphors, then working step-by-step to his own conclusion through a conversational writing style that emphasizes narratives and examples over premises. I've been provoked into linking and responding to his blog several times before, but am I done? Oh, no. Here I go again, with Portrait of a Noob.

Portrait of a Data-Modeling Vet

I'm using "vet" because that's referred to as the opposite of a n00b. I'm using "data-modeling" because that seems to be the term settled on towards the end to represent a complex of concepts: code comments, code blank lines, metadata, thorough database schemas, static typing, large class hierarchies. By the way, I doubt that all those truly are useful analogues for each other. In my opinion, part of being a vet is the virtue of "laziness" (i.e., seeking an overall economy of effort), and part of "laziness" is creating code in which informative comments and blank lines lessen the amount of time required to grok it. Compare code to a picture that's simultaneously two faces and a vase. Whitespace can express meaning too, which is partly why I believe it makes perfect sense for an organization or project to enforce a style policy regardless of what the specific policy is. Naturally I still think a code file should contain more code lines than comment and whitespace lines, though the balance depends on how much each code line accomplishes--some of the Lisp-like code and Haskell code I've seen surely should have had more explanatory comments.

But my reason for responding isn't to offer advice for commenting and spacing or to dispute the accuracy of the alleged "data-modeling" complex that afflicts n00bs. My objection is to the "portrayal" that n00bs focus too heavily on modeling a problem in an attempt to tame its scariness and avoid doing the work, and vets "just get it done". Before proceeding further I acknowledge that I have indeed had the joy of experiencing someone's fanatic effort to produce the Perfect Class Collection for a problem, with patterns A-Z inserted seemingly on impulse and both inheritance and interfaces stacking so deeply that UML diagrams aren't merely helpful but more or less required to grasp how the solution actually executes. Then a user request comes in for data that acts a little like X but also like Y and Z, and the end result is modifying more than five classes to accommodate the new ugly duckling class. As Yegge calls for moderation, so do I.

However, the n00b extreme of Big Design Up Front (iterations that try to do and assume too much) shouldn't overshadow the truth that data modeling isn't a distraction from or the "kidney" of software. We can bloviate as long as we want about code and data being the same, particularly about the adaptability that equivalence enables, but the mundane reality is programs with instruction segments, data segments, call stacks, and heaps. Computers shlep data values, combine data values, overwrite data values. The data model is the bridge between computer processing and the problem domain that can be seen, heard, felt (smelt and tasted?). A math student who solves a word problem but forgets the units label, like "inches" or "degrees" or "radians" or "apples", has failed. The data model is part of defining the problem, and also part of the answer.

Yet the difference between the data model of the n00b and the data model of the vet is cost-effectiveness: how complicated must the data model be to meet the needs of customers, future code maintainers, and other programmers who may reuse the objects (and if you don't want or expect others' code to call your object methods, ponder why you aren't putting the methods in package visibility rather than public). Yegge makes the same point, but I want to underscore the value and importance of not blindly fleeing from one extreme to the other. Explicit data modeling with static classes is not EVIL. By all means, decompose a program into more classes if that makes it more maintainable, understandable, and reusable; apply an OO design pattern if the situation warrants it, since a design pattern can make the data model more adaptive than one might guess.

N00bs and vets data-model, but the vet's data model is only as strict as it must be to meet everyone's needs. According to his usual form, Yegge's examples of languages for lumbering and pedantic n00bs are Java and C++, and his examples of languages for swift and imaginative vets are Perl and Python and Ruby. (And how amusing is it to read the comments that those languages "force you to get stuff done" and "force you to face the computation head-on" when the stated reason people give for liking those languages is that they don't force anything on the programmer-who-knows-best?) My reply hasn't changed. A language's system for typing variables and its system for supporting OOP don't change the software development necessities already mentioned: a data model and code to process that data model. Change the data model, and the code may need to change. Refactor how the code works, and the data model may need to change. Change the data stored in the data model, e.g. an instance with an uninitialized property, and the code may break at runtime anyway. The coupling between code and data model is unavoidable whether the data model is implicit (a hashmap or a concept of an object being more or less a hashmap) and the code contains the smarts OR the data model is explicit (a class hierarchy or a static struct/record) and the code is chopped into little individually-simple snippets encapsulated across the data model.

Having said that, I see Yegge's comments about "Haskell, OCaml and their ilk" as a bit off. (Considering I self-identify as a F# cheerleader, probably nobody is surprised that I would express that opinion.) I disagree with the statement that the more sound (more complete and subtle, less loopholes) a type system is, the more programmers perceive it as inflexible, hated, and metadata-craving. OK, it may be true that's how other programmers feel, but not me. I admit that at first the type system is bewildering, but once someone becomes acquainted with partial evaluation, tuples, unions, and function types, other systems seem lacking. A shift in orientation also happens: the mostly-compiler-inferred data types in a program aren't a set of handcuffs any longer but a crucial factor of how all the pieces fit together.

When a data structure changes and the change doesn't result in any problems at compile time, I can be reasonably certain all my code still runs. When the change causes some kind of contradiction in my code, like a missing pattern-match clause, the compiler shows what needs to change (though like other compiler errors, interpretation takes practice). I change the contradicting lines, and I can be reasonably certain the modified code will run.

I can be reasonably certain not that it works, but that it runs. Using its types right is necessary to ensure the program runs well, not sufficient to ensure the program runs well. As many people would rightly observe, a program with self-consistent types is useless if it doesn't solve the problem right, or solves the wrong problem, or doesn't yield the correct result for some edge condition, or blows up due to a null returned from an API written in a different language (gee, thanks!). Unit and integration tests clearly retain positions as excellent tools apart from the type system.

I can't help coming to the conclusion that when a type system has a rich feature set combined with crisp syntax (commas for tuples, curly braces and semicolons for struct-like records, square brackets and semicolons for lists) and type inference (all the checking, none of the bloat!), the type system isn't my enemy. And I should include classes and interfaces when lesser constructs aren't enough or I need larger explicit units of code organization. Don't feel bad for modeling data. We all must do it, in whatever language, and our programs are better for it.

Monday, March 10, 2008

futilely confronting the foolishness of crowds

This isn't a news flash to anyone I suspect, but it can sometimes be incredibly frustrating to publish an acute comment or reply online (or offline). Once some people feel that they're under attack, they will proceed to strike back at the comment or reply however they can: mod it down, ridicule it, counter it with irrelevant nonsense, etc. In any case, they don't seem to bother trying to comprehend the point it's making, or if they do then they still feel obligated to chivalrously "defend the honor" of banner foo. I've found that even if the tone is mild and evenhanded and all of the supporting axioms are undeniable, the same outcome can happen. The possibility of a moderate thought is simply crowded out by the sacred cows and cargo cults. How dare anyone attempt to offer a novel perspective! Evolved pack cooperation be damned, let's go back to competing dominance displays!

When I'm in a philosophical mood such situations often remind me of what Kris Kringle says in Miracle on 34th Street to explain why he purposely failed the mental examination after Dr. Sawyer the peevish company psychologist committed him to an asylum:
Oh, it's not just Doris. There's Mr. Sawyer. He's contemptible, dishonest, selfish, deceitful, vicious...yet he's out there and I'm in here. He's called normal and I'm not. Well, if that's normal, I don't want it. That's why I answered incorrectly.
When I'm in an impatient mood such situations merely make me want to yell "Doesn't anybody notice this? I feel like I'm taking crazy pills!"

Tuesday, March 04, 2008

peeve no. 257 is the idea of Internet "governance"

I'm not sure if this opinion places me in the category of realist or idealist, but the whole idea of Internet "governance" irks me. The Internet is a "network of networks", and a network is a connection between devices. It's a simple idea, while at the same time being fiendishly complicated to put into practice. The Internet is "devices talking to one another".

Nobody should find it necessary to comment that the Internet is decentralized, or to comment that the value of the Internet is at the "edges", i.e. the devices actually sending and receiving data. In simple terms, when ma and pa click on an icon for a Web browser to "go to the Internet", the closer metaphor isn't switching TV channels or turning pages in a massive encyclopedia (Information Superhighway, remember that?). They're telling the computer to "make a phone call" to other computers. This point was easier to emphasize back when people plugged a regular phone line right in to the computer's internal or external modem. IPTV and VOIP probably interfere further with the casual user's attempt at a metaphorical understanding...

If the fundamental Internet is just computers talking, then discussing its fundamental "governance" is nutty. Who "governs" TCP, UDP, and so on? Who "governs" HTML, CSS? Who "governs" SMTP, POP, IMAP? Who "governs" IP, IPv6? The Internet is communication. What matters is that the communication works right here, right now. Standards and protocols enable valid communication. The "penalty" is miscommunication, e.g., a malformed data packet or a web page that's "broken" in an idiosyncratic renderer.

No entity--company, government, standards body--"governs" the Internet. The Internet isn't owned by them either. I want to believe this is true. The irresistible tendency to see the Internet as a marketplace, public diary, fan convention, software development technique, leisure entertainment, newsroom, information repository, and so forth masks the reality. There is no such thing as cyberspace and no such thing as blogosphere. The Internet is device A exchanging data with remote device B, via a path of complex technologies and intermediaries. At the point at which almost all important information interaction happens through this method, and yet strict governance extends to the same level of micromanagement of communication, previous lacks of privacy will seem tame.