Showing posts with label Programming Languages General Topics. Show all posts
Showing posts with label Programming Languages General Topics. Show all posts

Thursday, September 08, 2011

software developers like punctuation

Comparison of equivalent snippets in various programming languages leads to a stable conclusion about what developers like: punctuation. Namespaces/packages, object hierarchies, composing reusable pieces into the desired aggregate, and so on are relegated to the despicable category of "ceremony". Better to use built-in punctuation syntax than to type letter sequences that signify items in the standard libraries. Developers don't hate objects. They hate typing names. Hand them a method and they'll moan. Hand them a new operator that accomplishes the same purpose and they'll grin. What's the most noticeable difference between Groovy and Java syntax, in many cases? Punctuation as shortcuts. Why do some of them have trouble following Lisp-y programming languages? Containment in place of punctuation.

Oddly enough, some of the same developers also immediately switch opinions when they encounter operators overloaded with new meanings by code. Those punctuation marks are confusing, unlike the good punctuation marks that are built in to the language. Exceedingly common behavior can still be unambiguous if its method calls are replaced by punctuation, but behavior of user modules is comparatively rare so the long names are less ambiguous than overloaded operators. "By this module's definition of plus '+', the first argument is modified? Aaaaaghhhh! I wish the module writer had just used a call named 'append' instead!"

Wednesday, July 20, 2011

"feels dynamic" is a ridiculous expression

I should start out by saying that most of the actual information content in Scala: The Static Language that Feels Dynamic is great and worth your time. Scala deserves any attention it receives. And obviously any complaints I advance about what Eckel writes are gadflies to his cow.

Regardless, my reaction to "feels dynamic" is extreme annoyance. Dynamic typing has no feel. Dynamic typing is merely one of many aspects to a language. I might agree that languages have a "feel", although it still sounds risible in a serious technological discussion. Go ahead and mention "feel", but at least trot out numerous examples that partially convey your meaning. Fortunately, Eckel does go on to do this in the article, which as a communicator puts him above the typical Rubyist/Pythonista/whatever-stupid-name-they-enjoy.

I'd suggest an alternative expression "feels Pythonic". Still better, "is as succinct as Python syntax". Based on the text, that's almost exactly what Eckel intends. Distinguishing a "dynamic feel" from a "static feel" just sounds like someone has an astounding lack of programming language breadth. For instance, ML-family or Lisp-family languages, not to mention Prolog or Forth or J, probably have different "feels" than that simplistic two-state viewpoint. For frak's sake, a hypothetical Java "replacement" more or less identical to Java, but with a type-inferring compiler and a much more convenient standard library, would "feel dynamic" according to some language cheerleaders. Get thee out to read about not only Scala but Groovy++.

Friday, October 29, 2010

peeve no. 261 is Obi and programming language mentality

The Force can have a strong influence on the weak-minded. (--Obi-Wan)

Similarly, a programming language can have a strong influence on weak-minded programmers. If your programming language "changed your life", "made you a better coder", "expanded your mind", etc., then I suggest that you may be going about this programming thing the wrong way. If a lack of language support for OO results in you producing a tangled web of dependencies among procedural functions, or easy access to global/file scope results in you spraying program state everywhere, then the programming language is your crutch. It's the guide-rail for your brain.

I'm glad that you found a programming language that you enjoy, but consider me unimpressed by rather impractical claims about the ways that it alters your consciousness.

Wednesday, July 07, 2010

explicit is better than implicit: in favor of static typing

Right now, static not dynamic typing more closely fits my aesthetic preferences and intellectual biases. And the best expression of the reason is "explicit is better than implicit". The primary problem I have with dynamic typing, i.e. checking types solely as the program executes, is that the type must be left implicit/unknown despite its vital importance to correct reasoning about the code's operation. The crux is whether the upsides of a type being mutable and/or loosely-checked outweigh the downside of it being implicit.

Dynamism. I'm inclined to guess that most of the time most developers don't in fact require or exploit the admittedly-vast possibilities that dynamic typing enables. The effectiveness of tracing compilers and run-time call-site optimizations confirms this. My experiences with C#'s "var" have demonstrated that, for mere avoidance of type declarations for strictly-local variables, type inference almost always works as well as a pure dynamic type. Stylistically speaking, rebinding a name to multiple data types probably has few fans. The binding of a name to different data types is more handy for parameters and returns...

Ultimate API Generality. As for the undeniable generality of dynamically-typed APIs, I'm convinced that most of the time utterly dynamic parameters are less accurate and precise than simple parameters of high abstraction. This is seen in LINQ's impressive range of applicability to generic "IEnumerable<T>" and in how rarely everyday Objective-C code needs to use the dynamic type id. With few exceptions, application code needs to implicitly or explicitly assume something, albeit very little, about a variable's contents in order to meaningfully manipulate it. In languages with dynamic typing, this truth can be concealed by high-level data types built-in to the syntax, which may share many operators and have implicit coercion rules. Of course, in actuality the API may not react reasonably to every data type passed to it...

"Informal Interfaces". According to this design pattern, as long as a group of objects happen to support the same set of methods, the same code can function on the entire group. In essence, the code's actions define an expected interface. The set of required methods might differ by the code's execution paths! This pattern is plainly superior for adapting code and objects in ways that cut across inheritance hierarchies. Yet once more I question whether, most of the time, the benefits are worth the downside in transparency. Every time the code changes, its informal interface could change. If someone wants to pass a new type to the code, the informal interface must either be inferred by perusing the source or by consulting documentation that may be incomplete or obsolete. If an object passed to the code changes, the object could in effect violate the code's informal interface and lead to a bug that surprises users and developers alike. "I replaced a method on the object over here, why did code over there abruptly stop working?" I sympathize with complaints about possible exponential quantities of static interface types, but to me it still seems preferable to the effort that's required to manually track lots of informal interfaces. But in cases of high code churn, developers must expend effort just to update static interface types as requirements and objects iterate...

Evolutionary Design. There's something appealing about the plea to stop scribbling UML and prod the project forward by pushing out working code regardless of an anemic model of the problem domain. In the earliest phases, the presentation of functioning prototypes is a prime tool for provoking the responses that inform the model. As the model evolves, types and type members come and go at a parallel pace. Isn't it bothersome to explicitly record all these modifications? Well, sure, but there are no shortcuts around the mess of broken abstractions. When the ground underneath drops away, the stuff on top should complain as soon as possible, rather than levitating like a cartoon figure until it looks down, notices the missing ground, and dramatically plummets. Part of the value of explicit types lies precisely in making code dependencies not only discoverable but obvious. This is still more important whenever separate individuals or teams develop the cooperating codebases. The other codebase has a stake in the ongoing evolution of its partner. A "handshake" agreement could be enough to keep everyone carefully synchronized, but it's more error-prone compared to an enforced interface to which everyone can refer. During rapid evolution, automated type-checking is an aid (although not a panacea!) to the task of reconciling and integrating small transforming chunks of data and code to the overall design. Types that match offer at least a minimum of assurance that contradictory interpretations of the domain model haven't slipped in. On the other hand, unrestricted typing allows for a wider range of modeling approaches...

Traits/Advanced Object Construction. No disagreement from me. I wish static type schemes would represent more exotic ideas about objects. Still, most of the time, applied ingenuity, e.g. OO design patterns, can accomplish a lot through existing features like composition, delegation, generics, inheritance.

I want to emphasize that my lean toward static typing for the sake of explicitness isn't my ideal. I direct my top level of respect at languages and platforms that leave the strictness of the types up to the developer. I like a functioning "escape hatch" from static types, to be employed in dire situations. Or the option to mix up languages as I choose for each layer of the project. I judge types to be helpful more often than not, but I reserve the right to toss 'em out when needs demand.

foolish usage of static and dynamic typing

Whenever I read the eternal sniping between the cheerleaders of static and dynamic typing, I'm often struck by the similarity of the arguments for each side (including identical words like "maintainable"; let's not even get started on the multifarious definitions of "scalability"). Today's example is "I admit that your complaints about [static, dynamic] typing are valid...but only if someone is doing it incorrectly. Don't do that." Assuming that sometimes the criticisms of a typing strategy are really targeted at specimens of its foolish usage, I've therefore paired common criticisms with preventive usage guidelines. The point is to neither tie yourself to the mast when you're static-typing nor wander overboard when you're dynamic-typing.

Static
  • "I detest subjecting my fingers to the burden of type declarations." Use an IDE with type-completion and/or a language with type inference. 
  • "The types aren't helping to document the code." Stop reflexively using the "root object" type or "magic-value" strings as parameter and return types. Y'know, these days the space/speed/development cost of a short semantically-named class is not that much, and the benefit in flexibility and code centralization might pay off big later on - when all code that uses an object is doing the same "boilerplate" tasks on it, do you suppose that just maybe those tasks belong in methods? If a class isn't appropriate, consider an enumeration (or even a struct in C#).
  • "Branches on run-time types (and the corresponding casts in each branch) are bloating my code." Usually, subtypes should be directly substitutable without changing code, so the type hierarchy probably isn't right. Other possibilities include switching to an interface type or breaking the type-dependent code into separate methods of identical name and distinct signatures so the runtime can handle the type-dispatch.
  • "I can't reuse code when it's tightly coupled to different types than mine." Write code to a minimal interface type, perhaps a type that's in the standard library already. Use factories, factory patterns, and dependency injection to obtain fresh instances of objects.
  • "I need to change an interface." Consider an interface subtype or phasing in a separate interface altogether. Interfaces should be small and largely unchanging.
  • "When the types in the code all match, people don't realize that the code can still be wrong." Write and run automated unit tests to verify behavior. Mocks and proxies are easier than ever.
Dynamic
  • "It's impossible to figure out a function's expected parameters, return values, and error signals without scanning the source." In comments and documentation, be clear about what the function needs and how it responds to the range of possible data. In lieu of this, furnish abundant examples for code monkeys to ape.
  • "Sometimes I don't reread stuff after I type it. What if a typo leads to the allocation of a new variable in place of a reassignment?" This error is so well-known and typical that the language runtime likely can warn of or optionally forbid assignments to undeclared/uninitialized variables.
  • "Without types, no one can automatically verify that all the pieces of code fit together right."  Write and run automated unit tests that match the specific scenarios of object interactions.
  • "Beyond a particular threshold scripts suck, so why would I wish to implement a large-scale project in that manner?" A language might allow the absence of namespaces/modules and OO, but developers who long to remain sane will nevertheless divide their code into comprehensible sections. Long-lived proverbs like "don't repeat yourself" and "beware global state" belong as much on the cutting-edge as in the "enterprise".
  • "My head spins in response to the capability for multiple inheritance and for classes to modify one another and to redirect messages, etc." Too much of this can indeed produce a bewildering tangle of dependencies and overrides-of-overrides, which is why the prudent path is often boring and normal rather than intricate and exceptional. Mixin classes should be well-defined and not overreach.
  • "In order to read the code I must mentally track the current implied types of variables' values and the available methods on objects." Descriptive names and comments greatly reduce the odds that the code reader would misinterpret the intended variable contents. Meanwhile, in normal circumstances a developer should avoid changes to an object's significant behavior after the time of creation/class definition, thus ensuring that the relevant class definitions (and dynamic initialization code) are sufficient clues for deducing the object's abilities.

Monday, June 21, 2010

my turn to explain type covariance and contravariance

The Web has a wealth of resources about type covariance and contravariance, but I often find the approach unsatisfying: either code snippets or abstract-ish theoretical axioms. I prefer an additional, intermediate level of understanding that may be called pragmatic. Pragmatically understood, any pattern or concept isn't solely "standalone" snippets or logical axioms. A learner should be able to connect the concept to his or her prior knowledge and also to the question "When would I use this?" I'll attempt to remedy this shortcoming.

For code reuse, a pivotal issue is which data type substitutions in the code will work. Can code written for type Y also function properly for data of type Z? Well, if the definition of data type Z is strictly "Y's definition plus some other stuff", then yes, since the code can treat Z data as if it were Y data. Z is said to be derived from Y. Z is a stricter data type relative to Y. We then say that Z is covariant to Y but Y is contravariant to Z. Code that acts on Y can also act on Z because Z encompasses Y, but code that acts on Z may not be able to act on Y because the code may rely on the "other stuff" in Z beyond Y's definition. In languages that support it, the creation of a data type that's covariant to an existing type can be easily accomplished through inheritance of the existing type's language artifacts. (When someone uses the language feature of inheritance to create a type whose data can't be treated by code as if it were of the original type, confident code reuse is lost, which is the goal behind type covariance/contravariance!)

Thus far I've only mentioned the reuse of one piece of code over any data type covariant to the code's original type. The data varies but the code doesn't. A different yet still eminently practical situation is the need for the code to vary and nevertheless join up with other pieces of code without any of the code failing to work (e.g. a callback). What relationship must there be between the data types of the code pieces in order for the whole to work?

All that's necessary to approach this question is to reapply the same principle: a piece of code that works with data type Y will also work with data type Z provided that Z is covariant to Y. Assume an example gap to be filled by varying candidates of code. In this gap, the incoming data is of original type G and the outgoing data is of original type P.

Consider the code that receives the outgoing data. If the candidate code sends data of type P, the receiving code will work simply because it's written for that original type. If the candidate code sends data of a type covariant to (derived or inherited from) P, the receiving code will work because, as stated before, covariant types can "stand in" for the original type. If the candidate code sends data of a type contravariant to P, the receiving code won't work because it may rely on the parts of P's definition that are included in P alone. (In fact, this should always be true. Whenever it isn't, the receiving code should have instead targeted either a less-strict type or even just an interface. A "snug" data-type yields greater flexibility.) So the rule for the candidate code is that it must be written to send data of type P or covariant to P.

Meanwhile, the code that sends the gap's incoming data has committed to sending it as type G. If G is the original type that the candidate code is written to receive, it will work simply because it's written for that type. If a type covariant to G is the original type that the candidate code is written to receive, it won't work because it may rely on the parts of that type that aren't in G. If a type contravariant to G is the original type that the candidate code is written to receive, it will work because G is covariant to it, and once again covariant types are substitutable. So the rule for the candidate code is that it must be written to receive data of type G or contravariant to G.

Hence, the rules for covariance and contravariance in various programming languages (delegates in C#, generics in Java or C#) are neither arbitrarily set nor needlessly complex. The point is to pursue greater code generality through careful variance among related data types. Static data types are "promises" that quite literally bind code together. Pieces of code may exceed those promises (legal covariance of return); however, the code may not violate those promises (illegal contravariance of return). Pieces of code may gladly accept what is seen as excessively-generous promises (legal contravariance of paramaters); however, no code should expect anything that hasn't been promised (illegal covariance of parameters).

Wednesday, August 12, 2009

the importance of syntax sugar, ctd.

A while ago I ranted that the common term "syntax sugar" minimizes the practical importance of programming language syntax. One example is when people sometimes state (or imply) that their preference for dynamic languages is tied to the languages' sweeter syntax.

While that may be true, it's a mistake to assume that the saccharine in the syntax is dependent on a language's dynamic typing. This point is amply demonstrated for those who've been keeping up with the development of C#, a static language whose syntax is sufficiently honeyed that the below snippet is remarkably comparable to the Javascript version after it. I tried to match up the code as much as possible to make the comparison easier.

C#

public class myClass
{
public string myProp { get; set; }
static void Main(string[] args)
{
var myc = new myClass() { myProp = "English" };
myc.manhandleStr();
System.Console.Out.WriteLine(myc.myProp);
}
}

public static class ExtMethodClass
{
public static void manhandleStr(this myClass cl)
{
var palaver = new
{
myName = "Nym",
myTaskList = new List<Action<string>>
{
next => { cl.myProp = next + ".com"; },
next => { cl.myProp =
next.Replace("l","r").ToLower(); }
}
};
palaver.myTaskList.ForEach(task => task(cl.myProp));
}
}

Javascript

var myc =
{
myProp: "English"
};

myc.manhandleStr = function() {
var palaver =
{
myName: "Nym",
myTaskList:
[
function(next) { myc.myProp = next + ".com"; },
function(next) { myc.myProp =
next.replace("l","r").toLowerCase(); }
]
};
palaver.myTaskList.forEach(function(task) { task(myc.myProp); });
}

myc.manhandleStr();
alert(myc.myProp);


By the way, other C# programmers might scold you for writing code like this. And yes, I realize that fundamentally (i.e. semantically rather than superficially) the dynamic Javascript object "myc" differs from the static C# instance "myc". Keep tuned for the "dynamic" keyword that's on the way in C# for duck-typing variables going to and from CLR dynamic languages.

Friday, June 12, 2009

I'm GLAD I'm not programming in natural human language

At least one of the hard and undeniable facts of software development doesn't really swipe a developer across the face until he or she starts work: the difficulty of extracting good information from people. Someone may say "the bill is a set rate per unit" but a mere five minutes later, after further interrogation, is finally coaxed into saying "customers pay a flat rate for items in categories three and four". Similarly, the reliability of the words "never" or "always" should be closely scrutinized when spoken by people who aren't carefully literal. (On the other hand, xkcd has illustrated that literal responses are inappropriate in contexts like small talk...)

I'm convinced that this difficulty is partially caused by natural human language. It's too expressive. It isn't logically rigorous. By convention it supports multiple interpretations. While these attributes enable it to metaphorically branch out into new domains and handle ambiguous or incompletely-understood "analog" situations, the same attributes imply that it's too imprecise for ordering around a computing machine. Just as "I want a house with four bedrooms and two bathrooms" isn't a sufficient set of plans to build a house, "I want to track my inventory" isn't a sufficient set of software instructions to build a program (or even a basis on which to select one to buy).

Every time I perform analytical/requirements-gathering work, I'm reminded of why I doubt that natural human language will ever be practical for programming, and why I doubt that my job will become obsolete any time soon. I can envision what the programming would be like. In my head, the computer sounds like Majel Barrett.

Me: "Computer, I want to periodically download a file and compare the data it contains over time using a graph."
Computer: "Acknowledged. Download from where?"
Me: "I'll point at it with my mouse. There."
Computer: "Acknowledged. Define periodically."
Me: "Weekly."
Computer: "Acknowledged. What day of the week and what time of the day?"
Me: "Monday, 2 AM."
Computer: "Acknowledged. What if this time is missed?"
Me: "Download at the next available opportunity."
Computer: "Acknowledged."
Me: "No, wait, only download at the next available opportunity when the system load is below ___ ."
Computer: "Acknowledged. What if there is a failure to connect?"
Me: "Retry."
Computer: "Acknowledged. Retry until the connection succeeds?"
Me (getting huffy): "No! No more than three tries within an eight-hour interval."
Computer: "Acknowledged. Where is the file stored?"
Me: "Storage location ______ ."
Computer: "Acknowledged. What if the space is insufficient?"
Me: "Remove the least recent file."
Computer: "Acknowledged. What data is in the file?"
Me (now getting tired): "Here's a sample."
Computer: "Acknowledged. What is the time interval for the graph?"
Me: "The last thirty data points."
Computer: "Acknowledged. What is the color of the points? Does the graph contain gridlines? What are the graph dimensions? How will the graph be viewed?"
Me: "Oh, if only I had an analyst!"

Wednesday, November 12, 2008

functional languages and template engines: a StringTemplate docs quote

From the StringTemplate documentation (I started trying StringTemplate out just yesterday, when I was looking around for a currently-maintained template library for C#):

Just so you know, I've never been a big fan of functional languages and I laughed really hard when I realized (while writing the academic paper) that I had implemented a functional language. The nature of the problem simply dictated a particular solution. We are generating sentences in an output language so we should use something akin to a grammar. Output grammars are inconvenient so tool builders created template engines. Restricted template engines that enforce the universally-agreed-upon goal of strict model-view separation also look remarkably like output grammars as I have shown. So, the very nature of the language generation problem dictates the solution: a template engine that is restricted to support a mutually-recursive set of templates with side-effect-free and order-independent attribute references.

Monday, August 25, 2008

8 perceptual metaphors for OOP objects

Why Learning F# Is So Difficult at got net? (I compliment the blog's subtitle "Kevin Hazzard's Brain Spigot") has an insightful explanation for why functional programming code is so confusing at first: usually, programmers don't have the necessary perceptual primitives, the prototypes, that would enable rapid recognition/comprehension of the code. They're expecting mutable variables, not immutable lists. They're expecting "foreach", not "map". They're not expecting statements, expressions, and functions to be somewhat synonymous. They're not expecting functions to be inputs or outputs of functions, nor are they expecting partial function evaluation. They're not expecting recursion to play a large role, perhaps because they assume it will blow the stack (which is a good opportunity to explain tail-recursion optimization). And, naturally, they're not expecting the operators to look and act so differently.

That said, the need to learn new perceptual prototypes as part of adjusting to a different programming paradigm applies to OOP, too. My experience of switching to OOP had its own moments of bewilderment, such as "Huh? Invoking a subroutine on this data type runs code two levels up in an 'inheritance hierarchy'?". I know of no less than 8 ways or metaphors for perceiving the most fundamental concept of OOP, objects:
  • Supervariables. The supervariable metaphor is most applicable to small classes that don't do much more than provide a handful of convenience methods for representing and modifying abstract data types. "Supervariable" emphasizes that an object not only stores a small conglomeration of data but also provides prefabricated code for common tasks on that data. An alternative name for this metaphor is "smart variable". A ComplexNumber object that has a produceConjugate method fits the supervariable metaphor.
  • Miniature programs. The miniature program metaphor is most applicable to classes that, like a control panel, expose intricate, full-featured capabilities. "Engine" in the class name could be a clue. (In practice, objects that are closest to "miniature programs" probably employ I/O and various other objects internally.) The point is that creating an object and running one or more of its methods can be similar to starting a program and directing it to carry out actions. Of course, the metaphor is explicit when a program specifically furnishes an "API object" interface in addition to GUI or command-line.
  • Nouns. The noun metaphor is the most straightforward and the most taught. The object represents a noun and simply consists of the noun's attributes and activities. It's a Car, a Square, a Dog (that inherits from Vehicle, Shape, Animal, respectively). It's an abstract simulation of its real counterpart in the problem domain. Since in nontrivial problems 1) the problem domain overflows with too many irrelevant nouns and 2) not all necessary parts of a program have real-world analogues in the problem domain, the noun metaphor isn't seamless nor self-sufficient. It's a good starting point for analysis and design, however.
  • Memo senders. One of the tricky mental shifts for OOP beginners is decentralization of control flow, especially when it's event-driven. Each object has its own limited, distinct responsibility and role, so the way to accomplish a larger purpose is collaboration and communication among objects. No object is inherently primary, though execution must start at one particular method. In OOP, the metaphor of object interaction isn't a top administrator dictating orders or stage actors enacting a script; it's a cooperative group of employees who send memos to one another in order to complete individual assignments. And "employees" shouldn't need to "send memos" to every other employee in the "building" - see Law of Demeter.
  • Lieutenants. From the standpoint of the memo-sending metaphor, the "lieutenant" metaphor is about the contents of the "memos". A lieutenant (or vice-president) isn't told exactly what to do (the "how") but the goal to be met (the "what"). The value of the lieutenant is in delegation. Objects shouldn't need to know much about other objects' work, as long as the objects do their "jobs". There's a good reason for this advice: the more an object knows and therefore assumes about other objects, the harder it is to modify and/or reuse each object independently. Effective error handling often requires a delicate compromise of an object knowing just enough about possible errors to respond the way it should - no more, no less (should the object abort, retry, ignore, fail?).
  • Instantiable scopes. Unlike the noun metaphor, the "instantiable scopes" metaphor is not likely to be included in introductory texts. It's one of the more obvious metaphors to seasoned programmers but one of the least obvious to novices. This is a rather literal and stark "metaphor", which is why it's of more interest to compiler and interpreter writers than to analysts. An object is a "bundle" of functions and variables whose implementation involves "this" (or "self") pointers and vtables. Sometimes, people involuntarily learn this metaphor when a serious problem occurs. Developers who are trying to add tricksy feature extensions to a language may need to think of objects in this way.
  • Data guards. Substitute "bouncers" for "guards" if desired. In this metaphor, objects are protective intermediaries between code and data. The object somehow mitigates the downsides of direct data access: checking that indexes are in-bounds, tying the disposal of acquired resources to garbage collection, etc. Like the lieutenant metaphor, a data guard object can separate an accessing object from dangerous knowledge, because the accessing object knows only as much about the data as the guard allows. This means, for instance, that the guard could change to obtain the data from a different source, or the accessing object could be reused in several different contexts at once as long as its data guard functions consistently. The obvious downside is that superfluous guards and multiple layers of guards complicate and slow the program.
  • Sets of behaviors. This is the most abstract metaphor of all. Many OO design patterns use it. According to this metaphor, objects represent actions instead of, well, objects. The object's focus is the execution of a specific algorithm, and not the expression of data. Therefore, the typical name of a behavioral object refers to the role it serves in the program: bridge, adapter, factory, iterator, strategy. Viewed through this metaphor, the important difference between all objects is behavior. Objects that have the same behavior shouldn't require multiple classes. The behavior of subclasses shouldn't violate the behavior expectations of the superclass. An object is what an object does.

Tuesday, March 18, 2008

define maintainable

Here's a John Lam quote at the end of here:
Finally, why should .NET developers bother with Ruby? According to Lam: "You spend less time writing software than you spend maintaining software. Optimizing for writing software versus maintaining software is probably the wrong thing to do. Static typing makes it harder to maintain software because it's harder to change it."
What makes this quote so noteworthy is how its definition of the word "maintainable" is apparently different from the definition used by static typing proponents. They would say that static typing makes it easier to maintain software because the static types document the code (some people might even characterize static types as metadata), enable additional checking of the code by the compiler, enable automatic refactoring, and generally make code easier to read/optimize/execute at the expense of making it harder to write.

The static typing cheerleaders allege that dynamic typing optimizes for writing software versus maintaining software. But based on this quote from John Lam, dynamic typing cheerleaders allege the same against static typing.

Does "maintainable" refer to preserving maximum code flexibility, or does it refer to preserving maximum code clarity (by static types, low dynamism, simple concepts)? The answer is both. Flexibility and clarity are just one of the design trade-offs canny developers mull every day.

UPDATE: I feel obligated to comment on the linked article title. "No Borg-like release train for Ruby on .Net" doesn't make any sense. Borg have no releases! Borg assimilate. I know that it's a Register-related website and the chances are good that readers know that "Borg" is a tee-hee term for (the non-open-source portions of?) Microsoft, but c'mon, at least don't use the Borg label when it isn't appropriate.

Saturday, February 16, 2008

clarifications of metaprogramming, monads, DSLs

The last post led to some interesting feedback that has prompted me to make some clarifications. Specifically, I'm responding mostly to the entry on chromatic's use.perl.org Journal (I often enjoy pondering what chromatic writes there and over at O'Reilly Network Blogs). I'm also including a short response to the reddit comments, and tossing in a couple other thoughts I had after posting.
  • I remembered a few other metaprogramming techniques. My quick reference to a "preprocessing step that applies templates/macros to the source" doesn't give due recognition to the (sometimes headache-inducing) tricksy acrobatics performed by template metaprogramming. The general category of compiler-compilers was a gaping omission, considering how many of them that Wikipedia lists and how frequently people use these great tools. Camlp4 is a snazzy way to create parsers in OCaml, even for extending OCaml syntax (see the slides for the One-Day Compilers presentation). And F# has a "quotations" feature for treating code like data.
  • Perl 6 isn't a DSL, of course. When I wrote that "Parsec can implement external DSLs, even an external DSL like Perl 6", I meant something more along the lines of "Parsec can implement languages that are much different from the host language Haskell, even a language as nontrivial as Perl 6". My intent was merely to comment in passing that "monads in the form of a parser combinator like Parsec" can and do function as an effective technique for external DSLs. Those of us who're familiar with Higher-Order Perl (the later chapters...) aren't (much) fazed by the concept of parser combinators...
  • Monads aren't DSLs. Actually, I said the two were opposites. The line "A monad is an inside-out DSL" (and in bold, no less) is using DSLs somewhat as a metaphor or counterpoint for monads, although as I wrote a while ago my preferred monad metaphor is an abstract algebra group or ring (and if you see what the origin for monads was, the suitability of this metaphor isn't surprising).
  • However, I did write "In fact, as monad tutorials instruct, Haskell's 'do-notation' is syntax sugar, i.e. a DSL, for using monads..." My intent was merely to comment in passing that do-notation is like a DSL within Haskell syntax for the >>= operator, through which the domain "using monads" is made easier to work in. It reminds me of how LINQ's SQL-ish syntax, with its keywords like "from" and "where", acts like a DSL for data queries within applicable languages. As parts of a language's intrinsic syntax, are "do-notation" or extra-special LINQ syntax true stand-alone DSLs? No. Do those language features, like DSLs, make it easier to program for a specific purpose, a specific task's "domain"? Darn tootin' (yes).
  • Are monads a genuine example of metaprogramming, or programming in a language and to an API? Given that a monad is a feature of the language syntax, a monad sure doesn't seem like metaprogramming to a Haskell programmer any more than defmacro does to a Lisper. However, think of how a language without monad capabilities would achieve monad-like effects (consider the "before" and "after" advice of AOP). On the whole, monads seem like a sufficiently sophisticated mechanism for code transformation, regardless of the language, to qualify for "metaprogramming". Someday, when monads aren't considered fancy wizardry, maybe the "plain programming" label will be the consensus. (As an aside, one of my personal syntactical/semantic wishes is that the freedom to choose dynamic or static variable type checking becomes commonplace.)
  • reddit comments: Is my only point that do-notation is a DSL, and I'm just trying to be clever about it? 1) Well, I'm always trying to be clever on this blog, as a general rule. It's one of my motivations for writing at all. But I'm sure everyone realizes that my motivation has nothing to do with whether what I'm writing is useful or not to each individual reader. Applying what you read is your responsibility. I'm not one for writing how-tos. 2) As I already admitted, calling the do-notation a DSL is shaky (the "internal" qualifier notwithstanding). That isn't at all the main point. The main point is metaprogramming. The later paragraphs are about contrasting two applications or approaches of metaprogramming, monads and DSLs. If this "Grand Unified Theory" isn't as interesting or persuasive to anyone else, that's okay. It might offer some answers for people who ask, "Can I make DSLs in Haskell?"
  • Woo-hoo! As of right now, not only is the blog the top Google result for "rippling brainwaves", but the discussed blog entry is the top Google result for "metaprogramming monads DSLs". (Using Blogger probably helps. You Google searchers have got to ask yourselves a question: "Do I feel lucky?" Well, do ya, punks?) But I didn't write the title or the entry to "bait" readers, honest. That's partly what I was referring to in the self-deprecating remark, "This must be what happens to your mind after reading too many blogs."
  • Thanks to the commentators who said they grasped what I was struggling to communicate! I appreciate the occasional reassurance that I'm being lucid, rather than gibbering...
  • Update (Feb 18, 2008): I noticed that Writing a Lisp Interpreter In Haskell seems to match some of my conclusions. Nice to see that I finally caught on to what I read many months ago... First a short quote:"Haskell comes standard with a parsing library called Parsec which implements a domain specific language for parsing text." Now a long quote:
    A common misconception among Haskell beginners is that monads are an unfortunate evil necessary in a a lazy, side-effect free language. While it is true that monads solve the lazy IO problem, it is only one example of their power. As I plunged deeper and deeper into Haskell I began to realize that monads are a desirable feature that allows the programmer to build otherwise impossible abstractions. Monads, together with higher order functions, provide excellent facilities for abstracting away boilerplate code. In this sense Haskell's monads serve a similar purpose to Lisp's macros (though the concept is entirely different).

Thursday, February 14, 2008

of metaprogramming, monads, DSLs

Update: I wrote a follow-up as feedback to the feedback.

This must be what happens to your mind after reading too many blogs.

Like the writer of "I'm not tired of Java yet", I'm one of those many people who still (must) predominantly use Java at work for all the usual boring reasons, like platform stability and understandable syntax and 3rd-party APIs, but who also enjoy learning about other ways of doing things. The functional programming paradigm is one I've covered before (e.g. in this comparison of FP's and other paradigms' answers to the fundamental concerns of programming). Much of functional programming still appeals to me (e.g. when I implemented an Ajax callback that, if necessary, would make a data-specific number of additional Ajax calls before doing its original task(s) such as updating the DOM). However, over time I'm gaining more appreciation for another paradigm: metaprogramming, programs that manipulate programs, especially when the manipulating and manipulated program are in the same language and the manipulation happens at run-time. Metaprogramming can enable a higher level of abstraction and drastically reduce the need for code to repeat itself, though I'd advise first exhausting the plain reuse mechanisms of the language: namespaces, modules, classes, traits, functions, etc. (On the other hand, in cases in which a complex OO design pattern's careful balance between rigor and flexibility isn't vital and metaprogramming isn't absurdly difficult or error-prone, I prefer the metaprogramming option...)

A broad collection of techniques fall into the metaprogramming category. In LISP-like languages, homoiconicity languages in which the language syntax is a language data structure, read-write metaprogramming is natural. In other languages, metaprogramming could be supported through implementation-provided reflection objects. If the language implementation supports loading/linking additional code at run-time, a program could generate source, prepare it, and load it. With some limitations, metaprogramming might be possible through byte-level operations performed by a library, like ASM. When the language (or the platform it runs on, such as .Net 3.5) offers "expression tree" objects, the creation of new code at run-time is less messy but not necessarily easier. In still other languages that as a matter of course interpret/compile source at run-time (or immediately before), a program could just synthesize source as a string and run an "eval" function on it. Finally, a comparatively primitive yet widely-used technique is a preprocessing step that applies templates/macros to the source to produce the complete source.

One language characteristic blurs the line between metaprogramming and programming somewhat like homoiconic languages: classes and objects that are modifiable without restriction. Learning some ways to exploit this characteristic has been called "Basic Training" for Ruby, but other languages with the characteristic are applicable. Its relationship to metaprogramming is clear. Given that classes and objects combine code and data, and classes and objects constitute the structure of program B (so OOP is a prerequisite), and program A can freely modify the classes and objects of program B (and vice-versa), then program A can manipulate program B--the definition of metaprogramming, in a limited sense. Nevertheless, the modification of classes and objects can accomplish much of the effect of metaprogramming rather simply. (Freedom languages with unrestricted object modifications seem to usually have an "eval", too.)

Regardless of the specific metaprogramming technique, two applications of the paradigm have been getting publicity: monads (Monads in Python is a good example) and DSLs (find an example by your preferred Ruby cheerleader). The DSLs under discussion are internal, not external, which means the DSLs act as an API short-hand that uses the syntax and semantics of a host general-purpose programming language. (But monads in the form of a parser combinator like Parsec can implement external DSLs, even an external DSL like Perl 6...) Monads and DSLs are both applications of metaprogramming because each of the two effectively transforms minimal code into a complete form for execution. In fact, as monad tutorials instruct, Haskell's "do-notation" is syntax sugar, i.e. a DSL, for using monads...

The surprising revelation I came to a short while ago is that these two applications of the metaprogramming paradigm are opposites. In a DSL, statements and expressions of the source code "expand out" into larger blocks of code. The DSL implementation translates the source code and extrapolates a context of other code around it and from it. A monad implementation, by contrast, constitutes a context of other code around the source code that translates the source code and makes the monad's purpose "focus in" on the source code. A monad is an inside-out DSL. In the DSL application of metaprogramming, one observes and implements "operations" and "entities" in a domain, then creates a DSL--a language for those operations and entities. In a monad application of metaprogramming, one observes and implements "operations" and "entities" in a domain, figures out how to interweave those operations and entities with code in general (via the monad "return" and "bind" definitions), then creates a monad--a "domain" for those operations and entities, in which code has extra significance/side-effects. Obviously, the opposite function of monads and DSLs also reflects opposite purposes for the host programming language: an internal DSL acts as a highly-specific dialect and a monad acts as an extension. DSLs could perhaps be friendly to non-programmers (presuming a minimum level of aptitude), but monads would be definitely unfriendly to them.

I'd like to try employing a monad application of metaprogramming to make certain tasks easier. Of course, outside of the pure functional paradigm, where both mainstream languages and reality have state, monads are less essential. Monads could still be beneficial whenever I'll need to model a complicated commonality among several statements, without repeating myself. Those who think monads are still a long way off should note that F# is on its way to becoming an officially-supported .Net language (it also works on Mono), and F# has "computation expressions" whose similarity to monads and list comprehensions is no accident...

Tuesday, August 14, 2007

the typing debate is a red herring

Or: even more than before, dynamic vs. static variable typing is not the salient point of differentiation in programming language debates.

I've hinted at this previously, but my discovery of Jomo Fisher's blog on the more cutting-edge aspects of .Net reminded me of it. The casually-readable entry The Least You Need to Know about C# 3.0 describes multiple features which put the dynamic-static (compilation-runtime) distinction in a new light: the keyword "var" (for type inferenced or "anonymous type" variables), extension methods, and expression trees. These additions help make LINQ possible.

As .Net increases its explicit support for dynamic code (as opposed to dynamic code support through a Reflection API), the barrier between the capabilities of a "static" and a "dynamic" language keeps shrinking. If "expression tree" objects in .Net 3.5 allow someone to generate and execute a customized switch statement at runtime, then what we have is a solution with the flexibility of a dynamic language and the performance of a static language. Ideally, the smart folks working on porting dynamically-typed languages to statically-typed VM platforms would accomplish something similar in the implementation innards. The code that accomplishes this feat is a bit daunting, but it is cutting-edge, after all.

Some of those irascible Lisp users may be snickering at the "cutting-edge" label. As they should. I've read that Lisp implementations have had the ability to employ either static or dynamic typing for many years. Moreover, Lisp uses lists for both data and program structures, so it doesn't need a special expression tree object. It also has had a REPL loop that made the compilation-runtime distinction mushy before .Net was a gleam in Microsoft's eye. On the other hand Lisp is, well, Lisp. Moving along...

The way I see it (and echoing/elaborating what I have written before now), there are three reasons why the question of static and dynamic variable typing has always been, if not a red herring, at best a flawed litmus-test for language comparisons.
  1. The time when a variable's data type is set doesn't affect the clear fact that the data the variable refers to has one definite data type at runtime. Ruby and Python cheerleaders are fond of reminding others that their variables are strongly typed, thankyouverymuch! Where "strongly typed" means that the code doesn't attempt to perform inappropriate operations on data by applying coercion rules to one or more operands. The timeless example is how to evaluate 1 + "1". Should it be "11", 2, or Exception? Strongly-typed languages are more likely than weakly-typed languages to evaluate it as Exception (whether a static CompileException or a dynamic RuntimeException). So dynamic typing is separate from strong typing, precisely because variable typing, a part of the code, is different from data typing, which is what the code receives and processes in one particular way at runtime. Data is typed--even null values, for which the type is also null. Regardless of language, the next question after "what is the name and scope of a variable?" is "what can I do with the variable?", and its answer is tied to the type of data in the variable. In fact, this connection is how ML-like languages can infer the data type of a variable from what the code does with it. Similarly, Haskell's type classes appear to define a data type precisely by the operations it supports. No matter how strict a variable type is at compile time or run time, when the code executes such considerations are distractions from what the actual data and type really is.
  2. Programming languages are intangible entities until someone tries to use them, and therefore publicity (ideas about ideas) is of prime importance. One example is the stubborn insistence on calling a language "dynamic" instead of "scripting"; with one word programmers are working on active and powerful code (it's dynamite, it's like a dynamo!) while with the other word programmers are merely "writing scripts". Unfortunately, applying the word "dynamic" to an entire language/platform can also be misleading. Languages/platforms with static variable typing are by no means necessarily excluded from a degree of dynamism, apart from support for reflection or expression tree objects. Consider varargs, templates, casting (both up and down the hierarchy), runtime object composition/dependency injection, delegates, dynamically-generated proxies, DSL "little languages" (in XML or Lua or BeanShell or Javascript or...) parsed and executed at runtime by an interpreter written in the "big" language, map data structures, even linked lists of shifting size. The capabilities available to the programmer for creating dynamic, or perhaps true meta-programmatic, programs can be vital in some situations, but in any case it's too simplistic to assume static variable typing precludes dynamic programs. I don't seem to hear the Haskell cheerleaders often complaining about a static-typing straightjacket (or is that because they're too busy trying to force their lazy expressions to lend a hand solving the problem?).
  3. OOP has been in mainstream use for a long time. I think it's uncontroversial to note the benefits (in maintainability and reuse) of breaking a program into separate units of self-contained functions and data called objects, and then keeping the interactions between the objects minimal and well-defined, especially for large, complex programs. This fundamental idea behind OOP is independent of variable typing, and also independent of object inheritance. Anyone with a passing familiarity of Smalltalk or Objective-C would agree. A language might allow one object to send a message to, or call a method on, any other object, with a defined fallback behavior if the object isn't able to handle the message. Or, it might not allow message-passing to be that open-ended. Maybe, for reasons of performance or error-checking, it has a mechanism to enforce the fact that an object must be able to handle all messages passed to it. This "message-passing contract" may be explicit in the code or inferred by the compiler. Most likely, if it has object inheritance it supports using a descendant object directly in place of one of its ancestor objects (Liskov substitution principle). My point is that OOP support may be intimately connected to a language's scheme for typing variables (Java), or it may not be (Perl). A tendency to confuse OOP with a typing system (as in "Java's too persnickety about what I do with my object variables! OOP must be for framework-writing dweebs!") is another way in which the typing question can lead to ignorant language debates.
When faced with comparing one language to another, one important question is when (and if) the type of a variable gets set, and how strictly. However, that question is merely one of many.

Saturday, March 03, 2007

comparative roundup of FP paradigm aspects

Functional programming is a broad topic. When I've read about it, the context has often been either its applications in a specific language, longish screeds about abstract concepts (Church numerals?), or teachings about other subjects entirely (Scheme, I'm looking at you!). But the important bits of FP, the parts that people are probably referring to when they advocate "learning about FP just to expand mental horizons", are independent of that stuff. As I understand it, FP is a way of structuring a program; it's a paradigm. Both to organize my own thoughts and help dissolve FP's perceived mystery, here is a roundup of FP's answers to fundamental programming concerns, with comparisons to others' answers to the same concerns. Nothing new or detailed here (go elsewhere for that), but I hope the conciseness and comparisons could: 1) aid in quicker understanding of what makes FP different, 2) serve as a very-high-level introduction, 3) answer the honest question "how is FP applicable to real programming problems again?". I would have appreciated something like this when I first starting reading up on it.
  • Programming metaphor. FP's metaphor seems to be math. People in FP talk about "provable" programs. Functions in FP are more like mathematical functions that always map the same arguments to the same result, rather than procedures that happen to return a value. In contrast, OOP's metaphor is objects passing messages, and (high-level) imperative programming's metaphor is...well, a simplified computer architecture, in which memory locations (variables) have names and operators can be used to represent the end goal of several machine instructions. Declarative's metaphor is a document of assertions written in terms of the problem domain.
  • Code modularization. Code is easier to handle when the humble programmer can divide a program into parts. Of course, FP uses functions for this, but functions have values and types similar to any other program entity ("first-class"). OOP uses objects. Imperative uses functions/procedures that group a series of statements, but the functions have no presence in the code aside from program flow control. Declarative might use an 'include' to designate one document as a part of another document. In practice, no matter the paradigm, there's a higher level, the module or package level, for collecting several units of modularized code into a single reusable chunk or library.
  • Localized code effects. Closely related to code modularization. Code should not have much of an "action-at-a-distance" quality. Code units must be connected in some way, but this connection should be as minimal and well-defined as possible to make unit changes less hazardous and the units' execution more easily understood. In particular, program state over there should not affect program state here. In FP (only the paradigm in the "pure" abstract is under discussion here) there is no modifiable program state, because that might cause functions to not always map the same arguments to the same value. In effect, an FP program is a series of independent, generalized data transformations. FP accomplishes iteration by recursion, but a function that ends just by returning the result of another function, known as a "tail call", does allow the compiler/interpreter to save space simply by tossing out the intermediate function call (cut out the useless middle man whose only remaining purpose is forwarding the tail-call result) and returning the result of the tail call to the function that called the intermediate function in the first place. In OOP, code's effect is localized by encapsulating or hiding data inside objects' private scopes. Imperative has local function variables.
  • Deferred code execution. Often, code needs to be manipulated but nevertheless deferred in execution, such as handler code that must be connected up to an event-processing framework (like a GUI toolkit). In FP, functions are values, so functions can be passed as arguments to other, "higher-order" functions. An object in OOP would send a message to another object to identify itself as a Listener or Observer of that object's events. Or, alternatively, an object might always delegate or forward certain messages to other designated objects. In a more general context in FP, if one wishes to delay the execution of a particular operation at runtime, one way is to construct a function with no arguments, known as a "thunk", around the operation, and then pass the thunk around until the operation is really needed, at which point the code just runs the thunk and does the operation.
  • Reconfiguring code. To avoid marginal effort, code should be reusable. In some happy cases, the code can be reused as is because the requirement is the same (actually, the problems themselves can sometimes be mapped onto each other to accomplish this - one could perform a sort on a set of expensive-to-copy data by creating a parallel array of references to the data and then sorting the references array using any code that can sort arrays by dereferencing the values). But many times a requirement will be similar but distinct, and the code must be reconfigured to be reused. The easiest solution is to add a parameter to the code. For FP-style functions, that have values and types of their own, this can take the form of currying: evaluating a function with an incomplete number of parameters returns not the function's result but a new function that takes the remaining parameters. Currying add(a,b) by evaluating add(3) would return a new function that could be named add3 - a function that returns the result of adding 3 to its single parameter. Moreover, the fact that functions can be parameters to functions means that FP makes it easy to create "light-weight frameworks": the parts of a generalized function or "framework" that vary can be smaller, distinct functions that are passed to it like any other parameters, and then are called within its body. In OOP, a slew of patterns might apply to reconfiguring code to a new situation: Adapter, Mediator, Decorator, Strategy, among others. Declarative might have a technique known as an "overlay" for merging a new document fragment with another named fragment.
  • Metaprogramming. One of the great epiphanies of programming that set it apart from other disciplines is metaprogramming - programs manipulating programs. Here, comparing programming to writing would imply that a book that could write books, and comparing programming to construction would imply that a house could build a house (although I suppose one could say metaprogramming is like building a robot that can build robots - er, Skynet?). FP's enhanced notion of functions greatly pays off in this context, because functions can now create other functions. "Lambda" is the traditional name for the magic function, keyword, or operator that creates a new function (value) from a code block and parameter list. Any outside variables referenced by the new function in lexical scope are captured at creation time and "ride along" with the function - this is a closure because it "closes over" the scope. Since closures are related to variables, closures pertain more to "impure" FP than FP per se. In OOP, metaprogramming support refers to constructing new original objects (as opposed to instantiating an existing class), especially when the OOP is prototype-based rather than class-based - class-based OOP can still do it if the language supports metaclasses. Metaprogramming can also be done with macros/templates within any paradigm, but unless this facility is part of the language (like the Lisp-y ones), the macros may be part of a messy pre-processing step that is notoriously tricky.
Yow. I was hoping this would turn out to be smaller as I was writing it. I haven't even covered any real FP techniques or patterns; I've only talked in generalities about the tools, and not examples of how to effectively use them! There's no mention of concepts related to/enabled by FP, like lazy evaluation or pattern matching! The breadth and longevity of the topic have foiled me again!

Note that a paradigm is mostly independent from a type system, and also the question of compilation vs. interpretation vs. JIT from bytecode. Comparing paradigms is about how each one enables differences in coding styles, not how the paradigms and code happen to be implemented. Comparing languages and language implementations, well, that's a different question.

I should also point out that FP, or any other paradigm, blends with other paradigms in the wild. A common tactic, that has the virtue of being more familiar to the OOP-trained, is to make an FP-style function an object that happens to be callable with arguments. A different tactic, that may have advantages in underlying flexibility for esoteric tricks, is to use a closure like an object, where the variables in the closure's scope are like an object scope. Yet another tactic is to rely on conventions and practices and libraries to use one paradigm like another, like C code that uses structs and functions for object-oriented "lite" - in an early stage C++ worked by translating C++ code to backend C code. The quite permeable boundaries between paradigms is why knowledge of FP can come in handy in contexts outside languages like Haskell.

Tuesday, November 14, 2006

First rule of proselytization is to respect the infidel

The idea behind the clash of civilizations post was that the Java (and C#, etc.) folks have different values than the dynamic languages camp, so arguments that are completely convincing to one side are considered irrelevant by the other side. They talk over each other's heads.

However, if someone wants to try to talk across that gap, and possibly help someone cross, I think that insults are not a good start. In defense of J2EE's complexity addresses the layered, many-pieced architecture of J2EE through a historical point of view. As a general rule, technology (especially software) is invented, popularized, and evolved to serve a specific need, even if that need may be quite broad. J2EE wasn't an accident! The problem with this blog entry is the argument at the end: that software engineering should be complex enough to keep out BOZOs. chromatic called him on it over on his O'Reilly blog, because, obviously, there are many reasons for a non- or anti-BOZO to use a simpler technology.

An example of a good way to communicate with your intellectual opponents is this post by Sam Griffith. He's the same writer who previously expounded on a Java macro language, which I linked to from my MOP post a while ago. He gives Java its credit, but at the same time explains how Java could learn some things from other languages/platforms that had similar goals. Another good example is the Crossing Borders series over at IBM developerWorks (even if it exhibits the annoying trait of non-Java content in a Java-centered context). Each page in the series demonstrates an alternative approach to the common Java way of accomplishing a task, and then it compares and contrasts. Some of the articles, like one on Haskell, honestly don't seem to offer much for the Java developer, and one on web development seems to basically suggest that we should go back to embedding straight Java code in our page templates. But the one on the "secret" sauce of RoR is enlightening.

Personally, I often read these articles with a half-smirk, because the pendulum swing here is clear: we started out writing web stuff in C, then switched to Perl, then Java, and now back to Python/Ruby/PHP or what have you. The other reason for my smirk is because I'm now forced to work in ASP.Net anyway. But if Microsoft can do anything, it's copy and clone, so there's work afoot to use IronPython for more rapid web work.

Monday, October 09, 2006

again too good not to share

Apparently the Drunken Blog Rants guy has been complaining about overzealous Agile folks. And he continued the discussion with another post, Egomania Itself. (The title is an anagram of Agile Manifesto.) I'm not commenting about the "debate" because I frankly don't care. As long as managment doesn't force a rigid methodology on me, and I can get my work done well and on time, and the users are overjoyed, I think that everything is fine. Oh, and pigs fly.

No, the real reason I bring this up is because of the Far Side cartoon reference. Behold! The ultimate example of, as Yegge says, "non-inferring static type systems"!

My stance, if I consistently have one, on the merits of static vs. dynamic typing is as follows:
  • To the computer, all data is just a sequence of bits. The necessity of performing number operations on one sequence and date operations on another pretty much implies that the first sequence is a number and the second sequence is a more abstract Date. Data has a (strong) type so it can make sense and we can meaningfully manipulate it. Don't let that bother you.
  • An interface with data types is more descriptive than it otherwise would be. Especially if your code intends for the third argument to be an object with the method scratchensniff. I shouldn't have to read your code in order to use it. Admittedly, data types still aren't good enough for telling you that, for instance, an integer argument must be less than 108. Where the language fails, convention and documentation and, most importantly, discipline can certainly suffice...like an identifier in ALLCAPS or beginning with an _underscore.
  • Speaking optimistically, we programmers can see the big picture of what code is doing. We can tell exactly what will happen and to what, and for what reason. Compilers and computers don't, because compilers and computers can't read. However, when data is typed, the compiler can use that additional information to optimize.
  • Leaving type unspecified automatically forces the code to work more generically, because the type is unknown. The code can be set loose on anything. It may throw a runtime error because of unsuitable data, but on the other hand it can handle cases the original writer never even thought of. This can also be done with generics or an explicitly simple interface, but not as conveniently.
  • Not having the compiler be so picky about types and interfaces means that mock objects for unit testing are trivial.
  • There is a significant overhead to keeping track of often-complex object hierarchies and data types. Knowing that function frizzerdrap is a part of object class plunky is hardly intuitive, to say nothing of static methods that may have been shoehorned who-knows-where. Thanks be to Bob, IDEs have gotten good at assisting programmers with this, but it's also nice to not have to search as many namespaces.
  • Dynamic typing goes well with full-featured collections. Processing data collections is a huge part of what programming is. The more your collection enables you to focus on the processing, and not the mechanics of using the collection, the better. With dynamic typing, a collection can be designed for usability and then reused to hold anything. Some of the most confusing parts of a statically-typed language are its collections, whether the collection uses generics or the collection uses ML-style type constructors.
Is static typing or dynamic typing better? Silly rabbit, simplistic answers are for kids!

Tuesday, September 12, 2006

revenge of multiple inheritance

I count myself fortunate to not know much about C++, in spite of that being the language I learned in college (4-yr degree). One of the many details I was blissfully unaware of was multiple inheritance. In fact, I didn't know that C++ supported multiple inheritance until I read about Java interfaces. A statement similar to "interfaces are Java's response to multiple inheritance in C++" usually appeared in the introduction. That inspired me to read about multiple inheritance in C++, but when I encountered the term "virtual base class", I decided to give up on that. The moral I learned was that, as with biology, tangled-up inheritance trees don't work well in practice.

When I started to pick up Python, I was startled to find multiple inheritance sitting right there in plain view, and no one was complaining. Actually, I could have found multiple inheritance in Perl, but that would have never happened because I only used objects in Perl; I never implemented them (is that like saying, "I never inhaled"?). This is a clear instance of the oft-observed difference in values between these languages and Java. In Java, you make the language simple enough that programmers, especially those on teams other than yours, won't mess the code up by abusing complicated features. In these other languages born from scripting, you make the language capable of anything, or at least extendible, so the programmer can do whatever he wants to get his job done. The relevant quote is "easy things easy and hard things possible".

The reason I bring up multiple inheritance now is because of a great blog post by chromatic about Roles: Composable Units of Object Behavior. The concept of object roles is slated for Perl 6. chromatic goes into great detail about how roles basically work as a middle ground between Java interfaces and C++ multiple inheritance. Squint enough and one might even think of it as AOP for classes. There's no reason someone couldn't accomplish a subset of the same effects in existing OO languages using a design pattern or two, but as chromatic explains, there are benefits to having roles built into the type system. Having roles built-in means that an interface and a default implementation don't need to be a matched pair, as in Java, but a single unit that can be applied to a class and checked at compile time or at run time. I still think I would rather have one class, parameterized by constructors into specific objects, and perhaps combined with another class using the decorator pattern, rather than one class parameterized by many roles into many classes. But maybe I'm stuck in the wrong paradigm. Roles seem like a great answer to the quandary of multiple inheritance. Leveraging the compiler is good. Code reuse is good.

Tuesday, August 22, 2006

clash of programming language civilizations

Why yes, the title of this post is sensationalist and even a little insulting to the real, violent culture clashes in the world. Thanks for noticing.

A few days ago I pointed to a blog posting that I saw through the use.perl.org feed. This time, I'll be pointing to blog postings through the javablogs.com feed. (I'm subscribed to those feeds because I frequently use Java and Perl at work. Outside of work, I experiment with whatever.) A proposal to add closures to Java has brought out some interesting arguments for the status quo. Java 5 Language changes were a bad idea even argues that the status quo, Java 5, isn't all that much an improvement over previous versions. Here are two quotes from there, one short, one long: "Verbosity is a good thing(tm)!", "Encapsulation is very important and is defined using several levels of hiding: class, package, protected and public... Simple and powerful. Why is this powerful? Because I can rely on the scope and find the usage of X (X being a field or method) within his scope far more easily.".

What I find so intriguing is that the same valued characteristics this blog used to support an exceedingly lean, simple Java are what Java detractors routinely trot out as arguments for Java being weak, wordy, and over-engineered. The same characteristics! It's as if one person's language vice is another person's language virtue. Contrast the above quotes to quotes from the dynamic-typing, stop-calling-us-scripting-languages camp: "do what I mean (DWIM)", "convention over configuration", "the first postmodern computer language". With values this different, neither "side" even has an agreed-upon ruler for comparing their...er, syntax.

This clash in values is even more dramatic when you start considering examples. About Microsoft's "Delegates" transparently explains why Java chose not to have delegates. One blogger compared Hani's Bile Blog to some of the oh-so-friendly Ruby resources, and got some surprisingly shrill replies in spite of the fact that one's attitude has no correlation to one's programming language. A blog entry which is not so tongue-in-cheek is Our long Java nightmare. He says this isn't an anti-Java rant, but with a title like that, I wonder what would be. An irritated individual over on javalobby has the legitimate complaint that Ruby users keep discussing Ruby on Java sites. There's even a guy who feels compelled to argue that Java is not in the midst of a close battle with the Next Big Thing.

Another twist on this deep clash in philosophy is the group who works in Java for a living, and is recreating (mostly server-side) Java development "from the inside". Truth be told, if you want to use closures along with Java-ish syntax, you can just use Groovy right now and use Grails to develop a website around an EJB 3 object. Web service development isn't that hard at all if you use Glassfish. Even sticking with Eclipse is easier than it used to be with Callisto. If you prefer VB-like development, you may be able to do that sometime in the future. Want the source for the language implementation, so you can solve the most obscure bugs? It's on the way. Java programmers are not buffoons. Whether they cash in on it or not, they (we?) know how hard it has been to get things done in Java.

Conclusion? There is none. The rationale behind a choice of programming language is up to you, or your managers. Which punctuation marks or paradigm a language uses is only part of the equation..."it's how you use it". As Ovid explains, Perl probably wouldn't have the reputation it does if people used it better. More to the point, in another place Ovid advises his readers to ignore language holy wars and use the language whose strengths best fit your problem.

I would also advise that rather than choosing 1. the side of Turing Machines and statefulness and efficiency and static typing, or 2. the side of lambda calculus and statelessness and runtime interpretation and deferred typing, pick both. Think laterally. Use a convenient RAD language for most of the program and a speedy language for the parts that are a performance bottleneck. Use a language that can check types at run time OR compile time. Use a language that is functional but statically typed, like F#. There doesn't have to be One Language to Rule Them All, as long as there is One Interoperable Platform To In the Darkness Bind Them. But it certainly is entertaining to watch the two programming language civilizations clash.

Tuesday, August 08, 2006

aw, just read this page

I've been pondering some more posts about data typing in languages, the practical use of functional programming, etc., but after I reread the following page, I decided to just link to it since it makes pretty much all of the points I was thinking and more: Scalable computer programming languages . Even better, it was written by someone who's arguably more qualified than me to opine about any computer science topic whatsoever, and a more effective writer. I must confess, his mention of some language called Dylan has piqued my curiosity. I wonder what he would think about Ruby.
If programming languages were cars... acts like a humorous shorthand for his article's opinion.

Bonus observation (i.e. unconnected topic that doesn't warrant its own post): I saw the Last Exit to Springfield episode of the Simpsons not too long ago. It's so good, so emblematic of what the Simpsons did right, that I want to cry for the series' current state. Homer and Burns misinterpreting each other, Lisa playing Classical Gas on guitar, "dental plan...lisa needs braces...dental plan...lisa needs braces...", Burns and Smithers running the plant by themselves, a nod to the Grinch cartoon. So many great moments, and so wonderfully surreal.