Wednesday, December 03, 2008

gleaning good design advice from unit tests

I'm a beginner to the rigorous usage of xUnit unit tests, which enable easy unit test creation, execution, and repetition through coded tests that a computer can run non-interactively. I'm well-acquainted with the concept, of course, but mostly from blogs. At work, I'm alone in writing such tests. My guess is that the typical programmer categorizes up-to-date, comprehensive, effective test suites like good documentation: "nice but nonessential". The comparison is apt, since both must be maintained in parallel with the code. And the test source can function as sketchy documentation (open source projects, I'm scowling at you!).

But as so many others have observed in practice, good tests are of sufficient value to merit the developer's scarce attention. The upfront cost of writing or updating a test produces benefits repeatedly thereafter. When the code is first written, the test confirms that it meets its function. When the code changes later (it will), the test confirms that it still functions as expected and therefore won't create problems in other code that uses it according to those past expectations (this is even more important if any points of code interaction are resolved at run-time). When the code has a bug, the test that is written to check for the bug confirms that the bug is fixed and remains fixed. When the code is reorganized through the wisdom of hindsight, the test confirms that the transition hasn't accidentally abandoned established dependencies. You don't need to trust TDD or Agile wonks to enjoy these results. Just trust your tests.

The trickier aspect of unit tests, the facial mole that everybody notices and newcomers should acknowledge sooner rather than later, is the all-too-likely possibility that a typical OOP program's objects aren't the neat, independent, well-defined, composable units that facilitate unit testing. To some degree this is unavoidable, as no object is an island. Objects that are excellent individually will need to collaborate and delegate in order to perform their own useful tasks; when asked for its price including tax for a specific political domain, a sales item should need to ask yet another object for the tax rate, because tax rates are not one of an item's responsibilities. A horrendous level of difficulty of writing unit tests for an object indicates that its design or the overall design of the entire set of objects should be reexamined.

For considered as one more design constraint, greater "testability" encourages: 1) cohesion - limiting each object to a fixed and bounded purpose, 2) loose coupling - limiting brittle dependencies between objects, 3) law of Demeter - limiting the number of objects an object interacts with directly, 4) referential transparency - limiting the tendency for methods to rely on a tangled web of tedious-to-establish object states.

However, like other design constraints, testability can be detrimental when misinterpreted or carried to uncalled-for extremes. This list of downsides is more applicable to statically-typed source code that strictly enforces encapsulation (although dynamic typing is no excuse for shoddy object design, of course).
  • Exhibitionist getters and setters. The abuse of getter and setter methods is one of the evergreen blog debates. In the context of pursuing testability, inappropriate getters and setters happen because setters make object setup less of a hassle and getters make verification of a test result less of a hassle. One of the key guidelines to remember when writing unit tests is that, as much as is feasible, the test should be treating the object the same way as actual client code, so the test checks scenarios that matter. Would actual code reach down through the object's throat in order to get a drink? The true problem, as always, is that all access points that an object publicly exposes are by definition part of its (implicit) interface. Details that an object doesn't hide now have the potential to cause cascading maintenance headaches when the details change later. (Please realize that this isn't an attack against dependency injection. Getters and setters for systemic "service" collaborators, abstracted behind interfaces, is different than getters and setters for the object's internal data.)
  • Interface explosion. Since unit tests are meant for repetition, side effects are undesirable. A foolproof avoidance technique is to store objects with side effects as interface types and substitute fakes behind the interfaces during tests (this is also helpful for integration tests). The tradeoff is that interface types added purely for the sake of testing enlarge the code's size and complexity without enabling the code to do more (dynamic typing cheerleaders would say this is true of all static types). Although switching to an interface hypothetically prepares the code to work the same in the event that the class is swapped out for another, few programmers (not me) have the rare ability to exactly foretell just what common elements should be in an eternal interface. Moreover, including extraneous types, indirection, etc. conflicts with the principles of avoiding big up-front designs and stuff that's good but possibly inconsequential (gold plating). Programming to an interface instead of an implementation is good practice if an object is to accommodate frequent or dynamic replacements of its fellow objects, but this is not always the case. At this time my preference is a different strategy known as "extract and override": extract just the code that causes side effects, then override that code with a stub in a "testing version" subclass that matches the actual class as closely as feasible (similar to the Template Method OO design pattern).
  • Ugly factories. Where many interfaces are present, factory objects may be nearby, hence testability can also lead to more factories. I appreciate the decoupling that factories make possible. I recognize that factories are essential sometimes. For example, the code needs an unknown instance to fulfill an interface or an unfortunately-written object that is a chore to initialize. But I dislike factories. It feels icky and hacky to create a concrete object via a different object, not by a constructor. When objects don't have sufficient contextual information to create the right helper objects - since the more an object must know about its environment to function, the less easily it can be reused - my preference is handing those objects in (through a constructor or setter method). I'll admit there's a limit to pushing out the burden of object creation because the objects must be selected and instantiated somewhere. Centralizing object creation in an overarching Factory or Configuration singleton is a simple option but it also fails modularity. A few uses of the Abstract Factory OO design pattern could be a happy medium between factory-per-object and factory-for-all.
  • Bureaucratizing the simple. The balancing of design constraints is a cruel problem in which a compromise does what a compromise does: leaves everyone a little disappointed. After successfully dicing a jumble of concerns into testable object atoms that don't overreach, the way to accomplish a useful requirement is to...assemble an application from atoms. That's an exaggerated negative perspective, but bloggers have been mentioning or insinuating similar remarks about the standard Java APIs for a while (others might, but we understand the value of writing code with standardized stream and reader abstractions, as well as the necessity of not confusing bytes and characters in the Unicode world [Earth]). Like a UI, an API object design should have its knobs and buttons laid out understandably. Typical activities belong at front and center, with unobtrusive advanced flexibility/extensibility in the corners for heavy-duty users who know what they need. In terms of test coverage, convenience methods that straddle multiple objects are fine on the condition that the methods do nothing but delegate to fine-grained, unit-tested objects.
  • Library insulation. Libraries complicate unit tests. Closed-source libraries can't be modified in favor of greater testability, and the cost of modifying open-source libraries for testability is often significant. Insulating the code from the libraries with a lot of layers and interfaces, for faking the libraries during tests, seems like an overreaction. On the other hand, it's also wasteful writing unit tests that truth-be-told mostly exercise a mature, vetted library and just slightly exercise the new custom code that calls it. Once again, my inclination is extracting the code that needs testing from the library code that doesn't, and minimizing the untested "glue" code that actually bridges the two (like the rule that the Model and View of MVC can each be intricate but not the code in-between).
I've read that the intrusiveness of unit testing on design is meant to be embraced. But I'm sure no one is suggesting unit testing at the expense of an uncluttered, understandable, maintainable design.