Friday, November 16, 2007

has MVC in webapps been over-emphasized?

Most of the prior entries' titles don't contain a question mark. This one does, because I've become torn on this topic. As I understand it, MVC (Model-View-Controller) refers to the interface design principle of preserving code/implementation separation between the data (Model), the appearance or rendering (View), and the processing of user interactions (Controller). MVC in webapps typically means...
  • The Model is the data structures and data processing. Ultimately speaking, the Model consists of the information manipulation which is the prime purpose of the webapp, even if it merely accomplishes database lookup.
  • The View is the components, templates, and formatting code that somehow transforms the Model into Web-standard output to be sent to the client.
  • The Controller processes HTTP requests by running the relevant 'action' code, which interacts with the Model and the View. Part of the Controller is a pre-built framework, since the Controller's tasks, like parsing the URL, are so common from webapp to webapp.
Post-Struts, MVC may seem somewhat obvious. However, speaking as someone who has had the joy of rewriting an old webapp, no more than a jumble of .jsp and some simplistic "beans", to run on a properly-organized MVC framework, the benefit of MVC is palpable. I'm sure the difference shows itself just as strongly in, say, the comparison between Perl CGI scripts written before the Internet bubble and recent Perl modules that use Catalyst. (As an aside, I can't stop myself from remarking that well-written and quite-maintainable Perl does exist; I've seen it.)

Nevertheless, I wonder if the zeal for MVC has misled some people (me included) into the fool's bargain of trading a minor uptick in readability/propriety for a substantial drag in productivity. For instance, consider the many template solutions for implementing the View in JEE. Crossing borders: Web development strategies in dynamically typed languages serves as a minimal overview of some alternative ways. I firmly believe that embedding display-oriented, output-generation code into a template is better than trying to avoid it at all costs through "code-like" tags; markup is a dreadful programming language. Any CFML backers in the audience? I don't mind excellent component tag libraries that save time, but using a "foreach" tag in preference to a foreach statement just seems wonky.

Indeed, it is true that including code inside a template may make it harder to distinguish static from dynamic content, and the burden of switching between mental parsing of HTML and programming language X falls upon the reader. Developers won't mind much, but what about the guy with the Queer Eye for Visual Design, whose job is to make the pages look phat? In my experience, the workflow between graphic artist and web developer is never as easy as everyone wishes. Page templates don't round-trip well, no matter how the template was originally composed in the chosen View technology. A similar phenomenon arises for new output formats for the same data. When one wishes to produce a syndication feed in addition to a display page, the amount of intermingled code will seem like a minor concern relative to the scale of the entire template conversion.

Consider another example of possibly-overdone MVC practices, this time for the Model rather than the View. One Model approach that has grown in popularity is ActiveRecord. Recently my dzone.com feed showed a kerfuffle over the question of whether or not ActiveRecord smells like OO spirit. The resulting rumble of the echo chamber stemmed more from confusion over definitions than irreconcilable differences, I think; in any case, I hope I grasp the main thrust of the original blog entry, despite the unfortunate choice of using data structures as a metaphor for ActiveRecord objects. Quick summary, with a preemptive apology if I mess this up:
  • In imperative-style programming, a program has a set of data structures of arbitrary complexity and a set of subroutines. The subroutines act on the data structures to get the work done.
  • In object-oriented-style programming, a program has a set of (reusable) objects. Objects are units having data structures and subroutines. As the program runs, objects send messages to get the work done. Objects don't interact or share data except by the messages. The advantage is that now a particular object's implementation can be anything, such as an instance of a descendant class, as long as the messages are still handled.
  • Since ActiveRecord objects represent a database record by freely exposing data, other code in the program can treat the ActiveRecord like an imperative data structure. To quote the blog: "Almost all ActiveRecord derivatives export the database columns through accessors and mutators. Indeed, the Active Record is meant to be used like a data structure." You know how people sometimes complain about objects having public "getters" and "setters" even when the getters/setters aren't strictly necessary for the object to fulfill its responsibility? Roughly the same complaint here.
  • The blog entry ends by reiterating that ActiveRecord is still a handy technique, as long as its messy lack of information hiding (leaky encapsulation) is isolated from other code to 1) avoid the creation of brittle dependencies on the associated table's schema, and 2) avoid the database design dictating the application design.
In my opinion, the real debate here is not if ActiveRecord objects break encapsulation and thereby aren't shining examples of OO (they do, so they aren't), but the degree to which anyone should care in any particular case. Rigidly-followed MVC, which aims to divide Model from View, would tend to encourage interfaces that prevent the Model data source, i.e. a database or remote service or file, from affecting the View. The result is layers of abstraction.

In contrast, I'm steadily less convinced of the need for doing that. The number of times I've 1) completely swapped out the Model data source, or 2) significantly tweaked the Model without also somehow exposing that tweak in the View, is minuscule. Similarly, the majority of webapps I work on are almost all View, anyway. Therefore, as long as the View is separate from the Model, meaning no program entity executing both Model and View responsibilities, inserting an abstraction or indirection layer between the two probably is unnecessary extra work and complexity. When the coupling chiefly consists of feeding a dynamic data structure, like a list of hashes of Objects, into the View, the weakness of the contract implies that the Model could change in numerous ways without breaking the call--the parameter is its own abstraction layer. Of course, that flexibility comes at the cost of the View needing to "know" more about its argument to successfully use it. That information has in effect shifted from the call to the body of the call.

Lastly, after retooling in ASP.Net, the importance of a separate Controller has also decreased for me. I can appreciate the necessity of subtler URL-to-code mapping, and the capability to configure it on the application level in addition to the web server level. Yet the concept of each dynamic server page consisting of a page template and an event-driven code-behind class is breathtakingly simple and potent as a unifying principle. Moreover, the very loose MVC enforcement in ASP.Net (although explicit MVC is coming) is counterbalanced by the strong focus on components. To be reusable across Views, components must be not tied too closely to a specific Model or Controller. The Controller code for handling a post-back or other events is assigned to the relevant individual components as delegates. The Model...er, the Model may be the weak point, until the .Net Entity Framework comes around. As with any other webapp framework, developers should have the discipline to separate out the bulk of the nitty-gritty information processing from the page when it makes sense--fortunately, if that code is to be reused in multiple pages, the awkwardness acts as a nudge.

The practical upshot is to use MVC to separate the concerns of the webapp, while ensuring that separation is proportionate to the gain.

3 comments:

  1. good read, thanks. I feel choosing between MVC or not depends on the complexity of the envisioned site. If you are making it quick and dirty, forget MVC. If you will be working on the site for years and it will become more a "web app" than a web site, MVC with its inherent TDD will save much time in the future.

    I just left a job where the code was ASP.NET still using Response.Write() and dynamic includes (includes within a loop). It was pure hell. Templates, MVC and C# 3.0 is making me look forward to programming again.

    ReplyDelete
  2. It's all about context and maintainability of the code.

    Applications with 50+ developers need the logical break down. Your 15 year old kid pumping PHP sites from his room in the middle of the nigh, does not. (Come to think of it, he should go outside and play.)

    Of course, MVC can be overdone, but it is mostly done in the first-time type of project category, or the whole concept is not understood. I mean, when the whole MVC become popular for web enterprise applications, even I wanted to write "Hello, World!" apps with perfectly defined layers.

    You summarized the controller as "The Controller processes HTTP requests by running the relevant 'action' code, which interacts with the Model and the View. Part of the Controller is a pre-built framework, since the Controller's tasks, like parsing the URL, are so common from webapp to webapp."

    It is, but maybe I'm missing something from your explanation. For example, the controller is what controls the view. In other words, it not only interacts with the back-end but knows (or makes decisions on) which view to display depending on the results of your business logic. But maybe that's what you mean by "interacts with Model and the View.

    ReplyDelete
  3. Yes, as I have read, the Controller is any "middle-man" code that does work on the Model, selects the View, and passes the Model to the View. The Controller facilitates two kinds of separation: 1) the View code doesn't do work on the Model (but only reads it), 2) the Model code doesn't use any View code. The "action" code is also part of the Controller. In frameworks that aren't Action-based, the Controller is harder to distinguish.

    I don't argue the complexity/app size/team size factor in making choices about MVC (to reiterate: use MVC in proportion to the gain), but I would note that sites which start out "quick and dirty" have a tendency to cling to life and gradually become behemoths.

    ReplyDelete