A sure sign of a dysfunctional object-oriented design is the nudist community anti-pattern. The defining characteristic of this anti-pattern is that all the design's important objects don't have the mutual data privacy that's yielded by true encapsulation. (Still more care is required when working with an implementation that relies primarily on conventions to enforce data hiding.)
At first glance, the information has been properly separated and gathered into individual objects with distinct responsibilities. However, in practice, each object performs its single responsibility by direct requests for additional data from its fellow objects. Hence, when the shape of the data inevitably changes, not only the responsible object must change but also every other object that asks it directly for that data.
The remedy is to revise the nature of the interactions between objects to better respect individual privacy. Objects shouldn't need to openly publish members to accomplish tasks, but instead should send structured messages that demurely ask an object to produce a necessary sub-goal or intermediate result. If it's difficult to decide which object should be responsible for a role, that could be an indication that the design needs refactoring because requirements have changed.
When an object simply must have information from another, the package of information can still be suitably wrapped for handling by the outsider object, e.g. limiting the exposure to partial instead of full or restricting access to "look don't touch". Also, by putting it in a "box" before giving it, the interaction as a whole can be kept loose because each participant need only depend on a shared abstraction with semantic meaning (cf. "dependency inversion").
Friday, October 30, 2009
the unlikelihood of murderous AI
Sometimes suspension of disbelief is difficult. For me, a murderous AI in a story can be one of those obstacles. By murderous AI I don't mean an AI that's designed to kill but malfunctions. I mean an AI that's designed for other purposes but later decides to murder humans instead. It seems so unlikely for a number of reasons.
- Domain. An AI is created for a particular purpose. In order for this purpose to be useful to people and not subject to "scope creep", it probably isn't "imitating a person as completely as possible". Moreover, in order to be still more useful for that circumscribed purpose, expect the AI to be mostly filled with expert knowledge for that purpose rather than a smattering of wide-ranging knowledge. (The AI might be capable of a convincing conversation, but it's likely to be a boring one!) For yet greater cost-effectiveness, the AI's perceptual and categorization processes would be restricted to its purpose as well. For instance, it might not need human-like vision or hearing or touch sensations and so its "world" could be as alien to us as a canine's advanced sense of smell. A data-processing AI could have no traditional senses at all but instead a "feel for datasets". It might not require a mental representation of a "human". Within the confines of these little AI domains, there's unlikely to be a decision to "murder" because the AI doesn't have the "building blocks" for the mere concept of it.
- Goals. Closely connected to its domain, the goals of an AI are likely to be clearly defined in relation to its purpose. The AI has motivations, desires, and emotions to the extent that its goals guide its thoughts. It'd be counterproductive for the AI to be too flexible about its goals or for its emotions to be free-floating and chaotic. It's also odd to assume that an AI would need to have a set of drives identical to that produced in humans over eons of brain evolution (to quote the murderous AI's designer's wail "Why, why did I choose to include paranoid tendencies and overwhelming aggression?...").
- Mortality. Frankly, mortality is a formidably advanced idea because direct experience of it is impossible. Plus, any degree of indirect understanding, i.e. observing the mortality of other entities, is closely tied to knowledge and empathy of the observed entity. An AI needs to be exceedingly gifted to infer its own risk of mortality and/or effectively carry out a rampage. Even the quite logical conclusion "the most comprehensive method of self-defense is killing everything else" can't be executed without the AI figuring out how to 1) distinguish animate from inanimate for a given being and 2) convert that being from one state to the other. An incomplete grasp of the topic could lead to a death definition of "quiet and still" in which case "playing dead" is an excellent defensive strategy. Would the AI try to "kill" a ringing mobile phone by dropping it, assess success when the ringing stops, and diagnose the phone as "undead" when it resumes ringing during the next time the caller tries again?
- Fail-safe/Dependency/Fallibility. Many people think of this sooner or later when they encounter a story about a murderous AI. Countless devices much simpler than AI have a "fail-safe" mechanism in which operation immediately halts in response to a potentially-dangerous condition. And without a fail-safe, a device still has unavoidable dependencies such as an energy supply and to remove those dependencies would incapacitate/break it. A third possibility is inherent weaknesses in its intelligence. The story's AI builder must have extensive memory of his or her work and therefore know a great number of points of failure or productive paths of argumentation. Admittedly, the murderous AI in the story could be said to overwrite its software or remove pieces of its own hardware to circumvent fail-safes, but if a device can run without the fail-safe enabled then the fail-safe needed more consideration.
Thursday, October 08, 2009
LINQ has your Schwartzian transform right here...
I was looking around for how to do a decorate-sort-undecorate in .Net. Eventually I realized (yeah, yeah, sometimes I'm slow to catch on) that the LINQ "orderby" clause actually makes it incredibly easy. For instance, if you had some table rows from a database query that you needed to sort according to how the values of one of the columns are ordered in an arbitrary external collection...
var sortedRows = from rw in queryResults.AsEnumerable()
orderby externalCollection.IndexOf((string) rw["columnName"])
select rw;
Oh brave new world, that has such query clauses in it! Back when I first read about the Schwartzian transform in Perl, with its hot "map-on-map" action, it took a little while for me to decipher it (my education up to that point had been almost entirely in the imperative paradigm with some elementary OO dashed in).
Between this and not having to learn about pointers, either, programmers who start out today have it too easy. Toss 'em in the C, I say...
UPDATE: Old news. Obviously, others have noticed this long before, and more importantly taken the steps of empirically confirming that LINQ does a Schwartzian under the covers. (I just assumed it was because my mental model of LINQ is of the clauses transforming sequences into new sequences, not ever doing things in-place.)
var sortedRows = from rw in queryResults.AsEnumerable()
orderby externalCollection.IndexOf((string) rw["columnName"])
select rw;
Oh brave new world, that has such query clauses in it! Back when I first read about the Schwartzian transform in Perl, with its hot "map-on-map" action, it took a little while for me to decipher it (my education up to that point had been almost entirely in the imperative paradigm with some elementary OO dashed in).
Between this and not having to learn about pointers, either, programmers who start out today have it too easy. Toss 'em in the C, I say...
UPDATE: Old news. Obviously, others have noticed this long before, and more importantly taken the steps of empirically confirming that LINQ does a Schwartzian under the covers. (I just assumed it was because my mental model of LINQ is of the clauses transforming sequences into new sequences, not ever doing things in-place.)
Wednesday, August 12, 2009
the importance of syntax sugar, ctd.
A while ago I ranted that the common term "syntax sugar" minimizes the practical importance of programming language syntax. One example is when people sometimes state (or imply) that their preference for dynamic languages is tied to the languages' sweeter syntax.
While that may be true, it's a mistake to assume that the saccharine in the syntax is dependent on a language's dynamic typing. This point is amply demonstrated for those who've been keeping up with the development of C#, a static language whose syntax is sufficiently honeyed that the below snippet is remarkably comparable to the Javascript version after it. I tried to match up the code as much as possible to make the comparison easier.
C#
Javascript
By the way, other C# programmers might scold you for writing code like this. And yes, I realize that fundamentally (i.e. semantically rather than superficially) the dynamic Javascript object "myc" differs from the static C# instance "myc". Keep tuned for the "dynamic" keyword that's on the way in C# for duck-typing variables going to and from CLR dynamic languages.
While that may be true, it's a mistake to assume that the saccharine in the syntax is dependent on a language's dynamic typing. This point is amply demonstrated for those who've been keeping up with the development of C#, a static language whose syntax is sufficiently honeyed that the below snippet is remarkably comparable to the Javascript version after it. I tried to match up the code as much as possible to make the comparison easier.
C#
public class myClass
{
public string myProp { get; set; }
static void Main(string[] args)
{
var myc = new myClass() { myProp = "English" };
myc.manhandleStr();
System.Console.Out.WriteLine(myc.myProp);
}
}
public static class ExtMethodClass
{
public static void manhandleStr(this myClass cl)
{
var palaver = new
{
myName = "Nym",
myTaskList = new List<Action<string>>
{
next => { cl.myProp = next + ".com"; },
next => { cl.myProp =
next.Replace("l","r").ToLower(); }
}
};
palaver.myTaskList.ForEach(task => task(cl.myProp));
}
}
Javascript
var myc =
{
myProp: "English"
};
myc.manhandleStr = function() {
var palaver =
{
myName: "Nym",
myTaskList:
[
function(next) { myc.myProp = next + ".com"; },
function(next) { myc.myProp =
next.replace("l","r").toLowerCase(); }
]
};
palaver.myTaskList.forEach(function(task) { task(myc.myProp); });
}
myc.manhandleStr();
alert(myc.myProp);
By the way, other C# programmers might scold you for writing code like this. And yes, I realize that fundamentally (i.e. semantically rather than superficially) the dynamic Javascript object "myc" differs from the static C# instance "myc". Keep tuned for the "dynamic" keyword that's on the way in C# for duck-typing variables going to and from CLR dynamic languages.
Saturday, August 08, 2009
please state the nature of the medical emergency
The words of the title echo through my mind every time I read some blog or other about the efficient market hypothesis. And if the blog analyzes the Reconstruction Finance Corporation then I start thinking about Internet protocol definitions. Even if its topic is just purchasing power parity, I hear a faint sequence of whirs, clicks, drones, and static.
Monday, July 27, 2009
the most incredible spell in "Half-Blood Prince"
Early on in "Harry Potter and the Half-Blood Prince", the wizard Dumbledore enters a room which is in disarray - items wrecked and disorganized, walls and ceiling missing pieces - and waggles his trusty magic wand. Everything is restored, repaired, and repositioned back to what appears to be the original state of the room. (It trounces Mary Poppins' finger-snap trick.)
This has to be a contender for the most incredible spell in the entire movie. Illumination on demand, movement controlled from a distance, and fire-throwing are great and all, but aren't that fantastic, relatively speaking, from the standpoint of the current level of technological progress. The rollback of the room is more astounding because it reduces entropy without expending energy. Maxwell's demon, thy name is Dumbledore (and you have our condolences).
There's no physical obstacle to roughly accomplishing part of the effect of this Entropy Reversal Spell. Sure, if an automobile crashed into my living room (just posit that I exist in a sitcom and the probability shoots way up!), I or more likely a contracted team of workers could remove the rubble and rebuild the wall to an approximation of its former glory. The energy (and currency) to do so would far exceed zero even apart from questions of efficiency, so reducing the entropy level here would merely increase the entropy level somewhere else.
Instant home repair would still be inferior to the Entropy Reversal Spell, though. To exactly reverse the breaking of a solid or the spill of a liquid first requires the retrieval of all the involved atoms and molecules then the restoration of the orientations and interactions between. Assuming nanotechnology or another technique is up to the task, one of its necessary inputs is an information representation of all those mutual orientations and interactions. Unfortunately, this is a clear case of combinatorial explosion, with the accompanying space and time problems inherent to the data processing. (It'd be a good opportunity to try out your quantum computer.)
Of course, the possible uses for the Entropy Reversal Spell extend far beyond perfect housekeeping. For instance, imagine that an engine - a Carnot heat engine, say - converts a temperature change into usable work. Then, at the tail end of each cycle through the casting of the magic of entropy reversal, the atoms could be restored without energy cost from the less-useful low-temperature state back to the handier high-temperature state. Presto! No more fuel necessary for all time and no harmful emissions, either.
But perpetual motion machines are still thinking too small. Consider the general notion of reversibility. Rolling a pen across my cubicle counter is quite reversible without trouble; just tap it with a finger to move it a few inches then tap it with a different finger to move it back. Putting the pen back on the counter after it falls on the floor takes more work and energy, both because it's a longer distance and because it's working against gravitational force (yeah, yeah, call me lazy for complaining). When the pen falls it moves from a higher (potential) energy state to a lower - the entropy increases and it's more work to reverse. Generally, the greater the entropy change of any occurrence, the harder it is (more work and energy) to reverse, and in any case the reversal ends up increasing overall entropy and energy loss still more because the reversal's energy-efficiency isn't perfect. In contrast, with an Entropy Reversal Spell, entropy isn't a factor anymore and theoretically any occurrence can be reversed without significant penalty. Feeling older all the time? Reverse it and forget about it. Father Time can suck it!
Although entropy reversal is a nifty feat when one can manage it, teleportation should at least get an honorable mention. Yet it's not quite as awe-inspiring since all one needs to do is somehow get all the information describing the object, transfer it elsewhere, and use the information to reproduce the original. Note that you'll need to get your hands on a Heisenberg compensator (find the details online).
Oh yes, I am a nerd.
This has to be a contender for the most incredible spell in the entire movie. Illumination on demand, movement controlled from a distance, and fire-throwing are great and all, but aren't that fantastic, relatively speaking, from the standpoint of the current level of technological progress. The rollback of the room is more astounding because it reduces entropy without expending energy. Maxwell's demon, thy name is Dumbledore (and you have our condolences).
There's no physical obstacle to roughly accomplishing part of the effect of this Entropy Reversal Spell. Sure, if an automobile crashed into my living room (just posit that I exist in a sitcom and the probability shoots way up!), I or more likely a contracted team of workers could remove the rubble and rebuild the wall to an approximation of its former glory. The energy (and currency) to do so would far exceed zero even apart from questions of efficiency, so reducing the entropy level here would merely increase the entropy level somewhere else.
Instant home repair would still be inferior to the Entropy Reversal Spell, though. To exactly reverse the breaking of a solid or the spill of a liquid first requires the retrieval of all the involved atoms and molecules then the restoration of the orientations and interactions between. Assuming nanotechnology or another technique is up to the task, one of its necessary inputs is an information representation of all those mutual orientations and interactions. Unfortunately, this is a clear case of combinatorial explosion, with the accompanying space and time problems inherent to the data processing. (It'd be a good opportunity to try out your quantum computer.)
Of course, the possible uses for the Entropy Reversal Spell extend far beyond perfect housekeeping. For instance, imagine that an engine - a Carnot heat engine, say - converts a temperature change into usable work. Then, at the tail end of each cycle through the casting of the magic of entropy reversal, the atoms could be restored without energy cost from the less-useful low-temperature state back to the handier high-temperature state. Presto! No more fuel necessary for all time and no harmful emissions, either.
But perpetual motion machines are still thinking too small. Consider the general notion of reversibility. Rolling a pen across my cubicle counter is quite reversible without trouble; just tap it with a finger to move it a few inches then tap it with a different finger to move it back. Putting the pen back on the counter after it falls on the floor takes more work and energy, both because it's a longer distance and because it's working against gravitational force (yeah, yeah, call me lazy for complaining). When the pen falls it moves from a higher (potential) energy state to a lower - the entropy increases and it's more work to reverse. Generally, the greater the entropy change of any occurrence, the harder it is (more work and energy) to reverse, and in any case the reversal ends up increasing overall entropy and energy loss still more because the reversal's energy-efficiency isn't perfect. In contrast, with an Entropy Reversal Spell, entropy isn't a factor anymore and theoretically any occurrence can be reversed without significant penalty. Feeling older all the time? Reverse it and forget about it. Father Time can suck it!
Although entropy reversal is a nifty feat when one can manage it, teleportation should at least get an honorable mention. Yet it's not quite as awe-inspiring since all one needs to do is somehow get all the information describing the object, transfer it elsewhere, and use the information to reproduce the original. Note that you'll need to get your hands on a Heisenberg compensator (find the details online).
Oh yes, I am a nerd.
Monday, June 29, 2009
DVCS and good merges
While it may be true that a DVCS (i.e. multiple repositories) implies good support for branching and merging else the DVCS would be a nightmare to use, it's incorrect to logically conclude that good support for branching and merging implies that someone must be using a DVCS (A -> B does not mean B -> A). Decentralization or support for multiple repositories is a different question than whether the version control system has "dumb" merging. The benefits of easier collaboration (we can pull and push work-in-progress to each other without committing to the "canonical" repo) and offline work are more honest justifications for DVCS than mere good merges.
Monday, June 22, 2009
my favorite word for today is complexify
It ends in "-fy", it's a neat opposite to "simplify", and it's satisfying to have the word itself be a bit cumbersome to say and understand. Call a consultant today to help you complexify your infrastructure!
Friday, June 12, 2009
I'm GLAD I'm not programming in natural human language
At least one of the hard and undeniable facts of software development doesn't really swipe a developer across the face until he or she starts work: the difficulty of extracting good information from people. Someone may say "the bill is a set rate per unit" but a mere five minutes later, after further interrogation, is finally coaxed into saying "customers pay a flat rate for items in categories three and four". Similarly, the reliability of the words "never" or "always" should be closely scrutinized when spoken by people who aren't carefully literal. (On the other hand, xkcd has illustrated that literal responses are inappropriate in contexts like small talk...)
I'm convinced that this difficulty is partially caused by natural human language. It's too expressive. It isn't logically rigorous. By convention it supports multiple interpretations. While these attributes enable it to metaphorically branch out into new domains and handle ambiguous or incompletely-understood "analog" situations, the same attributes imply that it's too imprecise for ordering around a computing machine. Just as "I want a house with four bedrooms and two bathrooms" isn't a sufficient set of plans to build a house, "I want to track my inventory" isn't a sufficient set of software instructions to build a program (or even a basis on which to select one to buy).
Every time I perform analytical/requirements-gathering work, I'm reminded of why I doubt that natural human language will ever be practical for programming, and why I doubt that my job will become obsolete any time soon. I can envision what the programming would be like. In my head, the computer sounds like Majel Barrett.
Me: "Computer, I want to periodically download a file and compare the data it contains over time using a graph."
Computer: "Acknowledged. Download from where?"
Me: "I'll point at it with my mouse. There."
Computer: "Acknowledged. Define periodically."
Me: "Weekly."
Computer: "Acknowledged. What day of the week and what time of the day?"
Me: "Monday, 2 AM."
Computer: "Acknowledged. What if this time is missed?"
Me: "Download at the next available opportunity."
Computer: "Acknowledged."
Me: "No, wait, only download at the next available opportunity when the system load is below ___ ."
Computer: "Acknowledged. What if there is a failure to connect?"
Me: "Retry."
Computer: "Acknowledged. Retry until the connection succeeds?"
Me (getting huffy): "No! No more than three tries within an eight-hour interval."
Computer: "Acknowledged. Where is the file stored?"
Me: "Storage location ______ ."
Computer: "Acknowledged. What if the space is insufficient?"
Me: "Remove the least recent file."
Computer: "Acknowledged. What data is in the file?"
Me (now getting tired): "Here's a sample."
Computer: "Acknowledged. What is the time interval for the graph?"
Me: "The last thirty data points."
Computer: "Acknowledged. What is the color of the points? Does the graph contain gridlines? What are the graph dimensions? How will the graph be viewed?"
Me: "Oh, if only I had an analyst!"
I'm convinced that this difficulty is partially caused by natural human language. It's too expressive. It isn't logically rigorous. By convention it supports multiple interpretations. While these attributes enable it to metaphorically branch out into new domains and handle ambiguous or incompletely-understood "analog" situations, the same attributes imply that it's too imprecise for ordering around a computing machine. Just as "I want a house with four bedrooms and two bathrooms" isn't a sufficient set of plans to build a house, "I want to track my inventory" isn't a sufficient set of software instructions to build a program (or even a basis on which to select one to buy).
Every time I perform analytical/requirements-gathering work, I'm reminded of why I doubt that natural human language will ever be practical for programming, and why I doubt that my job will become obsolete any time soon. I can envision what the programming would be like. In my head, the computer sounds like Majel Barrett.
Me: "Computer, I want to periodically download a file and compare the data it contains over time using a graph."
Computer: "Acknowledged. Download from where?"
Me: "I'll point at it with my mouse. There."
Computer: "Acknowledged. Define periodically."
Me: "Weekly."
Computer: "Acknowledged. What day of the week and what time of the day?"
Me: "Monday, 2 AM."
Computer: "Acknowledged. What if this time is missed?"
Me: "Download at the next available opportunity."
Computer: "Acknowledged."
Me: "No, wait, only download at the next available opportunity when the system load is below ___ ."
Computer: "Acknowledged. What if there is a failure to connect?"
Me: "Retry."
Computer: "Acknowledged. Retry until the connection succeeds?"
Me (getting huffy): "No! No more than three tries within an eight-hour interval."
Computer: "Acknowledged. Where is the file stored?"
Me: "Storage location ______ ."
Computer: "Acknowledged. What if the space is insufficient?"
Me: "Remove the least recent file."
Computer: "Acknowledged. What data is in the file?"
Me (now getting tired): "Here's a sample."
Computer: "Acknowledged. What is the time interval for the graph?"
Me: "The last thirty data points."
Computer: "Acknowledged. What is the color of the points? Does the graph contain gridlines? What are the graph dimensions? How will the graph be viewed?"
Me: "Oh, if only I had an analyst!"
Tuesday, June 09, 2009
your brain on CSPRNG
Not too long ago I used a cryptographically secure pseudo-random number generator API for assigning unique identifiers to users for requesting an unauthenticated but personalized semi-sensitive resource over the Internet. They already have the typical unique organizational IDs assigned to them in the internal databases, but these IDs are far from secret. As much as possible, the resource identifier/URL for a particular user had to be unpredictable and incalculable (and also had to represent sufficient bits such that a brute force attack is impractical, which is no big deal from a performance standpoint because these identifiers are only generated on request). Also, we've committed to the unauthenticated resource itself not containing any truly sensitive details - the user simply must log in to the actual web site to view those.
Anyhow, as my mind was drifting aimlessly before I got my coffee today, I wondered if the CSPRNG could be an analogy for the brain's creativity. Unlike a plain pseudo-random number generator that's seeded by a single value, the CSPRNG has many inputs of "entropy" or uncorrelated bits. The bits can come from any number of sources in the computer system itself, including the hardware, which is one reason why the CSPRNG is relatively slow at processing random numbers.
But like a CSPRNG algorithm the brain is connected to myriad "inputs" all the time, streaming from within and without the body. Meanwhile, the brain is renowned for its incredible feats of creativity. It's common for people to remark "that was a random thought" and wonder "where that came from". Given all these entropic bits and the stunning network effects of the cortex (an implementation of "feedback units" that outshines any CSPRNG), should we be surprised that the brain achieves a pseudorandom state? I hesitate to call it true randdomness in the same way as someone who believes the brain relies on quantum-mechanical effects; people who try to recite a "random sequence" tend to epically fail.
I'm not sure that this makes any real sense. I'm not suggesting that a CSPRNG is a suitable computational model for human thought. It's merely interesting to ponder. It's another reminder that just as software isn't a ghost-like nebulous presence that "inhabits" a computer - a microprocessor engineer would tell you that each software instruction is as real as the current that a pacemaker expels to regulate a heartbeat - our thoughts and minds are inseparable from the form or "hardware" of the brain, and the brain is inseparable from the nervous system, and the nervous system is inseparable from the rest of the body.
Anyhow, as my mind was drifting aimlessly before I got my coffee today, I wondered if the CSPRNG could be an analogy for the brain's creativity. Unlike a plain pseudo-random number generator that's seeded by a single value, the CSPRNG has many inputs of "entropy" or uncorrelated bits. The bits can come from any number of sources in the computer system itself, including the hardware, which is one reason why the CSPRNG is relatively slow at processing random numbers.
But like a CSPRNG algorithm the brain is connected to myriad "inputs" all the time, streaming from within and without the body. Meanwhile, the brain is renowned for its incredible feats of creativity. It's common for people to remark "that was a random thought" and wonder "where that came from". Given all these entropic bits and the stunning network effects of the cortex (an implementation of "feedback units" that outshines any CSPRNG), should we be surprised that the brain achieves a pseudorandom state? I hesitate to call it true randdomness in the same way as someone who believes the brain relies on quantum-mechanical effects; people who try to recite a "random sequence" tend to epically fail.
I'm not sure that this makes any real sense. I'm not suggesting that a CSPRNG is a suitable computational model for human thought. It's merely interesting to ponder. It's another reminder that just as software isn't a ghost-like nebulous presence that "inhabits" a computer - a microprocessor engineer would tell you that each software instruction is as real as the current that a pacemaker expels to regulate a heartbeat - our thoughts and minds are inseparable from the form or "hardware" of the brain, and the brain is inseparable from the nervous system, and the nervous system is inseparable from the rest of the body.
Tuesday, May 19, 2009
resist the temptation
Memo to the universe at large: resist the temptation to mention one or more of the words ["id","ego", "superego"] in discussions of the relationship among ["McCoy","Kirk","Spock"].
It's been done. To Death. Repeatedly.
It's been done. To Death. Repeatedly.
Saturday, April 25, 2009
amarok 2 and "never played" playlists
Hey, Amarok, here's a tip: the 2.x series wouldn't be useless if I could order it to randomly append a never played track to the playlist...and have it work.
UPDATE (May 6): After switching to a more recent beta, I can obtain the behavior I want. And the "playlist layout" customization allows me to view the actual play count. Now I just wish that I could search my collection based on play count or last played date...
UPDATE (May 6): After switching to a more recent beta, I can obtain the behavior I want. And the "playlist layout" customization allows me to view the actual play count. Now I just wish that I could search my collection based on play count or last played date...
Monday, February 09, 2009
Haskell comprehension measured through WTF/min
The top compliment I can give to Real World Haskell is that it manages to finally teach me the aspects of Haskell programming that I previously assumed to be both impenetrably complicated and useless. As I read I'm also reminded of what it was like when I first tried to comprehend Haskell code. I've concluded that the most noticeable sign of greater Haskell comprehension is a noticeable drop in my WTF/min when I'm figuring out a given code example.
WTF/min is "WTFs per minute". According to a highly-linked picture, this unit is "the only valid measurement of code quality" and it's determined through code reviews. My initial experiences of Haskell definitely exhibited high WTF/min. The following are some of the past Haskell-related thoughts I can recall having at one time or another.
WTF/min is "WTFs per minute". According to a highly-linked picture, this unit is "the only valid measurement of code quality" and it's determined through code reviews. My initial experiences of Haskell definitely exhibited high WTF/min. The following are some of the past Haskell-related thoughts I can recall having at one time or another.
- Infinite lists like [1..]? WTF? Oh, lazy evaluation, right.
- Functions defined more than once? WTF? Oh, each declaration pattern matches on a different set of parameters. It's like method overloading.
- The underscore character has nothing to do with this problem domain but it's being matched against. WTF? Oh, it matches anything but discards the match.
- Why is the scoping operator "::" strewn throughout? WTF? Oh, it's being used for types, not scopes.
- Even a simple IO command like "putStrLn" has a type? WTF is "IO ()"? Oh, it's an expression with IO side-effects that evaluates to the value-that-is-not-a-value, ().
- WTF? What is this ubiquitous 'a' or 't' type everywhere? Oh, it's like the type parameters of generics or templates.
- Functions don't need "return" statements? WTF? Oh, all functions are expressions anyway.
- WTF is going on with these functions not being passed all their parameters at once? Oh, applying a function once produces another function that only needs the rest of the parameters. That'll be helpful for reusing the function in different contexts.
- This function definition doesn't have any parameters at all, and all it does is spit out the result from yet another function, a function that itself isn't being passed all the parameters it needs. WTF? Oh, "point-free" style.
- Now I understand all those -> in the types. But WTF is this extra => in front? Oh, it's sorta like a list of interfaces that must be met by the included types, so the code is tied to a minimal "contract" instead of a particular set of explicit types. That's good, but how would I set those up?...
- Ah, now this I'm sure I know. "class" and "instance" are easy. WTF?! How can that be it? Just more functions? Can't I store structured information anywhere? Oh, tuples or algebraic data types.
- I like the look of these algebraic data types with the "|" that I know from regular expressions. Unions and enums in one swell foop. WTF? How do I instantiate it? Oh, what appear to be constituent data types are actually constructor functions.
- After a value has been stuffed into the data type, how can my code possibly determine which data type constructor was used? WTF? Oh, just more pattern-matching.
- WTF? Record syntax for a data type declaration results in automatic accessor functions for any value, but we use these same function names when we're creating a new record value? Oh.
- I've acquainted with map and filter. WTF is foldr and zip and intercalate? Oh, I'll need to look over the standard list functions reference.
- What's this "seq" sitting in the middle of the code and apparently doing jack? WTF for? Oh, to escape from laziness when needed.
- WTF? How can a function name follow its first argument or a binary operator precede its first argument? Oh, `backticks` and (parentheses).
- How come there's all these string escapes in the flow of code? W...T...F? Oh, lambda. Cute.
- I've always been told that Haskell is heavily functional and pure. WTF are these do-blocks, then? Oh, monads. Wait, what?
- Functor, Monoid, MonadPlus, WTF? Oh, more typeclasses whose definitions, like that of monads, enable highly generalized processing.
- A way to gain the effects of several monads at once is to use "transformers"? WTF? Oh, when a transformer implements the monad functions it also reuses the monad functions of a passed monad.
- Finally...I know that the ($) must be doing something. But what? Why use it? WTF? Oh, low-precedence function application (so one can put together the function and its arguments, then combine them).
Thursday, January 29, 2009
ravens at the intersection of logic and reality
The raven paradox presents an interesting question for anyone seeking to apply logic. It's also short and understandable. 1) The two implications "all ravens are black" and "anything that isn't black isn't a raven" are logically equivalent - either both false or both true. p -> q has the same truth table as ~q -> ~p. 2) A raven that is black is evidence for "all ravens are black". 3) Similar to 2, any object that isn't black and isn't a raven is evidence for "anything that isn't black isn't a raven". 4) Since the two propositions are logically equivalent (1), why wouldn't evidence for "anything that isn't black isn't a raven" (3) also be evidence for "all ravens are black" (2)? To summarize, how many green apples are required to convincingly support the proposition that all ravens are black? And isn't this question ridiculous?
In my opinion, the raven paradox is a matter of perspective. The Wikipedia article probably includes all of the following comments, stated differently. (If equations excite you, as usual the Wikipedia article won't disappoint in that department.)
In my opinion, the raven paradox is a matter of perspective. The Wikipedia article probably includes all of the following comments, stated differently. (If equations excite you, as usual the Wikipedia article won't disappoint in that department.)
- Logic works best as a closed, limited system in which a truth neither "appreciates" nor "decays". I like the analogy of a microscope; it's good for ensuring that nothing is overlooked in a small fixed domain but it's unsuited for usefully observing a big unbounded area. We pick out pieces of reality and then apply logic to those pieces. The choice of axioms is vitally important. Logic's utility is tied to the universality of its rules and conclusions. The specific meanings of its "p"s and "q"s are irrelevant to its functioning.
- Given any logical entity p, not-p (~p) is defined as the logical entity that is false whenever p is true and true whenever p is false. In the case of the raven paradox, p is "in the set of ravens" and ~p is "not in the set of ravens". q is "in the set of black" and ~q is "not in the set of black". If the "system" of these statements is all objects in the known universe, clearly ~p and ~q are huge sets in that system. But if the system of these statements is the collection of five doves and two ravens in a birdcage, isn't it more significant that five non-black birds aren't ravens than that two ravens are black? The (Bayesian) quantities matter. Some people downplay statistics because its formulas require assumptions about the source population and the randomness of samples, but it seems to me that a precise number calculated through known assumptions is still much better than an intuitive wide-ranging guess hampered by cognitive biases. When an entire population can't be measured, it's better to estimate and quantify the accompanying uncertainty probabilistically than to give up altogether.
- Yet another factor in the perception of the raven paradox is difference in size not only between p and ~p (and q and ~q) but between the sets of p and q. There are many, many more members in "the set of black" than in "the set of ravens". Consider a more focused implication (regardless of its actual truth being rock-solid or not) like "grandfathers are older than 40". Here, the p is "grandfathers" and q is "older than 40". The overall system is people, not objects, and the sets are more stringent than colors and species. A person who is younger than 40 and not a grandfather makes one more likely to believe that all grandfathers are older than 40. For this implication, it feels more reasonable to think that evidence for ~q -> ~p is also evidence for p -> q.
- Further probing the connection between p and q, some applications of logical implications are tighter than in the raven paradox. Laying aside sets and characteristics of objects, causes are commonly said to imply effects. Assign p to "I start a fire", and q to "the fuel is consumed (well, chemically converted)". When 1) the fuel is not consumed and 2) I haven't ignited a fire, it seems quite reasonable to accept these two facts as evidence that unconsumed fuel implies no fire-starting by me (~q -> ~p) and about as reasonable to advance these facts as evidence that my pyromaniacal actions would have led to the consumption of the fuel (p -> q). However, beware that cause and effect implication is susceptible to its own category of raven paradoxes, some of which are painfully woven into everyday life. After all, if 1) my friend isn't alive (~q) because of an auto accident and 2) I didn't tell him (~p) to avoid highway 30 on the way home, I shouldn't necessarily use these two facts to support the implication that if I had told him (p), then he would be alive (q).
- A creative response to the raven paradox is to continue the example by pondering the unexaggerated multitude of statements that a green apple supports in addition to "all ravens are black". A green apple supports the statement that all roses are red (regardless of white roses...). A green apple supports the statement that all snow is white (again, regardless of yellow snow...). After tiring of that activity, someone could turn it around and name the statements that a black raven supports in addition to "all ravens are black". A black raven supports the statement that all watermelons are green. A black raven supports the statement that all basketballs are brown. Do this long enough and you'll realize that from logic's myopic and therefore unbiased definitions, contradictions are what matter because logic includes only true and false, is-raven and is-not-raven, is-black and is-not-black. This "binary" measure of truth results in there being no way for an implication to be progressively truthful as the evidence pile enlarges. When truth must be absolutely dependable, all-or-nothing, one contrarian member of a set trashes the implications that are blanket statements about all of the set's members.
Tuesday, January 20, 2009
algorithms everywhere
I have a simple (*mumble* cheap) portable music player that allows file organization through one mechanism: unnested subdirectories listed alphabetically under the root with files listed alphabetically in each. I started pondering the best way to divvy out subdirectories and files to reduce searching time. The number of subdirectories ideally should be small, so it takes less time to select the desired subdirectory. But the number of files within each subdirectory ideally should be small as well, so it takes less time to select the desired file after selecting the subdirectory.
To cut the self-indulgent story short, I ended up reading about B-trees on Wikipedia. However, since this application has a maximum depth of one, a B-tree would be inappropriate. Yet I was sufficiently inspired to come up with my own set of insertion algorithms based on a subdirectory minimum of 5 files and a maximum of 10 files (in passing, note that these parameters meet the B-tree criterion that a full tree/subdirectory can split evenly into two acceptable trees/subdirectories).
Some might say that it's ludicrous to approach this task in this way, given that I "executed" the algorithms by hand inside a file manager in lieu of writing any code and the low-capacity music player contains less than 300 files. Thus, I lost time by analyzing the problem via a theoretical lens and formulating a general solution. I'm practical enough to acknowledge that reality.
My point is that algorithms and data structures are everywhere if one has the right perspective. And this is not strange compared to other specialties. Artists see lines and shapes and visual patterns that I wouldn't notice unless someone told me. Mathematicians see quantitative relationships (or more abstract stuff - as in abstract algebra sometimes drives me nuts). I could list numerous examples like doctors, lawyers, mechanics, architects, psychologists who all see aspects of their surroundings differently than me.
This is the part of vocational training that's hard to teach: to mold one's mind until the subject matter is a familiar mental frame or toolkit. The reason that professional software developers should study the "Computer Science-y stuff" is so that they can recognize and organize their thoughts, thereby avoiding the trap of attempting a solved problem or attacking it in a naive manner. They don't need to memorize what they can find on the Web or in a book, but they need to know enough to comprehend and adapt what they find!
To cut the self-indulgent story short, I ended up reading about B-trees on Wikipedia. However, since this application has a maximum depth of one, a B-tree would be inappropriate. Yet I was sufficiently inspired to come up with my own set of insertion algorithms based on a subdirectory minimum of 5 files and a maximum of 10 files (in passing, note that these parameters meet the B-tree criterion that a full tree/subdirectory can split evenly into two acceptable trees/subdirectories).
Some might say that it's ludicrous to approach this task in this way, given that I "executed" the algorithms by hand inside a file manager in lieu of writing any code and the low-capacity music player contains less than 300 files. Thus, I lost time by analyzing the problem via a theoretical lens and formulating a general solution. I'm practical enough to acknowledge that reality.
My point is that algorithms and data structures are everywhere if one has the right perspective. And this is not strange compared to other specialties. Artists see lines and shapes and visual patterns that I wouldn't notice unless someone told me. Mathematicians see quantitative relationships (or more abstract stuff - as in abstract algebra sometimes drives me nuts). I could list numerous examples like doctors, lawyers, mechanics, architects, psychologists who all see aspects of their surroundings differently than me.
This is the part of vocational training that's hard to teach: to mold one's mind until the subject matter is a familiar mental frame or toolkit. The reason that professional software developers should study the "Computer Science-y stuff" is so that they can recognize and organize their thoughts, thereby avoiding the trap of attempting a solved problem or attacking it in a naive manner. They don't need to memorize what they can find on the Web or in a book, but they need to know enough to comprehend and adapt what they find!
Subscribe to:
Posts (Atom)