Thursday, January 29, 2009

ravens at the intersection of logic and reality

The raven paradox presents an interesting question for anyone seeking to apply logic. It's also short and understandable. 1) The two implications "all ravens are black" and "anything that isn't black isn't a raven" are logically equivalent - either both false or both true. p -> q has the same truth table as ~q -> ~p. 2) A raven that is black is evidence for "all ravens are black". 3) Similar to 2, any object that isn't black and isn't a raven is evidence for "anything that isn't black isn't a raven". 4) Since the two propositions are logically equivalent (1), why wouldn't evidence for "anything that isn't black isn't a raven" (3) also be evidence for "all ravens are black" (2)? To summarize, how many green apples are required to convincingly support the proposition that all ravens are black? And isn't this question ridiculous?

In my opinion, the raven paradox is a matter of perspective. The Wikipedia article probably includes all of the following comments, stated differently. (If equations excite you, as usual the Wikipedia article won't disappoint in that department.)
  • Logic works best as a closed, limited system in which a truth neither "appreciates" nor "decays". I like the analogy of a microscope; it's good for ensuring that nothing is overlooked in a small fixed domain but it's unsuited for usefully observing a big unbounded area. We pick out pieces of reality and then apply logic to those pieces. The choice of axioms is vitally important. Logic's utility is tied to the universality of its rules and conclusions. The specific meanings of its "p"s and "q"s are irrelevant to its functioning.
  • Given any logical entity p, not-p (~p) is defined as the logical entity that is false whenever p is true and true whenever p is false. In the case of the raven paradox, p is "in the set of ravens" and ~p is "not in the set of ravens". q is "in the set of black" and ~q is "not in the set of black". If the "system" of these statements is all objects in the known universe, clearly ~p and ~q are huge sets in that system. But if the system of these statements is the collection of five doves and two ravens in a birdcage, isn't it more significant that five non-black birds aren't ravens than that two ravens are black? The (Bayesian) quantities matter. Some people downplay statistics because its formulas require assumptions about the source population and the randomness of samples, but it seems to me that a precise number calculated through known assumptions is still much better than an intuitive wide-ranging guess hampered by cognitive biases. When an entire population can't be measured, it's better to estimate and quantify the accompanying uncertainty probabilistically than to give up altogether.
  • Yet another factor in the perception of the raven paradox is difference in size not only between p and ~p (and q and ~q) but between the sets of p and q. There are many, many more members in "the set of black" than in "the set of ravens". Consider a more focused implication (regardless of its actual truth being rock-solid or not) like "grandfathers are older than 40". Here, the p is "grandfathers" and q is "older than 40". The overall system is people, not objects, and the sets are more stringent than colors and species. A person who is younger than 40 and not a grandfather makes one more likely to believe that all grandfathers are older than 40. For this implication, it feels more reasonable to think that evidence for ~q -> ~p is also evidence for p -> q.
  • Further probing the connection between p and q, some applications of logical implications are tighter than in the raven paradox. Laying aside sets and characteristics of objects, causes are commonly said to imply effects. Assign p to "I start a fire", and q to "the fuel is consumed (well, chemically converted)". When 1) the fuel is not consumed and 2) I haven't ignited a fire, it seems quite reasonable to accept these two facts as evidence that unconsumed fuel implies no fire-starting by me (~q -> ~p) and about as reasonable to advance these facts as evidence that my pyromaniacal actions would have led to the consumption of the fuel (p -> q). However, beware that cause and effect implication is susceptible to its own category of raven paradoxes, some of which are painfully woven into everyday life. After all, if 1) my friend isn't alive (~q) because of an auto accident and 2) I didn't tell him (~p) to avoid highway 30 on the way home, I shouldn't necessarily use these two facts to support the implication that if I had told him (p), then he would be alive (q).
  • A creative response to the raven paradox is to continue the example by pondering the unexaggerated multitude of statements that a green apple supports in addition to "all ravens are black". A green apple supports the statement that all roses are red (regardless of white roses...). A green apple supports the statement that all snow is white (again, regardless of yellow snow...). After tiring of that activity, someone could turn it around and name the statements that a black raven supports in addition to "all ravens are black". A black raven supports the statement that all watermelons are green. A black raven supports the statement that all basketballs are brown. Do this long enough and you'll realize that from logic's myopic and therefore unbiased definitions, contradictions are what matter because logic includes only true and false, is-raven and is-not-raven, is-black and is-not-black. This "binary" measure of truth results in there being no way for an implication to be progressively truthful as the evidence pile enlarges. When truth must be absolutely dependable, all-or-nothing, one contrarian member of a set trashes the implications that are blanket statements about all of the set's members.
The way I see it, the raven paradox isn't an argument that logic is useless in judging evidence or that the judgment of evidence proceeds illogically. It's an illustration of why logic is an excellent way to construct a truthful chain of reasoning out of existing truths but a flawed way to produce truths out of messy, endlessly astounding reality.

Tuesday, January 20, 2009

algorithms everywhere

I have a simple (*mumble* cheap) portable music player that allows file organization through one mechanism: unnested subdirectories listed alphabetically under the root with files listed alphabetically in each. I started pondering the best way to divvy out subdirectories and files to reduce searching time. The number of subdirectories ideally should be small, so it takes less time to select the desired subdirectory. But the number of files within each subdirectory ideally should be small as well, so it takes less time to select the desired file after selecting the subdirectory.

To cut the self-indulgent story short, I ended up reading about B-trees on Wikipedia. However, since this application has a maximum depth of one, a B-tree would be inappropriate. Yet I was sufficiently inspired to come up with my own set of insertion algorithms based on a subdirectory minimum of 5 files and a maximum of 10 files (in passing, note that these parameters meet the B-tree criterion that a full tree/subdirectory can split evenly into two acceptable trees/subdirectories).

Some might say that it's ludicrous to approach this task in this way, given that I "executed" the algorithms by hand inside a file manager in lieu of writing any code and the low-capacity music player contains less than 300 files. Thus, I lost time by analyzing the problem via a theoretical lens and formulating a general solution. I'm practical enough to acknowledge that reality.

My point is that algorithms and data structures are everywhere if one has the right perspective. And this is not strange compared to other specialties. Artists see lines and shapes and visual patterns that I wouldn't notice unless someone told me. Mathematicians see quantitative relationships (or more abstract stuff - as in abstract algebra sometimes drives me nuts). I could list numerous examples like doctors, lawyers, mechanics, architects, psychologists who all see aspects of their surroundings differently than me.

This is the part of vocational training that's hard to teach: to mold one's mind until the subject matter is a familiar mental frame or toolkit. The reason that professional software developers should study the "Computer Science-y stuff" is so that they can recognize and organize their thoughts, thereby avoiding the trap of attempting a solved problem or attacking it in a naive manner. They don't need to memorize what they can find on the Web or in a book, but they need to know enough to comprehend and adapt what they find!