Thursday, January 17, 2008

defining information and data: the economics of information, part 1

Other parts (since blog entries are consumed in reverse-chronological order, the parts are posted in reverse-chronological order):

Part 2, valuing information
Part 3, trading information

Disclaimer: I'm not an actual theorist or scholar of anything. If any of the following series seems derivative, that's because it is. Moving along...

introduction

Per usual practice, I'll use the Thomas Jefferson quote as a preface:

If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it. Its peculiar character, too, is that no one possesses the less, because every other possesses the whole of it. He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me.

Techdirt's "The Grand Unified Theory on the Economics of Free" (and its thesis-defense comments) prompted me to ponder some general questions about information, especially digitally-encoded information. What is it? What is its value, and how is that value derived? How could the answers to those questions tie together into economics, whether in conventional terms or the terms Techdirt's article laid out? I proceeded in careful steps, philosophy-style. The result is loooong but divisible by topic, so it's a series.

defining information and data

As Jefferson hinted, information (what he calls an idea) is fundamentally different from matter. Specifically, information can only be manipulated by a mind. As many minds exist, so can many copies of information exist. Rather than attempting to define "mind", and potentially become mired into a discussion about the nature and origin of consciousness, I'll resort to circular definitions: "mind" is defined as that which understands and manipulates information, and "information" is defined as the material which minds understand and manipulate. (Someone may break in at this point to protest that human minds also manipulate human bodies. Wrong! Brains manipulate human bodies. Note that I'll also not consider the relationship between the brain and the mind--I don't wanna. However, for extra credit, the reader may ponder this oft-explored hypothetical: given two humans, how much about them must be identical for them to have identical minds? Identical synapses? Identical bodies? Identical experiences? Identical surroundings?)

The next step is to recognize something else the Jefferson quote expresses: a mind can either hoard information, or it can communicate the information to another mind (letting the brainwaves ripple, so to speak). Communication is necessary because minds are disconnected, separate. (By the above circular definitions, two or more minds could only be considered a single mind if all the minds understood and manipulated the exact same set of information, which is not the case. I am Hugh.) Since minds are separate, the separation material must consist of matter. Therefore, communication has multiple steps: 1) the sending mind encodes the information into matter, 2) the matter travels to another mind, 3) the receiving mind decodes the matter into information. Nothing in this model is controversial, aside from being a blatant oversimplification compared to real communication theories/models. For clarity's sake, "data" shall refer to the matter representations of information, used in communication.

In passing, for the sake of full disclosure, note that real communication is implicitly imperfect, ultimately stemming from the difference between, or the boundary of, mind and matter. (One might say, "the map is not the territory".) Perhaps the original purpose of minds is to do what matter on its own cannot: apply information to other information, "annotating" new information with meaning, forming context, birthing information of greater worth than the individual bits. To communicate is to dislodge a subset of information from outside of its original context within a mind. The very first step of information encoding is lossy. The very last step of information decoding is extrapolation. Strictly speaking, mental information (as opposed to sensory information recorded almost perfectly by technology) isn't duplicated; it's torn out, encoded to data with approximations, sent, decoded from data approximations, and interpreted. "Reproduction" of information, via a communication channel of data, is not really a "magical" loophole of conservation laws.

Data's most basic form is personal interaction such as gestures, vocalizations, and speech. A variety of creatures create and consume this form of data through a variety of methods, but its expression peaked in humans: spoken language, which has stunning complexity. Advances in communication have at least partly been refinements in data: pictures, stylized/standardized pictures, phonetic pictures, characters/alphabets, the printing press, photography, the phonograph, the telegraph, the telephone, radio, cinema, television, modems/faxes, broadband networks.

Most importantly, data manipulation changed forever once humans realized that by translating or encoding data into a digital (binary-series) form, meaning a succession of either 0/1 or off/on or no/yes, any technology which can handle the digital form can therefore handle the data. And since the presence and absence of an electric current are excellent ways to represent 0/1, the data could also be "electrified", which enables quick transmission, storage, and modification. Experts may smirk at someone who simplistically asks "how to quickly show, to my relative in the other half of the country, a picture I took from my new itty-bitty camera" or at someone who asks "how many songs can my portable music player contain", but that level of abstraction better illustrates the profound communication effect of digitized data. On some level, all websites, within and without the Web 2.0 bubble, are "social".

Digital data's technologically-achieved processing efficiencies naturally extend to yet another operation: copying, which is just retrieving, transmitting, and re-storing the data. Moreover, unlike both the manual and mechanical copies of past forms of data, the copies can be identical: all that's necessary is reliably duplicating each 0/1 in the series, by repeating the procedure bit by bit. For humans, who really only care about the information rather than the data, it would be an excruciating way to copy, but it's ideal for the purpose of automation. Besides greater speed and accuracy, digital data copying also has become quite cheap in cost. The number of 0/1 bits per unit of currency keeps increasing across the range of all devices, while the cost of each data operation is tiny (the cost as in watts of power and common device wear and tear).

Essentially, digital data resembles information more than any previous form of data, and therefore the machines that manipulate the digital data resemble the mind in the same ways. Given this is the case, the urge to apply the Thomas Jefferson quote to data as well as information is not surprising. Also, digital data prompts a reevaluation of the economics of information as a resource or good in itself, rather than just as a factor in economic decisions.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.