Saturday, May 02, 2015

data : code :: concept : verification

I've sometimes mused about whether my eventual embrace of a Pragmatism-esque philosophy was inevitable. The ever-present danger in musings like this is ordinary hindsight bias: concealing the actual complexity after the fact with simple, tempting connections between present and past. I can't plausibly propose that the same connections would impart equal force on everyone else. In general, I can't rashly declare that everyone who shares one set of similarities with me is obligated to share other sets of similarities. Hastily viewing everyone else through the tiny lens of myself is egocentrism, not well-founded extrapolation.

For example, I admit I can't claim that my career in software development played an instrumental role in the switch. I know too many competent colleagues whose beliefs clash with mine. At the same time, a far different past career hasn't stopped individuals in the Clergy Project from eventually reaching congenial beliefs. Nevertheless, I can try to explain how some aspects of my specific career acted as clues that prepared and nudged me. My accustomed thought patterns within the vocational context seeped into my thought patterns within other contexts.

During education and on the job, I encountered the inseparable ties between data and code. Most obviously, the final data was the purpose of running the code (in games the final data was for immediately synthesizing a gameplay experience).  Almost as obvious, the code couldn't run without the data flowing into it. Superficially, in a single ideal program, code and data were easily distinguishable collaborators taking turns being perfect. Perhaps a data set went in, and a digest of statistical measurements came out, and the unseen code might have ran in a machine on the other side of the internet.

At a more detailed level of comprehension, and in messy and/or faulty projects cobbled together from several prior projects, that rosy view became less sensible. When final data was independently shown to be inaccurate, the initial cause was sometimes difficult to deduce. Along the bumpy journey to the rejected result, data flowed in and out of multiple avenues of code. Fortunately the result retained meaningfulness about the interwoven path of data and code that led to it, regardless of its regrettable lack of meaningfulness in regard to its intended purpose. It authentically represented a problem with that path. Thus its externally checked mistakenness didn't in the least reduce its value for pinpointing and resolving that path's problems.

That wasn't all. The reasoning applied to flawless final data as well, which achieved two kinds of meaningfulness. Its success gave it metaphorical meaningfulness in regard to satisfying the intended purpose. But it too had the same kind of meaningfulness as flawed final data: literal meaningfulness about the path that led to it. It was still the engineered aftereffect of a busy model built out of moving components of data and code—a model ultimately made of highly organized currents of electricity. It was a symbolic record of that model's craftsmanship. Its accurate metaphorical meaning didn't erase its concrete roots.

The next stage of broadening the understanding of models was to incorporate humans as components—exceedingly sophisticated and self-guiding components. They often introduced the starting data or reviewed the ultimate computations. On top of that, they were naturally able to handle the chaotic decisions and exceptions that would require a lot more effort to perform with brittle code. Of course the downside was that their improvisations could derail the data. Occasionally, the core of an error was a human operator's unnoticed carelessness filling in a pivotal element two steps ago. Or a human's assumptions for interpreting the data were inconsistent with the assumptions used to design the code they were operating.

In this sense, humans and code had analogous roles in the model. Each were involved in carrying out cooperative series of orderly procedures on source data and leaving discernible traces in the final data. The quality of the final data could be no better than the quality of the procedures (and the source data). A model this huge was more apt to have labels such as "business process" or "information system", abbreviated IS. Cumulatively, the procedures of the complete IS acted as elaborations, conversions, analyses, summations, etc. of the source data. Not only was the final data meaningful for inferring the procedures behind it, but the procedures in turn produced greater meaningfulness for the source data. Meanwhile, they were futilely empty, motionless, and untested without the presence of data.

Summing up, data and code/procedures were mutually meaningful throughout software development. As mystifying as computers appeared to the uninitiated, data didn't really materialize from nothing. Truth be told, if it ever did so, it would arouse well-justified suspicion about its degree of accuracy. "Where was this figure drawn from?" "Who knows, it was found lying on the doorstep one morning." Long and fruitful exposure to this generalization invited speculation of its limits. What if strict semantic linking between data and procedures weren't confined to the domain of IS concepts?

A possible counterpoint was repeating that these systems were useful but also deliberately limited and refined models of complex realities. Other domains of concepts were too dissimilar. Then...what were those unbridgeable differences, exactly? What were the majority of beneficial concepts, other than useful but also deliberately limited and refined models? What were the majority of the thoughts and actions to verify a concept, other than procedures to detect the characteristic signs of the alleged concept? What were the majority of lines of argument, other than abstract procedures ready to be reran? What were the majority of secondary cross-checks, other than alternative procedures for obtaining equivalent data? What were the majority of serious criticisms to a concept, other than criticisms of the procedures justifying it? What were the majority of definitions, other than procedures to position and orient a concept among other known concepts?

For all that, it wasn't that rare for these other domains to contain some lofty concepts that were said to be beyond question. These were the kind whose untouchable accuracy was said to spring from a source apart from every last form of human thought and activity. Translated into the IS perspective, these were demanding treatment like "constants" or "invariants": small, circular truisms in the style of "September is month 9" and "Clients have one bill per time period". In practice, some constants might need to change from time to time, but those changes weren't generated via the IS. These reliable factors/rules/regularities furnished a self-consistent base for predictable IS behavior.

Ergo, worthwhile constants never received and continually contributed. They were unaffected by data and procedures yet were extensively influential anyway. They probably had frequent, notable consequences elsewhere in the IS. Taken as a whole, those system consequences strongly hinted the constants at work—including tacit constants never recognized by the very makers of the system. Like following trails of breadcrumbs, with enough meticulous observation, the backward bond from the system consequences to the constants could be as certain as the backward bond from data to procedures.

In other words, on the minimal condition that the constants tangibly mattered to the data and procedures of the IS, they yielded accountable expectations for the outcomes and/or the running of the IS. The principle was more profound when it was reversed: total absence of accountable expectations suggested that the correlated constant itself was either absent or at most immaterial. It had no pertinence to the system. Designers wishing to conserve time and effort would be advised to ignore it altogether. It belonged in the routine category "out of system scope". By analogy, if a concept in a domain besides IS declined the usual methods to be reasonably verified, and distinctive effects of it weren't identifiable in the course of reasonably verifying anything else, then it corresponded to neither data nor constants. Its corresponding status was out of system scope; it didn't merit the cost of tracking or integrating it.

As already stated, the analogy wasn't undeniable nor unique. It didn't compel anyone with IS expertise to reapply it to miscellaneous domains, and expertise in numerous fields could lead to comparable analogies. There was a theoretical physical case for granting it wide relevance, though. If real things were made of matter (or closely interconnected to things made of matter), then real things could be sufficiently represented with sufficient quantities of the data describing that matter. If matter was sufficiently represented, including the matter around it, then the ensuing changes of the matter were describable with mathematical relationships and thereby calculable through the appropriate procedures. The domain of real things qualified as an IS...an immense IS of unmanageable depth which couldn't be fully modeled, much less duplicated, by a separate IS feasibly constructed by humans.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.