Monday, December 18, 2006

relational database as inference engine

A short while ago, I was reworking an SQL query to ensure that it would yield the precise results the rest of the application expected. (Side observation: my conscience kept hollering at me to make the SQL simpler and just crunch the returned data as needed. Shut up, Jiminy! Go spell encyclopedia and leave me alone.) I noticed that I was adding, taking away, and rearranging WHERE conditions to match particular rows. But matching to specific data based on a generalized set of characteristics is also an activity performed by inference engines! It's not that far of a leap to equate database rows to facts, SQL queries to rules, and newly-inserted database rows to concluded facts or assertions. I wouldn't be surprised if the academics in these two camps have been cross-pollinating ideas for some time. If I knew substantially more than jack about real relational database theory, I could offer some insightful comments. Instead, here's an example.

Imagine a set of widgets available for sale. A white widget has no additional cost, a black widget costs an additional 400, and a green widget costs an additional 200. The base cost for round widgets are 100, square widgets are 200, and triangular widgets are 150. How much does a round, black widget cost? An inference engine might have the set of rules (color white) => add 0, (color black) => add 400, (color green) => add 200, (shape round) => set 100, (shape square) => set 200, (shape triangle) => set 150. Then one could add the facts (color black) and (shape round) and have the answer. Or at least the necessary addends.

A (SQL-compatible) database could do a similar operation. One particular combination (or should I say tuple?) of facts becomes a row in table "facts". The rules become: (select 0 from facts where color = "white"), (select 400 from facts where color = "black"), (select 200 from facts where color = "green"), (select 100 from facts where shape = "round"), (select 200 from facts where shape = "square"), (select 150 from facts where shape = "triangle"). Add a row to "facts" that has a "color" column of "black" and a "shape" column of "round", run the queries, and there's your answer. Or at least the necessary addends.

The similarities become more striking when you use a database table in the usual way, i.e., using a row to represent one member out of a collection of similar entities. To refer back to the previous example, this just means renaming the "facts" table to "widgets" and breaking non-widget facts out into separate tables (the usual normalization). SQL queries that match multiple entities at once will employ JOINs. In the inference engine, an entity would look like (widget (shape square) (color green)), and I assume that rules that match multiple entities would work about the same as rules that match multiple individual facts.

As to whether it makes practical sense to map a relational database to an inference engine or vice versa, I'm inclined to think not. If your problem domain is rigid enough to work in a database, then there's no gain in using an inference engine instead. If your problem domain is a classic example of AI, a rather fuzzy attempt to capture a variety of heuristics working on loosely-structured data, then the database details would bog you down. To say nothing of the limitations (and proper place) of SQL.