Rippling Brainwaves: git graphs compared to data sequence graphs

Friday, March 04, 2011

git graphs compared to data sequence graphs

https://github.com/ArtV/DataSequenceGraph. This is an experimental release whose quality is not assured. Data sequence graph represents a set of data value sequences (IEnumerable<>), called "chunks", as nodes and directed edges in a single graph. More details.

Git represents revision history as a graph of nodes and directed edges, where each node is a commit and each edge points back to an ancestor commit. However, the needs of git are different than the needs of a data sequence graph, which underline the differences in design.

A git graph has no cycles/circles (a commit cannot be its own grandpa, for instance). A data sequence graph may, but never within a particular data chunk's route. The routes use the cycle like a traffic roundabout, jumping into the circle and then leaving before completing a circuit. This is possible because of the rules that an edge's requisite edge must be met and the edge to be passively followed is the one edge whose requisite occurs first/earliest in the route's history.
Nodes in a git graph aren't centrally numbered like in a data sequence graph. It couldn't use that strategy and still be thoroughly decentralized. Instead, the renowned "content addressable filesystem" of git names the commit with a hash of its information including its immediate ancestors. Git avoids global name collisions, but at the unavoidable expense of the required space for the hash. Although in practice a short prefix of the hash/name suffices to be unambiguous. Since a global or decentralized data sequence graph would probably be useless (but it's an intriguing idea...), the same constraint isn't applicable.
Git also requires immutable commits or nodes. Not in the sense that rewriting or amending is impossible, but that a commit's identifier must also change whenever the commit's information changes (well, if you have the know-how you can bend this a bit, but normally...). The state of the revision-tracked content must reflect its ancestors definitively at the point of any commit. A selected commit represents one and only one content state. Whereas a node in a data sequence graph non-uniquely represents an individual value, and many data chunks/routes could possibly include that value/node. This difference also implies that while the identifier of a git commit would need to be different if it were a destination of greater or fewer edges, the simplistic numeric identifier of a node in a data sequence graph must remain the same no matter how other edges and nodes change.

Rippling Brainwaves

Friday, March 04, 2011

git graphs compared to data sequence graphs

No comments:

Post a Comment

About Me

materialistic naturalism

Blog Archive