Tuesday, August 23, 2011

git's index is more than a scratchpad for new commits

Someone relatively inexperienced with git could develop a mistaken impression about the index. After referring to the isolated commands on a quick-reference guide or on a "phrasebook" that shows git equivalents to other VCS commands, the learner might, with good reason, start to consider the index as a scratchpad for the next commit. The most common tasks are consistent with that concept.

However, this impression is limiting. More accurately viewed, the index is git's entire "current" view of the filesystem. Commits are just saved git views of the filesystem. Files that the user has added, removed, modified, renamed, etc. aren't included in git's view of the filesystem until the user says, with "git add" for example. With the exception of before the very first commit, the index is unlikely to ever be empty. It isn't truly a scratchpad, then. When checking out a commit, git changes its current view of the filesystem to match that commit; therefore it changes the index. Through checkouts, history can be used to populate git's view of the filesystem. Through adds, the actual filesystem can be used to populate git's view of the filesystem. Through commits, git's view of the filesystem can be stored for future reference as a descendant of the HEAD.

Without this understanding, usage of "git reset" is infamous for causing confusion. With it, the confusion is lessened. A reset command that changes the index, which happens in the default or with option --hard, is like a checkout in that it changes git's view to the passed commit. (Of course the reset also moves the branch ref and HEAD, i.e. the future parent of the next commit.) A reset command that doesn't change the index, which happens with option --soft, keeps git's "view" the same as if it remained at the old commit. A user who wanted to collapse all of the changes on a branch into a single commit could possibly checkout that branch, git reset --soft to the branch ancestor, and then commit. Depending on the desired effect, merge --squash or rebase --interactive might be more appropriate, though.

Post-Script: Since this is aimed at git newcomers, I should mention that before trying to be too fancy with resets, become close friends with "reflog" and "stash".

Post-Script The Second: Drat. The Pro Git blog addressed the same general topic, but based more directly around "reset". And with attractive pictures. And a great reference table at the bottom.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.