Skip to content
Dave Anderson edited this page Jul 31, 2013 · 19 revisions

Object Store

Files Whose Names Are Their Own Hash

The basic unit of git is file with a SHA-1 hash, 40 hex characters long, used for the file's name. To prevent file-system issues, git takes the first two characters as a subdirectory and the rest of the hash as the file name. For simplification purpose I'll use a seven character, undivided file name instead. (see git hash-object and git cat-file)

Three types of files

  • Blobs: The content. A file of any kind.
  • Trees: A text representation of a directory.
  • Commits: A text representation of a commit.

Blobs

A245621

hello, world

B534142

Some other file’s text

Trees

In a clever reuse of objects, the representation of a directory, called a tree, is a text file. (see git write-tree)

C534543

A245621 hello.txt
B534142 other.txt

Because the filename is the hash, files with the same content point to the same object, it is the directory text object that holds the different name.

D534544

A245621 hello.txt
B534142 other.txt
B534142 other_copy.txt

Commits

Similarly, a commit (or commit-tree) is a text file. It has the hash of the top directory, committer info, commit time and the commit message.

E534535

tree C534543
author First Last <[email protected]> 1243040974 -0700
committer First Last <[email protected]> 1243040974 -0700

commit message

Commits track Parents

G345807

tree F345346
parent E534535
author First Last <[email protected]> 1243054344 -0700
committer First Last <[email protected]> 1243054344 -0700

second commit message

Merged Commits might track Multiple Parents

L345312

tree K534511
parent G345807
parent J345310
author First Last <[email protected]> 1243075625 -0700
committer First Last <[email protected]> 1243075625 -0700

merge commit message

Managing Commits

Now we move onto managing this data, these files use a different convention: their name is their file name and what they point to is a text representation of it.

Branches

A branch is simply a file. The name of the branch is the name of the file and the contents of the file is the hash (and therefore the filename) of the commit.

refs/heads/master

L345312

HEAD

The current working branch is stored in the file called HEAD.

HEAD

ref: refs/heads/master

config

The config file stores configuration information, including remote repository reference information

config example remote entry snippet

[remote "origin"]
        url = https://github.com/davious/Test.git
        fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
        remote = origin
        merge = refs/heads/master