Git Is Just Text Files Pointing at Other Text Files

March 2026

Inside every Git repo is just a folder of small text files pointing at each other. Here is what they are.

Git has a reputation for being complicated. But once you look inside, most of it is just text files. Small ones. Pointing at each other. Organized into a simple database.

That’s it.

The `.git` Folder Is the Actual Repo

When you run git init, Git creates a hidden .git folder. Your project files sit outside it. The .git folder is where Git lives.

project/
├─ index.js
├─ package.json
└─ .git/         ← here

Inside .git, the important stuff:

.git/
├─ HEAD
├─ refs/
├─ objects/
└─ config

Your code is just Git’s working copy. The .git folder is the real repository. That objects/ folder in particular is where things get interesting.

Git Is a Small Database of Objects

Most people picture Git storing “versions of files,” like a series of snapshots in a photo album. That’s not quite right.

Git stores a database of objects. Three kinds:

blob    → file contents
tree    → directory structure
commit  → snapshot + metadata

Everything lives in .git/objects/. Let’s go through each one.

Blobs: the actual file contents

A blob is just the raw contents of a file. Nothing else. No filename, no path, no metadata.

Take a file called hello.txt containing Hello world. Git stores that text as a blob and gives it a hash:

e965047ad7c57865823c7d992b1d046ea66edf78

That hash is calculated from the content of the file. Change one character, you get a completely different hash. The hash is the identity of the blob.

Inside .git/objects/ it appears as:

.git/objects/e9/65047ad7c57865823c7d992b1d046ea66edf78

Git splits the hash into a two-character folder to keep the directory manageable. Practical, if inelegant.

Trees: directory structure

A tree represents a folder. It maps filenames to blob hashes (and to other trees, for nested folders).

For a project like this:

project/
├─ README.md
└─ src/
    └─ app.js

Git creates:

root tree
├─ README.md → blob abc123
└─ src → tree
          └─ app.js → blob def456

Trees connect names to content. Blobs don’t know their own names. Trees are what give blobs their place in the world.

Commits: a snapshot, not a diff

Here’s where a lot of people have the wrong mental model. A commit doesn’t store changes. It stores a pointer to a tree.

A commit object looks like this:

commit
tree   7f8a3c...
parent a91c0e...
author Eddie
message "add login"

The tree field points to the root tree for that snapshot. That tree points to all the folders and blobs. The full picture of “what the project looked like at this commit” is: commit points to root tree, root tree points to everything else.

The parent field is what creates history. Each commit points backward to the one before it:

A <- B <- C <- D <- E

Why This Design Is Surprisingly Clever

No duplication

Because blobs are identified by their content hash, Git reuses them automatically.

Say you have two commits, and README.md didn’t change between them. Git doesn’t store README.md twice. Both commits’ trees point to the same blob. One copy, referenced from multiple places.

Only changed files produce new blobs:

Commit 1: README.md + old app.js
Commit 2: README.md + new app.js

blobs stored:
blob1 → README.md contents     (shared by both commits)
blob2 → old app.js contents
blob3 → new app.js contents

The more your files stay the same across commits, the more Git deduplicates. A 1000-commit project isn’t storing 1000 full copies of your codebase.

History is immutable by design

A commit’s hash is calculated from its contents, which include the parent hash, the tree hash, the author, the timestamp, and the message.

Change anything in a commit and you get a completely different hash. And since every child commit includes the parent hash, changing one commit invalidates every commit that comes after it. You can’t quietly edit history without the hashes giving you away.

This is why commit hashes are a reliable way to refer to a specific state of the code. The hash is the proof.

Corruption gets caught automatically

Because every object is hashed, Git can check its own work. If a file on disk doesn’t match its expected hash, Git knows something went wrong. There’s no separate integrity check. The whole system is one giant integrity check.

Cloning is straightforward

When you clone a repo, Git downloads all the objects. Once you have them, you can reconstruct every commit because they all reference each other by hash. Nothing is missing. Nothing is ambiguous. The objects are self-contained.

This idea turns out to be pretty important

Git basically built a content-addressed storage system. The same core idea shows up in Docker (image layers), IPFS (distributed file storage), and Nix (reproducible builds). All of them store content by hash, reference by pointer, and deduplicate automatically.

Branches and HEAD Are Also Just Files

Back to the .git folder. The refs/ directory holds your branches:

.git/refs/heads/main
.git/refs/heads/dev
.git/refs/heads/feature-login

Open any of them. You’ll find one thing: a commit hash.

a1b93f4c9a3c2c6e7...

That’s a branch. A single-line text file pointing to a commit. This is why creating a branch is instant: Git writes one file with one line. No duplication, no copying, no waiting.

HEAD works the same way. Open .git/HEAD and you’ll see:

ref: refs/heads/main

That’s how Git knows which branch you’re on. HEAD points to a branch file, which points to a commit. Two hops.

When you make a new commit, Git creates the commit object, then updates the branch file to contain the new hash. That’s the whole “branch moved forward” operation.

Checkout Changes HEAD (and Loads Files)

When you run git checkout dev, Git rewrites .git/HEAD:

ref: refs/heads/dev

Then it reads the commit that branch points to, reads the tree from that commit, and loads those files into your working directory. Checkout is: update one pointer, then sync your files to match.

Detached HEAD Is a Scary Name for a Simple Thing

If you run git checkout a1b93f4 (a specific commit hash instead of a branch name), HEAD stops pointing at a branch file and points directly at a commit:

ref: refs/heads/main   ← normal
a1b93f4               ← detached

This is “detached HEAD state,” which sounds like a medical emergency. It just means you’re parked at a commit with no branch label. If you make commits here, they won’t be attached to any branch. They’ll exist in the objects database, but nothing points to them, so Git will eventually garbage-collect them.

The fix: git checkout -b new-branch-name. That creates a branch file pointing to where you are. HEAD has something to point to again. You’re back on the map.

Where Worktrees Fit In

Normally there’s one HEAD file. That’s why you can only be on one branch at a time.

Worktrees get around this by creating additional HEAD files:

.git/worktrees/hotfix/HEAD
.git/worktrees/feature/HEAD

Each worktree has its own pointer into the same objects database. Multiple branches checked out simultaneously, no duplication, no conflict. The objects are shared. Only the pointers multiply.

The Mental Model

blob     = file contents (named by hash)
tree     = directory structure (maps names to blobs)
commit   = snapshot (points to a tree + its parent)
branch   = a label (one file, one commit hash)
HEAD     = which label you're using right now
checkout = move the pointer, sync the files

Once this settles in, Git stops feeling like a mysterious black box. It’s a database of hashed objects, linked by pointers, with a few text files on top to track where you are.

If you want to explore the object database yourself, look up git cat-file. It lets you read commits, trees, and blobs directly by their hash.

The .git Folder Is the Actual Repo