Managing Repository Size
A Git repository that started at a few megabytes can balloon to gigabytes over time. Large repos slow down clones, fetches, and many Git operations proportionally. Understanding why repos grow and how to diagnose and fix the problem is an important skill for any team maintaining a long-lived codebase.
Why Repositories Grow
Binary files committed without LFS — images, videos, compiled assets, design files; every version stored in full
Build artifacts committed accidentally —
node_modules/,dist/,.classfiles, compiled binariesSecrets and credential files —
.envfiles committed before gitignore was configuredLarge files deleted in a later commit — the file is gone from the working tree but still lives in history
Log files, database dumps, or large data files committed for convenience
Merged feature branches with large temporary files that were never cleaned up
Diagnosing Repository Size
Check total object count and size
git count-objects -vH
Example output
count: 0 size: 0 bytes in-pack: 24891 packs: 1 size-pack: 1.24 GiB ← your repo's on-disk size prune-packable: 0 garbage: 0 size-garbage: 0 bytes
Total .git directory size
du -sh .git
Finding the Biggest Objects in History
The most powerful way to identify what is bloating your repository is to combine git rev-list (to enumerate all objects) with git cat-file (to get their sizes). This pipeline works on any pack file and shows you the largest objects in your entire history.
Find the 20 largest objects in git history
git rev-list --objects --all |
git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' |
grep '^blob' |
sort -k3 -n -r |
head -20 |
awk '{print $3, $4}'Example output — showing culprits
156293120 videos/product-demo.mp4 89478485 design/mockups-v3.psd 52428800 build/app-release.apk 41943040 data/dump-2023-01-15.sql 31457280 node_modules.tar.gz 24117248 assets/hero-video.mov
Finding When a File Was Added
Find all commits that ever touched a specific file
git log --all --full-history -- "videos/product-demo.mp4" # Show the commit that first added the file git log --all --full-history --diff-filter=A -- "videos/product-demo.mp4"
Example output
commit 7d3e5f2a1b9c4d6e8f0a2b4c6d8e0f1a3b5c7d9e
Author: Jane Smith <jane@example.com>
Date: Tue Mar 14 11:23:45 2023
feat: add product demo video to assetsGitHub Size Guidelines
Threshold | Behavior |
|---|---|
< 1 GB | Ideal — fast to clone, no issues |
1 GB | GitHub recommended upper limit — start investigating |
5 GB | GitHub will email a warning about repository size |
100 MB per file | GitHub hard limit on individual file pushes (enforced by pre-receive hook) |
50 MB per file | GitHub shows a warning when pushing files above this size |
> 5 GB total | May experience degraded performance and clone failures |
Option 1: git filter-repo (Recommended)
git filter-repo is the modern, fast replacement for the deprecated git filter-branch. It rewrites history to completely remove specific files or paths from every commit they ever appeared in.
Install git filter-repo
pip install git-filter-repo # or brew install git-filter-repo
Remove a specific file from all history
# IMPORTANT: work on a fresh clone git clone --mirror https://github.com/user/repo.git cd repo.git # Remove the large file from all history git filter-repo --path videos/product-demo.mp4 --invert-paths # Also remove a whole directory git filter-repo --path node_modules --invert-paths # Remove multiple paths git filter-repo --path videos/product-demo.mp4 --path data/dump.sql --invert-paths
Force push rewritten history
# After filter-repo, push all branches and tags git push --force --all git push --force --tags # Run gc to actually free the disk space git gc --prune=now
Option 2: Git LFS for Future Large Files
For new large files going forward, use Git LFS. This keeps binary data out of the Git object store while still tracking it with version control. See the Git LFS page for full details.
Prevention is Better than Cleanup
Add a comprehensive
.gitignorebefore the first commit — includenode_modules/,dist/,build/,*.log,*.envUse
.env.example(committed) and.env(gitignored) for environment variablesRun
git lfs track "*.psd" "*.mp4" "*.ai"before adding any binary assetsAdd a pre-commit hook that rejects files over a size limit (e.g., 10 MB)
Review
git statusandgit diff --statbefore every commitSet up GitHub's secret scanning and size warnings on your organization
Pre-commit hook to reject large files
#!/bin/sh
# Save as .git/hooks/pre-commit and chmod +x
LIMIT=10485760 # 10 MB in bytes
FILES=$(git diff --cached --name-only)
for FILE in $FILES; do
if [ -f "$FILE" ]; then
SIZE=$(wc -c < "$FILE")
if [ "$SIZE" -gt "$LIMIT" ]; then
echo "Error: $FILE is $(($SIZE / 1048576)) MB — over the 10 MB limit."
echo "Use Git LFS for large files: git lfs track '$FILE'"
exit 1
fi
fi
done