Git History Rewrite at Scale: Removing 100MB+ Files Safely

Introduction

Large files inside Git repositories are a silent problem. They increase clone times, inflate repository size, and in platforms like Bitbucket Cloud, can completely block pushes once files exceed 100MB.

During a migration exercise, we encountered multiple repositories containing large binary files embedded directly in Git history. Some were intentionally added during testing; others were legacy artifacts. Regardless of origin, the impact was the same: repository growth, push failures, and migration risk.

We needed a scalable, production-safe solution to:

  • Identify files larger than 100MB
  • Preserve those files safely
  • Remove them from Git history
  • Maintain traceability
  • Avoid Git LFS
  • Process multiple repositories in batch

This article explains the approach, implementation, and verification process. Continue reading “Git History Rewrite at Scale: Removing 100MB+ Files Safely”