Friday, October 07, 2011

Removing ignored files from a git repository

When I am using TFS, Visual Studio manages the files which should not be committed. So when I create a git repository I often forget to add the .gitignore file. The first reminder I get about my oversight is when I see all the DLLs being added during the first commit.

Today I decided to find out how to clean up the repository. First I added this .gitignore file to my repository:

Then I searched the internet. The first hit from Google was this post by Aral Balkan. The content and the comments provided me with all the information I needed to manage the git repository.

Searching and cleaning the repository

An instance of a git repository can be thought of as an isolated file system. As such commands can be run against it the same way as a normal file system.
The first command I needed was git ls-files which works in the same way as ls. The command git ls-files -i -X .gitignore lists all the files in the repository which would have been excluded had I remembered to set the .gitigonre.
Removing a file from git is done using the git rm. As git is a versioned file system there is the file on disk and a reference to that file in the index. The command git rm --cached will remove the reference from the index but leave the file on disk.

A script to do that

Manually removing each file from the index would take some time. It would also go against all of my computing instincts. The job needs a script.

Here I simply loop round the results from git ls-files sending each one to git rm. I am sure there are many ways to achieve the same result but this method worked well for me. I am using git bash and Windows.


Simon said...

Ooh, that's very pleasing.

Reuben Helms said...

I'm using bash and Windows as well, and some of my files have spaces in them.

This slight modification will include spaces as a part of the file name.


IFS=$(echo -en "\n\b")
for f in $(git ls-files -i -X .gitignore)
git rm --cached "$f"

It's probably not the quickest way to do it, but it handles spaces in files names.

Patrick Hastings said...

Hi Keith,

Very useful little script that.

Tamás Árpád said...

It can be made even simpler:
git rm --cached `git ls-files -i -X .gitignore`
note the backticks