Friday, October 07, 2011

Removing ignored files from a git repository

When I am using TFS, Visual Studio manages the files which should not be committed. So when I create a git repository I often forget to add the .gitignore file. The first reminder I get about my oversight is when I see all the DLLs being added during the first commit.

Today I decided to find out how to clean up the repository. First I added this .gitignore file to my repository:


Then I searched the internet. The first hit from Google was this post by Aral Balkan. The content and the comments provided me with all the information I needed to manage the git repository.

Searching and cleaning the repository

An instance of a git repository can be thought of as an isolated file system. As such commands can be run against it the same way as a normal file system.
The first command I needed was git ls-files which works in the same way as ls. The command git ls-files -i -X .gitignore lists all the files in the repository which would have been excluded had I remembered to set the .gitigonre.
Removing a file from git is done using the git rm. As git is a versioned file system there is the file on disk and a reference to that file in the index. The command git rm --cached will remove the reference from the index but leave the file on disk.

A script to do that

Manually removing each file from the index would take some time. It would also go against all of my computing instincts. The job needs a script.


Here I simply loop round the results from git ls-files sending each one to git rm. I am sure there are many ways to achieve the same result but this method worked well for me. I am using git bash and Windows.

1 comments:

Simon said...

Ooh, that's very pleasing.