Destroying deleted branches in TFS

January 13, 2016

Everyone using a version control system will have ever used a delete command to delete files/folders which are not required anymore.

Team Foundation Verion Control (TFVC) treats a delete [tf delete] as a pending change in your workspace (type = delete). The final checkin command removes the item from the version control server, but does not delete the item permanently. All historical changes to the file remain available for lookup. In fact it’s also possible to undelete [tf undelete] an item and bring it back into play.

In only a few scenarios you might want to go one step further and really destroy the deleted files from TFS …

Recently a customer was complaining about the exponential growth of their TFS 2013 databases (exceeding 400GB for a specific TPC database). Running some SQL scripts on the tbl_content table of the Team Project Collection database revealed that a lot of version control and file container data was added monthly to the TFS database. Ranging from 2.5GB a few months ago up to 8GB in the last months.

After some investigation I noticed that the development team was creating a lot of feature branches (150+) for a specific application to isolate their feature development. This shouldn’t have a big impact on the exponential growth of the TFS databases because a branch is like a shallow copy of each of the files from the source branch. Both branches will originally refer to the same copy of each file in TFS. Extra changes will trigger a deltafication process to optimize storage. More info of this complex procedure is explained in this old TFS 2010 blog post which should still be valid for the latest version of TFS. But the main problem was that the feature branches also contained a lot of references (NuGet packages / binary files) which were added to version control as well and many dependency changes were initiated in the feature branches. As described in the old blog post, TFS will also try to compute deltas for binary files, but it’s way more difficult to predict database growth based on the sizes of these deltas. The results from deltafication vary greatly depending on what type of file it is, and the nature of the changes.

After further discussions, we decided to permanently destroy old deleted feature branches in version control and see the impact of the database size. I wrote a powershell script to iterate over a version control root folder to search for deleted branches and to destroy it only if the latest check-in did not occur in the last 100 days. In the script I used the startcleanup switch to immediately trigger the clean up process in TFS instead of waiting for the daily clean up background process.

I don’t remember the exact results on the database size, but it certainly decreased the total size of the TFS database for the involved Team Project Collection. With the upcoming planned upgrade to TFS 2015, we also used the Test Attachment Cleaner (part of the TFS Power Tools) to remove old diagnostic test data. So, in the end we were able to reduce the total size of the TPC database with a fair amount of GB in order to also speed up the duration of the future upgrade to TFS 2015.

Another initiative was started to not use TFS anymore to keep (physical) track of the dependencies. TFS is indeed a version control repository to store sources, but there are better tools out there to store software packages/artefacts.