http://www.developer.com/

Back to article

Subversion vs. Git: Choosing the Right Open Source Version Control System


September 7, 2010

As with everything in the open source world, version control systems (VCSs) come in several flavors. The grandfather of open source VCSs is CVS, a tool that was the de facto standard in the industry for several years until the likes of Subversion came along and made it almost obsolete.

All the VCSs are broadly categorized into two categories:

  1. Centralized version control systems -- The repository is centrally stored on a server.
  2. Distributed version control systems -- Every client has the entire repository stored locally.

CVS and Subversion are two of the most popular centralized VCSs, while Git, Mercurial, Bazaar and Monotone are their distributed counterparts.

Choosing the right type of VCS for a project depends on several things, including:

  • The types of files you are working with
  • The kinds of people who are going to use the VCS
  • The operating systems they use

The first choice you need to make, however, is whether to use a centralized system or a distributed system, which depends on your preference and the relevant experience with the VCS. Since Subversion and Git are the most popular in their respective categories, I have chosen to match them against one another in this article. They both have different advantages. To help you choose the right one, I will explain their major differences and then list the advantages and disadvantages of each.

Difference in Subversion and Git Version Control

Subversion currently is the most popular centralized VCS. In 2000, CollabNet, Inc. began seeking developers to build a replacement for CVS that would maintain the CVS fundamentals but fix its most glaring issues. In a few months, the first version of Subversion was released and it's been extremely popular among developers ever since.

As previously mentioned, Subversion is a centralized VCS. Clients can check out (make a local copy of) the repository on their local machines, and after making changes, they can check in (update the server with the changes) the changes to the server. As with any VCS, you can compare two versions, create a branch and merge it with the main stream, and when there is a conflict, resolve it.

Git on the other hand has been built from the ground up as a distributed version control system. The number of redundant copies of all the code, history and meta information is as large as the number of users. So even if the central repository is lost due to a system failure, all the data -- other than the last few changes that were not committed -- can be retrieved from any client machine. In a centralized VCS like Subversion, only the central repository has the complete history. Clients must communicate over the network to fetch the history of any file that they require. So, if the central system goes down you have to resort to the last backup of the server.

So, the biggest difference between Subversion and Git is that in Subversion only the central repository stores the complete history while in Git all clients have all the information. An immediate reaction might be to give a couple of extra points to Git because the chances of losing data are minimized, but it has its downsides too. Considering that the source code you store in Git is proprietary, you should secure it as much as possible. In a distributed system, it is extremely hard to secure your data as there are so many copies of it.

With Subversion, all the data must be written to, and is read from, the central server. Securing that server assures that your data is secure. To ensure the durability of the data, you can set up a systematic backup of the Subversion server.

Platform Support: Advantage Subversion

As I mentioned earlier, one of the considerations of choosing a VCS is the operating system on which you need to access your data. While Subversion works well on all operating systems, Git doesn't play well with Windows. Windows lacks a good graphical application to access Git repositories, which has left Git popular primarily among Linux/UNIX users and some Mac users. In fact, only Linux-based projects such as the Linux Kernel, the Fedora project and WINE use Git.

Ease of Use with User Interfaces: Advantage Subversion

Currently Subversion has a wider range of user interface tools than Git. For example, Subversion plugins are available for most major IDEs on all platforms, as is a Windows Explorer shell extension and a number of native tools for Windows and Mac OS X.

Git on the other hand, is primarily a command-line tool. While some developers are comfortable with using the command line, others -- especially designers -- dread using Git because of the lack of user interface.

Branch Handling: Advantage Git

One of the biggest advantages of a VCS is the ability to branch out from the main stream. In Git, branches are not a dirty word -- they are used often, several times a day by some developers. For every feature or bug fix, you create a new branch and after you are done, you merge the branches back painlessly. In Subversion, branching is easy but if you get into a conflict while merging, you're in trouble. You might need to manually merge the files that are conflicting.

Speed of Performance and Space Occupied: Advantage Git

When it comes to comparing the processing speeds and the space occupied by the meta information of the VCSs, Git wins hands down. In Git, because the entire repository is available locally to the user, the following actions are performed locally and there is no network latency:

  • Viewing history of a file
  • Performing a diff
  • Obtaining another version of a file
  • Merging branches
  • Committing changes

The only operations that need a connection to the network are the push and fetch operations. Compare this to Subversion, where every action is performed over the network.

The space requirements of Subversion also are very large when compared with Git's. Take for example the Mozilla project, which requires over 12GB in Subversion to store the 10-year history of the project source code. In Git, the same history is stored in 420MB -- a 30x reduction in space.

The reason for this huge difference is the way changes are stored in both systems. Git stores about 100 bytes of data in an index file for changes made to a file in the repository, while an Subversion repository stores two versions of each file; one for the user to actually work with and the other hidden in the .svn/ directory to perform operations such as status, diff and commit.

Conclusion

If you have ever collaborated with other people on a project, then you are well aware of the frustrations of sharing files. Whether you email files to one another or upload them to a server, every so often you curse one of your team members because they overwrote a change that you made. The lines of code you wrote are gone -- buried for good. The simplest way to resolve this problem is to make sure that no two people are working on the same file at the time. As simple as that sounds, it can be quite a pain to enforce such a policy on a team. Enter version control systems.

In this article, I covered some of the key differences between the Git and Subversion version control systems. There are several others that you might want to consider before choosing one over the other. If you are not using any version control as yet, start using one immediately if not sooner. If you can't make up your mind about which one to go with, I suggest using Subversion because the learning curve is less steep.

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date