Search This Blog


DVCS Branching - Why Mercurial Sucks and What Mercurial Advocates Are Missing


Version control software is basically used to track changes to files (usually plain-text files) over time. It is primarily used by software developers to track source code changes. It used to be that a centralized server was used to host a centralized repository of changes, and clients would connect to it to checkout or check-in changes.

Then came a revolution known as distributed version control. This model basically eliminates the centralized repository and gives every user (developer) their own complete clone of the repository.

There are basically two popular distributed version control systems (DVCS) available right now, and both are free/libre/open source software. They are Git and Mercurial. Both were developed to replace BitKeeper for use in maintaining the Linux kernel, but only one was written by Linus himself so the other one sucks. ;) There are other free ones and also commercial ones, but I don't know of any good reason to use them. Linus Torvalds did at one point say that if for whatever reason you needed a commercial DVCS that BitKeeper was the one to use.

Git and Mercurial work relatively similarly. As far as I know, they both represent history internally using a directed acyclic graph (DAG). Combining this with the distributed design means that branching and merging happens to work quite well in both (it has to). Effectively, whenever you work in a separate clone of the repository you create a new physical branch, even if you don't mean to. As a result, you often have to merge when you synchronize your local repository. That only tells part of the story of branching though.

Branching In Mercurial

The Mercurial community has basically come up with 3 branching strategies that you can use with Mercurial. The reason there's so many of them is because none of them really works very well. Proponents look for ways to justify these misfeatures, but when you actually put the strategies into practice you see that it's nonsense.

Named Branches

The core branching feature is called named branches. This is what the `hg branch' and `hg branches' commands manage. The implementation is a little bit finicky. It's effectively a name attribute stored with each changeset. That's all there really is to it. I imagine it must have been inspired by Subversion properties or something silly like that. The way it works is that you are basically free to set the working branch name anytime you want with `hg branch <name>'. The attribute is remembered and when you commit it is stored in the changeset as a permanent part of history. The head of a named branch is basically the most recent commit with that branch name.

Branches in Mercurial are shared by default. If you want to only synchronize selective branches then you need to explicitly specify only the ones you want synchronized with every relevant command. I find it extremely tedious. It basically makes it difficult for anyone to work under the radar without distracting others. In a big project you can expect that many people are not going to be interested in the work that others are doing. You're only really interested in the parts of the project that impact you in some way and don't want to be bothered with all of the other noise. Mercurial makes it difficult to do that.

Another problem is that in this distributed environment there is only one namespace for branches, and branches are permanent parts of history. This means that bad things can happen if two completely distinct developers happen to choose the same branch name for different branches. It also means that if you just want to create an experimental branch and throw it away later then you are forced to share it with the world anyway (short of history editing, which the Mercurial community also discourages). A workaround that was added later is `hg commit --close-branch', which basically adds a new changeset with a new special attribute to indicate that the named branch is "closed" (i.e., finished with). This basically just tells Mercurial to hide it from view by default. The branch is automatically reopened though if you accidentally commit onto it again. In my experience, it's very difficult to keep things straight when the entire universe's branches are always available to the `hg update' command, which is used to update your working copy to some specific version (e.g., the latest) as well as change branches.

Repository Clone

What the Mercurial community would have you do for experimental changes that you may wish to throw away is clone the repository again and work in a throw-away repository instead. Due to the nature of the distributed DAG history you are actually automatically branching whenever you work on a cloned repository. This solution does work, but it's very Subversion-like in that you have to manually manage "branches" within your file system. Mercurial will use hard links on supported file systems to save space if you clone locally (basically each clone will share whatever physical files on disk that it can), but that's little comfort. In my experience, this strategy doesn't work overly well, and it clutters your project namespace (e.g., I generally keep all source code projects in ~/src or "%USERPROFILE%\src").

You also can't directly share these experimental changes with other people or with a shared remote repository without pushing a new anonymous head and screwing everybody up. At best you could tell your collaborator to explicitly clone anew, and keep the history separate from the main history, just as you have done, but that's a very error-prone approach.


A lot of Git users complain about the branching options in Mercurial, and the Mercurial community's solution was to offer an alternative branching mechanism that is similar to how branches work in Git. The solution is called "bookmarks". Bookmarks began as a Mercurial extension, but has since been merged into the core.

A bookmark is basically an association between a name and a commit identifier. These associations are stored in .hg/bookmarks. When you commit on a bookmark the bookmark is automatically updated to point to the new head. Since bookmarks are stored outside of history they can be easily created, renamed, and deleted at any time. They are very lightweight compared to the other branching options. Indeed, they are similar to a Git branch. The problem with bookmarks is that they aren't shared by default. If you choose to use bookmarks to manage branches in your project then you either have to tediously communicate each bookmark with all collaborators or risk them getting quite confused because there will be multiple seemingly anonymous branch heads when they pull your branches in.

Basically, you have to explicitly push bookmarks with `hg push -B <name>', and explicitly pull them with `hg pull -B <name>'. You can check for new bookmarks in the remote repository with `hg incoming -B'. Unfortunately, this puts a lot of responsibility on the collaborators to manually keep bookmarks up to date. The documentation says that once both sides of the connection have a bookmark it will be updated automatically, but that's not much help. Without them, at least, Mercurial is quite happy to jump branches on you without really raising any alarms (I'm not sure it would stop you even with the bookmarks, to be honest, though I've been told it will). The current implementation is just not sufficient for real world usage. Most of my collaborators are even in the same room. I can't imagine if you had people spread out across a building or the world.

Branching In Git

Git is quite smart about most of what it can do. A branch in Git is a much more logical concept. You get to think of Git branches as tangible things. There is basically only one strategy for branching and it works quite well. Perhaps most importantly it's distribution friendly.

Git manages your branches for you. A branch is actually represented by a very light file that basically just references the head commit identifier of the branch. This allows you to create, rename, and delete branches at will, very efficiently, similar to Mercurial bookmarks. Git takes care of efficiently storing the actual history for you so you can be sure that it doesn't duplicate anything needlessly. The operation of switching branches is also different than the operation of updating your working copy [fix:] local branches (`git checkout' and `git pull'[1], respectively).

Git branches are not all implicitly synchronized with a remote repository. If you `git push' then only the current branch's commits will be pushed. Similarly, when you `git pull' then only the current branch will be pulled. Branches are namespaced by repository. Local ones are just in the global namespace, but remote branches are namespaced by the remote name. For example, origin/master refers to the 'master' branch of the remote 'origin'. This puts the user in complete control of their repository. The remote might refer to a branch as 'foo', but you can use a local branch named 'bar' if you want to. Git doesn't implicitly synchronize any branches with remote branches. You have to explicitly associate them in configuration before they will implicitly be synced. Otherwise, you can explicitly choose which local and remote branches you want to synchronize.


Mercurial proponents typically try to defend Mercurial's branching madness with excuses that branches should be permanent history and that it's a feature, not a bug. The reality is that Mercurial has a very non-distributed nature to its branching policies. Instead of letting each user be in control of his view of the world, the Mercurial community seems to be in favor of encouraging everybody to always share the entire world with the entire world.

In short, Git is a much better DVCS for branching (and in my experience, pretty much everything else too, but I digress). Mercurial is a good alternative for people that need extra hand holding and don't plan to make extensive use of the tools. Branching in Mercurial is so painful that we basically don't do it where I work, just as we didn't really do it with Subversion. The little bit of branching that we do do ends up being ugly and distracting and it's very error-prone in my experience.


[1] - `git pull' is equivalent to `git fetch && git merge'. Note that if you're wise to it then you can use git rebase instead of git merge if you want to linearize [local] history.