Product InfoCustomer TestimonialsConcept and Reference DocumentsLicensing Terms and PricingCompany Location, Mission and VisionAnnouncements & NewslettersHome

Branch is Easy, Merge is ****!

The title of this column is a quote from a frustrated Subversion user. He had used SnapshotCM on a different project, and was expressing his frustration about the relative difficulty of merging in Subversion. His comment came to mind as I read the recent announcement that a commercial company that has built their business on top of Subversion was going to fix several of Subversion's problems, notably its rename and merge issues. Read on for my guess as to what's really happening, what it means, and whether they'll really be able to fix it.

Problem Description

A Subversion repository stores a single versioned hierarchy of files. That's it. One hierarchy per repository. In order to store multiple projects in a repository, a user designates locations within the hierarchy for each project. That is, perhaps /Projects/P1, /Projects/P2, etc. Even branches are stored in the same hierarchy. In fact, projects and branches and project directories are all stored in the same hierarchy without any inherent distinctions between them. That is, path /A/B/C, for example, could be a project root, a branch root, a directory inside a branch, a directory outside a branch, or even the branch of part of a project, and even Subversion doesn't explicitly know. So how do you keep them straight? Well, that's up to you. Yes, Subversion depends on its users to establish and maintain the distinction between directories and projects and branches. While that's a problem, it doesn't appear to be one they intend to address. So what are they addressing? Merge and rename.

The Rename Problem

The rename problem is pretty simple actually: Subversion doesn't have a true rename operation. Rather, a rename creates a copy of the file in the new location, and deletes the file from the original location, all in an atomic action. So does Subversion track the relationship between the old and new names? Yes, though the new name appears to be a new file. And this affects merging, if, for example, additional changes are made to the original name. The documentation isn't too clear on this, other than to note the various situations where the merge needs help from the user (and I thought it was supposed to be the other way around) after a rename!

Besides the rename issues affecting merge, Subversion merge has other issues. Branch merge involves copying the changes unique to one branch onto another branch. Since a branch is initially created by copying from another branch, Subversion notes that source location and revision in the root of the copied hierarchy, and uses that information to figure out what's changed when merge is done later. However, you must merge all changes on a branch to avoid confusing Subversion for later merges.

The difficulty of Subversion merge also is apparent if you read the manual. It's full of warnings and exceptions about things the user (guru) needs to do properly. Yes, it has gotten better with the newer releases, but for example, see the discussion about reintegrating a branch (i.e., repeated merge) in the latest manual and the discussion on the mergeinfo property. Why does the user (or even guru) need to care? The bottom line is that Subversion's handling of merge information is overly complex and fragile and results in an overly complex model for the user to deal with. The original design didn't handle merge info at all, and changes since then have failed to address the fundamental design issues.

Any competent merge tool today must handle the details of a branch merge automatically. That Subversion burdens the user with merge details means that branching and merging fails to be useful for most teams, except in rare cases where there is a strong need, and then any branch must be carefully handled by the resident Subversion guru.

Design or Implementation Problem?

As alluded to above, these issues stem from fundamental design choices. The Subversion model does not have a project object, or a branch object. It doesn't record merges in a file's history. It can't represent a true rename. As a result, Subversion depends on users to keep track of many more things than necessary. And anything dependent on convention makes it harder to do decent tools. Fixing it will require significant changes to the model. Or a major hack (which appears to be how merge is handled already). Either way, productivity, usability and teams will suffer.

How Does SnapshotCM Compare?

SnapshotCM provides solid project and branch (snapshot) support, each with its own first-class object in the repository. Branches have logical relationships with other branches which are shown and manipulated graphically.

Every file and directory has a unique, non-changing id, which is used to track all copies of a file across all projects and branches. Renames and moves are treated as just another attribute, a clean and consistent solution. Even directory renames and moves do not cause any difficulties for SnapshotCM.

Merges are recorded in file history and branch to branch merging is fully symmetrical. That is, one does not need to use one type of command to merge one direction and another to merge the other way (as Subversion requires when reintegrating a branch).

Most importantly, because SnapshotCM can reliably match items during a branch to branch merge (copy), and because it records merges in a file's history, it can also reliably determine how to handle most branch merge actions. The user need simply say do it (graphically), and does not need to specify what to do. Even in situations where changes are being made and merged in several directions at once, SnapshotCM will do the right thing. And should a true conflict exist, or if a user wants to revert a change (i.e. do something other than the normal action), he can, quickly and easily.

And beyond the technical side, SnapshotCM organizes your branches into interactive visual graphs and hierarchies. Simple, understandable, and clear, it takes mere seconds for most branch level operations. Subversion can't compare.

Here is a key point: Merging branches is so easy in SnapshotCM that everyone can use them. They are manipulated graphically, are quick to create, quick to sync up, quick to merge, easy to understand, and easy to remove. In short, developers can easily use SnapshotCM branches for lots of light- (and heavy-) weight tasks precisely because SnapshotCM makes them easy. Branching is useless unless used, and in SnapshotCM, branches are used.

For further reading:

A user having problems merging. Note the extensive syntax and knowledge required to resolve his problem. To be fair, the problem was related to a beta release, but my point is not the problem, but rather that the user must become an expert on Subversion merge details. There is a better way!

Read the latest Subversion manual yourself and note the description of the complexity involved with merge!