Why I choose Bazaar (a history of revision control)

Like any sensible software developer, I have a close relationship with revision control systems. In my previous job, I was an SCM Engineer (see Software configuration management) which meant I had an even closer relationship than most, since we were running the CVS servers and actively using them to track changes and deployments.

We all know, deep down, that revision control systems shouldn’t exist. This kind of thing should be inherent in the design of the operating system, through standard file and filesystem formats. The OLPC interface is making some headway towards that, but for the rest of us, it means using a revision control tool throughout the development process.

Unfortunately, even though the tool is expected to be the most-used command on your system, very few of them are particularly easy to use. Thus there’s a large learning curve, and people become religious about their choice since they have invested significant time in using it.

Just to spice the mix up, not only will people religiously defend their choice of revision control system, but they’ll do so while actively hating it.

In the beginning there was CVS and we all thought that it was pretty good. It was based on the simpler RCS and shared a file-format with it, but introduced control of directory trees and remote operation.

Actually, in reality, CVS wasn’t that good. Its command set could be a little strange and inconsistent (e.g. it’s not possible to diff between two dates on a branch); the support for branching assumed that all branches would be merged into the mainline, and only once; and nobody ever really knew how to create a new project in a repository (tip. cvs import is wrong).

But we all used it anyway, and we muddled through. It did have some good features; it was simple, fast and pretty reliable–when it did break, you could usually fix the repository yourself. And most importantly of all, we understood how to drive it.

And so it was for many years, until Subversion (SVN) came along. Subversion intended to be “a better CVS”, perhaps this goal should have made us suspicious at the time since CVS was already being a pretty good CVS by itself; unfortunately we hated CVS so much we flocked to the new system in hope.

In hindsight, Subversion didn’t really improve on CVS much at all. In fact, arguably, the only real improvement was the addition of atomic commits (in CVS, each commit is per-file, so it’s manual labour to work out which change was made to two files at the same time).

(Its support for branching, tagging, copying, renaming, etc. were no better than CVS’s when done in the repository by hand.)

The cost of this single new feature was a much more complicated interface (with two separate commands), a backend that tended to break down weekly and a lethargic slowness to its operation.

Most people I know now justify their use of Subversion instead of CVS by “Subversion is maintained, CVS isn’t” which is a somewhat self-fulfilling justification.

While the mass conversion to Subversion, and ensuing disappointment and frustration, was going on; something new appeared on the horizon: Arch.

Arch was different, it broke one of the core assumptions of revision control, that of the repository as a cathedral. In CVS, and Subversion like it, if somebody wants to modify your code (even if on a branch) you need to give them access to your own repository. In some cases (especially with CVS), vast access control and permission structures would be in place to ensure proper behaviour.

With Arch, you don’t; all you need to give to anyone is read access. Anybody can make their own branch by copying yours and committing to their own copy.

This model also necessitated fixing a long standing problem that CVS had; Arch has repeatable (smart) merging. If you merge from a branch, you can merge again later, and again, and again.

Arch made this possible through each commit (changeset) having a globally unique identifier; made from the branch’s own globally unique identifier and the changeset number in the branch.

Unfortunately while this was a massive step in a new direction, Arch had an absolutely terrible user interface. Its command list was terrifying with over 100 commands, many of which had multiple word names (tla set-tree-version). It exposed too many of its own innards, and expected you to learn them. It also forced baroque file naming semantics on its users and strange policy (though shalt not commit without first running “make clean”).

Efforts were made to improve Arch’s user interface through projects such as baz, but they were always to be doomed from the start.

We’ve since seen an explosion of new revision control systems; Monotone, Darcs, Git and Bazaar.

What’s especially interesting is the commonality between these systems. They are all “distributed” like Arch, though they also all discard the strange “unique branch identifier” convention and instead simply assign a unique identifier to each file or commit.

This means that they all support personal branches, and by necessity all support repeatable (smart) merging.

So how do they differ, what are their killer features and killer problems?

Monotone is all about repository integrity, ensuring that every commit is both authorised and intact. It pays for this with a severe lack of speed.

Darcs is based around a “theory of patches”, a branch is not made up of its history but by the collection of patches in it. Unfortunately this often breaks down, and darcs frequently gets stuck calculating even trial and commonplace branch models.

Git is very strange to me; its killer feature appears to be the speed at which it can handle very large trees, but the interface is as insane as Arch’s was. It is heavily optimised for the “I only apply patches” development model, at the expense of ordinary development models (it shares an issue with Arch where calculating annotations on an individual file is an expensive operation).

What about Bazaar? Its killer feature is that it is designed to work the way you do. The command set is relatively small, and each of them works in the most obvious manner. It also supports plugins so that you can always implement your own workflow.

Of all the revision control systems, it’s the only one (that I’m aware of) that supports both distributed and centralised workflows (and lets you go distributed when you need to, e.g. when you’re on a plane).

Here’s a few examples of how Bazaar’s command set works the way you do. To start managing some code in bzr:

$ cd myproject
$ bzr init

To add the files, copy in your usual .bzrignore file and just add everything:

$ cp ~/bzrignore .bzrignore
$ bzr add
added foo.c
added bar.c

Check the output for mistakenly added files, adjust .bzrignore and remove the file with bzr rm.

A common operation is realising that the commit you’re about to make should really go on a new branch for now:

$ cd ..
$ cp -a myproject myproject-foo
$ cd myproject-foo
$ bzr commit

A copy of a Bazaar branch is a different branch, you can commit to it separately. There’s a bzr branch command for it too (which deals with issues such as bound branches, checkouts, etc.) but it’s nice to demonstrate that Bazaar does what you’d expect even when you don’t use its own commands.

Pulling changes from another branch (where you haven’t made any modifications yet) is easy:

$ bzr pull ../myproject

As is merging (when your branches have diverged):

$ bzr merge ../myproject

One particularly nice feature is that after a merge, you see the merge as a single commit and it can be treated as such; but it also has the set of merged commits indented under it–you can examine these as individual commits as well!

What’s the downside of Bazaar? Well, it’s not the fastest system (but by no means the slowest), for small to medium sized projects this is never an issue but may be for extremely large projects–fortunately the developers are improving its performance all the time!

But that doesn’t matter; it is, honestly, the first revision control system that I don’t hate.

21 Comments

  1. Faidon Liambotis says:

    What about Mercurial?

  2. When I decided where to go from Arch (yes I was insane enough to learn how to use it), I went a bit back to Subversion. Why? Everybody knows it. It is nice to say that anybody can easily create his own branch, but he would have to learn VCS you use. Subversion is now known very well and widely supported by various tools and it is it’s greatest advantage.

    And I don’t loose that much. I can use bzr-svn to get back distributed working possibility and it works nicely.

  3. I couldn’t agree more! Glad to hear I’m not the only one who finds git needlessly difficult to use.

  4. What is the “correct” way to create a new project with CVS and what’s wrong with “cvs import”?

  5. Rudd-O says:

    Bazaar is pretty much Git. Except Git is faster. Use one of the quick start tutorials available in the Web. I picked it up (theory and practice) in three hours, coming from a Subversion background.

    Oh, and check Gitk out. It’s definitely ugly-ass, but it gives you a perspective of your history like NO other revision control system.

  6. Rudd-O says:

    Bazaar is pretty much Git. Except Git is faster. Use one of the quick start tutorials available in the Web. I picked it up (theory and practice) in three hours, coming from a Subversion background.

    Oh, and check Gitk out. It’s definitely ugly-ass, but it gives you a perspective of your history like NO other revision control system.

  7. Brendan says:

    It sounds like you’d like Mercurial. It’s not far from git in speed, but the user interface is much more natural.

  8. carlospc says:

    You miss “Once upon a time” at the beginning of the post ;-)

    Great history. Have a look at Free revision control software for a complete list of this kind of software.

  9. Will says:

    I love Bazaar. There is one feature it really needs to steal from Mercurial (and Git, because I think that’s where Mercurial got it): A batteries-included, dead-simple to deploy web interface.

    I sometimes pick Mercurial for small projects just because of the ease of publishing a repository on the web via hgweb.

    I know there are a few different web interfaces for Bazaar ([bzrweb](http://vmlinux.org/jocke/bzr/index.py/log/bzrweb/head), [bazaar-webserv](http://goffredo-baroncelli.homelinux.net/bazaar/), [loggerhead](http://www.lag.net/loggerhead/)), but none of them work as well (in my opinion) as hgweb.

  10. Mercurial: not mentioned for the same reason as any other revision control system that’s not on here, I’ve never looked it it or used it.

    CVS: correct way to import a project is to make the directory on the server, check it out, copy the project in, and cvs add the files. Yes, it’s annoying; but cvs import gives you an unwanted vendor branch

    git: picked it up in three hours?! That’s three hours too much.

    gitk: looks a bit like bzrk, there’s similar programs for every revision control tool

  11. bartman says:

    You say that git has a lot of commands, and is like Arch. You know
    Bazaar is not really winning by much. Git comes with a ton of features
    that are not in other VCS’s… no wonder it has more commands.

    Let me translate your examples to git:

    $ cd myproject
    $ git init

    $ cp ~/gitignore .gitignore
    $ git add .

    $ cd ..
    $ cp -a myproject myproject-foo
    $ cd myproject-foo
    $ git commit -a

    $ git fetch ../myproject

    $ bzr pull ../myproject

    I don’t understand what the big deal is. It’s almost word for word
    identical.

    Git has a lot of features. But like most shell tools, the higher the bar
    of entry the bigger the benefits.

    I could not go back to a single-head development model, after having used
    git for almost two years. Being able to flip between multiple streams
    of development w/o losing changes is a huge plus.

    Anyway sounds like you with bzr, I could talk about git for hours. :)
    Have fun with your new SCM.

  12. bartman says:

    err, s/bzr/git/ in the last example.

  13. anon says:

    Sorry, but git works the way I do. bzr/Arch/etc. all left unnecessary detritus in my history, both for solo work on different platforms and in a close-knit team. Nice thing about git is that it tries to play nice with other systems. I can use git myself while still contributing to projects that use svn, bzr, cvs…

    So ultimately my choice is up to me, yours is up to you. Just say that you like bzr more, not that system XXX is lacking (except cvs, which really does suck for most uses).

  14. pachi says:

    I think you should have a try to Mercurial. It’s as fast as git (even faster sometimes) and has a much saner UI, as bazaar. Also, IMHO, it’s more cleanly designed.

  15. Ben says:

    “The cost of this single new feature was a much more complicated interface (with two separate commands), a backend that tended to break down weekly and a lethargic slowness to its operation.”

    I do appreciate that this statement was a segway on your way to bzr, but I’ve never had any of these problems with svn. It’s a great tool, and it achieves its goal of being a better cvs.

  16. James Stansell says:

    Thanks for this post.

    In “trial and commonplace” should that be “trivial”?

    -james.

  17. Brian Schlining says:

    I’ve been using mercurial.

  18. Spacebat says:

    The greatest strengths of SVN are its maturity and broad acceptance, and the use of SVK as a front-end. With SVK you get many of the benefits of other more distributed solutions. I also like that SVN has two storage formats, BerkelyDB or SVN’s own FSFS. I use the latter in order to have fewer package dependencies and find SVN rock solid.

  19. madduck says:

    Really, please don’t spout fud about Git when you don’t seem to understand it: http://blog.madduck.net/vcs/2007.09.29_uninformed-slagger

  20. Mikkel Høgh says:

    @madduck: I don’t think that “uninformed slagger” is a really courteous way to describe the opinion of someone else. That aside, I think you might well consider that git has some serious tradeoffs:
    http://www.markshuttleworth.com/archives/125
    http://www.markshuttleworth.com/archives/123

    That might not matter to you, but it sure does to me…

Leave a Reply