Anyone who has worked on a software project with a team of people
knows how difficult the coordination of efforts can be. Students of
Fred Brooks' The Mythical Man Month, or people who understand
it implicitly by virtue of their experiences in the work place, are
well aware of this. Intelligence, quite simply, does not scale. The
number of communication channels for a team grows with the square of
the number of participants, and presumably the efficiency of such a
team shrinks with a corresponding inverse relationship. To keep
things from grinding to a halt, good tools must be at one's behest.
Among these tools, a good source code management tool must reside on a
developer's belt. There are many options from which to choose, some
better than others. In the open source and Unix community, CVS has
long been a favorite. There also exist myriad commercial
alternatives, such as ClearCase, Razor and Perforce. More recently,
the open source community has added Subversion to its arsenal, a tool
superior to CVS and one that has breathed new life into the
non-commercial offerings.
CVS has long been maligned by proponents of commercial alternatives,
but Subversion largely undermines any claims that proponents of such
tools can make. For a full list of the improvements that Subversion
makes over CVS, see the documentation, but the main ones follow...
For one thing, Subversion provides atomic commits. This is huge.
When you commit code from your sandbox back to the repository, there
exists a race condition with CVS that may cause you to fumble. When
doing a commit, even a batch commit, you are actually just committing
one file at a time. Furthermore, the commit operation may fail should
it be the case that a newer version of the file has been committed
since you last updated. If you try to commit a batch of files in CVS,
it may be the case that one or more of the file commits may fail,
leaving the repository in a broken state. This is bad, very bad.
Subversion, on the other hand, operates not on a per-file basis for
commits, but rather on the basis of change sets. You can commit a
group of files together, and either they all commit or none of them
commit. This does wonders to forfend breaking the build.
On a related note, revision numbers in Subversion are not done on a
file by file basis. Rather, the whole repository has a revision
number that increments with the commission of each change set. This
has the advantage of lumping changes together into logical units. For
a given revision number you don't really care that file Foo changed.
What you want to know know is what feature was implemented for a given
revision, and what modifications were made in the course of doing so.
Subversion makes this a reality. From revision 442 to 443, you can
see the list of files that changed, and the corresponding comment
"fixed bug #256". To do this in CVS you will have to resort to ad-hoc
measures, perhaps writing tools of your own.
Another big win with Subversion is that file renaming is a first class
operation and directories are first class objects. In CVS, moving a
file was a horribly clumsy operation. You had to copy the file to its
new location, issue a delete command for the old location, and issue
an add command for the new location. In the process of doing so you
would lose the history of the file. If you did not wish to lose the
file's history, you were literally forced to go into the repository
and hand move the corresponding file at your own risk. That sucked,
and with Subversion is entirely obviated. Now you just issue a "svn
mv" command and life is good. Given how central this operation is to refactoring code, I wonder how I ever lived without it as a CVS user.
Another big gripe about CVS is the tedium of branching with it. In
Subversion, branching is trivially easy. Whereas in CVS the general
opinion seems to be that branching is akin to being bitten by rabid
coyotes, in Subversion it basically consists of a directory copy operation. You
issue a "svn copy" command, which is actually a light weight
copy-on-write method, creating a new node in a directory tree that
comprises the branch. An "svn merge" command specifying that branch
will merge things to wherever you desire with minimal hassle.
All in all, Subversion is a fantastic tool, and one that I believe to
represent the future of version control. However, this doesn't stop
the snooty and recalcitrant proponents of commercial tools from
claiming that Subversion is an inferior tool, and that everyone would
be better off if only they would adopt a "real" tool. This claim, as
far as I am concerned, is nonsense.
In dealing with said Philistines, it would seem that the most common
complaint centers on Subversion's lack of deference to strict file
locking. This is unfortunate, as said mechanism is in fact a crutch
for a crippled process. If coordination of developers in your group
depends on knowing who is editing what file, then, quite frankly, you
are doing something very wrong.
In developing software, the fundamental unit of work is a feature, or
perhaps a bug fix. The fact that you are operating on a directory
tree of files is accidental. It is an artifact, something that is not
an intrinsic facet of software development. Along these lines I would
argue that file locking operations represent a faux feature. They
provide a false sense of security that lulls developers into a state
of complacency, a state of mind that assumes that everything is A-OK
just as long as people aren't working on the same files.
Nothing could be further from the truth. If someone is working on
file A, and a co-worker is working on file B, and file A includes file
B, depending on the API that it exports, then the benefits of file
locking to you are precisely NOTHING. The functionality
provided by file B could change out from under the developer working
on file A, and unless the developer working on file A is communicating
with the developer working on file B, then he is hosed. That's all
there is to it.
Quite simply, there is no substitution for good communication between
developers. Furthermore, there is no tool in existence that truly
facilitates it. You can lead a horse to water, but you can't make it
drink. If your developers refuse to talk with one another, then there
is no piece of software that can save you. Software development is a
very social process, and if developers are not talking to one another
then all is lost. File locking will not keep people from stepping on
one another's toes. It will merely provide the illusion of
concurrency management, and that illusion can oft prove disastrous.
Of course, it would seem that the various and sundry commercial tools
are in the habit of providing such features. This begs the question,
why? The answer, presumably, is simple. Lots of people are hot for
the aforementioned security blanket, and commercial enterprises are
typically more interested with making money than in improving the
state of software engineering. As such, they are naturally inclined
to give people what they want, even if what they want is stupid and
short-sighted.
Software engineers don't need snazzy tools. They need solid tools,
ones that implement a few core features in a robust fashion and that
refrain from providing nonsensical features that appeal to PHBs and
the talentless hacks who are self-styled "software engineers". As
such, there exists a conflict between those who would sell said tools
and those who would use them. Having smart developers who communicate effectively and write good
suites of automated unit tests will buy you much more than any piece
of software that comes in a shiny box.
This brings me to my last point... I have noticed a distinction
between the various people that one encounters in the work place, a
distinction made by Pirsig in Zen and the Art of Motorcycle
Maintenance. Apart from the truly useless wards of the state,
there are two kinds of people: those who are educated, and those who
are trained.
Educated individuals tend to be generalists. They might be especially
good in a few specific things, but that is not their main selling
point. By far and away, their main virtue is adaptability. They have
absorbed a large body of knowledge and generalized it into a framework
that allows them to reason about and solve novel problems. They do
not tend to get into ruts. They do not tend to bump up against walls
and get stuck. No matter what the environment is into which they are
thrust, they can adapt and prevail.
Trained individuals, when in their domain, are indistinguishable from
educated folk, or perhaps even superior. They have absorbed some
particular body of knowledge and are now purveyors of it in the same
way that followers of a religion are purveyors of their faith.
Pointed in the right direction they are very capable people. Put them
up against a novel situation, however, and they are damn near useless.
They will spout ideological nonsense that aligns with their
programming, and there is quite simply nothing that you can do to
convince them that there exists a better way. If they learned system
A, and part of their indoctrination is that system A is the best
system ever, then there is nothing in this world that you can do to
convince them that system B is better. You may just as well try to
make a Satanist of a fundamentalist Christian.
It is this latter variety of people against whom I have vainly butted
my head trying to convince them of the insanity of relying on such
silly features as strict file locking. It does not matter how
destructive a reliance on such a strategy may be. If they cut their
teeth on a broken process, and were trained instead of educated, then
there is nothing that can be done to break them of this habit. You
cannot reason with such people and are wasting your time in doing so.
Realistically speaking, it's probably going to be twenty to fifty
years before we have a software development process that doesn't
totally suck. Until then, your best bet is probably to use Subversion,
write unit tests competently, and regularly chew the fat with your fellow developers. Also, for the love of God, use
an editor that doesn't suck. If you're not using Emacs, then I hope that you're using
vi, and if you're not using one of the two, then please kill
yourself.
That is all.