It seems rather common (around here, at least) for people to recommend SVN to newcomers to source control because it's "easier" than one of the distributed options. As a very casual user of SVN before switching to Git for many of my projects, I found this to be not the case at all.
It is conceptually easier to set up a DCVS repository with git init (or whichever), without the problem of having to set up an external repository in the case of SVN.
And the base functionality between SVN, Git, Mercurial, Bazaar all use essentially identical commands to commit, view diffs, and so on. Which is all a newcomer is really going to be doing.
The small difference in the way Git requires changes to be explicitly added before they're committed, as opposed to SVN's "commit everything" policy, is conceptually simple and, unless I'm mistaken, not even an issue when using Mercurial or Bazaar.
So why is SVN considered easier? I would argue that this is simply not true.
If you use the version control only for yourself, SVN is probably harder, since the setup is harder. If you however want to work with multiple developers over the web, a server side control has advantages:
You have one central place, that always has the official state-of-the-art source, being the SVN server
Since everyone always merges his changes against a central server, there are much less collisions and much less manual fixing of collisions
You have central control over the source
You have an official revision number instead of a revision hash, being the same among all developers and showing the official progress (it's an up counting number, unlike a hash, which is just an identity fingerprint, so you can see which code is newer or older, just by this number
A distributed versioning system is A Very Good Thing (tm), but I find the primary barrier to adoption being educating users on the new possibilities a new SCM gives.
This coupled with an often lack-luster amount of UI tools (half-finished tortoise implementations etc), brings a blank stare to the eye of many co-workers who long since foreswore the commandline for the sake of a good UI tool.
Also, with tools like CVS I find that people loathe branching and merging because they really really don't want to be stuck an entire day doing three way merges, often not really sure which would be the right merge to do.
My point is: Start out by telling people what they gain (not just "hey watch this new cool toy"), and prep them to the fact that using a commandline IS the way to go and that frequent constant time branching is a good thing.
Many systems such as mercurial comes with complete patch-queue system, meaning that from a Continuous Integration standpoint you know that whatever goes into production has been approved by QA. Stuff like this is hard to do properly with CVS or SVN.
With Mercurial people would have private repositories for their current work and all developers share a developer-tree on a server. The CI system monitors the developer-tree and pulls all changes, builds, and performs unittests. If everything passes it propagates the changes to a Testing-tree from where a deliverable is built for the QA persons. Every changeset that is added gets a token. When QA deems a feature to be complete, they annotate the Testing tree with this token, and the associated changesets are then automatically propagated to the Production-tree.
Using this approach you will never commit anything by hand to the production branch, or the testing branch. Rather the state of the code, and the sign off from QA determines the contents of your production branch,
I believe it is conceptually simpler to think of a centralized repository where each developer commits his work vs. multiple copies of the entire repository, none of which represents the 'truth'. Since most developers are familiar with the notion of a client-server model and backend database, this is a natural concept.
Of course, the very strength of distributed source control systems is that they don't have to adhere to this model, but for a newcomer, it seems easier to grasp.
Svn forces a single work-line. You can either commit, or you can't. Having similar experiences, I don't find Hg or Git hard to use, however, I work on a team whom seem to find it nightmarish to use them.
The whole concept of auto-branching, multiple heads, when to and when not to merge completely eludes them, not to mention they have trouble breaking free of the "commit/checkout with $somecore" mentality which leads to complete confusion when they see pushes between 2 checked out copies.
( Had a problem for a while where somebody repeatedly merged 2 branches that were not supposed to be merged because they were unable to grasp the concept. )
However, the same people whom have trouble with distributed SCMS caused me to ask this question
edit/note Mercurials biggest difference I've noted vs git that makes mercurial harder to use for newbies is mercurials default behavior for push/pull is like doing git push --all / git pull --all , which can propagate private branches and add lots of confusion ( especially as when a new branch turns up, mercurial freezes in fear and asks you how to handle it instead of just keeping on trucking ), as well as the default merge/conflict resolution tool on mac just clobbering one set of changes blindly.
I think the toolset for SVN is much broader, so you could sit down and teach people (TortoiseSVN, RapidSVN etc) even if they did not have much conceptual idea of how the repository worked. It is also relatively easy to get SVN hosted for you (with trac for example) without needing to know anything. Distributed ones have not had this backing yet and I am sure opinions of them will change when they do.
I would argue that setting up a repository with a DVCS is practically easier, but conceptually harder. After all, with a centralized VCS the users do not set up their own repository, they just create an account on Assembla or have the repo set up for them.
DVCS is currently lacking good desktop clients. Despite what most people say, version control systems can be quite hard to use correctly so a good desktop client can really help - and here TortoiseSVN excels.
We struggle to make it as easy as possible at Codice, but it's always a little bit harder to explain, of course, it depends on the audience.
For OSS projects and small teams, specially people working on their laptops and moving here and there, working at home, plane sometimes, and so on, it's pretty easy. But, whenever you talk to corporations/enterprises, they get excited about its multi-site role, but not that much about distributed at first glance. It all depends on whether the group has a majority of advanced developers or not.
It's poor marketing, simple as that. Far too many DVCS introductions focus on the command line and say "wow, isn't it fantastic, you can do a merge just by typing hg merge" completely oblivious to the fact that many people (especially in Windows land) are terrified of the command line. Yes, Joel Spolsky, I'm looking at your own hginit.com here -- we need a TortoiseHg version please!
Maybe it was the case two years ago that they had poor GUI implementations, but they've come on in leaps and bounds recently. TortoiseHg is now in version 1.0, and while it may not be anything to write home about visually, it's pretty solid, stable, and easy to use. TortoiseGit is also rock solid, and it does a great job of abstracting away all the complexities of the git command line.
Related
When working remotely, our team only has access to our source code by remote desktop into our office PCs so we never really work in offline mode. Does a distributed version control system like Mercurial or Git still give us advantages over our current centralized Subversion set up? If so, what are they? Are there any drawbacks or pitfalls? I've read in numerous places that shifting to distributed version control requires a change in thinking. Can someone explain what needs to change in this regard?
As explained in the differences between DVCS and CVCS (Centralized VCS), the main advantages are:
local commits (you can commit more often in private branches, then clean up the history you want to push to other repos)
publication process (you pull from multiple repos, or quickly established intermediate repos to push to, where you can do intermediate tasks like continuous integration tests)
That last point required the most "change in thinking" and is a bit scary ("I can pull from any repo?!")
But once you realize the benefits, you can really have more productive development cycles because you are able to monitor (by fetching commits from your peers) the development of some of your colleagues. If they are developing a function that you need, you can start integrating it sooner.
(The thing to remember with a DVCS is that is doesn't prevent the setup of a "central" repo, for other developers to pull from)
As for continuous integration, instead of pushing directly from your repo to a central server in charge of CI, you can push to a local repo on your desktop, which will run all the tests, before pushing automatically (if "green") the code to a "central" repo.
It is so effective that you can now push to the official central repo a code that "never breaks the build", rendering your CI server pretty much useless ;)
I would recommend HgInit as a very thorough explanation of just how svn is improved upon by a decentralized toolset. It will also help you to understand the conceptual differences.
One of the big improvements I'd like to emphasize is the notion of merge tracking. Subversion didn't have this feature at all until 1.5, and with the difference in the way it treats revisions and branches, it will probably never be as good as the decentralized tools can be. Nobody likes merges. Might as well reduce as much of that pain as you can. Also see: Why is branching and merging easier in Mercurial than in Subversion?.
The biggest change in thinking for me when making the switch from subversion was getting over the idea that history is strictly linear, and branching is nothing but copying code to another directory. Note that in Git and Mercurial, you don't checkout a subdirectory of the repository. You won't see 'git checkout http://github.com/project/branches/v2.0' or anything. Eric Sink wrote a really good explanation of the difference in the way the history is stored. I recommend taking a look.
The development machines might stand next to each other, but the source code is still distributed between them. That the machines are in close physical proximity really doesn't matter for managing source code changes made by different developers.
I've been using SVN for some time now, and am pretty happy with how it works (but I can't say I'm an expert, and I haven't really done much with branches and merging). However an opportunity has arisen to put in some new practises on a new team and so I thought I'd take a look at DVCSs to see if it's worth making the jump.
The company I work for is a pretty standard company where we all work in the same location (or sometimes at home) and we want to keep a central store of all code.
My question is: if all you are doing with a DVCS is creating a central hub that everyone pushes their changes to, is there really any benefit to moving to a DVCS and its extra overheads in this sort of environment?
With DVCS's people can maintain their own local branches without making any changes in the central repository, and push their changes to the master repository when they think it's cooked up. Our project is stored in an SVN repository but personally I use git-svn to manage my local changes and find it quite useful, because we are not allowed to submit all the changes immediately(they have to be approved by the integrator first).
It all depends on how you want to work on projects. Distributed environments are great if everybody wants to build on its own branch. I prefer a central repository for my work (in a small team) as it makes the developers think about releasing one version of our product.
In my experience I see a lot of DVCS users who think of their own changes as the ones they don't have to review and these users review the changes of all other developers before merging them in their own tree. I like to see my changes as the change to the core product, so I review these changes before I commit them. As a result we try to keep the product pretty stable during the entire development cycle. Refactoring works OK, as we all update often.
Several DVCS users I know prefer to work on their feature on an independent tree and leave the integration with the central product to the final phase of their development. This works fine if the feature is independent, but I wouldn't like to be the one who has to integrate all the features developed this way with a deadline in sight.
If you integrate often, DVCS's don't differ much from central VCS's, and most DVCS's support a central repository, while more and more central VCS's support several features that where unique to DVCS's before, like offline commit and shelving.
(FYI: Offline commits are planned for Subversion 1.8)
Personally, I find it's a huge benefit. Even with a central repo, a DVCS changes the flow from "edit code, update from central, commit" to "edit code, commit, push to central". Among other things, that means that conflict resolution is far less stressful. It can also encourage development in smaller chunks, since you don't have to push after every commit. If your team is OK with it, that means your individual commits might leave the app in a strange state, as long as it's working when you finally push to the central repo. If they're not OK with that, as long as you're using git (or patch queues for hg, IIRC), you can still do dev in the same style, but then condense all your smaller commits into one larger commit that is complete before you push it to the central repo.
The big benefit of using a DVCS for me is that I can commit to my local repository without having to share these changes with everyone else. So when working on a big change I do small incremental commits, meaning I can revert just the last 30 minutes work, or do a diff against a version that was working yesterday, but then only push to the central repository once all my work is complete.
I think this benefit alone is worth moving to a DVCS for.
However, using a DVCS does require a little more thought and understanding and using a "standard" version control system like SVN or CVS so you will need to consider the training overhead if moving to a DVCS or your central repository will end up full of a lot different branches people didn't realise they were creating.
You'll get the inevitable war of Git vs. Mercurial starting here soon... :-) I personally use Mercurial, but what I've got to say should be suitable for all DVCS.
In my opinion, yes, they are suitable for corporate use. I use them at my own company, albeit with a small number of developers using it, but if you're worried about scalability, look at the large Open source projects using git and mercurial, e.g. Mozilla, Python.
The central hub approach works well - it's a familiar working model to users of subversion and you've always got a "definitive" version. Lock down access to this and apply any hooks to enforce commit policies and after that, developers have a large amount of freedom to work how they like with their local copies.
Another big plus is that I've found merging much less painful with mercurial than with subversion.
What's trickier with a DVCS is managing binary files - you can't require a lock on a binary file like you can with subversion (amongst others). Manage this with communication ideally.
Finally, cloning a repo is great for keeping checkouts in sync if you're working from several PCs.
Hope this helps.
K
I think the main benefit of a DVCS comes when you want to push your changes directly to other people (or machines, e.g. taking the repository home with you), without going through a central hub. If you have the need to do this, a DVCS is definitely the way to go. If, as you say,
all you are doing with a DVCS is creating a central hub that everyone pushes their changes to
then you’re not really taking advantage of the main purpose of a DVCS and I would say SVN is sufficient.
P.S. One might also make the argument that a DVCS encourages users to commit more often since they can do so in their personal repository and only publish their changes when they’re ready — but this can be easily accomplished in SVN using branches, with the only “downside” being that “personal” commits increment the version number of the whole repository.
Even with a hub workflow, a DVCS gets you the ability to make small commits locally, merge them only when you want to, and push them when they are ready.
With a non-DVCS, you are forced to either:
do your work without committing, until it's polished and you push a huge commit.
make small commits as you go, which everyone has to merge often, though merging intermediate commits brings them nothing of value.
And if you explore a dead end, without DVCS: with the first method, you can't rewind, you don't have a commit to go back to; with the second, both your commits and their reverts had to be merged pointlessly by everyone else.
Personally i think the biggest advantage of DVCS is that you commit (locally) before merging, so if halfway through the merge it turns out to be more complex than you originally thought, it is trivial to get back to a clean state without losing your work. compare to CVCS where you usually have to merge succesfully before you can commit.
additional advantages include:
working from home/at clients site becomes easier as you don't require network connection just to check something in, and if you wait till back at base to push changes the history is preserved rather than lumping everything into one change.
Most DVCS operations are actually a lot faster as they don't need to pull data over the network
Some things (e.g. user settings scripts) are better shared directly between developers who want to share them rather than via a central location
In my experience there are several ways to use a DVCS inside a corporate environment:
Multi-site support: you've separated teams and you use your DVCS to set up different "servers" at each location so they're not limited by the underlying network problems (and believe me, there will be). It used to be done with "big things" like Clearcase Multi-site or Wandisco (for SVN/CVS) but now it's pretty doable with DVCS systems.
Support "roaming users": you're a corp. developer but you want to work at home for a certain time (or ever): instead of relying on the VPN you can have a DVCS at your laptop and then you're free to commit, review, diff, branch and merge without being slow down by the central server. You synch back when you're online or back at the office.
True "distributed development": which is the extreme case: each developer having his own DVCS (like you'd do on the OSS world). It will really depend on team's skills and motivation: if the team really wants to move into it, they'll benefit, otherwise it will be SYSADM's nightmare having to manage not a single repo but hundreds... with their corresponding issues.
the overhead is not so big, in fact, in our environment, the added hg push is less of an overhead than commiting to the central svn repo was. but the biggest plus is all the bells and whistles that come with mercurial, that are great for an individual developer regardless of the team size or workflow. first and foremost, the fact that every wc is a repo is great, since you can experiment much more freely without polluting the master repo. then, there is functionality that builds on the wc == repo equality: bisect to quickly find the revision where a bug sneaked in, grep to, well, grep the history, as well as functionality simply missing from subversion, like colored diffs in the terminal.
Bazaar VCS can work as distributed VCS and as centralized VCS so you have the freedom to select the workflow you need. In the same time local private branches (where people can experiment while working on new features and in the same time commit their progress regularly) is huge benefit.
Also DVCS makes natural development workflow when mandatory code review required before new changes will be landed to trunk. This workflow (regarding SVN) described brilliantly in the UQDS article. And despite the fact that article described SVN-based workflow you'll find it more natural when you're using any DVCS instead of SVN, because in DVCS branching and merging is basic first-class operation.
Somewhat stuck trying to find a newer/better SCC system for my employer. My personal darling is SVN as it's compatible across a good swath of machines and relatively fast, but with past/present experience, it's not as friendly/easy to do branching.
The needs assessment is as follows:
Must be easy to use (CVS is considered easy)
Branches must be first class citizens
Prefer a mechanism like SVN's external property for repositories
Must be multiple OS friendly (linux, unix, Mac, MS Windows)
Proprietary/Commercial might be all right depending on license costs
Git is out as an option as some of the machines are running Vista and it's been a nightmare getting any developer tool to function with some sense of stability on that OS.
I'm also looking at Mercurial but not sure yet if it's going to work right for how the company operates.
I'm biased here, but take a look at Plastic SCM. It's easy to use (much easier than CVS) and it's all about branching.
If you're looking at Mercurial or GIT, maybe you're interested in distributed, aren't you? If so, plastic is still an option since, AFAIK, is the only distributed commercial one together with BitKeeper.
SVN is much easier with branching now that mergeinfo properties have been implemented (since v1.5). You just branch and merge away, and it remembers which revisions of which branches have been merged where.
Obviously I'm being a little blasé, but it does make svn worth another look if branching was your main issue with it. No longer do you have to keep text files around with notes on which trunk revisions you've merged into your branch. You just say "merge trunk". Then, you can "reintegrate" a branch automatically
Although it's worth noticing that you need to upgrade your svn server to 1.5 for this to work, the clients are backwards compatible with older servers, but you won't get the new functions. Back up your repo just before you upgrade, obviously.
Oh, and I believe 1.6 is out now, so you may as well jump straight on that. Of course TortoiseSVN (and VisualSVN for Visual Studio users) goes along with it.
My company uses Perforce and I've had a reasonable time with branches, and integrating changes between branches. It will even let you integrate between files that didn't share the same root, if you absolutely insist on it.
Using TortoiseSVN (and likely many other clients as well that I have no experience with), creating branches is a piece of cake, and merging them with other branches or back into the trunk is only slightly more difficult. If you have not tried branching/merging since 1.5, then SVN definitely warrants another look!
I'd like to hear from people who are using distributed version control (aka distributed revision control, decentralized version control) and how they are finding it. What are you using, Mercurial, Darcs, Git, Bazaar? Are you still using it? If you've used client/server rcs in the past, are you finding it better, worse or just different? What could you tell me that would get me to jump on the bandwagon? Or jump off for that matter, I'd be interested to hear from people with negative experiences as well.
I'm currently looking at replacing our current source control system (Subversion) which is the impetus for this question.
I'd be especially interested in anyone who's used it with co-workers in other countries, where your machines may not be on at the same time, and your connection is very slow.
If you're not sure what distributed version control is, here are a couple articles:
Intro to Distributed Version Control
Wikipedia Entry
I've been using Mercurial both at work and in my own personal projects, and I am really happy with it. The advantages I see are:
Local version control. Sometimes I'm working on something, and I want to keep a version history on it, but I'm not ready to push it to the central repositories. With distributed VCS, I can just commit to my local repo until it's ready, without branching. That way, if other people make changes that I need, I can still get them and integrate them into my code. When I'm ready, I push it out to the servers.
Fewer merge conflicts. They still happen, but they seem to be less frequent, and are less of a risk, because all the code is checked in to my local repo, so even if I botch the merge, I can always back up and do it again.
Separate repos as branches. If I have a couple development vectors running at the same time, I can just make several clones of my repo and develop each feature independently. That way, if something gets scrapped or slipped, I don't have to pull pieces out. When they're ready to go, I just merge them together.
Speed. Mercurial is much faster to work with, mostly because most of your common operations are local.
Of course, like any new system, there was some pain during the transition. You have to think about version control differently than you did when you were using SVN, but overall I think it's very much worth it.
At the place where I work, we decided to move from SVN to Bazaar (after evaluating git and mercurial). Bazaar was easy to start off, with simple commands (not like the 140 commands that git has)
The advantages that we see is the ability to create local branches and work on it without disturbing the main version. Also being able to work without network access, doing diffs is faster.
One command in bzr which I like is the shelve extension. If you start working on two logically different pieces of code in a single file and want to commit only one piece, you can use the shelve extension to literally shelve the other changes later. In Git you can do the same with playing around in the index(staging area) but bzr has a better UI for it.
Most of the people were reluctant to move over as they have to type in two commands to commit and push (bzr ci + bzr push). Also it was difficult for them to understand the concept of branches and merging (no one uses branches or merges them in svn).
Once you understand that, it will increase the developer's productivity. Till everyone understands that, there will be inconsistent behaviour among everyone.
At my workplace we switched to Git from CVS about two months ago (the majority of my experience is with Subversion). While there was a learning curve involved in becoming familiar with the distributed system, I've found Git to be superior in two key areas: flexibility of working environment and merging.
I don't have to be on our VPN, or even have network connectivity at all, to have access to full versioning capabilities. This means I can experiment with ideas or perform large refactorings wherever I happen to be when the urge strikes, without having to remember to check in that huge commit I've built up or worrying about being unable to revert when I make a mess.
Because merges are performed client-side, they are much faster and less error-prone than initiating a server-side merge.
My company currently uses Subversion, CVS, Mercurial and git.
When we started five years ago we chose CVS, and we still use that in my division for our main development and release maintenance branch. However, many of our developers use Mercurial individually as a way to have private checkpoints without the pain of CVS branches (and particularly merging them) and we are starting to use Mercurial for some branches that have up to about 5 people. There's a good chance we'll finally ditch CVS in another year. Our use of Mercurial has grown organically; some people still never even touch it, because they are happy with CVS. Everyone who has tried Mercurial has ended up being happy with it, without much of a learning curve.
What works really nicely for us with Mercurial is that our (home brewed) continuous integration servers can monitor developer Mercurial repositories as well as the mainline. So, people commit to their repository, get our continuous integration server to check it, and then publish the changeset. We support lots of platforms so it is not feasible to do a decent level of manual checks. Another win is that merges are often easy, and when they are hard you have the information you need to do a good job on the merge. Once someone gets the merged version to work, they can push their merge changesets and then no one else has to repeat the effort.
The biggest obstacle is that you need to rewire your developers and managers brains so that they get away from the single linear branch model. The best medicine for this is a dose of Linus Torvalds telling you you're stupid and ugly if you use centralised SCM. Good history visualisation tools would help but I'm not yet satisfied with what's available.
Mercurial and CVS both work well for us with developers using a mix of Windows, Linux and Solaris, and I've noticed no problems with timezones. (Really, this isn't too hard; you just use epoch seconds internally, and I'd expect all the major SCM systems get this right).
It was possible, with a fair amount of effort, to import our mainline CVS history into Mercurial. It would have been easier if people had not deliberately introduced corner cases into our mainline CVS history as a way to test history migration tools. This included merging some Mercurial branches into the CVS history, so the project looks like it was using from day one.
Our silicon design group chose Subversion. They are mainly eight timezones away from my office, and even over a fairly good dedicated line between our offices SUbversion checkouts are painful, but workable. A big advantage of centralised systems is that you can potentially check big binaries into it (e.g. vendor releases) without making all the distributed repositories huge.
We use git for working with Linux kernel. Git would be more suitable for us once a native Windows version is mature, but I think the Mercurial design is so simple and elegant that we'll stick with it.
Not using distributed source control myself, but maybe these related questions and answers give you some insights:
Distributed source control options
Why is git better than Subversion
I personnaly use Mercurial source control system. I've been using it for a bit more than a year right now. It was actually my first experience with a VSC.
I tried Git, but never really pushed into it because I found it was too much for what I needed. Mercurial is really easy to pick up if you're a Subversion user since it shares a lot of commands with it. Plus I find the management of my repositories to be really easy.
I have 2 ways of sharing my code with people:
I share a server with a co-worker and we keep a main repo for our project.
For some OSS project I work on, we create patches of our work with Mercurial (hg export) and the maintener of the project just apply them on the repository (hg import)
Really easy to work with, yet very powerful. But generally, choosing a VSC really depends on our project's needs...
Back before we switched off of Sun workstations for embedded systems development we were using Sun's TeamWare solution. TeamWare is a fully distribution solution using SCCS as the local repository file revision system and then wrappers that with a set of tools to handle the merging operations (done through branch renaming) back to the centralized repositories of which there can be many. In fact, because it is distributed, there really is no master repository per se' (except by convention if you want it) and all users have their own copies of the entire source tree and revisions. During "put back" operations, the merge tool using 3-way diffs algorithmically sorts out what is what and allows you combine the changes from different developers that have accumulated over time.
After switching to Windows for our development platform, we ended up switching to AccuRev. While AccuRev, because it depends on a centralized server, is not truely a distributed solution, logically from a workflow model comes very close. Where TeamWare would have had completely seperate copies of everything at each client, including all the revisions of all files, under AccuRev this is maintained in the central database and the local client machines only have the flat file current version of things for editing locally. However these local copies can be versioned through the client connection to the server and tracked completely seperately from any other changes (ie: branches) implicitly created by other developers
Personally, I think the distributed model implemented by TeamWare or the sort of hybrid model implemented by AccuRev is superior to completely centralized solutions. The main reason for this is that there is no notion of having to check out a file or having a file locked by another user. Also, users don't have to create or define the branches; the tools do this for you implicitly. When there are larger teams or different teams contributing to or maintaining a set of source files this resolves "tool generated" locking related collisions and allows the code changes to be coordinated more at the developer level who ultimately have to coordinate changes anyway. In a sense, the distributed model allows for a much finer grained "lock" rather than the course grained locking instituted by the centralized models.
Have used darcs on a big project (GHC) and for lots of small projects. I have a love/hate relationship with darcs.
Pluses: incredibly easy to set up repository. Very easy to move changes around between repositories. Very easy to clone and try out 'branches' in separate repositories. Very easy to make 'commits' in small coherent groups that makes sense. Very easy to rename files and identifiers.
Minuses: no notion of history---you can't recover 'the state of things on August 5'. I've never really figured out how to use darcs to go back to an earlier version.
Deal-breaker: darcs does not scale. I (and many others) have gotten into big trouble with GHC using darcs. I've had it hang with 100% CPU usage for 9 days trying to pull in
3 months' worth of changes. I had a bad experience last summer where I lost two weeks
trying to make darcs function and eventually resorted to replaying all my changes by hand into a pristine repository.
Conclusion: darcs is great if you want a simple, lightweight way to keep yourself from shooting yourself in the foot for your hobby projects. But even with some of the performance problems addressed in darcs 2, it is still not for industrial strength stuff. I will not really believe in darcs until the vaunted 'theory of patches' is something a bit more than a few equations and some nice pictures; I want to see a real theory published in a refereed venue. It's past time.
I really love Git, especially with GitHub. It's so nice being able to commit and roll back locally. And cherry-picking merges, while not trivial, is not terribly difficult, and far more advanced than anything Svn or CVS can do.
My group at work is using Git, and it has been all the difference in the world. We were using SCCS and a steaming pile of csh scripts to manage quite large and complicated projects that shared code between them (attempted to, anyway).
With Git, submodule support makes a lot of this stuff easy, and only a minimum of scripting is necessary. Our release engineering effort has gone way, way down because branches are easy to maintain and track. Being able to cheaply branch and merge really makes it reasonably easy to maintain a single collection of sources across several projects (contracts), whereas before, any disruption to the typical flow of things was very, very expensive. We've also found the scriptabability of Git to be a huge plus, because we can customize its behavior through hooks or through scripts that do . git-sh-setup, and it doesn't seem like a pile of kludges like before.
We also sometimes have situations in which we have to maintain our version control across distributed, non-networked sites (in this case, disconnected secure labs), and Git has mechanisms for dealing with that quite smoothly (bundles, the basic clone mechanism, formatted patches, etc).
Some of this is just us stepping out of the early 80s and adopting some modern version control mechanisms, but Git "did it right" in most areas.
I'm not sure of the extent of answer you're looking for, but our experience with Git has been very, very positive.
Using Subversion with SourceForge and other servers over a number of different connections with medium sized teams and it's working very well.
I am a huge proponent of centralized source control for a lot of reasons, but I did try BitKeeper on a project briefly. Perhaps after years of using a centralized model in one format or another (Perforce, Subversion, CVS) I just found distributed source control difficult to use.
I am of the mindset that our tools should never get in the way of the actual work; they should make work easier. So, after a few head pounding experiences, I bailed. I would advise doing some really hardy tests with your team before rocking the boat because the model is very different than what most devs are probably accustomed to in the SCM world.
I've used bazaar for a little while now and love it. Trivial branching and merging back in give great confidence in using branches as they should be used. (I know that central vcs tools should allow this, but the common ones including subversion don't allow this easily).
bzr supports quite a few different workflows from solo, through working as a centralised repository to fully distributed. With each branch (for a developer or a feature) able to be merged independently, code reviews can be done on a per branch basis.
bzr also has a great plugin (bzr-svn) allowing you to work with a subversion repository. You can make a copy of the svn repo (which initially takes a while as it fetches the entire history for your local repo). You can then make branches for different features. If you want to do a quick fix to the trunk while half way through your feature, you can make an extra branch, work in that, and then merge back to trunk, leaving your half done feature untouched and outside of trunk. Wonderful. Working against subversion has been my main use so far.
Note I've only used it on Linux, and mostly from the command line, though it is meant to work well on other platforms, has GUIs such as TortoiseBZR and a lot of work is being done on integration with IDEs and the like.
I'm playing around with Mercurial for my home projects. So far, what I like about it is that I can have multiple repositories. If I take my laptop to the cabin, I've still got version control, unlike when I ran CVS at home. Branching is as easy as hg clone and working on the clone.
Using Subversion
Subversion isn't distributed, so that makes me think I need a wikipedia link in case people aren't sure what I'm talking about :)
Been using darcs 2.1.0 and its great for my projects. Easy to use. Love cherry picking changes.
I use Git at work, together with one of my coworkers. The main repository is SVN, though. We often have to switch workstations and Git makes it very easy to just pull changes from a local repository on another machine. When we're working as a team on the same feature, merging our work is effortless.
The git-svn bridge is a little wonky, because when checking into SVN it rewrites all the commits to add its git-svn-id comment. This destroys the nice history of merges between my coworker's repo an mine. I predict that we wouldn't use a central repository at all if every teammember would be using Git.
You didn't say what os you develop on, but Git has the disadvantage that you have to use the command line to get all the features. Gitk is a nice gui for visualizing the merge history, but the merging itself has to be done manually. Git-Gui and the Visual Studio plugins are not that polished yet.
We use distributed version control (Plastic SCM) for both multi-site and disconnected scenarios.
1- Multi-site: if you have distant groups, sometimes you can't rely on the internet connection, or it's not fast enough and slows down developers. Then having independent server which can synchronize back (Plastic replicates branches back and forth) is very useful and speed up things. It's probably one of the most common scenarios for companies since most of them are still concerned of "totally distributed" practices where each developer has its own replicated repository.
2- Disconnected (or truly distributed if you prefer): every developer has his own repository which is replicated back and forth with his peers or the central location. It's very convenient to go to a customer's location or just go home with your laptop, and continue being able to switch branches, checkout and checkin code, look at the history, run annotates and so on, without having to access the remote "central" server. Then whenever you go back to the office you just replicate your changes (normally branches) back with a few clicks.
What would be the best version control system to learn as a beginner to source control?
Anything but Visual Source Safe; preferably one which supports the concepts of branching and merging. As others have said, Subversion is a great choice, especially with the TortoiseSVN client.
Be sure to check out (pardon the pun) Eric Sink's classic series of Source Control HOWTO articles.
I'd suggest you try Subversion, for example with the 1-click SVN installer. Try searching SO for "Subversion", and you'll find loads of questions with answers that point to good tutorials.
Good luck!
I'd go straight for Git. I've used subversion before, but always felt like I was doing it wrong. Git made sense from day one.
Useful resources:
Linus Torvals on Git
Scott Chacon "Getting Git"
There are a few core concepts that I think are important to learn:
Check-ins/check-outs (obviously)
Local versions vs. server versions
Mapping/Binding a local workspace to a remote store or repository.
Merging your changes back into a file that contains changes from
others.
Branching (what it is, when/why to use it)
Merging changes from a branch back into a main branch or trunk.
Most modern source control systems require some knowledge of the above topics and should help facilitate you learning them. Then you have distributed source control, which I don't have any experience with but is supposed to be fairly complicated and may not be suitable for a beginner.
Subversion is great because it has all of the modern features you'd want and is free.
Git is also becoming an increasingly popular option and is another free or very low cost alternative to Subversion. Knowledge regarding the concepts of branching and merging become critical for using Git, however.
You can use unfuddle as a free and easy way to experiment with both Git and Subversion. I use it to host a couple of subversion repositories for some side projects I've worked on in the past.
I'm not and advanced source control user, but I'm learning. Here is my experience with source control products:
A long time ago, the company I was working for at the time decided to use source control. They introduced the concept to developers and got eveyone willing to give it a try. They chose to use PVCS, and implemented it. Before too long, developers would have to coordinate to lock/unlock modules and objects and we really didn't see much benefit.
A few years later, I was playing around with making an open source project and at the time rubyforge was offering CVS repositories. I tried it out and it was marginally better than PVCS. Granted I was the only one using the repository. I did however become frustrated when I tried to rearrange the structure of my files because I didn't like the way I had initially imported them. It didn't really work out in CVS.
A few years after that I was working on another personal project and my web hosting provider offered easy to setup Subversion (SVN) repositories. It took me a little bit of research to get it up and running correctly, but once I got past the initial learning curve, I liked it.
Not long after that I realized that I liked having source control and that my current job didn't have it. So I evangelized, and after a long period of time, my team implemented Source Safe because we work in Visual Studio and are generally a Microsoft shop. I was eager to use it, but before long I found that I was losing files and that Visual Studio was putting things in the wrong place and that I'd work on a project for a while and then go to export my work to another location and find that it either wouldn't export or would only export some of the projects in a solution. This made me realize that even though I thought I was using a "version control system", the copy of the code that was most secure, robust and complete was my working copy. The exact opposite of what source control is supposed to do.
So last week I was so fed up with Source Safe that I went searching. After looking into a few solutions, I decided to try git. I won't say it's all been roses, since I have again had some learning curve to get it to do what I want it to do, However, I have liked it enough to convert all of my work and personal projects over to it. One of the really nice things about it is that I don't need a centralized repository so I can use it without going through a ton of red tape at work to get it installed.
So in short I would reccommend git, I use Mysysgit in windows and it has the added bonus of giving me a bash shell. On Linux you can just install it from your package manager. If you don't like git, try subversion. If you don't like either of those you probably won't like CVS or PVCS either. Under no circumstances try Source Safe, it's awful.
I found http://unfuddle.com saved me messing about with installing SVN or git. You can get a free account in there and use either of those - plus you can use your OpenID there.
Then you avoid having to mess about setting it up right and focus on how you're going to use it!
Vault from SourceGear.com is superb. It is free for single users and provides a superb VS 2005/2008 interface. I love it!
rp
#Ian Nelson:
I agree with you that Source Safe is bad as a source control system, but keep in mind that using Source Safe is a lot better than "carrying around floppy disks" as Joel Spolsky said.
For a beginner it might not be a bad idea, since the cost of having no source control at all is a lot higher.
Each tool has it's strengths and weaknesses. It's very much a question of what your requirements are. Unfortunately with this issue, like many others, it's often not the best tool that is selected but the one that someone is familiar with. For instance, if you don't require many branches and your team is small and local, almost any vcs will do the job (except SourceSafe). Things change if you need branches (which almost by necessity means you also need to do merges), your team is distributed, you need advanced security (subcontractors are not allowed to entire source tree), task tracking, etc. There is also the question of cost in three different ways: cost of licenses, cost of maintenance (some tools are so complicated that you in practice need someone just to control the repositories) and cost of training.
Therefore suggesting one tool over another is like suggesting what would be the best programming language.
Just some pointers:
StarTeam is the easiest of the tools
I have used. It required very little
training. I got a one-day training
since I was to be the maintainer.
This maintaining took me less than 30
minutes per week. Users I "trained"
by writing a two-page manual and
after that I had very few questions
to answer.
Continuus was the other end of the
scale as far as ease of use is
concerned. On the other hand task handling was great and it offered good support for release management. Trouble is, even as a release manager I never thought ease of making releases (it was once you learned how, but that took a considerable amount of time) should be more important than the daily work that developers do.
Merging and branch creating differs
wildly between tools. Some tools make
this simple, like git and ClearCase
(although the latter is very slow)
some basically force you to do the
merge by hand. If you need to do
merges a lot, the cost can get high.
ClearCase was also expensive in all
three categories mentioned before
(although it has to be said we used
all the advanced stuff which isn't
necessary). Git on the other hand
lacks a good UI and some of concepts
differ from what you might be used
to. Git's security features are also
lacking (gitosis addresses some
issues but not all).
Most tools I have used are also quite
slow. Tools like PVCS/Dimensions was
just slow, no matter what (basic
things like opening a directory in
the repository), some very slow in
more specific ways (like ClearCase).
From the tools I have used I would select StarTeam if your developers are not very experienced (and if you don't mind paying the license, which is quite expensive) and git if you have some experienced vcs guys onboard who can set up the environment to other guys. Mercurial also looks like an interesting competitor and seems to have slightly better UI's.
Continuus, PVCS/Dimensions and ClearCase are just too slow, too complex and too expensive for almost any project. If someone insist on selecting one of these, I would go for ClearCase.
I haven't used Subversion which many seem to like (yet, I have a feeling this is about to change in the near future) so can't comment how it compares to the other tools I have used (usually as a build and/or release manager).
As for the first tool to choose, problem with Git, Bazaar and Mercurial is they are distributed vcs's. This is different from the traditional server-client model where you have a central repository. For just learning the stuff I would recommend also reading about the concepts. Branching for instance is something that you might not understand correctly just by trying yourself (there are different branch strategies for different situations). Plus it is very different if you are the only one accessing the repository, merge conflicts for instance wouldn't be a problem (you might get to see them but you would easily also fix them since you know the code in both branches). Of course you would learn about check outs, check ins, and such but I don't think these issues are particularily difficult in the first place.
Added problem with vcs's is that they tend to use different terms. In StarTeam which is otherwise easy to use they for some reason insist on using the terms "check out" and "check out and lock". The latter is what most people think the first does. There is a reason for this (you can edit files even if you don't have an exclusive lock), but it would still make much more sense to call these "Get" and "Check out" to avoid confusion.
Anything, but I would learn a modern system like git or subversion myself. My first VCS was RCS, but I got the basics down.
Well, if you are just wanting to learn on your own, I would say you should go with something free, like subversion. If you are a company who has never used source control before, then it really depends on your needs.
My first exposure was CVS with WinCVS as a client. it was horrid. Next was Subversion, with TortoiseSVN and Eclipse's integration. It was intuitive, and heavenly. I think that using CVS with TortoiseCVS and Eclipse's would be nice as well, though I prefer the way SVN handles revisioning. The entire repository is versioned with each check in, not individual files.
I'd also recommend Subversion. It does not take too long to set up, it is free, and there is a really good book available online that goes over the basics as well as some advanced topics: http://svnbook.red-bean.com/
Subversion with tortoisesvn. (tortoisesvn because you can see a lot of what goes on visually and will provide a good jumping off point for the command line stuff. ) There is tons of documentation out there and most likely you will see it at least one point in your career. Almost every company I have worked for and interviewed with runs SVN.
If you're looking to learn a commercial product while getting started Perforce provides a free client and server, with the server supporting two users and five client workspaces.
At my previous place of employment it was used religiously not only for code by our programmers, but for art assets and game levels, and my own documentation.
Subversion is good place to start with. It is very stable and modern version control system.
Best online resource to start learning about Subversion would be Version Control with Subversion. There are lot of choices as far as server and client softwares are concerned. I personally prefer (for Windows environment).
VisualSVN server
TortoiseSVN shell-integrated client and
AnkhSVN Visual Studio Subversion Add-On
Again, with Subversion there are lot of options available. Also, it is a continually evolving version control system (unlike outdated SourceSafe). It could be easily integrated with numerous automated build tools (CruiseControl, FinalBuilder) and bug/issue tracking systems (JIRA).
If you are looking for state-of-the-art version control systems, go for Git(developed by Linus Torvalds). But if you are totally new to version control systems, I would suggest start with subversion.