What are common practices for deployment of large scale systems? - deployment

Given a large scale software project with several components written in different languages, configuration files, configuration scripts, environment settings and database migration scripts - what are the common practices for deployment to production?
What are the difficulties to consider? Can the process be simplified with tools like Ant or Maven? How can rollback and database management be handled? Is it advisable to use version control for the production environment?

As I see it, you're mostly asking about best practices and tools for release engineering AKA releng -- it's important to know the "term of art" for a subject, because it makes it much easier to search for more information.
A configuration management system (CMS -- aka revision control system or version control system) is indispensable for today's software development; if you use one or more IDEs, it's also nice to have good integration between them and the CMS, though that's more of an issue for purposes of development than for purposes of deployment / releng.
From a releng viewpoint, the key thing about a CMS is that it must have good support for "branching" (under whatever name), because releases must be made from a "release branch" where all the code under development and all of its dependencies (code and data) are in a stable "snapshot" from which the exact, identical configuration can be reproduced at will.
The need for good branching support may be more obvious if you have to maintain multiple branches (customized for different uses, platforms, whatever), but even if your releases are always, strictly in a single linear sequence, releng best practices still dictate making a release branch. "Good branching support" includes ease of merging (and "conflict resolution" when different changes are made to a file), "cherry-picking" (taking one patch or changeset from one branch, or the head/trunk, and applying it to another branch), and the like.
In practice, you start off the release process by making a release branch; then, you run exhaustive testing on that branch (typically MUCH more than what you run everyday in your continuous build -- including extensive regression testing, integration testing, load testing, performance verification, etc, and possibly even more costly quality assurance processes, depending). If and when the exhaustive testing and QA reveal defects in the release-candidate (including regressions, performance degradation, etc), they must be fixed; in a large team, development on head/trunk may be continuing while the QA is being done, whence the need for ease of cherry-picking / merging / etc (whether your practice is to perform the fix on head or on the release branch, it still needs to be merged to the other side;-).
Last but not least, you're NOT getting full releng value from your CMS unless you're somehow tracking with it "everything" that your releases depend on -- simplest would be to have copies or hard links to all the binaries for the tools you need to build your release, etc, but that may often be impractical; so at the very least track the exact release, version, bugfix &c numbers of those tools that are used (operating system, compilers, system libraries, tools that preprocess image, sound or video files into final form, etc, etc). The key is being able, at need, to exactly reproduce the environment required to rebuild the exact version that's proposed for release (otherwise you'll go mad tracking down subtle bugs that may depend on third party tools' changes as their versions change;-).
After a CMS, the second most important tool for releng is a good issue tracking system -- ideally one that's well integrated with the CMS. That's also important for the development process (and other aspects of product management), but in terms of the release process the importance of the issue tracker is the ability to easily document exactly what bugs have been fixed, what features have been added, removed, or changed, and what modifications in performance (or other user-observable characteristics) are expected in the new forthcoming release. For the purpose, a key "best practice" in development is that every changeset that gets committed to the CMS must be connected to one (or more) issue in the issue tracking system: after all, there's gotta be some purpose for that change (fix a bug, change a feature, optimize something, or some internal refactor that's supposed to be invisible to the software's user); similarly, every tracked issue that's marked as "closed" must be connected to one (or more) changesets (unless the closing is of the "won't fix / working as intended" kind; issues related to bugs &c in third-party components, which have been fixed by the third-party supplier, are easy to treat similarly if you do manage to keep track of all third-party components in the CMS too, see above; if you don't, at least there should be text files under CMS documenting third-party components and their evolution, again see above, and they need to be changed when some tracked issue on a 3rd party component gets closed).
Automating the various releng processes (including building, automated testing, and deployment tasks) is the third top priority -- automated processes are much more productive and repeatable than asking some poor individual to manually go through a list of steps (for sufficiently complex tasks, of course, the workflow of the automation may need to "get a human being in the loop"). As you surmise, tools such as Ant (and SCons, etc, etc) can help here, but inevitably (unless you're lucky enough to get away with very simple and straightforward processes) you'll find yourself enriching them with ad-hoc scripts &c (some powerful and flexible scripting language such as perl, python, ruby, &c, will help). A "workflow engine" can also be precious when your release workflow is sufficiently complex (e.g. involving specific human beings or groups thereof "signing off" on QA compliance, legal compliance, UI guidelines compliance, and so forth).
Some other specific issues you're asking about vary enormously depending on specifics of your environment. If you can afford programmed downtime, your life is relatively easy, even with a large database in play, as you can operate sequentially and deterministically: you shut the existing system down gracefully, ensure the current database is saved and backed up (easing rollback, in the hopefully VERY rare case it's needed), run the one-off scripts for schema migration or other "irreversible" environment changes, fire the system back up again in a mode that's still unaccessible to general users, run another extensive suite of automated tests -- and finally if everything's gone smoothly (including the saving and backup of the DB in its new state, if relevant) the system's opened up to general use again.
If you need to update a "live" system, without downtime, this may range anywhere from a slight inconvenience to a systematic nightmare. In the best case, transactions are reasonably short, and synchronization between the state set by transactions may be delayed a bit without damage... and you have a reasonable abundance of resources (CPUs, storage, &c). In this case, you run two systems in parallel -- the old one and the new one -- and just make sure all new transactions target the new system, while letting old ones complete on the old system. A separate task periodically syncs "new data in the old system" to the new system, as transactions on the old system terminate. Eventually you can determine that no transactions are running on the old system and all changes that happened there are synced up to the new one -- and at that time you can finally shut down the old system. (You need to be prepared to "reverse sync" too of course, in case a rollback of the change is needed).
This is the "simple, sweet" extreme for live system updating; at the other extreme, you can find yourself in such an overconstrained situation that you can prove the task is impossible (you just cannot logically meet all the stated requirements with the given resources). Long sessions opened on the old system which just cannot be terminated -- scarce resources that make it impossible to run two systems in parallel -- core requirements for hard-real-time sync of every transaction -- etc, etc, can all make your life miserable (and as I noticed, at the extreme, can make the stated task absolutely impossible). The two best things you can do about this: (1) ensure you have abundant resources (this will also save your skin when some server unexpected goes belly-up... you'll have another one to fire up to meet the emergency!-); (2) consider this predicament from the start, when initially defining the architecture of the overall system (e.g.: prefer short-lived transactions to long-lived sessions that just can't be "snapshot, closed down, and restarted seamlessly from the snapshot", is one good arcitectural pointer;-).

disclaimer: where I work we use something I wrote that I'm going to mention
I'll tell you how I do it.
For configuration settings, and general deployment of code and content, I use a combination of NAnt, a CI server, and dashy (an automatic deployment tool). You can replace dashy with any other 'thing', that automates the uploads to your server (perhaps capistrano).
For DB scripts, we use RedGate SQL Compare to get the scripts, and then for most changes, I actually make them manually, where appropriate. This is because some of the changes are slightly complex, and I feel more comfortable doing it by hand. You can actually use this tool to do it for you, (or at least generate the scripts).
Depending on your language, there are some tools that can script the DB updates for you (I think someone on this forum wrote one; hopefully he'll reply), but I have no experience with those. It's something I'd like to add though.
Difficulties
I forgot to answer one of your questions.
The biggest problem in updating any significantly complicated/distributed site is DB synchronisation. You need to consider if you will have any downtime, and if you do, what will happen to the DBs. Will you shutdown everything, so that no transactions can be processed? Or will you shift everything to one server, update DB B, and then sync DB A & B, then update DB B? Or something else?
Whatever you choose, you need to choose it, and either say "Okay for each update, there will be X downtime", or whatever. Just have it documented.
The worst thing you can do is have someones transaction fail, mid processing, because you were updating that server; or to somehow leave only part of your system operational.

I don't think you have any option about using versioning or not.
You cannot go without versioning (as mentioned in a comment).
I am talking from experience since I am currently working on a website where we have several items that have to work together:
The software itself, its functionalities at a given point in time
The external systems (6 of them) we are linked with (messages are versioned)
A database, which holds the configuration (and translations)
The static resources (images, css, javascripts) hosted on an Apache server (several in fact)
A java applet, which has to be synchronized with the javascript and software
This works because we use versioning, although I must admit that our database is pretty simple, and because the deployment is automated.
Versioning means that we have actually at any point in time several concurrent versions of the messages, the database, the static resources and the java applet.
It is especially important in the event of a 'fallback'. If you discover a flaw when loading you new software and suddenly you cannot afford to load, you'll have a crisis if you have not versioned, and you'll just have to load the old soft if you have.
Now as I said, the deployment is scripted:
- the static resources are deployed first (+ the java applet)
- the database goes next, several configurations cohabit since they are versioned
- the soft is queued for load in a 'cool' time-window (when our traffic is at its lowest point, so at night)
And of course we take care of backward / forward compatibility issues with the external servers, which should not be underestimated.

Related

How to implement continuous migration for large website?

I am working on a website of 3,000+ pages that is updated on a daily basis. It's already built on an open source CMS. However, we cannot simply continue to apply hot fixes on a regular basis. We need to replace the entire system and I anticipate the need to replace the entire system on a 1-2 year basis. We don't have the staff to work on a replacement system while the other is being worked on, as it results in duplicate effort. We also cannot have a "code freeze" while we work on the new site.
So, this amounts to changing the tire while driving. Or fixing the wings while flying. Or all sorts of analogies.
This brings me to a concept called "continuous migration." I read this article here: https://www.acquia.com/blog/dont-wait-migrate-drupal-continuous-migration
The writer's suggestion is to use a CDN like Fastly. The idea is that a CDN allows you to switch between a legacy system and a new system on a URL basis. This idea, in theory, sounds like a great idea that would work. This article claims that you can do this with Varnish but Fastly makes the job easier. I don't work much with Varnish, so I can't really verify its claims.
I also don't know if this is a good idea or if there are better alternatives. I looked at Fastly's pricing scheme, and I simply cannot translate what it means to a specific price point. I don't understand these cryptic cloud-service pricing plans, they don't make sense to me. I don't know what kind of bandwidth the website uses. Another agency manages the website's servers.
Can someone help me understand whether or not using an online CDN would be better over using something like Varnish? Is there free or cheaper solutions? Can someone tell me what this amounts to, approximately, on a monthly or annual basis? Any other, better ways to roll out a new website on a phased basis for a large website?
Thanks!
I think I do not have the exact answers to your question but may be my answer helps a little bit.
I don't think that the CDN gives you an advantage. It is that you have more than one system.
Changes to the code
In professional environments I'm used to have three different CMS installations. The fist is the development system, usually on my PC. That system is used to develop the extensions, fix bugs and so on supported by unit-tests. The code is committed to a revision control system (like SVN, CVS or Git). A continuous integration system checks the commits to the RCS. When feature is implemented (or some bugs are fixed) a named tag will be created. Then this tagged version is installed on a test-system where developers, customers and users can test the implementation. After a successful test exactly this tagged version will be installed on the production system.
A first sight this looks time consuming. But it isn't because most of the steps can be automated. And the biggest advantage is that the customer can test the change on a test system. And it is very unlikely that an error occurs only on your production system. (A precondition is that your systems are build on a similar/equal environment. )
Changes to the content
If your code changes the way your content is processed it is an advantage when your
CMS has strong workflow support. Than you can easily add a step to your workflow
which desides if the content is old and has to be migrated for the current document.
This way you have a continuous migration of the content.
HTH
Varnish is a cache rather than a CDN. It intercepts page requests and delivers a cached version if one exists.
A CDN will serve up contents (images, JS, other resources etc) from an off-server location, typically in the cloud.
The cloud-based solutions pricing is often very cryptic as it's quite complicated technology.
I would be careful with continuous migration. I've done both methods in the past (continuous and full migrations) and I have to say, continuous is a pain. It means double the admin time for everything, and assumes your requirements are the same at all points in time.
Unfortunately, I would say you're better with a proper rebuilt on a 1-2 year basis than a continuous migration, but obviously you know best about that.
I would suggest you maybe also consider a hybrid approach? Build yourself an export tool to keep all of your content in a transferrable state like CSV/XML/JSON so you can just import into a new system when ready. This means you can incorporate new build requests when you need them in a new system (what's the point in a new system if it does exactly the same as the old one) and you get to keep all your content. Plus you don't need to build and maintain two CMS' all the time.

Rule development and deployment management with Drools Guvnor

Introduction
Drools Guvnor has it's own versioning system, that in production use allows the users of an application to modify the rules and decision tables in order to adapt to change in their business. Yet, the same assets continue to live on the development version control system, where new features to the app are developed.
This post is for looking insight/ideas/experience on rule development and deployment when working with Drools rules and Guvnor.
Below are some key concepts I've been puzzling about.
Deployment to Guvnor
First of all, what is the best way to deploy the drl files and decision tables to production environment? Just simply put them on a zip package and then unzip to Web-Dav folder? What I have navigated around Drools, I haven't found a way to import more than one file at a time. The fact model can be added as a jar archive, though. Guvnor seems to have a REST API of some sort, but using that would require custom deployment scripts.
Change management
Secondly, once the application is in production, the users will likely want to change the values in decision tables in order to set the discount percentages to higher for premium clients etc. This is all fine and dandy, until comes the time to start development of version 2.0 of the app.
Now what we have at this point is
drl files and decision tables in version controlling system
drl files and decision tables in production environment with user modifications, versioned by Guvnor
Now we are in the point of getting the rules and decision tables back from the Guvnor. And again is the Web-Dav folder the best for this, what other options there are?
Merge tools today can even handle Excel file diffs, but sounds like a merge hell to me on a big scale projects.
Keeping the fact model backwards compatible
Yet another topic is fact model integrity. For the assumed version 2.0, developers always want to make refactoring and tear the whole fact model upside down. Still, it must remain backwards compatible with the previous versions as there may be user modified rules that depend on that. Any tips on this? Just keep the fact model simple and clean? Plan ahead / suggest what the users could want to change?
Summary
I'm certain I'm not the first, and surely not the last, to consider options on deployment and change management with Drools and Guvnor. So, what I'd like to hear is comment, discussion, tips etc. on some best (and also the worst in order to avoid them) practices to handle these situations.
Thanks.
The best way to do things depends very much on your specific application and the environment you work in. However the following are pointers from my own experience. Note that I'll add just a few points for now. I'll probably come back to this answer when things come to me.
After the initial go-live, keep releases incremental and small
For your first release you have the opportunity to try things out. Take advantage of this opportunity, and do as much refactoring as possible, because...
Your application has gone live and your business users are maintaining rules in decision tables. You have made great gains in what folks in the industry like to call "business agility". Unfortunately, this tends to be at the expense of application development agility. All of your guided editor rules and decision table rules are tied to your fact model, so any changes existing properties of that fact model will break your decision tables. Unlike in most IDEs these days, you can't just right-click on a fact's getX() method, rename it, and expect all code which relies on that property to be updated.
Decision tables and guided rules are hard to refactor. If a fact has been renamed, then in many (all?) versions of Guvnor, that rule/table will no longer open. You need to get at the underlying XML file via WebDav and do some text searching and replacing. That may be very difficult, considering that to do this, you need to download the file from production to a test environment, make changes, test them, deploy them to a test environment. When you're happy with your changes you need to push them back up to the 'production' Guvnor. Unfortunately, while you were doing that, the users have updated a number of decision tables and you need to either start again, or re-apply the past couple of days' changes. In an ideal world, your business users will agree to make no rule changes for a period of time. But the only way to make that feasible is to keep the changes small so that you can make them in a couple of hours or a day depending on how generous they feel.
To mitigate this:
Keep facts used within Guvnor separate from your application domain classes. Your application developers can then refactor the internal application model to their hearts content, but such changes will not affect the business model. Maintain mappings between the two and ensure there are tests covering those mappings.
Avoid changes such as renaming facts or their properties. Make sure that facts you create and their properties have names which suit the domain and agree these with the business. On the other hand, adding a new property is relatively painless. It is well worth prompting the users to give you an eye on their future plans.
Keep facts as simple as possible. Don't go more complex than name-value pairs unless you really need to. For one thing, your business users will find it much easier to maintain rules. Your priority with anything managed within Guvnor should always be about making it easy for business users to maintain.
Keep external dependencies out of your facts. A developer may think it's a good idea to annotate a fact as a JPA #Entity, for easy persistence. Unfortunately, that adds dependencies which need to be added to Guvnor's classpath, requiring a restart.
Tips & tricks
My personal technique for making cross-environment changes is to connect Eclipse to two Guvnor WebDav directories, and checkout the rules into local directories, where each local directory maps to an environment. I then use the Eclipse diff tooling.
When building a Guvnor-managed knowledge base, I create a separate Maven project containing only the facts, and with no dependencies on anything else. It makes it a lot easier to keep them clean this way. Also, when I really do need to add a dependency (i.e. I use JodaTime where possible), then the build can have a step to generate a shaded JAR containing all the dependencies. That way you only ever deploy one JAR to Guvnor, which is guaranteed to contain the correct versions of your dependencies.
I'm sure there will be more that I think of. I'll try to remember to come back to this...

Source control Branching needs

we are creating hospital information system software. The project will be different hospital to hospital and contain different use cases. But lots of parts will be the same. So we will use branching mechanism of the source control. If we find a bug in one hospital, how can we know the other branches have the same bug or not.
IMAGE http://img85.imageshack.us/img85/5074/version.png
The numbers in the picture which we attached show the each hospital software.
Do you have a solution about this problem ?
Which source control(SVN,Git,Hg) we will be suitable about this problem ?
Thank you.!
Ok, this is not really a VCS question, this is first and foremost and architectural problem - how do you structure and build your application(s) in such a way that you can deliver the specific use cases to each hospital as required whilst being able, as you suggest, to fix bugs in commom code.
I think one can state with some certainty that if you follow the model suggested in your image you aren't going to be able to do so consistently and effectively. What you will end up with is a number of different, discrete, applications that have to be maintained separately even though they have at some point come from a common set of code.
Its hard to make more than good generalisations but the way I would think about this would be something along the following lines:
Firstly you need a core application (or set of applications or set of application libraries) these will form the basis of any delivered system and therefore a single maintained set of code (although this core may itself include external libraries).
You then have a number of options for your customised applications (the per hospital instance) you can define the available functionality a number of means:
At one extreme, by configuration - having one application containing all the code and effectively switching things on and off on a per instance basis.
At the other extreme by having an application per hospital that substantially comprises the core code with customisation.
However the likelyhood is that whilst the sum of the use cases for each hospital is different individual use cases will be common across a number of instances so you need to aim for a modular system i.e. something that starts with the common core and that can be extended and configured by composition as much as by any other means.
This means that you are probably want to make extensive use of Inversion of Control and Dependency Injection to give you flexibility within a common framework. You want to look at extensibility frameworks (I do .NET so I'd be looking at the Managed Extensibility Framework - MEF) that allow you to go some way towards "assembling" an application at runtime rather than at compile time.
You also need to pay attention to how you're going to deploy - especially how you're going to update - your applications and at this point you're right you're going to need to have both your version control and you build environment right.
Once you know how you're going build your application then you can look at your version control system - #VonC is spot on when he says that the key feature is to be able to include code from shared projects into multiple deliverable projects.
If it were me, now, I'd probably have a core (which will probably itself be multiple projects/solutions) and then one project/solution per hospital but I would be aiming to have as little code as possible in the per hospital projects - ideally just enough framework to define the instance specific configuration and UI customisation
As to which to use... if you're a Microsoft shop then take a good long hard look at TFS, the productivity gains from having a well integrated environment can be considerable.
Otherwise (and in any case), DVCS (Mercurial, Git, Bazaar, etc) seem to me to be gaining an edge on the more traditional systems as they mature. I think SVN is an excellent tool (I use it and it works), and I think that you need a central repository for this kind of development - not least because you need somewhere that triggers your Continuous Integration Server - however you can achieve the same thing with DVCS and the ability to do frequent local, incremental, commits without "breaking the build" and the flexibility that DVCS gives you means that if you have a choice now then that is almost certainly the way to go (but you do need to ensure that you establish good practices in ensuring that code is pushed to your core repositories early)
I think there is still a lot to address purely from the VCS question - but you can't get to that in useful detail 'til you know how you're going to structure your delivered solution.
All of those VCS (Version Control System) you mention are compatible with the notion of "shared component", which allows you to define a common shared and deployed code base, plus some specialization in each branch:
CVCS (Centralized)
Subversion externals
DVCS (Distributed)
Git submodules (see true nature of submodules)
Hg SubRepos
Considering the distributed aspect of the release management process, a DVCS would be more appropriate.
If the bug is located in the common code base, you can quickly see in the other branches if:
what exact version of the common component they are referring to.
they refer the same or older version of that common component than the one in which the bug has been found (in which case chances are they also do have the bug)

ClearCase advantages/disadvantages [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Because I'm currently struggling to learn IBM Rational ClearCase, I'd like to hear your professional opinion.
I'm particularly interested in advantages/disadvantages compared to other version-control-systems like Subversion or Git.
You can find a good comparison between ClearCase and Git in my SO answer:
"What are the basic ClearCase concepts every developer should know?", illustrating some major differences (and some shortcomings of ClearCase)
File-centric operations
The most single important shortcoming of ClearCase is its old "file-centric" approach (as opposed to "repository-centric" like in SVN or Git or Perforce...)
That means each checkout or check-in is done file per file. The atomicity of operation is at file levels.
Combine that with a very verbose protocol and a network with potentially several nodes between the developer workstation and the VOB server, and you can end up with a fairly slow and inefficient file server (which ClearCase is at its core).
File-per-file operations means: slow recursive operations (like recursive checkout or recursive "add to source control", even by clearfsimport).
A fast LAN is mandatory to mitigate the side-effects of that chatty protocol.
Centralized VCS
The other aspect to take into account is its centralized aspect (even though it can be "distributed" with its multi-site replicated VOB feature)
If the network does not allow access to the VOBs, the developers can:
still work within snapshot views (but with hijacked files only)
wait for the restoration of the network if they are using dynamic views
Expensive Distributed VCS option
You can have some distributed VCS feature by replicating a Vob.
But:
you need a special kind of license to access it.
that license is expensive and add to the cost of the regular license
any vob that uses the replicated vob (admin vob, admin pvob, ...) must be replicated as well (meaning some projects not directly concerned with a distributed development will still have to pay multi-site license...)
Old and not user-friendly GUI
the GUI is very old school and impractical (mid-90's MFC look, completely synchronous GUI, meaning you have to wait for a refresh before clicking elsewhere): when browsing baselines, you cannot quickly look for one in particular.
the GUI on Unix is not exactly the same than on Windows (the latest 7.1 version is better but not there yet)
the installation process is quite complicated (although the latest Installer Manager introduced by CC7.1 is now a coherent GUI on Windows or Unix and does simplify the procedure)
the only real rich application has only been developed for CCRC (the Remote Client)
UCM inconsistencies and in coherencies
As mentioned in "How to Leverage ClearCase’s features", dynamic views are great (a way to see data through the network without having to copy them to the disk), but the main feature remain UCM: it can be a real asset if you have big project with complex workflow.
Some shortcomings on that front:
the dependencies between components is not well managed for a depth superior to one (because of the bug of "parasite baseline")
UCM still has some in coherencies and inconsistencies as documented in CM Crossroads
Limited policies with Base ClearCase
Using ClearCase without using UCM means having to define a policy to:
create branch (otherwise anyone can create any branch, and you end up with a gazillon of them, with merge workflow nightmare)
put labels (otherwise you forget to label some files, or you put a label where you were not supposed to, or you "move" (gasp) a label from one version to another: at least UCM baselines cannot be moved)
define changeset. ChangeSets only exist with UCM activities. With Base ClearCase, you are reduced to clever "cleartool find" requests...
No application rights
ClearCase right management is entirely built on system rights.
That means you need to register your user to the correct system group, which is not always easy to do when you have to enter a ticket to your IT service in order for them to make the proper registration.
Add to that an heterogeneous environment (users on Windows, and server on Unix), and you need to register your user on Unix as well as Windows! (with the same login/group name). Unless you put some sort of LDAP correspondence between the two world (like Centrify)
No advanced API
only CLI is complete ("cleartool" is the ClearCase Command Line Interface), meaning that any script (in Perl or other language) consists in parsing the output of those cleartool commands)
ClearCase Automation Library (CAL) exists, but is quite limited
Java API exists, but only for web views for the CCRC client.
View Storages not easily centralized/backed up
The View storages are the equivalent of the ".svn" of SubVersion, exept there is only one "view storage" per view instead of many .svn in all the directories of a SubVersion workspace. That is good.
What is bad is that each operations within a view (a simple "ls", checkout, checking, ...) will trigger a network request to the view_server process that manages your view server.
2 options:
declare your view storage on your workstation: great for scalability, you can have as many view as you want without taxing the LAN: all communications are directly done on your workstation. BUT if that machine dies on you, you loose your views.
declare your view storage on a centralized server: that means all view_server process will be created there and that all operations on a view by any user will have to communicate with that server. It can be done if the infrastructure is "right" (special high-speed LAN, dedicated server, constant monitoring), but in practice, your LAN will not support this mode.
The first mode means: you have to backup yourself your work in progress (private files or checked-out files)
The second mode means: your workstation can be unavailable, you can just log on another a get back your views (execpt for the private files of a snapshot view)
Side discussion about dynamic views:
To add to the "dynamic view" aspect, it has one advantage (it's dynamic) and one shortcoming (it's dynamic).
Dynamic views are great for setting a simple environment to quickly share a small development between a small team: for a small development effort, a dynamic view can help 2 or 3 developers to constantly stay in touch one with another, seeing instantly when one's commit breaks something in the other views.
For more complex development effort, the artificial "isolation" provided by snapshot view is preferable (you see changes only when you refresh - or "update" - your snapshot view)
For real divergent development effort or course, a branch is still required to achieve true code isolation (merges will be required at some point, which ClearCase handles very well, albeit slowly, file-by-file)
The point is, you can use both, for the right reasons.
Note: by small team I do not mean "small project". ClearCase is best used for large project, but if you want to use dynamic views, you need to setup up "task branches in order to isolate a small development effort per branch: that way a "small team" (a subset of your large team) can work efficiently, sharing quickly its work between its members.
If you use dynamic views on a "main" branch where everyone is doing anything, then any check-in would "kill you" as it could introduced some "build breaks" unrelated with your current development effort.
That would then be a poor usage of dynamic views, and that would forget its other usages:
additional way of accessing data, in addition of snapshot views, meaning it is a great tool to just "see" the files (you can for example use a dynamic view to tweak its config spec until you see what you want and then copy those select rules into your usual snapshot view)
a side view to make merges: you work with your snapshot view, but for merges you can use your dynamic "sister-view" ("sister" as in "same config spec"), in order to avoid having a failed merge because of checked-out files (on which you would be currently working on your snapshot view), or because of a snapshot view not completely up-to-date. Once the merge is complete, you update your regular snapshot view and resume your work.
Developing directly in a dynamic view is not always the best option since all (non-checked-out) files are read over the network.
That means the dll or jar or exe needed by your IDE would be accessed over the network, which can slow down considerably the compilation process.
Possible solutions:
one snapshot view with all in it
a snapshot view with dll or jar or exe in it (files which do not changes every five minutes: one update per day), and dynamic view with only the sources visible.
The cost is a fairly obvious disadvantage. Not just the license cost, but also the cost of a ClearCase guru's salary. Almost every company I'm aware of that uses ClearCase seems to have at least one person whose only purpose is to tame the unruly beast.
The very fact that it's complicated enough to require a full-time nanny is also worrying.
An absolute nightmare of a system. It made me wish we could go back to VSS! (Never mind any modern source-control system like Subversion or Git!)
It's slooooow.
If you use dynamic views and the network goes down you cannot access your working copy of the source. You can do nothing but sit and wait for it to be fixed.
If you use snapshot views you seem to run into conflicts and "hijacked" files all the time, so the files in your working copy are never quite the same as in the source repository.
Whenever you try a large update or deliver operation it invariably FAILS for one reason or another, requiring your ClearCase guru to spend a few hour/days figuring it out. Oh yes, you must have a dedicated, full-time ClearCase guru!
When it fails you often cannot roll back the operation, either, so you're stuck with an operation in progress and the developers are blocked.
When you look past the pretty(?) icons, the GUI is very poor - right down to things like being unable to resize windows to see full file paths!
Their support staff are quite reluctant to fix anything. Their first response is always "this is by design" and "can you work around it?" If they do ultimately provide a fix (after much arguing) it will be the most basic possible fix to the most immediate problem.
Basically, it's slow, complicated and unreliable as hell. Oh, and did I mention it's ridiculously expensive? The only way they can possibly sell it is by talking to decision-makers who have never used the product and never will! I'm quite sure that no developer in the world would ever buy it.
Atomic commits and changesets are my biggest gripes against ClearCase. Let's say you check in five files as part of a bug fix or refactoring. Then it is discovered that something got messed up and you need to revert. Good luck finding which five files they are and what version each one needs to be on. But let's take a step back. You have just finished editing those five files, and it's time to commit. The first four go through just fine. That last one requires a massive merge. The other four files are already checked in. They don't wait for you to finish your necessary changes in the last file. I sure hope that no one updated or is using a dynamic view. A continuous integration build server is going to fail too.
Sometimes we make a whole new directory full of files that need to be checked in, but we don't want to check them in until they are done. It's early and everything is still volatile, so why check things in that you might delete very soon? OK, fine so far. Now it's time to check in. You add the newly created folder to source control. Well, ClearCase isn't recursive, so only that single folder is checked in. With SVN, that folder and everything below it is added, as you choose. The developer needs to remember to add everything, otherwise, a lot of files are going to be missing.
ClearCase owns the files and folders so you cannot modify anything unless you have checked it out first. The eclipse plugin takes away a lot of the nuisance here. I can't tell you how many times I opened a file in vi to make a quick change, only to find that I had forgotten to check it out first. Checkout isn't recursive either.
Updates can be painfully slow without changesets. When you update with a snapshot view, every file updates, not just the modified files. I worked on a project with 20,000+ files. I would remote in to my work machine, start the update, then drive to work; get coffee; go to my desk while it was finishing up. That might sound like an exaggeration, but it sadly isn't.
Dynamic views are terrible unless you are in a very small team. And if that's the case, why do you even have ClearCase? I have seen countless people's views getting hosed because someone checked in files that broke the views of everyone else. You should always update and merge any conflicts on your own view. That way, the changes only affect you. With a dynamic view, you cannot merge down before pushing back up; you just commit and hope.
I know cost probably isn't a big concern, but the developers who make the money for the company would enjoy spending the $50k-$100k (depending on ClearQuest license, which is a common addition) on either fun events or new equipment (chairs, monitors, etc.). IBM recommends having staff to keep ClearCase going. Why not re-purpose those people to generate revenue for the company, instead of making sure things don't crash and burn?
Some of the reasons that I have heard for not switching:
Learning will take time and money
Learning SVN or Mercurial should take no more than a day. Only ClearCase suggests having a certain ratio of admins to developers.
Migration will be painful
This is why tools exist: cc2svn
It's not as easy with Mercurial
Security
There are no known gaping holes in SVN AFAIK, and the development team is dedicated to fixing anything that is found quickly. The Department of Defense seems OK with SVN.
No real productivity gain afterwards
It takes forever trying to track down bugs without changesets. I love being able to roll back until I can't see the bug. You can't do that in ClearCase.
Multisite
WANdisco solves that problem. It's not free though.
The only thing that ClearCase does better than the rest is branching individual files, while keeping the others on the same track as another branch.
Everything I have done in Clearcase always seems hard. Whereas, I've never had that impression with other systems(except maybe CVS on occasion).
I've used SVN, CVS, Clearcase, and Mercurial.
My experience with ClearCase was a disaster, and I will second Don's statement that it requires a resident expert-- unfortunately we had more than one. I had experience with CVS and other version control systems, I was familiar with the concepts, but I found the ClearCase documentation incomprehensible and had to ask for help several times; different experts gave me conflicting advice to the point where we actually broke cd. That is, after I issued a ClearCase command in a UNIX shell, the "cd" command failed with an error message.
The basic task of a version control system is really pretty simple. Honestly, I think that half a dozen commands should suffice, using a file scheme that plays well with others. To me ClearCase looks like the result of a marketing exec deliberately complicating the hell out of things to make the product look sophisticated and powerful. I've heard that it can be configured to behave in a simple, safe, reliable way, but again that requires the services of an expert-- out of the box it's like a motorized swiss army knife.
Everything I've experienced related in any capacity to ClearCase is inefficient, ugly, overly complex, slow, confusing, expensive and inconvenient.
It seems to attract managers and engineers that JUST HAVE GOT IT ALL WRONG.
Damn, IBM and Rational must have amazing sales guys to sell such a crappy product.
We are just migrating off CC onto Git for many of the reasons given here. I would like to add one reason to stay away from CC or any other commercial source control system.
Your vital business data is hostage to ClearCase. You can't get it out.
Your vital business data is the code, its version history and all metadata such as commit comments, who checked in and when.
All software will have a limited useful life. You should always ask yourself when you introduce a new system that swallows important business data, whether it is code, bugs, customer data or what not: How do I get my data out again? If you can't answer that question, you should not introduce that system.
When we migrated out we lost most of our history and all of our metadata. Essentially we only have history corresponding to released versions, but information about what changes were done in response to what customer requests is lost (we have that data in the customer support and bug ticket system, so it is not completely lost, but the coupling to the source code is gone).
This will be somewhere between a nuisance and a problem for us on short to medium term. In a couple of years time, it is not important anymore, but perhaps for 1-3 years it will matter.
(There are commercial tools to migrate CC to other SCM, but they were not deemed adequate to our needs, and I doubt it would have been feasible. The minimal export we did took long enough.)
The lesson learnt is: Never entrust vital business data to proprietary systems.
No atomic commits
Once you checked in files it is very hard to revert to a certain state, because atomic commits aren’t supported. When checking in multiple files, each file gets a new revision (similar to CVS) and not the check-in itself. I think this is a crucial feature, because you hardly want revert single files but complete commit actions (which should map tasks). With ClearCase you can only revert to certain states by using Labels. In practice using ClearCase Labels for each check-in is overkill and thus not done.
Crappy user interface
The GUI of ClearCase Explorer is just a big joke. Horrible in usability and ugly looking. Different and often necessary functions aren’t provided (e.g. recursively checking in worked on artifacts). Command line tool cleartool used with cygwin is much better, but still some things aren’t available like recursively adding new files/folders to source control. I have to laugh my head off if I read a 50 lines of code long script to workaround this.
High administration efforts
Administrating ClearCase beast is far from obvious or lightweight (in difference to other scm-systems like CVS, subversion or Git). Expect to put quite a few dedicated ClearCase experts to just keep it running.
Horrible performance
Nothing is worse as making your developers wait while interfacing with SCM-tool, it is like driving with hand brakes enabled. It slows down your brain and also your work. Getting fresh new files to your snapshot view takes around 30 minutes for 10K artifacts. An update (no artifacts were changed) for the same amount takes roughly 5 minutes. When experimenting a lot and jumping between different up-to-date views means a lot of waiting. It gets even worse, when you’re working on files and you want to check-in or update them. Check-out, check-in and adding to source control cycles take around 10-15 seconds which is obviously a nightmare. It gets very annoying when you’re refactoring renaming/moving types or methods (many files can be affected).
Lack of support of distributed development
Today software development is often a distributed thing (developers are spread around the world working on the same product/project). ClearCase definetely isn’t suitable for this, because it is badly suited for offline work. Doing a check-out (action before you can edit a file/folder) requires that you are network connected. Here you could use the hijack option but this is rather a workaround as a feature (you basically just unlock the file on the filesystem). If your development sites are far away from your ClearCase server the check-in/check-out latency can even increase so dramatically that it is not usable at all. There are workarounds for that like using ClearCase Multisite (scm DB replica technology), but you have to pay extra for it and is not trivial to adminstrate.
Git as alternative
Though being a big fan+supporter of Open Source I am still willing to pay money for good software. But looking at IBM-monster ClearCase I wouldn’t invest my money here, it has all these discussed shortcomings, and further more IBM doesn’t seem to invest money to improve their product significantly. Recently I had a look a Git scm which looks very good, especially for its branching+merging features, where ClearCase has its major strengths.
This information taken from http://www.aldana-online.de/2009/03/19/reasons-why-you-should-stay-away-from-clearcase/
Possibly the worst software ever made. I will not work for any firm that uses rational anything. Aside from CC completely crashing and restarting my workstation frequently on dynamic builds. What happens when you are pushing something to source control and CC does what it does best, crash? Is your code then put in lost+found, backed up somewhere maybe? No, it is gone forever. So if you are ever in the god-awful situation of using this giant piece of expensive software, keep duplicates of everything. Good job Rational / IBM. Way to capture the most important part of source control, reliability. Die slow.
Downsides of ClearCase - an addition to the most in-depth post here.
The merge tool is not worthwhile. It barely helps you, remembers no decisions you made, its just a glorified diff.
The merge tool has to check out directories to even CHECK if they need a merge. Its a bit insane.
I use BitKeeper at work (let's assume Git), and merging two repositories even if there are conflicts is so trivial and user friendly even with command line, while ClearCase having tons of GUI tools is a long and laborious process which is also extremely error prone.
All GUI tools require a ton of latency. Even seeing what can be done on a file requires a high speed connection. So right-clicking in the ClearCase tool on a file working from home could take a minute or two having high speed internet because of the extreme amount of networking requirements.
Someone can completely mess up the repository or check-ins if they make their view spec different than the team. Which is quite insane that nobody can just check out some branch; they need the appropriate view spec which will incidentally give them the right stuff. The whole concept can be nice and flexible but 99% of the time it just causes lots of pain. Did I mention you can't email your spec via Microsoft Outlook since CC tools don't accept UTF-8 so you can't copy-paste it?
I have absolutely nothing nice to say about CC. I used it for 2 years at 2 companies and dropped it in a heartbeat feeling happy the entire time. It is also impossible to just experiment with at home with your own projects, so you will still learn SVN or Git at home, and be forced to go through ClearCase pains at work. Nobody I know has ever used CC voluntarily. They only use it because some manager at work decided CC is the path to salvation and forced everyone to migrate to it. In fact my last company migrated from CVS to ClearCase, and after one year from ClearCase to SVN. It was that hated.
ClearCase is not just one thing that makes you say no. It's like living in a house infested with ants. Each ant is just a minor inconvenience at best, but the infestation will drive you mad.
I'm trying to consolidate a few comments into an actual post here. I'm not really here to persuade you that one is better than the other, except by way of making a few points:
If you're comparing git and ClearCase, I respectfully submit that you need to better define your needs - if you are considering ClearCase for a "good" reason, the git probably isn't even in the equation - it's far too new to trust for enterprise-level source control, imo.
ClearCase introduces a lot of concepts into the version control space that other systems don't have, so it can be pretty daunting and confusing. Especially if the only experience you have is reading the documentation, as appears to be the case for a few people here.
ClearCase is definitely not well suited to huge code bases supported by developers who are not on a LAN with a VOB server. You can have many replicated (multi-site) VOB servers to get them close to remote developers, but this isn't necessarily practical if those remote sites are just a single developer.
Do you want file versioning or repository versioning? This is a pretty important question, and one that will necessarily filter out an entire set of tools, making your job easier. Repository versioning has a lot of advantages (and it's not "new", like some posters claimed - commercial tools like Perforce have been around for more than a dozen years, and there may have been tools that did repository versioning even before Perforce), but it isn't a panacea.
With a sufficiently large installation of any source control system, you're going to need help. When considering tools, you need to consider how easy it will be to find people to help you (either job applicants who have experience, or consultants who will be there at a moments' notice to address any issues). There's no such thing as a maintenance-free SCM system, and assuming you have one will get you into more trouble than picking one that requires "too much" administration.
Don't pay too much attention to people who talk about how bad "dynamic views" are - bad SCM policies are bad, regardless of the tool you're using. Your configuration management policies and practices have to be separate from your choice of tool - no tool will stop people from smashing all over your codebase if you don't define sensible branching and merging policies. If someone suggests that having developers directly commit onto /main is ever a sensible idea, you might want to walk away from that conversation.
ClearCase is a fine tool, but it is a complicated tool. There is no getting around this - it does not have an "easy install" mode. :-) From a technical standpoint, there's nothing that git or SVN can do that ClearCase cannot (although often the terminology is different, since Open Source projects tend to just invent new taxonomy where there already existed one), but some things are definitely easier/harder for a given system, depending on their design. ClearCase "snapshot" views are basically the same thing you would have if you checked out a repository from SVN or CVS - it's a local copy of the source code on your machine, with pointers back into the central server for tools to query version history, etc. You can work with these views without any network connection to the ClearCase server at all once they have been created, and you can "recycle" them to avoid downloading your entire repository again when you want to move to work on another branch, for example. "Dynamic Views" are basically a ClearCase invention, and the standard operating mode for a LAN. They appear the same as checking out an SVN repository, but they don't actually copy any files until you make changes. In this way the view is available immediately, but it obviously cannot be worked with if the main clearcase server is unavailable, and is unpleasant to work with over a high-latency connection. They also have the convenience of being able to be mounted as a network drive on any machine with access to the server on which they were created, so if your windows workstation dies, you can just log onto another one, mount your view, and get back to work, since all the files are stored either in the VOB server (for files you haven't modified on this branch), or the view_server (for files you have created or modified just in this view).
Also, and this deserves its' own paragraph....clearmerge is nearly worth the price of admission alone. It's hands down the best merge tool that I've ever used in my life. I firmly believe a lot of bad practice in SCM has developed because of a lack of high-quality merge tools, so CVS users never learned to use branches properly and this fear of branching has propagated to the current day for no particularly good reason.
Ok, all that being said, if you're looking for reasons not to use ClearCase, they're not hard to find, although I think that's the wrong way to go about it. Really you should need to come up with good reasons TO use ClearCase, not reasons for NOT using ClearCase. You should come into any SCM situation assuming that ClearCase is too much tool or too complicated a tool for the job, and then see if you have some situation that encourages you to use it anyhow. Having IBM or Rational logos is not a good reason.. :-)
I would not even consider ClearCase unless you could say yes to all the following statements:
You do now, or will eventually have, more than 50 developers working on the same codebase.
Most of those developers are centrally located, or have high-throughput low-latency connections to a central location.
You have a set of SCM policies and can identify how to use ClearCase to enforce those policies (really you should consider this for any tool)
Money really is no object
My experience is mostly limited by CC, CVS and SVN. In principle, CC is technologically capable, enterprise ready and comparable by features with any modern VCS. But it has several flaws that make it unusable in any people-oriented environment. For process oriented environments it is probably more appropriate, though I doubt that such environments are appropriate by themselves. Maybe, in military, cosmic or medical software, I don't know. Anyway, I believe that even for these domains there are appropriate and still more friendly tools.
Beside being technically capable VCS, CC has several distinctive advantages:
Dynamic views
Nice version tree
Triggers
Good merge versioning, including renames
In my opinion, their use is limited excepting last one; and they don't compensate flaws. Dynamic view nice in theory, but not always available in practice. Version tree has much less use in other VCS, while necessary in CC because of proliferation of branches (see 6). Triggers, as I know, very detailed and capable, but I think that for most practical tasks SVN hooks are good enough. And now about disadvantages that mostly concerns usability:
CC totally fails in sense of usability for main user group: for developers. And that is the main reason why I think that it should never be used in any environment, be it enterprise or not. Even if it were free, it would nevertheless suck your company's money by wasting time of your developers and frustrating them. This point is composed from:
"Check out-Check In" with strict locking approach - it is counter-productive, refactoring unfriendly, and dangerous in repository organizations with single development branch for multiple developers. Meanwhile, the advantages of strict locking are negligible.
Poor performance and high load
It effectively cannot be used remotely without multi-site (due to 2). Multisite is expensive too. ClearCase Remote client is very limited. It don't even have cleartool (before version 7.1), leaving alone dynamic views.
It can hardly be used offline. Dynamic views are just not work. Snapshot views are effectively read only, because you cannot check out without access to repository (see 1). Hijack is poor option which in fact means that CC gives up any responsibility for hijacked file. And CC cannot show you difference with previous revision when offline. SVN is able to show difference with previous revision even being offline.
Overly complicated model, especially with UCM: VOBs, PVOBs, Projects, streams, branches, views, deliver, update, load, restore, rebase, merge, baseline, check in, check out. I think that half of this concepts are just superfluous and doesn't add value, while increasing both technical and conceptual complexity. Few developers understand even basic stuff about CC.
Proliferation of branches. For example, repository often organized with stream per developer (due to 1). It just has no sense in SVN or most other VCSs.
No repository wide revisions. Well, there are such revisions as understand, they called baselines. But when I see some file revision and want to get repository snapshot at the moment of that file revision, I will get some problems. I will need to do black magic with config spec to create a snapshot view, or find somehow through dynamic view if it is available.
Crappy user GUI. Version tree, even being nice, has mediocre usability. Merge tool is just a pity. Other "features": not resizeable windows, absence of incremental search in some places, mouse-centric interface, look and feel in 1995 style, strange work flow distributed between Client and Project Explorer etc.
CC provokes rare and vast check ins. You all know, that check ins must be small and frequent. But developers usually refrains from additional interactions with CC, hijack files and work in local VCS or even without VCS at all (which is more often, unfortunately). And then, after two weeks of development they begin commit comlex feature that adds 20 files and affects another 20 files. It lasts for a day or two, because they hijacked files and now need to perform manual merge with all new changes from repo and resolve all conflicts and discrepancies. During that process, code lies not compilable, because several files successfully got checked in and others do not. And after that it still lies not compilable because they forgot to add another 2 files to CC.
It is very expensive
It is very complex in terms of infrastructure and requires dedicated administrators
ClearCase seems extremely powerful, from the outside. But really, it's just that the number of commands and options you need to use for basic workflow is so high that these get hidden behind a few aliases or scripts, and you end up with something less powerful than CVS, with the usability of Visual Source Safe. And any time you want to do something a little more complicated than your scripts allow, you get a sick feeling in your stomach.
Compare this with Git, which seems complicated from the outside, but after a week working with it you feel completely in control. The repository model is simple to understand, and incredibly powerful. Because it's easy to get at the nuts and bolts, it's actually enjoyable to dig below the surface of your daily workflow.
For example, figuring out a trivial task such as how to just view a non-HEAD version of a file in a snapshot view took me a couple of hours and what I ended up with was a complete hack. Not the enjoyable sort of hack either.
But in Git, figuring out a seemingly complicated task such as how to interactively commit only some changes, (and leave the rest for later) was great fun, and all the time I have the feeling that the VCS is allowing me to organise code and history in a way that suits me, rather than history being an accident of how we used the VCS. "Git means never having to say 'you should have'".
At my work, I use Git for all sorts of lightweight tasks, even within ClearCase. For instance, I do TDD, and I commit to Git whenever a bunch of tests pass and I'm about to refactor. When the task's eventually done, I check in to ClearCase, and Git helps me review exactly what I'm changing. Just try to get ClearCase to produce a diff across a couple of files - it can't! Use Google to find out the various hacks people have tried to work around this. This is something version control should do out of the box, and it should be easy! CVS has had this for decades!
Nightmare to administer in secure environments
Outdated technology
Non-intuitive GUI
Expensive
Resource monster
Sellout to Microsoft
In my opinion? Only reason to have it? If you are religiously following RUP.
The support is terrible. We've had tickets open for years. Our eclipse guru actually fixed a bug in their eclipse plugin locally in about 30 minutes by disassembling the java file. But the ticket still hasn't got past level one support. Every so often they either try to sneakily close it or ping it back to us 'to try on the latest version' (even though we sent them a reproduction recipe which they could try for themselves.).
Do not touch with a barge pole.
Performance.
ClearCase is powerful, stable (IF properly maintained and supervised) but it's slow.
Geological sometimes.
Dynamic views views lead to horrible build times, snapshot views can take ages to update (lunch break for large projects) or checkout (go home for the day).
Clearcase is so annoying it actually drives people to write poetry about it:
http://digital-compulsion.blogspot.com/2007/01/poetic-pathetic-version-control.html
http://grahamis.com/blog/2007/01/24/if-it-was-free-no-one-would-download-it/
The developers will spend 1/2 their time figuring out clearcase before doing any work and once they've figured it out they'll install git locally and only push to the clearcase repo as needed.
You'll have to employ a dedicated Clearcase admin.
I would suggest SVN for toolset and Git for scaling/workflow. I'd also suggest avoiding CC where possible. (Not counting money, the fact it is such a pain to use that is requires a full time admin is a total joke)
I recently had to wrangle with a similar situation. Maybe you can learn from my story.
The team I was newly assigned to was using a heavyweight tool in an convoluted, error-prone manner. I first attempted to sell them on my tools and processes of choice. This attempt failed miserably. I was flabbergasted that they would pick such a burdensome environment over one that was both easier and more effective. Turns out that they wanted to be disciplined, and using a painful process felt disciplined to them. It sounds wierd, but it's true. They had a lot of other misconceptions too. After I figured out what they were after, we actually stuck with the same tool suite (Serena), but massively changed how it was configured.
My advice to you is to figure out what matters to your team. Ripping on ClearCase won't get you anywhere unless you speak to their interests. Also, find out why they don't want to use alternatives. Basically do a little requirements gathering and fit your tool choices to your needs. Depending on your options, who knows, Clear Case may end up being the best option after all.
I'm not totally against ClearCase ( it does have it's advantages ), but to list out the disadvantages:
License Limitations - I can't easily work from home, because I don't have access to the license server. Even with a snapshot view on my laptop I have to play tricks because I can't get a license. There is a special remote client, but adds tons of its own limitations to the mix
License Limitations again - Only so many seats to go around, and then no one can use it.
Unix tools out of date - ClearCase seems to run best on Unix systems, but the GUI tools suck there. Windows/Unix integration of ClearCase introduces all sorts of its own pains.
The biggest downfall for me is both the performance (especially if your VOB is multisite or offsite), and potentially lengthy downtimes.
If you're like me and work in a relatively small office as part of a large company (with no onsite IT), Clearcase servers going down can cost you the better part of a workday in non-productivity as well as getting the right people to get it fixed.
Bottom line, use it only if you really need it for what you are doing and make sure you have a beefy IT budget to maintain it.
ClearCase is perfectly usable if your willing to also use another version control system on top of it! personally I find using mercurial ontop of CC to work quite well.
no atomic checkins
As of the new version of version 7.1 CC provides atomic checkin as functionality IF you like that. Personally I would really not want it but apparently some people see that as "an essential feature". I NEVER would want one big bulk in one go as a sort of massive version. Then again... if you want it just turn it on.
so... no longer an argument.
We used UCM ClearCase integrated with ClearQuest (DR Tracking/change request system) for the last 4 years with more than 50 developers. We have over 50 UCM projects over thousand of streams that handled over 35K DRs and change requests. During this period we have officially made over 600 integration deliveries and while having up to 6 concurrent development and release efforts.
I am the main CM/ClearCase guy with a backup who is able to perform the regular delivery/merge and integration builds. The network and servers are supported by the IT team. All I can say is we have had virtually no problems coming from the CM side of this huge development effort and were never a show stopper. Our developers where trained with just the basic stuff and a simple steps were given to them whenever a new project (branch) was created at the request from the project management.
Too many developer complained about ClearCase because they lack the proper CM/IT/ClearCase/Process/Management support. Developers should focus on development not SCM or be a tool specialist. For a large software development, at least 5-7% of the budget should be spent on CM and tool support.
Running a JDK from a VOB in Linux.
Try it, you need to play with the LD_PRELOAD variable (I know!)
the point of "it needs a dedicated person" and "it is complicated" etc....
The core issue here with finding this a problem is that you have to define if you want to have configuration management performed in your organization (which is NOT version management). Configuration Management is like Project Management: even without a tool you still can do project managment and without a tool you can do Configuration Management. Lots of people have a hard time understanding this and lots of people think Configuration Management is equal to a tool which versions sources of software or something...... (therefore comparisons with subversions or other VERSION management systems)
ClearCase is a solution that is build for usage in a Configuration Management environment ERGO: there is a configuration manager (just like "there is a project manager").
So... if in your perception that dedicated person is there to manage a tool I think there is something very wrong. In my perception there is a dedicated person who does configuration management who from an end-user perpective only shows up when there is a problem with the tool but regards this as only 1% of his job.
So what you need to do (like in any other software project) go back to your requirements and put a list of requirements together on what your organisation wants with configuration management. AND YES like in any other software project you will have users (like e.g. developers) who totally not agree with other users (like e.g. management) on certain requirements. There lies the key imho on some reactions I read here.
And IMHO if you have the organization list of requirements AND a configuration manager in the mix.... the choice is pretty clear (see also the forum on www.cmcrossroads.com)
ClearCase is not a tool only for end-users entering their sources under version control like subversion or git. That is only 1% of why a configuration manager really wants a mature configuration management tool.
And... I think the choice of a CM system should never lay with developers equal to choosing the right project management tool or the right CRM system. Developers are end-users of a certain part of the functionality of the tool.
I will be maybe alone here, but ClearCase is not that bad as everyone says. It can handle huge repositeories. Dynamic view are pretty cool and powerful feature too. It is reliable, can be customized by adding triggers and constraints on a pef file basis, permissions, etc.
Unfortunatelly, it comes with a price, big price. It is costly, and to operate properly needs to be properly configured and maintained by dedicated IT team. It makes it really good for BigCo, but not so wise choice for SmallFirm.
I'm a big fan of DVCS and git, but can understand why would BigCo choose ClearCase over SVN and Git. What I can't understand why would anyone choose SVN over Git ;>
Dynamic Views. Must admire a fully functional translucent file system.
One big benefit is that the Intellectual Property is always in the corporate network. A laptop can be lost/stolen and no source code in jeopardy.
Another is the instant access to source code and changed files, no time is ever spent downloading anything.
It serves well for the purpose it has.

Revision control locking: Is the jury still out?

When I'm online it seems that everyone has agreed that using the exclusive locking workflow in source control is a Bad Thing. All the new revision control systems I see appear to be built for edit and merge workflows, and many don't even support exclusive locks at all.
However, everyone I work with is of the opinion that exclusive locks are a "must have" for any source control system, and working any other way would be a nightmare. Even recent hires from other companies seem to come in with this idea.
My question isn't who is right. I'm pretty sure what answer I'd get to that. My question is, is there really still any debate over this matter? Is there an honest "pro locking" camp out there that makes a serious case? Is any work being done to advance the art of revision control based around the locking model? Or are locking fans flogging a dead horse?
EDIT:
The answers so far have done a good job of explaining why exclusive locks are a good feature to use occasionally. However, I'm talking about promoting a workflow where exclusive locks are used for everything.
If you're used to exclusive locking, then it's hard to embrace the edit-merge workflow.
Exclusive locking has its benefits, especially for binary files (images, videos, ...) which can't be merged automatically or not at all.
But: the need for exclusive locking always indicates another problem: good communication between people working on the project. Exclusive locking provides a poor replacement: it tells users that someone else is already working on that particular file - something they should know without using a version control system.
Since there are better ways to help with the communication among team members, most (all?) version control systems don't implement exclusive locking anymore or just a reduced version (i.e., locking, but such that those locks are not enforced).
It's not the job of a version control system to help with the communication.
I like having the option to exclusive-lock some file[s].
Having an exclusive lock is necessary, e.g. for binary files.
It's also semi-necessary for some machine-generated non-binary files (e.g. for Visual Studio project files, which don't 'merge' at all well if ever there are two parallel changes to be merged).
If you believe that merges are hard (and while we've come a long way, they can be in some circumstances), and you don't have programmers frequently wanting to edit the same file, exclusive locking isn't necessarily that bad.
I wouldn't use them on an open source project, obviously, but in the corporate world where the rules are stricter and you can walk over to a guy and say "can I break your lock?", it gives visibility into what people are working on and avoids conflicts so they don't have to be resolved later.
If two people really need to work on a file at the same time, often you can branch that file, and so long as the tool makes it clear that that branch needs to be merged back in, you can do that and resolve any conflicts then.
That said, I don't think I want to have to work in an exclusive locking world again.
Exclusive locking is the best you can do in a worst-case scenario, so its presence always tells me that there are bigger problems.
One of those bigger problems is bad organization of code. On one of my consulting gigs for a major telecomm, eight out of thirty team members were constantly working on the same source file (a VB.NET "god" form). We would wait for someone else to finish their work and release the exclusive lock (VSS), then the next person in the pecking order would immediately lock the file to apply their changes. This took forever because they had to reintegrate all their work into the new code that they could see only just then. Since I was the new guy, I was on the bottom of the pecking order and I NEVER was allowed to check in my code changes. I eventually went to the project manager/director and suggested that I be tasked with another part of the application functionality. This project eventually self-destructed, but most of us left as we realized that inevitability. Note that the use of VSS integration was a crucial part of this failure, too, since it forces early acquisitions of that precious file lock.
So, a well-organized project should almost never result in two people working on the same part of the same source file at the same time. Therefore, no need for exclusive locking.
Another of those bigger problems is putting binary files into source control. Source control tools are not designed to handle binaries, and that is a good thing. Binaries require different treatment, and source control tools cannot support that special treatment. Binaries must be managed as a whole, not as parts (lines). Binaries tend to be far more stable/unchanging. Binaries tend to need explicit versioning different from source control versioning, often with multiple versions available simultaneously. Binaries are often generated from source, so only the source needs to be controlled (and the generation scripts). See Maven repositories for a binary-friendly storage design (but please don't use Maven itself, use Apache Ivy).
The arguments for locking are really bad. They basically say: our tools are so bad they cannot merge, so we lock.
In my experience, it's often necessary for two people to work on the same file at the same time. (There are, presumably, shops where this doesn't happen, but I have fairly varied experience to draw on.)
In these cases, it is necessary for both people to get copies of the original, do their work, and merge their changes. In non-locking VCSs, this is done automatically and accurately for the most part.
With a locking VCS, what happens is that one person gets the lock. If that person finishes first, great; if not, the other person gets to wait before being able to introduce changes. Sometimes it is necessary to make a quick bug fix while somebody is in the middle of a long change process, for example. Once the second person has the lock, the second person needs to merge the changes, typically manually, and check in the new version. On several occasions, I've seen the first person changes dropped entirely, as the second person didn't bother with the manual merge. Frequently this merge is done hastily, either out of distaste for the job or simple time pressure.
Therefore, if two people need to work on the same file at the same time, a non-locking VCS is almost always better.
If this doesn't come up, and two people never need to work on the same file at the same time, it doesn't matter which you use.
On an open-source project like a game, it makes sense to keep images under revision control, and those are nice to be able to lock (Subversion supports this). For source files, it's better to get into the edit-merge work flow. It's not hard and increases productivity in my experience.
Here's my $0.02.
Locking is an old school of thought for textual Code. Once programmers use merging a couple of times they learn and usually like the power of it.
Valid cases for locks still exist.
Graphics alterations. 99% of the time you cannot merge 2 peoples work on the same graphic.
Binary updates.
Sometimes code can be complex/simple enough to justify only 1 person working on it at a time. In this case it's a project management choice to use a feature.
Interesting question. To me, the issue is not so much whether to lock, but how long to lock. In this shop, I'm a minority of one, because I like to merge. I like to know what other people have done to to the code. So what I do is:
Always work in a local copy of the source tree.
Run Windiff often against the "official" code and if necessary merge changes down to my local copy. For merging, I use an old Emacs (Epsilon) and have the compare-buffers command bound to a hot-key. Another key says "make the rest of this line like the one in the other file", because many changes are small.
When I'm ready to check in changes, Windiff tells me what files I need to lock, check in, and unlock. So I keep them locked as short a time as possible, like minutes.
So when Fearless Leader says "Have you checked in your code?" the answer is "I don't have any checked out."
But as I said, I'm a minority of one.
In my opinion, the main reason people use exclusive locking is its simplicity and for them, lack of risk.
If I have exclusive access to a file, I won't have to try and understand someone elses changes to the same file. I don't have to risk making my changes and then having to merge with someone elses when I check in.
If I have an exclusive lock on the files I'm changing, then I know that when I checkin, I will be able to checkin a coherent changeset; it is simpler for me to do this.
The other aspect of merging (esp. automatic merging) is the potential for regression problems. Without good automated tests, every time you do a automatic merge, you may get problems. At least if you have an exclusive lock on something you ensure that someone is looking at the code before it's checked in. This, for some, reduces risk.
What exclusive locking takes away is the potential parallelism of changes. You can't have two people working on a file.
The open source model (lots of people around the world collaborating on different stuff) has promoted the view that locking is bad, but it really does work for some teams. It does avoid real problems. I'm not saying that these problems can't be overcome, but it requires a change in behaviour for people; if you want to change to a non-locking model, you have to persuade them to change to a way of working which can seem harder for them, and can actually (in their view) increase risk cause regressions.
Personally, I prefer not to use locks, but I can see why some people don't like it.
A bit late to this discussion but to anybody who has read and digested Steve Maguire's Writing Solid Code a central point is to have your tools detect and identify as many problems as possible.
A compiler doesn't get tired after 12 or more straight hours and forget to issue a syntax error. But it is quite easy to forget a manual step of initiating a communication and paying for it later.
If you need to version control a binary file then you need some form of locking to prevent - or at least warn of - an accidental overwrite. It must be a fundamental feature of any VCS even a distributed one. To use such a feature in a DVCS may require the creation of a central repository but is that so evil? If you use a DVCS in any sort of corporate environment, you will have a central repo to ensure business continuity.
Addressing your edit comments.
Even RCS and SCCS (the grandfather VCS for most of what runs on Unix/Linux these days) permit concurrent editing access to files, and I'm not referring to separate branches. With SCCS, you could do 'get -k SCCS/s.filename.c' and obtain an editable copy of the file -- and you could use an option ('-p' IIRC) to get it to standard output. You could have other people doing the same. Then, when it came time to check-in, you'd have to ensure that you started with the correct version or do a merge to deal with changes since your starting version was collected, etc. And none of this checking was automated, and conflicts were not handled or marked automatically, and so on. I didn't claim it was easy; just that it could be done. (Under this scheme, the locks would only be held for a short time, while a checkin/merge was in progress. You do have locks still - SCCS requires them, and RCS can be compiled with strict locking required - but only for short-ish durations. But it is hard work - no-one did it because it is such hard work.)
Modern VCS handle most of the issues automatically, or almost automatically. That is their great strength compared to the ancestral systems. And because merging is easy and almost automatic, it allows different styles of development.
I still like locking. The main work system I use is (IBM Rational) Atria ClearCase; for my own development, I use RCS (having given up SCCS around Y2K). Both use locking. But ClearCase has good tools for doing merging, and we do a fair amount of that, not least because there are at least 4 codelines active on the product I work on (4 main versions, that is). And bug fixes in one version quite often apply to the other versions, almost verbatim.
So, locking-only VCS typically do not have good enough merge facilities to encourage the use of concurrent editing of files. More modern VCS have better merging (and also branching) facilities, and therefore do not have as strong a need for locking for more than the shortest term (enough to make the operations on the file - or files in the more advanced systems - atomic).