Looking for a version control system that supports standard development and customer specific development [closed] - version-control

At our company we've built a data integration tool that we have sold to several customers. Most of the customers have distinct requirements. We implemented these customer-specific extensions by using a self-made mechanism based on inheritance (so every installation knows which classes to load and which not). But all this customer-specific code is still in the same codebase as the standard code.
Now, this is no longer possible for several reasons (codebase getting ugly and large, clashing requirements, etc.)
For this reason we have decided to separate the codebases: one for the standard product, and several customer-specific codebases.
I am now trying to find a version control system that supports this approach. Here's my wishlist:
support for several "standard" codebases for different releases
1.0 release
1.1 release
2.0 beta/development
support for multiple "customer" codebases
ability to create a customer codebase by cloning a standard codebase
ability to change standard code in a customer codebase
ability to update a customer codebase with a new standard release (and somehow marking the conflicts that come from changed standard code in the customer codebase)
As our team is still very small (~4 programmers), it should also be easy to handle by the developers themselves.
Btw, our software is built using Spring with STS (so, an Eclipse plugin would be great too).
All VCS that I have researched so far seem to have that target of building one piece of software - not several. I am hoping for some suggestions or best-practice approches.

Simply get git, go for pull request process and take advantage of some GUI, supporting this workflow.
Are releases much different form custom development?
To clarify, what is the situation you are facing: "standard" development comes in versions, they might live independent for maintenance, you may need to get some fixes from new versions to be incorporated in older releases, you need a way to solve hotfixes.
All these things are well solved by distributed version control systems like git, hg or others. I have started with hg, but later found, git is used more often and in standard installation offers all what is need (what is not a case for some hg features).
Regarding custom development - in fact, they do not differ conceptually much from standard versions - you just need another modification of your program being identified under unique name, which will eventually denote, these are custom things.
Branching or pull request process?
Now how to approach different "swim-lines" for different versions and custom developments?
Branching workflow models
Obvious answer is "branching". There is a lot of tutorials on various branching models and they shall be solving your problem.
However, branching is not trivial either and you may find long disputes on what style is the best one.
Topical repos and pull request workflow
Fortunately, there are even simpler solutions - Pieter Hintjens article http://hintjens.com/blog:24 about "Branching considered harmful" provides simpler model, using topical repositories and pull request process. This is how many projects on GitHub and BitBucket are managed and I found this really the most effective solution with minimal risks.
Final recommendations
For pull request working process, it is handy to have some GUI, which supports related communication - and apart form GitHub and BitBucket, there are solutions on the market (incl. some open source solutions).
Prepare yourself for long run - starting with linked article by Pieter Hintjens may make your run a bit shorter, next step could be playing with a project on BitBucket or so, then design "the final" system (which will anyway evolve during time, but git repos are well suited to keep with the changes).


Is there any version control software with the functionality of Git, but which is not under a viral license? - A "viral license" being, by my definition, one which requires derived software to be under the same or an equally-restrictive license.
I'm not interested in an argument on or discussion about the GPL; it's outside the scope of this question and website.
Fossil is (and Codeville was) a BSD-licensed distributed revision control system.
Note that unless you're actually modifying the version control software itself, the license doesn't affect you; you're free to develop non-GPL'ed software using a GPL'ed tool to manage revisions.
The other options are :
Since 2 years passed since started professionally working with git (after 20 years of not-git...) I can say this:
GIT has it's advantages when it comes to merging code bases between branches and multiple users. Once you master it, and learn to ignore its - sometimes utterly confusing command line UI - can be easy to work with.
On the downside, GIT IS complex to understand and LEARN. There is a long steep learning phase, especially if you work from the command line in multiply branched repository (the common and the recommended approach). Working with UI tools like InteliJ IDE's can hide some of the details, but these require their learning attention and time too + some not so basic GIT knowledge. And this knowledge is required by ALL members of your team.
Forget the license... You want to NOT use GIT for so many other reasons...
If you want things to work faster for your team - stay away from GIT. Why not use SVN? It is supported by any service that supports GIT, and is the most popular alternative to GIT (as far as I know).
To commit/merge/manage a team in GIT it'll take you exponentially more time than other SVN/Fossil/... All in the name of advance "distributed" design, and a rich set of methods to kill your code, merge it wrongly, give you so many options to do horrible mistakes (that happen to pro's and newbies alike), and do simple things the HARD HARD way. Were in reality it only serves the ritual hungry souls of geeky programmers, who would otherwise have to go home late and face the empty walls of their houses... (poetic answer too).
REALLY - It would actually be funny if it wasn't the number one pain-in-the-arse time killer in the office. And once you go GIT you can never go back, so my advice, don't let the geeks have it. Keep it out or pay the price.
And, yeah, I know the crowd here, and I am more than willing to loose a few points. It's not like it means anything real.

I need some experiences concerning the writing of software documentations and user guides.
When I write formal documents like software specifications, every document gets a version number and in the document there is a change history after the table of contents, where you can keep track of the changes made to the document.
If I'm now writing a software documentation or a user guide for an application, and the software has versioning itself, one could get confused with the version number of the document and the product: e.g. application version 1.5, documentation version 1.3.
What's the common way / best practice to write software documentation? Do you keep track of changes to the documents there? If you print a change history - do you show changes to the product and/or the document?
I've encountered this issue at every company for which I've worked that 1) had a significant code base, 2) attempted professional quality documentation, and 3) had separate development and documentation groups.
I have come to agree with Anders, convinced that software and documentation should have different versioning and version control systems. Although similar and having the same target, documentation and code have different lifecycles, which can be fully independent, only being mapped one to the other at release time.
As for generating the documentation with each software build, ask yourself: does that really make sense? Is the documentation historical or is it prescriptive? Any documentation that is generated with each build better have the tools in place to do it. Currently, that only works for API documentation and there are Doxygen-/Javadoc-style tools to support it. That is likely to never be doable for User's Guides and Installation Guides because they are context sensitive.
The need for different version control systems holds, particularly, for the newer structured documentation methodologies. Structured documentation needs to be managed at a much finer level of granularity than source code to be able to efficiently handle something even as seemingly simple as rebranding; usually managed at either the paragraph, sentence, or word level, unlike the file level, which is sufficient for source code. Further, it is generally economical for document elements to be shared among multiple products or departments (engineering, marketing, ...). And, for this level of documentation sophistication, only a content management system is sufficient for tracking content and managing change; the CVS-/SVN-/Perforce-/Clearcase-style SCCSs are abysmally inadequate for managing real-world documentation. Using different version management tools ensures different version numbers for documentation and software.
Documentation may even have a higher rate of change than software when the need for handling typos, grammar errors, and corporate style changes is considered.
Separating documentation and development processes reduces dependencies, which is the fundamental metric needed for producing a quality product. Further, late binding is desirable to best accommodate the rapid rate of change and unpredictable events like late feature additions or deletions. Only at the moment of final (or alpha-/beta release) should the documentation version be mapped to the software version. But, I agree with High-Performance Mark that the end user shouldn't see different version numbers. The documentation version number does not need to appear on the document. That number can, within the documentation process, be maintained and hidden from the public.
The only time that software and documentation versioning can be maintained in lockstep is when documentation is a fully-integrated part of the development process. Over the last 30 years, I've seen this becoming less and less the case because there is less formal, upfront design than there used to be, relying, instead, on an iterative, quick-prototyping approach to software development. The original well-intentioned notions of having documentation drive software development have mostly been put aside but the new methodology also hasn't given us improved documentation or software. Whether the documentation is done upfront or as an afterthought, it's still going to double the time it takes to develop a commercial-quality product.
I think that the documentation and the software are different items, which each have different version numbers. You want to be able to update the documentation without having to update the software number. I would have named it something like:
System Documentation for productX 1.3
Documentation revision 1.7
By clearly including both the software version and the document version in the same place there shouldn't be any confusion.
We tend to use a plain text format for our documentation, mainly LaTeX, and treat it just like source code from a revision point of view: it goes in the repo, we can do diffs and patches, etc. We're not big for change histories in published documents, we can always audit what has gone on if necessary but it rarely is.
As for synchronising code and documentation version numbers, our preferred approach is that v1.1.1 of a document matches v1.1.1 of the software, 3.2.45 matches 3.2.45 and so on. However, in practice we often only have documentation for the first 2 digits (ie 1.1, 3.2) since the third digit is mainly for bug-fixes or performance enhancements. The repo revision number is inserted into the documentation (and in the source code) using svn:keywords should we ever need it.
I'd like to tell you that the same makefile which builds our new version software also builds the new version of the documentation, but we haven't got there yet. We are, however, working on it.
Why don't you just use version control and use that as the automatic document revision? You can have most systems update some text on checkout.

when starting with a project and using source control i find it hard to separate the things people are working on so they don't either write duplicate code or think it should be named one thing and so on.
this problem diminishes over time because the general foundation is in place and it's easier to separate the tasks so they don't overlap as much
how do you manage working with source control in the beginning phase?
I can see that it don't really have anything to do with source control, but it gets more apparent when you have source control too. so the question becomes more along the lines of "how do you manage to separate the tasks so they don't overlap too much. I think it's really hard and i haven't really seen much about how to do it.
Well, as far as source control goes, somebody needs to take the lead and set up the basic structure of the project, directories, etc. and communicate it to the team. On projects I work on, this is usually an architect or senior developer, someone who knows the best practices for project organization for the team/company.
With respect to avoiding having multiple people working on the same tasks, that's a project management function; someone needs to determine what tasks need to be done, and communicate it to the team. If you are working in an agile/scrum environment, the team may divide and hand out work items amongst themselves, but in either case you need to communicate to avoid doing the same work twice.
To address the issue of multiple people working on the same task, I tend to work on smaller teams, 2-6 people; in this environment, I have had a lot of success with a scrum-influenced approach using the Crystal Clear methodology:
Architect(s)/designer(s) come up with high level design
Architect(s)/designer(s) define iterations/deliveries, the first of which is a "project skeleton" which consists of architectural and back-end components and a thin slice of the app
Lead person breaks up features into 1-3 day tasks/units of work (estimated)
Team meets and discusses priority, timing and dependencies of tasks, and divides up the first set of tasks
The team has brief daily meetings to discuss status/priorities and dependencies, and change direction if necessary
With larger projects/teams, you will almost certainly need someone whose main job is dedicated to tracking status, dependencies and conflicts.
I don't think source control has much to do with the problem of coordinating people's efforts (except that it can catch some "conflicts" when people erroneously try to modify the same files in different ways -- but, that's not as good as preventing conflicts, and even just "preventing conflicts" does not per se ensure that everybody is working on what they should ideally be working right now, in terms of priorities). Coordination is properly managed with practices (and perhaps tools, e.g. Pivotal Tracker -- but, using the right practices is even more important than using nice tools!-) that specifically focus on ensuring coordination. For example, the practices that Tracker is designed to support and enhance, such as story-based iterative planning, and other compatible ones, such as stand-ups, offer ways to meet these needs.
You must be having a base version that everyone is using, check that into the repository, and then make incremental changes to the repository, make sure that everyone works on different part of the code, commit every working change, and resolve conflicts as and when they occur. That is how I would do it.

Currently we use FogBugz for tracking issues and found it to be ok. I'm looking for something else that can allow end users the ability to track their cases along with us. And something that actually works well with email. I've found a few alternatives that support those features but they don't integrate with version control. We've got all the SVN hooks in fog bugz and we use them - but I haven't really found them all that useful. Has anyone found a really good reason to need version control integration with the bug trackers?
Clearly, this kind of integration is not something that is essential to the operation of the software. With a bit of discipline every check-in can be accompanied with a bug number manually, and every bug resolution can manually have a version control tag added to it.
All else being equal however, I personally will always prefer automation over 'discipline of the users', because the latter will always sooner or later let you down from time to time. Not because the users are malicious or incompetent, but simply because people cannot be 100% alert all of the time.
I find the integration of SVN with TRAC very helpful. Through SVN hooks, commits to the repository with a ticket number insert a comment on the ticket with a link to a nice visual HTML representation of the revision number, showing inserts, deletes, and diffs.
As a supervisor over a small team of programmers, I find this as a helpful tool for me to do code reviews, so I can verify that the commit truly addresses the associated issue. I wouldn't exactly call this integration essential, but it was a nice free extra on my issue tracker that I've grown to love.
It is absolutely critical for us.
Here is a typical commit log for one of our projects (sample):
Make sure filedes is cleared in child list prior to reallocating
When p->child-filedes is > 0, the child list is active and can not
be collected.
[ Impact: Closes bug 123457 ]
Note the [ Impact: ] line, which could also be "Relates-To", "Caused" or any number of other things.
This lets us use simple greps and automated scripts allowing the person committing to automatically close, or even re-open a bug.
Though we typically use Git and Mercurial, these sort of hooks would work on (almost) any VCS, especially proprietary ones that do not feature some modular plug-in that you need.
If you think of your bug system as just another part of your VCS, its really easy to see how they depend upon each-other.
Other stuff, such as fetching patches submitted with bugs is possible, too.
It is a question about your code size, and how many bugs you need to track.
And it is also really useful for non coders in the organisation i.e. managers and customer support. They can find answers to questions like "When and where was this bug fixed"...
I think it's helpful to distinguish between bugs found internal to the development organization, e.g. from peer code review, versus bugs found by a test group that is external to the development organization.
The (small) benefit to coordinating version control with bugs found by an external test group would be for historical reference.
The larger benefit is in coordinating bugs found via peer code review with version control -- by doing so you can certify that all code is peer review bug free before releasing it to external test groups; a common requirement.
FYI, Code Collaborator from SmartBear, Inc. handles this nicely.
I have found version control integration to be extremely helpful in maintaining and managing multiple versions (stable, development trunk, etc.) of a project.
Using the version control integration and a bit of discipline from coders to reference bug tickets in commits (or some pre-commit hooks to forcibly require ticket references) has allowed us to quickly and easily generate lists of changesets that are required to fix any given bug. This is instrumental when merging the fixes into various stable branches of the code.
It's not a necessity, but it certainly makes life easier for release management.
I've used SVN + Trac and Atlassian's Jira product with Fisheye SVN plugin and have found both tools to be very good. Trac seems to be a bit simpler, but very easy to use. Jira, in my opinion, had a nicer look and feel and quite a few more bells and whistles, but was almost too much at times.

Does your work environment use Harvest SCM? I've used this now at two different locations and find it appalling. In one situation I wrote a conversion script so I could use CVS locally and then daily import changes to the Harvest system while I was sleeping. The corp was fanatic about using Harvest, despite 80% of the programmers crying for something different. It was needlessly complicated, slow and heavy. It is now a job requirement for me that Harvest is not in use where I work.
Has anyone else used Harvest before? What's your experience? As bad as mine? Did you employ other, different workarounds? Why is this product still purchased today?
I had the benefit of using Harvest at a bank and you'll never find a more wretched hive of scum and villainy, backwards triple-forking undocumented check-in gauntlets that require 15 steps to make one simple change. Nevermind that they weren't even using branching. This is an evil tool don't let it get you in its clutches.
Chances are, your company has some sort of contract with CA - are you using a lot of other CA software in-house?
Edit: Guess so!
OK I'm going to answer this in a couple of episodes because its late here and Harvest is a big topic.
Firstly CA Harvest (which is what version 7 of the product is called, version 5 is CCC which I cant recall the expansion, version 12 is called CA SCM) is a lot more than just a SCM tool - in the same way ClearCase is a lot more than an SCM tool. SVN, CVS, git, hg are all base-standard SCM and little more.
What you get with Harvest is SCM + Policy. It gives you a place to store and version your code and wrap it all in a policy of how that code matures though your organization from dev to prod. Do you have a policy in your organization that a Lead Developer needs to sign off on the code before its released to QA ? Harvest allows you define the signoff as a policy, and enforces it - you cant migrate the code from the "Dev" state to the "QA" state until one of the people in the project designated as a Lead Dev does exactly that. Do you have a policy that any SQL code needs signoff by a DBA before it progresses ? Harvest allows you to define that policy, and enforces it - so you might need both Lead Dev and DBA signoff before code migrates.
Harvest is by no means a tool for most software organizations - it is typically used in the finance industry, or in business' where a very strong regulatory framework governs what they can do. Banks need to comply with Sarbannes-Oxley, which has very strong auditing requirements. Harvest provides the ability to define all kinds of controls and process around how changes to the Banks assets move through their lifecycle. I know large public transport organizations that are responsible for the safety and punctuality of millions of people every day, that need the tightly defined control mechanisms that a tool like Harvest provides. I also have seen Harvest used in environments where 1000's of developers use it everyday - yes, I'm not exaggerating, literally 1000's of devs in one organization, writing code for a worldwide retailer, pushing IT solutions out their door everyday to the stores around the world.
Harvest is not perfect, thought version 12 is much better. It has too many "that's just stupid"-moments, it does per-file versioning ala CVS, and CVS-like branching and directory versioning (or lack thereof), with all the fun we've come to know and fear. Once you know it and accept it though, its isn't inherently slower than any other SCM I've used. It just has a bigger job to do than just version your code.
Another big win, and its even bigger with version 12, is its integration with other CA tool (and ability to integrate with non-CA tools, but not many at the moment) - defect tracking with Quality Centre, trouble ticketing with Unicentre Service Desk, software deployment to the desktop with SDM. You can define bridges between these apps that result in a lot tighter integration of these concerns, with the usually positive effects on accuracy and timeliness.
If your dealing with getting software out to a worldwide enterprise, with thousands of desktops and servers, mainfame/midrange/middleware systems, iron-clad change control processes, complexity, regulations, contracts, auditors, just a whole bunch of complexity, Harvest is just one tool in a whole suite of tools your going to need. If you just want a simple SCM for a team of 10 devs supporting a few hundred customers, its not a great way to go.
I'll try to add something about how Harvest actually works next time - repositories, projects, views, packages, forms, processes etc. That might help explain why some organizations use it, and why its not for everyone.
I used Harvest during a short gig in the banking industry a few years ago. I agree that it was practically unusable, but the people in charge of QA seemed to love it.
I worked for a company that had two choices; ClearCase or Harvest. Subversion hadn't ever been considered, and the reason was that ClearCase (IBM) and Harvest (CA) both had longstanding mainframe contracts already.
We've used Harvest for about ten years (2000-2010) and even though we are now looking at replacing it I believe it has served us very well.
Harvest (let's stick with that name even though it's no longer it's official name), was the first major tool we implemented to support us in R&D and at the time none uf us knew much about the many aspects of application lifecycle (versioning of code, branching, automated testing, regression testing, quality assurance, deployment to numerous runtime environments and production, rollback, ememrgency fixes, maintenance updates etc.); today we know a lot more and our development processes serve us very well (not that there is not room for many improvements).
We do not have a very hierarchical organisation (we don't have a lot of inspectors that need to approve changes) but it's very helpful to have support for "checkpoints" - points in the development process where something need to happen (e.g. functional testing or integration testing).
The drawback (for us) with Harvest in regards to usability has been "what a programmer need to do to change x lines of code". Today (out there) there are a lot of easier and more efficient ways than Harvest to get write access to source code files, make your updates and then return the files again / move them to another aspect of the development process (testing,deployment etc.). Another drawback is the price tag; it's expensive.
Gain we've had with Harvest:
It support workflow and therefore we've been able to have a single system to manage code versioning, workflow and process automation. If possible it's easier to maintain and improve a single system than many.
In addition to providing cmd line access to internal processes (making it possible to script special solutions when so required by your processes) Harvest also is easily configured by graphic interface.
It has the concept of "Package" which makes it easy to attach plenty of meta data to code changes and to handle the changes independently of other changes (versioning on file level rather than change sets containing the complete code mass). This is helpful to handle indpendent emergency and maintenance changes.
If a developer is only a programmer and only think on the coding aspect of software development then I imagen he/she would might get very frustrated with Harvest.
If a developer is a developer and understand that software development is a lot more than coding and that the coding is only the very begining of a the lifecycle of software then I belive he would see a lot of benefits with Harvest.
I have been using HARVEST for the last 4 years and i love it. The kind of support it gives you to control the code movement is really fantastic. We use HARVEST to deploy applications on to Websphere. It also do an amazing work in deploying the plugins into the web server along with the application. When you want to have a process in place for moving the code in a big enterprise environment, i don't think any other tool can even come closer to HARVEST.