Which version control programs can enforce running & passing of tests before integration of changes? - version-control

At my work we currently use Aegis version control/SCM. The way we have it configured, we have a bunch of tests, and it forces the following things to be true before a change can be integrated:
The full set of tests must have been run.
All tests must have passed.
With test-driven development (TDD) these seem like sensible requirements. But I haven't heard of any way you can do this with any other version control systems. (We're not currently planning to switch, but I would like to know how to do it in the future without using Aegis.)
I would be interested in any VCS (distributed or not) that can do this, I'm also interested in any plugins/extensions to existing VCS that allow this. Preferably open source software.
ETA: OK, it seems like the usual thing to do is have VCS + continuous integration software, and running the tests is automated as part of the build, instead of as a separate step. If I understand correctly, that still lets you commit code that doesn't pass the tests, just you get notified about it -- is that right? Is there anything that would stop you from being able to integrate/commit it at all?

IMO you're much better off using a continuous integration system like CruiseControl or Hudson if you want to enforce that your tests pass, and make the build rather than the check-in dependent on the tests results. The tools are straightforward to set up, and you get the advantages of built-in notification of the results (via email, RSS or browser plugins) and test results reporting via a Web page.
Regarding the update to the question, you're right - VCS + CI lets you commit code that doesn't pass the tests; with most CI setups, you just won't get a final build of your product unless all the tests pass. If you really want to stop anyone from even committing unless all the tests pass you will have to use hooks in the VCS as others have suggested. However, this looks to me to be hard to deal with - either developers would have to run all of the tests every time they made a checkin, including tests that aren't relevant to the checkin they are making, or you'd have to make some very granular VCS hooks that only run the tests that are relevant to a given checkin. In my experience, it's much more efficient to rely on developers to run the relevant tests locally, and have the CI system pick up on the occasional mistakes.

With subversion and git you can add pre-commit hooks to do this.
It sounds like you need to look at Continuous Intergration (or a variant of).
Think Git has a hook on apply patch too.

Subversion and git both support this via pre-commit hooks.
Visual Studio Team System supports this natively via checkin policies.
I believe that Rational ClearCase also supports it, though I've never seen that demonstrated so I can't say for certain.

We use git and buildbot to do something similar, though not quite the same. We give each developer their own Git repository, and have the buildbot set to build any time pushes to one of those repositories. Then there is someone who acts as the integrator, who can check the buildbot status, review changes, and merge their changes or tell them to fix something as appropriate.
There are plenty of variations of this workflow that you could do with Git. If you didn't want to have someone be the integrator manually, you could probably set the buildbot up to run a script on success, which would automatically merge that person's change into the master repository (though it would have to deal with cases in which automatic merge didn't work, and it would have to test the merge results as well since even code that merges cleanly can sometimes introduce other problems).

I believe continuous integration software such as team city allow you to do pre-commit build and test. I don't know of any vcs that provides it directly...there may be some like the one you use but I'm not familiar with them.

You can also use pre-commit hooks in Perforce. And, if you're a .NET shop, Visual Studio can be configured to require "gated" checkins.

VSTS with custom Work Items, right? I don't see anything wrong with using this. Built in reporting. The choice to automate. Why not?

What I do here is following a branch per task pattern which lets you test the code already submmitted to version control but still keeping the mainline pristine. More on this pattern here.
You can find more information about integration strategies here and also comments about Mark Shuttleworth on version control here.

Most CI implementations have a mechanism to reject check-ins that don't meet all the criteria (most notably pass all the tests). They're called by different names.
VCS should do what they do best.. version source code.
TeamCity - Pre-tested commit
TFS - Gated check-ins

Related

How to link continuous integration to my latest sprint trunk?

Using a continuous integration on my project, I need to checkout the code from latest sprint from BAZAAR as bzr://path/to/myproject/sprint/123
As this path is changing repeatedly (for each sprint), I'm currently using externals to create a bzr://path/to/myproject/current pointing to bzr://path/to/myproject/sprint/123.
So, I just need to change the externals to redirect the continuous integration tool to the latest project.
Is there another way to do this ?
What I don't want is to change the configuration of my project inside the continuous integration tool (CruiseControl.NET).
One option (might not be suitable for your teams' processes) would be to stop using a separate "sprint" location in bzr for each iteration's changes. Instead, just use a "trunk" (or perhaps your "current" above). If you are usually in a situation where you have multiple sprints having changes at the same time, then this would probably not be appropriate.
I suppose you can use a lightweight checkout.
bzr checkout --lightweight bzr://path/to/myproject/iterations/123 bzr://path/to/myproject/current
You can then use bzr switch to switch to the next branch (I'm not sure if it will work over the network):
bzr switch -d bzr://path/to/myproject/current bzr://path/to/myproject/iterations/124
After searching the web, I've found some articles about this question.
There are two solutions so far:
Automatically detect newly finished branch and build them. There is an example here using CC.NET. It is so applicable to my iterations.
Another way is to provide scripts to developer that execute most of the CI tool. This is not perfect, but this may detect issues before merging in the trunk.
Other references:
Best branching strategy when doing continuous integration?

TFS 2010 project structure and getting the best out of it

we recently decided to move to TFS 2010. we would like also to improve our source control structure and projects structure.
here is the structure the team agreed on:
|OurCompanyName (or common root name)
|
+--Windows
+----Applications
+------App1
+------App2
+----Services
+------WindowsService1
+------WindowsService2
|
+--Web
+----Applications
+------WebApp1
+------WebApp2
+----Services
+------WebService1
+------WebService2
|
+--Common
+----ThirdParty
+----Libs
+------DataAccessLib
+------BusinessLogicLib
|
+--Tests
+----TestProject1
+----TestProject1
The common folder holds 3rd party and our in-house libraries which is used all-over(App1,App2,WebApp1...etc)
We need to acheive the following :
Release versions must depend on latest production release of Libs.
if tests failed, depended projects shouldn't build and team should be notified.
simple branching: development, production,versioned releases and how we can structure them accordingly.
I have already read the following guide Visual Studio TFS Branching Guide 2010 but it only addresses the branching bit of it.
You aren't really asking a question from what I can tell. But I can give some feedback/discussion on your goals.
Release versions must depend on latest production release of Libs.
A release version should depend on whatever it used while it was being developed. Not whatever the current version is. You may want to go into more depth on what this requirement is and why you think you need it.
if tests failed, depended projects shouldn't build and team should be notified.
TFS doesn't support chaining builds out of the box, you can modify the build template to add support, but it's not a particularly clean solution (imo).
You can self subscribe to failing builds utilising the built in tfs alerts subscriptions, however it is up to each developer to do so. (Unless you subscribe a mailing group or create a custom event mailer).
Again why are you automatically updating dependencies in other projects? surely you'd be better off using a pull for updates than a push and use a technology like NuGet to handle your references.
simple branching: development, production,versioned releases and how we can structure them accordingly.
That sounds like simply branching each time you do a release, which is very simple.
If you however knew which changeset you releases you wouldn't have to branch and could branch only if you needed it (eg to fix a production bug). It takes a lot more work as you either need to manually label your code on release (at which point you gain nothing over branching) or have an automated release process which does it for you.
Other notes
You dont' want to use multiple Team Project Collections - this adds in a nightmare when it comes to managing build servers.
You may want to update your diagram to show what is a Team Project, Branch, and what is just a standard folder.
Having used TFS for a while, I would like to give a caution:
You look at things from the developer's side, as we did when we started thinking about how to best deploy the projects. However, you should also take under account project management requirements.
Having different TFS projects, means different reporting data to the manager.
Thus if App1 and WebApp1 are to the person that runs your projects part of the same overall project, then if you have them in different TFS projects, questions of the form: 'How many hours did my team spend on this project' will be difficult to answer.
I would seriously look into this issue before deciding on the project structure.
Now regarding your questions:
Release versions must depend on latest production release of Libs
As Betty (above) mentions this is not good practive. What will happen if development took place with production release of Lib v1.0 and sometime during stabilization Lib changed to v2.0 ?
if tests failed, depended projects shouldn't build and team should be notified.
I believe this is a matter of your build script, not of your layout
-simple branching
We try to implement a simple MAIN-line based approach,where we have one or more development branches (depends really on your specific requirements).
Once in a while, when dev code is considered 'stable' i.e. passed basic unit tests, it is merged onto the MAIN line. Developers carry on, on their development branches whereas code on the MAIN line goes more extensive testing. Bugs found are reported and fixed initially on the DEV branches and merged back onto the main line. Once code on the MAIN line is good enough, stabilization starts on a RELEASE branch. After that point, bugfixes take place on the RELEASE branch and merged back into the MAIN line. Note that 'stable', 'good enough' are values that mean different things to organizations.

project upgrade changes merging with customized version

I'm running an open source ecommerce store (nopcommerce) and have made a lot of customizations to the store.
Every time a new version of the software is released, I use winmerge to try to detect which files have changed, and then merge these changes into the project. This works OK but as my customization have grown, this task has become increasingly problematic.
What I'd really like to do is be able to get a diff from my current version to the new version, and then go through and apply the changes that I want.
If I use TFS for this, is there a standard way to accomplish this? Perhaps a 3-way merge app would do the job better?
To complicate things a bit further, I'm using the theming support to add my modified views in another location, so the changes from version to version need to be figured out and applied to the files in this additional folder as well.
In fact this is where the big headache comes in- determining which changed I made, and which ones are new changes from the new version.
nopCommerce hosts its source code in Mercurial via CodePlex. All you really need to do is clone their repository and make changes to your local clone. Then, you can either keep up with their modifications or wait until the next release comes out, then get an update from their repo and merge it with your changes. Mercurial, being a distributed version control system, just does merges well, and you will have fewer problems if you try to do something manually for yourself using Subversion, TFS, or anything but Mercurial. Go download TortoiseHg, which gives you both a nice GUI and the command-line tools for Mercurial. TortoiseHg comes with the KDiff3 merge tool, but I highly recommend Beyond Compare. It's not free, but I'd pay for this software a hundred times over.
As always, if you need help with using Mercurial, see the Hg Book.
I have used both TFS and Subversion and I strongly recommend Subversion (source repository) with TortoiseSVN (command line) and VisualSVN (integrated into Visual Studio).
With these tools, it is very, very easy to find out exactly what files have changed and, more importantly, rollback to a previous version in the event that something goes horribly wrong.
You can also add CruiseControl continuous integration to automatically build your solution and run unit tests on each checkin to ensure that you didn't inadvertently break something.

Procedures before checking in to source control?

I am starting to get a reputation at work as the "guy who breaks the builds".
The problem is not that I am writing dodgy code, but when it comes to checking my fixes back into source control, it all goes wrong.
I am regularly doing stupid things like :
forgetting to add new files
accidentally checking in code for a half fixed bug along with another bug fix
forgetting to save the files in VS before checking them in
I need to develop some habits / tools to stop this.
What do you regularly do to ensure the code you check in is correct and is what needs to go in?
Edit
I forgot to mention that things can get pretty chaotic in this place. I quite often have two or three things that Im working on in the same code base at any one time. When I check in I will only really want to check in one of those things.
A few suggestions:
try work on one issue at a time. It's easy to make unrelated changes to the codebase that then end up being committed as one big chunk with a poor log message. Git is excels here since you can so easily move switch branches, and stash and cherry pick changes.
run the status command before a commit to see which files you've touched and if you've created new files that need to be added to version control.
run the diff command to see what you've actually changed. Often times you find that you've left in some debug logging that should be taken out or made some unnecessary change that is just cluttering up the diff. Try to make your diffs as small and clean as possible.
make sure your working copy builds with your changes in it
update before checking in and make sure that your working copy builds with other peoples changes in it
run what ever smoke test suite you might have to make sure that your changes work correctly
make small and frequent commits. It's a lot easier to figure out what has broken the build when the breaking commit is small.
Other things that the team can do is setup a continuous integration server like David M suggested so that the broken build is discovered as soon as possibly and automatically.
I usually always do a Get Latest before, then build. If build is good then I check in my code.
Here is what I have been doing. I have used ClearCase and CVS in the past for source control, and most recently I have been using Subversion and Visual Studio 2008 as my IDE.
Make my code changes and build on the local machine.
Make sure they do, in fact, fix the bug in question.
Run an SVN update on the local machine and repeat steps 1 and 2.
Run through the automated unit tests to verify that they pass.
If an automated smoke test is available, which automatically tests a lot of the system's capabilities, run it. Verify that the results are correct.
Then go to the build machine and run the build script.
If the project's configuration has changed, this could definitely break a build. Perform an SVN update on the build machine, whether the build script does that or not. Open the build machine's copy of the IDE, and do a complete rebuild. This will show you whether the build box has any problems that you have taken care of on your machine but not on the build box.
The suggestions to keep separate branches for each issue are also very good, if you can keep track of all of the issues you are working on.
First, use multiple working copies (a.k.a sandboxes) - one per issue. So, if you've been working on some complex feature for a while, and you need to deal with a quick bug fix on the same project, check out a new clean working copy and do the bug fix there. With independent working copies for each issue there is no confusion about which changes to commit from the working copy to the reposistory.
Second, before committing changes, always perform the following three steps:
Buld the software.
Run a smoke test (does it start and run without crashing).
Inspect the changes you're checking in by diffing your changes against the baseline.
These should be repeated after any merge operations (e.g. after an SVN update).
At my workplace, the safety net for this is peer review. That is, get someone else to build, run, and reproduce your solution on their machine, on their view.
I cannot recommend this enough. It has caught so many omissions, would-be problems, and other accidental pieces of junk to make it a valuable part of the process. Not to mention that the mere knowledge that you have to place your work in front of someone else before having it go on to the main branch means that you raise your own quality standards.
In the past I have used branching in Clear Case to help with this issue. The process I used is below. I've never used SorceDepot so I do not know how this can be adapted to work with it.
Create a branch for the bug fix
Code all changes on the branch
Code Review
Merge to stable branch in a different view (the different view is important)
BEFORE checking in: compile, test, and run
Check in code to stable branch
By creating the branch and then merging the changes to a different view (I use Merge Manager to do the merge) any files that were not included or checked in immediately cause issues. This way everything gets tested as it will be when checked in on the stable branch.
The best thing to avoid your problems, is to use hooks, that are provided in most SCMs (they are for sure in SVN and Mercurial, and I believe they must be in other advanced SCMs). Attach unit tests to the hook and make it run every time someone checks code in - exactly before it is checked in. This way you will achieve two things:
code in SCM repo will always pass the tests,
you won't make most simple mistakes, because they should be easily detectable, if you have decent test suite.
I like having Tortoise plugins for Windows Explorer. The file icons are all badged with committed, modified or not added icons making it very easy to see what status the files are in. I also enable the meta data for Modified so I can sort changed files in the list (Details) view, where they bubble to the top so I can see them.
I bet there is a Tortoise* plugin for your SCM, I saw one for Mercurial and SVN (and CVS, ugh). I really wish Mac OS X's Finder would accept plugins like Tortoise, its so much easier than having to pop open a dedicated app most of the time.
Get someone else to go through "every" change "before" you check in the code.

When to start to use source control in early stages of development?

We have 2 kinds of people at my shop:
The ones that starts to check-in the code since the first successful compilation.
The others that only checks-in the code when the project is almost done.
I am part of group 1, and trying to convince people of group 2 to act like me. Their arguments are like the following:
I'm the solo developer of this project.
It's just a prototype, maybe I'll have to rewrite from scratch again.
I don't want to pollute the Source Control with incomplete versions.
If I am right, please help me to raise arguments to convince them. If you agree with them tell me why.
When someone asked for good excuses not to use version control, they got 75 answers and 45 upvotes.
And when they asked Why should my team adopt source control, they got 26 answers.
Maybe you'll find something helpful there.
You don't need "arguments to convince them." Discourse is not a game, and you should not use your work as a debating platform. That's what your spouse is for :) Seriously, though, you need to explain why you care how other devs work on solo projects in which other people are not involved. What are you missing because they don't use source control? Do you need to see their early ideas to understand their later code? If you can sucessfully do that, you may be able to convince them.
I personally use version control at all times, but only because I don't walk a tightrope without a net. Other people have more courage, less time to spend on infrastructure, etc. Note that in 2009, in my opinion, hard disks rarely fail and rewritten code is often better than the code that it replaces.
While I'm answering a question with a question, let me ask another one: does your code need to compile/work/not-break-the-build to be checked in? I like my branches to get good and broken, then fixed, working, debugged, etc. At the same time, I like other devs to use source control however they want. Branches were invented for just that reason: so that people who can't get along do not have to cohabitate.
Here's my view to your points.
1) Even solo developers need somewhere to keep their code when their PC fails. What happens if they accidentally delete a file without source control?
2/3) Prototypes belong in source control so other team members can look at the code. We put our prototype code in a seperate location to the mainline branch. We call it Spike. Here's a great article on why you should keep Spike code- http://odetocode.com/Blogs/scott/archive/2008/11/17/12344.aspx
If I'm the sole developer on a project (in other words, the repository, or part of it, is under my complete control), then I start committing source code as soon as it's written, and I tend to check in after every incremental change, whether or not it works or represents any kind of milestone.
If I'm working in a repository on a project with others, then I tend to try and make my commits such that they don't break the mainline development, pass any tests, etc.
Whether or not it's a prototype, it deserves to go into source control; prototypes represent a lot of work, and lessons learned from them are valuable. Plus, prototypes have an awful habit of becoming production code, which you'll want in source control.
I try to only write code that compiles (everything else is commented out with a TODO/FIXME tag)... and also add everything to source control.
Argument 1: Even as a single dev it's nice to roll back to a running version, to track your progress, etc.
Argument 2: Who cares if it's just a prototype? You might stumble upon a similar problem in six months or so, and then just start looking for this other code...
Argument 3: Why not use more than one repo? I like to file misc stuff to my personal repo.
Start using source control about 20 minutes before you write your first line of your first artifact. There is never a good time to start after you're begun writing things.
some people can only learn from experience.
like a hard drive failure. or coding yourself into a dead-end after deleting code that actually worked
now, i'm not saying that you should erase their hard drive and then taunt them with "if only you had used source control"...but if something like were to happen, hopefully there would be a backup done first ;-)
Early and Often. As the Pragmatic Programmers say, source control is like a time machine, and you never know when you'll want to go back.
I would say to them...
I'm the solo developer of this project.
And when you leave or hand it off we'll have 0 developers. All the more reason to use source control.
The code belongs to the company not you and the company would like some accountability. Checking in code doesn't require too much effort:
svn ci <files> -m " implement ajax support for grid control
Next time someone new wants to make some changes on the grid control or do something related, they will have a great starting point. All projects start off with one or two people. Source control is easier now than it ever was--have they arranged a 30 minute demo of Tortoise SVN with them?
It's just a prototype, maybe I'll have to rewrite from scratch again.
Are they concerned about storage? Storage is cheap. Are they concerned about time wasted on versioning? It takes less time then the cursory email checks. If they are re-writing bits then source control is even more important to be able to reference old bits.
I don't want to pollute the Source Control with incomplete versions.
That's actually a good concern. I used to think the same thing at one point and avoided checking in code until it was nice and clean which is not a bad thing in and of itself but many times I just wanted to goof around. At this point learning about branching helps. Though I wish wish SVN had full support for purging folders like Perforce.
Let see their arguments:
I'm the solo developer of this project.
It's just a prototype, maybe I'll have to rewrite from scratch again.
I don't want to pollute the Source Control with incomplete versions.
First, the 3rd one. I can see the reasoning, but it is based on a bad assumption.
At work, we use Perforce, a centralized VCS, and indeed we only check in source that compile successfully and doesn't break anything (in theory, of course!), after peer review.
So when I start a non trivial change, I feel the need to intermediary commits. For example, recently I started to make some changes (somehow, in solo for this particular task, so I address point 1) on a Java code using JDom (XML parsing). Then I was stuck and wanted to use Java 1.6's built in XML parsing. It was obviously time to keep a trace of the current work, in case my attempt was failed and wanted to go back. Note this case somehow addresses the point 2.
The solution I chose is simple: I use an alternative SCM! Although some centralized VCS like SVN are usable in local (on the developer's computer), I was seduced by distributed VCS and after briefly testing Mercurial (which is good), I found Bazaar better suited to my needs and taste.
DVCS are well suited for this task because they are lightweight, flexible, allow alternative branches, doesn't "pollute" the source directory (all data is in one directory at the root of the project), etc.
By making a parallel source management, you don't pollute the source of other developers, while keeping the possibility to go back or quickly try alternative solutions.
At the end, by committing the final version to the official SCM, the result is the same, but with added security at the level of the developer.
I'd like to add two things. With version control you can:
Revert to last version that worked, or at least check how it looked like. For that you would need SCM which supports changesets / uses whole-tree commits.
Use it to find bugs, by using so called 'diff debugging' by finding commit in history that introduced the bug. You would want SCM which support it in automated or semi-automated fashion.
Personally, I often start version control after the first sucessful compile.
I just wonder why nobody mentioned distributed version control systems in this context: If you could manage to switch over to a distributed system (git, bazaar, mercury), most arguments of your second group would become pointless since they can just start their repository locally and push it to the server when they want (and they can also just remove it, if they want to restart from scratch).
For me, it's about having a consistent process. If you are writing code, it should follow the same source control process that your production code does. That helps build and enforce good development practices across the development team.
Categorizing the code as a prototype or other non-production type of project should just be used to determine where in the source control tree you put it.
We use both CVS (for non .NET projects) and TFS (for .NET projects) where I work, and the TFS repository has a Developer Sandbox folder where developers can check in personal experimental projects (prototypes).
If and when a project starts to get used in production, the code is moved out of the Developer Sandbox folder into it's own folder in the main tree.
I would say you should start adding the source and checking in before you even build the first time. It is then much easier to avoid checking in generated artifacts. I always use some source control, even for my small hobby hacks, just because it automatically filters the relevant from the noise.
So when I start prototyping I might create a project and then before building it I do "git init, git add ., git commit -a -m ..." just so that when I want to move the interesting parts I just clone over using git and then I can add it to the subversion repository or whatever is used where I am working at the moment.
It's called branching people try to get with the program :p Prototyping? Work in a branch. Experimenting? Work in a branch. New feature? Work in a branch.
Merge your branches into the main trunk when it makes sense.
I guess people tend to be laid back when it comes to setting up source control initially if the code may never be used. I have projects I coded belonging to both groups and the ones outside source control are not less important. It is one of those things that gets postponed everyday when it really should not.
On the other hand I sometimes commit too seldom complicating a revert once I screw up some CSS code and not knowing what I changed e.g. to make the footer of the site end up behind the header.
I check-in the project in source control before I start coding.
The first thing I do is create and organize the projects and support files (such as .sln files in .NET development) with the necessary support libraries and tools (usually in a lib folder) I know I will use in my project.
If I already have some code written, then I add it too, even if it is an incomplete application. Then I check-in everything. From there, everything is as usual, write some code, compile it, test it, check-in it...
You probably won't need to branch from this point or revert your changes, but I think it is a good practice to have everything under source control since the beginning, even if you don't have anything to compile.
I create a directory in source control before I start writing code for a project. I do the first commit after creating the project skeleton.
i'm drunk and and i do first git -init and then vim foo.cpp.
Any decent modern source control platform (of which VSS is not one) should not in any way be polluted by putting source code files into it. I am of the opinion that anything that has a life expectancy of more than about 1/2 an hour should be in source control as early as possible. Solo develpment is no longer a valid excuse for not using source control. It is not about security it is about bugs and long term history.