TFS commit best practice? - version-control

In my previous jobs, programmers were encouraged to check-in code often with comments. At my new job now, the rule is that no one checks in anything until his or her code is QA'ed. But because QA lags so much behind, we could rarely check-in. We can probably check in once every 2 weeks or so. And when we are asked to check in, it is really a pain in the neck to sort out what's changed for which ticket that needs to be checked in. Do you guys understand the pain? And the consequence is that very often, we programmers forget to check in some important files for some ticket. Another consequence is that this: I have modified file1.html for both ticket 1 and ticket 2. Now we are asked to check in changes for ticket 1 only, and then I have to save a copy of file1.html outside of my solution and then determine what changes are for ticket 1 and remove those changes for ticket 2 before I check in. Pain!
What's your suggestion? What should I say to convince the team here to stop this check-in policy and allow us to check in as often as we want before it is QA'ed? Thanks!

I'd say you seem to understand the problem fairly well, you just need to lay out the pros/cons of the two approaches.
Error-prone to cherry-pick files to check-in at a later date
The version/files you check-in may not match what QA tested. Ideally QA should be testing the same code that will be release, and your version control system is used to enforce that.
Having QA test code that included other changes that may or may not be in the same release can lead to test passes/failures that are dependent on code that will not be in the release. This can invalidate the QA process.
What you are doing is different than what most other teams are doing (honestly this usually resonates with managers more than the other points - in my experience as a consultant at least)
It sounds like what your team is trying to achieve is have a set of code that has all undergone QA and is "releasable" at any time. This is a good goal, but it is usually achieved by using an appropriate branching strategy.
One approach is to do branch by feature (that's essentially what you're trying to do now, only without the support of a version control system). This means you have a branch for each independent change/feature that you make. QA happens against your feature branch. Once QA passes that feature branch is merged into MAIN (aka trunk).
This way developers have their own feature branch that they can check-in to often (best practice is at least one check-in per day). And you still have the copy of the code that is restricted to only code that has passed QA (MAIN), and is always release-ready.
If you fail to convince them you can also use a local Git repository to keep your personal changes organized, then use the Git-tfs tool to send them over to TFS when it's time to check-in.

Related

Simulate a tfs style changeset in the enterprise version of github

I have three environments; dev, test and staging/prod. In our previous model of using Team Foundation Server, we would have three branches of code that matched up to each of these three environments.
Developers would work locally and when they had their code complete, they'd check it into the dev branch. When checking in, TFS automatically creates something called a changeset. This check-in would kick off a build of the files into code which then gets deployed to the dev environment.
When a developer was happy with their code in dev, they'd merge just their changeset into the test branch. They'd pull up a complete list of all of the available changesets that dhad not been merged into test, they'd select theirs and check those into the test branch. Again, this would kick off a build and the output files would get deployed to test.
Once QA was happy with the changes, the dev would merge this changeset into the prod branch. Kicking off a build and the files would be deployed to the staging area. The developer and QA would them promote these files to prod.
All of would allow multiple developers to work on the same files using this changeset mentality. When a specific changeset (or set of changesets) was merged into another environment, only those changes would get merged.
In my relatively new exposure to git, I cannot seem to find a way to select specific "pull requests" (which I assume is similar to a TFS changeset) from one branch to another branch. When I try to make a pull request from one branch to another branch, it wants to pull in not only my pull request, but every other pull request made in the lower branch by other developers too. What is the magic way to make this happen?
Note: Unfortunately we don't have the notion of a "release". We have five scrum teams working on one website with over 200 pages. Each scrum team has their own sprints and can release multiple scrum stories during their sprint. We have internally only one DEV environment, and one TEST environment and one PROD environment. Not only are our environments used by these five scrum teams, but these DEV/TEST/PROD sites are is also used by various other teams for integration efforts with applications we sell and also for customer account management and purchasing. We cannot change that infrastructure.
Note: this is not for a discussion as to if this "changeset" methodology is correct or proper. This is a question of how to achieve this behavior in github/git.
Note: we are a set of scrum-based agile teams. We work from stories. As many as 60 stories can be actively in development at any one time with our large team of 25+ developers. When one story is ready for prod, we promote it to the prod environment as an atomic unit. So think of a changeset as a story.
I have two thoughts:
Don't do it this way. Instead, you should look to git-flow. http://danielkummer.github.io/git-flow-cheatsheet/ and http://nvie.com/posts/a-successful-git-branching-model/ are good explanations. At it's core, git-flow is a naming convention for branches, so it's really not tied to git at all. In essence, you have feature branches that each developer or dev team works on. Once they complete a feature, they merge into develop. develop is "done features" -- not "done done" but rather "feature complete." When we deem it time to release, we fork to a new release/someversion branch (name to match the release name), and then work with QA to harden the release. Commits on the release/someversion branch are only bug fixes. Once it's good enough to deploy, we fast-forward the master branch up to the release branch as we push it into production. master then represents what's in production. As we deploy, we also merge release/someversion into develop so the bug fixes get into the mainline of development. The project manager / product owner can then think of the develop branch as "the latest," and developers can continue on their feature branches until they're feature complete. (Hint, make features small -- like an hour or a day. Features are not releases.)
So why is this better than the way you were doing it? If the feature is done, ready enough for QA to start banging on it, it's done enough to be part of the next release. Picking and choosing features around each other will lead you into very subtle and unpredictable bugs. Since you're re-merging at each step, you have the possibility that you'll merge incorrectly, creating a bug. You're also now creating unique product with each step, so you could get to production with a completely different set of features than you vetted in dev and test. (Will this do bad things? Ask your pharmacist if these drugs interact when taken together.)
Git-flow works great for cadences where you have well coordinated, infrequent, larger releases. As you get closer to continuous delivery, this ceremony will get in your way. At that point, you may choose to flip to GitHub flow or a similar lighter-weight naming convention.
If you're really, really, really (see the above "you shouldn't do it this way" comment) convinced you should do it this way, first, go convince a rubber duck and hopefully you will have talked yourself out of it. If you're still really, really convinced you need to do this, you'll frequently need to squash your commits together creating one large commit for the entire feature, then cherry pick the changeset between the branches.
There's a few disadvantages to this "squash and cherry-pick" approach. 1. You lose history. Since you're squashing the history together, you have to now keep features in very contained bundles, and frequently edit the bundle as a whole. One of the primary premises of source control is you get an audit history -- both to roll back to if something goes wrong, and to reference when you need to learn why something works this way or who to talk to about it. (See "git blame".) When you squash, you intentionally remove that learning tool. 2. You're playing features into place in different orders. So you're frequently doing merges. What makes git so awesome is merging is easy. What makes git merging easy is you do it in small pieces. This methodology of squashing everything associated with this feature into one huge commit and cherry-picking it between branches means you're doing very large merges ... which means it will be hard.
Yeah, I know you're quite enamored with the way it's always been, and you really don't want someone telling you your baby is ugly. Sorry. Your baby was ugly. On the bright side, it doesn't need to be. Git flow is awesome, and can definitely facilitate the velocity your team needs.
You previous behavior was dysfunctional. Although not unusual: http://nakedalm.com/avoid-pick-n-mix-branching-anti-pattern/
In Git you most likely want to do two things. The first is to follow Git Flow: http://nvie.com/posts/a-successful-git-branching-model/
Once you have this you can look at creating a deployment pipeline for binaries, not for source. You should do a build from MASTER and that build goes through your environments. Happy to discuss here and offline.

How to protect trunk from devs

We are using TFS for our code: trunk + branches for coding activities. There are 6 devs in my team.
Problem: sometimes developers don't want to create a new branch (or use an old one) to fix/develop something. They just do it in trunk. OK, in some cases it's acceptable. But most of the time it creates a lot of troubles.
How can I enforce protection of trunk and force devs to create new or reuse old branches?
UPD: I don't want to give read-only access to the devs on trunk (they have to be able to create branches and merge them back by themselves). I want some compromise - can create branches/do merging but can't develop in the trunk.
Working on the trunk directly is almost always incorrect. Yes it can be the most efficient way sometimes, but breaking process is breaking process and that will bite you eventually.
I think this problem is best solved with education, but limiting trunk write access to senior devs might help too - if they aren't "infected" too :)
Wortyh bearing in mind though that any good source-repository (read: not VSS) will save you from terminal problems in this area, it's just a matter of effort and watchfulness. You never want to rely on rollbacks, just saying "don't panic".
You can set permissions at the folder level.
Creating a branch is a powerfull permission. You will probably have to have one person who creates the branches and then sets the permissions.
For information on setting permissions see: http://msdn.microsoft.com/en-us/library/ms252587.aspx
I'll second what #annakata said. Additionally, I highly recommend that whomever is responsible for managing SCM in your organization to set up a check-in alert that lets you know when someone checks code into the trunk. That way, you can follow up (with the previously-mentioned cricket bat, if necessary) with the developer responsible.
Some other techniques to consider:
Only allow senior developers to check in. Developers shelve their changes, and the senior devs review then check-in. They can help be your gatekeeper.
Use the gated-check-in feature of TFS2010 to help you. Turn on gated check-ins for the trunk.
Education in a form that developers can understand. Make them know exactly why building off the trunk is a bad thing. SCM process education can go a long way to getting people to comply. If they think it's just an arbitrary rule, they don't feel bad about violating it.
Add consequences (to whatever degree your organization allows). Things like a beer/pizza fund that they need to contribute to when they screw up, or a funny hat that needs to be worn, or even a loud announcement to the entire development organization when someone checks in to trunk. It gets the point across quickly.
Does TFS has support for running hook scripts like subversion has?
If it has you can run pre-commit and post-commit checks to see if commits follows process guidelines, reject patches with an email explaining why etc.
If thats too much work my best advice is talking to people and use good old reward for adhering to rules and punishment for breaking them.
What kind of "fixes" are they making to the TRUNK? Typically you should never check-in to TRUNK but only merge...
If they have enhancements or bug fixes that can wait and that are not emergencies they should do their development in the DEV branch.
If it is an emergency then branch off of TRUNK and make a HOTFIX branch. This will be a copy of what is in production.
Example of when you would want to use HOTFIX:
Let's say you have a change you want to make to production or QA but you don't want the future work done in DEV to go out just yet as it has breaking changes for the QA environment or maybe you just want to be as safe as possible and make sure only the code you know you wanted to change went out with your deployment. If you do not have a HOTFIX branch then click on TRUNK and select "Branch" and name it HOTFIX or something meaningful to you. Then make your changes in HOTFIX, check them in, and deploy from the HOTFIX branch. The HOTFIX will only contain two things then A. What is in TRUNK and B. Your one-off changes. It will not include all of the extra work that you haven't validated or tested from the DEV branch, which is a good thing.
You can create user group within TFS to give readonly or no acess at all. If you right mouse click over the team project and click group membership, then add those groups to the folder structure in the source control Explorer.

Version control best practices

I just made the move to version control the other day, and after a bad experience with Subversion, I switched to Mercurial, and so far am happy with it.
Although I understand and appreciate the idea of version control, I don't really have any practical experience with it.
Right now, I am using it for a couple websites I am working on, and a couple questions have come to mind:
When/how often should I commit? After any major change, whether it works or not? When I'm done for the night? Only when it reaches it's next stable iteration? After any bugfixes?
Would I branch off when I wanted to, say, change the layout of a menu, then merge back in?
Should I branch? What is the difference (for just me, a lone developer) between branching, then merging back in, and cloning the repository and pulling it back in?
Any other advice for a version control newbie?
So far, everyone has given me good advice, but very team-oriented. I would like to clarify:
At the moment, I am just using VC on some websites I do on the side. Not quite full-out freelance work, but for the purposes of VC, I am the only one that really touches the website code.
Also, since I am using PHP on the sites, there is no compiling to be done.
Does this change your answers significantly?
Most of the questions you're asking about depends mostly on who you are working with. If you're a lone developer it shouldn't matter a lot, since you can do whatever you'd like. But if you're in a team where you have to share your code then you should discuss with your team members what the code of conduct should be since sharing changes between one another can become tricky at times.
The discussion regarding code of conduct doesn't need to be lengthy, it can be very brief; as long everyone is on the same page on how to use the repository that is shared between the programmers in the team. If you want to use the more advanced features in Mercurial, such as cherry picking or patch queues, then try using them so that it won't impact your team members in a negative way, such as rebasing on a public repository.
Remember version control has to be easy to use for everyone in the team, or else it won't be used.
When/how often should I commit? After any major change, whether it works or not? When I'm done for the night? Only when it reaches it's next stable iteration? After any bugfixes?
While working with a team there are several approaches, but the common rule is to commit early and often. The main reason on why you should commit often is to make merge conflicts easier to handle.
A merge conflict is simply put whenever merging a file that has been changed by at least two people doesn't work because they've been editing on the same lines. If you're holding on to a commit that involves a very large change with several lines of changes across several files, it will become very difficult to manage for the receiver to manage the conflicts that may occur. The merge conflict becomes even more difficult to handle if the said set of changes are held on for too long.
There are some exceptions to the rule of committing often and one is whenever you have a breaking change. although if you have the ability to commit locally (which you are doing in Mercurial and git inherently) you could commit breaking changes. As long as you fix whatever broke, you should push it upstream to the shared repository when you've fixed your own breaking change.
Would I branch off when I wanted to, say, change the layout of a menu, then merge back in?
Should I branch?
There are many branching strategies to choose from (there is the Streamed Lines paper from 1998 that has an exhaustive pattern list of branching strategies) and when you're making them for yourself it should be open game for yourself. However when working in teams, you'd better discuss openly with the team if you need to branch or not. Whenever you have the urge to branch though you should ask yourself the following questions:
Will my future changes be breaking the work of others?
Will my team have a direct negative impact from the changes I'll be doing until I'm done?
Is my code throwaway code?
If the answer is yes in any of the questions above you should probably branch publically, or keep it for yourself (since you can do that in Mercurial in several ways). You should first discuss with your team on how to execute the whole endavour to see if there is any other way of doing it and if you're going to merge your changes back in, sometimes there are factors at play where there is no need to branch (this is mostly related to how modular the code is).
When you decide to branch be prepared to handle a merge conflict. It is sane to assume the one who created the branch and made the commits to be able to merge it back into the "main branch". At these times it would be great if everyone in the team made relevant commit comments.
As a side note: You do write good commit comments, right? RIGHT!? A good commit comment usually tells why that particular change was made or what feature the committer was working on instead of a nondescript "I made a commit" kind of comment. This makes it easier for the one who is handling the big merge conflict to figure out what line changes can be overwritten and which ones to keep while going through the revision history.
Compile times, or build times rather, sometimes play into the branch discussion you may have. If your project has a slow build time then it might be a good idea to use a staging strategy in your branches. This strategy takes into account that all developers should integrate to a "main line" and changes that are approved are elevated (or "promoted") to the next stage, such as testing or release lines. It is classically illustrated with tag names for open source software like this:
main -o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-> ...
\ \ \
test o-----------o--------------o---------> ...
1.0 RC1 \ 1.0 RC2 2.0 RC1
release o----------------------> ...
1.0
The point with this is that testers can work without being interrupted by the programmers and that there is a known baseline for those who are in release management. In distributed version control, the different lines could be cloned repositories and it may look a bit different since repositories share the versioning graph. The principle however is the same.
Regarding web development, there are virtually no build times. But branching in stages (or by tagging your release revisions) it becomes easier to roll-back if you want to check a difficult-to-track-down bug.
However, a whole other thing comes into play and that is the time it takes to deploy the site. Version control tools in my experience are really bad at asset management. Handling art assets that are in total up to several GB usually is a huge pain in the butt to handle in Subversion (more so in Mercurial). Assets may require you to handle them in another way that is less time consuming, such as putting them in a shared space that are synched and backed up in a traditional manner (art assets are usually not worked on concurrently as with source code files).
What is the difference (for just me, a lone developer) between branching, then merging back in, and cloning the repository and pulling it back in?
The concepts of branching and keeping remote repositories are closer now than with centralized version control tools. You could almost consider them being the same thing. In Mercurial (and in git) you can "branch" either by:
Cloning a repository
Creating a named branch
Creating a named branch means that you're making a new path in the versioning graph for the repository you're creating it on. Creating a cloned repository means you're copying the source repository into a new location, and making a new path in the cloned repository's versioning graph. They are both two different implementations of branching as a general concept in version control.
In practice, the only difference between both methods that you should care about is in usage. You clone a repository to have a copy of the source code and have a place to store your own changes in and you create named branches whenever you want to do small experiments for yourself.
Since browsing through branches is a bit quirky for those who accustomed to a straight line of commits, advanced users know how to manipulate their versions so the version history is "clean" with e.g. cherry picking or rebase. At the moment git docs actually explain rebase rather well.
These are the practices that I follow
Each commit should make sense: one bug fix (or a set of bugs related to each other), one (small) new feature, etc. The idea is that if you need to rollback, your rollbacks fall on well defined "boundaries"
Every commit should have a good message explaining what you are committing. Really get into this habit, you will thank yourself later. Doesn't have to be verbose, a few sentences can do. If you are using a bug tracking system, associating a bug number with your commit is also extremely helpful
Now that I use git and branching is so incredibly fast and cheap, I tend to make a new branch for each new feature I'm about to implement. I'd never even consider doing this for many other VCSes. So branching depends on the system you are using, your codebase, your team, etc, there are no hard rules there.
I prefer to always use the command line and get to know my VCS's commands directly. The disconnect that a GUI based frontend can cause can be a pain, and even damaging. Controlling your source code is very important, it's worth getting in there and doing it directly. But that's just my preference.
Back up your VCS. I back up my local repository with Time Machine, and then I push out to a remote repository on my server, and that server is backed up as well. VCS alone is not really a "backup", it can go down too just like anything else.
When/how often should I commit?
You'll probably get lots of contradictory answers on this one. My view is that you should commit changes when they are working, and each commit (or checkin) should contain exactly one "edit". An "edit" is an atomic set of changes that go together to fix a bug or implement a new feature.
There is a theory that you should check in code every few hours even if it's not working, but in that case you will need to be working on your own branch - you don't want to be checking in broken code to your main line, or onto a shared branch.
The advantage of checking in every night is that you have backups (assuming that your repository is on a different machine).
As for branching:
you should have main line that contains always working code.
you should have a current development branch that contains the latest code. When you are happy with this (and it's passed all it's tests) you should merge it back into the main line.
you might want a branch that contains the last released version. This can be used for testing/debugging bugs and releasing patches (in extremis).
update before each commit
provide commit comments
commit as soon as you have something finished
don't commit anything that makes the code in the repository not compiling or buggy
update every morning
sometimes verbally communicate with colleages if there is something important to update
commit code relevant to exactly one thing (i.e. fixing a bug, implementing a feature)
don't worry to make very small commits, as long as they conform to the previous rule
Btw, what's the bad experience with Subversion?
I commit when I am finished a piece of work and only if it is working. It's bad practise to commit to somewhere where other people use the code.
Branching is something that people will argue about. Some people say never branch and just have switches to get something working or not. Do what you feel more comfortable but don't branch just because you can. I use branching and Branch when i am working on a major bit of work where if I commit broken code by accident its not going to affect everyone else.
Q: When/how often should I commit? After any major change, whether it works or not? When I'm done for the night? Only when it reaches it's next stable iteration? After any bugfixes?
A: Whenever you are feeling comfortable, I am commiting as soon as a unit of work is finished and working (which does not mean that the complete task has to be finished). But you should not commit something that does not compile (might inhibit other people in the team,if any). Also, you should not commit incomplete stuff to the trunk if there is any possibility that you have to implement a quick fix or small change before completing it.
Q: Would I branch off when I wanted to, say, change the layout of a menu, then merge back in?
A: Only if there is a possibility that you have to implement a quick fix or small change before completing your task.
The nice thing about branching is that all commits you are doing in the branch will still be available for future reference (if necessary). Also it is much simpler and faster than cloning the repo, I think ;-)
I agree with others on commit times.
Regarding branching, I generally branch only when working on something that breaks what others are doing or when a patch needs to be rolled to production in a file that already has changes that should not go to production. If you're only one developer, then the first scenario doesn't really apply.
I use tags to manage releases - the "production" tag is always associated with the current prod code, and each release is tagged with "release-YYYYMMDD". This allows you to roll back if necessary.

What Check-In Policies should be considered for version control?

I'm tasked with helping to set up the process templates and check-in policies for my company's TFS 2008 installation.
Aside from three check-in policies (a check-in action must have comments against it, a code file must be peer-reviewed, there must be a work item associated with a check-in), I have been asked to consider and implement any others.
What are some of the most important or useful policies to enforce for version control?
The fewer the better.
Usually in an organization you want to ease the friction of check-in to ensure that you are encouraging developers to make frequent small discrete check-ins rather than checking out a load of stuff at once. Then again you want to ensure that you have a working codebase for everyone who needs it and are capturing the data that you need to improve your software delivery process.
Personally, a policy to enforce changeset comments and a work item association policy are ok - as they capture meta-data that is very easy to remember at the time but hard to find afterwards. It also encourages developers to get into the habit of having a work item to track all pieces of work - even experimental development or spikes.
The peer review process might be better performed using branching or another process rather than forcing a peer review on every check-in - however that depends on your process. Remember as well that you can have mandatory check-in notes in TFS to capture meta-data such as code reviewer. A check-in note is slightly different to a check-in policy and is often confused.
If you want read more discussion about check-in policies, take a look at a blog post I did on the balancing act a while ago. Also to hear some more discussion about check-in policies, I recorded a podcast recently with a fellow Team System MVP talking about their use of TFS and it might be interesting (Radio TFS, Using TFS with Ed Blankenship). Finally we also did a Radio TFS episode all about check-in policies in 2008 that might be of interest.
Don't break the build! Of course, finding an automated way to check on that and reject the check-in are the challenge.
Some rules that we follow in our company:
Commit all changes related to the same task at once (that will help review the changes and future rollbacks or merges if needed).
template based comments (eg: prefix all comments with a code that represents what was done, + for adds, - for removes, * for updates, ! for important modifications, etc).
Obviously always check-in code that compiles, and finished work to the main-line.
check-in daily unfinished work to branches.
The ones we use where I work on TFS are:
Code Analysis
This ensures that all the code was compiled on the devs machine before it was checked in
Work Item Association
If you've done a change there should have been an assigned task!
Last Build Successful
Using the TFS Build Server to check that the current code in source control compiled on an independant machine
Check In Comments (part of the TFS Powertools - http://msdn.microsoft.com/en-us/teamsystem/bb980963.aspx)
It's good to be able to see a summary of the check in without having to go to the work item(s)
Try to keep the number of developers working on the same branch small. That way the branch stays stable with respect to compilation, the unit tests, and regressions. It's a nightmare if a developer does a check in which compiles but his code breaks a key area of the application (such as login).
If you really have to have more than 10 developers checking code into the same branch, we've started an email policy where the developer checking in warns everyone that they're checking in, so that no one attempts to update their copy of the branch in the midst of a check in. Sometimes, we've had to have the converse, where we set aside an time in the date to prohibit check ins, so that updates are safe.
Frankly, the less policies, the better. The more policies you have, the greater the incentive for NOT using version control. What happens then is:
Code is developed on parallel, uncontrolled source control systems, and just the final revision goes to the official one.
People delay committing as much as possible, decreasing visibility of what they are doing to other developers.
People will actually avoid committing something if they can get away with it, and some will find a way to get away with it.
In fact, I think your three check-in policies are already too much. For instance:
Having code being peer-reviewed before check-in makes it much more difficult to have work-in-progress stored there. Instead, if the source control system allows it (and many do), control whether the source is peer reviewed or not. With some systems you can create a life cycle for a revision, with others you might create branches, and still others you might use tags.
Having a work-item associated with a check-in makes it impossible for developers to do exploratory programming, or having initiative on possible improvements. It stifles the developers. Instead, make sure that any revision going into integration tests or user acceptance tests, not to mention production itself, is associated with a work item.
This might sound anti-Enterprise, but it's just some things we have learned in a few decades of software development. Most enterprise organizations haven't been clued in to this, but, eventually, they will. So, you might go the very opposite way, but don't say no one ever told you.
I recommend the Agile Manifest, and, particularly, Lean Software Development for general principles.
Or, taking Stack Overflow design philosophy into account, make the system reward the behavior you want.

How do you handle the tension between refactoring and the need for merging?

Our policy when delivering a new version is to create a branch in our VCS and handle it to our QA team. When the latter gives the green light, we tag and release our product. The branch is kept to receive (only) bug fixes so that we can create technical releases. Those bug fixes are subsequently merged on the trunk.
During this time, the trunk sees the main development work, and is potentially subject to refactoring changes.
The issue is that there is a tension between the need to have a stable trunk (so that the merge of bug fixes succeed -- it usually can't if the code has been e.g. extracted to another method, or moved to another class) and the need to refactor it when introducing new features.
The policy in our place is to not do any refactoring before enough time has passed and the branch is stable enough. When this is the case, one can start doing refactoring changes on the trunk, and bug-fixes are to be manually committed on both the trunk and the branch.
But this means that developpers must wait quite some time before committing on the trunk any refactoring change, because this could break the subsequent merge from the branch to the trunk. And having to manually port bugs from the branch to the trunk is painful. It seems to me that this hampers development...
How do you handle this tension?
Thanks.
This is a real practical problem. It gets worse if you have several versions you need to support and have branched for each. Even worse still if you have a genuine R&D branch too.
My preference was to allow the main trunk to proceed at its normal rate and not to hold on because in an environment where release timings were important commercially I could never argue the case that we should allow the code to stabilise ("what, you mean you released it in an unstable state?").
The key was to make sure that the unit tests that were created for the bug fixes were transitioned across when the bug was migrated into the main branch. If your new code changes are genuinely just re-factoring then the old tests should work equally well. If you changes are such that they are no longer valid then you can't just port you fix in any case and you'll need to have someone think hard about the fix in the new code stream.
After quite a few years managing this sort of problem I concluded that you probably need 4 code streams at a minimum to provide proper support and coverage, and a collection of pretty rigorous processes to manage code across them. It's a bit like the problem of being able to draw any map in 4 colours.
I never found any really good literature on this subject. It will inevitably be linked to your release strategy and the SLAs that you sign with your customers.
Addendum: I should also mention that it was necessary to write the branch merging as specific milestones into the release schedule of the main branch. Don't under-estimate the amount of work that might be entailed in bring your branches together if you have a collection of hard-working developers doing their job implementing features.
Where I work, we create temporary, short-lived (less than day -- a few weeks) working branches for every non-trivial change (feature add or bugfix). Trunk is stable and (ideally) potentially releasable all the time; only done items get merged into it. Everything committed from trunk gets merged into the working branches every day; this can be largely automated (we use Hudson, Ant and Subversion). (This last point because it's usually better to resolve any conflicts sooner than later, of course.)
The current model we use was largely influenced by an excellent article (which I've plugged before) by Henrik Kniberg: Version Control for Multiple Agile Teams.
(In our case, we have two scrum teams working on one codebase, but I've come to think this model can be beneficial even with one team.)
There is some overhead about the extra branching and merging, but not too much, really, once you get used to it and get better with the tools (e.g. svn merge --reintegrate is handy). And no, I do not create a temp branch always, e.g. for smaller, low-risk refactorings (unrelated to the main items currently under work) that can easily be completed with one commit to trunk.
We also maintain an older release branch in which critical bugs are fixed from time to time. Admittedly, there may be manual (sometimes tedious) merging work if some particular part of code has evolved in trunk significantly compared to the branch. (This hopefully becomes less of an issue as we move towards continually releasing increments from trunk (internally), and letting marketing & product management decide when they want to do an external release.)
I'm not sure if this answers your question directly, or if you can apply this in your environment (with the separate QA team and all) - but at least I can say that the tension you describe does not exist for us and we are free to refactor whenever. Good luck!
Maybe Git (or other DVCS) are better at handling merges to updated code thanks to the fact that they (really) manage changes rather than just compare files... As Joel says :
With distributed version control, merges are easy and work fine. So you can actually have a stable branch and a development branch, or create long-lived branches for your QA team where they test things before deployment, or you can create short-lived branches to try out new ideas and see how they work.
Not tried yet, though...
Where I work, we keep with the refactoring in the main branch. If the merges get tricky, they just have to be dealt with on an ad-hoc basis, they're all do-able, but occasionally take a bit of time.
Maybe our problem comes from the fact that we have branches that must have quite a long life (up to 18 months), and there are many fixes that have to be done against them.
Making sure that we only branch from extremely stable code would probably help, but will not be so easy... :(
I think the tension can be handled by adding following ingredients to your development process:
Continuous integration
Automated functional tests (I presume you already count with unit tests)
Automated delivery
With continuous integration, every commit implies a build where all unit tests get executed and you are alarmed if anything goes wrong. You start working more with head and you are less prone to branching the code base.
With automated functional tests, you are able to test your application at the click of the button. Generally, since these tests take more time, these are run nightly.
With this, the classic role of versioning starts to lose the importance. You do not make your decision on when to release based on the version and its maturity, it’s more of the business decision. If you have implemented unit and functional testing and your team is submitting tested code, head should always in state that can be released. Bugs are continuously discovered and fixed and release delivered but this is not any more cyclical process, it’s the continuous one.
You will probably have two types of detractors, since this implies changing some long rooted practices. First, the paradigm shift of continuous delivery seems counter-intuitive to managers. “Aren’t we risking to ship a major bug?” If you take a look at Linux or Windows distros, this is exactly what they are doing: pushing releases towards customers. And since you count with suite of automated tests, dangers are further diminished.
Next, QA team or department. (Some would argue that the problem is their existence in itself!) They will generally be aversive towards automating tests. It means learning new and sometimes complicated tool. Here, the best is to preach by doing it. Our development team started working on continuous integrations and in the same time writing the suite of functional tests with Selenium. When QA team saw the tool in action, it was difficult to oppose its implementation.
Finally, the thurth is that the process I described is not as simple as addig 3 ingredinents to your development process. It implies a deep change to the way you develop software.