A question on Feature/Task branches when using Subversion - version-control

I have come across sources that say that it is better to have separate branches for different features than implementing the feature in trunk.
A question that I have with this approach is, when merging a feature to trunk, we will be merging the whole thing at once. If we do this and annotate a file to see the reasons behind different lines of code, we would see a comment saying "merging xyz feature" which would not be very helpful.
Is there a way to overcome this issue?

If your problem is that when using SVN Blame (aka Annotate) it doesn't show you a list of all the commit comments for merged revisions?
If so, you need to add the -g flag to the command, or if using TortoiseSVN, check the 'Include Merge Info' checkbox.

There are sources that would say that, but there are also sources that say that the earth is flat.
Task branching in a distributed vcs is workable, though something of a solution in search of a problem in the vast majority of cases. In previous generations of systems, it will be pure pain for no gain. It won't do anything useful; and if (when) you get it wrong it will cause confusion and unnecessary rework.
In the worst case your task branches become stealth feature branches. And then you end up maintaining several times the volume of code you need to; every other team member has there own private slightly-different versions of common libraries representing work in progress. Meaning every merge breaks things, so merges happen less, so every branch diverges more.
That's the kind of thing you might have to do, the kind of costs you might have to bear, if you had multiple customers with incompatible needs, or tortuous legal restrictions. Except you are doing it to your own team, for no reason other than a vague feeling it's the right thing to do.

Related

Best practice: To clean or not to clean old branches

I've worked in different team and in one team people tend to clean the old branches as soon as they merge them. In other team branches stay forever. What is the benefit of deleting/keeping old branches? Does it depends on what source control system we use? (In my case - SVN).
The answer may be depended on the version control system that you use. For example, if you would use Git, then you should not try to remove any branch, since the branching system and the way commit and push history is handled (depended on branches) is way different than SVN.
In general, however, I tend to keep the old branches, and not delete them. And in professional places that I have worked, they tend to keep the branches too. In my point of view, keeping a branch not only provides you with code history, but also:
Failed attempts history. You may later think about doing something that has failed before. If you keep the failed branch, you will be able to understand why it failed in the first place.
Good reusable code may exist in these branches. Sometimes when the main stable branches ends discarding much code, good code developed for this branch specifically may end up in the trash, too. However, some of this code may prove useful in other situations in later stages of the development. So, why reinvent the wheel?
Spinoff projects. In big projects, some times branches contain features that did not make it into the final product. From these features, there may be some new ideas that could form a standalone project by themselves.
Proof. Let's face it, in companies, especially big ones, there are managerial concerns that need to be taken into account while committing code. For example, while looking at code history, you can immediately see who has committed faulty or good code, and avoid misunderstandings. I know it sounds cynical, but sometimes it saves people a lot of trouble.
In general, its History. Why delete branches that remind you of the paths that the development has followed up until now? I doubt it will have a significant impact on disk space (in most cases, at least. In other cases, it can have a big impact, but companies should take care of the problem of space before it actually becomes a concern). Branches represent thousands of man hours in terms of work. Deleting them is as if you throw this time away.
As far as discarding the branches, I cannot think of any reason other than to save space.
Simple... You can back track as long as you want, if you have them.
In my case also it is SVN. I use to archive them by different tags and move it to a different folder. So always one hot folder (Live) with parallel dev branches, once merge is completed, go for archiving the branch.

Is using “feature branches” compatible with refactoring?

“feature branches” is when each feature is developed in its own branch and only merged into the main line when it has been tested and is ready to ship. This allows the product owner to choose the features that go into a given shipment and to “park” feature that are part written if more important work comes in (e.g. a customer phones up the MD to complain).
“refactoring” is transforming the code to improve its design so as to reduce to cost of change. Without doing this continually you tend to get uglier code bases which is more difficult to write tests for.
In real life there are always customers that have been sold new features and due to politics all the customers have to see that progress is being made on “their” group of features. So it is very rarely that there is a time without a lot of half-finished features sitting on branches.
If any refactoring has been done, the merging in the “feature branches” become a lot harder if not impossible.
Do we just have to give up on being able to do any refactoring?
See also "How do you handle the tension between refactoring and the need for merging?"
My view these days is that due to the political reasons that resulted in these long living branches and the disempowerment of the development director that prevented him from taking action, I should have quicker started looking for a new job.
Feature branches certainly make refactoring much harder. They also make things like continuous integration and deployment harder, because you are ballooning the number of parallel development streams that need to be built an tested. You are also obviating the central tenet of "continuous integration" -- that everyone is working on the same codebase and "continuously" integrating their changes with the rest of the team's changes. Typically, when feature branches are in use, the feature branch isn't continuously built or tested, so the first time the "feature branch" code gets run through the production build/test/deploy process is when it is "done" and merged into the trunk. This can introduce a whole host of problems at a late and critical stage of your development process.
I hold the controversial opinion that you should avoid feature branches at (nearly) all costs. The cost of merging is very high, and (perhaps more importantly) the opportunity cost of failing to "continuously integrate" into a shared code base is even higher.
In your scenario, are you sure you need a separate feature branch for each client's feature(s)? Could you instead develop those features in the trunk but leave them disabled until they are ready?. Generally, I think it is better to develop "features" this way -- check them in to trunk even if they aren't production-ready, but leave them out of the application until they are ready. This practice also encourages you to keep your components well-factored and shielded behind well-designed interfaces. The "feature branch" approach gives you the excuse to make sweeping changes across the code base to implement the new feature.
I like this provoking thesis ('giving up refactoring'), because it enriches discussion :)
I agree that you have to be very careful with bigger refactoring when having lots of parallel codelines, because conflicts can increase integration work a lot and even cause introducing regression-bugs during merging.
Because of this with refactoring vs. feature-branches problem, there are lots of tradeoffs. Therefore I decide on a case by case basis:
On feature-branches I only do refactorings if they prepare my feature to be easier to implement. I always try to focus on the feature only. Branches should differ from trunk/mainline at least as possible.
Taking it reverse I sometimes even have refactoring branches, where I do bigger refactorings (reverting multiple steps is very easy and I don't distract my trunk colleagues). Of course I will tell my team, that I am doing this refactoring and try to plan to do it during a clean-up development cycle (call it sprint if you like).
If your mentioned politics are a big thing, then I would encapsulate the refactoring efforts internally and add it to estimation. In my view customers in middle-terms will see faster progress when having better code-quality. Most likely the won't understand refactoring (which makes sense, because this out of their scope...), so I hide this from them
What I would never do is to refactor on a release-branch, whose target is stability. Only bug-fixes are allowed there.
As summary I would plan my refactorings depending on codeline:
feature-branch: only smaller ones (if they "help" my feature)
refactoring-branch: for bigger ones, where the refactoring target isn't completely clear (I often call them "scribble refactorings")
trunk/mainline: OK, but I have to communicate with developers on feature-branches to not create an integration nightmare.
release-branch: never ever
Refactoring and merging are the two combined topics Plastic SCM focuses on. In fact there are two important areas to focus: one is dealing (during merge) with files that have been moved or renamed on a branch. The good news here is that all the "new age" SCMs will let you do that correctly (Plastic, Git, Hg) while the old ones simply fail (SVN, Perforce and the even older ones).
The other part is dealing with refactored code inside the same file: you know, you move your code and other developer modifies it in parallel. It is a harder problem but we do focus on it too with the new merge/diff toolset. Find the xdiff info here and the xmerge (cross-merging) here. A good discussion about how to find moved code here (compared to "beyond compare").
While the "directory merging" or structure merging issue is a core one (whether the system does it or not), the second one is more a tooling problem (how good your three-way merge and diff tools are). You can have Git and Hg for free to solve the first problem (and even Plastic SCM is now free too).
Part of the problem is that most merge tools are just too stupid to understand any refactoring. A simple rename of a method should be merged as a rename of the method, not as an edit to 101 lines of code. Therefore for example additional calls to the method in anther branch should be cope with automatically.
There are now some better merge tools (for example SemanticMerge) that are based on language parsing, designed to deal with code that has been moved and modified. JetBrains (the create of ReShaper) has just posted a blog on this.
There has been lots of research on this over the years, at last some products are coming to market.

Do you feel comfortable merging code?

This morning, I read two opinions on refactoring.
Opinion 1 (Page not present)
Opinion 2 (Page not present)
They recommend branching (and subsequently merging) code to:
Keep the trunk clean.
Allow a developer to walk away from risky changes.
In my experience (particularly with Borland's StarTeam), merging is a non-trival operation. And for that reason, I branch only when I must (i.e. when I want to freeze a release candidate).
In theory, branching makes sense, but the mechanics of merging make it a very risky operation.
My questions:
Do you feel comfortable merging code?
Do you branch code for reasons other than freezing a release
candidate?
Branching might be painful but it shouldn't be.
That's what git-like projects (mercurial, bazar) tells us about CVS and SVN. On git and mercurial, branching is easy. On SVN it's easy but with big projects it can be a bit hardcore to manage (because of time spent on the branching/merging process that can be very long -- compared to some others like git and mercurial -- and difficult if there are non-obvious conflicts). That don't help users that are not used to branch often to have confidence in branching. Lot of users unaware of the powerful uses of branching just keep it away to not add new problems to their projects, letting the fear of the unknown make them far from efficiency.
Branching should be an easy and powerful tool we'd have to use for any reason good enough to branch.
Some good reasons to branchs:
working on a specific feature in parallel with other people (or while working on other features alternatively if you're alone on the project);
having several brand versions of the application;
having parallel versions of the same application -- like concurrent techniques developped in the same time by to part of the team to see what works the better;
having resources of the application being changed on a artist/designers (for example in games) specific branch where the application is "stable" while other branches and trunk are used for features addition and debugging;
[add here useful usages]
Some loose guiding principles:
Branch late and only when you need to
Merge early and often
Get the right person to do the merge, either the person who made the changes or the person who wrote the original version are best
Branching is just another tool, you need to learn how to use it effectively if you want the maximum benefit.
Your attitude to branching should probably differ between distributed open source projects (such as those on Git) and your company's development projects (possibly running on SVN). For distributed projects you'll want to encourage branching to maximize innovation and experimentation, for the latter variety you'll want tighter control and to dictate checkin policies for each code line that dictate when branching should / should not occur, mostly to "protect" the code.
Here is a guide to branching:
http://www.vance.com/steve/perforce/Branching_Strategies.html
Here is a shorter guide with some high level best practices:
https://www.perforce.com/pdf/scm-best-practices.pdf
Branching is trivial. Merging is not. For that reason, we rarely branch anything.
Using SVN, I've found branching to be relatively painless. Especially if you periodically merge the trunk into your branch to keep it from getting too far out of sync.
We use svn. It only takes us about 5 minutes to branch code. It's trivial compared to the amount of pain it saves us from messing up trunk.
Working in a code base of millions of lines of code with hundreds of developers branching is an everyday occurrence. The life of the branch varies depending on the amount of work being done.
For a small fix:
designer makes a sidebranch off the main stream
makes changes
tests
reviews
merges accumulated changes from main stream to sidebranch
iterates through one or more of the previous steps
merges back to main stream
For a multi-person team feature:
team makes a feature sidebranch off the main stream
individual team member operates on feature sidebranch as in "small fix" approach and merges to feature sidebranch.
sidebranch prime periodically merges accumulated changes from main stream to feature sidebranch. Small incremental merges from the mainstream to feature sidebranch are much easier to deal with.
when feature works, do final merge from main stream to feature sidebranch
merge feature sidebranch to main stream
For a customer software release:
make a release branch
deliver fixes as needed to release branch
fixes are propogated to/from the main stream as needed
Customer release streams can be very expensive to support. Requires testing resources - people and equipment. After a year or two, developer knowledge on specific streams starts to get stale as the main stream moves forward.
Can you imagine how much it must cost for Microsoft to support XP, Vista and Windows 7 concurrently? Think about the test beds, the administration, documentation, customer service, and finally the developer teams.
Golden rule: Never break the main stream since you can stall a large number of developers. $$$
The branching problem is why I use a Distributed Version Control system (Git in my case, but there are also Mercurial and Bazaar) where creating a branch is trivial.
I use short lived branches all the time for development. This lets me mess around in my own repository, make mistakes and bad choices, and then rebase the changes to the main branch so only clean changes are kept in history.
I use tags to mark frozen code, and it is easy in these systems to go back and branch off these for bug fixes without having a load of long lived branches in the code base.
I use Subversion and consider branching very simple and easy. So to answer question 1.. Yes.
The reason for branching can vary massively. I branch if I feel I should. Quite hard to put rules and reasons down for all possibilities.
However, as far as the "Allow a developer to walk away from risky changes." comment. I totaly agree with that one. I create a branch whenever I want to really play around with the code and wish I was the only developer working on it.. When you branch, you can do that...
I've been on a project using svn and TFS and branching by itself is a really simple thing.
We used branching for release candidate as well as for long lasting or experimental features and for isolating from other team's interference.
The only painful moment in branching is merging, because an old or intensely developed branch may differ a lot from trunk and might require significant effort to merge back.
Having said the above, I would say that branching is a powerful and useful practice which should be taken into account while developing.
If merging is too much of a pain, consider migrating to a better VCS. That will be a bigger pain, but only once.
We use svn and have adopted a rule to branch breaking changes. Minor changes are done right in the trunk.
We also branch releases.
Branching and merging have worked well for us. Granted there are times we have to sit and think about how things fit together, but typically svn does a great job of merging everything.
I use svn, it takes less than a minute to branch code. I used to use Clearcase, it took less than a minute to branch code. I've also used other, lesser, SCMs and they either didn't support branches or were too painful to use. Starteam sounds like the latter.
So, if you cannot migrate to a more useful one (actually, I've only heard bad things about Starteam) then you might have to try a different approach: manual branching. This involves checking out your code, copying it to a different directory and then adding it as a new directory. When you need to merge, you'd check out both directories and use WinMerge to perform the merge, checking in the results to the original directory. Awkward and potentially difficult if you continue to use the branch, but it works.
the trick with Branching is not to treat it as a completely new product. It is a branch - a relatively short-lived device used to make changes separately and safely to a main product trunk. Anyone who thinks merging is difficult is either refactoring the code files so much (ie they are renaming, copying, creating new, deleting old) that the branch becomes a completely different thing, or they are keeping the branch so long that the accumulated changes bear little resemblance to the original.
You can keep a branch for a long time, you just have to merge your changes back regularly. Do this and branching/merging becomes very easy.
I've only done it a couple times, so I'm not exactly comfortable with it.
I've done it to conduct design experiments that would span over some checkins, so branching is an easy way to wall off yourself a garden to play in. Also, it allowed me to tinker while other people worked on the main branch, so we didn't lose much time.
I've also done it when making wide ranging changes that would render the trunk uncompilable. It became clear in my project that I'd have to remove compile-time type safety for a large portion of the codebase (go from generics to system.object). I knew this would take a while and would require changes all over the codebase which would interfere with other people's work. It would also break the build until I was complete. So I branched and stripped out the generics, working until that branch compiled. I then merged it back into the trunk.
This turned out pretty well. Prevented a lot of toe-stepping, which was great. Hopefully nothing like this will ever come up again. Its kind of a rare thing that a design will change requiring this kind of wide ranging edits that don't result in a lot of code being thrown out...
Branched have to be managed correctly to make merging painless. In my experience (with Perforce) regular integration to the branch from the main line meant that the integration back into the main line went very smoothly.
There were only rare occasions when the merging failed. The constant integration from the main line to the branch may well have involved merges, but they were only of small edits that the automatic tools could handle without human intervention. This meant that the user didn't "see" these happening.
Thus any merges required in the final integration could often be handled automatically too.
Perforces 3-way merge tools were a great help when they were actually needed.
Do you feel comfortable branching code?
It really depends of the tool I'm using. With Starteam, branching is indeed non trivial (TBH, Starteam sucks at branching). With Git, branching is a regular activity and is very easy.
Do you branch code for reasons other than freezing a release candidate?
Well, this really depends of your version control pattern but the short answer is yes. Actually, I suggest to read the following articles:
Version Control for Multiple Agile Teams by Henrik Kniberg
FeatureBranch by Martin Fowler
I really like the pattern described in the first article and it can be applied with any (non Distributed) Version Control System, including Starteam.
I might consider the second approach (actually, a mix of the both strategies) with (and only with) a Distributed Version Control Systems (DVCS) like Git, Mercurial...
We use StarTeam and we only branch when we have a situation that requires it (i.e. hotfix to production during release cycle or some long reaching project that spans multiple release windows). We use View Labels to identify releases and that makes it a simple matter to create branches later as needed. All builds are based on these view labels and we don't build non-labeled code.
Developers should be following a "code - test - commit" model and if they need a view for some testing purpose or "risky" development they create it and manage it. I manage the repository and create branches only when we need them. Those times are (but not limited to):
Production hotfix
Projects with long or overlapping development cycles
Extensive rewriting or experimental development
The merge tool in StarTeam is not the greatest, but I have yet to run into an issue caused by it. Whoever is doing the merge just needs to be VERY certain they know what they're doing.
Creating a "Read Only Reference" view in Star Team and setting it to a floating configuration will allow changes in the trunk to automatically show in the branch. Set items to branch on change. This is good for concurrent development efforts.
Creating a "Read Only Reference" view with a labeled configuration is what you'd use for hot fixes to existing production releases (assuming you've labeled them).
Branching is trivial, as most have answered, but merging, as you say, is not.
The real keys are decoupling and unit tests. Try to decouple before you branch, and keep an eye on the main to be sure that the decoupling and interface are maintained. That way when it comes time to merge, it's like replacing a lego piece: remove the old piece, and the new piece fits perfectly in its place. The unit tests are there to ensure that nothing got broken.
Branching and merging should be fairly straightforward.
I feel very comfortable branching/merging.
Branching is done for different reasons, depending on your development process model/
There's a few different branch models:
Here's a one
Trunk
.
.
.
..
. ....
. ...
. ..Release1
.
.
...
. ....
. ...Release2
.
.
..
. ...
. ..
. ...Release3
.
.
Now here's a curious thing. Suppose Release1 needed some bugfixing. Now you need to branch Release1 to develop 1.1. That is OK, because now you can branch R1, do your work, and then merge back to R1 to form R1.1. Notice how this keeps the diffs clear between releases?
Another branching model is to have all development done on the Trunk, and each release gets tagged, but no further development gets done on that particular release. Branches happen for development.
Trunk
.
.
.
.Release1
.
.
.
.
.Release2
.
.......
. ......
. ...DevVer1
. .
. .
. ...DevVer2
. ....
. ....
...
.Release3
.
There may be one or two other major branch models, I can't recall them off the top of my head.
The bottom line is, your VCS needs to support flexible branching and merging.
Per-file VCS systems present a major pain IMO(RCS, Clearcase, CVS).
SVN is said to be a hassle here as well, not sure why.
Mercurial does a great job here, as does(I think)git.

How can I author changes that are not prone to merge conflicts?

Automated merging isn't perfect. Just because there isn't a line-edit conflict doesn't mean there isn't a syntactic conflict, and that doesn't mean there isn't a semantic conflict.
Does anyone have strategies for authoring low-conflict changes? Is this something that falls out of TDD or other approaches (Certainly TDD will help catch them, but does it actually prevent)?
I've always found that the smaller my commits, the less likely they are to have merge conflicts. The folks who have big problems always seem to go off for days and work on things, then try to merge them all at once.
Right now I'm working on a 2 man team where we are right in the same codebase all the time. We each work in a personal branch and then integrate to a shared branch whenever we have something working. That's usually several times a day. We almost never have merge conflicts, and when we do they're pretty trivial.
So... get the latest code from the repository frequently. Work in your own branch, so you can commit your changes and merge other folks' work without affecting the rest of the team. Then push your own code up to the shared branch as frequently as possible so the changes will be as small as possible.
Also, talk to your team. If you know someone else is working in a specific file, you might want to wait until they get their work in before you jump in. Sometimes you can't help it, but communication at least lets you plan for a complicated merge rather than being surprised.
Classes that violate the single responsiblity principle are the hardest to merge. Finding a class that was difficult to merge probably is a sign that it needs to be refactored, probably in the direction of more parts.
First of all, your code base should be modular. Second, what you need is communication with the rest of your team. Everybody should know who is working on what. If there is a change in the internal API, it should be made clear to the whole team.
Also, before commiting, always fetch the last version, and if complex merging is needed, do it locally.
This is really a human problem, not a technical one. Source control doesn't replace proper communication channels. Your Project Manager should be on top of every changes, and he should realize when a change will span several people.
Also, common sence is needed. :)
Unit testing is of course a big help to catch the most elusive bugs that can come up when merging.
Talk to your fellow developers, and try to avoid sychronous editing of the same block of code wherever possible. Having a well-modularised architecture (small classes, decoupled functionality) makes this possible almost all the time.
If we ever do have a clash, we often resolve it by one of us switching to writing unit tests for untested code for a few minutes.

How often to commit changes to source control? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
How often should I commit changes to source control ? After every small feature, or only for large features ?
I'm working on a project and have a long-term feature to implement. Currently, I'm committing after every chunk of work, i.e. every sub-feature implemented and bug fixed. I even commit after I've added a new chunk of tests for some feature after discovering a bug.
However, I'm concerned about this pattern. In a productive day of work I might make 10 commits. Given that I'm using Subversion, these commits affect the whole repository, so I wonder if it indeed is a good practice to make so many ?
Anytime I complete a "full thought" of code that compiles and runs I check-in. This usually ends up being anywhere between 15-60 minutes. Sometimes it could be longer, but I always try to checkin if I have a lot of code changes that I wouldn't want to rewrite in case of failure. I also usually make sure my code compiles and I check-in at the end of the work day before I go home.
I wouldn't worry about making "too many" commits/check-ins. It really sucks when you have to rewrite something, and it's nice to be able to rollback in small increments just in case.
When you say you are concerned that your "commits affect the whole repository" --- are you referring to the fact that the whole repository's revision number increases? I don't know how many bits Subversion uses to store it, but I'm pretty sure you're not going to run out of revision numbers! Many commits are not a problem. You can commit ten times as often as the guy next door and you won't increase your carbon footprint at all.
A single function or method should be named for what it does, and if the name is too long, it is doing too much. I try to apply the same rule to check-ins: the check-in comment should describe exactly what the change accomplishes, and if the comment is too long, I'm probably changing too much at once.
I like this small article from Jeff Atwood: Check In Early, Check In Often
I personally commit every logical group of code that is finished/stable/compiles and try not to leave the day without committing what I did that day.
If you are making major changes and are concerned about affecting others working on the code, you can create a new branch, and then merge back into the trunk after your changes are complete.
If your version control comment is longer than one or two sentences, you probably aren't committing often enough.
I follow the open-source mantra (paraphrased) - commit early, commit often.
Basically whenever I think I've added useful functionality (however small) without introducing problems for other team members.
This commit-often strategy is particularly useful in continuous integration environments as it allows integration testing against other development efforts, giving early detection of problems.
I commit everytime I'm done with a task. That usually takes 30 mins to 1 hr.
Don't commit code that doesn't actually work. Don't use your repository as a backup solution.
Instead, back up your incomplete code locally in an automated way. Time Machine takes care of me, and there are plenty of free programs for other platforms.
The rule of thumb, that I use, is check-in when the group of files being checked-in can be covered by a single check-in comment.
This is generally to ensure that check-ins are atomic and that the comments can be easily digested by other developers.
It is especially true when your changes affect a configuration file (such as a spring context file or a struts config file) that has application wide scope. If you make several 'groups' of changes before checking in, their impact overlaps in the configuration file, causing the 2 groups to become merged with each other.
I don't think you should worry so much about how often. The important thing here is what, when and why. Saying that you have to commit every 3 hours or every 24 hours really makes no sense. Commit when you have something to commit, don't if you don't.
Here's an extract from my recommended best practices for version control:
[...] If you are doing many changes to a project at the same time, split them up into logical parts and commit them in multiple sessions. This makes it much easier to track the history of individual changes, which will save you a lot of time when trying to find and fix bugs later on. For example, if you are implementing feature A, B and C and fixing bug 1, 2 and 3, that should result in a total of at least six commits, one for each feature and one for each bug. If you are working on a big feature or doing extensive refactoring, consider splitting your work up into even smaller parts, and make a commit after each part is completed. Also, when implementing independent changes to multiple logical modules, commit changes to each module separately, even if they are part of a bigger change.
Ideally, you should never leave your office with uncommitted changes on your hard drive. If you are working on projects where changes will affect other people, consider using a branch to implement your changes and merge them back into the trunk when you are done. When committing changes to libraries or projects that other projects—and thus, other people—depend on, make sure you don’t break their builds by committing code that won’t compile. However, having code that doesn’t compile is not an excuse to avoid committing. Use branches instead. [...]
Your current pattern makes sense. Keep in mind how you use this source control: what if you have to rollback, or if you want to do a diff? The chunks you describe seem like exactly the right differential in those cases: the diff will show you exactly what changed in implementing bug #(specified in checkin log), or exactly what the new code was for implementing a feature. The rollback, similarly, will only touch one thing at a time.
I also like to commit after I finish a chunk of work, which is often several times a day. I think it's easier to see what's happening in small commits than big ones. If you're worried about too many commits, you may consider creating a branch and merging it back to the trunk when the whole feature is finished.
Here's a related blog post: Coding Horror: Check In Early, Check In Often
As others have stated, try to commit one logical chunk that is "complete" enough that it does not get in other devs' way (e.g., it builds and passes automated tests).
Each dev team / company must define what is "complete enough" for each branch. For example, you may have feature branches that require the code only to build, a Trunk that also requires code to pass automated tests, and labels indicating something has passed QA testing... or something like that.
I'm not saying that this is a good pattern to follow; I'm only pointing out that how done is "done" depends on your team's / company's policies.
I also like to check in regularly. That is every time I have a completed a step towards my goal.
This is typically every couple of hours.
My difficulty is finding someone willing and able to perform so many code reviews.
Our company policy is that we need to have a code review before we can check anything in, which makes sense, but there is not always someone in the department who has time to immediately perform a code review. Possible Solutions:
More work per check in; less checkins == less reviews.
Change the company checkin policy. If I have just done some refactoring and the unit tests all run green, maybe I can relax the rule?
Shelve the change until someone can perform the review and continue working. This can be problematic if the reviewer does not like you code and you have to redesign. Juggling different stages of a task by 'shelving' changes can become messy.
The moment you think about it.
(as long as what you check in is safe)
Depends on your source code system and what else you have in place. If you're using Git, then commit whenever you finish a step. I use SVN and I like to commit when I finish a whole feature, so, every one to five hours. If I were using CVS I'd do the same.
I agree with several of the responses: do not check in code that will not compile; use a personal branch or repository if your concern is having a "backup" of the code or its changes; check in when logical units are complete.
One other thing that I would add is that depending on your environment, the check-in rate may vary with time. For example, early in a project checking in after each functional piece of a component is complete makes sense for both safety and having a revision history (I am thinking of cases where earlier bits get refactored as later ones are being developed). Later in the project, on the other hand, entirely complete functionality becomes more important, especially during integration development/testing. A half-integration or half-fix does not help anyone.
As for checking in after each bug fix: unless the fix is trivial, absolutely! Nothing is more of a pain than finding that one check in contained three fixes and one of them needs to be rolled back. More often than not it seems that in that situation the developer fixed three bugs in one area and unwinding which change goes to which bug fix is a nightmare.
I like to commit changes every 30-60 minutes, as long as it compiles cleanly and there are no regressions in unit tests.
Well, you could have your own branch to which you can commit as often as you like, and when you are done with your feature, you could merge it to the main trunk.
On the frequency of Commits, I think of it this way, how much pain would it be to me if my hard disk crashed and I hadn't committed something - the quantum of this something for me is about 2 hours of work.
Of course, I never commit something that doesn't compile.
At least once a day.
I don't have a specific time limit per commit, I tend to commit once a test has passed and I'm happy with the code. I wouldn;t commit code that does not compile or is otherwise in a state that I would not feel good about reverting to in case of failure
You have to balance the compromise between safety and recoverability on the one hand and ease of change management for the entire project on the other.
The best scheme that I've used has had two answers to that question.
We used 2 completely separate repositories : one was the project wide repository and the other was our own personal repository (we were using rcs at the time).
We would check into our personal repository very regularly, pretty much each time you saved your open files. As such the personal repository was basically a big, long ranging, undo buffer.
Once we had a chunk of code that would compile, tested ok and was accepted as being ready for general use it was checked into the project repository.
Unfortunately this system relied on the use of different VCS technologies to be workable. I've not found any satisfactory method of achieving the same results while using two of VCS of the same type (eg. two subversion repositories)
However, I have had acceptable results by creating "personal" development branches in a subversion repository - checking into the branch regularly and then merging into the trunk upon completion.
If you're working on a branch which won't be released, a commit is always safe.
However, if you are sharing it with other developers, committing non-working code is likely to be a bit annoying (particularly if it's in an important place). Normally I only commit code which is effectively "working" - not that it's been fully tested, but that I've ascertained that it does actually compile and not fail immediately.
If you're using an integrated bug tracker, it may be helpful to do separate commits if you've fixed two bugs, so that the commit log can go against the right bugs. But then again, sometimes one code change fixes two bugs, so then you just have to choose which one to put it against (unless your system allows one commit to be associated with multiple bugs)
I still believe in the phrase 'commit often, commit early'. I prefer decentralized VCS like Mercurial and there's no problem to commit several things and push it upstream later.
This is really a common question, but the real question is: Can you commit unfinished code?
Whenever you finish some code that works and won't screw anyone else up if they get it in an update.
And please make sure you comment properly.