Best practices to deal with "slightly different" branches of source code - version-control

This question is rather agnostic than related to a certain version control program.
Assume there is a source code tree under certain distributed version control. Let's call it A.
At some point somebody else clones it and gets its own copy. Let's call it B.
I'll call A and B branches, even if some version control tools have different definitions for branches (some might call A and B repositories).
Let's assume that branch A is the "main" branch. In the context of distributed version control this only means that branch A is modified much more actively and the owner of branch B periodically syncs (pulls) new updates from branch A.
Let's consider that a certain source file in branch B contains a class (again, it's also language agnostic). The owner of branch B considers that some class methods are more appropriate and groups them together by moving them inside the class body. Functionally nothing has changed - this is a very trivial refactoring of the code. But the change gets reflected in diffs. Now, assuming that this change from branch B will never get merged into branch A, the owner of branch B will always get this difference when pulling from branch A and merging into his own workspace. Even if there's only one such trivial change, the owner of branch B needs to resolve conflicts every time when pulling from branch A. As long as branches A and B are modified independently, more and more conflicts like this appear. What is the workaround for this situation? Which workflow should the owner of branch B follow to minimize the effort for periodically syncing with branch A?

The owner of branch B should have discussed the change with the owner of branch A. They should have decided that either the change was worth making, in which case it should have been committed to the trunk (A) or it wasn't, in which case it should have never been made. VCS is not a substitute for communication between developers.

Dev B should ask Dev A to pull from him.
The situation is not avoidable if two branches diverge and never converge. Its similar to a fork in a project - Its very difficult to merge two different forks.

Dev B should either un-refactor, or else add some meaningful functionality that has enough momentum to warrant merging back into A. Otherwise tell him "Them's the breaks".
To go to an extreme example, he could decide that he doesn't like C and decide to re-write everything as Pascal, line by line, using the same variable names, class names, etc.. To the point that the only difference is the language. So what would you tell him then, when he insists that there aren't any changes, so there should be no diff?

One workaround is to have a fancy diff wrapper around code comparison that will intelligently ignore certain very specific diffs.
On a simple level it's usually implemented via gdiff, on fancier, with custom diff code.
However, I must say that such a situation is really a bad thing and it's much better to merge the changes back into A; by for example submitting a patch to A's maintainers owners

Related

What is the standard workflow for applying conflicting patches?

This is a programming language and version control system agnostic question.
There is a source code tree and two patches X and Y. Each of them apply cleanly to the source code tree. But applying one of them (either X or Y first), then another one, results in second patch failing to apply (patches conflict).
Is my only option applying one of them (probably the biggest one, so most of work gets done automatically), then merging the other one by hand and resolving conflicts, or there are better tools/practices to handle this scenario?
The goal is to avoid this situation from happening as there's no easy solution to merge.
In order to avoid, make small commits with their tests and push them to the source repository. Other guys in the team will be forced to pull the latest changes in order to commit their code, and this will ensure that nothing gets broken.
I encourage you to avoid having multiple teams manipulating the same part of the source code. Create a good structure and if possible break down the project into smaller projects.

Merge in local, base or other?

I have a project in which I've set up a bitbucket repo using mercurial.
We're actually 3 to work on it, so we're using branches.
When we did merges, we did them quite randomly so many times it failed.
Actually, I'm using Meld, and I don't really know in "which" part of the repo I have to choose which part of the source code I want to merge.
So, when I do merge, where should I do it ?
I'm not really sure if I have to do it on local, base or other, even though I know local corresponds to my last modifications, other corresponds to the last modifications of the branch I want to merge, and well, actually I'm not really sure about what is other ...
On careful review, I have found it out. You want to merge into local
Please correct me if I'm wrong but I am pretty sure after doing some tests
Local
The correct place to merge change to. The local files that will result from the merge. This will likely contain a mix of some auto-merged lines already.
Base
Where you are merging into.
Other
The merges you're pulling
This may not be the "right" answer, but when in doubt, I make them ALL match by making them look ALL merged 'correctly' (sometimes I have to discuss with my coworkers what the 'correct' look is based on their changes).
By doing this, I ensure the merge will be successful because meld cannot and will not actually change upstream data in mercurial. So there's no downside. For the life of me I also cannot tell which pane to merge into (mostly because the term 'base vs local' is ambiguous). So this is kind of an odd way to do it, but it works

Merging two branches in both directions in mercurial

I would like to support the following situation:
development happens on two branches - they are both a bit like "default" (actual development happens on feature branches, but they are branched of and merged to one of these two branches)
I would like to merge changes from one branch to another in both directions without grafting individual commits
branches have a diff (on big merge of a feature branch) that I would like to always keep and support
I tried to do a dummy merge as described here, first in one direction, then after several successful merges, in another direction (dummy merge of dummy merge). Now I need to do a merge again in another direction, and here another dummy merge (of dummy merge of dummy merge) does not help me any more (and I hoped that one dummy merge would be enough anyway).
Is it possible to do development in this fashion, or is it better to do most of development in one branch? (well, I know it is better for hg, but I have reasons)
Preface
If both branches share the same functionality (unstable common DEVEL), I can't see any reasons in such splitting, except added headache
Face
You can avoid merging unwanted changeset from one branch into another and use ordinary merge if you'll convert this mergeset into MQ-patch ( or maybe shelveset) and always merge from|to (read "Merging patches with new upstream revisions" for merging to branch hint) branch with unapplied patch

Mercurial workflow with stable and default branches

We are trying to migrate from Subversion to Mercurial but we are encountering some problems. First a bit of background:
Desired workflow:
We would like to have just two named branches, stable and default, within one repository.
Development takes place on default branch.
Bug fixes are committed to stable branch and merged to default.
After every Sprint we tag our default branch.
Eventually we can release a new version, for which we bring some code (possibly the latest Sprint tag) from default over to stable (update stable, merge Sprint_xyz), tag the branch (tag Release_xyz) and release.
We also want the following jobs on our Jenkins build server for CI:
End-of-Sprint job: This job should tag default with something like Sprint_xyz
Release job: This job should bring the latest "Sprint" tag changes over to the stable branch, then tag stable with something like Release_6.0.0 and build a release.
Some more background:
Mercurial is new to us, but for what we have seen, this seems like a sane approach. We chose tags to mark releases over named-branches and cloned-branches trying to make the development workflow as straightforward as possible (single merge step, single checkout, only a couple of branches to keep track of...).
We use scrum and potentially (but not necessarily) release a version after each sprint which may (or not) become part of the stable branch and turn into a "shipable" release.
The problem we are encountering (and which is making us wonder if we are approaching this the right way...) is the following:
We work on the default branch ('d' on the poor-man's-graph that follow):
d -o-o-o-o-
We finish a sprint and trigger an End-of-Sprint job (using Jenkins) which tags default with "Sprint 1":
d -o-o-o-o-o-
|
Sprint 1
To release Sprint 1 we update to stable branch ('s') and merge changes from the Sprint 1 tag revision and commit:
Sprint 1
|
d -o-o-o-o-o-
\
s -o-o-o-o-o-o-
Tag stable and commit:
Sprint 1
|
d -o-o-o-o-o-
\
s -o-o-o-o-o-o-o-
|
Release 1
Update to default and merge stable since default should stay a superset of stable, commit and push:
Sprint 1
|
d -o-o-o-o-o-o-o-o-o-
\ /
s -o-o-o-o-o-o-o-
|
Release 1
The problem is that when merging .hgtags from 's' to 'd' mercurial encounters a conflict which holds the release job from completing. The resulting .hgtags should contain information from both involved tags.
We have searched for a solution to this, and could probably automate these type of merge conflicts with some hooks and scripts, but it looks like an unnecessary and error-prone hack to support a workflow that otherwise seems nothing out of the ordinary.
Is there something inherently wrong with our approach that causes us to encounter these problems?
If not, what is the best way to solve these issues without having to rely on a scripts/hooks approach?
Is there a better approach that would support our workflow?
I would go for the special case hooks. The problem you're seeing is related to the Mercurial philosophy of versioning metadata in the same way as normal repository data. This is simple and effective, and leads to a system that's overall easier to understand. But in this case it also leads to your merge conflict.
The reason it leads to a merge conflict is relatively simple. The .hgtags file is just a text file with a bunch of lines in it. Each line contains a hash and the associated tag. In one branch you've added the Sprint 1 tag. In another branch you've added the Release 1 tag. These show up as one line being added to the end of the file in one branch, and a different line being added to the end of the file in another branch.
Then you merge the two branches. Suddenly Mercurial is faced with a decision. Which line should it take? Should it take both of them? If it were source code, there would really be no way to tell without human intervention.
But it isn't source code. It's a bunch of tags. The rule should be 'if the two lines being added refer to different tags, just take both of them'. But it isn't because Mercurial is treating it like a bog-standard text file that could be important source code.
Really, the .hgtags file should be handled in a fairly special way for merges. And it might actually be good to add code that handles it that way into mainline Mercurial to support your use-case.
IMHO Mercurial should be modified so that the .hgtags file would only give you a conflict warning if you have two different hashes for the same tag. The other weird case would be if you have a tag with a hash that isn't an ancestor of the change in which the tag appears. That case should be called out somehow when doing a merge, but it isn't really a conflict.
I suspect you're merging the tagged changeset from default to stable. If you merge the tagging changeset instead, you shouldn't get the merge conflict when you merge the second (probably also tagging!) changeset back to default.

Branch per promotion tradeoffs in TFS

Suppose we have a big TFS 2010 project with three branches: MAIN, TST and PRD.
The strategy is: whenever a Sprint finishes MAIN is copied/merged into TST. Whenever TST is deemed stable it is copied/merged into PRD. Whenever TST or PRD have fixes, they're merged back to MAIN, or MAIN and TST. (Don't ask me why, I can't control this and I don't particularly like it.)
At each promotion step, as I understand, one can either:
delete the target branch and branch again - this entails losing immediate access to that branch's history (it can always be recovered, right?);
merge and resolve with acceptTheirs - this entails loosing changes that may not have been merged back from target to origin.
For the merge-backs, it is important to have ancestry information. With 1. I would expect ancestry information to be kept. With 2. I am unsure.
So, two questions:
Are those two the possible/desirable ways to go about promoting software between branches?
I which cases is ancestry information not kept?
Extra points for any additional tradeoffs that might be relevant for big-size repositories.
1.Are those two the possible/desirable ways to go about promoting software between branches?
If MAIN has a child branch TST, which has a child branch PRD then without resorting to baseless merges, these are the only merges possible to promote changes between branches.
If this is a desirable branching strategy, depends on many factors like how many parallel releases are put out and team sizes. A good reference guide on this is the branching guidance of the TFS Rangers http://vsarbranchingguide.codeplex.com/ The version you are seem to be using is a variation of the basic dual branch plan (what you call main, they call dev and your production branches aren't unique labeled). This branching strategy works best if only one version is in production and releases should always contain everything made.
2.In which cases is ancestry information not kept?
If files are copied or branches are destroyed. However if you need to delete and/or recreate branches all the time and/or need to use acceptTheirs continuously, than often its an indication of; inadequate branching strategy, inadequate TFS training, or issues with the testing and patching strategy (bugs found in production and development are found and fixed at the same time, resulting in merge conflicts).