How does merging a branch actually work (under the hood)? - merge

this may be a naive question, but, as asked in the object, what is the actual way used by versioning softwares to merge a branch back into the main trunk without generating broken code?
Here's a quick example: I make a branch out of the main trunk for the program "Hello World Power edition". I add support for Klingon. This is a radical change that changes the way the function printHelloWorld() works.
Meanwhile, because of bug #749 that cause "Hello World" to be written "Helo World", the function printHelloWorld() in the main trunk has been changed.
Now, the problem that I see here is: when i merge by branch back to the main trunk i experiment a clash in the function printHelloWorld() within the file sayHello.py
How does a VCS program know how to add the Klingon support from my branch and keep the bug fix in the main trunk? What are the human-driven or software-driven strategies to avoid this?
Thanks in advance.

How does a VCS program know how to add
the Klingon support from my branch and
keep the bug fix in the main trunk?
VCS knows nothing about semantic of your source code it sees it as a bunch of text/binary files. VCS system uses diff / merge algorithms to detect conflicts between yours and current file version. It is your responsibility to resolve such conflicts because only you know semantic of these changes. Some VCSs like SVN would require you to update your working copy with latest changes from trunk before allowing you to commit changes to make sure that nothing is lost.
To make sure that you didn't break anything and all previous bug fixes were not broken you should use code reviews, unit tests and other practices. Continuous integration is a good way to keep software healthy.

In such a case the version control system can't merge automatically, you have to do the merge by hand. Good unit tests will help you to ensure that no functionality is lost.

Before you can merge your branch back to trunk, the version control system will update your working copy with the changes in trunk since you branched out. It will just not allow you to merge without this update. This ensures you get all the bug fixes in the trunk in your next commit.
A good strategy for working on a branch is to port changes in trunk into you branch on very regular basis. This ensures that you don't drift too apart from trunk, leading you to have problems when you eventually merge back into trunk.

Related

Mercurial merge changeset between branches brings all ancestors with it

I have a problem with merging changes between branches.
I have default branch default.
When I'm starting working under new issue I make new branch like #565 from default and working with it. After working I'm merging code changes from #565 into default - it work great.
Also I have branch anotherbranch - this is like production branch, we can merge something there if it was tested on default.
Sometime I need to merge code changes from #565 into anotherbranch. When I trying to do this, Mercurial offer me to merge ALL code changes between #565 AND anotherbranch (because #565 is child of default).
How I can merge ONLY code changes of #565?
Someone suggested the graft command, and that's probably what you'll end up using, but let me take a step back and suggest that this problem will go away if you improve your process a bit.
In general if you fix bugs in new branches branched from anotherbranch and add new features in branches from default you'll have the best of both worlds. Fix bugs off of your "production" branch and then merge that production branch, called anotherbranch, back into default immediately.
That's the usual process and that makes sure that the bug fixes are everywhere but that new features don't hit production until they've gotten good testing.
I think what you need is the graft command ("copy changes from other branches onto the current branch").
The best way to fix this is, as Ry4an suggests, to modify your workflow. You may also want to consider using the mq extension to create patches. Patches are a great way to apply only certain changes to a repository, particularly if you're not ready for a full-on merge.

Subversion in Eclipse - Switching branch with uncommitted work in repository

Suppose I have the following senario:
Trunk is my main development line where programmers commit to when done.
Branch V1.0 is a branch I created when releasing version 1.0.
The programmer is working on the trunk but needs to switch to the branch in order to fix bugs.
When switching back to the trunk Subversion will give me the latest in the SVN repository which does not include the recent changes.
So, in order to switch he will have to commit what he has because otherwise the changes are "lost". I know they are still kept in the local repository but it would still mean restoring them one by one once he switchesback.
Am I missing something here?
Edit:
Now I am thinking down these lines:
Each programmer would have his own "private" development branch off of the trunch. He could commit to there whenever he wants. When he has finished what he has written he can them merge it into the trunk. He starts again for the next assignment.
If at any point he is required to fix a bug in some other release he can just commit to his own private branch, fetch the odl release and fix it. Then, after committing the changes to the fix, he can easily switch back to his own development branch.
Would that work?
I think that subversion is not thought for switching as you described. A solution would be to have two workspaces, one for the branch and another one the trunk, so you can switch from one to another without problems. I know it is not a nice one.
Consider switching to a version control tool that is better suited for your approach. My own suggestion is Mercurial, coupled with MercurialEclipse. The only drawbacks I'm aware of are that Subversion is better suited to store binary files and that Mercurial's subrepositories don't work so well as Subversion's externals.
In Mercurial your programmers would be able to commit their changes to their private repositories, merge and commit locally again, and then push the resulting changes to an official repository, from which other programmers would pull them into their own private repositories.
The best solution for such purposes is to have two separate working copies. One for working with trunk and one working with the branch.

Managing code in Mercurial: how to revert individual files, "tag" it and be able to maintain it

Update: We ended up using a process very much like this schema (thanks to neuro for the link). We massaged out repository into a state where default is stable (and has the same code as our production environment), we have a dev branch, feature branches for new stuff and use release branches for releases. All seems to be working perfectly.
Backstory
Our team has recently switched from using SVN (using ToroiseSVN Windows client) to Mercurial (using TortoiseHg Windows client) for version control. We have successfully exported our SVN repository and imported it into a Mercurial repository.
We now have a Mercurial repository where we can see the entire history of revisions (changesets in Mercurial).
How we did it in the old days
Life was simpler in the old days; our development process was not really multi-stream like it is now. The trunk was used to hold all code - even changes that were still in-flight (as long as it didn't break the trunk). When it came to managing releases with SVN, we would checkout the trunk (which holds all code), revert the individual changes we didn't want as part of the release, and create a tag for it.
Cherrypicking the code we want with SVN was easy. Bug-fixing previous releases and ensuring it was part of the trunk was simple too.
What we are doing now
In Mercurial, we need to be able to get a snapshot of the "trunk" (default in Mercurial) with individual changes reverted out. We can do this using hg revert.
To snapshot this, we have created a "named branch" - let's call it Build-4.0.1 for now.
Where the challenge arises
Development continues on default as normal when a bug is found in Build-4.0.1. Let's assume the bug is in one of the reverted files. We change the code from the branch for Build-4.0.1, create a new "named branch" (Build-4.0.2) and want to merge it back into default without pushing the reverted code over the top of newer code. How can we accomplish this?
Alternatively, is there a better workflow for managing the releases and our code in Mercurial? I quite like the look of this wonderful SO answer on managing release branches, although I am not sure how we can transition to it from the state we are in now (with in-flight stuff in default).
Note: I have looked at the Transplant extension, but haven't used it yet - could it be part of the solution to this challenge?
Well, to begin with, your use of revert seems strange to me. Usually it is used to revert modifications done to the working copy back to the version of the repository.
The usual way to get the working copy to some point backward is to update :
hg update -r 1234
from there, you can tag, modify, commit, etc.
To merge back you only have to merge your release branch to the default branch. It will work like a charm, unless it is to different/old a release.
Transplant works fine, but do something a bit different from merge : it take your changeset as a "diff" and apply it as a new modification.
To manage your releases, you can look this other answer (by me) :
How to use mercurial for release management?
What we use is a clone / main branch that holds the most stable version, which is released at some points. On this clone : branch, we can fix critical bugs (hotfix). In parallel, we use a dev clone / branch to develop. The hotfixes are merge as soon as completed from stable to dev. When the current development is done, we merge the dev on stable / default.
This schema is pretty good to understand things :)
Good luck !
Going through all the changes and taking out the ones you don't want is not a common way of creating a release, to put it mildly. The Common Branching Patterns section in the SVN book suggest some more popular work flows:
release branches: create release branch from unstable trunk, fix bugs to stabilize it, cherry pick bug fixes between them while the branch is in maintenance mode.
feature branches: keep the trunk stable and ready for release by only merging in the feature branches that you want
The second one is probably the best fit here, because it gives you a place to put experimental or risky changes until you feel confident about them - these are the changes you would have reverted before a release in your old workflow.
Both of these branching patterns should carry over just fine to mercurial. In case you go for the first approach, note that mercurial (since 2.0) now has a graft command, you no longer need the transplant extension.

Branch-per-feature workflow using Mercurial

We have team of 10 developers who works parallel for different features, sometimes these features use common code sometime no.
And now we're changing our process to branch-per-feature and it seems mercurial is more suitable for such development.
I see this process so:
1. make release branch (r-b) from default(trunk)
2. make feature branch (f-b) from default(trunk)
When developer thinks his feature is done he can merge f-b to r-b. When it's time to go to QA we merge all finished f-b to r-b and create release for our QAs.
Questions:
When QA finds a bug developer should modify his f-b and merge it again to r-b. Does it mean that developer just switch to his f-b and start fixing the bug and then makes simple merge f-b to r-b again?
When release is passed QA it goes to PROD - how can we freeze changes? "hg tag" is good choice but someone can update tag if he really wants it.
Thanks
If you're going to merging into specific release branches then your feature branches should be branched from the release branch, not the trunk. It is simpler to merge with the parent branch than a non-parent branch.
1) If you really want to do feature branches then each bug would have its own branch. This will help keep bug fixes separate from new features. After all, it's branch-per-feature not branch-per-developer.
2) Hg tag is what I have used. You are right that someone change move a tag if they really want to, but tags are versioned and you can install hooks on the main hg repo to throw alerts if a tag is moved. I really wouldn't worry about tags being moved unless you can't trust your developers, in which case you are screwed.
The answer to your first question is 'yes'.
The best way to freeze for release is to have a separate release clone that only the release manager can push/pull changesets to. Just because you're using branches doesn't mean multiple-clones don't have a place in your workflow. Have a clone that QA does final pre-flight testing on to which developers can't push changes makes for a great firewall.
Also, consider using bookmarks for your feature branches. Since, as I'm sure you know, Mercurial named branch names never go away the git-like bookmarks work well for sort lived concepts like features and bugs.

Please suggest a better workflow in Mercurial

I'm new to Mercurial and what I'm starting to realize is that my basic workflow may not be the most efficient way of working because I perform commits so frequently and for feature improvements that are so small that when I need to find some earlier step to revert to, it is extremely difficult.
Here's what I do after I have a project set up in Mercurial and have already completed my first commit.
Make some changes to a file and get it to a state where a small improvement is working
hg commit -m "improvement A works"
Make some changes to the same file and get it to a state where the next minor improvement is working.
hg commit -m "improvement B works"
Check whether all of the minor improvements add up to a single minor feature working correctly.
hg commit -m "feature A works"
If I find a mistake that was made in "improvement A", I open up history (with the Netbeans Mercurial visual plugin) and copy and paste some of the code back into my current version and start again from there.
This doesn't seem like a good system - I would appreciate any suggestions.
I agree with Jon that branches are the solution, but I would create branches for features rather than for the individual improvements that make up a feature. The workflow pattern would then be this:
Create branch for feature A.
Complete work for improvement A and commit.
Complete work for improvement B and commit.
When the feature seems to be working, merge the feature A branch back into trunk.
If you find a mistake in an improvement A of feature A, instead of starting over, you switch to the feature A branch and do the following:
Fix improvement A and commit.
Merge the feature A branch back into trunk.
You could isolate the changes for the improvements into branches maintaining a stable trunk.
Take a look at Branch wiki page.
Workflow pattern would be:
Create branch for improvement A
Complete work for improvement A and checkin
Test changes and merge back into trunk if successful
Create branch for improvement B
Complete work for improvement B and checkin
Test changes and merge back into trunk if successful
If you find a mistake you can abandon the branch (or correct the bug in the branch prior to merging back into trunk).
I disagree with the branches approach. If no parallel development is needed, why add the complexity of branches? There's nothing wrong with small 'checkpoint' commits. Tags can be used to point to important commits, which might be more clear.