Branching vs Merging in source control

Branching vs Merging in source control - version-control

Can anyone explain in plain English the difference between the two approaches? A link to a good clearly written tutorial would suffice also.
thank you.

They are not different approaches, but different sides to the same coin.
If you develop using branches, you need to merge them together eventually.
To answer the question:
A branch is a copy of a selected part of your source code, used for specific work (feature, project whatever).
A merge is the act of bringing two branches into sync and (normally) getting rid of one branch in the process.
From wikipedia, Revision control:
Branch
A set of files under version control may be branched or forked at a point in time so that, from that time forward, two copies of those files may develop at different speeds or in different ways independently of each other.
Merge
A merge or integration is an operation in which two sets of changes are applied to a file or set of files. Some sample scenarios are as follows:
A user, working on a set of files, updates or syncs their working copy with changes made, and checked into the repository, by other users.
A user tries to check-in files that have been updated by others since the files were checked out, and the revision control software automatically merges the files (typically, after prompting the user if it should proceed with the automatic merge, and in some cases only doing so if the merge can be clearly and reasonably resolved).
A set of files is branched, a problem that existed before the branching is fixed in one branch, and the fix is then merged into the other branch.
A branch is created, the code in the files is independently edited, and the updated branch is later incorporated into a single, unified trunk.

Branching is starting a new development line based on the state of the project where you branch from.
Merging is reporting changes from one development line or between development lines to another.
So clearly you need both at different points in time.
Links:
Branching / Merging primer from
microsoft
A more advanced site about source
code management

I don't think you can "oppose" merging vs. branching.
You branch when you want to isolate a development effort, as I explained in "When should you branch?".
The key question to ask, once you have a branch in which you (and other developers) can make evolution to (refactor? enhance? bugfix? ...), is:
"What should you do with the content of that branch?"
In other word, do you leave it alone, or do you re-integrate all its evolutions into another branch. And that is merging.
But merging alone isn't enough. What give merging its purpose is its workflow: from where to where are you merging your code?
Define your workflow, and you will really take advantage of what SCM is about.

In plain English, branching is like "the branch of a tree" where the tree is your software product and the branch is a special part of it that 'extends' from the center-column but has its own features.
In real life you make a branch of your software if you have to support two customers with slightly different whishes. You then have the core-product (A) and the customer-specific-product (B and C) which are basically the core-product with small changes.
Merging is taking this customer-specific branch (B or C), and re-integrating it with your core product (A). This is handy if you implemented a nice feature is branch B, that you'd like to have for everybody, so you merge it into the core product, A.

Related

revision control for many unrelated files

I'm curious to get people's thoughts on how to manage version control for unrelated functions in Matlab.
I keep a reasonably large set of general purpose scripts, each of which is more or less independent of the others. I've been keeping them all in a single directory, containing a single repository in Mercurial. I'm starting to collaborate much more, and I'd like my collaborators to be able modify the files, commit, branch, and merge.
The problem is that the files are independent of one another. Essentially, they're like many separate little projects. But Mercurial treats the repository as a single entity. So if a collaborator modifies file A and B, and I only want to merge in the changes from file A, things get complicated. I know that I could merge from the collaborator, then revert file B, but I'm wondering if there's a simpler way to handle this setup.
I could set up many tiny repositories to manage each file separately, but that also gets complicated.
I'm open to changing version control systems (although I like Mercurial a lot). Any suggestions?

It is considered a best practice to check in code after each bug fix/feature addition/or what not. Given your files are really independent "projects" it seems unlikely a bug or feature would span multiple files. Probably the best you can do is encourage your colleagues in best practices to commit changes only for a single file at once. Explain that better discipline about checking in leads to more manageable source control later. Hopefully you can get most to follow the practice and the few obstinate ones just stop taking their commits for.

It really depends on your typical reasons for merging one change but not the other. If you're using it to create a software configuration, i.e. sometimes you want to use version 1 of file A and version 2 of file B and sometimes it's the other way around, then you probably want to use subrepos to hold each file. If it's because you never want to accept part of a collaborator's change, then they need to be instructed how to make their changes more cohesive and submit them separately. That can sometimes be a difficult concept for people who either haven't used source control before, or who are accustomed to source control like svn that has little or no intrinsic concept of a changeset.

It depends whether you want to maintain a single 'master' version of the files, merging in changes that you like and ignoring others. If collaborators want to develop other branches, then they should perhaps clone the repository, and you can then accept the changesets that you want in the master.
If you want to veto changes by other collaborators, then the changes either need to be kept separate (via a cloned repository or branch) or you need a review process before changes are pushed back to the trunk.

I always use incoming repositories for collaborators. They match what the other person has made, but it avoids messing with my own repository. When you do this, you can then cherrypick their new changesets into your own repository with the transplant extension.

Branching and Merging Strategies

I have been tasked with coming up with a strategy for branching, merging and releasing over the next 6 months.
The complication comes from the fact the we will be running multiple projects all with different code changes and different release dates but approximately the same development start dates.
At present we are using VSS for code management, but are aware that it will probably cause some issues and will be migrating to TFS before new development starts.
What strategies should I be employing and what things should I be considering before setting a plan down?
Sorry if this is vague, feel free to ask questions and I will update with more information if required.

This is the single best source control pattern that I have come across. It emphasizes the importance of leaving the trunk free of any junk (no junk in the trunk). Development should be done in development branches, and regular merges (after the code has been tested) should be made back into the trunk (Pic 1), but the model also allows for source to be patched while still under development (Pic 2). I definitely recommend reading the post in its entirety, to completely understand.
Pic 1
Pic 2
Edit: The pictures are definitely confusing without words. I could explain, but I would basically be copying the original author. Having said that, I probably should have selected a better picture to describe the merge process, so hopefully this helps. I'd still recommend reading the post, however:

The simplest and most usual way I've seen branching work is off two premises. Trunk and Release. I think this is known as the "Unstable trunk, stable branch" philosophy.
Trunk is your main source. This contains the "latest and the greatest" code and is forward looking. It generally isn't always stable.
Release is a one-to-many association with trunk. There is one trunk but many releases that derive from the trunk. Releases generally start with a branch of the trunk once a particular functionality milestone has been hit so the "only" things left to go in for a particular deployment should just be bug fixes. You then branch the trunk, give it a label (e.g. 1.6 Release is our current latest Release), build and send the release to QA. We also push the version number (usually the minor number) of the trunk up at this point to ensure we don't have two releases with the same number.
Then you begin the testing cycle on your release branch. When sufficient testing has been perfomed you apply bug fixes to the release branch, merge these back to the trunk (to ensure bug fixes are carried forward!) and then re-release a build of the branch. This cycle with QA continues until you are both happy and the release is finally given to the customer(s). Any bug reports from the customer(s) that are accurate (i.e. they are a bug!) start another QA cycle with the branch in question.
As you create future releases it is a good idea to also try to move older customers onto newer branches to reduce the potential number of branches you might have to back-patch a bug fix into.
Using this technique you can deploy solutions using your technology to a variety of customers that require different levels of service (starting with least first), you can isolate your existing deployments from "dangerous" new code in the trunk and the worst merge scenario is one branch.

My first recommendation would be to read Eric Sink's Source Control HOWTO - specifically the branches and branch merge chapters.
We have 3 containers - DEV, MAIN, and RELEASE for our work. MAIN contains all our "ready-to-release" code and we tend to think of it as "basically stable." DEV/Iteration (or DEV/Feature, or DEV/RiskyFeatureThatMightBreakSomeoneElse) are branches from MAIN and are merged up when the Iteration/Feature is ready to promote up past the DEV environment. We also have TFS builds set up from the DEV/Iteration branch and the MAIN branch.
Our RELEASE container contains numbered releases (similar to the "tags" container used in many Subversion repositories). We simply take a branch from MAIN each time - I like to say we're "cutting" a RELEASE branch to signify this shouldn't have a lot of activity going on once the merge is finished.
As for VSS->TFS - Microsoft supports an upgrade path which should keep your version history, but if you don't need it the history, I would just get the latest version from VSS, check it into TFS and archive the VSS repository.
One final tip - get your team members familiar with source control. They must understand branching and merging or you will be stuck doing a lot of cleanup work :).
Good luck!

The subversion book describes some common branching patterns. Maybe you can also apply these to TFS.

When to use a Tag/Label and when to branch?

Using TFS, when would you label your code and when would you branch?
Is there a concept of mainline/trunk in TFS?

A label in TFS is a way of tagging a collection of files. The label contains a bunch of files and the version of the file. It is a very low cost way of marking which versions of files make up a build etc.
A branch can be thought of as a copy of the files (of a certain version) in a different directory in TFS (with TFS knowing that this is a branch and will remember what files and versions it was a branch of).
As Eric Sink says, a branch is like a puppy. It takes some care and feeding.
Personally, I label often but branch rarely. I create a label for every build, but only branch when I know that I need to work on a historical version or that I need to work in isolation from the main line of code. You can create a branch from any point in time (and also a label) so that works well and means that we don't have branches lying around that are not being used.
Hope that helps,
Martin.

In any VCS, one usually tags when you want a snapshot of the code, to be kept as reference for the future. You branch when you want to develop a new feature, without disturbing the current code.

Andrew claims that labeling is lazier than branching; it's actually more efficient in most cases, not lazy. Labeling can allow users to grab a project at any point in time, keep a history of files changed for a version or build, and branch off of/work with the code at any point and later merge back into the main branch. Instead of what Andrew said, you're advised to only branch when more than one set of binaries is desired- when QC and Dev development are going on simultaneously or when you need to apply a hotfix to an old version, for example.

I always see labels as the lazy man's branch. If you are going to do something so significant that it requires a full-source label then it is probably best to denote this with a branch so that all tasks associated with that effort are in an organized place with only the effected code.
Branching is very powerful however and something worth learning about. TFS is not the best source control but it is not the worst either. TFS does support the concept of a trunk from which all branches sprout as well.
I would recommend this as a good place to read up on best practices - at least as far as TFS is concerned.

The theory (and terminology) behind Source Control

I've tried using source control for a couple projects but still don't really understand it. For these projects, we've used TortoiseSVN and have only had one line of revisions. (No trunk, branch, or any of that.) If there is a recommended way to set up source control systems, what are they? What are the reasons and benifits for setting it up that way? What is the underlying differences between the workings of a centralized and distributed source control system?

Think of source control as a giant "Undo" button for your source code. Every time you check in, you're adding a point to which you can roll back. Even if you don't use branching/merging, this feature alone can be very valuable.
Additionally, by having one 'authoritative' version of the source control, it becomes much easier to back up.
Centralized vs. distributed... the difference is really that in distributed, there isn't necessarily one 'authoritative' version of the source control, although in practice people usually still do have the master tree.
The big advantage to distributed source control is two-fold:
When you use distributed source control, you have the whole source tree on your local machine. You can commit, create branches, and work pretty much as though you were all alone, and then when you're ready to push up your changes, you can promote them from your machine to the master copy. If you're working "offline" a lot, this can be a huge benefit.
You don't have to ask anybody's permission to become a distributor of the source control. If person A is running the project, but person B and C want to make changes, and share those changes with each other, it becomes much easier with distributed source control.

I recommend checking out the following from Eric Sink:
http://www.ericsink.com/scm/source_control.html
Having some sort of revision control system in place is probably the most important tool a programmer has for reviewing code changes and understanding who did what to whom. Even for single person projects, it is invaluable to be able to diff current code against previous known working version to understand what might have gone wrong due to a change.

Here are two articles that are very helpful for understanding the basics. Beyond being informative, Sink's company sells a great source control product called Vault that is free for single users (I am not affiliated in any way with that company).
http://www.ericsink.com/scm/source_control.html
http://betterexplained.com/articles/a-visual-guide-to-version-control/
Vault info at www.vault.com.

Even if you don't branch, you may find it useful to use tags to mark releases.
Imagine that you rolled out a new version of your software yesterday and have started making major changes for the next version. A user calls you to report a serious bug in yesterday's release. You can't just fix it and copy over the changes from your development trunk because the changes you've just made the whole thing unstable.
If you had tagged the release, you could check out a working copy of it and use it to fix the bug.
Then, you might choose to create a branch at the tag and check the bug fix into it. That way, you can fix more bugs on that release while you continue to upgrade the trunk. You can also merge those fixes into the trunk so that they'll be present in the next release.

The common standard for setting up Subversion is to have three folders under the root of your repository: trunk, branches and tags. The trunk folder holds your current "main" line of development. For many shops and situations, this is all they ever use... just a single working repository of code.
The tags folder takes it one step further and allows you to "checkpoint" your code at certain points in time. For example, when you release a new build or sometimes even when you simply make a new build, you "tag" a copy into this folder. This just allows you to know exactly what your code looked like at that point in time.
The branches folder holds different kinds of branches that you might need in special situations. Sometimes a branch is a place to work on experimental feature or features that might take a long time to get stable (therefore you don't want to introduce them into your main line just yet). Other times, a branch might represent the "production" copy of your code which can be edited and deployed independently from your main line of code which contains changes intended for a future release.
Anyway, this is just one aspect of how to set up your system, but I think giving some thought to this structure is important.

Branching Strategies [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
The company I work for is starting to have issues with their current branching model and I was wondering what different kinds of branching strategies the community has been exposed to?
Are there any good ones for different situations? What does your company use? What are the advantages and disadvantages of them??

Here is the method I've used in the past with good success:
/trunk - bleeding edge. Next major release of the code. May or may not work at any given time.
/branches/1.0, 1.1, etc. Stable maintenance branches of the code. Used to fix bugs, stabilize new releases. If a maintenance branch, it should compile (if applicable) and be ready for QA/shipping at any given time. If a stabilization branch, it should compile and be feature complete. No new features should be added, no refactoring, and no code cleanups. You can add a pre- prefix to indicate stabilization branches vs maintenance branches.
/branches/cool_feature. Used for highly experimental or destructive work that may or may not make it into trunk (or a maintenance branch). No guarantees about code compiling, working, or otherwise behaving sanely. Should last the minimum time as possible before merging into the mainline branch.
/tags/1.0.1, 1.0.2, 1.1.3a, etc. Used for tagging a packaged & shipped release. Never EVER changes. Make as many tags as you want, but they're immutable.

Our repository looks like:
/trunk
/branches
/sandbox
/vendor
/ccnet
/trunk is your standard, bleeding edge development. We use CI so this must always build and pass tests.
/branches this is where we put 'sanctioned' large changes, ie something we KNOW will make it into trunk but may need some work and would break CI. Also where we work on maintenance releases, which have their own CI projects.
/sandbox each developer has their own sandbox, plus a shared sandbox. This is for things like "Lets add a LINQ provider to our product" type of tasks that you do when you are not doing your real work. It may eventually go into trunk, it may not, but it is there and under version control. No CI here.
/vendor standard vendor branch for projects where we compile but it is not code that we maintain.
/ccnet this is our CI tags, only the CI server can write in here. Hindsight would have told us to rename this to something more generic such as CI, BUILDS, etc.

One branch for the active development (/main or master, depending on the jargon)
One branch for each maintenance release -> it will receive only really small fixes, while all major development goes to /main
One branch for each new task: create a new branch to work on every new entry on your Bugzilla/Jira/Rally. Commit often, self document the change using inch pebble checkins, and merge it back to its "parent" branch only when it's finished and well tested.
Take a look at this http://codicesoftware.blogspot.com/2010/03/branching-strategies.html for a better explanation

The first thing: KISS (Keep it simple stupid!)
/branches
/RB-1.0 (*1)
/RB-1.1 (*1)
/RB-2.0 (*1)
/tags
/REL-1.0 (or whatever your version look like e.g. 1.0.0.123 *2)
/REL-1.1
/REL-2.0
/trunk
current development with cool new features ;-)
*1) Keep version maintainable - e.g. Service Packs, Hotfixes, Bugfixes which may be merged to trunk if necessary and/or needed)
*2) major.minor.build.revision
Rules of the thumb:
The Tags folder need not to be checked out
Only few coding in release branches (makes merging simpler) - no code cleanup etc.
Never to coding in tags folder
Never put concrete version information into source files. Use Place-holders or 0.0.0.0 which the build mechanism will replace by the version number you're building
Never put third party libraries into your source control (also no one will add STL, MFC etc. libraries to SVN ;-))
Only commit code that compiles
Prefer using environment variables instead of hard-coded paths (absolute and relative paths)
--hfrmobile

We branch when a release is ready for final QA. If any issues are discovered during the QA process, the bugs are fixed in the branch, validated and then merged to the trunk. Once the branch passes QA we tag it as a release. Any hotfixes for that release are also done to the branch, validated, merged to the trunk and then tagged as a separate release.
The folder structure would look like this (1 QA line, 2 hotfix releases, and the trunk):
/branches
/REL-1.0
/tags
/REL-1.0
/REL-1.0.1
/REL-1.0.2
/trunk

We use the wild, wild, west style of git-branches. We have some branches that have well-known names defined by convention, but in our case, tags are actually more important for us to meet our corporate process policy requirements.
I saw below that you use Subversion, so I'm thinking you probably should check out the section on branching in the Subversion Book. Specifically, look at the "repository layout" section in Branch Maintenance and Common Branch Patterns.

The alternative I'm not seeing here is a "Branch on Change" philosophy.
Instead of having your trunk the "Wild West", what if the trunk is the "Current Release"? This works well when there is only one version of the application released at a time - such as a web site. When a new feature or bug fix is necessary a branch is made to hold that change. Often this allows the fixes to be migrated to release individually and prevents your cowboy coders from accidentally adding a feature to release that you didn't intend. (Often it's a backdoor - "Just for development/testing")
The pointers from Ben Collins are quite useful in determining what style would work well for your situation.

Gnat has written this excellent break down on the various bits of advice your can find on branching strategies.
There's not one branching strategy, it's what works for:
Your team size
Your product and the lifecycle periods
The technology you're using (web, embedded, windows apps)
Your source control, e.g. Git, TFS, Hg
Jeff Atwood's post breaks down a lot of possibilities. Another to add is the concept of promotion (from Ryan Duffield's link). In this setup you have a dev branch, test bracnh and release branch. You promote your code up until it reaches the release branch and is deployed.

We currently have one branch for ongoing maintenance, one branch for "new initiatives" which just means "stuff that will come out sometime in the future; we're not sure when." We have also occasionally had two maintenance branches going on: one to provide fixes for what is currently in production and one that is still in QA.
The main advantage we've seen is the ability to react to user requests and emergencies more rapidly. We can do the fix on the branch that is in production and release it without releasing anything extra that may have already been checked in.
The main disadvantage is that we end up doing a lot of merging between branches, which increases the chance that something will get missed or merged incorrectly. So far, that hasn't been a problem, but it is definitely something to keep in mind.
Before we instituted this policy, we generally did all development in the trunk and only branched when we released code. We then did fixes against that branch as needed. It was simpler, but not as flexible.

The philosophy that we follow at work is to keep the trunk in a state where you can push at any time without drastic harm to the site. This is not to say that the trunk will always be in a perfect state. There will of course be bugs in it. But the point is to never, ever leave it broken drastically.
If you have a feature to add, branch. A design change, branch. There have been so many times where I thought, "oh I can just do this in the trunk it isn't going to take that long", and then 5 hours later when I can't figure out the bug that is breaking things I really wished that I had branched.
When you keep the trunk clean you allow the opportunity to quickly apply and push out bug fixes. You don't have to worry about the broken code you have that you conveniently branched off.

For Subversion, I agree with Ryan Duffield's comment. The chapter he refers to provides a good analyses on which system to use.
The reason I asked is that Perforce provides a completely different way to create branches from SVN or CVS. Plus, there are all the DVCSs that give it's own philosophy on branching. Your branching strategy would be dictated by which tool(s) you're using.
FYI, Svnmerge.py is a tool to assist with merging branches in SVN. It works very well as long as you use it frequently ( every 10-30 ) commits, otherwise the tool can get confused.

No matter which branching pattern chosen, you should try to keep your branches in a binary tree form like this:
trunk - tags
|
next
/ \ \
bugfix f1 f2
/ \ \
f11 f21 f22
Child nodes should only merge with the direct parent.
Try ur best to merge only the whole branch with the parent branch. never merge subfolders within a branch.
You may cherry pick commits when needed as long as you only merge and pick from whole branch.
The next branch in the above figure is only for illustration, you may not need it.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse