Mercurial branches with different codebase [closed] - version-control

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am struggling with figuring out a good method for handling my work flow using Mercurial. I have read many related questions here on SO and other places but couldn't find a reasonable solution.
Suppose I have two branches:
the default branch I do normal development in. Deploying releases from and marking them with release tags
a branch for a specific customer that is maintained separately. It runs on a separate version number, branched from an older version of the application.
These two branches are mostly identical now but there are already some differences. Over time, they will drift apart more.
It means that I have 4 types of source files:
Type-A: Files that remain identical in both branches (changes, if introduced, should be present in both). These are most of the files.
Type-B: Files that are only in the default branch and do not need to be merged into the customer branch
Type-C: Files that are only in the customer branch and do not need to be merged into the default branch
Type-D: Files that are present in both branches, have shared code but also contain code that should remain separate and specific to each branch
Development done on the default branch and there are regular releases which are mostly incremental. But I also have these two scenarios:
Some changes done to the default branch, need to also be merged to the customer branch (e.g. a bug or feature that need to be fixed/added to both).
A "hotfix" done to the customer branch and cannot be immediately merged to the default branch but needs to be merged eventually at some point.
Problem is that I can't figure out a reasonably simple, clean and safe way to support these two scenarios. Mercurial doesn't have the concept of partial merges or ignoring files on merge. It merges changesets but it insists on including the files that were introduced in previous revisions.
Merging default into customer branch or customer into default in these scenarios forces me to add files of Type-B and Type-C. Making a change to file of Type-A that doesn't need be added to the other branch (making it Type-D) introduces a challenge.
Now obviously I can work around some of these problems by using compiler defines (thus keeping source files the same in both branches), editing code manually and manually removing files after merges but this doesn't feel like the most efficient and clean way to handle this.
Surely this is a common enough work flow that wiser people than me already figured out. Can anyone suggest any method or best practices that can streamline the work in these scenarios? Or is there something fundamentally flawed with my setup?
Also, does Git handles these flows more gracefully?
Thanks.

I would do any development that a the customer requires on the customer branch and merge it into default. This will work for hotfixes and it is simpler than cherry-picking your development changesets from your default branch. It's easier to merge forwards rather than backwards. It also gets rid of Type-B file problem because there are no Type-B files in the customer branch.
Type C files I would merge into default and then delete on the default branch. Any further modifications to these files should generate a warning that file modified on branch was deleted on the other branch.

Separate changesets can be exchanged between branches using hg graft -r SRCREV commands
More complex and probably hard (but also more flexible and manageable) way may be using MQ and more than one queue (Queue per Type?), but even with one queue of MQ-patches with good naming convention you will not lost

Related

What should go in the 'default' branch of a Hg repository? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
In large Libre Source software projects, versioned with Mercurial or similar DVCS tools, which of the following is considered to be more conventional:
Keeping the latest "stable" version of the software in the default branch. Tagging each release in default so you know which revision got packaged up as a download. Merging patches into default as soon as they are tested. Keeping new features, etc. in named branches to be merged into default on the next release.
Keeping each release in a named branch, or similar. Using default to keep bleeding-edge code that's only intended to be run by developers or the very foolhardy.
Or... is there some better pattern of workflow that it widely accepted?
Mercurial has a fairly strong opinion on what you should use your default branch for. It's documented in the Standard Branching wiki page. The summary is:
You should not use a name other than default for your main development branch.
The reason is that default is the branch that is checked out by new clones. If you try to use some other name for your "main" branch, users will get a more or less random branch when they clone and commit things in the wrong place, which is generally undesirable.
Even with tons of documentation that says "branch before adding a new feature" (see next point), people will forget this when they send you patches. They then have the trouble of cleaning up things by moving changesets around.
So always put the bleeding-edge code in the default branch and use other branches for your stable releases.
Don't treat branch names as disposable
Branch names are a permanent part of each commit and allow identifying on which branch each commit was introduced. Thus you will want to give some thought to your branch names so that you don't pollute the branch namespace.
Also, if you attempt to use a branch per bugfix, you may eventually run into performance issues. Mercurial and the tools surrounding it are designed to work well with hundreds of branches. Mercurial itself still works quite well with ten thousand branches, but some commands might show noticeable overhead which you will only see after your workflow alredy stabilized.
We have caches in place internally in Mercurial, so the problems are mostly UI problems: hosting sites and log viewers might run hg branches to load all 10,000 branches into a single drop-down menu. That is really slow and useless for the poor user that want to select a single branch from the gigantic menu.
If the branches are closed, then they wont show up in hg branches, and so the problem should be minimized. However, the tools might want to show closed branches too — it all depends on the tool.
I'm sorry this is a little vague. The main point is that Mercurial is built to scale in the number of changesets, not the number of named branches. We have addressed the biggest performance problems with named branches with the cache I mentioned before, so today I'm not too concerned about having many branches, especially if the number of open branches is kept small (less than, say, 100).
I have fallen into the habit if using default in Mercurial and master in Git for the actual work, the bleeding edge, and using tags and branches for the releases. hgsubversion and Git-Svn seem to take this tack.
There are not, in common, such thing as "most conventional" - each and every workflow is a matter of local convention and development policy in team.
I saw both mentioned policy often, and intermediate variations - also.
In case of strong testing|release policy and intensively used branches ("branch per task") "default" branch often exist only as merges-only branch (merges from feature-branches before QA-testing) and means "Code, which work with finished features, without throwing errors, but with unstested functionality".
Minor versions form named branches, each release in such branch is tag. Bugfix branches are merged after completing into "default" and active versions branches
But this workflow is just one more example, not better|worse than others, suitable for mid-size teams with responsibility separation established, doesn't work well in "chaotic anarchy" development
There's not a huge amount in it. If we're talking about just DEV & STABLE branches, which is default is mainly just a naming convention. I'd tend to have DEV as default, just because most work goes happens on the dev branch and if this is the default branch, it's less hassel.
Personally I prefer a named branch per release. Bugfixes can then go on those branches and be forward ported with relative ease to all releases after using hg merge. If you try to do the same with DEV and STABLE, you can only ever have one maintained release (the last one), or your stable branch starts growing branches and you end up with a (possibly less organised) version of the branch per release structure.

revision control for many unrelated files

I'm curious to get people's thoughts on how to manage version control for unrelated functions in Matlab.
I keep a reasonably large set of general purpose scripts, each of which is more or less independent of the others. I've been keeping them all in a single directory, containing a single repository in Mercurial. I'm starting to collaborate much more, and I'd like my collaborators to be able modify the files, commit, branch, and merge.
The problem is that the files are independent of one another. Essentially, they're like many separate little projects. But Mercurial treats the repository as a single entity. So if a collaborator modifies file A and B, and I only want to merge in the changes from file A, things get complicated. I know that I could merge from the collaborator, then revert file B, but I'm wondering if there's a simpler way to handle this setup.
I could set up many tiny repositories to manage each file separately, but that also gets complicated.
I'm open to changing version control systems (although I like Mercurial a lot). Any suggestions?
It is considered a best practice to check in code after each bug fix/feature addition/or what not. Given your files are really independent "projects" it seems unlikely a bug or feature would span multiple files. Probably the best you can do is encourage your colleagues in best practices to commit changes only for a single file at once. Explain that better discipline about checking in leads to more manageable source control later. Hopefully you can get most to follow the practice and the few obstinate ones just stop taking their commits for.
It really depends on your typical reasons for merging one change but not the other. If you're using it to create a software configuration, i.e. sometimes you want to use version 1 of file A and version 2 of file B and sometimes it's the other way around, then you probably want to use subrepos to hold each file. If it's because you never want to accept part of a collaborator's change, then they need to be instructed how to make their changes more cohesive and submit them separately. That can sometimes be a difficult concept for people who either haven't used source control before, or who are accustomed to source control like svn that has little or no intrinsic concept of a changeset.
It depends whether you want to maintain a single 'master' version of the files, merging in changes that you like and ignoring others. If collaborators want to develop other branches, then they should perhaps clone the repository, and you can then accept the changesets that you want in the master.
If you want to veto changes by other collaborators, then the changes either need to be kept separate (via a cloned repository or branch) or you need a review process before changes are pushed back to the trunk.
I always use incoming repositories for collaborators. They match what the other person has made, but it avoids messing with my own repository. When you do this, you can then cherrypick their new changesets into your own repository with the transplant extension.

Branching vs Merging in source control

Can anyone explain in plain English the difference between the two approaches? A link to a good clearly written tutorial would suffice also.
thank you.
They are not different approaches, but different sides to the same coin.
If you develop using branches, you need to merge them together eventually.
To answer the question:
A branch is a copy of a selected part of your source code, used for specific work (feature, project whatever).
A merge is the act of bringing two branches into sync and (normally) getting rid of one branch in the process.
From wikipedia, Revision control:
Branch
A set of files under version control may be branched or forked at a point in time so that, from that time forward, two copies of those files may develop at different speeds or in different ways independently of each other.
Merge
A merge or integration is an operation in which two sets of changes are applied to a file or set of files. Some sample scenarios are as follows:
A user, working on a set of files, updates or syncs their working copy with changes made, and checked into the repository, by other users.
A user tries to check-in files that have been updated by others since the files were checked out, and the revision control software automatically merges the files (typically, after prompting the user if it should proceed with the automatic merge, and in some cases only doing so if the merge can be clearly and reasonably resolved).
A set of files is branched, a problem that existed before the branching is fixed in one branch, and the fix is then merged into the other branch.
A branch is created, the code in the files is independently edited, and the updated branch is later incorporated into a single, unified trunk.
Branching is starting a new development line based on the state of the project where you branch from.
Merging is reporting changes from one development line or between development lines to another.
So clearly you need both at different points in time.
Links:
Branching / Merging primer from
microsoft
A more advanced site about source
code management
I don't think you can "oppose" merging vs. branching.
You branch when you want to isolate a development effort, as I explained in "When should you branch?".
The key question to ask, once you have a branch in which you (and other developers) can make evolution to (refactor? enhance? bugfix? ...), is:
"What should you do with the content of that branch?"
In other word, do you leave it alone, or do you re-integrate all its evolutions into another branch. And that is merging.
But merging alone isn't enough. What give merging its purpose is its workflow: from where to where are you merging your code?
Define your workflow, and you will really take advantage of what SCM is about.
In plain English, branching is like "the branch of a tree" where the tree is your software product and the branch is a special part of it that 'extends' from the center-column but has its own features.
In real life you make a branch of your software if you have to support two customers with slightly different whishes. You then have the core-product (A) and the customer-specific-product (B and C) which are basically the core-product with small changes.
Merging is taking this customer-specific branch (B or C), and re-integrating it with your core product (A). This is handy if you implemented a nice feature is branch B, that you'd like to have for everybody, so you merge it into the core product, A.

Best practices to keep up a diverging branch of code [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm in a situation where some minor patches I've submitted to an open-source project were ignored or explicitly not accepted.
I consider them useful, but more important is that I need the functionality they implement.
I don't want to push my ideas and suggestions anymore to the main contributors, because I don't want to turn this into an ego issue. I've decided that my best bet would be just to use what I wrote for my own purposes. I don't want to fork the whole source code tree because I like how things are generally working, I'm just not happy with details.
But I do realize that the project will evolve and I would like to use the new features that will eventually appear. I understand that I'll have to merge all new things into my own source tree. Are there any best practices for this scenario?
The standard approach is to maintain a vendor branch in your repository. The idea is that you import a pristine copy of the original sources (called a vendor drop) into your local repository, and store it on a branch. This is the version of the code prior to applying your mods. You tag that with the version, then copy it to the main trunk and apply your patches.
When subsequent new versions of the vendor code are released, you check out the vendor branch (without your mods), and overlay the new version on top. Finally you merge the new branch with your mods, checking that they are still applicable/relevant, and you're ready to go again.
There can be complications e.g. with files being renamed, deleted etc. The script svn_load_dirs.pl, that comes with Subversion, can help with this by allowing you to identify files which have changed name and automating some of the bureaucracy.
This approach is discussed in detail (and much more clearly) in the Subversion book, under the section Vendor Branches.
If you are using Git, or could get used to use it, perhaps you should take a look at Stacked Git or Guilt.
They are layers on top of Git to keep track of patches.
Generally you would create a branch of the project in the repository. In your specific case, I would create a branch of just the directory that contains your code if that is possible. A lot of repositories like subversion will allow you to then check out your branch along side the main trunk. That should allow you to maintain your patches and ensure that they work with future changes that are made to the trunk.

Branching Strategies [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
The company I work for is starting to have issues with their current branching model and I was wondering what different kinds of branching strategies the community has been exposed to?
Are there any good ones for different situations? What does your company use? What are the advantages and disadvantages of them??
Here is the method I've used in the past with good success:
/trunk - bleeding edge. Next major release of the code. May or may not work at any given time.
/branches/1.0, 1.1, etc. Stable maintenance branches of the code. Used to fix bugs, stabilize new releases. If a maintenance branch, it should compile (if applicable) and be ready for QA/shipping at any given time. If a stabilization branch, it should compile and be feature complete. No new features should be added, no refactoring, and no code cleanups. You can add a pre- prefix to indicate stabilization branches vs maintenance branches.
/branches/cool_feature. Used for highly experimental or destructive work that may or may not make it into trunk (or a maintenance branch). No guarantees about code compiling, working, or otherwise behaving sanely. Should last the minimum time as possible before merging into the mainline branch.
/tags/1.0.1, 1.0.2, 1.1.3a, etc. Used for tagging a packaged & shipped release. Never EVER changes. Make as many tags as you want, but they're immutable.
Our repository looks like:
/trunk
/branches
/sandbox
/vendor
/ccnet
/trunk is your standard, bleeding edge development. We use CI so this must always build and pass tests.
/branches this is where we put 'sanctioned' large changes, ie something we KNOW will make it into trunk but may need some work and would break CI. Also where we work on maintenance releases, which have their own CI projects.
/sandbox each developer has their own sandbox, plus a shared sandbox. This is for things like "Lets add a LINQ provider to our product" type of tasks that you do when you are not doing your real work. It may eventually go into trunk, it may not, but it is there and under version control. No CI here.
/vendor standard vendor branch for projects where we compile but it is not code that we maintain.
/ccnet this is our CI tags, only the CI server can write in here. Hindsight would have told us to rename this to something more generic such as CI, BUILDS, etc.
One branch for the active development (/main or master, depending on the jargon)
One branch for each maintenance release -> it will receive only really small fixes, while all major development goes to /main
One branch for each new task: create a new branch to work on every new entry on your Bugzilla/Jira/Rally. Commit often, self document the change using inch pebble checkins, and merge it back to its "parent" branch only when it's finished and well tested.
Take a look at this http://codicesoftware.blogspot.com/2010/03/branching-strategies.html for a better explanation
The first thing: KISS (Keep it simple stupid!)
/branches
/RB-1.0 (*1)
/RB-1.1 (*1)
/RB-2.0 (*1)
/tags
/REL-1.0 (or whatever your version look like e.g. 1.0.0.123 *2)
/REL-1.1
/REL-2.0
/trunk
current development with cool new features ;-)
*1) Keep version maintainable - e.g. Service Packs, Hotfixes, Bugfixes which may be merged to trunk if necessary and/or needed)
*2) major.minor.build.revision
Rules of the thumb:
The Tags folder need not to be checked out
Only few coding in release branches (makes merging simpler) - no code cleanup etc.
Never to coding in tags folder
Never put concrete version information into source files. Use Place-holders or 0.0.0.0 which the build mechanism will replace by the version number you're building
Never put third party libraries into your source control (also no one will add STL, MFC etc. libraries to SVN ;-))
Only commit code that compiles
Prefer using environment variables instead of hard-coded paths (absolute and relative paths)
--hfrmobile
We branch when a release is ready for final QA. If any issues are discovered during the QA process, the bugs are fixed in the branch, validated and then merged to the trunk. Once the branch passes QA we tag it as a release. Any hotfixes for that release are also done to the branch, validated, merged to the trunk and then tagged as a separate release.
The folder structure would look like this (1 QA line, 2 hotfix releases, and the trunk):
/branches
/REL-1.0
/tags
/REL-1.0
/REL-1.0.1
/REL-1.0.2
/trunk
We use the wild, wild, west style of git-branches. We have some branches that have well-known names defined by convention, but in our case, tags are actually more important for us to meet our corporate process policy requirements.
I saw below that you use Subversion, so I'm thinking you probably should check out the section on branching in the Subversion Book. Specifically, look at the "repository layout" section in Branch Maintenance and Common Branch Patterns.
The alternative I'm not seeing here is a "Branch on Change" philosophy.
Instead of having your trunk the "Wild West", what if the trunk is the "Current Release"? This works well when there is only one version of the application released at a time - such as a web site. When a new feature or bug fix is necessary a branch is made to hold that change. Often this allows the fixes to be migrated to release individually and prevents your cowboy coders from accidentally adding a feature to release that you didn't intend. (Often it's a backdoor - "Just for development/testing")
The pointers from Ben Collins are quite useful in determining what style would work well for your situation.
Gnat has written this excellent break down on the various bits of advice your can find on branching strategies.
There's not one branching strategy, it's what works for:
Your team size
Your product and the lifecycle periods
The technology you're using (web, embedded, windows apps)
Your source control, e.g. Git, TFS, Hg
Jeff Atwood's post breaks down a lot of possibilities. Another to add is the concept of promotion (from Ryan Duffield's link). In this setup you have a dev branch, test bracnh and release branch. You promote your code up until it reaches the release branch and is deployed.
We currently have one branch for ongoing maintenance, one branch for "new initiatives" which just means "stuff that will come out sometime in the future; we're not sure when." We have also occasionally had two maintenance branches going on: one to provide fixes for what is currently in production and one that is still in QA.
The main advantage we've seen is the ability to react to user requests and emergencies more rapidly. We can do the fix on the branch that is in production and release it without releasing anything extra that may have already been checked in.
The main disadvantage is that we end up doing a lot of merging between branches, which increases the chance that something will get missed or merged incorrectly. So far, that hasn't been a problem, but it is definitely something to keep in mind.
Before we instituted this policy, we generally did all development in the trunk and only branched when we released code. We then did fixes against that branch as needed. It was simpler, but not as flexible.
The philosophy that we follow at work is to keep the trunk in a state where you can push at any time without drastic harm to the site. This is not to say that the trunk will always be in a perfect state. There will of course be bugs in it. But the point is to never, ever leave it broken drastically.
If you have a feature to add, branch. A design change, branch. There have been so many times where I thought, "oh I can just do this in the trunk it isn't going to take that long", and then 5 hours later when I can't figure out the bug that is breaking things I really wished that I had branched.
When you keep the trunk clean you allow the opportunity to quickly apply and push out bug fixes. You don't have to worry about the broken code you have that you conveniently branched off.
For Subversion, I agree with Ryan Duffield's comment. The chapter he refers to provides a good analyses on which system to use.
The reason I asked is that Perforce provides a completely different way to create branches from SVN or CVS. Plus, there are all the DVCSs that give it's own philosophy on branching. Your branching strategy would be dictated by which tool(s) you're using.
FYI, Svnmerge.py is a tool to assist with merging branches in SVN. It works very well as long as you use it frequently ( every 10-30 ) commits, otherwise the tool can get confused.
No matter which branching pattern chosen, you should try to keep your branches in a binary tree form like this:
trunk - tags
|
next
/ \ \
bugfix f1 f2
/ \ \
f11 f21 f22
Child nodes should only merge with the direct parent.
Try ur best to merge only the whole branch with the parent branch. never merge subfolders within a branch.
You may cherry pick commits when needed as long as you only merge and pick from whole branch.
The next branch in the above figure is only for illustration, you may not need it.