Perforce and Feature Branching - version-control

At our company we use Perforce for our source control and version management. Currently we implement feature branching with the following "tree"
Hotfix01 - Hotfix branch
Hotfix02 - Hotfix branch
HotfixNN - Hotfix branch
Main (continuous trunk)
RC - Release Candidate branch (next release)
Working01 - Working branch, feature branch.
Working02 - Working branch, feature branch.
Working03 - Working branch, feature branch.
WorkingNN - Working branch, feature branch.
We setup the branches ahead of time. Not that this really matters. Branches off RC are feature branches. Right now we have >50 developers, Analysts and QA team members that are working individually or as a team on various projects, defect fixes, etc. When work comes in you find a free branch (we track that separately), claim that branch (e.g. Working56), do a force merge/sync from RC to that branch (to make sure it's exactly what's in RC at that moment), do your work (code, peer review, QA) while continuously merging any changes from RC to your working branch (at least once per day, perhaps more often as needed) and when you are done you copy the changes up to RC.
This works, but it means we have (at this time) 300 working branches to manage. We would like to implement feature branching in a more sane way where we would use branch mapping and such to create a branch as needed, named after what it's for and then once we merge it back to RC we want it to no longer show up in the depot forever (at least, not for every developer). Basically we want to see only those branches with active development as feature branches, hiding branches we are done with.
What is the best practice? Can this be done with perforce using either branches or streams? Are we missing something with Branch specifications that would allow this? Should we just not worry and allow thousands or tens of thousands of feature branches to pile up in the depot view under RC? Is the way we are doing it now the best we can hope for?
We've been using Perforce for 10 years and this is still a question that haunts us daily.

Task streams are designed for this.
https://www.perforce.com/perforce/doc.current/manuals/p4v/streams_task.html
Create the task stream, do your work in it, merge/copy back to the parent, then unload the task stream.
Common task stream gotchas:
Do not try to reuse or reparent them. One stream per task! If you try to reuse task streams their beneficial features mostly don't work but you still get all of the limitations.
The modified depot files live on after the task stream is unloaded. Adjust your mental model to use the list of streams as the source of truth for which streams are active, not the folders in the depot tree.
If you can handle those, task streams are hugely useful.

Related

Trunk Based Development suggestion required

I was looking up good resources on branching strategies after doing feature branching for quite a few years and struggling with lots of branches and merge nightmares. Feature branches did give us a good isolation in managing releases in a pretty granular way as to which features should go to release. However, the problems they posed (many branches, merge conflicts) were way more than the benefits they gave.
We work with Oracle database (with 5000 objects) at back end. We also have multiple teams working on different areas of the same product. We are using Visual Studio with TFS (no DVCS).
The more branches we create, more database instances we require to give proper isolation in functional testing in those branches (each branch - one db instance) which are another set of problems.
We are adopting scrum and are searching for a branching model that will suit our release cycle (4 times a year) and CI builds. We are planning to do 5 regular sprint and 1 hardening sprint for each release.
From a feature branching model, we reworked our branching model to a very simple branching like below -
Development branch is working as our "Trunk" (for Trunk Based Development) and ALL developers (all teams) are committing to this branch (for quarterly release), testers are testing in this branch and CI server (Jenkins) is building this branch daily. We just need a clean MAIN at any time for safety as "Single Source of Truth for Last Release" which come to our use often for several reasons.
Maintenance branch is our bug fixing branch (hotfix) and is released several times during the year (irrespective of quarterly release). We prefer not to work directly on the main branch as want to have a "clean" Main branch. We do not want to let code go to Main without "manual" / functional testing done. Once a bug fix release is done, code is merged from Maintenance -> Main -> Development to integrate the bug fixes into Development.
We typically do not require the "Release Branches" as suggested in TBD since we will be continuously doing the bug fixes in the Maintenance branch, release from Maintenance and then merge the changes to Main (and then Development). We maintain only "Last release" and in case previous Release fix is required we create an old release branch from Labels in Main.
Have we modified the Trunk Based Development to an extent that it would pose problems in future? What are your suggestions?
Refer:
http://paulhammant.com/2013/12/04/what_is_your_branching_model
http://paulhammant.com/2013/04/05/what-is-trunk-based-development/#comment-2765204723
You should make a maintence branch from the tag that was released, only if you encounter a bug. Actually that's a release branch, and it should be named for the release. Say rel_1.1. By the time you've pushed out release 1.2 and it is clear you're not going to roll back, delete rel_1.1.

Simulate a tfs style changeset in the enterprise version of github

I have three environments; dev, test and staging/prod. In our previous model of using Team Foundation Server, we would have three branches of code that matched up to each of these three environments.
Developers would work locally and when they had their code complete, they'd check it into the dev branch. When checking in, TFS automatically creates something called a changeset. This check-in would kick off a build of the files into code which then gets deployed to the dev environment.
When a developer was happy with their code in dev, they'd merge just their changeset into the test branch. They'd pull up a complete list of all of the available changesets that dhad not been merged into test, they'd select theirs and check those into the test branch. Again, this would kick off a build and the output files would get deployed to test.
Once QA was happy with the changes, the dev would merge this changeset into the prod branch. Kicking off a build and the files would be deployed to the staging area. The developer and QA would them promote these files to prod.
All of would allow multiple developers to work on the same files using this changeset mentality. When a specific changeset (or set of changesets) was merged into another environment, only those changes would get merged.
In my relatively new exposure to git, I cannot seem to find a way to select specific "pull requests" (which I assume is similar to a TFS changeset) from one branch to another branch. When I try to make a pull request from one branch to another branch, it wants to pull in not only my pull request, but every other pull request made in the lower branch by other developers too. What is the magic way to make this happen?
Note: Unfortunately we don't have the notion of a "release". We have five scrum teams working on one website with over 200 pages. Each scrum team has their own sprints and can release multiple scrum stories during their sprint. We have internally only one DEV environment, and one TEST environment and one PROD environment. Not only are our environments used by these five scrum teams, but these DEV/TEST/PROD sites are is also used by various other teams for integration efforts with applications we sell and also for customer account management and purchasing. We cannot change that infrastructure.
Note: this is not for a discussion as to if this "changeset" methodology is correct or proper. This is a question of how to achieve this behavior in github/git.
Note: we are a set of scrum-based agile teams. We work from stories. As many as 60 stories can be actively in development at any one time with our large team of 25+ developers. When one story is ready for prod, we promote it to the prod environment as an atomic unit. So think of a changeset as a story.
I have two thoughts:
Don't do it this way. Instead, you should look to git-flow. http://danielkummer.github.io/git-flow-cheatsheet/ and http://nvie.com/posts/a-successful-git-branching-model/ are good explanations. At it's core, git-flow is a naming convention for branches, so it's really not tied to git at all. In essence, you have feature branches that each developer or dev team works on. Once they complete a feature, they merge into develop. develop is "done features" -- not "done done" but rather "feature complete." When we deem it time to release, we fork to a new release/someversion branch (name to match the release name), and then work with QA to harden the release. Commits on the release/someversion branch are only bug fixes. Once it's good enough to deploy, we fast-forward the master branch up to the release branch as we push it into production. master then represents what's in production. As we deploy, we also merge release/someversion into develop so the bug fixes get into the mainline of development. The project manager / product owner can then think of the develop branch as "the latest," and developers can continue on their feature branches until they're feature complete. (Hint, make features small -- like an hour or a day. Features are not releases.)
So why is this better than the way you were doing it? If the feature is done, ready enough for QA to start banging on it, it's done enough to be part of the next release. Picking and choosing features around each other will lead you into very subtle and unpredictable bugs. Since you're re-merging at each step, you have the possibility that you'll merge incorrectly, creating a bug. You're also now creating unique product with each step, so you could get to production with a completely different set of features than you vetted in dev and test. (Will this do bad things? Ask your pharmacist if these drugs interact when taken together.)
Git-flow works great for cadences where you have well coordinated, infrequent, larger releases. As you get closer to continuous delivery, this ceremony will get in your way. At that point, you may choose to flip to GitHub flow or a similar lighter-weight naming convention.
If you're really, really, really (see the above "you shouldn't do it this way" comment) convinced you should do it this way, first, go convince a rubber duck and hopefully you will have talked yourself out of it. If you're still really, really convinced you need to do this, you'll frequently need to squash your commits together creating one large commit for the entire feature, then cherry pick the changeset between the branches.
There's a few disadvantages to this "squash and cherry-pick" approach. 1. You lose history. Since you're squashing the history together, you have to now keep features in very contained bundles, and frequently edit the bundle as a whole. One of the primary premises of source control is you get an audit history -- both to roll back to if something goes wrong, and to reference when you need to learn why something works this way or who to talk to about it. (See "git blame".) When you squash, you intentionally remove that learning tool. 2. You're playing features into place in different orders. So you're frequently doing merges. What makes git so awesome is merging is easy. What makes git merging easy is you do it in small pieces. This methodology of squashing everything associated with this feature into one huge commit and cherry-picking it between branches means you're doing very large merges ... which means it will be hard.
Yeah, I know you're quite enamored with the way it's always been, and you really don't want someone telling you your baby is ugly. Sorry. Your baby was ugly. On the bright side, it doesn't need to be. Git flow is awesome, and can definitely facilitate the velocity your team needs.
You previous behavior was dysfunctional. Although not unusual: http://nakedalm.com/avoid-pick-n-mix-branching-anti-pattern/
In Git you most likely want to do two things. The first is to follow Git Flow: http://nvie.com/posts/a-successful-git-branching-model/
Once you have this you can look at creating a deployment pipeline for binaries, not for source. You should do a build from MASTER and that build goes through your environments. Happy to discuss here and offline.

Branching/merging strategy when one of multiple changes is not signed-off

I am trying to come up with a good branching merging strategy for a scenario when multiple features are being tested simultanously and at least one of them is not signed-off by stakeholders. I want to get the signed off changes pushed through to production with the least effort in terms of SCM operations and retesting.
I am using CVS (and I can't change this), but the question is SCM technology agnostic to certain extent.
Imagine that at any given time there are multiple features being developed on a common baseline. They're all worked on in isolation in their own branches. At some stage a build is pushed to test envrionment from the trunk that contains all the finished/tested and merged back to trunk changes. Let's say there are 5 branches merged in total and one is not good enough to pass testing or for some reason is not signed-off by the business.
If there was a silver bullet like 'unmerge' operation it would be perfect, but as far as I know there isn't.
My other idea is not to merge all the changes to trunk but to a separate branch forked off trunk and push a build off that branch to test server. If some change is withdrawn for any reason it would require forking a new branch off trunk and merging all but the withdrawn changes. Once all the changes are accepted this temporary branch can be merged to trunk.
I am wondering if this is not an overkill though.
Any other ideas?
While this doesn't really answer your question as asked, you might want to look into the concepts of feature toggles and "branch by abstraction" as discussed in Continuous Delivery. These are a couple of the core concepts that allow iterative mainline/trunk development.

TFS -- Sustainability of Cascading Branches

Branching guidance usually describes an immortal "Main" branch, with features branched from Main, and merged back to Main, and Releases branched from Main, with further branches of a Release as necessary for Service Packs, RTMs, etc. The guidance regarding Main is often simplified to "no trash in Main."
I'm working with a group that releases regularly (as often as monthly) and serially. To them it seems unnecessary to ever return work to the Main branch. They use TFS 2010--diagramatically their branching structure looks like this:
Daily builds on a branch are made; eventually the branch goes to production. Any hotfixes to a branch are applied directly to that branch, and optionally merged forward to any future in-flight branches.
This group's branching strategy has been described perjoratively as the "Cascading Branches Antipattern." But is it really, given that these branches release to production, and then (usually) have a fairly short time to live?
Is this practice of cascading branches in TFS sustainable over the long term. If not, what are the limits, and when (after how many branches) might they be reached?
Is there any reason to NOT "destroy" Main, R1, R2 (etc.) eventually, or is there a "gotcha" that will prevent destroying and reclamation of space on the SQL server that is hosting the source code repository?
Cascading branches can work. I also can't think of any technical reason why destroying very old (preferrably archived) branches would impact the newer cascaded branches. Here are some issues to consider:
Developers have to map a new branch to their workspace after every release.
Developers have to manually move any work to a new branch if they weren't able to check it in before release (vs. just checking in to the same working Dev or Main branch after release.)
If you have one or more developers working in a child branch of Rn and a decision is made to move their work to Rn+1 then a baseless merge will be required to avoid checking into the original parent Rn branch.
MAKE SURE YOU SECURELY LOCK EACH BRANCH after release. All those branches will increase risk of a developer accidentally checking in a change to a released branch.
You need to adjust build definitions and any other path-specific artifacts after each cascade. If all development just works out of Dev (or Main) then the primary workspace and related build/project artifacts remain the same over time.
How do you work on a parallel features in isolation when you don't know which feature(s) will ship in Rn? (If you have a main branch the you can have multiple child feature dev branches from Main, then merge a feature branch only when it is stable and ready to merged to ship in the next release.)
I believe Jeff Levinson did a presentation that described branching evolution starting with single branch, then cascading branch, then Main+Release and a couple variations (while describing pros and cons of each). Check out Branching and Merging Practices - Jeff Levinson (Teched 2010 Video) (or related Branching & Merging PPT).
Enjoy! -Zephan

When is the right time to branch and when is the wrong time?

Is there a specific rule I should be using for when to branch in source control? Branches seem to be expensive because they require that the team have extra knowledge about where the features they want to work on should go.
Our development team sometimes finds itself working on a long term feature and a shorter term feature at the same time. That means we end up with:
Trunk
-Branch A (Short Term)
-Branch B (Long Term)
After they complete we have to merge A in to the trunk, then merge the changes to the trunk back in to B to make sure those edits still function. It's messy.
I am wondering if we can cut down on branches by using Labels (Or tags, or pins or whatever your Source Control Software of choice calls it). Maybe it makes sense to branch for the longer term project, but we could just do the edits for the short term project right in the trunk after applying a label to the stable release. That way we can always retrieve the source code that was stable if we have to do an emergency bug fix, but we don't have to deal with the branch.
What rules do you use to decide when to branch?
One way to reduce branching is to implement new features (especially smaller ones) directly on trunk. This is how we do it:
small features, which will are guaranteed to be completed before the next release, are implemented on trunk
for larger features, we create a feature branch ("Branch B" in your example)
once we are ready to create a release, we create a release branch (from trunk), e.g. named "branches/2.x". This branch is then used for testing and finalizing the release.
once the release is built, we tag the corresponding revision from the release branch (e.g. tags/2.0.0).
normal development then continues on the trunk. the release branch is used for maintenance of the 2.x line of the product (e.g. bug fixes are merged from trunk, or implemented directly on that branch)
In a small team, the time to branch is when you can't commit directly into the trunk. With svn (as I guess with other version controls as well), it is possible to postpone the decision to branch till the time one realizes that one cannot commit into the trunk.
To minimize the need to branch, a new feature can be worked on in the trunk itself by restricting the new-feature code within compile-time or run-time flags. This approach also allows to later turn off feature if not needed, do A/B split testing experiments with the feature, etc.
Of course with this approach it always helps to have a continuous testing that gives an early alert whenever the build/test-suite breaks on the trunk.
For one thing, this depends on the tool you use. Branches are more 'expensive' in Subversion than in Mercurial or git, because merges are harder to do. For another, it depends on your project/organization: you should probably have at least one branch per maintained version.
It depends on the VCS you are using. If you are using a tool that has good support for merging, then you should branch whenever you feel like it. When in doubt, create a new branch. If the UNIX epoch time is even, then you should branch. If it's, odd, you should wait a second, and then branch. If you are using a tool that doesn't support merging well, then you should consider changing tools. In other words, stop using a tool that makes it necessary to ask this question.
It’s normally poor practice to develop against the mainline or trunk. The trunk should be used as the master code set and should reflect the code that currently represents production. If you are not in production yet, it should represent the gold copy and should always build and be subjected to automated regression tests. It should not be used to show development status or activity. Protect your trunk from change and resist the temptation to allow developers to check out and lock code on a trunk. The only updates in my view should be via the merge process, when you are ready to repatriate your code to the mainline.
When branching you should consider the purpose, complexity and duration of the development.
• Is it to support a team of developers working on a new feature or a substantial piece of development?
• Are you using traditional processes or the various agile flavors that are out there?
• It is to accommodate the development of a patch or fix for production?
• What development and in particular, test activity will you accommodate on the branch and will you retain the branch until the derived artifacts are built, tested and deemed releasable?
There are many models out there but few give sufficient consideration to the "build" process and the implications of regenerating your releasable artifacts.
Let’s assume you have the following lifecycle: DEV->SYSTEM-INTEGRATIONTEST->UAT->PRE-PROD->PRODUCTION. Assume you create a branch from mainline to accommodate the development and build processes. Your development\build\test cycle continues right through to UAT. The artifacts produced from this branch have been exposed to sufficient testing to deem them potentially suitable for release. You are able to state that the artifacts signed off by the users were also exposed to system and integration testing.
Some folks advocate merging the source code to the trunk at this point and recommend that you create a RELEASE branch upon a successful trunk rebuild. For me this is fine if the solution is stable and requires no further change prior to production, otherwise you risk propagating bugs elsewhere. In variably it will need to change.
If you do unearth issues in PRE_PROD, where are these “Fix” changes going to be made? Many suggest that you can make the code changes directly in the release branch. If you proceed, this modification will produce a new build and a new set of artifacts. You may elect to push these artifacts back through PRE_PROD and on to production, as the underlying code has been validated through previous testing and the modifications made to stabilize the release are deemed risk free? But you have a problem.
You cannot state that the executables\artefacts released to pre-prod and subsequently production, have been tested in your lower environments. Despite confidence being high, the output from the release branch build is different from that produced from the development builds. This may fail audit.
Branching for me is about managing your code and not the build output or solely the release. If you advocate branching for release and release stabilization (pre-prod fixing), you must take the above risk combined with the need for significant regression testing into consideration.
On the basis that the trunk should represent production code, you cannot push code to it unless it has been pushed to production first. I advocate creating a branch that supports the development, build and release as a single cycle. To avoid branch longevity and unnecessary divergence from the trunk (and potential big bang conflict issues) limit the development as much as you can and release and repatriate often with the trunk to keep other development efforts current.