Managing software with custom local patches - deployment

I'm trying to find a way to deploy software with custom patches in production. The base software is opensource with their own repos (SVN) and we've got some patches to select only for one service and not the other (so we've got base+patchA+patchB on one server and base+patchA+patchC on another).
Everything will be deployed as packages, which is pretty simple. The issue I'm thinking about is: how do we store the modifications? These are some ways I thought about:
I've tried using quilt / patch series + downloading a specific revision from upstream when building. This works well, until we need to port patches to a new version. Since quilt needs a copy of the tree for patch modifications, it takes a really long time to fix stuff. Also, having different versions deployed on different servers means we need to have different build-scripts.
I could clone the repository into local git/hg and create a branch with local modifications. That's great for porting patches forwards and I can make local release tags on my branch, but unfortunately I lose the information about separate patches. I can see the diff to upstream of course, but I lose the clear separation of different local modifications.
I tried stacked git / hg-mq, but as far as I can see, they don't export the patch series easily. They're good for storing work in progress. I can do the same trick that BitBucket does and create patch queues "inside the repository" to export them, but then every patch queue would be assigned to a branch separately (same modification would be needed in every local branch that uses patchA for example).
Do you have any other ideas, articles, standardised ways of doing this? Additional pros & cons for the ways I mentioned?

I know it's a long time since you asked, but you had the answer right here:
I could clone the repository into local git/hg and create a branch
with local modifications. That's great for porting patches forwards
and I can make local release tags on my branch, but unfortunately I
lose the information about separate patches. I can see the diff to
upstream of course, but I lose the clear separation of different local
modifications.
You still get the clear separation of different local modifications through Git's "commit" mechanism.
The way I see it, a Git "commit" is essentially the same thing as a patch sent by email. That is, it encapsulates a single change with a subject line and descriptive body text explaining the reason for the change. Of course, it also contains the changes to all the relevant files.
The git format-patch command (and its cousin, git send-email) will actually output in this format if you do need to send the changes to an outside system (eg, upstream), but the information is there anyway.

Related

Is is possible to re-integrate a sub-project and re-connect mercurial history?

A while back, we pulled a number of more stable packages out of our main application into separate mercurial repositories. We share them in a limited way with some other clients, who access them via artifactory, although these external clients don't generally bother or have a need to stay up to date with our changes. (They are many months behind and doing fine because it's only a few interfaces that are crossing over.)
It's arguable that splitting into separate repositories has made things less efficient for us in that (a) it's more heavy-weight to make changes to the other jars and we sometimes don't bother and (b) it's harder to peruse the changeset history of a feature that involves changes in two or more repos.
We're considering bringing these back into the main repository, but I'm wondering is there any way now to re-connect the histories when doing so? Ideally I'd like to be able trace the history of a given code file, progresing through recent changes, changes during the separation phase, and hopefully changes from before we split them apart. Is that possible?
I guess you can either pull all separate histories into one repository (hg will complain they're unrelated, but will let you proceed anyway if you insist), and then merge them normally (move them into separate directories in each of their branches, and then merge them in a way files from all branches get placed in a single place), or you can filter the history with convert, export/import and maybe mq, but that will be quite difficult to implement.

Recommended DVCS mechanism for hosting many independent patches

I have a project just getting started at http://sourceforge.net/projects/iotabuildit/ (more details at http://sourceforge.net/p/iotabuildit/wiki/Home/) that is currently using Mercurial for revision control. And it seems like Mercurial and SourceForge almost have all the right features or elements to put together the collaboration mechanism I have in mind for this project, but I think I'm not quite there yet. I want people to be able to submit, discuss and vote on individual changes from a large number of individuals (more developers than a project would normally have). And I want it to be as easy as possible for users to participate in this. The thought right now is that people can clone the "free4all" fork, which is a clone of the base "code" repository, or they can create their own fork in their own SourceForge user project (SourceForge now provides a workspace for every user to host miscellaneous project-related content). Then they can clone that to their local repository (after downloading TortoiseHg or their preferred Mercurial client). Then they can make modifications, commit them, push them to the fork, and request a merge into the base "code" repository, at which point we can discuss/review the merge request. This all is still far too many steps, and more formal than I'd like.
I see there is such a thing as "shelving" in Mercurial, but I don't see how/if that is supported in the SourceForge repository. And there probably isn't a way to discuss shelved changes as there is merge requests.
I'm looking for any suggestions that would make this easier. Ideally, I would like users to be able to:
Specify any version that they would like to play, and have that requested version extracted from source control hosted for the user to play at SourceForge (because the game can't be played locally due to security restrictions the Chrome browser properly applies to javascript code accessing image content in independent files)
Allow the user to download the requested version of the project for local editing (a C# version built from the same source is also playable locally, or Internet Explorer apparently ignores the security restriction, allowing local play in a browser)
Accept submitted modifications in a form that can be merged with any other compatible "branch" or version of the game that has been submitted/posted (ideally this would be very simple -- perhaps used just uploads the whole set of files back to the server and the compare and patch/diff extraction is performed there)
Other players can see a list of available submitted patches and choose any set to play/test with, then discuss and vote on changes.
Clearly some of these requirements are very specific, and I will probably need to write some server side code if I want to reach the ideal goal. But I want to take the path of least resistance and use the technologies available if much of the functionality I need is already almost there. Or I'd like to see if I can get any closer than the process I outlined earlier without writing any server code. So what pieces will help me do this? Does Mercurial & SourceForge support storing and sharing shelved code in the way I would want? Is there something to this "Patch Queue" (that I see, but can't understand or get to work yet) that might help? Is there a way to extract a patch file from a given set of files compared to a specific revision in a repository (on the server side), without having the user download any Mercurial components?
It sounds like something you could do with mercurial queue (mq) patch queues. The patch queue can be is own, separate versioned repository, and people can use 'guards' to apply only the patches they want to try.
But really it sounds even easier to use bitbucket or github, both of which have excellent patch-submission, review, and acceptance workflows built into them.

Migrating from CVS to distributed version control (Mercurial)

Some background: We're working on projects that involve projects across 2 different countries, and we've been using CVS. Developers in the country not hosting the CVS server will take forever to connect to the remote server, so we've set up this system to have 2 separate CVS servers in each country and have a sync job that keeps them in sync every hour or so.
Given this, we're looking at migrating to a distributed version control system, mostly because we've been having problems with the sync job failing and the limitation that for a given set of files only one side can have the writelock for it at a time.
We're currently looking at Mercurial for this purpose, so can anyone help tell us if:
a. Will Mercurial be a good fit for our use case above? How easy will it be for devs to make the transition, i.e. will they still be able to work the same way? etc
b. Can Mercurial support branching a specific folder only?
c. We also hold a lot of binary docs in version control, will they be suitable for Mercurial?
d. Is there support for getting the "writelock" of particular files? i.e. I want no other people to update these particular files while I'm working on them
Thanks!
a/ and d/: yes and no. Yes, a DVCS like Mercurial is a good fit for distributed development, but by nature, there is no more "writelock", since there is no one "central server" which would be notified each time you want to modify anything.
You will pull (or check incomings) regularly from the remote repo.
b/ no, this isn't how a DVCS works, since a branch is no longer a copy of a directory.
c/ binaries are best kept outside a DVCS (since it will be cloned around, and binaries would make its size grow too fast)
See "How is Mercurial/Git worse than Subversion with binary files?"

DVCS strategies for mixed-role small team?

I've done a lot of reading, and have been trialing GIT, GIT Tortoise, Tortoise SVN and PlasticSCM, to find the right source control for our small team (5-10 users).
Some background on our team: 6 copy writers/editors (2 remote), 2 developers, 2 graphic designers. We are not always working on projects together, sometimes up to 5 of us might be working on a given project. I'm unconcerned about the developers with DVCS, my concern is mainly around the other roles who are (in the nicest way) limited in their technical capability. Some of our copy writers update multiple source files (HTML, PDFs and adding concept graphics) to live, unversioned build directories (backed up as build.23.06.11.new.new.final.zip!). The copy and GD team will not have time, or to be brutally honest, the inclination to merge/resolve conflicts, or probably even remember to switch branches.
A few SO questions have shed light on what what seems to be a fairly consistent approach - main trunk (no junk in the trunk!) with teams having their own branches, and having release branches etc.
Every time I've re-read the links...
https://stackoverflow.com/questions/3854583/version-control-system-for-small-in-house-team
Getting started with Version Control
http://svn-ref.assembla.com/subversion-how-tos.html
...and google in general, I still end up asking myself the same questions:
Is it a Bad Idea to create role-specific branches for "trouble points" (copy team), where they can push to the repo, then our developers will merge their work into the actual project branch?
Should I still try to enforce a task-per-branch for everyone else?
Should I do task-per-branch for everyone but let the copy team create very broad tasks?
Is there usually a team/group/person who is considered an "admin" role for a repo who does crucial merges?
(is there an alternative suggested workflow where copy writers don't touch source?)
Unfortunately, the copy teams play a vital role in updating files which in turn affect layouts and all sorts of things, on a continual basis during dev. Its not like I can keep them in a bubble until the end of a project and chuck their work in.
... the good news is that hopefully, after a number of years, I'm ready to force everyone to move to version control! We've also settled on PlasticSCM for its intuitive GUI and Windows integration.
The best answer to this question would try to answer the 4 points above - tackle point 5 if you like - explain weak points if possible, and provide advice, gotchyas, etc.
cheers!
So basically you want to know how to get team-member of different skill-levels to use SCM and play nice with each other.
Buy-in from your team is priority #1. If you can't make them learn it, then you're left with providing a path of least resistance. So you really need to be flexible. There might be a wrong-way and a right-way to use the tool, but if the users won't accept the right-way, then the wrong-way is better than them not using it at all. How you achieve this balance is going to be different for every team.
Is it a Bad Idea to create role-specific branches for "trouble points" (copy team), where they can push to the repo, then our developers will merge their work into the actual project branch?
No, maybe its not optimal, but if this makes it easy for the Copy Team, then thats what you're left with. You could probably go even further and setup each user with their own branch. Then they never have to worry about merging other peoples changes.
Should I still try to enforce a task-per-branch for everyone else?
Each dev should have a unique "local" branch, that is not tracking an upstream branch. For example, use something generic like mydev. This makes it easy for them to switch between their local code and the current upstream branch.
You don't necessarily need to force everyone to create a local branch for every task, cause in the end, you're going to want them just to rebase their working branch onto the upstream one, and commit so it just becomes a fast-forward (i.e. linear commit).
Now for tasks that multiple devs are working on, or it is a feature that involves groups of smaller commits, then yes it does make sense to force them to create a new specific task branch. When they merge they can make sure to force a merge-commit, then it is clear that a set of commits are grouped together and all were part of a specific task. The merge commit will display like merged branch feature-X.
Should I do task-per-branch for everyone but let the copy team create very broad tasks?
It's really up to how much buy-in you can get from the Copy Team. I think if they really get confused with the DVCS tools, then you have to scale back until you can find something that does not cause too much of an impact.
One solution, is to have one of your devs help integrate the Copy Teams changes into another branch that everyone else will look at. That will help offload the learning-curve of the tool onto someone outside of the Copy Team.
Is there usually a team/group/person who is considered an "admin" role for a repo who does crucial merges?
Yes, this makes sense. However the great thing about SCM, is that everyone will be able to go back and do a code review on a merge. So if a merge breaks the code, you can either append the corrections after the merge, or remove the merge, and do it over.
(is there an alternative suggested workflow where copy writers don't touch source?)
Well, one possible technique is the Integration Manager model. The developers commit changes to their own share repos, but its up to the integration manger, to merge in the changes to the blessed repository.
I'm sure there are other methods that might work for your users, but this question is slightly ambiguous.

Good github structure when dealing with many small projects that have a common code base?

I'm working for a web development company and we're thinking about using GitHub for version control. We work with several different .NET-based CMS-platforms and of course with a lot of different customers.
We have a standard code base for each CMS which we start from when building a new site. We of course would like to maintain that and make it possible to merge some changes from a developed site (when the standard code base has been improved in some way).
We often need to make small changes to a published site at a later date and would like to be able to do this with minimal effort (i.e. the customer gladly pays for us to fix his problem in 2 hours, but doesn't want to pay for a 2 hour set up first).
How should we set this up to be able to work in an efficient fashion? I'm not very used to distributed version control (I've worked with CVS, Subversion and Clear Case before), but from my imagination we could:
Set up one repository for each customer, start with a copy of the standard code base and go from there. Lots of repositories of course.
Set up one repository for each CMS and then branch off one branch for each customer. This is probably (?) the best way, but what happens when we have 100 customers (=branches) in the same repository? It also doesn't feel entirely nice that we create a lot of branches that we don't really have any intention of ever merging back to the main branch.
I don't know, perhaps lots of branches is only a problem in my imagination or perhaps there are better ways to do this that I haven't thought about. I would be interested in any experince in similar problems.
Thank you for your time and help.
With Git, several repos make sense for submodules purpose (sharing common component, see nature of Git submodules, in the third part of the answer)
But in your case, one repo with a branch per customer can work, provided you are using the branches to:
isolate some client-specific changes (and long-lived branch with no merging back to master are ok),
while rebasing those same branches on top of master (which contains the common code, with common evolutions needed by all the branches).