Recommended DVCS mechanism for hosting many independent patches

Recommended DVCS mechanism for hosting many independent patches - version-control

I have a project just getting started at http://sourceforge.net/projects/iotabuildit/ (more details at http://sourceforge.net/p/iotabuildit/wiki/Home/) that is currently using Mercurial for revision control. And it seems like Mercurial and SourceForge almost have all the right features or elements to put together the collaboration mechanism I have in mind for this project, but I think I'm not quite there yet. I want people to be able to submit, discuss and vote on individual changes from a large number of individuals (more developers than a project would normally have). And I want it to be as easy as possible for users to participate in this. The thought right now is that people can clone the "free4all" fork, which is a clone of the base "code" repository, or they can create their own fork in their own SourceForge user project (SourceForge now provides a workspace for every user to host miscellaneous project-related content). Then they can clone that to their local repository (after downloading TortoiseHg or their preferred Mercurial client). Then they can make modifications, commit them, push them to the fork, and request a merge into the base "code" repository, at which point we can discuss/review the merge request. This all is still far too many steps, and more formal than I'd like.
I see there is such a thing as "shelving" in Mercurial, but I don't see how/if that is supported in the SourceForge repository. And there probably isn't a way to discuss shelved changes as there is merge requests.
I'm looking for any suggestions that would make this easier. Ideally, I would like users to be able to:
Specify any version that they would like to play, and have that requested version extracted from source control hosted for the user to play at SourceForge (because the game can't be played locally due to security restrictions the Chrome browser properly applies to javascript code accessing image content in independent files)
Allow the user to download the requested version of the project for local editing (a C# version built from the same source is also playable locally, or Internet Explorer apparently ignores the security restriction, allowing local play in a browser)
Accept submitted modifications in a form that can be merged with any other compatible "branch" or version of the game that has been submitted/posted (ideally this would be very simple -- perhaps used just uploads the whole set of files back to the server and the compare and patch/diff extraction is performed there)
Other players can see a list of available submitted patches and choose any set to play/test with, then discuss and vote on changes.
Clearly some of these requirements are very specific, and I will probably need to write some server side code if I want to reach the ideal goal. But I want to take the path of least resistance and use the technologies available if much of the functionality I need is already almost there. Or I'd like to see if I can get any closer than the process I outlined earlier without writing any server code. So what pieces will help me do this? Does Mercurial & SourceForge support storing and sharing shelved code in the way I would want? Is there something to this "Patch Queue" (that I see, but can't understand or get to work yet) that might help? Is there a way to extract a patch file from a given set of files compared to a specific revision in a repository (on the server side), without having the user download any Mercurial components?

It sounds like something you could do with mercurial queue (mq) patch queues. The patch queue can be is own, separate versioned repository, and people can use 'guards' to apply only the patches they want to try.
But really it sounds even easier to use bitbucket or github, both of which have excellent patch-submission, review, and acceptance workflows built into them.

Related

Automatically store each compile on github?

My code is compiled by multiple people (multiple machines) multiple times per day. I would like to have each compile automatically uploaded to our Github account each time someone completes a compile. Basically, these compiled zips get sent to actual hardware via either flash drive or email or dropbox (any number of ways based on many conditions). Since the zip is always named the same, sometimes the old version is deleted on the device, sometimes stored in an /old directory. I would like to stop losing old versions and retain a central repository of each version stored chronologically. Github seems the perfect place for that.
I could of course ask each user to upload the finished zip that they created to a central location, but I would like for it to be an automatic process if possible. So - does Github offer a feature like that?

Github seems the perfect place for that.
Not really, since putting large binaries in a distributed repo (ie, a repo which is cloned around in its entirety) is not a good idea.
To get a specific version of your binary, you would have to clone your "artifacts" repo from GitHub, before being able to select the right one to deploy.
And with the multiple deliveries, that repo would get bigger and bigger, making the clone longer.
However, if you have only one place to deploy, it is true that a git fetch would only get the new artifacts (incremental update).
But:
GitHub doesn't offer an unlimited space (and again, that repo would grow rapidly with all the deliveries)
cleaning a repo (ie deleting old versions of a binaries you might not need) is hard.
So again, using GitHub or any other DVCS (Distributed VCS) for delivery purpose doesn't seem adequate.
If you can setup a public artifact repository like Nexus, then you will be able to deliver as many binaries you want, and you will be able to clean them (delete them) easily.

Github has the concept of a files attached to a repository (but not actually in the repo - they're stored on s3) and there's an api for uploading files to it.
You could call that api as part of your build process.
If you have a continuous integration server building your code after every commit you should be able to get it to store the build products somewhere, but you might have to handle the integration yourself if you want them stored on GitHub (as opposed to on the CI server's hard disk)

While github is perfect for collaborating on the sources, it is not for managing the build and its artifacts. You may eventually want to look at companies like Cloudbees, which provide hosted build and integration environments, that target exactly the workflow parts beyond the source management. But those are mostly targeted towards Java development, which may or may not fit your needs.
Besides that, if you really only want to have a lot of time stamped zip files from your builds accessible by a lot of people, wouldn't a good old fashioned FTP server be enough for your needs, maybe?

Source control system to branch by user instead of version

Once again, I'm a bit stumped about the best stack-exchange site on which to post this question. But I think developers are best suited to answer questions about source control, so here it is.
I am considering a crowd-sourced, user-rated game development project and am wondering what, if any, source control and merging systems might best be capable of hosting the kinds of source control I'm interested in. By user-rated, I mean that there will be some kind of rating/voting system like that found here on StackOverflow. For some details on the project idea, you can read my posting about it at http://gamedev.enigmadream.com/index.php?topic=1589.0. What I think I need is:
Ability to branch by user and maximize merging capabilities. I know source control systems are mainly focused on branching by version, and we could maybe think of each user maintaining their own version. But I guess we need some really robust merging capabilities to maximize the abilities of one user to merge changes from another user into their own branch, for example. So I think I would like the ability for "cross-branch" merging without having to merge into the common root branch first. (I'm most familiar with Team Foundation Server (TFS), which doesn't easily support this.)
Massive branching and merging. If there are hundreds or thousands of people wanting to incorporate their own changes into the project, there could be a lot of branches, and the system would need to be able to handle that without a meltdown. A single user might want to create multiple branches deriving from multiple other users' branches under their own name too, ideally, with the ability to merge among them to some extent.
Permission control by branch. I see SourceForge supports Subversion and Mercurial, but does not currently support permission controls by path/branch on these (as far as I can tell), although that does appear to be a feature under consideration. Users should be limited from pushing their code into other branches. I suspect the normal operations for a user would be pulling edits from other branches into their own branch, and checking in additional changes in their own branch.
A voting system. I know I shouldn't expect a source control system to support voting natively, but anything that could contribute to making this possible would be helpful. For example, maybe a voting system would involve or rely on the ability to label the best edits from various branches and pull them into a single file based on a label or a set of labels. And anything that would assist in merging the results of a selected set of labels from various branches (perhaps applying a new label to the set) could help too.
Very few files and possibly no directories. I would be willing to give up the ability to manage a large number of files or directories in exchange to gain any of the above because the format for the game file I'm considering is generally contained in a single text (XML or HTML5 -- haven't decided yet) file. But this does mean that the system should be pretty good at merging edits to relatively large text files efficiently. I know Team Foundation Server does a pretty good job of maintaining just changes to a file. I hope other source control systems do at least as well.
Or is source control not the proper paradigm to be talking about here? Is there some other technology ideal for merging code like this, one that doesn't involve source control and/or branching the way I'm thinking about it?

Any VCS, because "...source control systems are mainly focused on branching by version..." is just wrong, VCS support diverged changes of code over time, nothing more and nothing less
Any DVCS, because they have reasonable good branch-merge capabilities from the ground
Mercurial, which have branch-level ACLs, SVN have path-based ACLs. And because Subversion have physical tree repository (at some degree), ACLs can be applied to any part of subtree, i.e to branches also
Any CodeReview tool, integrated with VCS and modified for specific-reqs
Fossil SCM is single-file portable EXE, repo - one file; any DVCS also add only one dir of repo to existing tree and handle big files without headache

I want to separate binary files (media) from my code repositories. Is it worth it? If so, how can I manage them?

Our repositories are getting huge because there's tons of media we have ( hundreds of 1 MB jpegs, hundreds of PDFs, etc ).
Our developers who check out these repositories have to wait an abnormally long time because of this for certain repos.
Has anyone else had this dilemma before? Am I going about it the right way by separating code from media? Here are some issues/worries I had:
If I migrate these into a media server then I'm afraid it might be a pain for the developer to use. Instead of making updates to one server he/she will have to now update two servers if they are doing both programming logic and media updates.
If I migrate these into a media server, I'll still have to revision control the media, no? So the developer would have to commit code updates and commit media updates.
How would the developer test locally? I could make my site use absolute urls, eg src="http://media.domain.com/site/blah/image.gif", but this wouldn't work locally. I assume I'd have to change my site templating to decide whether it's local/development or production and based on that, change the BASE_URL.
Is it worth all the trouble to do this? We deal with about 100-150 sites, not a dozen or so major sites and so we have around 100-150 repositories. We won't have the time or resources to change existing sites, and we can only implement this on brand new sites.
I would still have to keep scripts that generate media ( pdf generators ) and the generated media on the code repository, right? It would be a huge pain to update all those pdf generators to POST files to external media servers, and an extra pain taking caching into account.
I'd appreciate any insight into the questions I have regarding managing media and code.

First, yes, separating media and generated content (like the generated pdf) from the source control is a good idea.
That is because of:
disk space and checkout time (as you describe in your question)
the lack of CVS feature actually used by this kind of file (no diff, no merge, only label and branches)
That said, any transition of this kind is costly to put in place.
You need to separate the release management process (generate the right files at the right places) from the development process (getting from one or two referential the right material to develop/update your projects)
Binaries fall generally into two categories:
non-generated binaries:
They are best kept in an artifact repository (like Nexus for instance), under a label that would match the label used for the text sources in a VCS
generated binaries (like your pdf):
ideally, they shouldn't be kept in any repository, but only generated during the release management phase in order to be deployed.

Managing software with custom local patches

I'm trying to find a way to deploy software with custom patches in production. The base software is opensource with their own repos (SVN) and we've got some patches to select only for one service and not the other (so we've got base+patchA+patchB on one server and base+patchA+patchC on another).
Everything will be deployed as packages, which is pretty simple. The issue I'm thinking about is: how do we store the modifications? These are some ways I thought about:
I've tried using quilt / patch series + downloading a specific revision from upstream when building. This works well, until we need to port patches to a new version. Since quilt needs a copy of the tree for patch modifications, it takes a really long time to fix stuff. Also, having different versions deployed on different servers means we need to have different build-scripts.
I could clone the repository into local git/hg and create a branch with local modifications. That's great for porting patches forwards and I can make local release tags on my branch, but unfortunately I lose the information about separate patches. I can see the diff to upstream of course, but I lose the clear separation of different local modifications.
I tried stacked git / hg-mq, but as far as I can see, they don't export the patch series easily. They're good for storing work in progress. I can do the same trick that BitBucket does and create patch queues "inside the repository" to export them, but then every patch queue would be assigned to a branch separately (same modification would be needed in every local branch that uses patchA for example).
Do you have any other ideas, articles, standardised ways of doing this? Additional pros & cons for the ways I mentioned?

I know it's a long time since you asked, but you had the answer right here:
I could clone the repository into local git/hg and create a branch
with local modifications. That's great for porting patches forwards
and I can make local release tags on my branch, but unfortunately I
lose the information about separate patches. I can see the diff to
upstream of course, but I lose the clear separation of different local
modifications.
You still get the clear separation of different local modifications through Git's "commit" mechanism.
The way I see it, a Git "commit" is essentially the same thing as a patch sent by email. That is, it encapsulates a single change with a subject line and descriptive body text explaining the reason for the change. Of course, it also contains the changes to all the relevant files.
The git format-patch command (and its cousin, git send-email) will actually output in this format if you do need to send the changes to an outside system (eg, upstream), but the information is there anyway.

Best practice to maintain source code under version control with multiple companies?

I'm wondering if there is any best practice for maintaining your source code under version control among different companies. In Open Source there is a maintainer, who receives patches, decides on them and applies them. But what about closed sourced projects where different companies get different workloads and just commit them to the trunk and branches? Is this maintainer concept applicable to a project on which multiple companies work on?

You can choose from a wide range of version control systems. (Not only subversion)
With the "versioning" concept you are safe that no one damages the project permanently.
So there is no need for a manual approval process, especially when there are contracts for example between the participating companies.
I'd also set up a commit mailinglist so you have some kind of peer review of changes. So no changes can be done without anyone noticing them.
If applicable set up some kind of continous integration environment to keep the quality up.
I don't understand the question about the branches. The decision whether to use them or not is IMHO not depending on the fact that the commiters are employed in the same company or not.

Its really up to you to decide which workflow works best for the companies involved. Subversion has the ability to add permissions to your trunk and branches allowing you to lock down certain parts of your repository to people who are "trusted" with merge access to trunk. You'll need good communication amongst the companies. Using the open source Trac provides a wiki, integrated RSS feeds of the commits to the project and code browser.

Usually, each site works on its dedicated branch and can import the other remote site branch, to decide what to integrate in its own work.
But if a site need to work directly on the other site branch, one possible practice is the concept of branch membership which allows only one site at a time to work on a given branch.
(not sure it is possible with SVN though)
That allows for two remote site (with a large time shift) to work on the same task in a tightly integrated manner.

My recommendation : subversion, with that configured you give away a url and then checkout, update, get things done and when you guess that the project is ready, snapshot and deliver.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse