Best practice to maintain source code under version control with multiple companies? - version-control

I'm wondering if there is any best practice for maintaining your source code under version control among different companies. In Open Source there is a maintainer, who receives patches, decides on them and applies them. But what about closed sourced projects where different companies get different workloads and just commit them to the trunk and branches? Is this maintainer concept applicable to a project on which multiple companies work on?

You can choose from a wide range of version control systems. (Not only subversion)
With the "versioning" concept you are safe that no one damages the project permanently.
So there is no need for a manual approval process, especially when there are contracts for example between the participating companies.
I'd also set up a commit mailinglist so you have some kind of peer review of changes. So no changes can be done without anyone noticing them.
If applicable set up some kind of continous integration environment to keep the quality up.
I don't understand the question about the branches. The decision whether to use them or not is IMHO not depending on the fact that the commiters are employed in the same company or not.

Its really up to you to decide which workflow works best for the companies involved. Subversion has the ability to add permissions to your trunk and branches allowing you to lock down certain parts of your repository to people who are "trusted" with merge access to trunk. You'll need good communication amongst the companies. Using the open source Trac provides a wiki, integrated RSS feeds of the commits to the project and code browser.

Usually, each site works on its dedicated branch and can import the other remote site branch, to decide what to integrate in its own work.
But if a site need to work directly on the other site branch, one possible practice is the concept of branch membership which allows only one site at a time to work on a given branch.
(not sure it is possible with SVN though)
That allows for two remote site (with a large time shift) to work on the same task in a tightly integrated manner.

My recommendation : subversion, with that configured you give away a url and then checkout, update, get things done and when you guess that the project is ready, snapshot and deliver.

Related

Recommended DVCS mechanism for hosting many independent patches

I have a project just getting started at http://sourceforge.net/projects/iotabuildit/ (more details at http://sourceforge.net/p/iotabuildit/wiki/Home/) that is currently using Mercurial for revision control. And it seems like Mercurial and SourceForge almost have all the right features or elements to put together the collaboration mechanism I have in mind for this project, but I think I'm not quite there yet. I want people to be able to submit, discuss and vote on individual changes from a large number of individuals (more developers than a project would normally have). And I want it to be as easy as possible for users to participate in this. The thought right now is that people can clone the "free4all" fork, which is a clone of the base "code" repository, or they can create their own fork in their own SourceForge user project (SourceForge now provides a workspace for every user to host miscellaneous project-related content). Then they can clone that to their local repository (after downloading TortoiseHg or their preferred Mercurial client). Then they can make modifications, commit them, push them to the fork, and request a merge into the base "code" repository, at which point we can discuss/review the merge request. This all is still far too many steps, and more formal than I'd like.
I see there is such a thing as "shelving" in Mercurial, but I don't see how/if that is supported in the SourceForge repository. And there probably isn't a way to discuss shelved changes as there is merge requests.
I'm looking for any suggestions that would make this easier. Ideally, I would like users to be able to:
Specify any version that they would like to play, and have that requested version extracted from source control hosted for the user to play at SourceForge (because the game can't be played locally due to security restrictions the Chrome browser properly applies to javascript code accessing image content in independent files)
Allow the user to download the requested version of the project for local editing (a C# version built from the same source is also playable locally, or Internet Explorer apparently ignores the security restriction, allowing local play in a browser)
Accept submitted modifications in a form that can be merged with any other compatible "branch" or version of the game that has been submitted/posted (ideally this would be very simple -- perhaps used just uploads the whole set of files back to the server and the compare and patch/diff extraction is performed there)
Other players can see a list of available submitted patches and choose any set to play/test with, then discuss and vote on changes.
Clearly some of these requirements are very specific, and I will probably need to write some server side code if I want to reach the ideal goal. But I want to take the path of least resistance and use the technologies available if much of the functionality I need is already almost there. Or I'd like to see if I can get any closer than the process I outlined earlier without writing any server code. So what pieces will help me do this? Does Mercurial & SourceForge support storing and sharing shelved code in the way I would want? Is there something to this "Patch Queue" (that I see, but can't understand or get to work yet) that might help? Is there a way to extract a patch file from a given set of files compared to a specific revision in a repository (on the server side), without having the user download any Mercurial components?
It sounds like something you could do with mercurial queue (mq) patch queues. The patch queue can be is own, separate versioned repository, and people can use 'guards' to apply only the patches they want to try.
But really it sounds even easier to use bitbucket or github, both of which have excellent patch-submission, review, and acceptance workflows built into them.

Source control system to branch by user instead of version

Once again, I'm a bit stumped about the best stack-exchange site on which to post this question. But I think developers are best suited to answer questions about source control, so here it is.
I am considering a crowd-sourced, user-rated game development project and am wondering what, if any, source control and merging systems might best be capable of hosting the kinds of source control I'm interested in. By user-rated, I mean that there will be some kind of rating/voting system like that found here on StackOverflow. For some details on the project idea, you can read my posting about it at http://gamedev.enigmadream.com/index.php?topic=1589.0. What I think I need is:
Ability to branch by user and maximize merging capabilities. I know source control systems are mainly focused on branching by version, and we could maybe think of each user maintaining their own version. But I guess we need some really robust merging capabilities to maximize the abilities of one user to merge changes from another user into their own branch, for example. So I think I would like the ability for "cross-branch" merging without having to merge into the common root branch first. (I'm most familiar with Team Foundation Server (TFS), which doesn't easily support this.)
Massive branching and merging. If there are hundreds or thousands of people wanting to incorporate their own changes into the project, there could be a lot of branches, and the system would need to be able to handle that without a meltdown. A single user might want to create multiple branches deriving from multiple other users' branches under their own name too, ideally, with the ability to merge among them to some extent.
Permission control by branch. I see SourceForge supports Subversion and Mercurial, but does not currently support permission controls by path/branch on these (as far as I can tell), although that does appear to be a feature under consideration. Users should be limited from pushing their code into other branches. I suspect the normal operations for a user would be pulling edits from other branches into their own branch, and checking in additional changes in their own branch.
A voting system. I know I shouldn't expect a source control system to support voting natively, but anything that could contribute to making this possible would be helpful. For example, maybe a voting system would involve or rely on the ability to label the best edits from various branches and pull them into a single file based on a label or a set of labels. And anything that would assist in merging the results of a selected set of labels from various branches (perhaps applying a new label to the set) could help too.
Very few files and possibly no directories. I would be willing to give up the ability to manage a large number of files or directories in exchange to gain any of the above because the format for the game file I'm considering is generally contained in a single text (XML or HTML5 -- haven't decided yet) file. But this does mean that the system should be pretty good at merging edits to relatively large text files efficiently. I know Team Foundation Server does a pretty good job of maintaining just changes to a file. I hope other source control systems do at least as well.
Or is source control not the proper paradigm to be talking about here? Is there some other technology ideal for merging code like this, one that doesn't involve source control and/or branching the way I'm thinking about it?
Any VCS, because "...source control systems are mainly focused on branching by version..." is just wrong, VCS support diverged changes of code over time, nothing more and nothing less
Any DVCS, because they have reasonable good branch-merge capabilities from the ground
Mercurial, which have branch-level ACLs, SVN have path-based ACLs. And because Subversion have physical tree repository (at some degree), ACLs can be applied to any part of subtree, i.e to branches also
Any CodeReview tool, integrated with VCS and modified for specific-reqs
Fossil SCM is single-file portable EXE, repo - one file; any DVCS also add only one dir of repo to existing tree and handle big files without headache

Choosing version control system

In out current project we are using VSS and SVN to keep track of the versions. For some reasons the developers in our site are not allowed to commit in them. So when many developers work with the same file, we run into versioning issues. It is very difficult to keep track of it. Can anyone suggest a version control system?
1. It should be light-weight.
2. We are going to manage individual files. Not whole projects.
3. It should have a GUI.
4. Learning curve should be reduced to a minimum.
Not sure if these are high expectations, but do let me know about your thoughts.
For multi-site development, a DVCS (Distributed Version Control System) is actually recommended because it allows:
private commit
"backup" publication (you push your branch which will then be mirrored in the remote repo, still as your branch: nobody will be impacted)
common publication: you push a common branch (which you have pulled first to take into account other commits)
That publication workflow (orthogonal to branching) really opens more possibilities in term of code management.
Pick one (Git, Mercurial, ...) and you have a valid solution to your issues.
To elaborate on VonC's answer a DVCS would allow all the off site devs to commit to one server but allow the onsite devs to control (by pulling) what is mergred into their controled branch/repo if they want.
i.e. if the onsite guys are scared of you commiting it is probably because they dont understand bracnhing and merging. and at the moment DVCSs are the kings of branch and merging

Parallel Dev: Should developers work within the same branch?

Should multiple developers work within the same branch, and update - modify - commit ? Or should each developer have his/her own each branch exclusively? And how would sharing branches impact an environment where you are doing routine maintenance as opposed to unmaintained code streams? Also, how would this work if you deploy each developers work as soon as it is done and passes testing (rapidly, as opposed to putting all of their work into a single release).
In general, I have found that having developers (who are working on the same project) use the same branch is better for finding integration problems sooner. If developers are each using an individual branch, then you're just delaying possible integration problems until later, when you merge the branches.
Of course, having developers work on the same branch means you need to have actual communication between those developers, but that's a social problem and not a technical one.
Developers would work on separate branches when there is a good reason for that branch to exist in the first place (such as a patch release of a previous version of the software, or a special build for a specific customer).
Note that tools such as Git and Mercurial allow developers to easily create their own private branches to organise their own work. This is a different situation from more than one developer sharing a branch, and (usually short-lived) private branches should be encouraged.
Branches are meant as a way to version control any feature or experimental piece of code that may break the mainline/trunk.
While it is common for developers to have their own personal branches for deep experimentation, often branches center around a new feature being added. These new features often require more than one person to be committing.
For example, on a web project, two developers and a designer may be doing a facelift to their company website. They still need to keep their mainline/trunk code clean in case they need to make a quick change to it before the facelift is complete. So they create a "facelift" branch and work on that instead. While the developers are committing javascript, the designer can be committing CSS and images. Once the facelift feature is complete, they can merge it into the mainline and send it live.
The only reason any of them would need personal branches would be for experimenting. Perhaps the designer is trying to implement "sliding door" tabs and can't get the padding right in IE6, for example. If he solves the problem, he can merge it into the facelift branch, if he can't, he simply ignores it and continues with the rest of the design back in the facelift branch.
To some extent, the version control software you are using will nudge you into a particular approach. GIT is geared toward open-source contributors and resembles the "one developer" model (branching isn't even a concept in GIT. GIT is more about managing changes). Clearcase is more corporate, so you do have multiple developers on a branch, but each developer gets to play in his or her own view.
I agree with Greg's answer, this is more a social planning issue. Lots of devs on one branch will step on each other's toes. I've been on a project where there were more developers than individual source files :)
I think that merging of branches can be problematic (dropped or inconsistent functionality), regardless of how good the source control tools are. I would more readily opt for multiple developers working on a single main branch. There could be other branches for things like production bug fixes or proof-of-concepts (POCs), where merging could/should happen very soon after change (bug fixes) or good chance that merging may not need to happen (POCs).

Source Control - Distributed Systems vs. Non Distributed - What's the difference?

I just read Spolsky's last piece about Distributed vs. Non-Distributed version control systems http://www.joelonsoftware.com/items/2010/03/17.html. What's the difference between the two? Our company uses TFS. What camp does this fall in?
The difference is in the publication process:
a CVCS (Centralized) means: to see the work of your colleague, you must wait for them to publish (commit) to the central repository. Then you can update your workspace.
You are an active producer: if you don't publish anything, nobody sees anything.
You are a passive consumer: you discover new updates when you refresh your workspace, and have to deal with those changes whether you want it or not.
.
a DVCS means: there is no "one central repository", but every workspace is a repository, and to see the work of your colleague, you can refer to his/her repo and simply pulled its history into your local repo.
You are a passive producer: anyone can "plug in" into your repo and pull local commits that you did into their own local repo.
You are an active consumer: any update you are pulling from other repo is not immediately integrated into your active branch unless you explicitly make it so (through merge or rebase).
Version Control System is about mastering the complexity of the changes in data (because of parallel tasks and/or parallel works on one task), and the way you collaborate with others (other tasks and/or other people) is quite different between a CVCS and a DVCS.
TFS (Team Foundation Server) is a project management system which includes a CVCS: Team Foundation Version Control (TFVC), centered around the notion of "work item".
Its centralized aspect enforces a consistency (of other elements than just sources)
See also this VSS to TFS document, which illustrates how it is adapted to a team having access to one referential.
One referential means it is easier to maintain (no synchronization or data refresh to perform), hence the greater number of elements (tasks lists, project plans, issues, and requirements) managed in it.
Simply speaking, a centralized VCS (including TFS) system has a central storage and each users gets and commits to this one location.
In distributed VCS, each user has the full repository and can make changes that are then synchronized to other repositories, a server is usually not really necessary.
Check out http://hginit.com. Joel wrote a nice tutorial for Mercurial, which is a DVCS. I hadn't done any reading about DVCS before (I've always used SVN) and I found it easy to understand.
A centralized VCS (CVCS) involves a central server that is interacted with. A distributed VCS (DVCS) doesn't need a centralized server.
DVCS checkouts are complete and self-contained, including repository history. This is not the case with CVCS.
With a CVCS, most activities require interacting with the server. Not so with DVCS, since they are "complete" checkouts, repo history and all.
You need write access to commit to a CVCS; users of DVCS "pull" changes from each other. This leads to more social coding facilitated by the likes of GitHub and BitBucket.
Those are a few relevant items, no doubt there are others.
The difference is huge.
In distributed systems, each developer works in his own sandbox; he has the freedom to experiment as much as he want, and only push to the "main" repository when his code is ready.
In central systems, everyone works in the same sandbox. This means that if your code is not stable, you can't check it in, because you will break everyone else's code.
If you're working on a feature, it will naturally take a while before it stabilizes, and because you can't afford to commit any unstable code, you would sit on changes until they're stable. This makes development really really slow, specially when you have lots of people working on the project. You just can't add new features easily because you have this stabilization issue where you want the code in the trunk to be stable but you can't!
with distributed systems, because each developer works on his own sandbox, he doesn't need to worry about messing up anyone else's code. And because these systems tend to be really good at merging, you can still have your codebase be up to date with the main repository while still maintaining your changes in your local repository.
I would recommend reading Martin Fowler's review of Version Control Tools
In short the key difference between CVCS and DVCS is that the former (of which TFS is an example) have one central repository of code and in the latter case, there are multiple repositories and no one is 'by default' the central one - they are all equal.