Mercurial "vendor branches" from external repositories? - version-control

I want to store a project in Mercurial that contains external code (which can be modified by me) coming from Git and SVN repositories. In SVN I would solve this with vendor branches and copy the code around, but I understood that in Mercurial it's better to have different repositories for different projects, and pull between them when needed.
The project layout will be like this:
- externalLibraryA [comes from a SVN repo]
- ...with some extra files from me
- externalLibraryB [comes from a SVN repo]
- ...with some extra files from me
- externalPluginForExternalLibraryB [comes from a Git repo]
In Subversion I would create vendor dir and a trunk dir, copy all external libraries first in vendor, and then in the right place in trunk. (I think) I can do this in Mercurial too, with subrepositories, but is this the best way to do this?
I tried setting up different repositories for the external libraries, but then it seems I can't pull the externalLibraryARepo into the externalLibraryA directory of my main repository? It goes in the main directory, which is not what I want. I can also create a Mercurial mirror repository and include it as a subrepo in my main repository, but then the changes in this subdirectory go to the mirror repository, while I want them to stay in the main repository.

I'd probably just store this in one repository - note that in the link you give they are using their build system in the end to bring together the binary output from the different repos. I'm not clear on their rationale there.
If the underlying problem you're trying to solve is how to update the externals in a clean way, I'd probably use anonymous branching for that.
I.e. add the external lib to your project, and your modifications. Make sure it works. Tag with ExternalA-v1.0. Hack away on your actual project. Now ExternalA, Inc. has a new version of their stuff. Update your repo to ExternalA-v1.0 tag. Import their new version and apply your modifications on top. Commit. Now you have two heads: one with the latest version of your code (that works with ExternalA-v1.0) and one with the latest version of ExternalA (that does not work with your code, maybe). So then you merge and reconcile the two. Tag again, now with ExternalA-v2.0. Repeat as needed.
You can still keep your externals in separate repositories, but I assume that the project that is using those does not need to be up to date with changes there all the time - looks like the whole point of vendor branches is to have some point of isolation between dependee and dependants. Of course, moving the changes from the externalA project to the project that is using that will then be a manual affair (well, a copy, much like in SVN really).

It depends on whether your vendor code is going to be customized by your team or not. Our teams have had a great deal of success maintaining a named "vendor" branch on repositories with our own customizations on branches named by project name. This vendor code is then easily included in a project as a subrepository.
A caveat to this approach: if active development is going on in the subrepository, best keep it to directly editing the subrepository as a separate clone, otherwise it becomes necessary to pay close attention to the top-level repository so you don't inadvertantly bump your .hgsubstate forward to the wrong revision and break your build.
Watch out for merges of the top-level repository (your project) between versions which point to different named branches of your subrepository, as this can result in a merge between the "vendor" and "project" branches in the subrepository as it recurses, which may not be desirable.
Note that this functionality may change in the future as well, as some "warm" discussions have been taking place in recent months on the mercurial-devel mailing lists about the future of subrepository recursion.
edit:
I just saw this discussion in the related links as well, which seems relevant: https://stackoverflow.com/a/3998791/1186771

Related

Anyway to relate two local mercurial repos to each other at some point?

We have been using mercurial for over two years for our flex/as3 projects. Over the time, we came up with several projects and a common library which is used/referenced by the projects. In order to maintain the versioning properly, each time a new version is tagged on a project repo, the same tag is applied to library repo, as well.
This is done manually. I am not sure whether what we do which was described above is the correct way but I wonder if there is any (other) way to relate project repos to library repo, so that whenever i want to pull a specific version of a project, i will know exactly what version of library should be pulled.
Any thoughts? Thanks.
Have you ever tried subrepo? By using subrepo, hg will maintain the whole project with the subrepo.
For example, if you have HG_DIR/lib as the library and it is a subrepo to the HG_DIR, then when committing in HG_DIR, hg will remember which rev of HG_DIR/lib you are using with, which means a commit in HG_DIR links with a commit in HG_DIR/lib.
hg subrepo supports multiple VCS subrepos, too.
Update:
As #alexis suggested, use recommendation from the official documentation, lib should be treated as siblings of project:
HG_DIR/
lib1/
lib2/
project/

Version control on an external project

I am working on an enormous project ("the project") which is open-source, and I am changing the project but don't have a permission to commit. I'm looking for strategies for maintaining my own branch of the project. Some issues I am contemplating:
How to put my own work in a version control system, given that I'm altering the project's source code, adding new files and so on.
How to keep in sync with the project without having to manually merge my own changes over and over again.
I've never been in this situation - I've always maintained my complete project in some version control system. My plan right now is something like that:
Creating a directory tree in my SVN, similar to the one in the project.
Keeping all the changed files (and only them) in my svn.
Every time I decide to sync with the new baseline of the project, I'll do a checkout, merge my svn tree into the new version, test, then commit my changes to my svn and distribute them along with the latest project baseline.
The problems here are ENDLESS. Way too many manual steps, more and more work over time, and so on. The correct way to go would be, of course, to be a part of the original project, but this seems to be quite irrelevant right now for various reasons and is out of the question.
Ideas?
I'd use git or mercurial for this; simply import the project into git or mercurial, and merge the upstream changes into a branch in your project for easy merging into your trunk.
If the upstream project has a repository of their own, the import is even easier. Both git and mercurial have support for directly importing other version control systems. I did this recently to adapt an existing project that lives in SVN: https://github.com/mjpieters/rod.recipe.rabbitmq
Note that that project has an 'upstream' branch. That particular project has now accepted my proposed changes after reviewing the changes in github.com.
There are a few questions here on SO on the subject:
Fork and synchronize Google Code Subversion repository into GitHub
Tracking upstream svn changes with git-svn and github?
Best way to fork SVN project with Git
It should be trivial to create a similar setup with mercurial.
You can use git to maintain your source control on your local system. In fact Git can be used to maintain just about any directory under version control. There is no need to sync to anything, git maintains all changes locally.
If you need to commit to SVN check out the documentation http://git-scm.com/docs/git-svn

How should I work on a CVS hosted project to both (1) fix bugs and (2) maintain my own private fork with additional features

The question
An open source program uses CVS for version control. I would like to make a number of bug-fixes and submit patch bombs to the developers with commit access. I would also like to maintain my own semi-private fork that mainly tracks the main code-base but that includes my own features (these features, right now, should not be incorporated into the main code-base.)
I prefer to use mercurial for my own version control needs, but I am open to other version control systems if necessary.
I'd like to:
Be able to easily create patch-bombs against the current CVS source with my own bug-fixes
Keep track of history on my own features
Have fixes and improvements from the main tree easily incorporated in my new-feature fork
Easily apply my own bug-fixes to my new-feature fork
Be able to work and track change history without an Internet connection.
What suggestions do you have for doing this?
My current idea
My own best guess is below, to give you a better idea of what I am thinking about.
I will have 3 mercurial repositories.
The first two repos are managed as specified at (https://wiki.mozilla.org/Using_Mercurial_locally_with_CVS). One just mirrors the latest changes from the CVS upstream. I do "cvs update" then "hg commit" in this repo. The second repo holds my bug-fixes as patches using the mq extension and I pull from the the first repo and re-base my patches every so often. When my patches are incorporated into the main tree, I remove the patches from the patch queue/make them permanent commits.
The third repo is my local fork. It will start out as a clone of the first repo. Then each time I do an update of the first repo, I'll pull from it into repo 3. My own features will be directly present as commits in this repo. When I fix a bug, I'll export a patch from repo 2 and apply it to the appropriate pull from repo 1.
I have used Git to manage changes on top of a CVS repository in a similar way. My solution in Git uses local branches instead of multiple repositories, but it sounds essentially similar to your proposed idea.
I found that this arrangement works best if you commit all the CVS metadata (in the CVS/) subdirectories) to your mirrored repository. This means that the CVS metadata gets replicated in the other repositories, but it doesn't cause any harm (and lets you run commands like cvs diff if you need to).

Basic Subversion questions

I've just started using subversion, and have read the official documentation (svn book), cheat sheet and a couple of guides. I know how to install subversion (in linux), create a repository (svnadmin create), and import my Eclipse project into the repository (SVN import), view the repository files (using svn list).
But I am unable to understand some of the other terminologies. For example, after importing my Eclipse project into the newly created repository I have made changes to my Eclipse project (more than 1 file). Now, how should I update the repository with this added files/changes made to my Eclipse project?
The svn update command brings the changes from the repository into your working copy - which is the opposite of what I want i.e. bring the changes I made in my Eclipse project into the previously imported project in repository. If I am correct, you update the repository more often (as you keep extending your project implementation) than your current project (with update).
Also, I do not understand when would you use svn merge. The svn book states it applies the differences between 2 sources to a working copy. Is there a scenario which would explain this?
Finally, can I have more than 1 project checked into the repository? Or is it better to create a new repository for each project?
The term you are looking for is "commit".
Subversion does not exclusively lock a file for editing (though there is a command to do this if you really, really want to). So it is possible that you will need to merge two different users' sets of edits on a file, or even edits from two different working copies in two different locations on your machine.
Multiple projects is fine. Best approach IMHO is repository/project/trunk etc rather than repository/trunk/project.
Three things about SVN you should know:
Trunk - The main version of your code
Tags - 'Tagged' Versions of your code (i.e. v1.2.5-release)
Branches - Forks of the code for divergent development paths. We typically fork new branches to work on different versions, so if the current version is 1.2.4, you'd branch for 1.3's development. So if emergency changes to 1.2 need to be made (i.e. 1.2.5) you can work on it without worrying about what you broke by refactoring / feature adding in your 1.3 branch. The merge operation is designed so you can merge 1.3's branch back into trunk when you're ready to release 1.3, or a similar operation. You can also merge individual files (if two or more developers edited the same file at the same time and now you need to 'merge' the changes into the same file.
Each project in your repository should have 3 folders in it:
/trunk
/branches
/tags
These house the three points above. You don't have to have these folders, but you should. Other more mature VCS like Mericual/Git have the concepts of tags and branches baked into the system. In SVN these are more of a convention/reccomendation.
Terminology
Working Copy - The copy on your hard-drive, that contains all your edits, etc...
Add - Registers a file for tracking in version control
Update - Updates the working copy with changes from the server repository
Commit - Updates the server repository with changes from the working copy
Switch - Replaces the working copy with another folder within the server repository
Diff - Does a differential analysis of two files / versions of a file to see the changes between them.
Merge - Attempts to apply the changes from one or more files into another, highlighting conflicts.
Patch - A set of differences that can be used to update a file.
You commit changes to the repository
Merge is useful when you need to maintain two branches of a repository. For examples v1.x with most recent security fixes and the alpha version 2. That allows you to make the fixes in the 1.x code, whith the resulting binary for existing customers, and you can merge the changes into version 2 so fixing the bugs that weren't already caught.
I suggest you look around for 'typical svn workflows'. They will give you the big picture of the 'most common tasks'.
What you want to do is 'commit' the changes made to your files to the repository.
You need to merge in case of a conflict (when 2 or more people are working on a project and commit to the same repo. conflicts might arise).
Check the available articles on SVN kai remember to read about the sample/typical workflows or working scenarios with SVN.
Fully agree with David, but as far as question 3 is concerned, personally, I would distinguish between use cases:
Production: One project per repository. And do get warm with the mentioned tag/trunk/branch concept, it really helps a lot
Testing: I have one single repository where I have put virtually all my experimental codes (approx. 10 languages with x codes per language). Reason is: One experimental code takes me 1-2 minutes, creating a repository on a remote host, using ssh-security sometimes takes longer ;-)
Cheers
EL

Bazaar newbie question about repository structures

I want to use Bazaar on Windows XP for web-development and related tasks. Most of the files are edited locally and then transferred via FTP to the server. Just now the repository sits on my local workstation. Later on it should be shared locally with some co-workers. Perhaps we will use a local Linux server as a centralized repository, but this structure is not decided for now. But first I need to understand the impacts of the different repository setups, which I do not at all.
Using Bazaar-Explorer on Windows XP I’ve created a ‘shared tree repository’ from the option list of the init-dialogue in some location dev-filter/. Bazaar Explorer tells me:
Created repository with treeless branches at F:/bzr.local/dev-filter
Created branch at F:/bzr.local/dev-filter/trunk
Created working tree at F:/bzr.local/dev-filter/work
OK so far. Now I move a bunch of files into the work directory and add and commit them as Rev 1 ‘Start Revision’. Then I work on some of these files and commit them again as Rev 2. Here my confusion starts. Shouldn’t both revisions go into the trunk? The trunk is still empty, beside the .bzr directory which only holds some management information. If I delete my working directory, which I have tried during these first experiments, everything is gone. There’s obviously no hidden storage of those files.
OK. Perhaps I need to push it into the trunk? This does not work either. Entering the work/ directory and initializing the ‘push’ to the trunk, Bazaar-Explorer tells me
No new revisions to push.
So what? This looks like a severe conceptual misunderstanding about what should happen on my side.
Edit, 2010-02-03: Some conclusions
What I learned meanwhile is this:
I think I should switch to the command line until I really understand what’s going on, at least for creating the repositories and branches. Bazaar Explorer introduces a new level of abstraction which I only can handle if I understand the level beneath
One of the secrets of working with Bazaar at least for me is to understand those .bzr directories, their particular properties and states when created with ‘bzr init’, ‘bzr init-repository’, ‘bzr branch’ etc. in all their variants and how they are plumped together.
While there’s a whole chapter of ‘Organizing your workspace’ in the Bazaar User Guide, it’s more or less workflow oriented. The manual contains a lot of directory structures for the given examples. What I would prefer beside this and have not (or only rudimentary) found so far is some graphical representation of those ‘Lego like’ .bzr building blocks which create the linking of all the parts. So I started to invent some simple notation while working through the examples and looking into the .bzr directories to document what information is stored there, where does it come from, how and to what is it linked, is it complete or shared, etc.
Erich Schreiber
Created repository with treeless
branches at F:/bzr.local/dev-filter
This part of the output looks suspicious to me. Are you sure you chose 'Shared repository' and not 'Shared repository with treeless branches' from the init dialog?
Treeless Branches are branches without the working tree, if you indeed created a treeless branch for trunk then it makes sense that there are no files there.
Your changes are still saved in the F:/bzr.local/dev-filter/trunk/.bzr, and have indeed been committed there. You don't see those changes reflected in the file system because Bazaar has created trunk as a treeless branch, with `` as a lightweight checkout. See checkouts in the Bazaar User Reference.
If you open F:/bzr.local/dev-filter/trunk in Bazaar Explorer, you should see your revisions. If you create a new branch with a working tree or checkout based on trunk, Bazaar will create the files with your changes for you.
typically it goes like this.
bzr init-repo --no-trees F:/bzr.local/dev-filter
cd F:/bzr.local/dev-filter
bzr init trunk
bzr branch trunk work
---all above will not create any tree
Now in new directory say F:\temp
cd F:\temp
bzr checkout F:/bzr.local/dev-filter/work
bzr add
bzr commit
---back to F:/bzr.local/dev-filter/work
cd F:/bzr.local/dev-filter/work
bzr push F:/bzr.local/dev-filter/trunk