CVS virtual modules & directory mapping to mercurial repositories - version-control

My question is similar to this one but for Mercurial (converting using cvs2hg). However there some differences. This is part of our CVSROOT/modules file and shows the problem nicely:
PD1 -a PROD/PD1/Drivers Drivers/PD1/Firmware KernelHeaders Shared IppLibs
PD2 -a PROD/PD2/ Drivers/PD1 KernelHeaders Shared IppLibs
#PD2Linux Driver
PD2Linux PROD/PD2/Drivers/Linux/BuildFiles &PD2LinuxSource
PD2LinuxSource -d src &PD2 &PD2LibUSB
PD2LibUSB -d ThirdParty/libusb libusb
As you can see the driver structure is complicated. We're definatly looking to rationalise the driver structure, rather than including the entire older driver (PD1) in the newer one.
As I understand it, in Mercurial you can use the share extension to do the sub directory mapping.
My questions are
Is there a way in Mercurial to bring files located further down in the directory tree (in this case the autoconf files) upto the root as is done in the first line of the PD2Linux Driver?
Is there a way to create directories, as per the -d flag?
How to merge changesets that span PD1 & PD2?
e.g. if changes were made in PD2 that spanned both drivers and checked in to PD in CVS. This is bit of long shot as CVS doesn't have changesets.
I wonder if the cvs2hg takes into account the CVS modules file?
Atm I converting each PD directory individually (creating a cvsroot in each subdir), would it better so convert them all together and then split up them into seperate hg repos?

You write:
As I understand it, in Mercurial you can use the share extension to do the sub directory mapping.
Not quite. The share extension let's you associate several working copies with a single repository — it's not about remapping (sub-)directories.
Is there a way in mercurial to bring files located further down in the directory tree(in this case the autoconf files) upto the root as is done in the first line of the PD2Linux
Driver?
The answer to this and your other questions is: no. The core problem is that Mercurial (and other distributed version control tools) requires you to checkout the full repository every time. You cannot just clone repo/some/dir/, you must always clone repo/.
Atm I converting each PD directory individually (creating a cvsroot in each subdir), would it better so convert them all together and then split up them into seperate hg repos?
The end result should be separate Mercurial repositories — precisely because you need to clone the full repository. So make sure to make a 1–1 mapping between repositories and your drivers.
One tool that you might find useful is subrepositories. A subrepository is a nested repository that Mercurial will checkout when you clone the outer repository. The come with a number of caveats, but big companies are using them today (I've helped a number of companies with setting up subrepos).

Related

Converting CVS multi-project tree hierarchy to Git?

I am using CVS and I have this hierarchy:
/ROOT
/JAVA
/JavaProject1
/JavaProject2
.project
/PHP
/PHPProject1
/PHPProject2
.project
In Eclipse > CVS Repository Exploring, I can see this hierarchy and I can Check Out only the project that I want.
Also I can check out (import) JAVA and PHP folders (I created them as Eclipse General project for import) to Eclipse Package Explorer and can synchronize and commit all together.
When I want to use Git, it only supports one project.
I don't want a flat hierarchy (near all JAVA and PHP project together), I want to use tree hierarchy and I want to check out only the project that I want as with CVS.
Is my CVS hierarchy possible in Git or what technique should I use?
I think you're mixing what you want to do locally with how you want
to arrange things remotely. All git commands access only the local
repository. The 'push' and 'fetch' commands appear to access a remote
repository, but in fact they effectively start each other on the remote
machine running against the local repository on that machine. So the
tasks you can do remotely are very limited. Specifically, copying "branch"
and "tag" references and the commit histories those references point at.
This means for the simple case there is ONLY the local repository,
it exists in the .git directory in the working directory.
You can arrange working directories, with their .git directories however
you wish on your local machine. Likewise, you can arrange the the remote
repositories in any way allowed by the remote hosting service. The
layouts do not have to match. If the remote is your own Linux server you
can make the layout just like your local. If the remote is (for example)
Github you're more limited.
You'll need to backup the .git directory to backup
the repository; the rest of the working directory is probably not significant. You can use git push to do this backup, as long as you never use '--force'.
Git isn't really very keen on you having multiple working directories
for one repository. It is possible, however, in the simple case they
will each have their own copy of the repository and you will need to
push/pull the updates individually either to a "central" repository
or more "randomly". None of these repositories have to be physically
"remote".
Git much prefers you to switch between branches in one working directory
and use make install style processes to send builds out.
It is also possible to have unrelated branches in one repository, but most people find this too confusing as you still only have one working directory.

Is it possible to automate the changing of an svn repository layout?

I have an SVN repository which was moved from CVS a while back. It has a single branches/tags/trunk structure. Over time the main project was split to multiple library projects. The structure now looks something like this:
trunk\projects\prj1
trunk\projects\prj2
trunk\tools\tool1
trunk\tools\tool2
branches\b1\projects\prj1
...
Is there an easy way (through a script perhaps) to convert this repository into a structure similar to:
projects\prj1\trunk
projects\prj1\branches\b1
...
You don't need a script, unless you're going to have to repeat the process multiple times. Just use svn move and svn rename as appropriate to rearrange the structure. You can do it with repository URLs, or check out as little of the repository as possible using sparse directories, move everything around, then commit.
svn mkdir /projects/prj1 --parents
svn mkdir /projects/prj1/branches --parents
svn move trunk/projects/prj1 /projects/prj1/trunk
svn move branches/b1/projects/prj1 /projects/prj1/branches/b1
And so on. The reason I say you don't need to script it is because you're going to have so many different names & conditions to work through that it won't be worth the trouble.

Sharing a core codebase between multiple projects

We have several product lines built around a common core and currently maintain them in SVN using externals. Moving to mercurial, it is natural to move to use hg sub-repositories.
The thing is the core is quite large (probably >GB, judging by the SVN repo), and a typical developer sometimes wishes to work simultaneously on several products, say 3-4.
Did I get it correctly that it usually means a developer would have the core replicated 3-4 times for each developer, with its entire history?
Also, if a developer wishes to perform some simple operation in another product, it would mean the core have to be pulled first, even though it is already available at the client (several time...)?
In order to truly share the subrepository (and not its working copy), you can use the share extension. However, that makes the cloning process a bit counter-intuitive:
hg clone -U remote_core core
hg clone -U remote_projectA projectA
cd projectA
hg share ../core core
hg update
cd ..
hg clone -U remote_projectB projectB
cd projectB
hg share ../core core
hg update
And so on. But I warn you that you are going to have more than one headache with this setup. At work, we have a similar setup, but instead the shared subrepository has a branch (not a named branch, but a clone branch, a dedicated master repository) for each project that uses it. That way projects can modify the shared code independently while still having the easy merging between them.

Emulating symlink-like behaviour in a source control repository

Suppose I have the following (desired) folder structure:
*CommonProject
*Project#1
----> CommonProject(link)
*Project#2
----> CommonProject(link)
Where the CommonProject is the location of the source belonging to that project, and CommonProject(link) is merely a soft link to the main location. If we imagine this as a tree-view in a visual client, if I expand Project#1 I will see CommonProject there as a subdirectory, even though the files are not actually stored there.
The purpose of this is to enable the following behaviour:
When I check out Project#1 I get the files associated with that project as well as a subfolder CommonProject containing all of its files (as if Project#1 contained the copy of the files in the Version Control repository). Now if I were to modify CommonProject's files inside of Project#1 and was to submit my changes to the repository, the changes would go into the CommonProject location (no file is actually stored locally under Project#1 in the repository). Now if I was to sync Project#2, as it also contains symlink to CommonProject, it will now get my updates.
Essentially the duplication of files only exists on my machine, but in the repository there is only one version of CommonProject.
I know Perforce can’t do this, without juggling 3 specs. This is very complicated and error prone, especially when a lot of people do it. Is there a source control repository out there that can do this? (a pointer to some docs on how it can be done is a plus)
Thank you.
Subversion can directly store symlinks in the repository. This only works for operating systems that support symlinks though, as svn just stores the symlink the same way it would with any other file.
I think what you really want is to link to separate projects though. Subversion supports this through externals and git through submodules. Another alternative is to manage this sort of thing with in your build process, so that some static resources are gathered when you initialize the build. Generally, updating a utilities library that changes often is going to cause stability problems, so you can do this manually (or with clever scripts) when you need to
You'd probably be much better off just storing the projects in a flat directory (1 directory per project, all at the same level), and using whatever you build system or IDE is to link all the stuff together.

how does version control work?

how does version control usually work? does it save diff files as a trail with hashes to validate the trail?
Check out Eric Sinks blog series on version control.
Also, Joel Spolsky wrote Hg Init: a Mercurial tutorial, that finally made me "get" what distributed source control is all about.
There are more than one ways to skin a cat...
Different VCS use different approaches. CVS, for example, will create a file on the server for each file which you commit. This is essentially a file in RCS format; CVS is only a wrapper around RCS which runs the RCS commands over many files in a directory subtree (RCS can only work on single files).
The RCS file contains a list of changes (version number, checkin message and how much was changed). After that comes a copy of the current HEAD version. The rest of the files are the diffs between the versions (long explanation).
This way, CVS can quickly return the HEAD version (which is most often requested) and it can compute the other versions.
CVS doesn't do any validation; if one of your files becomes corrupt, you need a backup. Since CVS is based on RCS, it can't version directories nor can it track renames. CVS and RCS use the standard diff(1) command to create the diffs.
Subversion (SVN) works similarily but adds versioning of directories and renames. Moreover, SVN uses a better diff algorithm (xdelta) which gives a smaller repository.
For an explanation how Git works, see here.
Darcs is very different and IMHO more intuitive than other SCMs even distributed ones.
There's an excellent guide for beginners about how it works: Understanding Darcs.