Extending the history of a Mercurial repository into past - version-control

I've started development on a project (which used CVS) by downloading its sources, creating a fresh HG repository, and using that. However, the original project now has converted to using Mercurial as well.
Can I add its history before my initial commit into my repository?
Alternately, how can I push my repository to the remote one so as to preserve the history of both?

You can't change the ancestors of your current repo without altering the hash id's of every changeset, which essentially makes it a different repo. The hash of "left-parent" and "right-parent" are part of "who a changeset is" and so giving a parent to the first parent-less changeset in your current repo would change that first changeset's hash, which since it's the parent of the second changeset would change it's hash and so on.
If you're okay with changing the hashes of your existing repo (which you shouldn't be if anyone else out in the wild has clones of it) you could use the convert extension or even just import/export to attach your repo to the their newly converted repo.

Related

shun a whole commit in fossil

I have added a directory of files in my fossil repository, but:
the files contained occupy way more space than I expected
I realized afterward that adding it was completely superfluos.
So now I find myself with a repository one order of magnitude bigger than it needs to be to contain files that were never useful.
The whole directory has been included in a single commit, nothing else has been done in that commit, and has never modified since, but I had to do other commits afterward (after getting more confident with fossil, I know that I could have used undo before doing anything else, but at the time I wasn't conscious of the posibility).
The only way I found to do the job is to perform a shun on the data to remove them, but I also found online that this operation can wreak havoc in the database. Given that is a work related repository, I'm concerned about causing damages.
Is there a way to get rid of those files that is safe and will not leave the database in a corrupted/full of warning state?
If the bad checkin exists only in your repository (or your repository plus a server) and has not been pulled by other users, the simplest solution is to use fossil purge.
Use fossil purge checkins <tag> to move those checkins to the "graveyard"; the <tag> part can also be the hash of a checkin, not just a symbolic tag. Be aware that if you specify a branch, the entire branch will be purged; even if you don't specify a branch, all descendant of the checkin will be purged (as they depend on it). Once you've confirmed that everything is in order, use fossil purge obliterate to get rid of the graveyard if you need to free up the disk space. If you don't need the disk space, you can let the graveyard sit around for a while until you're certain that everything is okay. Consult fossil help purge for further options.
You may want to keep a backup of the repository (it's just a single file, you can just copy it) for a bit in case something didn't go right.
The shunning mechanism exists only to purge artifacts globally and is meant to be used on a central server as a last resort: it will prevent those artifacts from being propagated anymore to other users via that server. If your changes are local only or if you have access to all the servers and can use fossil purge instead, shunning is unnecessary.
If you actually need to purge something in the middle of a branch, additional steps are required.
Make a backup of the repository file, as you're going to do non-trivial surgery on it.
Use fossil update to move to the checkin just prior to the defective one.
Use fossil merge --cherrypick to copy the first "good" checkin. Do fossil commit --allow-fork to commit the copy of that checkin; the editor should be prepopulated with the original commit message. You will be prompted to confirm that you don't want to change the commit message. Press "y".
Repeat step 3 (fossil merge --cherrypick + fossil commit) for all remaining "good" checkins. You won't need --allow-fork for these.
You should now have a fork with all the checkins that you want to preserve and a separate fork with the bad checkin and the original version of the good ones. Verify the graph in fossil ui to see that everything is in order. Once that is done, use fossil purge to get rid of the bad checkin and its descendants as described above.
The process in steps 3+4 can be automated with a shell script:
#!/bin/sh
set -e
for commit in "$#"; do
fossil merge --cherrypick "$commit"
echo yes | VISUAL=true fossil commit --allow-fork
done
Put this in an file, say fossil-replay.sh, make it executable, then use fossil-replay.sh commit1 commit2 ... commitn to replay commit1 through commitn from the current position in the repository. Obviously, replace commit1 etc. with the actual commit hashes.

Reconstruct / rebuild a mercurial repository

First off: I know that hg branches are immutable and they cannot be renamed. I am also aware of the existence of the mutable branches extension for hg. But I'd prefer a different approach, as I can never be sure that all of our developers have it installed and active, it's still "only" an extension.
My question: We have a repo with about 20 branches in it. Due to various reasons (inexperienced use, bad choices, experiments that became production environments) some of those branches were named badly and now our repo is a little confusing. What we'd like to do is rename a few of those branches, because obviously, the more we work with them, the more it's becoming a problem.
Do you have any suggestions? I already thought of a "tool" or some kind of script that recreates the whole repo from scratch, getting changesets of the old repo and committing them - with new branch names - to a new repo, "rebuilding" it. But before I go and waste time in writing something like that, I'd like to hear if there are other possibilities.
FYI: there are about 600 commits with frequent merges across the various branches.
You can rebuild the repository by doing a Mercurial to Mercurial conversion using hg convert. Enable the convert extension first and create a branchmap to do the mapping of branch names from old to new:
a-bad-name new-name
another-bad-name better-name
You can use that to map multiple bad names into a single good name, for example.
After the conversion, you will have a new repository with the same history, but with different branch names. The changeset hashes will thus be different and people will have to reclone (but I think you're aware of this already).

How to push only a subset of committed files?

We recently switched from svn to mercurial. We're using Aptana as our IDE with the MercurialEclipse plugin, along with BitBucket for our repositories and SourceTree as our (additional) source control GUI.
I created 2 new files in Aptana, and committed each of them. Now in the Synchronize view, where the 2 files are listed as "outgoing", I'd like to push only one of them. I avoided using the "push all" icon at the top which would push all outgoing changes - instead I right-clicked a specific file in the outgoing list and chose "push" from the context menu. However, this caused both outgoing changes to be pushed. I can't seem to find any option to push only a specific file or subset of files of the committed changes. Is there any way to accomplish this in Aptana?
Note: My answer doesn't relate to Aptana, but instead covers what I think your issue is.
I think the main problem is a misunderstanding of how Mercurial stores its changes, which coming from a Subversion background is perfectly reasonable.
In Subversion, change history can be considered to be stored per file. That is, if you change two files and commit them, you can easily, and often do, have a situation where files in your working copy are at different versions.
In Mercurial, change history is stored across the whole repository. Committing will create a new "Changeset", which stores the state of the entire repository at that time. When you decide to push a change out to another repository, all modifications (or adds, or deletes, or...) will be pushed out with that change.
A caveat is that when you decide to commit a new changeset to your repository, you can selectively include or exclude files. Files not included will remain in your working copy as pending modifications, which can be committed in a new changeset.
I hope that makes sense to you - if you already understand it, it's a logical concept, but I find it tricky to explain.
So, on to your problem.
Lets say you have two files in your repository, file1 and file2 (it's that or foo and bar). You've changed them both, but they relate to different issues - they can be committed as different changesets:
$ hg log
changeset 0:....
summary: First commit
$ hg st
M file1
M file2
$ hg commit -I file1 -m "Changed file1"
$ hg log
changeset 1:....
summary: Changed file1
changeset 0:....
summary: First commit
$ hg st
M file2
Here you can see that we've committed only one file into the repository, and it's made a new changeset with the complete state of the repository at that time, minus the changes to file2. We can now do the same, committing file2, which will create another changeset. The problem with this approach is that changesets are ordered according to their parent, and so you couldn't easily push just the change to file2, without also pushing its parent - but it may be closer to what you're after.
TL;DR : SVN stores the state of individual files, Mercurial stores the state of the repository as a whole.
I very much recommend reading Mercurial: The Definitive Guide. It's a little out-of-date in places, but I think it will do a much better job of getting the concept across.

Propagation of changes in Mercurial

I have a following question. Suppose I have a bunch of repositories hosted that form an hierarchy, e.g.: A -> B -> C (means A is the central repository and all the rest are it's descendants).
Now suppose I work with the clone of C. Suppose I want to get the changes not just from C, but from the central repository, so I do the following commands:
hg pull [Address of A]
hg up
That seems perfectly legal, but what happens then I commit my changes and push them to C? Not only my local modifications will be pushed, but also the modifications of central repository (if there are any). What will happen if someone will try to pull the changes from A to C? Will there be a conflict or it will merge successfully the changes A -> local -> C with changes A -> C. Will Mercurial recognize it as the same changeset or not?
The identical situation takes place if I decide that my code is stable enough and can be placed in central repository:
hg commit -u spirit -m "A local modification that is stable"
hg push [Address of A]
What will happen if I make a pull from A to C and then pull from C to my local repo again, will it recognize these changeset as originating from my local repo, or will it report a conflict and suggest a merge?
And what is the best practice in that case anyway? Performing just subsequent pulls and pushes (i.e. A<->B, B<->C, C<->local)? But the problem is that I have just access to my local repository that is clone of C. How can I make a pull from B to C if I would want to on my local machine? How does Mercurial handle
What will happen if someone will try to pull the changes from A to C? Will there be a conflict or it will merge successfully the changes A -> local -> C with changes A -> C. Will Mercurial recognize it as the same changeset or not?
The changesets that already exist in C will be seen to already exist because their IDs will already be present. This is why changes are immutable. If you could modify a changeset it's ID would change (The ID is a hash of the changeset contents). Mercurial then (correctly) sees it as a different changeset. By keeping the changesets immutable we can be sure that their hash will be the same whichever repo they come from.
pull = copy changesets from over there that have IDs I haven't seen
push = copy changesets to over there that have IDs they haven't seen
Will there be a conflict
No
or it will merge successfully the changes A -> local -> C with changes A -> C.
No merge takes place because they are the same changes. It just sees that it already has them.
Merges don't happen unless you want to combine the changes from two parallel sets of changes, and not unless you explicitly ask for it.
As long as the changeset are identical, i.e. they have the same id hash. Mercurial considers them to be the same.
When you modify changesets with rebase or mq commands, the id hashs change and mercurial might place two very similar changesets into a repository.

What am I doing wrong with SVN merging?

When SVN with merge tracking works, it's really nice, I love it. But it keeps getting twisted up. We are using TortoiseSVN. We continuously get the following message:
Error: Reintegrate can only be used if revisions 1234 through 2345 were previously merged from /Trunk to the reintegrate source, but this is not the case
For reference, this is the method we are using:
Create a Branch
Develop in the branch
Occasionally Merge a range of revisions from the Trunk to the Branch
When branch is stable, Reintegrate a branch from the branch to the trunk
Delete the branch
I Merge a range of revisions from the trunk to the branch (leaving the range blank, so it should be all revisions) just prior to the reintegrate operation, so the branch should be properly synced with the trunk.
Right now, the Trunk has multiple SVN merge tracking properties associated with it. Should it? Or should a Reintegrate not add any merge tracking info?
Is there something wrong with our process? This is making SVN unusable - 1 out of every 3 reintegrates forces me to dive in and hack at the merge tracking info.
This problem sometimes happens when a parial merge has been done from trunk to branch in the past. A partial merge is when you perform a merge on the whole tree but only commit part of it. This will give you files in your tree that have mergeinfo data that is out of sync with the rest of the tree.
The --reintegrate error message above should list the files that svn is having a problem with (at least it does in svn 1.6).
You can either:
Merge the problem files manually from trunk to branch, using the range from the error message. Note: you must subtract 1 from the start of the range, so the command you'd run would be:
cd <directory of problem file in branch working copy>
svn merge -r1233:2345 <url of file in trunk>
svn commit
or
If you're certain that the contents of the files in your branch are correct and you just want to mark the files as merged, you could use the --record-only flag to svn merge:
cd <directory of problem file in branch working copy>
svn merge --record-only -r1233:2345 <url of file in trunk>
svn commit
(I think you can use --record-only on the entire tree, but I haven't tried it and you'd have to be absolutely sure that there are no real merges that need to come from trunk)
Bunny hopping might be the solution.
Basically, instead of continuously merging trunk changes into a single branch (branches/foo, let's call it), when you want to pull those changes from trunk:
Copy trunk to a new branch (branches/foo2).
Merge in the changes from the old branch (merge branches/foo into branches/foo2).
Delete the old branch (delete branches/foo).
Your problem is that you're trying to use Reintegrate merge on a branch that has been 'corrupted' by having a 'half merge' already done on it. My advice is to ignore reintegrate and stick to plain on revision merging if this is your workflow.
However, the big reason you get errors is because SVN is performing some checks for you. In this case, if the merge has extra mergeinfo from individual files in there, then svn will throw a wobbly and prevent you from merging - mainly because this case can product errors that you might not notice. This is called a subtree merge in svn reintegrate terminology (read the Reintegrate to the Rescue section, particularly the controversial reintegrate check at the end).
You can stop recording mergeinfo when you perform your intermediary merges, or just leave the branch alone until its ready - then the merge will pick up changes made to trunk. I think you can also byass this check by only ever merging the entire trunk to branch, not individual files thus keeping mergeinfo safe for the final reintegrate at the end.
EDIT:
#randomusername: I think (never looked too closely) at moving is that it falls into the 'partial merge' trap. One cool feature of SVN is that you can do a sparse checkout - only get a partial copy of a tree. When you merge a partial tree in, SVN cannot say that the entire thing was merged as it obviously wasn't, so it records the mergeinfo slightly differently. This doesn't help with reintegrate as the reintegrate has to merge everything back to the trunk, and now it finds that some bits were modified without being merged, so it complains. A move appears kind of the same thing - a piece of the branched tree now appears differently in the mergeinfo than it expects. I would not bother with reintegrate, and stick with the normal revision range merge. Its a nice idea, but it trying to be too many things to too many users in too many different circumstances.
The full story for mergeinfo is here.
I suspect you're not following the merge instructions correctly:
"Now, use svn merge with the --reintegrate option to replicate your branch changes back into the trunk. You'll need a working copy of /trunk. You can get one by doing an svn checkout, dredging up an old trunk working copy from somewhere on your disk, or using svn switch (see the section called “Traversing Branches”). Your trunk working copy cannot have any local edits or contain a mixture of revisions (see the section called “Mixed-revision working copies”). While these are typically best practices for merging anyway, they are required when using the --reintegrate option.
Once you have a clean working copy of the trunk, you're ready to merge your branch back into it:"
I have few problems with merging.