Conflict marking confusion when pulling a deleted file with darcs - version-control

My confusion arises from the following statement taken from here:
When pulling patches that conflict each other (e.g., change the same part of the file) Darcs detects the conflict and marks it in the repository content. It then lets the user resolve the problem.
This seemed inconsistent with what I was seeing, so I created the following work-flow using darcs 2.5.2:
Create repo foo;
Create a non-empty file in foo and record it;
Clone foo to bar;
Remove the file in foo and record it;
Add another line to the file in bar and record it;
Pull from foo into bar, obtain conflict notification;
After taking these steps I ran darcs whatsnew in bar, and was shown two 'patch primitives':
A hunk removing all of the "non-empty file in foo", but with no mention of the line added and recorded in bar;
A rmfile removing the file.
My question is: Why is there no mention of the line added and recorded in bar?
If I run darcs revert in bar, then everything makes sense: I see the "non-empty file" affected by neither conflicting patch, as per this statement taken from here:
The command darcs revert will remove the conflict marking and back up to state before conflicting patches.
But then if I run darcs mark-conflicts I am back to the same state as after the pull, with the two 'patch primitives' mentioned above, and no mention of the the line added and recorded in bar.
For reference / reproduction here is my complete work-flow from the command line:
$ mkdir foo
$ cd foo/
foo$ darcs initialize
foo$ touch shopping
foo$ vi shopping <-- add a couple of lines
foo$ darcs add shopping
foo$ darcs record
addfile ./shopping
Shall I record this change? (1/2) [ynW...], or ? for more options: y
hunk ./shopping 1
+cake
+pie
Shall I record this change? (2/2) [ynW...], or ? for more options: y
What is the patch name? Added shopping
Do you want to add a long comment? [yn]n
Finished recording patch 'Added shopping'
foo$ cd ..
$ darcs get foo/ bar
$ cd bar/
bar$ vi shopping <-- add another line
bar$ darcs record
hunk ./shopping 2
+beer
Shall I record this change? (1/1) [ynW...], or ? for more options: y
What is the patch name? Added beer
Do you want to add a long comment? [yn]n
Finished recording patch 'Added beer'
bar$ cd ../foo
foo$ rm shopping
foo$ darcs record
hunk ./shopping 1
-cake
-pie
Shall I record this change? (1/2) [ynW...], or ? for more options: y
rmfile ./shopping
Shall I record this change? (2/2) [ynW...], or ? for more options: y
What is the patch name? Removed shopping
Do you want to add a long comment? [yn]n
Finished recording patch 'Removed shopping'
foo$ cd ../bar
bar$ darcs pull
Pulling from "../foo"...
Mon Nov 14 19:26:44 GMT 2011 dukedave#gmail.com
* Removed shopping
Shall I pull this patch? (1/1) [ynW...], or ? for more options: y
Backing up ./shopping(-darcs-backup0)
We have conflicts in the following files:
./shopping
Finished pulling and applying.
bar$ darcs whatsnew
hunk ./shopping 1
-cake
-pie
rmfile ./shopping

If you run darcs changes -v inside bar, you'll see the history of your
changes, including the conflictor introduced as a result of you pulling
conflicting patches.
I've summarised your example to something everso slightly shorter:
DARCS=/usr/bin/darcs
$DARCS init --repo foo
cd foo
echo 'a' > myfile
$DARCS add myfile && $DARCS record -am 'Add myfile'
$DARCS get . ../bar
rm myfile
$DARCS record -am 'Remove myfile'
cd ../bar
echo 'b' >> myfile
$DARCS record -am 'Change myfile'
$DARCS pull -a ../foo
$DARCS changes -v
Now, after that, I see this output from darcs changes -v
Tue Nov 15 19:44:38 GMT 2011 Owen Stephens <darcs#owenstephens.co.uk>
* Remove myfile
conflictor [
hunk ./myfile 2
+b
]
|:
hunk ./myfile 1
-a
conflictor {{
|:
hunk ./myfile 2
+b
|:
hunk ./myfile 1
-a
}} []
|hunk ./myfile 1
|-a
|:
rmfile ./myfile
Tue Nov 15 19:44:38 GMT 2011 Owen Stephens <darcs#owenstephens.co.uk>
* Change myfile
hunk ./myfile 2
+b
Tue Nov 15 19:44:38 GMT 2011 Owen Stephens <darcs#owenstephens.co.uk>
* Add myfile
addfile ./myfile
hunk ./myfile 1
+a
So, let's explain the crazy output of "Remove myfile". "Remove myfile" exists
as the following in foo:
Tue Nov 15 19:44:38 GMT 2011 Owen Stephens <darcs#owenstephens.co.uk>
* Remove myfile
hunk ./myfile 1
-a
rmfile ./myfile
So, a hunk at line 1 and removal of the file.
Pulling "remove myfile" into bar, we modify the patch contents by introducing special "conflictor" primitives that represent the primitives within "Remove myfile" that conflict with other primitves in bar. N.b. there is no information loss here - we can always get back to the original primitives by unpulling the conflicting changes - in this case, unpulling "change myfile".
Conflictors are confusing, but AFAICT essentially separate changes that
conflict with a current patch, x into 2 sets:
"ix" which is the set of patches that includes:
i) patches that conflict with x and some other patch in the repo
ii) patches that conflict with a patch that conflicts with x
"xx" which is the sequence of patches that only conflict with the patch x.
I think the reason that this is done, is that Conflictors have the effect of
"undoing" primitives that cause conflicts, but only those that haven't been
undone by another Conflictor.
The output we see is something like:
"conflictor" ix "[" xx "]" x
I'm abusing notation, but hopefully you can somewhat decipher that (see
src/Darcs/Patch/V2/(Real.hs|Non.hs) in the darcs.net repo for the full story)
In this case, "Remove myfile" has 2 primitive patches, and (in this case) 2
corresponding conflictors when pulled into bar.
The first primitive (remove line 1 from myfile) only conflicts with the
primitive within "Change myfile" (add 'b' to line 2 of myfile) and so that's
the first conflictor:
conflictor [ <--- The start of xx (no ix here)
hunk ./myfile 2
+b
] <--- The end of xx
|:
hunk ./myfile 1 <--- x
-a
N.B ( "|:" is a marker that delimits a "Non" primitve's context from the
primitive itself - I won't try and explain it further, just read below |: to
see the primitive in question)
The second primitive (remove myfile) is only slightly more complicated: (rmfile
myfile) conflicts with (add 'b' to line 2 of myfile) which as we know conflicts
with (remove line 1 from myfile), so they both go into "ix", with no patches in
"xx". I'll remove the unnecessary "|:" delimiters and space things out:
conflictor {{
hunk ./myfile 2
+b
hunk ./myfile 1
-a
}}
[] <--- no xx
|hunk ./myfile 1 <--- start of x
|-a
|:
rmfile ./myfile <--- end of x
The final (rmfile myfile) has some context to indentify the exact primitive
that we're referring to (I'm not really sure why/how this is required, but
there we are), which is marked by leading '|'s and delimited by "|:".
Finally, to attempt explain the output of darcs whatsnew in foo; when
multiple patches conflict, I think the actual effect of the conflictor is to
"undo" any conflicting patches, giving the effect of neither;
gives the start of some explanations: http://en.wikibooks.org/wiki/Understanding_Darcs/Patch_theory_and_conflicts.
I think what you're seeing is the result of the forced commutation of "Change myfile" and "Remove myfile" call them A and B respectively. Then to merge the two, Darcs creates A^-1 and commutes A^-1 and B to give B' and (A^-1)' where B' has the effect of A^-1 (since we're forcing the commutation to work), meaning that the effect of B' (i.e. the merged "remove myfile") is actually to just undo the adding of the line made by "Change myfile".
I haven't had time to look at how darcs mark-conflicts works, so I can't yet explain the working changes you're seeing with darcs changes in bar.

Related

Why is Mercurial matching a nonexistent local revision number?

Quick intro: In Mercurial there are two different ways to numerically refer to a changeset.
First, there's the node ID hash. It is global and functions like a git commit hash. It consists of 40 hexadecimal digits.
Second, there's the local revision number. It is a decimal number that starts at 0 and counts up. Unlike the node hash, this is local, meaning the same changeset can have different local revision numbers in two different repos. This depends on what other changesets are present in each repo and depends even on the order each repo received their changesets.
A revision can be specified numerically to Mercurial as a local revision number, a full 40-digit hash, or "a short-form identifier". The latter gives a unique prefix of a hash; that is, if only one full hash starts with the given string then the string matches that changeset.
I found that in certain cases, Mercurial commands (such as hg log with an -r switch), given plain decimal numbers, will match some revision even though there aren't enough local revisions for the given number to match as a local revision number.
Here's an example I constructed after coming across such a case by chance:
test$ hg --version
Mercurial Distributed SCM (version 6.1)
(see https://mercurial-scm.org for more information)
Copyright (C) 2005-2022 Olivia Mackall and others
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
test$ hg init
test$ touch a
test$ hg add a
test$ hg ci -d "1970-01-01 00:00:00 +0000" -u testuser -m a
test$ touch b
test$ hg add b
test$ hg ci -d "1970-01-01 00:00:00 +0000" -u testuser -m b
test$ hg log
changeset: 1:952880b76ae5
tag: tip
user: testuser
date: Thu Jan 01 00:00:00 1970 +0000
summary: b
changeset: 0:d61f66df66f9
user: testuser
date: Thu Jan 01 00:00:00 1970 +0000
summary: a
test$ hg log -r 2
abort: unknown revision '2'
test$ hg log -r 9
changeset: 1:952880b76ae5
tag: tip
user: testuser
date: Thu Jan 01 00:00:00 1970 +0000
summary: b
test$
As is evident, hg log -r 9 matches a changeset even though there aren't that many changesets to match the 9 as a local revision number.
The question: Why is this? Additionally, how can we avoid matching a nonexistent local revision number?
This is due to how Mercurial parses revision specifiers. Here's how Olivia Mackall explains it in a mail from 2014:
Here is a hexadecimal identifier:
60912eb2667de45415eff601bfc045ae0fe8db42
See how it starts with 6? If you ask for revision 6, Mercurial will:
a) look for revision 6
b) if that fails, look for a hex identifier starting with "6"
c) if we find more than one match, complain
d) if we find no matches, complain
e) we found one match: success!
That is, if hg log -r 9 doesn't match any local revision number (because there are less than ten changesets in the repo), Mercurial next will match a node hash that happens to start with a 9.
To avoid this ambiguity, she responded that one should use hg log -r 'rev(9)' to match only local revision numbers, and hg log -r 'id(9)' to match only prefixes or full hashes.
In the documentation on revsets, these predicates are listed as:
"id(string)"
Revision non-ambiguously specified by the given hex string prefix.
And:
"rev(number)"
Revision with the given numeric identifier.
Unfortunately, both this page and the help page on revisions do not (as of version 6.1) explicitly point out the ambiguity between numbers that can match either as local revision numbers or node hash prefixes. The 2014 mailing list thread I quoted does contain suggestions to clarify this but it appears nothing came off it.
Additionally, here is a changeset message in which I explained the entire affair and how it came to affect the operation of a script of mine:
fix to use 'rev(x)' instead of just x to refer to local rev number
The revsets syntax to unambiguously refer to a local revision
number is to wrap the number in rev(). Without this, a number
that doesn't exist (eg -r 2) may be misinterpreted to refer to
a changeset that has a node hash starting with the requested
number.
In our case this bug happened to act up after the revision on at
2022-04-02 16:10:42 2022 Z "changed encoding from cp850 to utf8"
which the day after it was added was converted as the second
(local revision number 1) changeset from the svn repo. The
particular hg convert command was:
hg convert svn-mirror DEST \
--config hooks.pretxncommit.checkcommitmessage=true \
--config convert.svn.startrev=1152
This created a changeset known as 1:2ec9f101bc31 from that svn
revision. Lacking a local revision number 2, the -r 2 picked up
this changeset because its hash started with the digit "2".
Tnus the NEWNODE variable received the changeset hash for this
changeset. Because our hg rebase command is configured to keep
empty changesets, the changeset got added atop its already
existing copy in the destination repo.
Ever since the akt.sh script would pick up the wrong revision
number from the destination repo and abort its run with the
message indicating "Revisions differ!".

How do you find the changesets between two tags in mercurial?

If I have two tags named 2.0 and 2.1, how do I find the changeset messages between the two? I'm trying to find to a way to use HG make release notes and list the different messages associated with the commits.
Example Changeset:
changeset: 263:5a4b3c2d1e
user: User Name <user.name#gmail.com>
date: Tue Nov 27 14:22:54 2018 -0500
summary: Added tag 2.0.1 for changeset 9876fghij
Desired Output:
Added tag 2.1 for changeset 67890pqrst
Change Info...
Added tag 2.0.1 for changeset 9876fghij
Change Info...
Added tag 2.0 for changeset klmno12345
Preface
"Any challenge has a simple, easy-to-understand wrong decision". And Boris's answer is a nicest illustration for this rule: "::" topo-range will produce good results only in case of pure single-branch development (which is, in common, The Bad Idea (tm) anyway)
Face
Good solution must correctly handle complex DAGs and answer on question "New changesets included in NEW, missing in OLD (regardless of the nature of occurrence)"
For me it's "only()" functions in revsets with both parameters
"only(set, [set])"
Changesets that are ancestors of the first set that are not ancestors
of any other head in the repo. If a second set is specified, the
result is ancestors of the first set that are not ancestors of the
second set (i.e. ::set1 - ::set2).
hg log -r "only(2.1,2.0)"
maybe for better presentation powered by predefined style "changelog"
hg log -r "only(2.1,2.0)" -s changelog
or custom style|template
You'll want to use a revset to select all changesets between two tags, for example: 2.0::2.1 will likely do the trick. You can validate the selected changesets by running: hg log -G -r '2.0::2.1'. (See hg help revset for more information about revsets).
Once you have the right selected changesets, you can now apply a template to retrieve only the needed information. For example if you only want the first line of changeset description, you can do hg log -r '2.0::2.1' -T '{desc}\n' for the whole description or hg log -r '2.0::2.1' -T '{desc|firstline}\n' only for the first line of each changeset description.
If you want to add even more information, hg help template is your friend.

Workflow to synchronise Mercurial repositories via email with bundles

I have two directories on two different computers - machine A (Windows) and machine B (OSX) - and I want to keep the two directories via Mercurial in sync. [*]
The restriction is that the two machines are not connected via LAN/WAN; the only way to move data between them is via email. So I thought emailing Mercurial bundles as deltas could do the trick.
My current workflow is roughly this (using a local tag lcb for the latest change bundle):
Say I work on machine A. At the end of the day I do:
hg commit -A -m "changes 123"
hg bundle --base lcb bundle_123.hg
hg tag --local -f lcb --rev tip
finally then I email that bundle to machine B.
Then sitting at machine B I do
hg unbundle bundle_123.hg
hg merge
hg commit -A -m "remote changes 123"
hg tag --local -f lcb --rev tip
Now I'm working on machine B and at the end of the day I do what's listed under 1., but on machine B. And the cycle continues...
However, I'm worry this system is not robust enough:
In-between changes: What happens when after creating a bundle (Step 1) and before applying it remotely (Step 2) a changes occurrs on the remote machine B? I had a case where it just overwrote the changes with the new bundle without conflict warning or merge suggestion.
Double-applying of bundle: What happens when by accident a bundle is applied twice? Would be needed to record the applied bundles somehow with local tags?
Or is there another better workflow to transfer Mercurial deltas via email?
[*] From the answer to a superuser question I figured that Mercurial might be the most feasible way to do this.
In-between changes: What happens when after creating a bundle (Step 1) and before applying it remotely (Step 2) a changes occurs on the remote machine B? I had a case where it just overwrote the changes with the new bundle without conflict warning or merge suggestion.
If a change is made on machine B, then this change will have been made in parallel with the changes you bundled from machine A. It doesn't really matter if the changes are made before or after you create the bundle (time-wise), it only matters that the changes on machine B don't have the head from machine A as their ancestor.
In other words, the world looks like this when the two machines are in sync:
A: ... [a]
B: ... [a]
You then create some new commits on machine A:
A: ... [a] --- [b] --- [c]
B: ... [a]
You bundle using [a] as base, so you get a bundle with [b] and [c]. Let us now say that someone (perhaps yourself) makes a commit on machine B:
A: ... [a] --- [b] --- [c]
( bundled )
B: ... [a] --- [x]
So far nothing has been exchanged between the two repositories, so this is just a normal case of people working in parallel. This is the norm in a distributed version control system — people working in parallel is that creates the need for merge commits.
The need for a merge is not evident in either repository at this point, they both have linear histories. However, when you unbundle on machine B, you see the divergence:
A: ... [a] --- [b] --- [c]
( bundled )
B: ... [a] --- [x]
\
[b] --- [c]
( unbundled )
It is helpful to realize that hg unbundle is exactly like hg pull, except that it can be done offline. That is, the data stored in a bundle is really just the data that hg pull would have transferred if you had had an online connection between the two repositories.
You would now proceed by merging the two heads [x] and [c] to create [y] on machine B:
A: ... [a] --- [b] --- [c]
B: ... [a] --- [x] --- [y]
\ /
[b] --- [c]
on machine B your last bundle was created with [a] as a base. However, you also know that machine A has commit [c], so you can specify that as an additional base if you like:
$ hg bundle --base a --base c stuff-from-machine-b.hg
That will put [x] and [y] into the bundle:
bundle: (a) --- [x] --- [y]
/
(c)
Here I use (a) and (c) to denote the required bases of the bundle. You can only unbundle this bundle if you have both [a] and [c] in your repository. If you leave out the second base (only use [a]), you will also bundle [b] and [c]:
bundle: (a) --- [x] --- [y]
\ /
[b] --- [c]
Here you included everything except [a] in the bundle. Bundling too much is okay, as we will see next.
Double-applying of bundle: What happens when by accident a bundle is applied twice? Would be needed to record the applied bundles somehow with local tags?
Applying a bundle twice is exactly like running hg pull twice: nothing happens the second time. When unbundling, Mercurial looks in the bundle and imports the missing changesets. So if you unbundle twice, there is nothing to do the second time.
Initial state
A>hg log --template "{rev}:{node|short} \"{desc}\" - files: {files}\n"
2:415231dbafb8 "Added C" - files: C.txt
1:6d9709a42687 "Added B" - files: B.txt
0:e26d1e14507e "Initial data" - files: .hgignore A.txt
B>hg log --template "{rev}:{node|short} \"{desc}\" - files: {files}\n"
1:72ef13990d0d "Edited A" - files: A.txt
0:e26d1e14507e "Initial data" - files: .hgignore A.txt
i.e:
Identical repos diverged at revision 1 at both sides: independent changes appeared
Test for case 1 - parallel changes
72ef13990d0d in B doesn't interfere with 6d9709a42687:415231dbafb8 in A
A>hg bundle --base e26d1e14507e ..\bundle1-2.hg
2 changesets found
B>hg pull ..\bundle1-2.hg
pulling from ..\bundle1-2.hg
searching for changes
adding changesets
adding manifests
adding file changes
added 2 changesets with 2 changes to 2 files (+1 heads)
(run 'hg heads' to see heads, 'hg merge' to merge)
because B had own child for e26d1e14507e, pulling from bundle added additional head (and anonymous branch for changesets from A)
B>hg glog --template "{rev}:{node|short} \"{desc}\" - files: {files}\n"
o 3:415231dbafb8 "Added C" - files: C.txt
|
o 2:6d9709a42687 "Added B" - files: B.txt
|
| # 1:72ef13990d0d "Edited A" - files: A.txt
|/
o 0:e26d1e14507e "Initial data" - files: .hgignore A.txt
Test for case 2 - applying bundle twice
I know apriori, that existing in repo changesets will not be pulled again (and prefer unified style of hg pull from bundle instead of hg unbundle), but show it
B>hg pull ..\bundle1-2.hg
pulling from ..\bundle1-2.hg
searching for changes
no changes found
Additional benefit from pull's behavior - you can don't worry about moving base changeset for bundle and always use one, oldest point of divergence - it will (slightly) increase size of bundle (slightly - because by default bundle is bzip2-compressed archive), but also it will it guarantees the inclusion of all child changesets into bundle and pulling all missing (and only missing) changesets in destination repository
And, it any case, even unbundle the same bundle twice will not have any backfires.
Same bundle in same repo B, attempt to unbundle already pulled bundle
B>hg unbundle ..\bundle1-2.hg
adding changesets
adding manifests
adding file changes
added 0 changesets with 0 changes to 2 files
(run 'hg update' to get a working copy)

How to create patch for a new file, and patch it back to the original directory?

Suppose I have a directory dir1, and have files f1.c and f2.c in it.
I copy all to directory dir2, modify both f1 and f2, and add a new file f3.c.
Then I do the diff to create patch:
diff -ruN dir1/ dir2/ > diff.patch
Now I want to apply the patch back to dir1. However the changes in f1 and f2 are successfully patched. but I don't get a new file f3.c in dir1:
[/local/home/tmp]$ patch -p0 < diff.patch
patching file dir1/f1.c
Hunk #1 succeeded at 1 with fuzz 2.
patching file dir1/f2.c
The next patch would create the file dir2/f3.c,
which already exists! Assume -R? [n]
Apply anyway? [n]
Skipping patch.
1 out of 1 hunk ignored
How to apply the patch, so that I can add f3.c in dir1 too?
OK, i've figured out, that you must cd into dir1, and use the -p1 parameter:
cd dir1
patch -p1 < ../diff.patch

How to view a single Mercurial commit from the command line?

I would like to view from the commandline what was changed in given Mercurial commit similar to what one would see from hg status or from the TortoiseHg tool. The closest I can seem to get is hg log --stat but that prints extra symbols (i.e. pluses and minuses) and I cannot specify at which specific revision I want to look.
I need this because I have developers who have check-in comments like "." or ",". >:-(
It turns out that hg status has a --change argument where you can pass the revision number (e.g. 109), relative revision (ie -1 is last commit, -2 is second-last, etc), or the hash of the revision to it and it will print out the changes (i.e. additions, removals, and modification) that revision had.
--change isolates that revision and shows just from that revision, but replacing --change with --rev shows the cumulative effect since that revision to the current state.
hg log -v -r <changeset>
changeset: 563:af4d66e2bc6e
tag: tip
user: David M. Carr <****>
date: Fri Oct 26 22:46:02 2012 -0400
files: hggit/gitrepo.py tests/test-pull.t
description:
pull: don't pull tags as bookmarks
or, using templates, something like
hg log -r tip --template "{node|short} - files: {files}\n"
with output
af4d66e2bc6e - files: hggit/gitrepo.py tests/test-pull.t