Weird behavior of patch when filename is changed - diff

Recently, I am learning diff and patch. I created two files, file a with content abc and file b with content def. Then I used diff -u a b > p and patch < p, it behaved correctly, as shown in the following:
[joe#joe-pc c]$ ls
a b
[joe#joe-pc c]$ more a
abc
[joe#joe-pc c]$ more b
def
[joe#joe-pc c]$ diff -u a b > p
[joe#joe-pc c]$ more p
--- a 2018-12-20 22:56:33.865661540 +0800
+++ b 2018-12-20 22:54:15.241516269 +0800
## -1 +1 ##
-abc
+def
[joe#joe-pc c]$ patch < p
patching file a
[joe#joe-pc c]$ more a
def
[joe#joe-pc c]$ more b
def
[joe#joe-pc c]$ ls
a b p
[joe#joe-pc c]$
But if I changed the filename from a to ab, something strange happened. The patch < p command told me that
patching file b
Reversed (or previously applied) patch detected! Assume -R? [n]
[joe#joe-pc c]$ ls
ab b
[joe#joe-pc c]$ more ab
abc
[joe#joe-pc c]$ more b
def
[joe#joe-pc c]$ diff -u ab b > p
[joe#joe-pc c]$ more p
--- ab 2018-12-20 22:57:29.767980973 +0800
+++ b 2018-12-20 22:54:15.241516269 +0800
## -1 +1 ##
-abc
+def
[joe#joe-pc c]$ patch < p
patching file b
Reversed (or previously applied) patch detected! Assume -R? [n] ^C
[joe#joe-pc c]$
The file contents are the same, but why in the second situation the patch cannot find the right file ab to be patched?
The above operations were performed in a Linux machine with bash shell.
Thanks in advance.

It's a feature of GNU patch. As you haven't specific the file to patch, it has to be deduced from the input somehow. Basically, if you don't specify paths (only base names), it assumes that it has to patch the file with a shorter name, unless --posix command line argument is passed or POSIXLY_CORRECT environment variable is set:
patch --posix <p
# or
POSIXLY_CORRECT=1 patch <p
In your case, between a and b the first one is chosen correctly, but for ab and b the second one is chosen as patch target (as patching file b line suggests), but patching fails, hence the error.
You can also fix this behavior by either specifying patch target explicitly:
patch ab <p
Digging into docs
GNU patch uses the following logic (see patch's manual, "10.6 Multiple Patches in a File" section):
First, patch takes an ordered list of candidate file names as follows:
If the header is that of a context diff, patch takes the old and new file names in the header. ...
...
Then patch selects a file name from the candidate list as follows:
If some of the named files exist, patch selects the first name if conforming to POSIX, and the best name otherwise.
For "uniform context format" the "old" file is mentioned after --- (a or ab in your case), and the "new" file is mentioned after +++ (b in your case).
If both files exist and patch is not configured to be "confirming to POSIX" (e.g. by setting POSIXLY_CORRECT environment variable or --posix command line argument, see "10.12 patch and the POSIX Standard" section of the manual), then patch will choose "the best" name out of two. "Name" here includes full path obtain from the patch file (doesn't matter in your case). Details are specific later:
To determine the best of a nonempty list of file names, patch first takes all the names with the fewest path name components; of those, it then takes all the names with the shortest basename; of those, it then takes all the shortest names; finally, it takes the first remaining name.
"Name component" here is basically a folder/file name (e.g. /foo/bar/baz has three of them), "basename" is just the name of the file (baz).
So, if names are a (old) and b (new) and both files exist, there is no "best name", so the first one is patched.
But if names are ab (old) and b (new) and both files exist, then b is "better", so the tool tries to patch it and fails.
I have no idea why this behavior was made the default.

Related

How to assign patch to specified git source?

In my foo_git.bb:
SRC_URI = "git://github.com/foo/foo.git;branch=main;protocol=https;name=${BPN};destsuffix=git \
git://github.com/foo2/foo2;branch=main;protocol=https;name=${FOO2};destsuffix=${FOO2} \
file://0001-Modify-A_value.patch\
"
I want my patch to apply to foo2 but it always applied to foo. ( patch failed )
I found patchdir appended after the patch can work.
ex:
file://0001-Modify-A_value.patch;patchdir=${WORKDIR}/${FOO2_path}
From the OE manual
Patch files will be copied to ${S}/patches and then applied to source from within the source directory, ${S}. so for your use case to work your patch filenames should include their base repo name.
For example let’s say 0001-Modify-A_value.patch is as follows:
diff --git a/my.txt b/my.txt
index fa5cb9a..59369cc 100644
--- a/my.txt
+++ b/my.txt
## -1 +1 ##
-I am foo who lives in bar
+I am bar who lives in foo
To make it apply to foo2 you must modify it as follows:
--- foo2/my.txt
+++ foo2/my.txt
## -1 +1 ##
-I am foo who lives in bar
+I am bar who lives in foo
Bitbake uses Quilt for patching so for errors and so on look at its manual.
Another handy tool by bitbake to help you further is the devtool which is designed to handle tasks like updating a recipe or patching it.

Workflow to synchronise Mercurial repositories via email with bundles

I have two directories on two different computers - machine A (Windows) and machine B (OSX) - and I want to keep the two directories via Mercurial in sync. [*]
The restriction is that the two machines are not connected via LAN/WAN; the only way to move data between them is via email. So I thought emailing Mercurial bundles as deltas could do the trick.
My current workflow is roughly this (using a local tag lcb for the latest change bundle):
Say I work on machine A. At the end of the day I do:
hg commit -A -m "changes 123"
hg bundle --base lcb bundle_123.hg
hg tag --local -f lcb --rev tip
finally then I email that bundle to machine B.
Then sitting at machine B I do
hg unbundle bundle_123.hg
hg merge
hg commit -A -m "remote changes 123"
hg tag --local -f lcb --rev tip
Now I'm working on machine B and at the end of the day I do what's listed under 1., but on machine B. And the cycle continues...
However, I'm worry this system is not robust enough:
In-between changes: What happens when after creating a bundle (Step 1) and before applying it remotely (Step 2) a changes occurrs on the remote machine B? I had a case where it just overwrote the changes with the new bundle without conflict warning or merge suggestion.
Double-applying of bundle: What happens when by accident a bundle is applied twice? Would be needed to record the applied bundles somehow with local tags?
Or is there another better workflow to transfer Mercurial deltas via email?
[*] From the answer to a superuser question I figured that Mercurial might be the most feasible way to do this.
In-between changes: What happens when after creating a bundle (Step 1) and before applying it remotely (Step 2) a changes occurs on the remote machine B? I had a case where it just overwrote the changes with the new bundle without conflict warning or merge suggestion.
If a change is made on machine B, then this change will have been made in parallel with the changes you bundled from machine A. It doesn't really matter if the changes are made before or after you create the bundle (time-wise), it only matters that the changes on machine B don't have the head from machine A as their ancestor.
In other words, the world looks like this when the two machines are in sync:
A: ... [a]
B: ... [a]
You then create some new commits on machine A:
A: ... [a] --- [b] --- [c]
B: ... [a]
You bundle using [a] as base, so you get a bundle with [b] and [c]. Let us now say that someone (perhaps yourself) makes a commit on machine B:
A: ... [a] --- [b] --- [c]
( bundled )
B: ... [a] --- [x]
So far nothing has been exchanged between the two repositories, so this is just a normal case of people working in parallel. This is the norm in a distributed version control system — people working in parallel is that creates the need for merge commits.
The need for a merge is not evident in either repository at this point, they both have linear histories. However, when you unbundle on machine B, you see the divergence:
A: ... [a] --- [b] --- [c]
( bundled )
B: ... [a] --- [x]
\
[b] --- [c]
( unbundled )
It is helpful to realize that hg unbundle is exactly like hg pull, except that it can be done offline. That is, the data stored in a bundle is really just the data that hg pull would have transferred if you had had an online connection between the two repositories.
You would now proceed by merging the two heads [x] and [c] to create [y] on machine B:
A: ... [a] --- [b] --- [c]
B: ... [a] --- [x] --- [y]
\ /
[b] --- [c]
on machine B your last bundle was created with [a] as a base. However, you also know that machine A has commit [c], so you can specify that as an additional base if you like:
$ hg bundle --base a --base c stuff-from-machine-b.hg
That will put [x] and [y] into the bundle:
bundle: (a) --- [x] --- [y]
/
(c)
Here I use (a) and (c) to denote the required bases of the bundle. You can only unbundle this bundle if you have both [a] and [c] in your repository. If you leave out the second base (only use [a]), you will also bundle [b] and [c]:
bundle: (a) --- [x] --- [y]
\ /
[b] --- [c]
Here you included everything except [a] in the bundle. Bundling too much is okay, as we will see next.
Double-applying of bundle: What happens when by accident a bundle is applied twice? Would be needed to record the applied bundles somehow with local tags?
Applying a bundle twice is exactly like running hg pull twice: nothing happens the second time. When unbundling, Mercurial looks in the bundle and imports the missing changesets. So if you unbundle twice, there is nothing to do the second time.
Initial state
A>hg log --template "{rev}:{node|short} \"{desc}\" - files: {files}\n"
2:415231dbafb8 "Added C" - files: C.txt
1:6d9709a42687 "Added B" - files: B.txt
0:e26d1e14507e "Initial data" - files: .hgignore A.txt
B>hg log --template "{rev}:{node|short} \"{desc}\" - files: {files}\n"
1:72ef13990d0d "Edited A" - files: A.txt
0:e26d1e14507e "Initial data" - files: .hgignore A.txt
i.e:
Identical repos diverged at revision 1 at both sides: independent changes appeared
Test for case 1 - parallel changes
72ef13990d0d in B doesn't interfere with 6d9709a42687:415231dbafb8 in A
A>hg bundle --base e26d1e14507e ..\bundle1-2.hg
2 changesets found
B>hg pull ..\bundle1-2.hg
pulling from ..\bundle1-2.hg
searching for changes
adding changesets
adding manifests
adding file changes
added 2 changesets with 2 changes to 2 files (+1 heads)
(run 'hg heads' to see heads, 'hg merge' to merge)
because B had own child for e26d1e14507e, pulling from bundle added additional head (and anonymous branch for changesets from A)
B>hg glog --template "{rev}:{node|short} \"{desc}\" - files: {files}\n"
o 3:415231dbafb8 "Added C" - files: C.txt
|
o 2:6d9709a42687 "Added B" - files: B.txt
|
| # 1:72ef13990d0d "Edited A" - files: A.txt
|/
o 0:e26d1e14507e "Initial data" - files: .hgignore A.txt
Test for case 2 - applying bundle twice
I know apriori, that existing in repo changesets will not be pulled again (and prefer unified style of hg pull from bundle instead of hg unbundle), but show it
B>hg pull ..\bundle1-2.hg
pulling from ..\bundle1-2.hg
searching for changes
no changes found
Additional benefit from pull's behavior - you can don't worry about moving base changeset for bundle and always use one, oldest point of divergence - it will (slightly) increase size of bundle (slightly - because by default bundle is bzip2-compressed archive), but also it will it guarantees the inclusion of all child changesets into bundle and pulling all missing (and only missing) changesets in destination repository
And, it any case, even unbundle the same bundle twice will not have any backfires.
Same bundle in same repo B, attempt to unbundle already pulled bundle
B>hg unbundle ..\bundle1-2.hg
adding changesets
adding manifests
adding file changes
added 0 changesets with 0 changes to 2 files
(run 'hg update' to get a working copy)

How to create patch for a new file, and patch it back to the original directory?

Suppose I have a directory dir1, and have files f1.c and f2.c in it.
I copy all to directory dir2, modify both f1 and f2, and add a new file f3.c.
Then I do the diff to create patch:
diff -ruN dir1/ dir2/ > diff.patch
Now I want to apply the patch back to dir1. However the changes in f1 and f2 are successfully patched. but I don't get a new file f3.c in dir1:
[/local/home/tmp]$ patch -p0 < diff.patch
patching file dir1/f1.c
Hunk #1 succeeded at 1 with fuzz 2.
patching file dir1/f2.c
The next patch would create the file dir2/f3.c,
which already exists! Assume -R? [n]
Apply anyway? [n]
Skipping patch.
1 out of 1 hunk ignored
How to apply the patch, so that I can add f3.c in dir1 too?
OK, i've figured out, that you must cd into dir1, and use the -p1 parameter:
cd dir1
patch -p1 < ../diff.patch

Patch semantics

I've made a patch by using diff:
diff -u /home/user/onderzoeksstage/omf/Rakefile /home/user/onderzoeksstage/Rakefile > rakefile2.patch
I've placed this rakefile2.patch in another directory: /home/user/onderzoeksstage/omf/confine/patches.
Now, I was under the assumption that I could go to that directory where all my patches are collected, call patch < rakefile2.patch and patch would known where to find the file to patch (the original file /home/user/onderzoeksstage/omf/Rakefile) by reading out the rakefile2.patch header.
But when doing that, patch says that it does not find the file to patch:
[user#localhost patches]$ patch < rakefile2.patch
can't find file to patch at input line 3
Perhaps you should have used the -p or --strip option?
The text leading up to this was:
--------------------------
|--- /home/user/onderzoeksstage/omf/Rakefile 2013-02-12 14:11:49.809792527 +0100
|+++ /home/user/onderzoeksstage/Rakefile 2013-02-12 12:17:50.314831492 +0100
--------------------------
File to patch: ...
...
So my assumption was obviously wrong, but so how does patch work?
When going to /home/user/onderzoeksstage/omf/ and calling patch < rakefile2.patch does work. Does patch only look at the header for the filename at the end of the path and not take in account the directory? And so what I try to accomplish will never work?
Why is this; is this because that way a patch could be applied to any file called Rakefile (e.g. in my case) and so make it a more "generic" patch?
Thanks
Does patch only look at the header for the filename at the end of the
path and not take in account the directory?
That's what it does by default. See the description of -p option in man patch. Looks like -p0 is what you want here.

Conflict marking confusion when pulling a deleted file with darcs

My confusion arises from the following statement taken from here:
When pulling patches that conflict each other (e.g., change the same part of the file) Darcs detects the conflict and marks it in the repository content. It then lets the user resolve the problem.
This seemed inconsistent with what I was seeing, so I created the following work-flow using darcs 2.5.2:
Create repo foo;
Create a non-empty file in foo and record it;
Clone foo to bar;
Remove the file in foo and record it;
Add another line to the file in bar and record it;
Pull from foo into bar, obtain conflict notification;
After taking these steps I ran darcs whatsnew in bar, and was shown two 'patch primitives':
A hunk removing all of the "non-empty file in foo", but with no mention of the line added and recorded in bar;
A rmfile removing the file.
My question is: Why is there no mention of the line added and recorded in bar?
If I run darcs revert in bar, then everything makes sense: I see the "non-empty file" affected by neither conflicting patch, as per this statement taken from here:
The command darcs revert will remove the conflict marking and back up to state before conflicting patches.
But then if I run darcs mark-conflicts I am back to the same state as after the pull, with the two 'patch primitives' mentioned above, and no mention of the the line added and recorded in bar.
For reference / reproduction here is my complete work-flow from the command line:
$ mkdir foo
$ cd foo/
foo$ darcs initialize
foo$ touch shopping
foo$ vi shopping <-- add a couple of lines
foo$ darcs add shopping
foo$ darcs record
addfile ./shopping
Shall I record this change? (1/2) [ynW...], or ? for more options: y
hunk ./shopping 1
+cake
+pie
Shall I record this change? (2/2) [ynW...], or ? for more options: y
What is the patch name? Added shopping
Do you want to add a long comment? [yn]n
Finished recording patch 'Added shopping'
foo$ cd ..
$ darcs get foo/ bar
$ cd bar/
bar$ vi shopping <-- add another line
bar$ darcs record
hunk ./shopping 2
+beer
Shall I record this change? (1/1) [ynW...], or ? for more options: y
What is the patch name? Added beer
Do you want to add a long comment? [yn]n
Finished recording patch 'Added beer'
bar$ cd ../foo
foo$ rm shopping
foo$ darcs record
hunk ./shopping 1
-cake
-pie
Shall I record this change? (1/2) [ynW...], or ? for more options: y
rmfile ./shopping
Shall I record this change? (2/2) [ynW...], or ? for more options: y
What is the patch name? Removed shopping
Do you want to add a long comment? [yn]n
Finished recording patch 'Removed shopping'
foo$ cd ../bar
bar$ darcs pull
Pulling from "../foo"...
Mon Nov 14 19:26:44 GMT 2011 dukedave#gmail.com
* Removed shopping
Shall I pull this patch? (1/1) [ynW...], or ? for more options: y
Backing up ./shopping(-darcs-backup0)
We have conflicts in the following files:
./shopping
Finished pulling and applying.
bar$ darcs whatsnew
hunk ./shopping 1
-cake
-pie
rmfile ./shopping
If you run darcs changes -v inside bar, you'll see the history of your
changes, including the conflictor introduced as a result of you pulling
conflicting patches.
I've summarised your example to something everso slightly shorter:
DARCS=/usr/bin/darcs
$DARCS init --repo foo
cd foo
echo 'a' > myfile
$DARCS add myfile && $DARCS record -am 'Add myfile'
$DARCS get . ../bar
rm myfile
$DARCS record -am 'Remove myfile'
cd ../bar
echo 'b' >> myfile
$DARCS record -am 'Change myfile'
$DARCS pull -a ../foo
$DARCS changes -v
Now, after that, I see this output from darcs changes -v
Tue Nov 15 19:44:38 GMT 2011 Owen Stephens <darcs#owenstephens.co.uk>
* Remove myfile
conflictor [
hunk ./myfile 2
+b
]
|:
hunk ./myfile 1
-a
conflictor {{
|:
hunk ./myfile 2
+b
|:
hunk ./myfile 1
-a
}} []
|hunk ./myfile 1
|-a
|:
rmfile ./myfile
Tue Nov 15 19:44:38 GMT 2011 Owen Stephens <darcs#owenstephens.co.uk>
* Change myfile
hunk ./myfile 2
+b
Tue Nov 15 19:44:38 GMT 2011 Owen Stephens <darcs#owenstephens.co.uk>
* Add myfile
addfile ./myfile
hunk ./myfile 1
+a
So, let's explain the crazy output of "Remove myfile". "Remove myfile" exists
as the following in foo:
Tue Nov 15 19:44:38 GMT 2011 Owen Stephens <darcs#owenstephens.co.uk>
* Remove myfile
hunk ./myfile 1
-a
rmfile ./myfile
So, a hunk at line 1 and removal of the file.
Pulling "remove myfile" into bar, we modify the patch contents by introducing special "conflictor" primitives that represent the primitives within "Remove myfile" that conflict with other primitves in bar. N.b. there is no information loss here - we can always get back to the original primitives by unpulling the conflicting changes - in this case, unpulling "change myfile".
Conflictors are confusing, but AFAICT essentially separate changes that
conflict with a current patch, x into 2 sets:
"ix" which is the set of patches that includes:
i) patches that conflict with x and some other patch in the repo
ii) patches that conflict with a patch that conflicts with x
"xx" which is the sequence of patches that only conflict with the patch x.
I think the reason that this is done, is that Conflictors have the effect of
"undoing" primitives that cause conflicts, but only those that haven't been
undone by another Conflictor.
The output we see is something like:
"conflictor" ix "[" xx "]" x
I'm abusing notation, but hopefully you can somewhat decipher that (see
src/Darcs/Patch/V2/(Real.hs|Non.hs) in the darcs.net repo for the full story)
In this case, "Remove myfile" has 2 primitive patches, and (in this case) 2
corresponding conflictors when pulled into bar.
The first primitive (remove line 1 from myfile) only conflicts with the
primitive within "Change myfile" (add 'b' to line 2 of myfile) and so that's
the first conflictor:
conflictor [ <--- The start of xx (no ix here)
hunk ./myfile 2
+b
] <--- The end of xx
|:
hunk ./myfile 1 <--- x
-a
N.B ( "|:" is a marker that delimits a "Non" primitve's context from the
primitive itself - I won't try and explain it further, just read below |: to
see the primitive in question)
The second primitive (remove myfile) is only slightly more complicated: (rmfile
myfile) conflicts with (add 'b' to line 2 of myfile) which as we know conflicts
with (remove line 1 from myfile), so they both go into "ix", with no patches in
"xx". I'll remove the unnecessary "|:" delimiters and space things out:
conflictor {{
hunk ./myfile 2
+b
hunk ./myfile 1
-a
}}
[] <--- no xx
|hunk ./myfile 1 <--- start of x
|-a
|:
rmfile ./myfile <--- end of x
The final (rmfile myfile) has some context to indentify the exact primitive
that we're referring to (I'm not really sure why/how this is required, but
there we are), which is marked by leading '|'s and delimited by "|:".
Finally, to attempt explain the output of darcs whatsnew in foo; when
multiple patches conflict, I think the actual effect of the conflictor is to
"undo" any conflicting patches, giving the effect of neither;
gives the start of some explanations: http://en.wikibooks.org/wiki/Understanding_Darcs/Patch_theory_and_conflicts.
I think what you're seeing is the result of the forced commutation of "Change myfile" and "Remove myfile" call them A and B respectively. Then to merge the two, Darcs creates A^-1 and commutes A^-1 and B to give B' and (A^-1)' where B' has the effect of A^-1 (since we're forcing the commutation to work), meaning that the effect of B' (i.e. the merged "remove myfile") is actually to just undo the adding of the line made by "Change myfile".
I haven't had time to look at how darcs mark-conflicts works, so I can't yet explain the working changes you're seeing with darcs changes in bar.