Why is Mercurial matching a nonexistent local revision number? - version-control

Quick intro: In Mercurial there are two different ways to numerically refer to a changeset.
First, there's the node ID hash. It is global and functions like a git commit hash. It consists of 40 hexadecimal digits.
Second, there's the local revision number. It is a decimal number that starts at 0 and counts up. Unlike the node hash, this is local, meaning the same changeset can have different local revision numbers in two different repos. This depends on what other changesets are present in each repo and depends even on the order each repo received their changesets.
A revision can be specified numerically to Mercurial as a local revision number, a full 40-digit hash, or "a short-form identifier". The latter gives a unique prefix of a hash; that is, if only one full hash starts with the given string then the string matches that changeset.
I found that in certain cases, Mercurial commands (such as hg log with an -r switch), given plain decimal numbers, will match some revision even though there aren't enough local revisions for the given number to match as a local revision number.
Here's an example I constructed after coming across such a case by chance:
test$ hg --version
Mercurial Distributed SCM (version 6.1)
(see https://mercurial-scm.org for more information)
Copyright (C) 2005-2022 Olivia Mackall and others
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
test$ hg init
test$ touch a
test$ hg add a
test$ hg ci -d "1970-01-01 00:00:00 +0000" -u testuser -m a
test$ touch b
test$ hg add b
test$ hg ci -d "1970-01-01 00:00:00 +0000" -u testuser -m b
test$ hg log
changeset: 1:952880b76ae5
tag: tip
user: testuser
date: Thu Jan 01 00:00:00 1970 +0000
summary: b
changeset: 0:d61f66df66f9
user: testuser
date: Thu Jan 01 00:00:00 1970 +0000
summary: a
test$ hg log -r 2
abort: unknown revision '2'
test$ hg log -r 9
changeset: 1:952880b76ae5
tag: tip
user: testuser
date: Thu Jan 01 00:00:00 1970 +0000
summary: b
test$
As is evident, hg log -r 9 matches a changeset even though there aren't that many changesets to match the 9 as a local revision number.
The question: Why is this? Additionally, how can we avoid matching a nonexistent local revision number?

This is due to how Mercurial parses revision specifiers. Here's how Olivia Mackall explains it in a mail from 2014:
Here is a hexadecimal identifier:
60912eb2667de45415eff601bfc045ae0fe8db42
See how it starts with 6? If you ask for revision 6, Mercurial will:
a) look for revision 6
b) if that fails, look for a hex identifier starting with "6"
c) if we find more than one match, complain
d) if we find no matches, complain
e) we found one match: success!
That is, if hg log -r 9 doesn't match any local revision number (because there are less than ten changesets in the repo), Mercurial next will match a node hash that happens to start with a 9.
To avoid this ambiguity, she responded that one should use hg log -r 'rev(9)' to match only local revision numbers, and hg log -r 'id(9)' to match only prefixes or full hashes.
In the documentation on revsets, these predicates are listed as:
"id(string)"
Revision non-ambiguously specified by the given hex string prefix.
And:
"rev(number)"
Revision with the given numeric identifier.
Unfortunately, both this page and the help page on revisions do not (as of version 6.1) explicitly point out the ambiguity between numbers that can match either as local revision numbers or node hash prefixes. The 2014 mailing list thread I quoted does contain suggestions to clarify this but it appears nothing came off it.
Additionally, here is a changeset message in which I explained the entire affair and how it came to affect the operation of a script of mine:
fix to use 'rev(x)' instead of just x to refer to local rev number
The revsets syntax to unambiguously refer to a local revision
number is to wrap the number in rev(). Without this, a number
that doesn't exist (eg -r 2) may be misinterpreted to refer to
a changeset that has a node hash starting with the requested
number.
In our case this bug happened to act up after the revision on at
2022-04-02 16:10:42 2022 Z "changed encoding from cp850 to utf8"
which the day after it was added was converted as the second
(local revision number 1) changeset from the svn repo. The
particular hg convert command was:
hg convert svn-mirror DEST \
--config hooks.pretxncommit.checkcommitmessage=true \
--config convert.svn.startrev=1152
This created a changeset known as 1:2ec9f101bc31 from that svn
revision. Lacking a local revision number 2, the -r 2 picked up
this changeset because its hash started with the digit "2".
Tnus the NEWNODE variable received the changeset hash for this
changeset. Because our hg rebase command is configured to keep
empty changesets, the changeset got added atop its already
existing copy in the destination repo.
Ever since the akt.sh script would pick up the wrong revision
number from the destination repo and abort its run with the
message indicating "Revisions differ!".

Related

How do you find the changesets between two tags in mercurial?

If I have two tags named 2.0 and 2.1, how do I find the changeset messages between the two? I'm trying to find to a way to use HG make release notes and list the different messages associated with the commits.
Example Changeset:
changeset: 263:5a4b3c2d1e
user: User Name <user.name#gmail.com>
date: Tue Nov 27 14:22:54 2018 -0500
summary: Added tag 2.0.1 for changeset 9876fghij
Desired Output:
Added tag 2.1 for changeset 67890pqrst
Change Info...
Added tag 2.0.1 for changeset 9876fghij
Change Info...
Added tag 2.0 for changeset klmno12345
Preface
"Any challenge has a simple, easy-to-understand wrong decision". And Boris's answer is a nicest illustration for this rule: "::" topo-range will produce good results only in case of pure single-branch development (which is, in common, The Bad Idea (tm) anyway)
Face
Good solution must correctly handle complex DAGs and answer on question "New changesets included in NEW, missing in OLD (regardless of the nature of occurrence)"
For me it's "only()" functions in revsets with both parameters
"only(set, [set])"
Changesets that are ancestors of the first set that are not ancestors
of any other head in the repo. If a second set is specified, the
result is ancestors of the first set that are not ancestors of the
second set (i.e. ::set1 - ::set2).
hg log -r "only(2.1,2.0)"
maybe for better presentation powered by predefined style "changelog"
hg log -r "only(2.1,2.0)" -s changelog
or custom style|template
You'll want to use a revset to select all changesets between two tags, for example: 2.0::2.1 will likely do the trick. You can validate the selected changesets by running: hg log -G -r '2.0::2.1'. (See hg help revset for more information about revsets).
Once you have the right selected changesets, you can now apply a template to retrieve only the needed information. For example if you only want the first line of changeset description, you can do hg log -r '2.0::2.1' -T '{desc}\n' for the whole description or hg log -r '2.0::2.1' -T '{desc|firstline}\n' only for the first line of each changeset description.
If you want to add even more information, hg help template is your friend.

How to view a single Mercurial commit from the command line?

I would like to view from the commandline what was changed in given Mercurial commit similar to what one would see from hg status or from the TortoiseHg tool. The closest I can seem to get is hg log --stat but that prints extra symbols (i.e. pluses and minuses) and I cannot specify at which specific revision I want to look.
I need this because I have developers who have check-in comments like "." or ",". >:-(
It turns out that hg status has a --change argument where you can pass the revision number (e.g. 109), relative revision (ie -1 is last commit, -2 is second-last, etc), or the hash of the revision to it and it will print out the changes (i.e. additions, removals, and modification) that revision had.
--change isolates that revision and shows just from that revision, but replacing --change with --rev shows the cumulative effect since that revision to the current state.
hg log -v -r <changeset>
changeset: 563:af4d66e2bc6e
tag: tip
user: David M. Carr <****>
date: Fri Oct 26 22:46:02 2012 -0400
files: hggit/gitrepo.py tests/test-pull.t
description:
pull: don't pull tags as bookmarks
or, using templates, something like
hg log -r tip --template "{node|short} - files: {files}\n"
with output
af4d66e2bc6e - files: hggit/gitrepo.py tests/test-pull.t

How to get the closest revision to the given one that contains other changes than .hgtags only?

I have a revision hash key. I would like to get the closest revision that contains anything, but the .hgtags.
For instance, consider the following fragment of a Mercurial history:
D:\CI\NC\8.0>hg log -l3 -b 8.0 -v
changeset: 1768:633cf1f61665
branch: 8.0
tag: tip
user: ci
date: Wed Nov 16 21:06:20 2011 +0200
files: .hgtags
description:
Replaced tag 'good.NC.16' with 'rejected.NC.16' for changeset 9451e8f187b1
changeset: 1767:6cad328c622c
branch: 8.0
parent: 1765:9451e8f187b1
user: ci
date: Wed Nov 16 21:04:26 2011 +0200
files: .hgtags
description:
Added tag 'good.NC.16' for changeset 9451e8f187b1
changeset: 1765:9451e8f187b1
branch: 8.0
tag: rejected.NC.16
user: gilad
date: Tue Nov 15 18:26:09 2011 +0200
files: .hgignore
description:
update
In this case, if the given revision is 633cf1f61665, then I am looking for the revision 9451e8f187b1, because it is the closest one, which contains not just .hgtags, but something else.
How, given 633cf1f61665, can I locate 9451e8f187b1 using as few hg.exe invocations as possible?
EDIT
I have fixed the output, it should have displayed revisions from the same branch.
EDIT2
I will try to explain myself. Let us define two notions:
A dull changeset - the one created by the hg tag action.
An interesting changeset - any non dull changeset.
So, my question can be rephrased like so:
Given an arbitrary revision (dull or interesting) I need to find the closest
interesting revision belonging to the same named branch using as few hg invocations
as possible.
For instance, given 633cf1f61665 or 6cad328c622c or 9451e8f187b1 the required revision is 9451e8f187b1.
Try with
$ hg log -r "max(::REV and not file(.hgtags))"
and see if that does what you want. See hg help revsets for more information about the query language.
You can make a revset alias for this if you use it often:
[revsetalias]
interesting($1) = max(::$1 and not file(.hgtags))
and then use hg log -r "interesting(123)" in the future.

Unknown revision numbers in bzr tags command

When I run the bzr tags command on a branch, I often get some tags that are displayed with no revision number. It appears as a question mark. For example, when I run this command:
bzr tags -d lp:~zaber/openobject-client/main
tag 5.0.7 doesn't have a revision number:
5.0.0 930
5.0.0-2 933
5.0.0-3 938
5.0.0-alpha 719
5.0.0-rc1 771
5.0.0-rc1.1 776
5.0.0-rc2 830
5.0.0-rc3 858
5.0.1 946.1.19
5.0.2 976
5.0.3 983
5.0.4 986
5.0.5 993
5.0.6 1000
5.0.7 ?
5.0.7rc1 1022
5.0.7rc2 1042
This may happen more often when I've got shared repositories for several local branches, but I'm not sure.
Those tags are known to bzr (fetched or merged from another branch in some pull or merge operation) but corresponding revision is not present in your history (not merged to your branch).
Strictly to say that's a bug, you can find it in the bzr bugtracker on Launchpad.net.
What you can do about such tags:
remove them from your branch only with bzr tag --delete XXX
use them to merge those revisions later with bzr merge -r tag:YYY lp:XXX
look at the corresponding revision ids with bzr tags --show-ids
As bialix suggested, deleting the tags using bzr tag --delete XXX works. Also, deleting a tag on a checkout also deletes the tag on the master branch. (I guess that's parallel to the way commits work, but it still surprised me.) Sometimes a merge will bring a bunch of broken tags across, so here's a gawk command to remove all unknown tags from the local branch:
bzr tags | gawk '/\?/ { system("bzr tag --delete " $1) }'

Why does hg clone of hg.netbeans.org/main report "9 integrity errors"?

I just finished cloning the (huge) netbeans repository for
the second time. I found that I couldn't
successfully pull changes, after my first attempt to clone, earlier this week.
I guessed that some intermittent error had
corrupted the repository the first time around... that appears not to
be the case.
I'm using hg 1.3.1 on Ubuntu 9.4 (32-bit).
I cloned with hg clone http://hg.netbeans.org/main main
hg verify (below) ends with:
9 warnings encountered!
9 integrity errors encountered!
incidentally, the size of 00manifest.d is 1.1GiB, is that normal?
What could be causing this? Where does one even report this kind of error?
(assuming for the moment that it's not a PEBKAC.)
This should give you an idea of what I'm seeing (repetitive bits removed to save space):
[smithma#oberon:~/w/netbeans/main]
$ { hg --version ; echo ; echo ; hg --debug verify ; } | tee
../netbeans-main-hg-verify.txt
Mercurial Distributed SCM (version 1.3.1)
Copyright (C) 2005-2009 Matt Mackall and others
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
manifest#?: rev 149491 points to unexpected changeset 149752
(expected 149754)
[...SNIP...]
repository uses revlog format 1
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
applemenu/src/org/netbeans/modules/applemenu/layer.xml#?: rev 12
points to unexpected changeset 149753
(expected 41473 46378 56815 59563 66079 70568 71017 83303 103972 105432 135060 137239 147766 149755)
warning: cnd.repository/src/org/netbeans/modules/cnd/repository/disk/UnitImpl.java#74688:
copy source revision is nullid cnd.repository/src/org/netbeans/modules/cnd/repository/disk/UnitDiskRepository.java:000000000000
[...SNIP...]
defaults/src/org/netbeans/modules/defaults/mf-layer.xml#?: rev 74
points to unexpected changeset 149753
(expected 25730 25732 25733 25741 25746 25747 25752 25768 26270 26561
27350 27495 27539 27566 27776 28203 28741 29191 29244 29364 29582
32476 33848 34406 35712 35713 36197 38355 40775 40854 42144 43593 44912
46378 46644 46697 46757 48145 48325 49166 50888 54548 54616 54618
55792 56816 56868 56895 56915 57513 58323 59288 59456 59563 59709 60225
66549 67160 67595 76198 77297 85585 86938 87361 93609 93755 113163
113177 117980 117992 124182 124475 135060 147766 149755)
[...SNIP...]
118132 files, 151874 changesets, 591274 total revisions
9 warnings encountered!
9 integrity errors encountered!
First, no it's not a PEBKAC. The errors from verify are fixable, the best way is probably to contact a Mercurial dev to write a script fixing the broken linkrevs.
The huge manifest could be dealt with contrib/shrink-revlog.py, from a quick testing I think it would shrink to approximately 50MB.