I have a Subversion repository, which is a mirror of a subtree accessable by our team.
The svnadmin dump was created with a account with limited access to the source repository.
The mirror was than initalised with svnadmin load.
svnadmin load repo < svndump
svn propset --revprop -r0 svn:sync-from-uuid af407c0b-a50a-0410-b304-fd27f8ee8df8 %dstrepo%
svn propset --revprop -r0 svn:sync-last-merged-rev 187698 %dstrepo%
svn propset --revprop -r0 svn:sync-from-url %SourceRepo% %dstrepo%
The history contains a great number of revisions without paths and dates.
svnsync sync %dstrepo% %SourceRepo%
Sync Is working
This results in great number of svn revisions looking like this
svn log -vl10 repo
....
------------------------------------------------------------------------
r204669 | ebenneheij | 2014-09-22 12:04:00 +0200 (ma, 22 sep 2014) | 4 lines
------------------------------------------------------------------------
....
r204570 | (no author) | (no date) | 1 line
------------------------------------------------------------------------
r204569 | (no author) | (no date) | 1 line
------------------------------------------------------------------------
r204568 | (no author) | (no date) | 1 line
Related: How to get the SVN revision number in certain Date or date range
Related: http://manned.org/rsvndump/29eed607
It looks like these empty revisions are there for padding, but they are breaking the option to use a date range for revisions.
I'm trying to validate the last 30 days of commits with
svnadmin verify -r {`date -d '30 days ago' +%F `}:HEAD repo
This fails on the revisions without a date in the revprops.
Failed to find time on revision 191375
Somehow the svnadmin dump and svnsync create empty revisions without date information.
This is causing the error I am seeing.
I tried to drop empty revisions with svndumpfilter --drop-all-empty-revs.
The problem with that solution is that svnsync won't connect anymore.
A solution I see now is to replace svnsync with svnrdump and svndumpfilter with the lastsynchronised revision stored somewhere.
Can I fill in the padded revisions with a date so the date range will work?
Update: added mirroring description
Update2: only revisions with paths visible to me contain dates
Update3: alternative suggestion
Update4: empty revisions for padding do not contain date, how to fill the date?
Related
Quick intro: In Mercurial there are two different ways to numerically refer to a changeset.
First, there's the node ID hash. It is global and functions like a git commit hash. It consists of 40 hexadecimal digits.
Second, there's the local revision number. It is a decimal number that starts at 0 and counts up. Unlike the node hash, this is local, meaning the same changeset can have different local revision numbers in two different repos. This depends on what other changesets are present in each repo and depends even on the order each repo received their changesets.
A revision can be specified numerically to Mercurial as a local revision number, a full 40-digit hash, or "a short-form identifier". The latter gives a unique prefix of a hash; that is, if only one full hash starts with the given string then the string matches that changeset.
I found that in certain cases, Mercurial commands (such as hg log with an -r switch), given plain decimal numbers, will match some revision even though there aren't enough local revisions for the given number to match as a local revision number.
Here's an example I constructed after coming across such a case by chance:
test$ hg --version
Mercurial Distributed SCM (version 6.1)
(see https://mercurial-scm.org for more information)
Copyright (C) 2005-2022 Olivia Mackall and others
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
test$ hg init
test$ touch a
test$ hg add a
test$ hg ci -d "1970-01-01 00:00:00 +0000" -u testuser -m a
test$ touch b
test$ hg add b
test$ hg ci -d "1970-01-01 00:00:00 +0000" -u testuser -m b
test$ hg log
changeset: 1:952880b76ae5
tag: tip
user: testuser
date: Thu Jan 01 00:00:00 1970 +0000
summary: b
changeset: 0:d61f66df66f9
user: testuser
date: Thu Jan 01 00:00:00 1970 +0000
summary: a
test$ hg log -r 2
abort: unknown revision '2'
test$ hg log -r 9
changeset: 1:952880b76ae5
tag: tip
user: testuser
date: Thu Jan 01 00:00:00 1970 +0000
summary: b
test$
As is evident, hg log -r 9 matches a changeset even though there aren't that many changesets to match the 9 as a local revision number.
The question: Why is this? Additionally, how can we avoid matching a nonexistent local revision number?
This is due to how Mercurial parses revision specifiers. Here's how Olivia Mackall explains it in a mail from 2014:
Here is a hexadecimal identifier:
60912eb2667de45415eff601bfc045ae0fe8db42
See how it starts with 6? If you ask for revision 6, Mercurial will:
a) look for revision 6
b) if that fails, look for a hex identifier starting with "6"
c) if we find more than one match, complain
d) if we find no matches, complain
e) we found one match: success!
That is, if hg log -r 9 doesn't match any local revision number (because there are less than ten changesets in the repo), Mercurial next will match a node hash that happens to start with a 9.
To avoid this ambiguity, she responded that one should use hg log -r 'rev(9)' to match only local revision numbers, and hg log -r 'id(9)' to match only prefixes or full hashes.
In the documentation on revsets, these predicates are listed as:
"id(string)"
Revision non-ambiguously specified by the given hex string prefix.
And:
"rev(number)"
Revision with the given numeric identifier.
Unfortunately, both this page and the help page on revisions do not (as of version 6.1) explicitly point out the ambiguity between numbers that can match either as local revision numbers or node hash prefixes. The 2014 mailing list thread I quoted does contain suggestions to clarify this but it appears nothing came off it.
Additionally, here is a changeset message in which I explained the entire affair and how it came to affect the operation of a script of mine:
fix to use 'rev(x)' instead of just x to refer to local rev number
The revsets syntax to unambiguously refer to a local revision
number is to wrap the number in rev(). Without this, a number
that doesn't exist (eg -r 2) may be misinterpreted to refer to
a changeset that has a node hash starting with the requested
number.
In our case this bug happened to act up after the revision on at
2022-04-02 16:10:42 2022 Z "changed encoding from cp850 to utf8"
which the day after it was added was converted as the second
(local revision number 1) changeset from the svn repo. The
particular hg convert command was:
hg convert svn-mirror DEST \
--config hooks.pretxncommit.checkcommitmessage=true \
--config convert.svn.startrev=1152
This created a changeset known as 1:2ec9f101bc31 from that svn
revision. Lacking a local revision number 2, the -r 2 picked up
this changeset because its hash started with the digit "2".
Tnus the NEWNODE variable received the changeset hash for this
changeset. Because our hg rebase command is configured to keep
empty changesets, the changeset got added atop its already
existing copy in the destination repo.
Ever since the akt.sh script would pick up the wrong revision
number from the destination repo and abort its run with the
message indicating "Revisions differ!".
I have an hg repository and I would like to know the last commit date of every file in sources/php/dracca/endpoint/wiki/**/Wiki*.php
So far, I have this one liner:
find sources/php/dracca/endpoint/wiki/ -name "Wiki*.php" -exec hg log --limit 1 --template "{date|shortdate}" {} \; -exec echo {} \;
But this seems utterly slow as (I suppose) find makes 1 hg call per file, leading to 15seconds of computation for the (say) ~40 files I have in there...
Is there a faster way?
The output of this command looks like:
2019-09-20 sources/php/dracca/endpoint/wiki/characters/colmarr/WikiCharactersColmarrEndpoint.php
2019-09-20 sources/php/dracca/endpoint/wiki/characters/dracquints/allgroup/WikiCharactersDracquintsAllgroupEndpoint.php
...
It might be changed a bit if needed (I won't mind having, say, 1 date and then the list of files changed for that date, or whatever like this)
Even with find+exec you can have shorter (by one last exec) chain with modified template {date|shortdate}\n
You can use (accepted) perl-ism from this question or ask anybody to update mentioned lof extension to current Mercurial (code from 2012 will not work now)
Alternatives (dirty ugly hacks)
In any case, you can|have to call hg only once and perform some post-processing of results.
Before these trick, read hg help filesets and get one common fileset for your files (I suppose, it can be just set:sources/php/dracca/endpoint/wiki/**/Wiki*.php but TBT!)
After it, you can:
Perform hg log like this
hg log setup.* --template "{files % '{file} {rev} {date|shortdate}\n'}"
(I used simple pattern for test, you have to have own fileset)
get output in such form
setup.py 1163 2018-11-07
README.md 1162 2018-11-07
setup.py 1162 2018-11-07
hggit/git_handler.py 1124 2018-05-01
setup.py 1124 2018-05-01
setup.cfg 1118 2017-11-27
setup.py 1117 2017-11-27
hggit/git2hg.py 1111 2017-11-27
hggit/overlay.py 1111 2017-11-27
setup.py 1111 2017-11-27
…
(there are some unwanted unexpected files, because I out all files in revision, which affect file in interest, without filter). You have to grep only needed files, sort by cols 1+2 and use date of latest revision of each file
Use hg grep. For the above test-pattern
hg grep "." -I setup.* --files-with-matches -d -q
(find any changes, output only filename+revision, short date)
you'll get something like
setup.py:1163:2018-11-07
setup.cfg:1118:2017-11-27
and 3-rd column will be your needed last modification date of file
I would like to view from the commandline what was changed in given Mercurial commit similar to what one would see from hg status or from the TortoiseHg tool. The closest I can seem to get is hg log --stat but that prints extra symbols (i.e. pluses and minuses) and I cannot specify at which specific revision I want to look.
I need this because I have developers who have check-in comments like "." or ",". >:-(
It turns out that hg status has a --change argument where you can pass the revision number (e.g. 109), relative revision (ie -1 is last commit, -2 is second-last, etc), or the hash of the revision to it and it will print out the changes (i.e. additions, removals, and modification) that revision had.
--change isolates that revision and shows just from that revision, but replacing --change with --rev shows the cumulative effect since that revision to the current state.
hg log -v -r <changeset>
changeset: 563:af4d66e2bc6e
tag: tip
user: David M. Carr <****>
date: Fri Oct 26 22:46:02 2012 -0400
files: hggit/gitrepo.py tests/test-pull.t
description:
pull: don't pull tags as bookmarks
or, using templates, something like
hg log -r tip --template "{node|short} - files: {files}\n"
with output
af4d66e2bc6e - files: hggit/gitrepo.py tests/test-pull.t
I have some old commit messages in a Mercurial repository that should be changed (to adjust for some new tools). I already understand that this hacking has to be done on the master repository and all local repositories would have to be re-cloned, because checksums of all subsequent changesets will also change.
I've tried following the recipes in "How to edit incorrect commit messages in Mercurial?", but with MQ extension I got stuck on error message
X:\project>hg qimport -r 2:tip
abort: revision 2 is the root of more than one branch
and with Histedit quite similarly
X:\project>hg histedit 2
abort: cannot edit history that would orphan nodes
The problem seems to be that there have been branches created after the changeset.
I can see how it would become messy if I'd want to change the contents of patch, but perhaps there's a workaround that I've missed for editing the commit message?
I would use a hacked version of the convert extension to do this. The extension can do hg → hg conversions which lets you alter author and branch names. There is not support for changing commit messages yet, but you can hack it.
Specifically, you should change the getcommit method from:
def getcommit(self, rev):
ctx = self.changectx(rev)
parents = [p.hex() for p in self.parents(ctx)]
if self.saverev:
crev = rev
else:
crev = None
return commit(author=ctx.user(), date=util.datestr(ctx.date()),
desc=ctx.description(), rev=crev, parents=parents,
branch=ctx.branch(), extra=ctx.extra(),
sortkey=ctx.rev())
which is responsible for reading the old commits. Change the
desc=ctx.description()
to
desc=adjust(ctx.description())
and then implement the adjust function at the top of the file:
def adjust(text):
return text.upper()
If these are accidental/duplicate branches due to using --amend and push --force then strip them first and try 'histedit' again then wipe the central repo on bitbucket; try the following which worked for me:
Inspect the repository log and look for branches, you can use the GraphlogExtension which you will have to enable first:
# hg log -G | more
...
o changeset: 43:c2fcca731aa5
| parent: 41:59669b9dfa4a
| user: Daniel Sokolowski (https://webdesign.danols.com)
| date: Tue Aug 27 20:14:38 2013 -0400
| summary: Progress snapshot: major content text and model instance ..
...
| o changeset: 42:c50724a6f1c6
|/ user: Daniel Sokolowski (https://webdesign.danols.com)
| date: Tue Aug 27 20:14:38 2013 -0400
| summary: Progress snapshot: major content text and model instance ...
|
o changeset: 41:59669b9dfa4a
| user: Daniel Sokolowski (https://webdesign.danols.com)
...
Enable the MqExtension and strip all branches.
# hg strip --no-backup 42:c50724a6f1c6
# hg strip --no-backup 45:3420dja12jsa
...
If needed change the commit to 'draft' (see Phases) and re-run 'histedit' and all should be good now.
# hg histedit 14:599dfa4a669b
abort: cannot edit immutable changeset: b7cfa2f28bde
# hg phase -f -d 14:599dfa4a669b
# hg hsitedit 14:599dfa4a669ba
I needed something similar so I filed a feature request.
You can use hg grep, but it searches the contents of all files.
What if I just want to search the file names of deleted files to recover one?
I tried hg grep -I <file-name-pattern> <pattern> but this seems to return no results.
using templates is simple:
$ hg log --template "{rev}: {file_dels}\n"
Update for Mercurial 1.6
You can use revsets for this too:
hg log -r "removes('**')"
(Edit: Note the double * - a single one detects removals from the root of the repository only.)
Edit: As Mathieu Longtin suggests, this can be combined with the template from dfa's answer to show you which files each listed revision removes:
hg log -r "removes('**')" --template "{rev}: {file_dels}\n"
That has the virtue (for machine-readability) of listing one revision per line, but you can make the output prettier for humans by using % to format each item in the list of deletions:
hg log -r "removes('**')" --template "{rev}:\n{file_dels % '{file}\n'}\n"
If you are using TortoiseHg workbench, a convenient way is to use the revision filter. Just hit ctrl+s, and then type
removes("**/FileYouWantToFind.txt")
**/ indicates that you want to search recursively in your repository.
You can use * wildcard in the filename too. You can combine this query with other revision sets using and, or operators.
There is also this Advanced Query Editor:
I have taken other answers and improved it.
Added "--no-merges". On large project with dev teams, there will lots of merges. --no-merger will filter out the log noise.
Change removes("**") to sort(removes("**"), -rev). For a large project with over 100K changesets, this will get to the latest files removed a lot faster. This reverses the order from starting at rev 0 to start at tip instead.
Added {author} and {desc} to ouput. This will give context as to why the files was removed by displaying the log comment and who did it.
So for my use case, it was hg log --template "File(s) deleted in rev {rev}: {author} \n {desc}\n {file_dels % '\n {file}'}\n\n" -r 'sort(removes("**"), -rev)' --no-merges
Sample output:
File(s) deleted in rev 52363: Ansariel
STORM-2141: Fix various inventory floater related issues:
* Opening new inventory via Control-Shift-I shortcut uses legacy and potentinally dangerous code path
* Closing new inventory windows don't release memory
* During shutdown legacy and inoperable code for inventory window cleanup is called
* Remove old and unused inventory legacy code
indra/newview/llfloaterinventory.cpp
indra/newview/llfloaterinventory.h
File(s) deleted in rev 51951: Ansariel
Remove readme.md file - again...
README.md
File(s) deleted in rev 51856: Brad Payne (Vir Linden) <vir#lindenlab.com>
SL-276 WIP - removed avatar_skeleton_spine_joints.xml
indra/newview/character/avatar_skeleton_spine_joints.xml
File(s) deleted in rev 51821: Brad Payne (Vir Linden) <vir#lindenlab.com>
SL-276 WIP - removed avatar_XXX_orig.xml files.
indra/newview/character/avatar_lad_orig.xml
indra/newview/character/avatar_skeleton_orig.xml
Search for a specific file you deleted efficiently, and format the result nicely:
hg log --template "File(s) deleted in rev {rev}: {file_dels % '\n {file}'}\n\n" -r 'removes("**/FileYouWantToFind.txt")'
Sample output:
File(s) deleted in rev 33336:
class/WebEngineX/Database/RawSql.php
File(s) deleted in rev 34468:
class/PdoPlus/AccessDeniedException.php
class/PdoPlus/BulkInsert.php
class/PdoPlus/BulkInsertInfo.php
class/PdoPlus/CannotAddForeignKeyException.php
class/PdoPlus/DuplicateEntryException.php
class/PdoPlus/Escaper.php
class/PdoPlus/MsPdo.php
class/PdoPlus/MyPdo.php
class/PdoPlus/MyPdoException.php
class/PdoPlus/NoSuchTableException.php
class/PdoPlus/PdoPlus.php
class/PdoPlus/PdoPlusException.php
class/PdoPlus/PdoPlusStatement.php
class/PdoPlus/RawSql.php
from project root
hg status . | grep "\!" >> /tmp/filesmissinginrepo.txt