How can I access the commited file from a Subversion pre-commit hook in Perl? - perl

I need to do the following:
Write pre-commit hook in Perl
Hook should check all files being committed for presence of some text, and fail if that text is not found
Basically, I need an example of Perl hook that reads files being committed.
I am really looking for some elegant solution with the least amount of code.
Notes:
Hook should use svnlook or other better way to find files.

pre-commit hook:
my $repos = shift;
my $txn = shift;
foreach my $line (`$svnlook changed -t $txn "$repos"`)
{
chomp($line);
if ($line !~ /^([AUD_]).\s\s(.+)$/)
{
print STDERR "Can't parse [$line].\n";
exit(1);
}
else
{
my $action = $1;
my $file = $2;
chomp($file);
#If path has trailing slash, then it is a folder and we want to skip folders
if($file =~ /\/$/)
{
next;
}
my $fileContent = `$svnlook cat -t $txn "$repos" "$file"`;
if ($action =~ /[AU]/)
{
my #lines = split(/\n/, $fileContent );
#Check for whatever you need in this file's content
}
}
}

It sounds like you've got the foundation figured out already:
get list of all files being committed
search each one of them in turn for particular text
if text is found, reject commit
You'll find some information on writing pre-commit hooks in the manual.

It should not be too difficult to modify this example in Python to do what you want. See also the hooks subdirectory of your repository for some templates and hook scripts and contributed hook scripts.

Related

File::Find in Perl - Looking for files only

I have a script like this to list every FILES inside my root path
use strict;
use File::Find qw(find);
my $path = "<my root path>";
find(\&Search, $path);
sub Search{
my $filename = $File::Find::name;
if(-f $filename){
print $filename."\n";
}
}
My point is to try to list all the FILES. However, it also listed the symlink inside my $root. I modify my Search function like this and it worked:
sub Search{
my $filename = $File::Find::name;
#Check if $filename is not symlink first
if(!-l $filename){
if(-f $filename){
print $filename."\n";
}
}
}
But it seem awkward right ? Why do we need two if condition just to verify $filename is the real file and not a symlink !!!
Is there anyone can suggest a better, more decent solution for this ?
Thank you and best regards.
Alex
-f is testing for file, and that includes symlinks. So yes, you do have to test both.
One slightly useful thing, is that you can probably just do:
if ( -f and not -l ) {
because File::Find sets $_ to the current file, and the file tests default to using that too. (won't work if you turn on no_chdir though).
You may also want to consider File::Find::Rule as an alternative to File::Find.
stat and lstat are identical except when it comes to symlinks. The former collects information about the linked file, whereas the latter collects information about the link itself.
The -X EXPR uses stat. lstat is needed here.
sub Search {
my $filename = $File::Find::name;
if (!lstat($filename)) {
warn("Can't stat $filename: $!\n");
return;
}
say $filename if -f _;
}
Bonus: Error checking becomes much simpler when you pre-call stat or lstat.

How to read a file which is gzipped and tar in perl

I have placed the text file "FilenameKeyword.txt" file in E:/Test folder, in my perl script i am trying to traverse through the folder and am i am trying to find a file with filename which has the string "Keyword" in it, later i have printed the content of that file in my script.
Now i wish do the same thing for the file which is placed inside tar file which is compressed.
Hypothetical File from where i am trying to extract the details:
E:\test.tar.gz
Wanted to know if there are possibility in perl to search and read the file without decompressing /unzipping the hypothetical file.If that is not possible, I shall also allocate some temperory memory to decompress the file , which should deleted after extracting the content from the particular text file.
While Searching in the internet i could it is possible to extract and read the gzip/tar file by using Archive::Extract, being new to Perl - i am really confused on how actually i should make use of it. Could you please help on this....
Input file:FilenameKeyword.txt
Script:
use warnings;
use strict;
my #dirs = ("E:\\Test\\");
my %seen;
while (my $pwd = shift #dirs) {
opendir(DIR,"$pwd") or die "Cannot open $pwd\n";
my #files = readdir(DIR);
closedir(DIR);
foreach my $file (#files)
{
if (-d $file and ($file !~ /^\.\.?$/) and !$seen{$file})
{
$seen{$file} = 1;
push #dirs, "$pwd/$file";
}
next if ($file !~ /Keyword/i);
my $mtime = (stat("$pwd/$file"))[9];
print "$pwd$file";
print "\n";
open (MYFILE, "$pwd$file");
while (my $line = <MYFILE>){
#print $line;
my ($date) = split(/,/,$line,2);
if ($line =~ s!<messageText>(.+?)</messageText>!!is){
print "$1";
}
}
}
}
Output(In test program file is placed under E:\Test):
E:\Test\FilenameKeyword.txt
1311 messages Picked from the Queue.
Looking for help to retrieve the content of the file which is place under
E:\test.tar.gz
Desired Output:
E:\test.tar.gz\FilenameKeyword.txt
1311 messages Picked from the Queue.
I was stuck in using CPAN module, CPAN module didn't work for me as i have oracle 10g enterprise edition in the same machine, due do some software conflict Active state perl was unable compile and refer to the perl lib for CPAN module, i have uninstalled oracle in my machine to make this work....
#!/usr/local/bin/perl
use Archive::Tar;
my $tar = Archive::Tar->new;
$tar->read("test.tar.gz");
$tar->extract();
If your file was gzipped only, you could read its contents in a "streamed" manner as outlined here (Piping to/from a child process without system or backtick - gzipped tar files). The article illustrates a technique to use open and a fork to open and decompress the file, and then making it available to Perl's while(), allowing you to iterate over it.
As tar is basically concatenating things, it might be possible to adapt this to your scenario.

How to traverse Subversion repository to find specific file, and stop searching further down?

I have this problem: given a Subversion repository http://svn/trunk/ I want to search the whole repository to find/list all files named exp.xml (their whole URL). Once the first occurence has been found I want it to stop searching further down the URL. Just to make it clear, here are some fictitious URLs:
http://svn/trunk/pro1/sub-pro-x/exp.xml/sub-pro-x1/exp.xml
http://svn/trunk/pro2/sub-pro-y/pro-y1/exp.xml/sub-pro-y1/exp.xml
http://svn/trunk/pro3/sub-pro-z/exp.xml/sub-pro-z1/exp.xml/sub-proj/exp.xml
The result should be:
http://svn/trunk/pro1/sub-pro-x/exp.xml
http://svn/trunk/pro2/sub-pro-y/pro-y1/exp.xml
http://svn/trunk/pro3/sub-pro-z/exp.xml
Now I already have a solution, but it's not really very efficient because I use grep exp.xml after svn -R list --- has searched the whole repository (30-40 min). In case you want to know, here is the command:
svn list -R http://svn/trunk | grep /exp.xml
So my question is whether it is possible to make any significant speedup to this query? One thing I am thinking of is maybe use some language, preferably Perl, to directly traverse the http:/svn/trunk/ and process all the links, and stop traversing further down when it finds the first exp.xml.
Thanks for your time.
If you want it to be faster, I would try checking out the SVN project and then searching the files on disk. You could perform a search using "find" in the checked-out sandbox (where "." assumes you are in the top directory of your project):
find . -name 'exp.xml'
but, similar to your "grep" solution, I don't think it achieves your "stop searching further" criteria. If you want a Perl script to search for "exp.xml" but stop recursing if it finds a match, try this (takes top level directory as argument):
#!/usr/bin/env perl
use warnings;
use strict;
my #dirs = $ARGV[0];
my #files;
DIR:
while (my $dir = shift #dirs) {
opendir(my $dh, $dir) or die "Couldn't open dir $dir: $!";
my #new_dirs;
while (my $file = readdir($dh)) {
# skip special directories (".", "..", and ".svn")
next if $file =~ /^\./;
# turn file into correct relative path
$file = "$dir/$file";
if (-d $file) {
push #new_dirs, $file;
}
if ($file eq "$dir/exp.xml") {
# if we matched, next outer loop so we don't recurse further
push #files, $file;
next DIR;
}
}
# if we didn't match any files, we need to check sub-dirs
push #dirs, #new_dirs;
}
print "$_\n" for #files;
Use svn ls [URL] or svn ls -R [URL] with your script to list the SVN repository starting at [URL]. See svn ls --help for more info.

How to redirect SVN stderrs to /dev/null using perl

I have a script to check if any data is available on svn repo path but not added into svn. It works fine for me but this gives stderr for adding and sending files like below;
Adding 1/a
Sending 1/a
Transmitting file data ...........
Committed revision 529.
Code:
use strict;
use warnings;
sub notAdded {
my #svnstatus = `svn st`;
foreach my $status (#svnstatus) {
chomp($status);
if ($status =~ m/^?/) {
my ($symble, $left) = split(' ', $status);
system("svn add $left");
}
}
}
&notAdded();
system("svn commit -m 'comment'");
Can anyone please suggest me how can I redirect this error to /dev/null within the script.
The normal way to hide unwanted output with SVN is to use the -q (quiet) flag:
svn -q add nothere
displays nothing.
Or the really easy way:
system("svn add $left 2>/dev/null");

How to search an entire CVS repository (all branches/history/comments)?

If I want to essentially grep every line ever in the repository, is there a way to do it? I know this would take a long time for large projects.
If not all inclusive, at least just the current branch and its entire source history?
Edit: I should have been more explicit. What if I don't have direct access to the server that the CVS repository is on? So I couldn't directly grep the filesystem that has the CVS repository.
There is no way to do this with standard CVS tools without access to the repository. A third party tool out there may do it (I don't know of one, although CS-CVS seems to claim to), but to do it programatically, you would have to do CVS logs on all the relevant files, and then retrieve and search each version reported by cvs in the logs (cvs log is a command line option in CVS that shows you the revision history of any file, but it doesn't show you the contents).
Here's what I recently used, in a case where I didn't have access to the server. It seemed to work that time. Call it from inside a working copy, with cvs in the PATH. Note that this doesn't search commit messages, but you can simply grep 'cvs log' for that.
#!/usr/bin/perl
# Searches CVS diffs and first revisions behind the current working
# directory for an expression (perlre syntax).
# Synopsis: cvsgrep [-n] <search-expression> [<file_1> ... <file_n>]
# -n means that contents of matching files should not be printed to stdout.
use Getopt::Std;
my %options=();
getopts("n",\%options);
my $no_content_dump=$options{"n"};
my $search_term=shift
or die "Error: usage is: cvsgrep [-n] <search-expression>".
" [<file_1> ... <file_n>]";
sub quote_fn
{
my $fn=shift;
$fn =~ s/\'/\'\"\'\"\'/g;
"'".$fn."'";
}
my $args_str;
while(#ARGV)
{
my $arg=shift;
$args_str.=' ' if $args_str;
$args_str.=&quote_fn($arg);
}
print
"Searching for term: $search_term",
($args_str?" in: $args_str":""),
"\n";
open CVSLOGH,"cvs log -N $args_str|" or die "Cannot execute cvs log: $!";
my #files_revisions=();
my $cur_file;
my $cur_revision;
while(<CVSLOGH>)
{
chop;
if(/^Working file\:\s*(.*)$/)
{
$cur_file=$1;
$cur_revision='';
}
elsif(/^revision\s+(.*)$/)
{
$cur_revision=$1;
}
elsif((/^\=\=\=\=/ || /^\-\-\-\-/) && $cur_revision)
{
push #files_revisions,{file=>$cur_file,rev=>$cur_revision};
}
}
close CVSLOGH;
my $matchcount=0;
my $count=0;
my $progress_msg="Scanned %d out of %d commit(s)\r";
my $erase_ln=(" " x (length($progress_msg)+20)) . "\r";
foreach my $file_revision(#files_revisions)
{
printf($progress_msg,$count++,scalar(#files_revisions));
my($file,$rev) = ($file_revision->{file},$file_revision->{rev});
$rev =~ /^(.*\.)([0-9]+)/;
my $revbase=$1;
my $revlastdigit=$2;
my $rev1=$revbase.($revlastdigit - 1);
my $diffcommand = "cvs diff -N -r $rev1 -r $rev ".&quote_fn($file);
open CVSDIFFH,"$diffcommand|" or die "Cannot execute cvs diff: $!";
my $diffresult;
while(<CVSDIFFH>)
{
if(/^[\<\>]/)
{
s/^.//;
$diffresult.=$_;
}
}
close CVSDIFFH;
if($diffresult =~ /$search_term/s)
{
print "${erase_ln}FOUND: in diff for $file $rev1:$rev\n";
$matchcount++;
system($diffcommand) unless $no_content_dump;
}
}
print "${erase_ln}Done ($matchcount match(es)).\n";
It depends on what you're looking for. CVS version files contain all of the edits that have ever happened to the file, in plaintext. So if you're simply looking for all files that contain a particular word, do a recursive grep on the repository.
If you're looking to find specific versions that contain those words, then you're going to have to extract the versions from the repository, which is expensive. However, if you can limit the set of files by grepping the repository, then it's not so bad.