cpan2rpm cant stat /tmp folder - perl

Im trying to build perl-Heap-Priority for RHEL6. Weired thing is when I run
cpan2rpm Heap::Priority it shows following
...
Tarball extraction: [/root/rpm/SOURCES/Heap-Priority-0.01.tar.gz]
Can't stat /tmp/CldQkErG6r/18:51: No such file or directory
at /usr/bin/cpan2rpm line 392
get_meta(): No such file or directory at /usr/bin/cpan2rpm line 396.
...
Practically this temporary folder is not created. Buy why?
my tmp folder permission is 777
drwxrwxrwt. 3 root root 4096 May 29 16:35 tmp

Known problem, see https://rt.cpan.org/Ticket/Display.html?id=72421. The problem is the space in the user column of the output.
$ tar -tzvf $HOME/rpmbuild/SOURCES/Heap-Priority-0.01.tar.gz |head -1
drwxr-xr-x James Freeman/544 0 2002-05-07 14:51 Heap-Priority-0.01/
Apply the following patch to fix the problem for this module. To get the name, instead of accessing the fifth column, we're accessing the last one. I do not know what else this patch might break, but it should be less wrong than the original code on average.
diff --git a/cpan2rpm b/cpan2rpm
index 28e8b01..6a36b68 100755
--- a/cpan2rpm
+++ b/cpan2rpm
## -1259,7 +1259,7 ## sub untar($) {
;
chomp($_ = qx/$cmd/);
- $_ = (split)[5] unless $zip;
+ $_ = (split)[-1] unless $zip;
$dst .= "/$1" if m|^(\S+)/?|;
$dst =~ s|/*$||; # path shouldn't end in / or tardir gets wiped
$dst =~ s|\./||; # paths in tarballs shouldn't be relative
You could have found out all of this by yourself by using the debugger. Learn to use this tool, it is invaluable.

I think this might be a sightly cleaner way to do it:
--- /usr/bin/cpan2rpm.orig 2017-10-20 14:45:57.000000000 -0700
+++ /usr/bin/cpan2rpm 2017-10-23 12:29:07.006118950 -0700
## -1258,7 +1258,7 ##
my $cmd = $zip
? "unzip -l $_ | grep -P -o '\\S+/\$' |tail -1"
- : "tar -t${z}vf $_ |head -1"
+ : "tar --numeric-owner -t${z}vf $_ |head -1"
;
chomp($_ = qx/$cmd/);

Related

Why does Perl file test operator "-l" not detect symlinks?

Why does the Perl file test operator "-l" fail to detect symlinks under the following conditions?
System Info
john#testbed-LT:/temp2/test$ uname -a
Linux Apophis-LT 4.13.0-37-generic #42-Ubuntu SMP Wed Mar 7 14:13:23 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
john#testbed-LT:/temp2/test$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 17.10
Release: 17.10
Codename: artful
Perl Info
john#testbed-LT:/temp2/test$ perl -v
This is perl 5, version 26, subversion 0 (v5.26.0) built for x86_64-linux-gnu-thread-multi (with 56 registered patches, see perl -V for more detail)
Test Resources
john#testbed-LT:/temp2/test$ touch regular_file
john#testbed-LT:/temp2/test$ mkdir dir
john#testbed-LT:/temp2/test$ ln -s regular_file symlink
john#testbed-LT:/temp2/test$ ls -al
total 12
drwxrwxr-x 3 john john 4096 May 6 02:29 .
drwxrwxrwx 6 john john 4096 May 6 02:29 ..
drwxrwxr-x 2 john john 4096 May 6 02:29 dir
-rw-rw-r-- 1 john john 0 May 6 02:29 regular_file
lrwxrwxrwx 1 john john 12 May 6 02:29 symlink -> regular_file
Script Containing Failing "-l" Operator
john#testbed-LT:/temp2/test$ cat ~/.scripts/test.pl
#!/usr/bin/perl
use strict;
use warnings;
use Cwd 'abs_path';
my $targetDir = "/temp2/test";
opendir(DIR, $targetDir) || die "Can't open $targetDir: $!";
while (readdir DIR) {
my $file = "$_";
if($file =~ m/^\.{1,2}/) {
next;
}
$file = abs_path($file);
if(-l "$file") {
print "Link: $file\n";
}
elsif(-d "$file") {
print "Dir: $file\n";
}
elsif(-f "$file") {
print "File: $file\n";
}
else {
print "\n\n *** Unhandled file type for file [$file]!\n\n";
exit 1;
}
}
closedir(DIR);
Script Output
john#testbed-LT:/temp2/test$ perl ~/.scripts/test.pl
File: /temp2/test/regular_file
Dir: /temp2/test/dir
File: /temp2/test/regular_file
Problem I'm Trying to Solve
Note in the above output that the symlink (named "symlink") is not listed while the file, "regular_file," is listed twice (I want "symlink" listed -- the actual link and not the file it points to).
When I change ... if(-l "$file") ... to ... if(lstat "$file") ... in the script, again "symlink" is not listed while "regular_file" is listed twice, but they are being listed from within the block meant to catch symlinks, i.e.:
john#testbed-LT:/temp2/test$ perl ~/.scripts/test.pl
Link: /temp2/test/regular_file
Link: /temp2/test/dir
Link: /temp2/test/regular_file
Goal
The output I'm trying to achieve (which is faked below -- not actually generated by the script, but by hand) is:
john#testbed-LT:/temp2/test$ perl ~/.scripts/test.pl
File: /temp2/test/regular_file
Dir: /temp2/test/dir
Link: /temp2/test/symlink
...but not necessarily in that order (I don't care about the order of the listing).
Why is the above-shown script not achieving the above-stated goal (why is the "-l" operator not working)?
perldoc Cwd:
abs_path
my $abs_path = abs_path($file);
Uses the same algorithm as getcwd(). Symbolic links and relative-path components ("." and "..") are resolved to return the canonical pathname, just like realpath(3). On error returns undef, with $! set to indicate the error.
(Emphasis mine.)
If you want to see symlinks, don't use abs_path.
What you want to do instead is
$file = "$targetDir/$file";
i.e. prepend the name of the directory you read $file from.
Additional notes:
opendir(DIR, $targetDir) || die "Can't open $targetDir: $!";
while (readdir DIR) {
my $file = "$_";
should be
opendir(my $dh, $targetDir) || die "Can't open $targetDir: $!";
while (my $file = readdir $dh) {
Why use bareword filehandles when you can just use normal variables (that are scoped properly)?
There's no reason to quote "$_" here.
Why first assign to $_ when you're just going to copy the string to $file in the next step?
Note in the above output that the symlink (named "symlink") is not listed while the file, "regular_file," is listed twice
Yeah, because you used abs_path to turn symlink into /temp2/test/regular_file. Get rid of that line.
By the way, you are missing
$file = "$targetDir/$file";
The only reason your program worked without it is because $targetDir happened to be the current work directory.

How to capture large STDOUT output in Perl when executing an external command

I want to get the list of file names present in the remote location.
I am using the below snippet in my Perl script.
my $command = "sftp -q -o${transferAuthMode}=yes -oPort=$sftpPort ${remoteUsername}\#${remoteHost} 2>\&1 <<EOF\n" .
"cd \"${remotePath}\"\n" .
"ls -l \n" .
"quit\n" .
"EOF\n";
my #files = `$command`;
When the number of files in the remote location is large (>500) then not all the file names are captured in #files.
When I manually do SFTP and list the files, all files are getting listed but I'm not getting the same through the script. Each time getting #files size different. It's occurring only when there are large number of files.
I'm unable find the reason behind this. Could you please help?
This can be achieved without requiring any additional package module/s. I tested this on my CentOS 7 Server (Windows VM).
My remote host details: I got ~2000 files in the remote host dir. A CentOS 6.8 server.
%_gaurav#[remotehost]:/home/gaurav/files/test> ls -lrth|head -3;echo;ls -lrth|tail -2
total 7.9M
-rw-rw-r--. 1 gaurav gaurav 35 Feb 16 23:51 File-0.txt
-rw-rw-r--. 1 gaurav gaurav 35 Feb 16 23:51 File-1.txt
-rw-rw-r--. 1 gaurav gaurav 38 Feb 16 23:51 File-1998.txt
-rw-rw-r--. 1 gaurav gaurav 38 Feb 16 23:51 File-1999.txt
%_gaurav#[remotehost]: /home/gaurav/files/test>
Script output from LocalHost: Please note that I am running your command sans the o${transferAuthMode}=yes part. As seen below, the script is able to gather all results in an array, greater than 500 results.
I am prnting the total entries, some particular index numbers from the array to show the results, but give it a try with un-commented Dumper line to see the full result.
%_STATION#gaurav * /root/ga/study/pl> ./scp.pl
Read 2003 lines from SCP command.
ArrayIndex: 2,3,1999,2000 contain:
[-rw-rw-r-- 0 501 501 36B Feb 16 23:51 File-58.txt]
[-rw-rw-r-- 0 501 501 37B Feb 16 23:51 File-129.txt]
[-rw-rw-r-- 0 501 501 38B Feb 16 23:51 File-1759.txt]
[-rw-rw-r-- 0 501 501 38B Feb 16 23:51 File-1810.txt]
%_STATION#gaurav * /root/ga/study/pl>
Script and its Working:
#!/usr/bin/perl
use strict ;
use warnings ;
use Data::Dumper ;
my $sftp_port=22 ;
my ($user, $host) = ("gaurav","192.168.246.137") ;
my $remote_path = '/home/gaurav/files/test' ;
my #result ; # To store result
my $command = "sftp -q -oPort=$sftp_port ${user}\#${host} 2>\&1 <<EOF\n"."cd $remote_path\nls -lrth\nquit\nEOF" ;
# open the command as a file handle, read output and store it.
open FH, "$command |" or die "Something went wrong!!\n" ;
while (<FH>) {
tr/(?\r|\f|\n)//d ; # Removing any new line, carriage return or form feed.
push(#result,"\[$_\]") ;
}
close FH ;
#print Dumper #result ;
# Just for printing a little bit of results from
# the array. Following lines can be deleted.
my $total = scalar #result ;
print "Read $total lines from SCP command.\n" ;
print "\nArrayIndex: 2,3,1999,2000 contain:\n
$result[2]
$result[3]
$result[1999]
$result[2000]
" ;
Another way: One could also get around this issue by making a shell script and calling it from the perl script and read its output. As shown below, my shell script which gets called by the perl script and the final output. This can be used as a quick technique when one doesn't have much time to write/formulate commands in perl directly. You can use the qx style(shown below) in earlier script as well.
Shell script "scp.sh"
%_STATION#gaurav * /root/ga/study/pl> cat scp.sh
#!/bin/bash
sftp -oPort=${1} ${2}#${3} 2>&1 <<EOF
cd ${4}
ls -l
quit
EOF
Perl Script "2scp.pl"
%_STATION#gaurav * /root/ga/study/pl> cat 2scp.pl
#!/usr/bin/perl
use strict ;
use warnings ;
use Data::Dumper ;
my $sftp_port=22 ;
my ($user, $host) = ("gaurav","192.168.246.137") ;
my $remote_path = '/home/gaurav/files/test' ;
# Passing arguements to shell script using concatination.
my $command = './scp.sh '." $sftp_port $user $host $remote_path" ;
my #result = qx{$command} ; # Runs the command and stores the result.
my $total = scalar #result ;
print "Read $total lines from SCP command.\n" ;
# End.
Output:
%_STATION#gaurav * /root/ga/study/pl> ./2scp.pl
Read 2004 lines from SCP command.
%_STATION#gaurav * /root/ga/study/pl>
Try it out and let us know.
Thanks.

Pass "file name" from a text file to a command line where each line of a file is file name

I'm running the following code
git log --pretty=format: --numstat -- SOMEFILENAME |
perl -ane '$i += ($F[0]-$F[1]); END{print "changed: $i\n"}' \
>> random.txt
What this does is it takes a file with a name "SOMEFILENAME" and saves the sum of the total amount of added and removed lines to a textfile called "random.txt"
I need to run this program on every file in repository and there are looots of them. What would be an easy way to do this?
If you want a total per file:
git log --pretty=format: --numstat |
perl -ane'
$c{$F[2]} += $F[0]-$F[1] if $F[2];
END { print "$_\t$c{$_}\n" for sort keys %c }
' >random.txt
If you want a single total:
git log --pretty=format: --numstat |
perl -ane'
$c += $F[0]-$F[1];
END { print "$c\n" }
' >random.txt
Their respective outputs are:
.gitignore 22
Build.PL 48
CHANGES.txt 0
Changes 25
LICENSE 132
LICENSE.txt 0
MANIFEST 18
MANIFEST.SKIP 9
README.txt 67
TODO.txt 1
lib/feature/qw_comments.pm 129
lib/feature/qw_comments.xs 250
t/00_load.t 13
t/01_basic.t 85
t/02_pragma.t 56
t/03_line_numbers.t 37
t/04_errors.t 177
t/05-unicode.t 39
t/devel-pod-coverage.t 26
t/pod.t 17
and
1151
Rather than use find, you can just let git give you all the files by using the name . (representing the current directory). With that, here's a version using awk that prints out stats per file:
git log --pretty=format: --numstat -- . |
awk '
NF == 3 {changed[$3] += $1 - $2}
END { for (name in changed) { printf("%s: %d changed\n", name, changed[name]); } }
'
And an even shorter one that prints a single overall changed line:
git log --pretty=format: --numstat -- . |
awk '
NF == 3 {changed += $1 - $2}
END { printf("%d changed\n", changed); }
'
(The NF == 3 is to account for the fact that git seems to print spurious blank lines in its output. I didn't try to figure out if there's a better git command.)

Script to find all similarly named files (differing only by case?)

I having been working on an SVN repo using command line only. I now have to bring in users that require a GUI to interface with the repo, however this is presenting a number of problems with similarly named files.
As it so happens a large number of images have been duplicated for reasons due to lack of communication or laziness.
I would like to be able to search for all files recursively from a given folder, and identify all files that differ only by case/capitalization, and must have the same file size, as it is certainly possible conflicts exist between different files, although I've not encountered any yet.
I don't mind to hammer out a Perl script to handle this myself, however I'm wonder if such a thing already exists or if anybody has any tips before I roll my sleeves up?
Thanks :D
I lean on md5sum for this type of problem:
find * -type f | xargs md5sum | sort | uniq -Dw32
If you are using svn, you'll want to exclude your .svn directories. This will print out all files with their paths that have identical content.
If you really want to only match files that differ by case, you can add a few more things to the above pipeline:
find * -type f | xargs md5sum | sort | uniq -Dw32 | awk -F'[ /]' '{ print $NF }' | sort -f | uniq -Di
myimage_23.png
MyImage_23.png
A shell script to list all filenames in a Subversion working directory that differ only in case from another filename in the same directory, and therefore will cause problems for Subversion clients on case-insensitive file systems, which cannot distinguish between such filenames:
find . -name .svn -type d -prune -false -o -print | \
perl -ne 'push #{$f{lc($_)}}, $_; END{map{print #{$f{$_}}} grep {#{$f{$_}}>1} sort keys %f}'
I have not used it personally but the Duplicate Files Finder looks like it would be suitable.
However, it will identify any duplicate files, regardless of file name, so you might have to filter the results if you only want duplicates with case-insensitive-matching file names.
It is open source, available on Windows and Linux, has both command line and GUI interfaces, and from the description the algorithm sounds very fast (only compares files with the same size rather than producing a checksum for every file).
I guess it would be something like:
#!perl
use File::Spec;
sub check_dir {
my ($dir, $out) = #_;
$out ||= [];
opendir DIR, $dir or die "Impossible to read dir: $!";
my #files = sort grep { /^[^\.]/ } readdir(DIR); # Ignore files starting with dot
closedir DIR;
my #nd = map { ! -d $_ ? File::Spec->catfile($dir, $_) : () } #files;
for my $i (0 .. $#nd-1){
push #$out, $nd[$i]
if lc $nd[$i] eq lc $nd[$i+1]
and -s $nd[$i] == -s $nd[$i+1];
}
map { -d $_ ? &check_dir($_, $out) : () } #files;
return $out;
}
print join "\n", #{&check_dir(shift #ARGV)}, "";
Please check it before using it, I have no access to windows machines (this does not happen in Un*x). Also, note that in the case of two files with the same name (except for the case) and the same size, only the first will be printed. In the case of three, only the first two, and so on (of course, you will need to keep one!).
As far as I know what you want doesn't exist as such. However, here's an implementation in bash:
#!/bin/bash
dir=("$#")
matched=()
files=()
lc(){ tr '[:upper:]' '[:lower:]' <<< ${*} ; }
in_list() {
local search="$1"
shift
local list=("$#")
for file in "${list[#]}" ; do
[[ $file == $search ]] && return 0
done
return 1
}
while read -r file ; do
files=("${files[#]}" "$file")
done < <(find "${dir[#]}" -type f | sort)
for file1 in "${files[#]}" ; do
for file2 in "${files[#]}" ; do
if
# check that the file did not match already
! in_list "$file1" "${matched[#]}" &&
# check that the files are not the same file
! [ $(stat -f %i "${file1}") -eq $(stat -f %i "${file2}") ] &&
# check that the size of the files are the same
[ $(stat -f %z "${file1}") = $(stat -f %z "${file2}") ] &&
# check that the non-directory part (aka file name) of the two
# files match case insensitively
grep -q $(lc "${file1##*/}") <<<$(lc "${file2##*/}")
then
matched=("${matched[#]}" "$file1")
echo "$file1"
break
fi
done
done
EDIT: Added comments and, inspired by TLP's comment, made only the file part of the path matter for equality comparisons. This has now been tested to a reasonable minimum degree and I expect that it won't explode in your face.
Here's a Ruby script to recursively search for files that differ only in case.
#!/usr/bin/ruby
# encoding: utf-8
def search( directory )
set = {}
Dir.entries( directory ).each do |entry|
next if entry == '.' || entry == '..'
path = File.join( directory, entry )
key = path.upcase
set[ key ] = [] unless set.has_key?( key )
set[ key ] << entry
search( path ) if File.directory?( path )
end
set.delete_if { |key, entries| entries.size == 1 }
set.each do |key, entries|
entries.each do |entry|
puts File.join( directory, entry )
end
end
end
search( File.expand_path( ARGV[ 0 ] ) )

How can I remove relative path components but leave symlinks alone in Perl?

I need to get Perl to remove relative path components from a Linux path. I've found a couple of functions that almost do what I want, but:
File::Spec->rel2abs does too little. It does not resolve ".." into a directory properly.
Cwd::realpath does too much. It resolves all symbolic links in the path, which I do not want.
Perhaps the best way to illustrate how I want this function to behave is to post a bash log where FixPath is a hypothetical command that gives the desired output:
'/tmp/test'$ mkdir -p a/b/c1 a/b/c2
'/tmp/test'$ cd a
'/tmp/test/a'$ ln -s b link
'/tmp/test/a'$ ls
b link
'/tmp/test/a'$ cd b
'/tmp/test/a/b'$ ls
c1 c2
'/tmp/test/a/b'$ FixPath . # rel2abs works here
===> /tmp/test/a/b
'/tmp/test/a/b'$ FixPath .. # realpath works here
===> /tmp/test/a
'/tmp/test/a/b'$ FixPath c1 # rel2abs works here
===> /tmp/test/a/b/c1
'/tmp/test/a/b'$ FixPath ../b # realpath works here
===> /tmp/test/a/b
'/tmp/test/a/b'$ FixPath ../link/c1 # neither one works here
===> /tmp/test/a/link/c1
'/tmp/test/a/b'$ FixPath missing # should work for nonexistent files
===> /tmp/test/a/b/missing
Alright, here is what I came up with:
sub mangle_path {
# NOT PORTABLE
# Attempt to remove relative components from a path - can return
# incorrect results for paths like ../some_symlink/.. etc.
my $path = shift;
$path = getcwd . "/$path" if '/' ne substr $path, 0, 1;
my #dirs = ();
for(split '/', $path) {
pop #dirs, next if $_ eq '..';
push #dirs, $_ unless $_ eq '.' or $_ eq '';
}
return '/' . join '/', #dirs;
}
I know this is possibly insecure and invalid, but any input to this routine will come from me on the command line, and it solves a couple of tricky use cases for me.