How do you climb up the parent directory structure of a bash script? - perl

Is there a neater way of climbing up multiple directory levels from the location of a script.
This is what I currently have.
# get the full path of the script
D=$(cd ${0%/*} && echo $PWD/${0##*/})
D=$(dirname $D)
D=$(dirname $D)
D=$(dirname $D)
# second level parent directory of script
echo $D
I would like a neat way of finding the nth level. Any ideas other than putting in a for loop?

dir="/path/to/somewhere/interesting"
saveIFS=$IFS
IFS='/'
parts=($dir)
IFS=$saveIFS
level=${parts[3]}
echo "$level" # output: somewhere

#!/bin/sh
ancestor() {
local n=${1:-1}
(for ((; n != 0; n--)); do cd $(dirname ${PWD}); done; pwd)
}
Usage:
$ pwd
/home/nix/a/b/c/d/e/f/g
$ ancestor 3
/home/nix/a/b/c/d

A solution without loops would be to use recursion. I wanted to find a config file for a script by traversing backwards up from my current working directory.
rtrav() { test -e $2/$1 && echo $2 || { test $2 != / && rtrav $1 `dirname $2`;}; }
To check if the current directory is in a GIT repo: rtrav .git $PWD
rtrav will check the existence of a filename given by the first argument in each parent folder of the one given as the second argument. Printing the directory path where the file was found or exiting with an error code if the file was not found.
The predicate (test -e $2/$1) could be swapped for checking of a counter that indicates the traversal depth.

If you're OK with including a Perl command:
$ pwd
/u1/myuser/dir3/dir4/dir5/dir6/dir7
The first command lists the directory containing first N (in my case 5) directories
$ perl-e 'use File::Spec; \
my #dirs = File::Spec->splitdir( \
File::Spec->rel2abs( File::Spec->curdir() ) ); \
my #dirs2=#dirs[0..5]; print File::Spec->catdir(#dirs2) . "\n";'
/u1/myuser/dir3/dir4/dir5
The second command lists the directory N levels up (in my case 5) directories (I think you wanted the latter).
$ perl -e 'use File::Spec; \
my #dirs = File::Spec->splitdir( \
File::Spec->rel2abs( File::Spec->curdir() ) ); \
my #dirs2=#dirs[0..$#dir-5]; print File::Spec->catdir(#dirs2)."\n";'
/u1/myuser
To use it in your bash script, of course:
D=$(perl -e 'use File::Spec; \
my #dirs = File::Spec->splitdir( \
File::Spec->rel2abs( File::Spec->curdir() ) ); \
my #dirs2=#dirs[0..$#dir-5]; print File::Spec->catdir(#dirs2)."\n";')

Any ideas other than putting in a for loop?
In shells, you can't avoid the loop, because traditionally they do not support regexp, but glob matching instead. And glob patterns do not support the any sort of repeat counters.
And BTW, simplest way is to do it in shell is: echo $(cd $PWD/../.. && echo $PWD) where the /../.. makes it strip two levels.
With tiny bit of Perl that would be:
perl -e '$ENV{PWD} =~ m#^(.*)(/[^/]+){2}$# && print $1,"\n"'
The {2} in the Perl's regular expression is the number of directory entries to strip. Or making it configurable:
N=2
perl -e '$ENV{PWD} =~ m#^(.*)(/[^/]+){'$N'}$# && print $1,"\n"'
One can also use Perl's split(), join() and splice() for the purpose, e.g.:
perl -e '#a=split("/", $ENV{PWD}); print join("/", splice(#a, 0, -2)),"\n"'
where -2 says that from the path the last two entries has to be removed.

Two levels above the script directory:
echo "$(readlink -f -- "$(dirname -- "$0")/../..")"
All the quoting and -- are to avoid problems with tricky paths.

This method uses the actual full path to the perl script itself ... TIMTOWTDI
You could just easily replace the $RunDir with the path you would like to start with ...
#resolve the run dir where this scripts is placed
$0 =~ m/^(.*)(\\|\/)(.*)\.([a-z]*)/;
$RunDir = $1 ;
#change the \'s to /'s if we are on Windows
$RunDir =~s/\\/\//gi ;
my #DirParts = split ('/' , $RunDir) ;
for (my $count=0; $count < 4; $count++) { pop #DirParts ; }
$confHolder->{'ProductBaseDir'} = $ProductBaseDir ;

This allows you to work your way up until whatever condition is desired
WORKDIR=$PWD
until test -d "$WORKDIR/infra/codedeploy"; do
# get the full path of the script
WORKDIR=$(dirname $WORKDIR)
done

Related

Perl script throws syntax error for awk command

I have a file which contains each users userid and password. I need to fetch userid and password from that file by passing userid as an search element using awk command.
user101,smith,smith#123
user102,jones,passj#007
user103,albert,albpass#01
I am using a awk command inside my perl script like this:
...
...
my $userid = ARGV[0];
my $user_report_file = "report_file.txt";
my $data = `awk -F, '$1 ~ /$userid/ {print $2, $3}' $user_report_file`;
my ($user,$pw) = split(" ",$data);
...
...
Here I am getting the error:
awk: ~ /user101/ {print , }
awk: ^ syntax error
But if I run same command in terminal window its able to give result like below:
$] awk -F, '$1 ~ /user101/ {print $2, $3}' report_file.txt
smith smith#123
What could be the issue here?
The backticks are a double-quoted context, so you need to escape any literal $ that you want awk to interpret.
my $data = `awk -F, '\$1 ~ /$userid/ {print \$2, \$3}' $user_report_file`;
If you don't do that, you're interpolating the capture variables from the last successful Perl match.
When I have these sorts of problems, I try the command as a string first to see if it is what I expect:
my $data = "awk -F, '\$1 ~ /$userid/ {print \$2, \$3}' $user_report_file";
say $data;
Here's the Perl equivalent of that command:
$ perl -aF, -e '$F[0]=~/101/ && print "#F[1,2]"' report_file
But, this is something you probably want to do in Perl instead of creating another process:
Interpolating data into external commands can go wrong, such as a filename that is foo.txt; rm -rf /.
The awk you run is the first one in the path, so someone can make that a completely different program (so use the full path, like /usr/bin/awk).
Taint checking can tell you when you are passing unsanitized data to the shell.
Inside a program you don't get all the shortcuts, but if this is the part of your program that is slow, you probably want to rethink how you are accessing this data because scanning the entire file with any tool isn't going to be that fast:
open my $fh, '<', $user_report_file or die;
while( <$fh> ) {
chomp;
my #F = split /,/;
next unless $F[0] =~ /\Q$userid/;
print "#F[1,2]";
last; # if you only want the first one
}

how do I find two parent directories up in perl?

I have a path: /path/to/here/file.txt.
I want to get /path/to/
Using
my ($file, $dir) = fileparse($fullPath);
I can get file.txt and /path/to/here/
How do I get just /path/to/?
use Path::Class qw( file );
say file("/path/to/here/file.txt")->dir->parent;
Note that this does not perform any file-system checks, so it will return /path/to even if /path/to is a symbolic link and thus not truly the parent directory.
Using Path::Tiny:
$ perl -MPath::Tiny -e 'CORE::say path($ARGV[0])->parent->parent' /path/to/here/file.txt
This does not perform any file system checks either. Doing it using only File::Spec tends to get tedious. I am not positive the following works:
$ perl -MFile::Spec::Functions=splitpath,catpath,catdir,splitdir -e \
'($v, $d) = splitpath($ARGV[0]); #d = splitdir $d; splice #d, -2; \
CORE::say catpath($v, catdir (#d))' /path/to/here/file.txt

perl query using -pie

This works:
perl -pi -e 's/abc/cba/g' hellofile
But this does not:
perl -pie 's/cba/abc/g' hellofile
In other words -pi -e works but -pie does not. Why?
The -i flag takes an optional argument (which, if present, must be immediately after it, not in a separate command-line argument) that specifies the suffix to append to the name of the input file for the purposes of creating a backup. Writing perl -pie 's/cba/abc/g' hellofile causes the e to be taken as this suffix, and as the e isn't interpreted as the normal -e option, Perl tries to run the script located in s/cba/abc/g, which probably doesn't exist.
Because -i takes an optional extension for backup files, e.g. -i.bak, and therefore additional flags cannot follow directly after -i.
From perldoc perlrun
-i[extension]
specifies that files processed by the <> construct are to be edited
in-place. It does this by renaming the input file, opening the output
file by the original name, and selecting that output file as the
default for print() statements. The extension, if supplied, is used to
modify the name of the old file to make a backup copy, following these
rules:
If no extension is supplied, no backup is made and the current file is
overwritten.
If the extension doesn't contain a * , then it is appended to the end
of the current filename as a suffix. If the extension does contain one
or more * characters, then each * is replaced with the current
filename. In Perl terms, you could think of this as:
perl already tells you why :) Try-It-To-See
$ perl -pie " s/abc/cba/g " NUL
Can't open perl script " s/abc/cba/g ": No such file or directory
If you use B::Deparse you can see how perl compiles your code
$ perl -MO=Deparse -pi -e " s/abc/cba/g " NUL
BEGIN { $^I = ""; }
LINE: while (defined($_ = <ARGV>)) {
s/abc/cba/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
-e syntax OK
If you lookup $^I in perlvar you can learn about the -i switch :)
$ perldoc -v "$^I"
$INPLACE_EDIT
$^I The current value of the inplace-edit extension. Use "undef" to
disable inplace editing.
Mnemonic: value of -i switch.
Now if we revisit the first part, add an extra -e, then add Deparse, the -i switch is explained
$ perl -pie -e " s/abc/cba/g " NUL
Can't do inplace edit: NUL is not a regular file.
$ perl -MO=Deparse -pie -e " s/abc/cba/g " NUL
BEGIN { $^I = "e"; }
LINE: while (defined($_ = <ARGV>)) {
s/abc/cba/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
-e syntax OK
Could it really be that e in -pie is taken as extension? I guess so
$ perl -MO=Deparse -pilogicus -e " s/abc/cba/g " NUL
BEGIN { $^I = "logicus"; }
LINE: while (defined($_ = <ARGV>)) {
s/abc/cba/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
-e syntax OK
When in doubt, Deparse or Deparse,-p

Script to find all similarly named files (differing only by case?)

I having been working on an SVN repo using command line only. I now have to bring in users that require a GUI to interface with the repo, however this is presenting a number of problems with similarly named files.
As it so happens a large number of images have been duplicated for reasons due to lack of communication or laziness.
I would like to be able to search for all files recursively from a given folder, and identify all files that differ only by case/capitalization, and must have the same file size, as it is certainly possible conflicts exist between different files, although I've not encountered any yet.
I don't mind to hammer out a Perl script to handle this myself, however I'm wonder if such a thing already exists or if anybody has any tips before I roll my sleeves up?
Thanks :D
I lean on md5sum for this type of problem:
find * -type f | xargs md5sum | sort | uniq -Dw32
If you are using svn, you'll want to exclude your .svn directories. This will print out all files with their paths that have identical content.
If you really want to only match files that differ by case, you can add a few more things to the above pipeline:
find * -type f | xargs md5sum | sort | uniq -Dw32 | awk -F'[ /]' '{ print $NF }' | sort -f | uniq -Di
myimage_23.png
MyImage_23.png
A shell script to list all filenames in a Subversion working directory that differ only in case from another filename in the same directory, and therefore will cause problems for Subversion clients on case-insensitive file systems, which cannot distinguish between such filenames:
find . -name .svn -type d -prune -false -o -print | \
perl -ne 'push #{$f{lc($_)}}, $_; END{map{print #{$f{$_}}} grep {#{$f{$_}}>1} sort keys %f}'
I have not used it personally but the Duplicate Files Finder looks like it would be suitable.
However, it will identify any duplicate files, regardless of file name, so you might have to filter the results if you only want duplicates with case-insensitive-matching file names.
It is open source, available on Windows and Linux, has both command line and GUI interfaces, and from the description the algorithm sounds very fast (only compares files with the same size rather than producing a checksum for every file).
I guess it would be something like:
#!perl
use File::Spec;
sub check_dir {
my ($dir, $out) = #_;
$out ||= [];
opendir DIR, $dir or die "Impossible to read dir: $!";
my #files = sort grep { /^[^\.]/ } readdir(DIR); # Ignore files starting with dot
closedir DIR;
my #nd = map { ! -d $_ ? File::Spec->catfile($dir, $_) : () } #files;
for my $i (0 .. $#nd-1){
push #$out, $nd[$i]
if lc $nd[$i] eq lc $nd[$i+1]
and -s $nd[$i] == -s $nd[$i+1];
}
map { -d $_ ? &check_dir($_, $out) : () } #files;
return $out;
}
print join "\n", #{&check_dir(shift #ARGV)}, "";
Please check it before using it, I have no access to windows machines (this does not happen in Un*x). Also, note that in the case of two files with the same name (except for the case) and the same size, only the first will be printed. In the case of three, only the first two, and so on (of course, you will need to keep one!).
As far as I know what you want doesn't exist as such. However, here's an implementation in bash:
#!/bin/bash
dir=("$#")
matched=()
files=()
lc(){ tr '[:upper:]' '[:lower:]' <<< ${*} ; }
in_list() {
local search="$1"
shift
local list=("$#")
for file in "${list[#]}" ; do
[[ $file == $search ]] && return 0
done
return 1
}
while read -r file ; do
files=("${files[#]}" "$file")
done < <(find "${dir[#]}" -type f | sort)
for file1 in "${files[#]}" ; do
for file2 in "${files[#]}" ; do
if
# check that the file did not match already
! in_list "$file1" "${matched[#]}" &&
# check that the files are not the same file
! [ $(stat -f %i "${file1}") -eq $(stat -f %i "${file2}") ] &&
# check that the size of the files are the same
[ $(stat -f %z "${file1}") = $(stat -f %z "${file2}") ] &&
# check that the non-directory part (aka file name) of the two
# files match case insensitively
grep -q $(lc "${file1##*/}") <<<$(lc "${file2##*/}")
then
matched=("${matched[#]}" "$file1")
echo "$file1"
break
fi
done
done
EDIT: Added comments and, inspired by TLP's comment, made only the file part of the path matter for equality comparisons. This has now been tested to a reasonable minimum degree and I expect that it won't explode in your face.
Here's a Ruby script to recursively search for files that differ only in case.
#!/usr/bin/ruby
# encoding: utf-8
def search( directory )
set = {}
Dir.entries( directory ).each do |entry|
next if entry == '.' || entry == '..'
path = File.join( directory, entry )
key = path.upcase
set[ key ] = [] unless set.has_key?( key )
set[ key ] << entry
search( path ) if File.directory?( path )
end
set.delete_if { |key, entries| entries.size == 1 }
set.each do |key, entries|
entries.each do |entry|
puts File.join( directory, entry )
end
end
end
search( File.expand_path( ARGV[ 0 ] ) )

perl backticks: use bash instead of sh

I noticed that when I use backticks in perl the commands are executed using sh, not bash, giving me some problems.
How can I change that behavior so perl will use bash?
PS. The command that I'm trying to run is:
paste filename <(cut -d \" \" -f 2 filename2 | grep -v mean) >> filename3
The "system shell" is not generally mutable. See perldoc -f exec:
If there is more than one argument in LIST, or if LIST is an array with more than one value, calls execvp(3) with the arguments in LIST. If
there is only one scalar argument or an array with one element in it, the argument is checked for shell metacharacters, and if there are any, the
entire argument is passed to the system's command shell for parsing (this is "/bin/sh -c" on Unix platforms, but varies on other platforms).
If you really need bash to perform a particular task, consider calling it explicitly:
my $result = `/usr/bin/bash command arguments`;
or even:
open my $bash_handle, '| /usr/bin/bash' or die "Cannot open bash: $!";
print $bash_handle 'command arguments';
You could also put your bash commands into a .sh file and invoke that directly:
my $result = `/usr/bin/bash script.pl`;
Try
`bash -c \"your command with args\"`
I am fairly sure the argument of -c is interpreted the way bash interprets its command line. The trick is to protect it from sh - that's what quotes are for.
This example works for me:
$ perl -e 'print `/bin/bash -c "echo <(pwd)"`'
/dev/fd/63
To deal with running bash and nested quotes, this article provides the best solution: How can I use bash syntax in Perl's system()?
my #args = ( "bash", "-c", "diff <(ls -l) <(ls -al)" );
system(#args);
I thought perl would honor the $SHELL variable, but then it occurred to me that its behavior might actually depend on your system's exec implementation. In mine, it seems that exec
will execute the shell
(/bin/sh) with the path of the
file as its first argument.
You can always do qw/bash your-command/, no?
Create a perl subroutine:
sub bash { return `cat << 'EOF' | /bin/bash\n$_[0]\nEOF\n`; }
And use it like below:
my $bash_cmd = 'paste filename <(cut -d " " -f 2 filename2 | grep -v mean) >> filename3';
print &bash($bash_cmd);
Or use perl here-doc for multi-line commands:
$bash_cmd = <<'EOF';
for (( i = 0; i < 10; i++ )); do
echo "${i}"
done
EOF
print &bash($bash_cmd);
I like to make some function btck (which integrates error checking) and bash_btck (which uses bash):
use Carp;
sub btck ($)
{
# Like backticks but the error check and chomp() are integrated
my $cmd = shift;
my $result = `$cmd`;
$? == 0 or confess "backtick command '$cmd' returned non-zero";
chomp($result);
return $result;
}
sub bash_btck ($)
{
# Like backticks but use bash and the error check and chomp() are
# integrated
my $cmd = shift;
my $sqpc = $cmd; # Single-Quote-Protected Command
$sqpc =~ s/'/'"'"'/g;
my $bc = "bash -c '$sqpc'";
return btck($bc);
}
One of the reasons I like to use bash is for safe pipe behavior:
sub safe_btck ($)
{
return bash_btck('set -o pipefail && '.shift);
}