Script to find all similarly named files (differing only by case?) - perl

I having been working on an SVN repo using command line only. I now have to bring in users that require a GUI to interface with the repo, however this is presenting a number of problems with similarly named files.
As it so happens a large number of images have been duplicated for reasons due to lack of communication or laziness.
I would like to be able to search for all files recursively from a given folder, and identify all files that differ only by case/capitalization, and must have the same file size, as it is certainly possible conflicts exist between different files, although I've not encountered any yet.
I don't mind to hammer out a Perl script to handle this myself, however I'm wonder if such a thing already exists or if anybody has any tips before I roll my sleeves up?
Thanks :D

I lean on md5sum for this type of problem:
find * -type f | xargs md5sum | sort | uniq -Dw32
If you are using svn, you'll want to exclude your .svn directories. This will print out all files with their paths that have identical content.
If you really want to only match files that differ by case, you can add a few more things to the above pipeline:
find * -type f | xargs md5sum | sort | uniq -Dw32 | awk -F'[ /]' '{ print $NF }' | sort -f | uniq -Di
myimage_23.png
MyImage_23.png

A shell script to list all filenames in a Subversion working directory that differ only in case from another filename in the same directory, and therefore will cause problems for Subversion clients on case-insensitive file systems, which cannot distinguish between such filenames:
find . -name .svn -type d -prune -false -o -print | \
perl -ne 'push #{$f{lc($_)}}, $_; END{map{print #{$f{$_}}} grep {#{$f{$_}}>1} sort keys %f}'

I have not used it personally but the Duplicate Files Finder looks like it would be suitable.
However, it will identify any duplicate files, regardless of file name, so you might have to filter the results if you only want duplicates with case-insensitive-matching file names.
It is open source, available on Windows and Linux, has both command line and GUI interfaces, and from the description the algorithm sounds very fast (only compares files with the same size rather than producing a checksum for every file).

I guess it would be something like:
#!perl
use File::Spec;
sub check_dir {
my ($dir, $out) = #_;
$out ||= [];
opendir DIR, $dir or die "Impossible to read dir: $!";
my #files = sort grep { /^[^\.]/ } readdir(DIR); # Ignore files starting with dot
closedir DIR;
my #nd = map { ! -d $_ ? File::Spec->catfile($dir, $_) : () } #files;
for my $i (0 .. $#nd-1){
push #$out, $nd[$i]
if lc $nd[$i] eq lc $nd[$i+1]
and -s $nd[$i] == -s $nd[$i+1];
}
map { -d $_ ? &check_dir($_, $out) : () } #files;
return $out;
}
print join "\n", #{&check_dir(shift #ARGV)}, "";
Please check it before using it, I have no access to windows machines (this does not happen in Un*x). Also, note that in the case of two files with the same name (except for the case) and the same size, only the first will be printed. In the case of three, only the first two, and so on (of course, you will need to keep one!).

As far as I know what you want doesn't exist as such. However, here's an implementation in bash:
#!/bin/bash
dir=("$#")
matched=()
files=()
lc(){ tr '[:upper:]' '[:lower:]' <<< ${*} ; }
in_list() {
local search="$1"
shift
local list=("$#")
for file in "${list[#]}" ; do
[[ $file == $search ]] && return 0
done
return 1
}
while read -r file ; do
files=("${files[#]}" "$file")
done < <(find "${dir[#]}" -type f | sort)
for file1 in "${files[#]}" ; do
for file2 in "${files[#]}" ; do
if
# check that the file did not match already
! in_list "$file1" "${matched[#]}" &&
# check that the files are not the same file
! [ $(stat -f %i "${file1}") -eq $(stat -f %i "${file2}") ] &&
# check that the size of the files are the same
[ $(stat -f %z "${file1}") = $(stat -f %z "${file2}") ] &&
# check that the non-directory part (aka file name) of the two
# files match case insensitively
grep -q $(lc "${file1##*/}") <<<$(lc "${file2##*/}")
then
matched=("${matched[#]}" "$file1")
echo "$file1"
break
fi
done
done
EDIT: Added comments and, inspired by TLP's comment, made only the file part of the path matter for equality comparisons. This has now been tested to a reasonable minimum degree and I expect that it won't explode in your face.

Here's a Ruby script to recursively search for files that differ only in case.
#!/usr/bin/ruby
# encoding: utf-8
def search( directory )
set = {}
Dir.entries( directory ).each do |entry|
next if entry == '.' || entry == '..'
path = File.join( directory, entry )
key = path.upcase
set[ key ] = [] unless set.has_key?( key )
set[ key ] << entry
search( path ) if File.directory?( path )
end
set.delete_if { |key, entries| entries.size == 1 }
set.each do |key, entries|
entries.each do |entry|
puts File.join( directory, entry )
end
end
end
search( File.expand_path( ARGV[ 0 ] ) )

Related

Perl - Trouble with my unzip system call for zip file crack

I am a junior currently taking a scripting languages class that is suppose to spit us out with intermediate level bash, perl, and python in one semester. Since this class is accelerated, we speed through topics quickly and our professor endorses using forums to supplement our learning if we have questions.
I am currently working on our first assignment. The requirement is to create a very simple dictionary attack using a provided wordlist "linux.words" and a basic bruteforce attack. The bruteforce needs to compensate for any combination of 4 letter strings.
I have used print statements to check if my logic is sound, and it seems it is. If you have any suggestions on how to improve my logic, I am here to learn and I am all ears.
This is on Ubuntu v12.04 in case that is relevant.
I have tried replacing the scalar within the call with a straight word like unicorn and it runs fine, obviously is the wrong password, and it returns correctly. I have done this both in terminal and in the script itself. My professor has looked over this for a good 15 minutes he could spare, before referring me to forum, and said it looked good. He suspected that since I wrote the code using Notepad++ there might be hidden characters. I rewrote the code straight in the terminal using vim and it gave the same errors above. The code pasted is below is from vim.
My actual issue is that my system call is giving me problems. It returns the help function for unzip showing usages and other help material.
Here is my code.
#!/usr/bin/perl
use strict;
use warnings;
#Prototypes
sub brute();
sub dict();
sub AddSlashes($);
### ADD SLASHES ###
sub AddSlashes($)
{
my $text = shift;
$text =~ s/\\/\\\\/g;
$text =~ s/'/\\'/g;
$text =~ s/"/\\"/g;
$text =~ s/\\0/\\\\0/g;
return $text;
}
### BRUTEFORCE ATTACK ###
sub brute()
{
print "Bruteforce Attack...\n";
print "Press any key to continue.\n";
if (<>)
{
#INCEPTION START
my #larr1 = ('a'..'z'); #LEVEL 1 +
foreach (#larr1)
{
my $layer1 = $_; #LEVEL 1 -
my #larr2 = ('a'..'z'); #LEVEL 2 +
foreach (#larr2)
{
my $layer2 = $_; # LEVEL 2 -
my#larr3 = ('a'..'z'); #LEVEL 3 +
foreach (#larr3)
{
my $layer3 = $_; #LEVEL 3 -
my#larr4 = ('a'..'z'); #LEVEL 4 +
foreach (#larr4)
{
my $layer4 = $_;
my $pass = ("$layer1$layer2$layer3$layer4");
print ($pass); #LEVEL 4 -
}
}
}
}
}
}
### DICTIONARY ATTACK ###
sub dict()
{
print "Dictionary Attack...\n"; #Prompt User
print "Provide wordlist: ";
my $uInput = "";
chomp($uInput = <>); #User provides wordlist
(open IN, $uInput) #Bring in wordlist
or die "Cannot open $uInput, $!"; #If we cannot open file, alert
my #dict = <IN>; #Throw the wordlist into an array
foreach (#dict)
{
print $_; #Debug, shows what word we are on
#next; #Debug
my $pass = AddSlashes($_); #To store the $_ value for later use
#Check pass call
my $status = system("unzip -qq -o -P $pass secret_file_dict.zip > /dev/null 2>&1"); #Return unzip system call set to var
#Catch the correct password
if ($status == 0)
{
print ("Return of unzip is ", $status, " and pass is ", $pass, "\n"); #Print out value of return as well as pass
last;
}
}
}
### MAIN ###
dict();
exit (0);
Here is my error
See "unzip -hh" or unzip.txt for more help. Examples:
unzip data1 -x joe => extract all files except joe from zipfile data1.zip
unzip -p foo | more => send contents of foo.zip via pipe into program more
unzip -fo foo ReadMe => quietly replace existing ReadMe if archive file newer
aerify
UnZip 6.00 of 20 April 2009, by Debian. Original by Info-ZIP.
Usage: unzip [-Z] [-opts[modifiers]] file[.zip] [list] [-x xlist] [-d exdir]
Default action is to extract files in list, except those in xlist, to exdir;
file[.zip] may be a wildcard. -Z => ZipInfo mode ("unzip -Z" for usage).
-p extract files to pipe, no messages -l list files (short format)
-f freshen existing files, create none -t test compressed archive data
-u update files, create if necessary -z display archive comment only
-v list verbosely/show version info -T timestamp archive to latest
-x exclude files that follow (in xlist) -d extract files into exdir
modifiers:
-n never overwrite existing files -q quiet mode (-qq => quieter)
-o overwrite files WITHOUT prompting -a auto-convert any text files
-j junk paths (do not make directories) -aa treat ALL files as text
-U use escapes for all non-ASCII Unicode -UU ignore any Unicode fields
-C match filenames case-insensitively -L make (some) names lowercase
-X restore UID/GID info -V retain VMS version numbers
-K keep setuid/setgid/tacky permissions -M pipe through "more" pager
-O CHARSET specify a character encoding for DOS, Windows and OS/2 archives
-I CHARSET specify a character encoding for UNIX and other archives
See "unzip -hh" or unzip.txt for more help. Examples:
unzip data1 -x joe => extract all files except joe from zipfile data1.zip
unzip -p foo | more => send contents of foo.zip via pipe into program more
unzip -fo foo ReadMe => quietly replace existing ReadMe if archive file newer
aerifying
It is obviously not complete. In the main I will switch the brute(); for dict(); as needed to test. Once I get the system call working I will throw that into the brute section.
If you need me to elaborate more on my issue, please let me know. I am focused here on learning, so please add idiot proof comments to any thing you respond to me with.
First: DO NOT USE PERL'S PROTOTYPES. They don't do what you or your professor might wish they do.
Second: Don't write homebrew escaping routines such as AddSlashes. Perl has quotemeta. Use it.
Your problem is not with the specific programming language. How much time your professor has spent on your problem, how many classes you take are irrelevant to the problem. Focus on the actual problem, not all the extraneous "stuff".
Such as, what is the point of sub brute? You are not calling it in this script, it is not relevant to your problem, so don't post it. Narrow down your problem to the smallest relevant piece.
Don't prompt for the wordlist file in the body of dict. Separate the functionality into bite sized chunks so in each context you can focus on the problem at hand. Your dict_attack subroutine should expect to receive either a filehandle or a reference to an array of words. To keep memory footprint low, we'll assume it's a filehandle (so you don't have to keep the entire wordlist in memory).
So, your main looks like:
sub main {
# obtain name of wordlist file
# open wordlist file
# if success, call dict_attack with filehandle
# dict_attack returns password on success
}
Now, you can focus on dict_attack.
#!/usr/bin/perl
use strict;
use warnings;
main();
sub dict_attack {
my $dict_fh = shift;
while (my $word = <$dict_fh>) {
$word =~ s/\A\s+//;
$word =~ s/\s+\z//;
print "Trying $word\n";
my $pass = quotemeta( $word );
my $cmd = "unzip -qq -o -P $pass test.zip";
my $status = system $cmd;
if ($status == 0) {
return $word;
}
}
return;
}
sub main {
my $words = join("\n", qw(one two three four five));
open my $fh, '<', \$words or die $!;
if (my $pass = dict_attack($fh)) {
print "Password is '$pass'\n";
}
else {
print "Not found\n";
}
return;
}
Output:
C:\...> perl y.pl
Trying one
Trying two
Trying three
Trying four
Trying five
Password is 'five'

Finding multiple strings on multiple lines in file and manipulating output with bash/perl

I am trying to get the version numbers for content management systems being hosted on my server. I can do this fairly simply if the version number is stored on one line with something like this:
grep -r "\$wp_version = '" /home/
Which returns exactly what I want to stdout:
/home/$RANDOMDOMAIN/wp-includes/version.php:$wp_version = '3.7.1';
The issue I run into is when I start looking for version numbers that are stored on two or more lines, like Joomla! or Magento which use the following formats respectively:
Joomla!:
/** #var string Release version. */
public $RELEASE = '3.2';
/** #var string Maintenance version. */
public $DEV_LEVEL = '3';
Magento:
'major' => '1',
'minor' => '8',
'revision' => '1',
'patch' => '0',
I have gotten it to 'work', in a way, using the following (With this method if, for whatever reason, one of the strings I am looking for is missing the whole command becomes useless since xargs -l3 is expecting 2 rows above the path provided by -print):
find /home/ -type f -name version.php -exec grep " \$RELEASE " '{}' \; -exec grep " \$DEV_LEVEL " '{}' \; -print | xargs -l3 | sed 's/\<var\>\s//g;s/\<public\>\s//g' | awk -F\; '{print $3":"$1""$2}' | sed 's/ $DEV_LEVEL = /./g'
Which get's me output like this:
/home/$RANDOMDOMAIN/version.php:$RELEASE = 3.2.3
/home/$RANDOMDOMAIN/anotherfolder/version.php:$RELEASE = 1.5.0
I also have a working for loop that WILL exclude any file that does not contain both strings, but depending how much it has to sift through, can take significantly longer than the find one liner above:
for path in $(grep -rl " \$RELEASE " /home/ 2> /dev/null | xargs grep -rl " \$DEV_LEVEL ")
do
joomlaver="$path"
joomlaver+=$(grep " \$RELEASE " $path)
joomlaver+=$(echo " \$DEV_LEVEL = '$(grep " \$DEV_LEVEL " $path | cut -d\' -f2)';")
echo "$joomlaver" | sed 's/\<var\>\s//g;s/\<public\>\s//g;s/;//g' | awk -F\' '{ print $1""$2"."$4 }' | sed 's/\s\+//g'
unset joomlaver
done
Which get's me output like this:
/home/$RANDOMDOMAIN/version.php$RELEASE=3.2.3
/home/$RANDOMDOMAIN/anotherfolder/version.php$RELEASE=1.5.0
But I have to believe there is a simpler, shorter, more elegant way. Bash is preferred or if it can somehow be done with a perl one liner, that would work as well. Any and all help would be much appreciated. Thanks in advance. (Sorry for all the edits, but I am trying to figure this out myself as well!)
Here is a perl one-liner that will extract the $RELEASE and $DEV_LEVEL from the php file format you showed:
perl -ne '$v=$1 if /\$RELEASE\s*=\s*\047([0-9.]+)\047/; $devlevel=$1 if /\$DEV_LEVEL\s*=\s*\047([0-9.]+)\047/; if (defined $v && defined $devlevel) { print "$ARGV: Release=$v Devlevel=$devlevel\n"; last; }'
The -n makes perl effectivly wrap the whole thing inside a while (<>) { } loop. Each line is checked against two regexes. If both of them have matched then it will print the result and exit.
The \047 is used to match single quotes, otherwise the shell would get confused.
If it does not find a match, it does not print anything. Otherwise it prints something like this:
sample.php: Release=3.2 Devlevel=3
You would use it in combination with find and xargs to traverse down a directory structure, perhaps like this:
find . -name "*.php" | xargs perl -ne '$v=$1 if /\$RELEASE\s*=\s*\047([0-9.]+)\047/; $devlevel=$1 if /\$DEV_LEVEL\s*=\s*\047([0-9.]+)\047/; if (defined $v && defined $devlevel) { print "$ARGV: Release=$v Devlevel=$devlevel\n"; last; }'
You could make a similar version for the other file format (Magento?) you mentioned.

Find up-to-date files for different paths but with identical file names

I have the following files
./path/to/stuff1/file1 (x)
./path/to/stuff1/file2
./path/to/stuff1/file3
./path/to/stuff2/file1
./path/to/stuff2/file2 (x)
./path/to/stuff2/file3
./path/to/stuff3/file1 (x)
./path/to/stuff3/file2
./path/to/stuff3/file3
where I marked the files I touched lastly. I want to get exactly those marked files. In other words:
I want to get the up-to-date file for each directory.
I constructed the bash command
for line in $( find . -name 'file*' -type f | awk -F/ 'sub($NF,x)' | sort | uniq ); do
find $line -name 'file*' -type f -printf '%T# %p\n' | sort -n | tail -1 | cut -f2 -d' '
done
which I am able to use in perl using the system command and escaping the $. Is it possible to do this directly in perl or do you think my approach is fine?
edit
If possible the task should be done in perl without using external modules.
edit2
Sorry, I noticed my question wasn't clear. I thought the answer of #TLP would work but I have to clearify: I want to check for the newest file in each folder, e.g. the newest file in stuff1. Say I do
touch ./path/to/stuff1/file1
touch ./path/to/stuff2/file2
touch ./path/to/stuff3/file1
before I run the script. It then should output:
./path/to/stuff1/file1
./path/to/stuff2/file2
./path/to/stuff3/file1
The filename can be identical for different stuff but only one file per path should be output.
The script of #codnodder does this but I wish to search for only for the filename and not for the full path. So I want to search for all files beginning with file and the script should search recursively.
Your find command can be emulated with File::Find's find command. This is a core module in Perl 5, and is almost certainly already on your system. To check the file modification time, you can use the -M file test.
So something like this:
use strict;
use warnings;
use File::Find;
my %times;
find(\&wanted, '.');
for my $dir (keys %times) {
print $times{$dir}{file}, "\n";
}
sub wanted {
return unless (-f && /^file/);
my $mod = -M $_;
if (!defined($times{$File::Find::dir}) or
$mod < $times{$File::Find::dir}{mod}) {
$times{$File::Find::dir}{mod} = $mod;
$times{$File::Find::dir}{file} = $File::Find::name;
}
}
If I run this command in my test directory, on my system, I get the following Data::Dumper structure, where you can clearly see the file name key, the full path stored in the file key, and the modification date (in days compared to the run time of the script) as the mod.
$VAR1 = {
'./phone' => {
'file' => './phone/file.txt',
'mod' => '3.47222222222222e-005'
},
'./foo' => {
'file' => './foo/fileb.txt',
'mod' => '0.185'
},
'.' => {
'file' => './file.conf',
'mod' => '0.154490740740741'
}
};
There are 3 general approaches I can think of.
Using opendir(), readdir(), and stat().
using File::Find.
Using glob().
The most appropriate option depends on the specifics of what you have
to work with, that we can't see from your posting.
Also, I assume when you say "no external modules", you are not
excluding modules installed with Perl (i.e., in Core).
Here is an example using glob():
use File::Basename qw/fileparse/;
for my $file (newest_file()) {
print "$file\n";
}
sub newest_file {
my %files;
for my $file (glob('./path/stuff*/file*')) {
my ($name, $path, $suffix) = fileparse($file);
my $mtime = (stat($file))[9];
if (!exists $files{$path} || $mtime > $files{$path}[0]) {
$files{$path} = [$mtime, $name];
}
}
return map { $files{$_}[1] } keys %files;
}

Make some replacements on a bunch of files depending the number of columns per line

I'm having a problem dealing with some files. I need to perform a column count for every line in a file and depending the number of columns i need to add severals ',' in in the end of each line. All lines should have 36 columns separated by ','
This line solves my problem, but how do I run it in a folder with several files in a automated way?
awk ' BEGIN { FS = "," } ;
{if (NF == 32) { print $0",,,," } else if (NF==31) { print $0",,,,," }
}' <SOURCE_FILE> > <DESTINATION_FILE>
Thank you for all your support
R&P
The answer depends on your OS, which you haven't told us. On UNIX and assuming you want to modify each original file, it'd be:
for file in *
do
awk '...' "$file" > tmp$$ && mv tmp$$ "$file"
done
Also, in general to get all records in a file to have the same number of fields you can do this without needing to specify what that number of fields is (though you can if appropriate):
$ cat tst.awk
BEGIN { FS=OFS=","; ARGV[ARGC++] = ARGV[ARGC-1] }
NR==FNR { nf = (NF > nf ? NF : nf); next }
{
tail = sprintf("%*s",nf-NF,"")
gsub(/ /,OFS,tail)
print $0 tail
}
$
$ cat file
a,b,c
a,b
a,b,c,d,e
$
$ awk -f tst.awk file
a,b,c,,
a,b,,,
a,b,c,d,e
$
$ awk -v nf=10 -f tst.awk file
a,b,c,,,,,,,
a,b,,,,,,,,
a,b,c,d,e,,,,,
It's a short one-liner with Perl:
perl -i.bak -F, -alpe '$_ .= "," x (36-#F)' *
if this is only a single folder without subfolders, use:
for oldfile in /path/to/files/*
do
newfile="${oldfile}.new"
awk '...' "${oldfile}" > "${newfile}"
done
if you also want to include subdirectories recursively, it's probably easiest to put the awk+redirection into a small shell-script, like this:
#!/bin/bash
oldfile=$1
newfile="${oldfile}.new"
awk '...' "${oldfile}" > "${newfile}"
and then run this script (let's calls it runawk.sh) via find:
find /path/to/files/ -type f -not -name "*.new" -exec runawk.sh \{\} \;

How do you climb up the parent directory structure of a bash script?

Is there a neater way of climbing up multiple directory levels from the location of a script.
This is what I currently have.
# get the full path of the script
D=$(cd ${0%/*} && echo $PWD/${0##*/})
D=$(dirname $D)
D=$(dirname $D)
D=$(dirname $D)
# second level parent directory of script
echo $D
I would like a neat way of finding the nth level. Any ideas other than putting in a for loop?
dir="/path/to/somewhere/interesting"
saveIFS=$IFS
IFS='/'
parts=($dir)
IFS=$saveIFS
level=${parts[3]}
echo "$level" # output: somewhere
#!/bin/sh
ancestor() {
local n=${1:-1}
(for ((; n != 0; n--)); do cd $(dirname ${PWD}); done; pwd)
}
Usage:
$ pwd
/home/nix/a/b/c/d/e/f/g
$ ancestor 3
/home/nix/a/b/c/d
A solution without loops would be to use recursion. I wanted to find a config file for a script by traversing backwards up from my current working directory.
rtrav() { test -e $2/$1 && echo $2 || { test $2 != / && rtrav $1 `dirname $2`;}; }
To check if the current directory is in a GIT repo: rtrav .git $PWD
rtrav will check the existence of a filename given by the first argument in each parent folder of the one given as the second argument. Printing the directory path where the file was found or exiting with an error code if the file was not found.
The predicate (test -e $2/$1) could be swapped for checking of a counter that indicates the traversal depth.
If you're OK with including a Perl command:
$ pwd
/u1/myuser/dir3/dir4/dir5/dir6/dir7
The first command lists the directory containing first N (in my case 5) directories
$ perl-e 'use File::Spec; \
my #dirs = File::Spec->splitdir( \
File::Spec->rel2abs( File::Spec->curdir() ) ); \
my #dirs2=#dirs[0..5]; print File::Spec->catdir(#dirs2) . "\n";'
/u1/myuser/dir3/dir4/dir5
The second command lists the directory N levels up (in my case 5) directories (I think you wanted the latter).
$ perl -e 'use File::Spec; \
my #dirs = File::Spec->splitdir( \
File::Spec->rel2abs( File::Spec->curdir() ) ); \
my #dirs2=#dirs[0..$#dir-5]; print File::Spec->catdir(#dirs2)."\n";'
/u1/myuser
To use it in your bash script, of course:
D=$(perl -e 'use File::Spec; \
my #dirs = File::Spec->splitdir( \
File::Spec->rel2abs( File::Spec->curdir() ) ); \
my #dirs2=#dirs[0..$#dir-5]; print File::Spec->catdir(#dirs2)."\n";')
Any ideas other than putting in a for loop?
In shells, you can't avoid the loop, because traditionally they do not support regexp, but glob matching instead. And glob patterns do not support the any sort of repeat counters.
And BTW, simplest way is to do it in shell is: echo $(cd $PWD/../.. && echo $PWD) where the /../.. makes it strip two levels.
With tiny bit of Perl that would be:
perl -e '$ENV{PWD} =~ m#^(.*)(/[^/]+){2}$# && print $1,"\n"'
The {2} in the Perl's regular expression is the number of directory entries to strip. Or making it configurable:
N=2
perl -e '$ENV{PWD} =~ m#^(.*)(/[^/]+){'$N'}$# && print $1,"\n"'
One can also use Perl's split(), join() and splice() for the purpose, e.g.:
perl -e '#a=split("/", $ENV{PWD}); print join("/", splice(#a, 0, -2)),"\n"'
where -2 says that from the path the last two entries has to be removed.
Two levels above the script directory:
echo "$(readlink -f -- "$(dirname -- "$0")/../..")"
All the quoting and -- are to avoid problems with tricky paths.
This method uses the actual full path to the perl script itself ... TIMTOWTDI
You could just easily replace the $RunDir with the path you would like to start with ...
#resolve the run dir where this scripts is placed
$0 =~ m/^(.*)(\\|\/)(.*)\.([a-z]*)/;
$RunDir = $1 ;
#change the \'s to /'s if we are on Windows
$RunDir =~s/\\/\//gi ;
my #DirParts = split ('/' , $RunDir) ;
for (my $count=0; $count < 4; $count++) { pop #DirParts ; }
$confHolder->{'ProductBaseDir'} = $ProductBaseDir ;
This allows you to work your way up until whatever condition is desired
WORKDIR=$PWD
until test -d "$WORKDIR/infra/codedeploy"; do
# get the full path of the script
WORKDIR=$(dirname $WORKDIR)
done