Fetching file path from a Perl '.rc' file - perl

So I have a Perl '.rc' file (let's call it 'path.rc' with Perl syntax) which has this line:
$RC{model_root} = '/nfs/fm/disks/fm_fabric_00011/abc_rel//xy/xy-abc1-15aa05e'
I need to fetch the files from directory 'xy-abc1-15aa05e'. I am not supposed to hard code this path in my Perl file (let's call this 'Fetch.pl') as the path may change frequently, so there's a separate .rc file maintained. I'm using:
my $model_root = $RC{model_root};
to link to the path in my Perl code (like a parameter to link to the path in rc file). How do I now open the files in directory 'xy-abc1-15aa05e'? My Perl file is not able to get the path :(
This is breaking the rest of my code... How can I to do this?

If you make your 'rc' file a Perl module like this:
Defs.pm:
package Defs;
our $path = '/nfs/fm/disks/fm_fabric_00011/abc_rel//xy/xy-abc1-15aa05e';
1;
Then you can - in your script:
#Ensure we can 'find' the defs file, by having a library path that's relative to
#the script location.
use FindBin;
use lib $FindBin::Bin;
use Defs;
print $Defs::path, "\n";
If you specifically need to use the format you've listed, then you need to process the contents of the file. One way of doing this is with eval. But I'm not overly keen on doing that unless absolutely necessary.
You could do something like this though:
use Data::Dumper;
open ( my $rcfile, "<", 'rcfile' ) or die $!;
my %RC;
eval <$rcfile>;
print Dumper \%RC;
I dislike using eval in this sort of way though - you need to be quite careful about your inputs, because otherwise odd things might break. (Note - this only works for a one line file - if you have multiple lines, you might need to local $/; to slurp the whole file to eval it).
I would instead be tempted to use a regular expression to parse:
my $model_root;
while ( <$rcfile> ) {
my ( $varname, $value ) = ( m/\A(\S+) = \'(\S+)\'/ );
if ( $varname eq '$RC{model_root}' ) { $model_root = $value; }
}
print $model_root;
foreach my $file ( glob "$model_root/*" ) {
print "Doing something with $file\n";
}

Related

Save contents of those files which contain a specific known string in an single .txt or .tmp file using perl

I'm trying to write a perl script where I'm trying to save whole contents of those files which contain a specific string 'PYAG_GENERATED', in a single .txt/.tmp file one after another. These file names are in a specific pattern and this pattern is 'output_nnnn.txt' where nnnn is 0001,0002 and so on. But I don't know how many number of files are present with this 'output_nnnn.txt' name.
I'm new in perl and I don't know how I can resolve this issue to get the output correctly. Can anyone help me. Thanks in advance.
I've tried to write perl script in different ways but nothing is coming in output file. I'm giving here one of those I've tried. 'new_1.txt' is the new file where I want to save the expected output and "PYAG_GENERATED" is that specific string I'm finding for in the files.
open(NEW,">>new_1.txt") or die "could not open:$!";
$find2="PYAG_GENERATED";
$n='0001';
while('output_$n.txt'){
if(/find2/){
print NEW;
}
$n++;
}
close NEW;
I expect that the output file 'new_1.txt' will save the whole contents of the the files(with filename pattern 'output_nnnn.txt') which have 'PYAG_GENERATED' string at least once inside.
Well, you tried I guess.
Welcome to the wonderful world of Perl where there are always a dozen ways of doing X :-) One possible way to achieve what you want. I put in a lot of comments I hope are helpful. It's also a bit verbose for the sake of clarity. I'm sure it could be golfed down to 5 lines of code.
use warnings; # Always start your Perl code with these two lines,
use strict; # and Perl will tell you about possible mistakes
use experimental 'signatures';
use File::Slurp;
# this is a subroutine/function, a block of code that can be called from
# somewhere else. it takes to arguments, that the caller must provide
sub find_in_file( $filename, $what_to_look_for )
{
# the open function opens $filename for reading
# (that's what the "<" means, ">" stands for writing)
# if successfull open will return we will have a "file handle" in the variable $in
# if not open will return false ...
open( my $in, "<", $filename )
or die $!; # ... and the program will exit here. The variable $! will contain the error message
# now we read the file using a loop
# readline will give us the next line in the file
# or something false when there is nothing left to read
while ( my $line = readline($in) )
{
# now we test wether the current line contains what
# we are looking for.
# the index function gives us the index of a string within another string.
# for example index("abc", "c") will give us 3
if ( index( $line, $what_to_look_for ) > 0 )
{
# we found what we were looking for
# so we don't need to keep looking in this file anymore
# so we must first close the file
close( $in );
# and then we indicate to the caller the search was a successfull
# this will immedeatly end the subroutine
return 1;
}
}
# If we arrive here the search was unsuccessful
# so we tell that to the caller
return 0;
}
# Here starts the main program
# First we get a list of files
# we want to look at
my #possible_files = glob( "where/your/files/are/output_*.txt" );
# Here we will store the files that we are interested in, aka that contain PYAG_GENERATED
my #wanted_files;
# and now we can loop over the files and see if they contain what we are looking for
foreach my $filename ( #possible_files )
{
# here we use the function we defined earlier
if ( find_in_file( $filename, "PYAG_GENERATED" ) )
{
# with push we can add things to the end of an array
push #wanted_files, $filename;
}
}
# We are finished searching, now we can start adding the files together
# if we found any
if ( scalar #wanted_files > 0 )
{
# Now we could code that us ourselves, open the files, loop trough them and write out
# line by line. But we make life easy for us and just
# use two functions from the module File::Slurp, which comes with Perl I believe
# If not you have to install it
foreach my $filename ( #wanted_files )
{
append_file( "new_1.txt", read_file( $filename ) );
}
print "Output created from " . (scalar #wanted_files) . " files\n";
}
else
{
print "No input files\n";
}
use strict;
use warnings;
my #a;
my $i=1;
my $find1="PYAG_GENERATED";
my $n=1;
my $total_files=47276; #got this no. of files by writing 'ls' command in the terminal
while($n<=$total_files){
open(NEW,"<output_$n.txt") or die "could not open:$!";
my $join=join('',<NEW>);
$a[$i]=$join;
#print "$a[10]";
$n++;
$i++;
}
close NEW;
for($i=1;$i<=$total_files;$i++){
if($a[$i]=~m/$find1/){
open(NEW1,">>new_1.tmp") or die "could not open:$!";
print NEW1 $a[$i];
}
}
close NEW1;

How are these quoted strings replaced with the values in perl .pm file?

Below is the Perl code in .pm file which is supposed to replace the specified strings (that are in "quotes") with some values. But its not happening. Can anyone explain what is happening in this code?
package SomePackage;
require Exporter;
#ISA = qw(Exporter);
#EXPORT = qw(send_request, create_mmd_and_transfer, update_mmd_file);
sub send_request {
my ( $service, $action, $torole ) = #_;
my ( $seller_request_mmd );
my $replace_contents = ();
$replace_contents{"REPLACE_Service"} = $service;
$replace_contents{"REPLACE_RequestAction"} = $action;
$replace_contents{"REPLACE_TradingPartner"} = $torole;
$replace_contents{"REPLACE_Requestxml"} = "Request.xml";
create_mmd_and_transfer( \%replace_contents, $seller_request_mmd, "/MMD.xml" );
}
sub create_mmd_and_transfer {
my $local_replace_contents = shift;
my $input_mmd = shift;
my $local_output_mmd = shift;
my $output_mmd = shift;
update_mmd_file( "$input_mmd", "temp_mmd_file.xml", $local_replace_contents );
}
sub update_mmd_file {
my $input_file = shift;
my $output_file = shift;
my $contents = shift;
open( MMD_FILE, "<$input_file" )
or main::error_exit(" Cannot open MMD file template $input_file \n $input_file not found int the Templates folder \n Please place the same and then run the script ");
open( TEMP_MMD_FILE, ">$output_file" );
while ( <MMD_FILE> ) {
s/^M//g; # Getrid of the ^Ms
foreach my $content ( keys( %$contents ) ) {
my $exact_value = ${%$contents}{$content};
if ( $main::test_scenario =~ /^Invalid Request Action \a\n\d Service/
and ( $content =~ /REPLACE_Service|REPLACE_RequestAction/i ) ) {
}
else {
if ( $exact_value ne "" ) {
s/$content/$exact_value/g;
}
}
}
print TEMP_MMD_FILE;
}
close MMD_FILE;
close TEMP_MMD_FILE;
}
The following will not make your script work, just create the better base for some future questions.
Before you even thinking about posting a perl question here:
1.) add to the top of your script:
use strict;
use warnings;
Posting a code here without these two lines, nobody will bother even trying to read the code.
2.) use perl -c SomePackage.pm for the check. If it will tell you: SomePackage.pm syntax OK - you can start thinking about posting a question here. ;)
Some basic problems with your script:
package SomePackage;
use strict; # see the above
use warnings;
require Exporter;
# these variables are defined outside of this package, so, tell perl this fact. use the `our`
our #ISA = qw(Exporter);
#the use warnings will warn you about the following line
# #EXPORT = qw(send_request, create_mmd_and_transfer, update_mmd_file);
#the correct one is without commas
our #EXPORT = qw(send_request create_mmd_and_transfer update_mmd_file); #not saying anything about the #EXPORT rudeness. :)
#my $replace_contents = ();
#the $replace_contents is a scalar. Bellow you using a hash. So,
my %replace_contents;
#or use the scalar but the lines bellow should use the hashref notation, e.g.
# $replace_contents->{"REPLACE_Service"} = $service;
# you decide. :)
# the seller_request_mmd contains undef here.
create_mmd_and_transfer( \%replace_contents, $seller_request_mmd, "/MMD.xml");
# also bellow, in the subroutine definition it wants 4 arguments.
# indicates a problem...
# using 2-arg open is not the best practice.
# Also, you should to use lexical filehandles
# open (MMD_FILE, "<$input_file")
# better
open (my $mmd_file, '<', $input_file)
# of course, you need change every MMD_FILE to $mmd_file
# check the result of the open and die if not successful
# or you can use the
use autodie;
# instead of $exact_value = ${%$contents}{$content};
# you probably want
my $exact_value = $contents->{$content};
Indent your code!
All the above are just about the syntactic problems and not solving anything about the "logic" of your code.
Ps: And me is still an beginner, so, others sure will find much more problems with the above code.
Ok. Here's what I've done to test this.
Firstly, you didn't give us an input file or the code that you use to call the module. So I invented them. I made the simplest possible input file:
REPLACE_Service
REPLACE_RequestAction
REPLACE_TradingPartner
REPLACE_Requestxml
And this driver program:
#!/usr/bin/perl
use strict;
use warnings;
use SomePackage;
send_request('foo', 'bar', 'baz');
sub error_exit {
die #_;
}
The first time, I ran it, I got this error:
Undefined subroutine &main::send_request called at test line 8.
That was because your #EXPORT line was wrong. You had:
#EXPORT = qw(send_request, create_mmd_and_transfer, update_mmd_file);
But the point of qw(...) is that you don't need the commas. So I corrected it to:
#EXPORT = qw(send_request create_mmd_and_transfer update_mmd_file);
Then I re-ran the program and got this error:
Cannot open MMD file template
not found int the Templates folder
Please place the same and then run the script at test line 11.
That looked like there was something missing. I changed the error message, adding indicators of where the variable interpolation was supposed to happen:
open( MMD_FILE, "<$input_file" )
or main::error_exit(" Cannot open MMD file template <$input_file> \n <$input_file> not found int the Templates folder \n Please place the same and then run the script ");
Then the error message looked like this:
Cannot open MMD file template <>
<> not found int the Templates folder
Please place the same and then run the script at test line 11.
So it seems clear that the $input_file variable isn't set in the update_mmd_file() subroutine. Tracing that variable back, we see that this value is originally the $seller_request_mmd variable in send_request(). But in send_request() you declare $seller_request_mmd but you never give it a value. So let's do that:
my ( $seller_request_mmd ) = 'test_input.txt';
Now, when I run your program, it runs to completion without any errors. And I find a new temp_mmd_file.xml is generated. But it is exactly the same as the input file. So more investigation is needed.
Digging into the update_mmd_file() subroutine, we find this interesting line:
my $exact_value = ${%$contents}{$content};
I think you're trying to extract a value from $contents, which is a hash reference. But your syntax is wrong. You were probably aiming at:
my $exact_value = ${$contents}{$content};
But most Perl programmers prefer the arrow notation for working with reference look-ups.
my $exact_value = $contents->{$content};
Making that change and re-running the program, I get an output file that contains:
foo
bar
baz
Request.xml
Which is exactly what I expected. So the program now works.
But there is still a lot of work to do. As you have been told repeatedly, you should always add:
use strict;
use warnings;
to your code. That will find a lot of potential problems in your code - which you should fix.
To be honest, this feels to me like you were trying to run before you could walk. I'd recommend spending some time to work through a good Perl introductory book before taking on my more Perl work.
And there was a lot of useful information missing from your question. It wouldn't have taken as long to get to the solution if you had shown us your driver program and your input data.

How to make recursive calls using Perl, awk or sed?

If a .cpp or .h file has #includes (e.g. #include "ready.h"), I need to make a text file that has these filenames on it. Since ready.h may have its own #includes, the calls have to be made recursively. Not sure how to do this.
The solution of #OneSolitaryNoob will likely work allright, but has an issue: for each recursion, it starts another process, which is quite wasteful. We can use subroutines to do that more efficiently. Assuming that all header files are in the working directory:
sub collect_recursive_includes {
# Unpack parameter from subroutine
my ($filename, $seen) = #_;
# Open the file to lexically scoped filehandle
# In your script, you'll probably have to transform $filename to correct path
open my $fh, "<", $filename or do {
# On failure: Print a warning, and return. I.e. go on with next include
warn "Can't open $filename: $!";
return;
};
# Loop through each line, recursing as needed
LINE: while(<$fh>) {
if (/^\s*#include\s+"([^"]+)"/) {
my $include = $1;
# you should probably normalize $include before testing if you've seen it
next LINE if $seen->{$include}; # skip seen includes
$seen->{$include} = 1;
collect_recursive_includes($include, $seen);
}
}
}
This subroutine remembers what files it has already seen, and avoids recursing there again—each file is visited one time only.
At the top level, you need to provide a hashref as second argument, that will hold all filenames as keys after the sub has run:
my %seen = ( $start_filename => 1 );
collect_recursive_includes($start_filename, \%seen);
my #files = sort keys %seen;
# output #files, e.g. print "$_\n" for #files;
I hinted in the code comments that you'll probabably have to normalize the filenames. E.g consider a starting filename ./foo/bar/baz.h, which points to qux.h. Then the actual filename we wan't to recurse to is ./foo/bar/qux.h, not ./qux.h. The Cwd module can help you find your current location, and to transform relative to absolute paths. The File::Spec module is a lot more complex, but has good support for platform-independent filename and -path manipulation.
In Perl, recursion is straightforward:
sub factorial
{
my $n = shift;
if($n <= 1)
{ return 1; }
else
{ return $n * factorial($n - 1); }
}
print factorial 7; # prints 7 * 6 * 5 * 4 * 3 * 2 * 1
Offhand, I can think of only two things that require care:
In Perl, variables are global by default, and therefore static by default. Since you don't want one function-call's variables to trample another's, you need to be sure to localize your variables, e.g. by using my.
There are some limitations with prototypes and recursion. If you want to use prototypes (e.g. sub factorial($) instead of just sub factorial), then you need to provide the prototype before the function definition, so that it can be used within the function body. (Alternatively, you can use & when you call the function recursively; that will prevent the prototype from being applied.)
Not totally clear what you want the display to look like, but the basic would be a script called follow_includes.pl:
#!/usr/bin/perl -w
while(<>) {
if(/\#include "(\S+)\"/) {
print STDOUT $1 . "\n";
system("./follow_includes.pl $1");
}
}
Run it like:
% follow_includes.pl somefile.cpp
And if you want to hide any duplicate includes, run it like:
% follow_includes.pl somefile.cpp | sort -u
Usually you'd want this in some sort of tree-print.

How to find a file which exists in different directories under a given path in Perl

I'm looking for a method to looks for file which resides in a few directories in a given path. In other words, those directories will be having files with same filename across. My script seem to have the hierarchy problem on looking into the correct path to grep the filename for processing. I have a fix path as input and the script will need to looks into the path and finding files from there but my script seem stuck on 2 tiers up and process from there rather than looking into the last directories in the tier (in my case here it process on "ln" and "nn" and start processing the subroutine).
The fix input path is:-
/nfs/disks/version_2.0/
The files that I want to do post processing by subroutine will be exist under several directories as below. Basically I wanted to check if the file1.abc do exists in all the directories temp1, temp2 & temp3 under ln directory. Same for file2.abc if exist in temp1, temp2, temp3 under nn directory.
The files that I wanted to check in full path will be like this:-
/nfs/disks/version_2.0/dir_a/ln/temp1/file1.abc
/nfs/disks/version_2.0/dir_a/ln/temp2/file1.abc
/nfs/disks/version_2.0/dir_a/ln/temp3/file1.abc
/nfs/disks/version_2.0/dir_a/nn/temp1/file2.abc
/nfs/disks/version_2.0/dir_a/nn/temp2/file2.abc
/nfs/disks/version_2.0/dir_a/nn/temp3/file2.abc
My script as below:-
#! /usr/bin/perl -w
my $dir = '/nfs/fm/disks/version_2.0/' ;
opendir(TEMP, $dir) || die $! ;
foreach my $file (readdir(TEMP)) {
next if ($file eq "." || $file eq "..") ;
if (-d "$dir/$file") {
my $d = "$dir/$file";
print "Directory:- $d\n" ;
&getFile($d);
&compare($file) ;
}
}
Note that I put the print "Directory:- $d\n" ; there for debug purposes and it printed this:-
/nfs/disks/version_2.0/dir_a/
/nfs/disks/version_2.0/dir_b/
So I knew it get into the wrong path for processing the following subroutine.
Can somebody help to point me where is the error in my script? Thanks!
To be clear: the script is supposed to recurse through a directory and look for files with a particular filename? In this case, I think the following code is the problem:
if (-d "$dir/$file") {
my $d = "$dir/$file";
print "Directory:- $d\n" ;
&getFile($d);
&compare($file) ;
}
I'm assuming the &getFile($d) is meant to step into a directory (i.e., the recursive step). This is fine. However, it looks like the &compare($file) is the action that you want to take when the object that you're looking at isn't a directory. Therefore, that code block should look something like this:
if (-d "$dir/$file") {
&getFile("$dir/$file"); # the recursive step, for directories inside of this one
} elsif( -f "$dir/$file" ){
&compare("$dir/$file"); # the action on files inside of the current directory
}
The general pseudo-code should like like this:
sub myFind {
my $dir = shift;
foreach my $file( stat $dir ){
next if $file -eq "." || $file -eq ".."
my $obj = "$dir/$file";
if( -d $obj ){
myFind( $obj );
} elsif( -f $obj ){
doSomethingWithFile( $obj );
}
}
}
myFind( "/nfs/fm/disks/version_2.0" );
As a side note: this script is reinventing the wheel. You only need to write a script that does the processing on an individual file. You could do the rest entirely from the shell:
find /nfs/fm/disks/version_2.0 -type f -name "the-filename-you-want" -exec your_script.pl {} \;
Wow, it's like reliving the 1990s! Perl code has evolved somewhat, and you really need to learn the new stuff. It looks like you learned Perl in version 3.0 or 4.0. Here's some pointers:
Use use warnings; instead of -w on the command line.
Use use strict;. This will require you to predeclare variables using my which will scope them to the local block or the file if they're not in a local block. This helps catch a lot of errors.
Don't put & in front of subroutine names.
Use and, or, and not instead of &&, ||, and !.
Learn about Perl Modules which can save you a lot of time and effort.
When someone says detect duplicates, I immediately think of hashes. If you use a hash based upon your file's name, you can easily see if there are duplicate files.
Of course a hash can only have a single value for each key. Fortunately, in Perl 5.x, that value can be a reference to another data structure.
So, I recommend you use a hash that contains a reference to a list (array in old parlance). You can push each instance of the file to that list.
Using your example, you'd have a data structure that looks like this:
%file_hash = {
file1.abc => [
/nfs/disks/version_2.0/dir_a/ln/temp1
/nfs/disks/version_2.0/dir_a/ln/temp2
/nfs/disks/version_2.0/dir_a/ln/temp3
],
file2.abc => [
/nfs/disks/version_2.0/dir_a/nn/temp1
/nfs/disks/version_2.0/dir_a/nn/temp2
/nfs/disks/version_2.0/dir_a/nn/temp3
];
And, here's a program to do it:
#! /usr/bin/env perl
#
use strict;
use warnings;
use feature qw(say); #Can use `say` which is like `print "\n"`;
use File::Basename; #imports `dirname` and `basename` commands
use File::Find; #Implements Unix `find` command.
use constant DIR => "/nfs/disks/version_2.0";
# Find all duplicates
my %file_hash;
find (\&wanted, DIR);
# Print out all the duplicates
foreach my $file_name (sort keys %file_hash) {
if (scalar (#{$file_hash{$file_name}}) > 1) {
say qq(Duplicate File: "$file_name");
foreach my $dir_name (#{$file_hash{$file_name}}) {
say " $dir_name";
}
}
}
sub wanted {
return if not -f $_;
if (not exists $file_hash{$_}) {
$file_hash{$_} = [];
}
push #{$file_hash{$_}}, $File::Find::dir;
}
Here's a few things about File::Find:
The work takes place in the subroutine wanted.
The $_ is the name of the file, and I can use this to see if this is a file or directory
$File::Find::Name is the full name of the file including the path.
$File::Find::dir is the name of the directory.
If the array reference doesn't exist, I create it with the $file_hash{$_} = [];. This isn't necessary, but I find it comforting, and it can prevent errors. To use $file_hash{$_} as an array, I have to dereference it. I do that by putting a # in front of it, so it can be #$file_hash{$_} or, #{$file_hash{$_}}.
Once all the file are found, I can print out the entire structure. The only thing I do is check to make sure there is more than one member in each array. If there's only a single member, then there are no duplicates.
Response to Grace
Hi David W., thank you very much for your explainaion and sample script. Sorry maybe I'm not really clear in definding my problem statement. I think I can't use hash in my path finding for the data structure. Since the file*.abc is a few hundred and undertermined and each of the file*.abc even is having same filename but it is actually differ in content in each directory structures.
Such as the file1.abc resides under "/nfs/disks/version_2.0/dir_a/ln/temp1" is not the same content as file1.abc resides under "/nfs/disks/version_2.0/dir_a/ln/temp2" and "/nfs/disks/version_2.0/dir_a/ln/temp3". My intention is to grep the list of files*.abc in each of the directories structure (temp1, temp2 and temp3 ) and compare the filename list with a masterlist. Could you help to shed some lights on how to solve this? Thanks. – Grace yesterday
I'm just printing the file in my sample code, but instead of printing the file, you could open them and process them. After all, you now have the file name and the directory. Here's the heart of my program again. This time, I'm opening the file and looking at the content:
foreach my $file_name (sort keys %file_hash) {
if (scalar (#{$file_hash{$file_name}}) > 1) {
#say qq(Duplicate File: "$file_name");
foreach my $dir_name (#{$file_hash{$file_name}}) {
#say " $dir_name";
open (my $fh, "<", "$dir_name/$file_name")
or die qq(Can't open file "$dir_name/$file_name" for reading);
# Process your file here...
close $fh;
}
}
}
If you are only looking for certain files, you could modify the wanted function to skip over files you don't want. For example, here I am only looking for files which match the file*.txt pattern. Note I use a regular expression of /^file.*\.txt$/ to match the name of the file. As you can see, it's the same as the previous wanted subroutine. The only difference is my test: I'm looking for something that is a file (-f) and has the correct name (file*.txt):
sub wanted {
return if not -f $_ and /^file.*\.txt$/;
if (not exists $file_hash{$_}) {
$file_hash{$_} = [];
}
push #{$file_hash{$_}}, $File::Find::dir;
}
If you are looking at the file contents, you can use the MD5 hash to determine if the file contents match or don't match. This reduces a file to a mere string of 16 to 28 characters which could even be used as a hash key instead of the file name. This way, files that have matching MD5 hashes (and thus matching contents) would be in the same hash list.
You talk about a "master list" of files and it seems you have the idea that this master list needs to match the content of the file you're looking for. So, I'm making a slight mod in my program. I am first taking that master list you talked about, and generating MD5 sums for each file. Then I'll look at all the files in that directory, but only take the ones with the matching MD5 hash...
By the way, this has not been tested.
#! /usr/bin/env perl
#
use strict;
use warnings;
use feature qw(say); #Can use `say` which is like `print "\n"`;
use File::Find; #Implements Unix `find` command.
use Digest::file qw(digest_file_hex);
use constant DIR => "/nfs/disks/version_2.0";
use constant MASTER_LIST_DIR => "/some/directory";
# First, I'm going thorugh the MASTER_LIST_DIR directory
# and finding all of the master list files. I'm going to take
# the MD5 hash of those files, and store them in a Perl hash
# that's keyed by the name of file file. Thus, when I find a
# file with a matching name, I can compare the MD5 of that file
# and the master file. If they match, the files are the same. If
# not, they're different.
# In this example, I'm inlining the function I use to find the files
# instead of making it a separat function.
my %master_hash;
find (
{
%master_hash($_) = digest_file_hex($_, "MD5") if -f;
},
MASTER_LIST_DIR
);
# Now I have the MD5 of all the master files, I'm going to search my
# DIR directory for the files that have the same MD5 hash as the
# master list files did. If they do have the same MD5 hash, I'll
# print out their names as before.
my %file_hash;
find (\&wanted, DIR);
# Print out all the duplicates
foreach my $file_name (sort keys %file_hash) {
if (scalar (#{$file_hash{$file_name}}) > 1) {
say qq(Duplicate File: "$file_name");
foreach my $dir_name (#{$file_hash{$file_name}}) {
say " $dir_name";
}
}
}
# The wanted function has been modified since the last example.
# Here, I'm only going to put files in the %file_hash if they
sub wanted {
if (-f $_ and $file_hash{$_} = digest_file_hex($_, "MD5")) {
$file_hash{$_} //= []; #Using TLP's syntax hint
push #{$file_hash{$_}}, $File::Find::dir;
}
}

How do I get the full path to a Perl script that is executing?

I have Perl script and need to determine the full path and filename of the script during execution. I discovered that depending on how you call the script $0 varies and sometimes contains the fullpath+filename and sometimes just filename. Because the working directory can vary as well I can't think of a way to reliably get the fullpath+filename of the script.
Anyone got a solution?
There are a few ways:
$0 is the currently executing script as provided by POSIX, relative to the current working directory if the script is at or below the CWD
Additionally, cwd(), getcwd() and abs_path() are provided by the Cwd module and tell you where the script is being run from
The module FindBin provides the $Bin & $RealBin variables that usually are the path to the executing script; this module also provides $Script & $RealScript that are the name of the script
__FILE__ is the actual file that the Perl interpreter deals with during compilation, including its full path.
I've seen the first three ($0, the Cwd module and the FindBin module) fail under mod_perl spectacularly, producing worthless output such as '.' or an empty string. In such environments, I use __FILE__ and get the path from that using the File::Basename module:
use File::Basename;
my $dirname = dirname(__FILE__);
$0 is typically the name of your program, so how about this?
use Cwd 'abs_path';
print abs_path($0);
Seems to me that this should work as abs_path knows if you are using a relative or absolute path.
Update For anyone reading this years later, you should read Drew's answer. It's much better than mine.
use File::Spec;
File::Spec->rel2abs( __FILE__ );
http://perldoc.perl.org/File/Spec/Unix.html
I think the module you're looking for is FindBin:
#!/usr/bin/perl
use FindBin;
$0 = "stealth";
print "The actual path to this is: $FindBin::Bin/$FindBin::Script\n";
You could use FindBin, Cwd, File::Basename, or a combination of them. They're all in the base distribution of Perl IIRC.
I used Cwd in the past:
Cwd:
use Cwd qw(abs_path);
my $path = abs_path($0);
print "$path\n";
Getting the absolute path to $0 or __FILE__ is what you want. The only trouble is if someone did a chdir() and the $0 was relative -- then you need to get the absolute path in a BEGIN{} to prevent any surprises.
FindBin tries to go one better and grovel around in the $PATH for something matching the basename($0), but there are times when that does far-too-surprising things (specifically: when the file is "right in front of you" in the cwd.)
File::Fu has File::Fu->program_name and File::Fu->program_dir for this.
Some short background:
Unfortunately the Unix API doesn't provide a running program with the full path to the executable. In fact, the program executing yours can provide whatever it wants in the field that normally tells your program what it is. There are, as all the answers point out, various heuristics for finding likely candidates. But nothing short of searching the entire filesystem will always work, and even that will fail if the executable is moved or removed.
But you don't want the Perl executable, which is what's actually running, but the script it is executing. And Perl needs to know where the script is to find it. It stores this in __FILE__, while $0 is from the Unix API. This can still be a relative path, so take Mark's suggestion and canonize it with File::Spec->rel2abs( __FILE__ );
Have you tried:
$ENV{'SCRIPT_NAME'}
or
use FindBin '$Bin';
print "The script is located in $Bin.\n";
It really depends on how it's being called and if it's CGI or being run from a normal shell, etc.
In order to get the path to the directory containing my script I used a combination of answers given already.
#!/usr/bin/perl
use strict;
use warnings;
use File::Spec;
use File::Basename;
my $dir = dirname(File::Spec->rel2abs(__FILE__));
perlfaq8 answers a very similar question with using the rel2abs() function on $0. That function can be found in File::Spec.
There's no need to use external modules, with just one line you can have the file name and relative path. If you are using modules and need to apply a path relative to the script directory, the relative path is enough.
$0 =~ m/(.+)[\/\\](.+)$/;
print "full path: $1, file name: $2\n";
#!/usr/bin/perl -w
use strict;
my $path = $0;
$path =~ s/\.\///g;
if ($path =~ /\//){
if ($path =~ /^\//){
$path =~ /^((\/[^\/]+){1,}\/)[^\/]+$/;
$path = $1;
}
else {
$path =~ /^(([^\/]+\/){1,})[^\/]+$/;
my $path_b = $1;
my $path_a = `pwd`;
chop($path_a);
$path = $path_a."/".$path_b;
}
}
else{
$path = `pwd`;
chop($path);
$path.="/";
}
$path =~ s/\/\//\//g;
print "\n$path\n";
:DD
Are you looking for this?:
my $thisfile = $1 if $0 =~
/\\([^\\]*)$|\/([^\/]*)$/;
print "You are running $thisfile
now.\n";
The output will look like this:
You are running MyFileName.pl now.
It works on both Windows and Unix.
The problem with just using dirname(__FILE__) is that it doesn't follow symlinks. I had to use this for my script to follow the symlink to the actual file location.
use File::Basename;
my $script_dir = undef;
if(-l __FILE__) {
$script_dir = dirname(readlink(__FILE__));
}
else {
$script_dir = dirname(__FILE__);
}
use strict ; use warnings ; use Cwd 'abs_path';
sub ResolveMyProductBaseDir {
# Start - Resolve the ProductBaseDir
#resolve the run dir where this scripts is placed
my $ScriptAbsolutPath = abs_path($0) ;
#debug print "\$ScriptAbsolutPath is $ScriptAbsolutPath \n" ;
$ScriptAbsolutPath =~ m/^(.*)(\\|\/)(.*)\.([a-z]*)/;
$RunDir = $1 ;
#debug print "\$1 is $1 \n" ;
#change the \'s to /'s if we are on Windows
$RunDir =~s/\\/\//gi ;
my #DirParts = split ('/' , $RunDir) ;
for (my $count=0; $count < 4; $count++) { pop #DirParts ; }
my $ProductBaseDir = join ( '/' , #DirParts ) ;
# Stop - Resolve the ProductBaseDir
#debug print "ResolveMyProductBaseDir $ProductBaseDir is $ProductBaseDir \n" ;
return $ProductBaseDir ;
} #eof sub
The problem with __FILE__ is that it will print the core module ".pm" path not necessarily the ".cgi" or ".pl" script path that is running. I guess it depends on what your goal is.
It seems to me that Cwd just needs to be updated for mod_perl. Here is my suggestion:
my $path;
use File::Basename;
my $file = basename($ENV{SCRIPT_NAME});
if (exists $ENV{MOD_PERL} && ($ENV{MOD_PERL_API_VERSION} < 2)) {
if ($^O =~/Win/) {
$path = `echo %cd%`;
chop $path;
$path =~ s!\\!/!g;
$path .= $ENV{SCRIPT_NAME};
}
else {
$path = `pwd`;
$path .= "/$file";
}
# add support for other operating systems
}
else {
require Cwd;
$path = Cwd::getcwd()."/$file";
}
print $path;
Please add any suggestions.
Without any external modules, valid for shell, works well even with '../':
my $self = `pwd`;
chomp $self;
$self .='/'.$1 if $0 =~/([^\/]*)$/; #keep the filename only
print "self=$self\n";
test:
$ /my/temp/Host$ perl ./host-mod.pl
self=/my/temp/Host/host-mod.pl
$ /my/temp/Host$ ./host-mod.pl
self=/my/temp/Host/host-mod.pl
$ /my/temp/Host$ ../Host/./host-mod.pl
self=/my/temp/Host/host-mod.pl
All the library-free solutions don't actually work for more than a few ways to write a path (think ../ or /bla/x/../bin/./x/../ etc. My solution looks like below. I have one quirk: I don't have the faintest idea why I have to run the replacements twice. If I don't, I get a spurious "./" or "../". Apart from that, it seems quite robust to me.
my $callpath = $0;
my $pwd = `pwd`; chomp($pwd);
# if called relative -> add pwd in front
if ($callpath !~ /^\//) { $callpath = $pwd."/".$callpath; }
# do the cleanup
$callpath =~ s!^\./!!; # starts with ./ -> drop
$callpath =~ s!/\./!/!g; # /./ -> /
$callpath =~ s!/\./!/!g; # /./ -> / (twice)
$callpath =~ s!/[^/]+/\.\./!/!g; # /xxx/../ -> /
$callpath =~ s!/[^/]+/\.\./!/!g; # /xxx/../ -> / (twice)
my $calldir = $callpath;
$calldir =~ s/(.*)\/([^\/]+)/$1/;
None of the "top" answers were right for me. The problem with using FindBin '$Bin' or Cwd is that they return absolute path with all symbolic links resolved. In my case I needed the exact path with symbolic links present - the same as returns Unix command "pwd" and not "pwd -P". The following function provides the solution:
sub get_script_full_path {
use File::Basename;
use File::Spec;
use Cwd qw(chdir cwd);
my $curr_dir = cwd();
chdir(dirname($0));
my $dir = $ENV{PWD};
chdir( $curr_dir);
return File::Spec->catfile($dir, basename($0));
}
On Windows using dirname and abs_path together worked best for me.
use File::Basename;
use Cwd qw(abs_path);
# absolute path of the directory containing the executing script
my $abs_dirname = dirname(abs_path($0));
print "\ndirname(abs_path(\$0)) -> $abs_dirname\n";
here's why:
# this gives the answer I want in relative path form, not absolute
my $rel_dirname = dirname(__FILE__);
print "dirname(__FILE__) -> $rel_dirname\n";
# this gives the slightly wrong answer, but in the form I want
my $full_filepath = abs_path($0);
print "abs_path(\$0) -> $full_filepath\n";
use File::Basename;
use Cwd 'abs_path';
print dirname(abs_path(__FILE__)) ;
Drew's answer gave me:
'.'
$ cat >testdirname
use File::Basename;
print dirname(__FILE__);
$ perl testdirname
.$ perl -v
This is perl 5, version 28, subversion 1 (v5.28.1) built for x86_64-linux-gnu-thread-multi][1]
What's wrong with $^X ?
#!/usr/bin/env perl<br>
print "This is executed by $^X\n";
Would give you the full path to the Perl binary being used.
Evert
On *nix, you likely have the "whereis" command, which searches your $PATH looking for a binary with a given name. If $0 doesn't contain the full path name, running whereis $scriptname and saving the result into a variable should tell you where the script is located.