Perl "do", with relative path beginning with "." or ".." - perl

I'm trying to use Perl's do EXPR function as a poor man's config parser, using a second .pl file that just returns a list as configuration information. (I think this is probably the ideal use for do, not least because I can write "do or die" in my code.) Here's an example:
main.pl
# Go read the config file
my %config = do './config.pl';
# do something with it
$web_object->login($config{username}, $config{password});
config.pl
# Configuration file for main script
(
username => "username",
password => "none_of_your_business",
favorite_color => "0x0000FF",
);
Reading Perldoc for do gives a lot of helpful advice about relative paths - searching #INC and modifying %INC, special warnings about 5.26 not searching "." any more, etc. But it also has these bits:
# load the exact specified file (./ and ../ special-cased)...
Using do with a relative path (except for ./ and ../), like...
And then it never actually bothers to explain the Special Case path handling for "./" or "../" - an important omission!
So my question(s) are all variations on "what really happens when you do './file.pl';"? For instance...
Does this syntax still work in 5.26, though CWD is removed from #INC?
From whose perspective is "./" anyway: the Perl binary, the Perl script executed, CWD from the user's shell, or something else?
Are there security risks to be aware of?
Is this better or worse than modifying #INC and just using a base filename?
Any insight is appreciated.

OK, so - to start with, I'm not sure your config.pl is really the right approach - it's not perl for starters, because it doesn't compile. Either way though, trying to evaluate stuff to 'parse config' isn't a great plan generally - it's rather prone to unpleasant glitches and security flaws, so should be reserved for when it's needed.
I would urge you to do it differently by either:
Write it as a module
Something like this:
package MyConfig;
# Configuration file for main script
our %config = (
username => "username",
password => "none_of_your_business",
favorite_color => "0x0000FF",
);
You could then in your main script:
use MyConfig; #note - the file needs to be the same name, and in #INC
and access it as:
print $MyConfig::config{username},"\n";
If you can't put it in the existing #INC - which there may be reasons you can't, FindBin lets you use paths relative to your script location:
use FindBin;
use lib "$FindBin::Bin";
use MyConfig;
Write your 'config' as a defined parsable format, rather than executable code.
YAML
YAML is very solid for a config file particularly:
use YAML::XS;
open ( my $config_file, '<', 'config.yml' ) or die $!;
my $config = Load ( do { local $/; <$config_file> });
print $config -> {username};
And your config file looks like:
username: "username"
password: "password_here"
favourite_color: "green"
air_speed_of_unladen_swallow: "african_or_european?"
(YAML also supports multi-dimensional data structures, arrays etc. You don't seem to need these though.)
JSON
JSON based looks much the same, just the input is:
{
"username": "username",
"password": "password_here",
"favourite_color": "green",
"air_speed_of_unladen_swallow": "african_or_european?"
}
You read it with:
use JSON;
open ( my $config_file, '<', 'config.json' ) or die $!;
my $config = from_json ( do { local $/; <$config_file> });
Using relative paths to config:
You don't have to worry about #INC at all. You can simply use based on relative path... but a better bet would be to NOT do that, and use FindBin instead - which lets you specify "relative to my script path" and that's much more robust.
use FindBin;
open ( my $config_file, '<', "$FindBin::Bin/config.yml" ) or die $!;
And then you'll know you're reading the one in the same directory as your script, no matter where it's invoked from.
specific questions:
From whose perspective is "./" anyway: the Perl binary, the Perl script executed, CWD from the user's shell, or something else?
Current working directory passes down through processes. So user's shell by default, unless the perl script does a chdir
Are there security risks to be aware of?
Any time you 'evaluate' something as if it were executable code (and EXPR can be) there's a security risk. It's probably not huge, because the script will be running as the user, and the user is the person who can tamper with CWD. The core risks are:
user is in a 'different' directory where someone else has put a malicious thing for them to run. (e.g. imagine of 'config.pl' had rm -rf /* in it for example). Maybe there's a 'config.pl' in /tmp that they 'run' accidentally?
The thing you're evaling has a typo, and breaks the script in funky and unexpected ways. (E.g. maybe it redefines $[ and messes with program logic henceforth in ways that are hard to debug)
script does anything in a privileged context. Which doesn't appear to be the case, but see the previous point and imagine if you're root or other privileged user.
Is this better or worse than modifying #INC and just using a base filename?
Worse IMO. Actually just don't modify #INC at all, and use a full path, or relative one using FindBin. And don't eval things when it's not necessary.

Related

If I declare a package with multiple levels of embedding, does the module named as the leaf node need to be in a subdirctory?

I am dealing with some legacy Perl code, which I want to change as little as possible. The main script, call it "myscript.pl" has a structure like this
use MyCompany::ABDoc::HTMLFile;
use MyCompany::ABDoc::JavaScriptFile 2014.004_015;
print "Howdy"
...
The HTMLFile.pm module looks like this
package MyCompany::AMDoc::HTMLFile;
...
I am troubleshooting my_script.pl, and wish to run it from the command line. (It normally is triggered by a Jenkins job). When I try
perl -d ./my_script.pl
I get a message about HTMLFile.pm not being found. This is because hte HTMLFile.pm actually exists at the same level as my_script.pl in the filesystem.
If this was my own script, and I had the freedom to move things around, I would create directory structure
MyCompany/AMDoc/HtmlFile.pm
and I know the script would work. But I am reluctant to do this, because somehow, this all runs fine when triggered by the Jenkins job.
So is it possible to run this code, from the command line, without moving anything? I just want to do some troubleshooting. I haven't found discussion in the Perl documentation about what kinds of command line flags, such as "-I", might help me out here.
I would create directory structure MyCompany/AMDoc/HtmlFile.pm and I know the script would work.
No, moving the file to MyCompany/AMDoc/HTMLFile.pm relative to the script would not work unless the script already takes steps to add its directory to #INC.[1]
For example, adding the following in the script would achieve that:
use FindBin qw( $RealBin );
use lib $RealBin;
This can also be done from without
perl -I "$dir" "$dir"/my_script.pl # General case
perl -I . ./my_script.pl # Specific case
So is it possible to run this code, from the command line, without moving anything?
No, not without modifying the script.[2]
According to what you gave us, it has to be accessible as MyCompany/AMDoc/HTMLFile.pm relative to a directory in #INC.
It would happen to work if the script's current work directory happened to match the directory in which the script is found in old version of Perl. But that's just a fluke. These frequently don't match.
Well, you could use something of the form
perl -e'
use FindBin qw( $RealBin );
my $s = shift;
push #INC, sub {
my ( undef, $path ) = #_;
my ( $qfn ) = $path =~ m{^MyCompany/AMDoc/(.*)}s
or return;
open( my $fh, '<', $qfn )
or return;
return $fh;
};
do( $s ) or die( $# // $! );
' ./my_script.pl
Even then, that expects the script to end in a true value.
My initial assumption was wrong. My code was actually running on a Kubernetes pod, and that pod was configured with the correct directory structure for the module. In addition, PERL5LIB is set in the pod.
PERL5LIB=/perl5lib/perl-modules:/perl5lib/perl-modules/cpan/lib64/perl5:/perl5lib/perl-modules/cpan/share/perl5
and sure enough, that very first path has the path to my module.

Including a module from another module [duplicate]

I don't know how to do one thing in Perl and I feel I am doing something fundamentally wrong.
I am doing a larger project, so I split the task into different modules. I put the modules into the project directory, in the "modules/" subdirectory, and added this directory to PERL5LIB and PERLLIB.
All of these modules use some configuration, saved in external file in the main project directory - "../configure.yaml" if you look at it from the module file perspective.
But, right now, when I use module through "use", all relative paths in the module are taken as from the current directory of the script using these modules, not from the directory of the module itself. Not even when I use FindBin or anything.
How do I load a file, relative from the module path? Is that even possible / advisable?
Perl stores where modules are loaded from in the %INC hash. You can load things relative to that:
package Module::Foo;
use File::Spec;
use strict;
use warnings;
my ($volume, $directory) = File::Spec->splitpath( $INC{'Module/Foo.pm'} );
my $config_file = File::Spec->catpath( $volume, $directory, '../configure.yaml' );
%INC's keys are based on a strict translation of :: to / with .pm appended, even on
Windows, VMS, etc.
Note that the values in %INC may be relative to the current directory if you put relative directories in #INC, so be careful if you change directories between the require/use and checking %INC.
The global %INC table contains an entry for every module you have use'd or require'd, associated with the place that Perl found that module.
use YAML;
print $INC{"YAML.pm"};
>> /usr/lib/perl5/site_perl/5.8/YAML.pm
Is that more helpful?
There's a module called File::ShareDir that exists to solve this problem. You were on the right track trying FindBin, but FindBin always finds the running program, not the module that's using it. ShareDir does something quite similar to ysth's solution, except wrapped up in a nice interface.
Usage is as simple as
my $filename = File::ShareDir::module_file(__PACKAGE__,
'my/data.txt');
# and then open $filename or whatever else.
or
my $dirname = File::ShareDir::module_dir(__PACKAGE__);
# Play ball!
Change your use Module call to require Module (or require Module; Module->import(LIST)). Then use the debugger to step through the module loading process and see where Perl thinks it is loading the files from.

Using a Perl module in my own directory [duplicate]

I have a module in the parent directory of my script and I would like to 'use' it.
If I do
use '../Foo.pm';
I get syntax errors.
I tried to do:
push #INC, '..';
use EPMS;
and .. apparently doesn't show up in #INC
I'm going crazy! What's wrong here?
use takes place at compile-time, so this would work:
BEGIN {push #INC, '..'}
use EPMS;
But the better solution is to use lib, which is a nicer way of writing the above:
use lib '..';
use EPMS;
In case you are running from a different directory, though, the use of FindBin is recommended:
use FindBin; # locate this script
use lib "$FindBin::RealBin/.."; # use the parent directory
use EPMS;
There are several ways you can modify #INC.
set PERL5LIB, as documented in perlrun
use the -I switch on the command line, also documented in perlrun. You can also apply this automatically with PERL5OPT, but just use PERL5LIB if you are going to do that.
use lib inside your program, although this is fragile since another person on a different machine might have it in a different directory.
Manually modify #INC, making sure you do that at compile time if you want to pull in a module with use. That's too much work though.
require the filename directly. While this is possible, it doesn't allow that filename to load files in the same directory. This would definitely raise eyebrows in a code review.
Personally I prefer to keep my modules (those that I write for myself or for systems I can control) in a certain directory, and also to place them in a subdirectory. As in:
/www/modules/MyMods/Foo.pm
/www/modules/MyMods/Bar.pm
And then where I use them:
use lib qw(/www/modules);
use MyMods::Foo;
use MyMods::Bar;
As an aside.. when it comes to pushing, I prefer the fat-arrow comma:
push #array => $pushee;
But that's just a matter of preference.
'use lib' is the answer, as #ephemient mentioned earlier. One other option is to use require/import instead of use. It means the module wouldn't be loaded at compile time, but instead in runtime.
That will allow you to modify #INC as you tried there, or you could pass require a path to the file instead of the module name. From 'perldoc -f require':
If EXPR is a bareword, the require assumes a ".pm" extension and
replaces "::" with "/" in the filename for you, to make it easy to
load standard modules. This form of loading of modules does not risk
altering your namespace.
You have to have the push processed before the use is -- and use is processed early. So, you'll need a BEGIN { push #INC, ".."; } to have a chance, I believe.
As reported by "perldoc -f use":
It is exactly equivalent to
BEGIN { require Module; import Module LIST; }
except that Module must be a bareword.
Putting that another way, "use" is equivalent to:
running at compile time,
converting the package name to a file name,
require-ing that file name, and
import-ing that package.
So, instead of calling use, you can call require and import inside a BEGIN block:
BEGIN {
require '../EPMS.pm';
EPMS->import();
}
And of course, if your module don't actually do any symbol exporting or other initialization when you call import, you can leave that line out:
BEGIN {
require '../EPMS.pm';
}
Some IDEs don't work correctly with 'use lib', the favored answer. I found 'use lib::relative' works with my IDE, JetBrains' WebStorm.
see POD for lib::relative
The reason it's not working is because what you're adding to #INC is relative to the current working directory in the command line rather than the script's directory.
For example, if you're currently in:
a/b/
And the script you're running has this URL:
a/b/modules/tests/test1.pl
BEGIN {
unshift(#INC, "..");
}
The above will mean that .. results in directory a/ rather than a/b/modules.
Either you must change .. to ./modules in your code or do a cd modules/tests in the command line before running the script again.

How can I run a shell script from inside a Perl script run by cron?

Is it possible to run Perl script (vas.pl) with shell sript inside (date.sh & backlog.sh) in cron or vice versa?
Thanks.
0 19 * * * /opt/perl/bin/perl /reports/daily/scripts/vas_rpt/vasCIO.pl 2> /reports/daily/scripts/vas_rpt/vasCIO.err
Error encountered:
date.sh: not found
backlog.sh: not found
Perl script:
#!/opt/perl/bin/perl
system("sh date.sh");
open(FH,"/reports/daily/scripts/vas_rpt/date.txt");
#date = <FH>;
close FH;
open(FH,"/reports/daily/scripts/vas_rpt/$cat1.txt");
#array = <FH>;
system("sh backlog.sh $date[0] $array[0]");
close FH;
cron runs your perl script in a different working directory than your current working directory. Use the full path of your script file:
# I'm assuming your shell script reside in the same
# dir as your perl script:
system("sh /reports/daily/scripts/date.sh");
Or if your're allergic to hardcoding paths like I am you can use the FindBin package from CPAN:
use FindBin qw($Bin);
system("sh $Bin/date.sh");
If your shell script also needs to start in the correct path then it's probably better to first change your working directory:
use FindBin qw($Bin);
chdir $Bin;
system("sh date.sh");
You can do what you want as long as you are careful.
The first thing to remember with cron jobs is that you get almost no environment set.
The chances are, the current directory is / or perhaps $HOME. And the value of $PATH is minimal - your profile has not been run, for example.
So, your script didn't find 'date.sh' because it wasn't in the correct directory.
To get the data from the shell script into your program, you need to pipe it there - or arrange for the 'date.sh' to dump the data into the file successfully. Of course, Perl has built-in date and time handling, so you don't need to use the shell for it.
You also did not run with use warnings; or use strict; which would also help you. For example, $cat1 is not a defined variable.
Personally, I run a simple shell script from cron and let it deal with all the complexities; I don't use I/O redirection in the crontab file. That's partly a legacy of working on ancient systems - but it also leads to portable and reliable running of cron jobs.
It's possible. Just keep in mind that your working directory when running under cron may not be what you think it is - it's the value in your HOME environment variable, or that specified in the /etc/passwd file. Consider fully qualifying the path to your .shes.
There are a lot of things that need care in your script, and I talk about most of them in the "Secure Programming Techniques" chapter of Mastering Perl. You can also find some of it in perlsec/
Since you are taking external data and passing them to other external programs, you should use taint checking to ensure that the data are what you expect. What if someone were able to sneak something extra into those files?
When you want to pass data to external programs, use system in the list form so the shell doesn't get a chance to interpret possible meta-characters.
Instead of relying on the PATH to find the programs that you expect to run, specify their full paths explicitly to ensure you are at least running the file you think you are (and not something someone snuck into a directory that is earlier in PATH). If you were really paranoid (like taint checking is), you might also check that those files and directories had suitable permissions (e.g., not world-writeable).
Just as a bonus note, if you only want one line from a filehandle, you can use the line-input operator in scalar context:
my $date = <$fh>;
You probably want to chomp the data too to get rid of possible ending newlines. Even if you don't think a terminating newline should be there because another program created the file, someone looking at the file with a text editor might add it.
Good luck, :)

What's the best practice for changing working directories inside scripts?

Do you think changing directories inside bash or Perl scripts is acceptable? Or should one avoid doing this at all costs?
What is the best practice for this issue?
Like Hugo said, you can't effect your parent process's cwd so there's no problem.
Where the question is more applicable is if you don't control the whole process, like in a subroutine or module. In those cases you want to exit the subroutine in the same directory as you entered, otherwise subtle action-at-a-distance creeps in which causes bugs.
You can to this by hand...
use Cwd;
sub foo {
my $orig_cwd = cwd;
chdir "some/dir";
...do some work...
chdir $orig_cwd;
}
but that has problems. If the subroutine returns early or dies (and the exception is trapped) your code will still be in some/dir. Also, the chdirs might fail and you have to remember to check each use. Bleh.
Fortunately, there's a couple modules to make this easier. File::pushd is one, but I prefer File::chdir.
use File::chdir;
sub foo {
local $CWD = 'some/dir';
...do some work...
}
File::chdir makes changing directories into assigning to $CWD. And you can localize $CWD so it will reset at the end of your scope, no matter what. It also automatically checks if the chdir succeeds and throws an exception otherwise. Sometimes it use it in scripts because it's just so convenient.
The current working directory is local to the executing shell, so you can't affect the user unless he is "dotting" (running it in the current shell, as opposed to running it normally creating a new shell process) your script.
A very good way of doing this is to use subshells, which i often do in aliases.
alias build-product1='(cd $working-copy/delivery; mvn package;)'
The paranthesis will make sure that the command is executed from a sub-shell, and thus will not affect the working directory of my shell. Also it will not affect the last-working-directory, so cd -; works as expected.
I don't do this often, but sometimes it can save quite a bit of headache. Just be sure that if you change directories, you always change back to the directory you started from. Otherwise, changing code paths could leave the application somewhere it should not be.
For Perl, you have the File::pushd module from CPAN which makes locally changing the working directory quite elegant. Quoting the synopsis:
use File::pushd;
chdir $ENV{HOME};
# change directory again for a limited scope
{
my $dir = pushd( '/tmp' );
# working directory changed to /tmp
}
# working directory has reverted to $ENV{HOME}
# tempd() is equivalent to pushd( File::Temp::tempdir )
{
my $dir = tempd();
}
# object stringifies naturally as an absolute path
{
my $dir = pushd( '/tmp' );
my $filename = File::Spec->catfile( $dir, "somefile.txt" );
# gives /tmp/somefile.txt
}
I'll second Schwern's and Hugo's comments above. Note Schwern's caution about returning to the original directory in the event of an unexpected exit. He provided appropriate Perl code to handle that. I'll point out the shell (Bash, Korn, Bourne) trap command.
trap "cd $saved_dir" 0
will return to saved_dir on subshell exit (if you're .'ing the file).
mike
Consider also that Unix and Windows have a built in directory stack: pushd and popd. It’s extremely easy to use.
Is it at all feasible to try and use fully-quantified paths, and not make any assumptions on which directory you're currently in? e.g.
use FileHandle;
use FindBin qw($Bin);
# ...
my $file = new FileHandle("< $Bin/somefile");
rather than
use FileHandle;
# ...
my $file = new FileHandle("< somefile");
This will probably be easier in the long run, as you don't have to worry about weird things happening (your script dying or being killed before it could put the current working directory back to where it was), and is quite possibly more portable.