Anyway to tell if perl script is run via do? - perl

I have a small script configuration file which is loaded from a main script.
#main.pl
package MYPACKAGE;
our $isMaster=1;
package main;
my config=do "./config.pl"
#config.pl
my $runViaDoFlag;
$runViaDoFlag=$0=~/main\.pl/; #Test if main.pl is the executing script
$runViaDoFlag=defined $MYPACKAGE::isMaster; #Test custom package variable
#Is there a 'built-in' way to do this?
die "Need to run from main script! " unless $runViaDoFlag;
{
options={
option1=>"some value",
option2=>"some value",
},
mySub=>sub {
# do interesting things here
}
}
In a more complicated config file it might not be so obvious that config.pl script is intended to only be executed by do. Hence I want to include a die with basic usage instructions.
Solutions:
test $0 for the main script name
have custom package variable defined in the main script and checked by the config script
simply have a comment in the config instructing the user how to use it.
These work, however is there some way of knowing if a script is executed via do via built-in variable/subs?

I'd offer a change in design: have that configuration in a normal module, in which you can then test whether it's been loaded by (out of) the main:: namespace or not. Then there is no need for any of that acrobatics with control variables etc.
One way to do that
use warnings;
use strict;
use feature 'say';
use FindBin qw($RealBin);
use lib $RealBin; # so to be able to 'use' from current directory
use ConfigPackage qw(load_config);
my $config = load_config();
# ...
and the ConfigPackage.pm (in the same directory)
package ConfigPackage;
use warnings;
use strict;
use feature 'say';
use Carp;
use Exporter qw(); # want our own import
our #EXPORT_OK = qw(load_config);
sub import {
#say "Loaded by: ", (caller)[0];
croak "Must only be loaded from 'main::'"
if not ( (caller)[0] eq 'main' );
# Now switch to Exporter::import to export symbols as needed
goto &Exporter::import;
}
sub load_config {
# ...
return 'Some config-related data structure';
}
1;
(Note that this use of goto is fine.)
This is just a sketch of course; adjust, develop further, and amend as needed. If this is lodaed out of a package other than main::, and so it fails, then that happens in the compile phase, since that's when import is called. I'd consider that a good thing.
If that config code need be able to run as well, as the question may indicate, then have a separate executable that loads this module and runs what need be run.
As for the question as stated, the title and the question's (apparent) quest differ a little, but both can be treated by using caller EXPR. It won't be a clean little "built-in" invocation though.
The thing about do as intended to be used is that
do './stat.pl' is largely like
eval `cat stat.pl`;
except that ...
(That stat.pl is introduced earlier in docs, merely to signify that do is invoked on a file.)
Then caller(0) will have clear hints to offer (see docs). It returns
my ($package, $filename, $line, $subroutine, $hasargs,
$wantarray, $evaltext, $is_require, $hints, $bitmask, $hinthash)
= caller($i);
In a call asked for, do './config.pl', apart from main (package) and the correct filename, the caller(0) in config.pl also returns:
(eval) for $subroutine
./config.pl for $evaltext
1 for $is_require
Altogether this gives plenty to decide whether the call was made as required.
However, I would not recommend this kind of involved analysis instead of just using a package, what is also incomparably more flexible.

Related

Calling one Perl program from another

I have two Perl files and I want to call one file from another with arguments
First file a.pl
$OUTFILE = "C://programs/perls/$ARGV[0]";
# this should be some out file created inside work like C://programs/perls/abc.log
Second File abc.pl
require "a.pl" "abc.log";
# $OUTFILE is a variable inside a.pl and want to append current file's name as log.
I want it to create an output file with the name of log as that of current file.
One more constraint I have is to use $OUTFILE in both a.pl and abc.pl.
If there is any better approach please suggest.
The require keyword only takes one argument. That's either a file name or a package name. Your line
require "a.pl" "abc.log";
is wrong. It gives a syntax error along the lines of String found where operator expected.
You can require one .pl file from another .pl, but that is very old-fashioned, badly written Perl code.
If neither file defines a package then the code is implicitly placed in the main package. You can declare a package variable in the outside file and use it in the one that is required.
In abc.pl:
use strict;
use warnings;
# declare a package variable
our $OUTFILE = "C://programs/perls/filename";
# load and execute the other program
require 'a.pl';
And in a.pl:
use strict;
use warnings;
# do something with $OUTFILE, like use it to open a file handle
print $OUTFILE;
If you run this, it will print
C://programs/perls/filename
You should convert your perl file you want to call to a perl module:
Hello.pm
#!/usr/bin/perl
package Hello;
use strict;
use warnings;
sub printHello {
print "Hello $_[0]\n"
}
1;
Then you can call it:
test.pl
#!/usr/bin/perl
use strict;
use warnings;
# you have to put the current directory to the module search path
use lib (".");
use Hello;
Hello::printHello("a");
I tested it in git bash on windows, maybe you have to do some modifications in your environment.
In this way you can pass as many arguments as you would like to, and you don't have to look for the variables you are using and maybe not initialized (this is a less safe approach I think, e.g. sometimes you will delete something you did't really want) somewhere in the file you want to call. The disadvantage is that you need to learn a bit about perl modules but I think it definitely worths.
A second approach could be to use the exec/system call (you can pass arguments in this way too; if forking a child process is acceptable), but that is an another story.
I would do this another way. Have the program take the name of the log file as a command-line parameter:
% perl a.pl name-of-log-file
Inside a.pl, open that file to append to it then output whatever you like. Now you can run it from many other sorts of places besides another Perl program.
# a.pl
my $log_file = $ARGV[0] // 'default_log_name';
open my $fh, '>>:utf8', $log_file or die ...;
print { $fh } $stuff_to_output;
But, you could also call if from another Perl program. The $^X is the path to the currently running perl and this uses system in the slightly-safer list form:
system $^X, 'a.pl', $name_of_log_file
How you get something into $name_of_log_file is up to you. In your example you already knew the value in your first program.

In Perl, why do I get "undefined subroutine" in a perl module but not in main ?

I'm getting an "undefined subroutine" for sub2 in the code below but not for sub1.
This is the perl script (try.pl)...
#!/usr/bin/env perl
use strict;
use IO::CaptureOutput qw(capture_exec_combined);
use FindBin qw($Bin);
use lib "$Bin";
use try_common;
print "Running try.pl\n";
sub1("echo \"in sub1\"");
sub2("echo \"in sub2\"");
exit;
sub sub1 {
(my $cmd) = #_;
print "Executing... \"${cmd}\"\n";
my ($stdouterr, $success, $exit_code) = capture_exec_combined($cmd);
print "${stdouterr}\n";
return;
}
This is try_common.pm...
#! /usr/bin/env perl
use strict;
use IO::CaptureOutput qw(capture_exec_combined);
package try_common;
use Exporter;
our #ISA = qw(Exporter);
our #EXPORT = qw(
sub2
);
sub sub2 {
(my $cmd) = #_;
print "Executing... \"${cmd}\"\n";
my ($stdouterr, $success, $exit_code) = capture_exec_combined($cmd);
print "${stdouterr}\n";
return;
}
1;
When I run try.pl I get...
% ./try.pl
Running try.pl
Executing... "echo "in sub1""
in sub1
Executing... "echo "in sub2""
Undefined subroutine &try_common::capture_exec_combined called at
/home/me/PERL/try_common.pm line 20.
This looks like some kind of scoping issue because if I cut/paste the "use IO::CaptureOutput qw(capture_exec_combined);" as the first line of sub2, it works. This is not necessary in the try.pl (it runs sub1 OK), but a problem in the perl module. Hmmmm....
Thanks in Advance for any help!
You imported capture_exec_combined by the use clause before declaring the package, so it was imported into the main package, not the try_common. Move the package declaration further up.
You should take a look at the perlmod document to understand how modules work. In short:
When you use package A (in Perl 5), you change the namespace of the following code to A, and all global symbol (e.g. subroutine) definitions after that point will go into that package. Subroutines inside a scope need not be exported and may be used preceded by their scope name: A::function. This you seem to have found.
Perl uses package as a way to create modules and split code in different files, but also as the basis for its object orientation features.
Most of the times, modules are handled by a special core module called Exporter. See Exporter. This module uses some variables to know what to do, like #EXPORT, #EXPORT_OK or #ISA. The first defines the names that should be exported by default when you include the module with use Module. The second defines the names that may be exported (but need to be mentioned with use Module qw(name1 name2). The last tells in an object oriented fashion what your module is. If you don't care about object orientation, your module typically "is a" Exporter.
Also, as stated in another answer, when you define a module, the package module declaration should be the first thing to be in the file so anything after it will be under that scope.
I hate when I make this mistake although I don't make it much anymore. There are two habits you can develop:
Most likely, make the entire file the package. The first lines will be the package statement and no other package statements show up in the file.
Or, use the new PACKAGE BLOCK syntax and put everything for that package inside the block. I do this for small classes that I might need only locally:
package Foo {
# everything including use statements go in this block
}
I think I figured it out. If, in the perl module, I prefix the "capture_exec_combined" with "::", it works.
Still, why isn't this needed in the main, try.pl ?

Exporting subroutines from a module 'used' via a 'require'

I'm working with a set of perl scripts which our build system is written with. Unfortunately they were not written as a set of modules, but instead a bunch of .pl files which 'require' each other.
After making some changes to a 'LogOutput.pl' which was used by almost every other file, I started to suffer from some issues caused by the file being 'require'd multiple times.
In an effort to fix this, while not changing every file (some of which are not under my direct control), I did the following:
-Move everything in LogOutput.pl to a new file LogOutput.pm, this one having everything needed to make it a module (based on reading http://www.perlmonks.org/?node_id=102347 ).
-Replace the existing LogOutput.pl with the following
BEGIN
{
use File::Spec;
push #INC, File::Spec->catfile($BuildScriptsRoot, 'Modules');
}
use COMPANY::LogOutput qw(:DEFAULT);
1;
This works, except that I need to change calling code to prefix the sub names with the new package (i.e. COMPANY::LogOutput::OpenLog instead of just OpenLog)
Is there any way for me to export the new module's subroutine's from within LogOutput.pl?
The well named Import::Into can be used to export a module's symbols into another package.
use Import::Into;
# As if Some::Package did 'use COMPANY::LogOutput'
COMPANY::LogOutput->import::into("Some::Package");
However, this shouldn't be necessary. Since LogOutput.pl has no package, its code is in the package it was required from. use COMPANY::LogOutput will export into the package which required LogOutput.pl. Your code, as written, should work to emulate a bunch of functions in a .pl file.
Here's what I assume LogOutput.pl looked like (using the subroutine "pass" as a stand in for whatever subroutines you had in there)...
sub pass { print "pass called\n" }
1;
And what I assume LogOutput.pl and LogOutput.pm look like now...
# LogOutput.pl
BEGIN
{
use File::Spec;
push #INC, File::Spec->catfile($BuildScriptsRoot, 'Modules');
}
use COMPANY::LogOutput qw(:DEFAULT);
1;
# LogOutput.pm
package test;
use strict;
use warnings;
use Exporter "import";
our #EXPORT_OK = qw(pass);
our %EXPORT_TAGS = (
':DEFAULT' => [qw(pass)],
);
sub pass { print "pass called\n" }
1;
Note this will not change the basic nature of require. A module will still only be required once, after that requiring it again is a no-op. So this will still not work...
{
package Foo;
require "test.pl"; # this one will work
pass();
}
{
package Bar;
require "test.pl"; # this is a no-op
pass();
}
You can make it work. Perl stores the list of what files have been required in %INC. If you delete and entry, Perl will load the file again. However, you have to be careful that all the code in the .pl file is ok with this. That #INC hack has to make sure its only run once.
BEGIN
{
use File::Spec;
# Only run this code once, no matter how many times this
# file is loaded.
push #INC, File::Spec->catfile($BuildScriptsRoot, 'Modules')
if $LogOutput_pl::only_once++;
}
use COMPANY::LogOutput qw(:DEFAULT);
# Allow this to be required and functions imported more
# than once.
delete $INC{"LogOutput.pl"};
1;
This is one of the few cases that a global variable is justified. A lexical (my) variable must be declared and would be reset with each loading of the library. A global variable does not need to be declared and will persist between loading.
This turned out to just be a stupid mistake on my part, I didn't put the subs into the #EXPORT list, only into #EXPORT_OK.

Is there a tool to check a Perl script for unnecessary use statements?

For Python, there is a script called importchecker which tells you if you have unnecessary import statements.
Is there a similar utility for Perl use (and require) statements?
Take a look at Devel::TraceUse it might give you a chunk of what you're looking for.
Here is a script I wrote to attempt this. It is very simplistic and will not automate anything for you but it will give you something to start with.
#!/usr/bin/perl
use strict;
use v5.14;
use PPI::Document;
use PPI::Dumper;
use PPI::Find;
use Data::Dumper;
my %import;
my $doc = PPI::Document->new($ARGV[0]);
my $use = $doc->find( sub { $_[1]->isa('PPI::Statement::Include') } );
foreach my $u (#$use) {
my $node = $u->find_first('PPI::Token::QuoteLike::Words');
next unless $node;
$import{$u->module} //= [];
push $import{$u->module}, $node->literal;
}
my $words = $doc->find( sub { $_[1]->isa('PPI::Token::Word') } );
my #words = map { $_->content } #$words;
my %words;
#words{ #words } = 1;
foreach my $u (keys %import) {
say $u;
foreach my $w (#{$import{$u}}) {
if (exists $words{$w}) {
say "\t- Found $w";
}
else {
say "\t- Can't find $w";
}
}
}
There is a number of ways to load packages and import symbols (or not). I am not aware of a tool which single-handedly and directly checks whether those symbols are used or not.
But for cases where an explicit import list is given,
use Module qw(func1 func2 ...);
there is a Perl::Critic policy TooMuchCode::ProhibitUnusedImport that helps with much of that.
One runs on the command line
perlcritic --single-policy TooMuchCode::ProhibitUnusedImport program.pl
and the program is checked. Or run without --single-policy flag for a complete check and seek Severity 1 violations in the output, which this is.
For an example, consider a program
use warnings;
use strict;
use feature 'say';
use Path::Tiny; # a class; but it imports 'path'
use Data::Dumper; # imports 'Dumper'
use Data::Dump qw(dd pp); # imports 'dd' and 'pp'
use Cwd qw(cwd); # imports only 'cwd'
use Carp qw(carp verbose); # imports 'carp'; 'verbose' isn't a symbol
use Term::ANSIColor qw(:constants); # imports a lot of symbols
sub a_func {
say "\tSome data: ", pp [ 2..5 ];
carp "\tA warning";
}
say "Current working directory: ", cwd;
a_func();
Running the above perlcritic command prints
Unused import: dd at line 7, column 5. A token is imported but not used in the same code. (Severity: 1)
Unused import: verbose at line 9, column 5. A token is imported but not used in the same code. (Severity: 1)
We got dd caught, while pp from the same package isn't flagged since it's used (in the sub), and neither are carp and cwd which are also used; as it should be, out of what the policy aims for.
But note
whatever comes with :constants tag isn't found
word verbose, which isn't a function (and is used implicitly), is reported as unused
if a_func() isn't called then those pp and carp in it are still not reported even though they are then unused. This may be OK, since they are present in code, but it is worth noting
(This glitch-list is likely not exhaustive.)
Recall that the import list is passed to an import sub, which may expect and make use of whatever the module's design deemed worthy; these need not be only function names. It is apparently beyond this policy to follow up on all that. Still, loading modules with the explicit import list with function names is good practice and what this policy does cover is an important use case.
Also, per the clearly stated policy's usage, the Dumper (imported by Data::Dumper) isn't found, nor is path from Path::Tiny. The policy does deal with some curious Moose tricks.
How does one do more? One useful tool is Devel::Symdump, which harvests the symbol tables. It catches all symbols in the above program that have been imported (no Path::Tiny methods can be seen if used, of course). The non-existing "symbol" verbose is included as well though. Add
use Devel::Symdump;
my $syms = Devel::Symdump->new;
say for $syms->functions;
to the above example. To also deal with (runtime) require-ed libraries we have to do this at a place in code after they have been loaded, what can be anywhere in the program. Then best do it in an END block, like
END {
my $ds = Devel::Symdump->new;
say for $ds->functions;
};
Then we need to check which of these are unused. At this time the best tool I'm aware of for that job is PPI; see a complete example. Another option is to use a profiler, like Devel::NYTProf.
Another option, which requires some legwork†, is the compiler's backend B::Xref, which gets practically everything that is used in the program. It is used as
perl -MO=Xref,-oreport_Xref.txt find_unused.pl
and the (voluminous) output is in the file report_Xref.txt.
The output has sections for each involved file, which have subsections for subroutines and their packages. The last section of the output is directly useful for the present purpose.
For the example program used above I get the output file like
File /.../perl5/lib/perl5//Data/Dump.pm
...
(some 3,000 lines)
...
File find_unused.pl --> there we go, this program's file
Subroutine (definitions)
... dozens of lines ...
Subroutine (main)
Package main
&a_func &43
&cwd &27
Subroutine a_func
Package ?
#?? 14
Package main
&carp &15
&pp &14
So we see that cwd gets called (on line 27) and that carp and pp are also called in the sub a_func. Thus dd and path are unused (out of all imported symbols found otherwise, by Devel::Symdump for example). This is easy to parse.
However, while path is reported when used, if one uses new instead (also in Path::Tiny as a traditional constructor) then that isn't reported in this last section, nor are other methods.
So in principle† this is one way to find which of the symbols (for functions) reported to exist by Devel::Symdump have been used in the program.
† The example here is simple and easy to process but I have no idea how complete, or hard to parse, this is when all kinds of weird ways for using imported subs are taken into account.

Is there a way to "use" a single file that in turn uses multiple others in Perl?

I'd like to create several modules that will be used in nearly all scripts and modules in my project. These could be used in each of my scripts like so:
#!/usr/bin/perl
use Foo::Bar;
use Foo::Baz;
use Foo::Qux;
use Foo::Quux;
# Potentially many more.
Is it possible to move all these use statements to a new module Foo::Corge and then only have to use Foo::Corge in each of my scripts and modules?
Yes, it is possible, but no, you shouldn't do it.
I just spent two weeks to get rid of a module that did nothing but use other modules. I guess this module started out simple and innocent. But over the years it grew into a huge beast with lots and lots of use-statements, most of which weren't needed for any given run of our webapp. Finally, it took some 20 seconds just to 'use' that module. And it supported lazy copy-and-paste module creation.
So again: you may regret that step in a couple of months or years. And what do you get on the plus side? You saved typing a couple of lines in a couple of modules. Big deal.
Something like this should work:
http://mail.pm.org/pipermail/chicago-talk/2008-March/004829.html
Basically, create your package with lots of modules:
package Lots::Of::Modules;
use strict; # strictly optional, really
# These are the modules we want everywhere we say "use Lots::Of::Modules".
# Any exports are re-imported to the module that says "use Lots::Of::Modules"
use Carp qw/confess cluck/;
use Path::Class qw/file dir/;
...
sub import {
my $caller = caller;
my $class = shift;
no strict;
*{ $caller. '::'. $_ } = \*{ $class. '::'. $_ }
for grep { !/(?:BEGIN|import)/ } keys %{ $class. '::' };
}
Then use Lots::Of::Modules elsewhere;
use Lots::Of::Modules;
confess 'OH NOES';
Yes.
In Foo/Corge.pm
use Foo::Bar;
use Foo::Baz;
use Foo::Qux;
use Foo::Quux;
1; # Be successful
All that is left is to get the directory containing sub-directory Foo added to your library path (#INC). Alternatively, create Foo.pm and have it use the other modules. They would be in a Foo sub -directory beside Foo.pm.
If you think about it, all the complex Perl modules that use other modules do this all the time. They are not necessarily in the same top-level package (Foo in this example), but they are used just as necessarily.
While you could use Carp, and Path::Class and confess, and so on (as jrockway suggests), that seems like overkill from where I'm sitting.
[EDIT: My earlier solution involving use Lots::Of::Modules; had a subtle bug -- see bottom. The fix makes things a bit uglier, but still workable.]
[EDIT #2: Added BEGIN { ... } around the code to ensure that any functions defined are available at compile time. Thanks to jrockway for pointing this out.]
The following code will do exactly what jrockway's code does, only simpler and clearer:
In Lots/Of/Modules.inc:
use Carp qw/confess cluck/;
use Path::Class qw/file dir/;
0; # Flag an error if called with "use" or "require" instead of "do"
To import those 4 functions:
BEGIN { defined( do 'Lots/Of/Modules.inc' ) or die; }
Because we don't have a package Lots::Of::Modules; statement at the start of this file, the use statements will export into the caller's package.
We must use do instead of use or require, since the latter will only load the file once (causing failure if use Lots::Of::Modules; is called more than once, e.g. in separate modules used by the main program). The more primitive do does not throw an exception if it fails to find the file named by its argument in #INC, hence the need for checking the result with defined.
Another option would be for Foo::Corge to just re-export any items of interest normally:
package Foo::Corge;
use base 'Exporter';
BEGIN {
our #EXPORT_OK = qw( bar baz qux quux );
use Foo::Bar qw( bar );
use Foo::Baz qw( baz );
use Foo::Qux qw( qux );
use Foo::Quux qw( quux );
}
1;
(The 'use' statements can probably go outside the BEGIN, but that's where they were in the code I checked to verify that this worked the way I thought it did. That code actually evals the uses, so it has a reason for them to be inside BEGIN which likely doesn't apply in your case.)
using #EXPORT instead #EXPORT_OK , is more simple
Library is :
package mycommon;
use strict;
use warnings;
use base 'Exporter';
our #EXPORT = qw(test);
sub test {
print "this is a test";
}
1;
use it:
#!/usr/bin/perl
use strict;
use warnings;
use mycommon;
common::test()