Is it possible use or require a Perl script without executing its statements? - perl

I need to add unit testing to some old scripts, the scripts are all basically in the following form:
#!/usr/bin/perl
# Main code
foo();
bar();
# subs
sub foo {
}
sub bar {
}
If I try to 'require' this code in a unit test, the main section of the code will run, where as I want to be able to just test "foo" in isolation.
Is there any way to do this without moving foo,bar into a seperate .pm file?

Assuming you have no security concerns, wrap it in a sub { ... } and eval it:
use File::Slurp "read_file";
eval "package Script; sub {" . read_file("script") . "}";
is(Script::foo(), "foo");
(taking care that the eval isn't in scope of any lexicals that would be closed over by the script).

Another common trick for unit testing scripts is to wrap the body of their code into a 'caller' block:
#!/usr/bin/perl
use strict;
use warnings;
unless (caller) {
# startup code
}
sub foo { ... }
When run from the command line, cron, a bash script, etc., it runs normally. However, if you load it from another Perl program, the "unless (caller) {...}" code does not run. Then in your test program, declare a namespace (since the script is probably running code in package main::) and 'do' the script.
#!/usr/bin/perl
package Tests::Script; # avoid the Test:: namespace to avoid conflicts
# with testing modules
use strict;
use warnings;
do 'some_script' or die "Cannot (do 'some_script'): $!";
# write your tests
'do' is more efficient than eval and fairly clean for this.
Another trick for testing scripts is to use Expect. This is cleaner, but is also harder to use and it won't let you override anything within the script if you need to mock anything up.

Ahh, the old "how do I unit test a program" question. The simplest trick is to put this in your program before it starts doing things:
return 1 unless $0 eq __FILE__;
__FILE__ is the current source file. $0 is the name of the program being run. If they are the same, your code is being executed as a program. If they're different, it's being loaded as a library.
That's enough to let you start unit testing the subroutines inside your program.
require "some/program";
...and test...
Next step is to move all the code outside a subroutine into main, then you can do this:
main() if $0 eq __FILE__;
and now you can test main() just like any other subroutine.
Once that's done you can start contemplating moving the program's subroutines out into their own real libraries.

Related

Anyway to tell if perl script is run via do?

I have a small script configuration file which is loaded from a main script.
#main.pl
package MYPACKAGE;
our $isMaster=1;
package main;
my config=do "./config.pl"
#config.pl
my $runViaDoFlag;
$runViaDoFlag=$0=~/main\.pl/; #Test if main.pl is the executing script
$runViaDoFlag=defined $MYPACKAGE::isMaster; #Test custom package variable
#Is there a 'built-in' way to do this?
die "Need to run from main script! " unless $runViaDoFlag;
{
options={
option1=>"some value",
option2=>"some value",
},
mySub=>sub {
# do interesting things here
}
}
In a more complicated config file it might not be so obvious that config.pl script is intended to only be executed by do. Hence I want to include a die with basic usage instructions.
Solutions:
test $0 for the main script name
have custom package variable defined in the main script and checked by the config script
simply have a comment in the config instructing the user how to use it.
These work, however is there some way of knowing if a script is executed via do via built-in variable/subs?
I'd offer a change in design: have that configuration in a normal module, in which you can then test whether it's been loaded by (out of) the main:: namespace or not. Then there is no need for any of that acrobatics with control variables etc.
One way to do that
use warnings;
use strict;
use feature 'say';
use FindBin qw($RealBin);
use lib $RealBin; # so to be able to 'use' from current directory
use ConfigPackage qw(load_config);
my $config = load_config();
# ...
and the ConfigPackage.pm (in the same directory)
package ConfigPackage;
use warnings;
use strict;
use feature 'say';
use Carp;
use Exporter qw(); # want our own import
our #EXPORT_OK = qw(load_config);
sub import {
#say "Loaded by: ", (caller)[0];
croak "Must only be loaded from 'main::'"
if not ( (caller)[0] eq 'main' );
# Now switch to Exporter::import to export symbols as needed
goto &Exporter::import;
}
sub load_config {
# ...
return 'Some config-related data structure';
}
1;
(Note that this use of goto is fine.)
This is just a sketch of course; adjust, develop further, and amend as needed. If this is lodaed out of a package other than main::, and so it fails, then that happens in the compile phase, since that's when import is called. I'd consider that a good thing.
If that config code need be able to run as well, as the question may indicate, then have a separate executable that loads this module and runs what need be run.
As for the question as stated, the title and the question's (apparent) quest differ a little, but both can be treated by using caller EXPR. It won't be a clean little "built-in" invocation though.
The thing about do as intended to be used is that
do './stat.pl' is largely like
eval `cat stat.pl`;
except that ...
(That stat.pl is introduced earlier in docs, merely to signify that do is invoked on a file.)
Then caller(0) will have clear hints to offer (see docs). It returns
my ($package, $filename, $line, $subroutine, $hasargs,
$wantarray, $evaltext, $is_require, $hints, $bitmask, $hinthash)
= caller($i);
In a call asked for, do './config.pl', apart from main (package) and the correct filename, the caller(0) in config.pl also returns:
(eval) for $subroutine
./config.pl for $evaltext
1 for $is_require
Altogether this gives plenty to decide whether the call was made as required.
However, I would not recommend this kind of involved analysis instead of just using a package, what is also incomparably more flexible.

In Perl, why do I get "undefined subroutine" in a perl module but not in main ?

I'm getting an "undefined subroutine" for sub2 in the code below but not for sub1.
This is the perl script (try.pl)...
#!/usr/bin/env perl
use strict;
use IO::CaptureOutput qw(capture_exec_combined);
use FindBin qw($Bin);
use lib "$Bin";
use try_common;
print "Running try.pl\n";
sub1("echo \"in sub1\"");
sub2("echo \"in sub2\"");
exit;
sub sub1 {
(my $cmd) = #_;
print "Executing... \"${cmd}\"\n";
my ($stdouterr, $success, $exit_code) = capture_exec_combined($cmd);
print "${stdouterr}\n";
return;
}
This is try_common.pm...
#! /usr/bin/env perl
use strict;
use IO::CaptureOutput qw(capture_exec_combined);
package try_common;
use Exporter;
our #ISA = qw(Exporter);
our #EXPORT = qw(
sub2
);
sub sub2 {
(my $cmd) = #_;
print "Executing... \"${cmd}\"\n";
my ($stdouterr, $success, $exit_code) = capture_exec_combined($cmd);
print "${stdouterr}\n";
return;
}
1;
When I run try.pl I get...
% ./try.pl
Running try.pl
Executing... "echo "in sub1""
in sub1
Executing... "echo "in sub2""
Undefined subroutine &try_common::capture_exec_combined called at
/home/me/PERL/try_common.pm line 20.
This looks like some kind of scoping issue because if I cut/paste the "use IO::CaptureOutput qw(capture_exec_combined);" as the first line of sub2, it works. This is not necessary in the try.pl (it runs sub1 OK), but a problem in the perl module. Hmmmm....
Thanks in Advance for any help!
You imported capture_exec_combined by the use clause before declaring the package, so it was imported into the main package, not the try_common. Move the package declaration further up.
You should take a look at the perlmod document to understand how modules work. In short:
When you use package A (in Perl 5), you change the namespace of the following code to A, and all global symbol (e.g. subroutine) definitions after that point will go into that package. Subroutines inside a scope need not be exported and may be used preceded by their scope name: A::function. This you seem to have found.
Perl uses package as a way to create modules and split code in different files, but also as the basis for its object orientation features.
Most of the times, modules are handled by a special core module called Exporter. See Exporter. This module uses some variables to know what to do, like #EXPORT, #EXPORT_OK or #ISA. The first defines the names that should be exported by default when you include the module with use Module. The second defines the names that may be exported (but need to be mentioned with use Module qw(name1 name2). The last tells in an object oriented fashion what your module is. If you don't care about object orientation, your module typically "is a" Exporter.
Also, as stated in another answer, when you define a module, the package module declaration should be the first thing to be in the file so anything after it will be under that scope.
I hate when I make this mistake although I don't make it much anymore. There are two habits you can develop:
Most likely, make the entire file the package. The first lines will be the package statement and no other package statements show up in the file.
Or, use the new PACKAGE BLOCK syntax and put everything for that package inside the block. I do this for small classes that I might need only locally:
package Foo {
# everything including use statements go in this block
}
I think I figured it out. If, in the perl module, I prefix the "capture_exec_combined" with "::", it works.
Still, why isn't this needed in the main, try.pl ?

How to write unit test cases for a Perl script

I'm able to write unit tests test_case.t for a Perl Module ModuleOne.pm
test_case.t
use strict;
use warnings;
use Test::More;
use Test::Cmd;
use ModuleOne; # Included the module here
my $ret = ModuleOne::methodone(args);
is($ret->{val}, 1, "Checking return value"); # success
I'm trying the achieve the same unit test cases for a perl script script_one.pl
script_one.pl
use strict;
use warnings;
use ModuleOne;
my $NAME;
my $ID;
# get parameters
GetOptions (
"name" => \$NAME,
"emp_id" => \$ID,
)
validate_params();
sub validate_params {
# this method will validate params
}
sub print_name {
# this method will print name
}
How can I include this perl file script_one.pl in test_case.t and write test cases for methods validate_params and print_name?
There are a couple of options. One is to use Test::Script to see if your code compiles and runs, and does some stuff. It's more of an integration test than a unit test though, and if you have external dependencies like writing to the file system, it's tough to mock those away like this.
Since you've got subs in the script, the easiest way is probably to require or do the script in your test file, maybe inside a different package (but that doesn't really matter). You can then call those functions, because they are in one of your namespaces.
use strict;
use warnings;
use Test::More;
package Foo {
do 'script_one.pl';
};
is Foo::print_name, 'foo', 'prints the right name';
This way you can mock dependencies more easily and you get some more control. The only thing that might be tricky is code that's not in subs and will be run at invocation, like the call to validate_params. You could just use Capture::Tiny to brush that under the rug.
The best option though is to not have functions in your script. Just make another module that has those functions, and call it in your script. It's fine to have a script like the following.
#!/usr/bin/env perl
use strict;
use warnings;
use My::Modules::Foo;
My::Modules::Foo->run; # or ::run()
It doesn't matter if it's OOP or not. The idea will be the same. If you encapsulate it properly, you can unit-test all your code without ever using that script.
Regarding the GetOpts stuff, those variables can be lexicals to the script, but your naming with the capital letters and the lack of arguments to the validate_params call indicate that they are really package-wide and are used inside the function. Don't do that. Use arguments to the subs. Put all the subs in a package, then have GetOpts in the script, and pass the options as arguments to the function.
That way you can test everything and really don't need the script.

identify a procedure and replace it with a different procedure

What I want to achieve:
###############CODE########
old_procedure(arg1, arg2);
#############CODE_END######
I have a huge code which has a old procedure in it. I want that the call to that old_procedure go to a call to a new procedure (new_procedure(arg1, arg2)) with the same arguments.
Now I know, the question seems pretty stupid but the trick is I am not allowed to change the code or the bad_function. So the only thing I can do it create a procedure externally which reads the code flow or something and then whenever it finds the bad_function, it replaces it with the new_function. They have a void type, so don't have to worry about the return values.
I am usng perl. If someone knows how to atleast start in this direction...please comment or answer. It would be nice if the new code can be done in perl or C, but other known languages are good too. C++, java.
EDIT: The code is written in shell script and perl. I cannot edit the code and I don't have location of the old_function, I mean I can find it...but its really tough. So I can use the package thing pointed out but if there is a way around it...so that I could parse the thread with that function and replace function calls. Please don't remove tags as I need suggestions from java, C++ experts also.
EDIT: #mirod
So I tried it out and your answer made a new subroutine and now there is no way of accessing the old one. I had created an variable which checks the value to decide which way to go( old_sub or new_sub)...is there a way to add the variable in the new code...which sends the control back to old_function if it is not set...
like:
use BadPackage; # sub is defined there
BEGIN
{ package BapPackage;
no warnings; # to avoid the "Subroutine bad_sub redefined" message
# check for the variable and send to old_sub if the var is not set
sub bad_sub
{ # good code
}
}
# Thanks #mirod
This is easier to do in Perl than in a lot of other languages, but that doesn't mean it's easy, and I don't know if it's what you want to hear. Here's a proof-of-concept:
Let's take some broken code:
# file name: Some/Package.pm
package Some::Package;
use base 'Exporter';
our #EXPORT = qw(forty_two nineteen);
sub forty_two { 19 }
sub nineteen { 19 }
1;
# file name: main.pl
use Some::Package;
print "forty-two plus nineteen is ", forty_two() + nineteen();
Running the program perl main.pl produces the output:
forty-two plus nineteen is 38
It is given that the files Some/Package.pm and main.pl are broken and immutable. How can we fix their behavior?
One way we can insert arbitrary code to a perl command is with the -M command-line switch. Let's make a repair module:
# file: MyRepairs.pm
CHECK {
no warnings 'redefine';
*forty_two = *Some::Package::forty_two = sub { 42 };
};
1;
Now running the program perl -MMyRepairs main.pl produces:
forty-two plus nineteen is 61
Our repair module uses a CHECK block to execute code in between the compile-time and run-time phase. We want our code to be the last code run at compile-time so it will overwrite some functions that have already been loaded. The -M command-line switch will run our code first, so the CHECK block delays execution of our repairs until all the other compile time code is run. See perlmod for more details.
This solution is fragile. It can't do much about modules loaded at run-time (with require ... or eval "use ..." (these are common) or subroutines defined in other CHECK blocks (these are rare).
If we assume the shell script that runs main.pl is also immutable (i.e., we're not allowed to change perl main.pl to perl -MMyRepairs main.pl), then we move up one level and pass the -MMyRepairs in the PERL5OPT environment variable:
PERL5OPT="-I/path/to/MyRepairs -MMyRepairs" bash the_immutable_script_that_calls_main_pl.sh
These are called automated refactoring tools and are common for other languages. For Perl though you may well be in a really bad way because parsing Perl to find all the references is going to be virtually impossible.
Where is the old procedure defined?
If it is defined in a package, you can switch to the package, after it has been used, and redefine the sub:
use BadPackage; # sub is defined there
BEGIN
{ package BapPackage;
no warnings; # to avoid the "Subroutine bad_sub redefined" message
sub bad_sub
{ # good code
}
}
If the code is in the same package but in a different file (loaded through a require), you can do the same thing without having to switch package.
if all the code is in the same file, then change it.
sed -i 's/old_procedure/new_procedure/g codefile
Is this what you mean?

Where should I put common utility functions for Perl .t tests?

I am getting started with Test::More, already have a few .t test scripts. Now I'd like to define a function that will only be used for the tests, but across different .t files. Where's the best place to put such a function? Define another .t without any tests and require it where needed? (As a sidenote I use the module structure created by Module::Starter)
The best approach is to put your test functions, like any other set of functions, into a module. You can then use Test::Builder to have your test diagnostics/fail messages act as if the failure originated from the .t file, rather than your module.
Here is a simple example.
package Test::YourModule;
use Test::Builder;
use Sub::Exporter -setup => { exports => ['exitcode_ok'] }; # or 'use Exporter' etc.
my $Test = Test::Builder->new;
# Runs the command and makes sure its exit code is $expected_code. Contrived!
sub exitcode_ok {
my ($command, $expected_code, $name) = #_;
system($command);
my $exit = $? >> 8;
my $message = $!;
my $ok = $Test->is_num( $exit, $expected_code, $name );
if ( !$ok ) {
$Test->diag("$command exited incorrectly with the error '$message'");
}
return $ok;
}
In your script:
use Test::More plan => 1;
use Test::YourModule qw(exitcode_ok);
exitcode_ok('date', 0, 'date exits without errors');
Write a module as rjh has demonstrated. Put it in t/lib/Test/YourThing.pm, then it can be loaded as:
use lib 't/lib';
use Test::YourThing;
Or you can put it straight in t/Test/YourThing.pm, call it package t::Test::YourThing and load it as:
use t::Test::YourThing;
The upside is not having to write the use lib line in every test file, and clearly identifying it as a local test module. The down side is cluttering up t/, it won't work if "." is not in #INC (for example, if you run your tests in taint mode, but it can be worked around with use lib ".") and if you decide to move the .pm file out of your project you have to rewrite all the uses. Your choice.