Overriding a module that is used by a program I'm testing - perl

I am revising a Perl program and I wanted a test harness that could run the original version of the program (call it launch_rockets.pl) and collect the standard output, but somehow skip the system calls that occur inside launch_rockets.pl. The following code successfully overrides system inside launch_rockets.pl:
use subs qw(system);
my $SYSTEM_SUCCESS = 0;
sub system {
print "***\n";
print "system #_\n";
print "***\n\n";
return $SYSTEM_SUCCESS;
}
local #ARGV = #test_args;
do 'launch_rockets.pl';
So far so good. But launch_rockets.pl also contains
use Proc::Background;
and later
Proc::Background->new('perl', 'launch_missiles.pl');
I could copy launch_rockets.pl into a sandbox where Proc::Background is replaced by a stub, but I was wondering if there was any override strategy that would be effective inside a do FILE call in the file's original environment.

use lib '/my/test/library/path';
lib prepends the directory to #INC, so /my/test/library/path/Proc/Background.pm will be the file that gets loaded. Put whatever code you want in there.
Another alternative would be:
{
package Proc::Background;
... # Put stub code here
} # end of package Proc::Background
$INC{'Proc/Background.pm'} = 1; # Make Perl think Proc::Background is loaded

Related

Perl Unit Testing -- is the subroutine testable?

I have been reading up and exploring on the concept of unit testing and test driven development in Perl. I'm looking into how I can incorporate the testing concepts into my development. Say I have a Perl subroutine here:
sub perforce_filelist {
my ($date) = #_;
my $path = "//depot/project/design/...module.sv";
my $p4cmd = "p4 files -e $path\#$date,\#now";
my #filelist = `$p4cmd`;
if (#filelist) {
chomp #filelist;
return #filelist;
}
else {
print "No new files!"
exit 1;
}
}
The subroutine executes a Perforce command and stores the output of that command (which is a list of files) in to the #filelist array. Is this subroutine testable? Would testing if the returned #filelist is empty useful? I am trying to teach myself how to think like a unit test developer.
There are a couple of things that make testing that perforce_filelist subroutine more difficult than it needs to be:
The p4 path is hard-coded
The p4 command is constructed inside the subroutine
The p4 command is fixed (so, it's always the first p4 in the path)
You output directly from the subroutine
You exit from inside the subroutine
But, your subroutine's responsibility is to get a filelist and return it. Anything you do outside of that makes it harder to test. If you can't change this because you don't have control of that, you can write stuff like this in the future:
#!perl -T
# Now perforce_filelist doesn't have responsibility for
# application logic unrelated to the file list
my #new_files = perforce_filelist( $path, $date );
unless( #new_files ) {
print "No new files!"; # but also maybe "Illegal command", etc
exit 1;
}
# Now it's much simpler to see if it's doing it's job, and
# people can make their own decisions about what to do with
# no new files.
sub perforce_filelist {
my ($path, $date) = #_;
my #filelist = get_p4_files( $path, $date );
}
# Inside testing, you can mock this part to simulate
# both returning a list and returning nothing. You
# get to do this without actually running perforce.
#
# You can also test this part separately from everything
# else (so, not printing or exiting)
sub get_p4_files {
my ($path, $date) = #_;
my $command = make_p4_files_command( $path, $date );
return unless defined $command; # perhaps with some logging
my #files = `$command`;
chomp #files;
return #files;
}
# This is where you can scrub input data to untaint values that might
# not be right. You don't want to pass just anything to the shell.
sub make_p4_files_command {
my ($path, $date) = #_;
return unless ...; # validate $path and $date, perhaps with logging
p4() . " files -e $path\#$date,\#now";
}
# Inside testing, you can set a different command to fake
# output. If you are confident the p4 is working correctly,
# you can assume it is and simulate output with your own
# command. That way you don't hit a production resource.
sub p4 { $ENV{"PERFORCE_COMMAND"} // "p4" }
But, you also have to judge if this level of decomposition is worth it to you. For a personal tool that you use infrequently, it's probably too much work. For something you have to support and lots of people use, it might be worth it. In that case, you might want the official P4Perl API. Those value judgements are up to you. But, having decomposed the problem, making bigger changes (such as using P4Perl) shouldn't be as seismic.
As a side note and not something I'm recommending for this problem, this is the use case for the & and no argument list. In this "crypto context", the argument list to the subroutine is the #_ of the subroutine calling it.
These calls keep passing on the same arguments down the chain, which is annoying to type out and maintain:
my #new_files = perforce_filelist( $path, $date );
my #filelist = get_p4_files( $path, $date );
my $command = make_p4_files_command( $path, $date );
With the & and no argument list (not even ()), it passes on the #_ to the next level:
my #new_files = perforce_filelist( $path, $date );
my #filelist = &get_p4_files;
my $command = &make_p4_files_command;
Whether it's testable depends a lot on your environment. You need to ask yourself the following questions:
Does the code depend on a production Perforce installation?
Does running the code with random values interfere with production?
Does running the code with the same values over and over again always yield the same results?
Can the external dependency be unavailable sometimes?
Is the external dependency outside of the control of the test?
Some of those things make it very hard (yet not impossible) to run tests for it. Some can be overcome by refactoring the code a little.
It's also important to define what exactly you want to test. A unit test for the function would make sure that it returns the right thing depending on what you put in, but you control the external dependency. An integration test on the other hand would run the external dependency.
Building an integration test for this is easy, but all the questions I have mentioned above apply. And since you have an exit in your code, you cannot really trap that. You would have to put that function in a script and run that and check the exit codes, or use a module like Test::Exit.
You also need to have your Perforce set up in a way that you always get the same results. That might mean to have dates and files there that you control. I don't know how Perforce works, so I cannot tell you how to do that, but in general these things are called fixtures. It's data that you control. For a database your test program would install them before running the tests, so you have a reproducible result.
You also have output to STDOUT, so you need a tool to grab that, too. Test::Output can do that.
use Test::More;
use Test::Output;
use Test::Exit;
# do something to get your function into the test file...
# possibly install fixtures...
# we will fake the whole function for this demonstration
sub perforce_filelist {
my ($date) = #_;
if ( $date eq 'today' ) {
return qw/foo bar baz/;
}
else {
print "No new files!";
exit 1;
}
}
stdout_is(
sub {
is exit_code( sub { perforce_filelist('yesterday') } ),
1, "exits with 1 when there are no files";
},
"No new files!",
"... and it prints a message to the screen"
);
my #return_values;
stdout_is(
sub {
never_exits_ok(
sub {
#return_values = perforce_filelist('today');
},
"does not exit when there are files"
);
},
q{},
"... and there is no output to the screen"
);
is_deeply( \#return_values, [qw/foo bar baz/],
"... and returns a list of filenames without newlines" );
done_testing;
As you can see, this takes care of all of the things the function does with relative ease. We cover all the code, but we are depending on something external. So this is not a real unit test.
Writing a unit test can be done similarly. There is Test::Mock::Cmd to replace the backticks or qx{} with another function. This could be done manually without that module too. Look at the module's code if you want to know how.
use Test::More;
use Test::Output;
use Test::Exit;
# from doc, could be just 'return';
our $current_qx = sub { diag( explain( \#_ ) ); return; };
use Test::Mock::Cmd 'qx' => sub { $current_qx->(#_) };
# get the function in, I used yours verbatim ...
my $qx; # this will store the arguments and fake an empty result
stdout_is(
sub {
is(
exit_code(
sub {
local $current_qx = sub { $qx = \#_; return; };
perforce_filelist('yesterday');
}
),
1,
"exits with 1 when there are no files"
);
},
"No new files!",
"... and it prints a message to the screen"
);
is $qx->[0], 'p4 files -e //depot/project/design/...module.sv#yesterday,#now',
"... and calls p4 with the correct arguments";
my #return_values;
stdout_is(
sub {
never_exits_ok(
sub {
# we already tested the args to `` above,
# so no need to capture them now
local $current_qx = sub { return "foo\n", "bar\n", "baz\n"; };
#return_values = perforce_filelist('today');
},
"does not exit when there are files"
);
},
q{},
"... and there is no output to the screen"
);
is_deeply( \#return_values, [qw/foo bar baz/],
"... and returns a list of filenames without newlines" );
done_testing;
We now can verify directly that the correct command line has been called, but we do not have to bother with setting up the Perforce to actually have any files, which makes the test run faster and makes you independent. You can run this test on a machine that does not have Perforce installed, which is useful if that function is only a small part of your overall application, and you still want to run the full test suite when you're working on a different part of the app.
Let's take a quick look at the second example's output.
ok 1 - exits with 1 when there are no files
ok 2 - ... and it prints a message to the screen
ok 3 - ... and calls p4 with the correct arguments
ok 4 - does not exit when there are files
ok 5 - ... and there is no output to the screen
ok 6 - ... and returns a list of filenames without newlines
1..6
As you can see it's almost the same as from the first example. I also hardly had to change the tests. Just the mocking strategy was added.
It's important to remember that tests are also code, and the same level of quality should apply to them. They act as documentation of your business logic and as a safety net for you and your fellow developers (including future-you). Clear descriptions of the business case that you are testing is essential for that.
If you want to learn more about the strategy of testing with Perl, and what not to do, I recommend watching the talk Testing Lies by Curtis Poe.
You ask:
Is this subroutine testable?
Yes, it definitely is. However a question instantly arrives; are you doing Development Driven Testing or Test Driven Development? Let me illustrate the difference.
Your current case is that you have written a method earlier than the test, which should drive the development of this function.
If you are trying to follow the basic guidance of TDD, you should write your test case first. In this stage the outcome of your unit test will be red, since there are missing pieces to test for.
Then you write the method with a minimal bits and pieces to make it compile. Now complete the first test case with something that you are asserting from the method you are testing against. If you did it right, your test case is now green, indicating to you that you can now check to see if there are things to refactor.
This will give you the basic principle of TDD, that is; red, green and refactor.
Summarized, you can test and assert for at least two things in your method.
Asserting to see if the #filelist is returned and is not empty
Asserting the failure case when you return 1
Also make sure that you are unit testing without external dependencies, like a file system etc, because that would be integration testing, which is including other moving parts of the system in your test.
As a final note, as with everything, experience comes through trying and learning. Always ask, at least yourself, then your business peers, to see if you are testing the right thing and if it brings any business value to test that part of the system.

How to globally override method in Perl

I am forced to use old version of Apache2::Cookie class which contains bug in method fetch().
I tried to override method in startup script, but it doesn't work later in other modules than startup script:
local *Apache2::Cookie::fetch = sub { ... }
How to override method globally for all other modules?
As Sobrique pointed out in their comment, the local is definitely an issue. But not the only one.
You need to load the pacakge first before you do this. Perl will take the last definition of the sub, just like the last assigned value will be the value of a variable.
my $foo;
$foo = 1;
$foo = 3;
print $foo; # 3, obviously
The same goes for your overwriting.
*Apache2::Cookie::fetch = sub { ... }; # note ... is valid syntax
require 'foo';
$cookie = Apache2::Cookie->new; # not sure if that is correct
# in foo.pm
use Apache2::Cokie; # this will overwrite your implementation
Loading modules in Perl works with %INC, a super-global hash that keeps track of which files have been loaded. If you use a file twice, it only gets loaded and parsed the first time. The second time, it will only call import on the package.
So the trick is to load Apache2::Cookie first, so it's already parsed when the real code loads it again.
use Apache2::Cookie;
BEGIN {
*Apache2::Cookie::fetch = sub { ... };
}
require 'foo'; # or use, no matter
$cookie = Apache2::Cookie->new; # not sure if that is correct
# in foo.pm
use Apache2::Cokie; # now this won't overwrite your implementation
Now Perl already has the file loaded, and the package installed in its guts. Then you overwrite the sub. Then it gets not loaded again and your fix is still intact when it gets called.

Accessing subs from a require'd perl script

I'm going to import some perl code with the require statement. The code I'd like to import is in mylibA.pl:
#!/usr/bin/perl
package FOO::BAR;
sub routine {
print "A message!\n";
}
and mylibB.pl:
#!/usr/bin/perl
package FOO::BAZ;
sub routine {
print "Another message!\n";
}
Then I'm going to use it like this:
#!/usr/bin/perl
foreach my $lib (qw/ mylibA.pl mylibB.pl /){
require $lib;
print "Make a call to ${lib}'s &routine!\n";
}
Is there a way for my script to figure out the namespace that was pulled in with the require statement?
Wow. I have to say this is the one of the most interesting Perl questions I've seen in a while. On the surface this seems like a very simple request - get an included module's namespace, but there really is no way to do this. You can get it while in the package, but not from outside the package. I tried using EXPORT to send the local package name back to the caller script but that ended up going nowhere given the difference in how "use" and "require" work. A more module type of approach probably would have worked with a "use" statement, but the requirement that the required script be able to run by themselves prevented that approach. The only thing left to do was to directly pollute the caller's namespace and hope for the best (assume that the caller had no package namespace) - something that modules are designed to prevent.
BTW - I can't believe this actually works - in strict mode, no less.
caller.pl
#!/usr/bin/perl
use strict;
#package SomePackageName; #if you enable this then this will fail to work
our $ExportedPackageName;
print "Current package=".__PACKAGE__."\n";
foreach my $lib (qw/ mylibA.pl mylibB.pl /){
require $lib;
print "Make a call to ${lib}'s &routine!\n";
print "Package name exported=".$ExportedPackageName."\n";
$ExportedPackageName->routine;
} #end foreach
print "Normal Exit";
exit;
__END__
mylibA.pl
#!/usr/bin/perl
package FOO::BAR;
use strict;
#better hope the caller does not have a package namespace
$main::ExportedPackageName=__PACKAGE__;
sub routine {
print "A message from ".__PACKAGE__."!\n";
}
1;
mylibB.pl
#!/usr/bin/perl
package FOO::BAZ;
use strict;
#better hope the caller does not have a package namespace
$main::ExportedPackageName=__PACKAGE__;
sub routine {
print "Another message, this time from ".__PACKAGE__."!\n";
}
1;
Result:
c:\Perl>
c:\Perl>perl caller.pl
Current package=main
Make a call to mylibA.pl's &routine!
Package name exported=FOO::BAR
A message from FOO::BAR!
Make a call to mylibB.pl's &routine!
Package name exported=FOO::BAZ
Another message, this time from FOO::BAZ!
Normal Exit
Regarding the mostly academical problem of finding the package(s) in a perl source file:
You can try the CPAN module Module::Extract::Namespaces to get all packages within a perl file. It is using PPI and is thus not 100% perfect, but most of the time good enough:
perl -MModule::Extract::Namespaces -e 'warn join ",", Module::Extract::Namespaces->from_file(shift)' /path/to/foo.pm
But PPI can be slow for large files.
You can try to compare the active packages before and after the require. This is also not perfect, because if your perl library file loads additional modules then you cannot tell which is the package of the prinicipal file and what's loaded later. To get the list of packages you can use for example Devel::Symdump. Here's a sample script:
use Devel::Symdump;
my %before = map { ($_,1) } Devel::Symdump->rnew->packages;
require "/path/to/foo.pm";
my %after = map { ($_,1) } Devel::Symdump->rnew->packages;
delete $after{$_} for keys %before;
print join(",", keys %after), "\n";
You can also just parse the perl file for "package" declarations. Actually, that's what the PAUSE upload daemon is doing, so it's probably "good enough" for most cases. Look at the subroutine packages_per_pmfile in
https://github.com/andk/pause/blob/master/lib/PAUSE/pmfile.pm
There are two problems here:
How do I change the behaviour of a script when executed as a standalone and when used as a module?
How do I discover the package name of a piece of code I just compiled?
The general answer to question 2 is: You don't, as any compilation unit may contain an arbitrary number of packages.
Anyway, here are three possible solutions:
Name your modules so that you already know the name when you load it.
Have each module register itself at a central rendezvous point.
Like #1, but adds autodiscovery of your plugins.
The simplest solution is to put all of the API in an ordinary module, and put the standalone logic in a seperate script:
/the/location/
Module/
A.pm
B.pm
a-standalone.pl
b-standalone.pl
Where each standalone basically looks like
use Module::A;
Module::A->run();
If another script wants to reuse that code, it does
use lib "/the/location";
use Module::A;
...
If the loading happens on runtime, then Module::Runtime helps here:
use Module::Runtime 'use_module';
use lib "/the/location";
my $mod_a = use_module('Module::A');
$mod_a->run();
It isn't strictly necessary to place the contents of a-standalone.pl and Module/A.pm into separate files, although that is clearer. If you want to conditionally run code in a module only if it is used as a script, you can utilize the unless(caller) trick.
Of course all of this is tricksing: Here we determine the file name from the module name, not the other way round – which as I already mentioned we cannot do.
What we can do is have each module register itself at a certain predefined location, e.g. by
Rendezvous::Point->register(__FILE__ => __PACKAGE__);
Of course the standalone version has to shield against the possibility that there is no Rendezvous::Point, therefore:
if (my $register = Rendezvous::Point->can("register")) {
$register->(__FILE__ => __PACKAGE__);
}
Eh, this is silly and violates DRY. So let's create a Rendezvous::Point module that takes care of this:
In /the/location/Rendezvous/Point.pm:
package Rendezvous::Point;
use strict; use warnings;
my %modules_by_filename;
sub get {
my ($class, $name) = #_;
$modules_by_filename{$name};
}
sub register {
my ($file, $package) = #_;
$modules_by_filename{$file} = $package;
}
sub import {
my ($class) = #_;
$class->register(caller());
}
Now, use Rendezvous::Point; registers the calling package, and the module name can be retrived by the absolute path.
The script that wants to use the various modules now does:
use "/the/location";
use Rendezvous::Point (); # avoid registering ourself
my $prefix = "/the/location";
for my $filename (map "$prefix/$_", qw(Module/A.pm Module/B.pm)) {
require $filename;
my $module = Rendezvous::Point->get($filename)
// die "$filename didn't register itself at the Rendezvous::Point";
$module->run();
}
Then there are fully featured plugin systems like Module::Pluggable. This system works by looking at all paths were Perl modules may reside, and loads them if they have a certain prefix. A solution with that would look like:
/the/location/
MyClass.pm
MyClass/
Plugin/
A.pm
B.pm
a-standalone.pl
b-standalone.pl
Everything is just like with the first solution: Standalone scripts look like
use lib "/the/location/";
use MyClass::Plugin::A;
MyClass::Plugin::A->run;
But MyClass.pm looks like:
package MyClass;
use Module::Pluggable require => 1; # we can now query plugins like MyClass->plugins
sub run {
# Woo, magic! Works with inner packages as well!
for my $plugin (MyClass->plugins) {
$plugin->run();
}
}
Of course, this still requires a specific naming scheme, but it auto-discovers possible plugins.
As mentioned before it is not possible to look up the namespace of a 'required' package without extra I/O, guessing or assuming.
Like Rick said before, one have to intrude the namespace of the caller or better 'main'. I prefer to inject specific hooks within a BEGIN block of the 'required' package.
#VENDOR/App/SocketServer/Protocol/NTP.pm
package VENDOR::App::SocketServer::Protocol::NTP;
BEGIN {
no warnings;
*main::HANDLE_REQUEST = \&HANDLE_REQUEST;
}
sub HANDLE_REQUEST {
}
#VENDOR/App/SocketServer.pm
my $userPackage= $ARGV[0];
require $userPackage;
main::HANDLE_REQUEST();
Instead of *main:: you can get more specific with *main::HOOKS::HANDLE_REQUESTS i.e. This enables you to resolve all injected hooks easily within the caller by iterating over the HOOK's namespace portion.
foreach my $hooks( keys %main::HOOKS ) {
}

How can I override Perl's open() function but use the same filehandle for testing?

I am currently adding some unit tests to some legacy code and I find myself with the need to override an open function. The live code looks something like this.
if ( !open( F, $filetoopen) ){
# do stuff with <F>
}
What I want to do is make sure that "F" contains a file handle that I have provided from my tests rather than what it thinks its opening.
I have the following code in my .t file...
BEGIN {
*CORE::GLOBAL::open = sub { open(F,$testfiletoopen); };
};
... it does work and the code in test finishes up reading from my test file. However it will only continue to work as long as I use the same filehandle name "F" as the code in test.
If there a way to make this test code less fragile so that if the filehandle name is changed in the live code the test won't fail?
Thanks
Why don't you simply use the parameters your live code provides to open?
BEGIN {
*CORE::GLOBAL::open = sub { open $_[0], $newfilename };
};
Keep in mind that this will break horribly as soon as you use the three-argument-form of open. If anything, this question offers yet more prove that the three-argument version is superior.

How can I access the Apache server configuration in a BEGIN block in mod_perl?

I've been trying to switch from using PerlSetEnv to using custom configuration directives. I have my configuration module with a copy of set_val from the docs:
sub set_val
{
local our ($key, $self, $parms, $arg) = #_;
$self->{$key} = $arg;
unless ($parms->path)
{
local our $srv_cfg = Apache2::Module::get_config($self, $parms->server);
$srv_cfg->{$key} = $arg;
}
}
...which is called by every custom directive sub. Then I have in my .conf:
PerlLoadModule MyModule::ServerConfig
MyCustomDirective 'hello'
This works fine in that httpd -t okays the file's syntax. The problem is that I can't seem to get at the value from the config file from within a BEGIN block, which I need to do.
I've tried tinkering with all sorts of things:
BEGIN
{
use Apache2::CmdParms ();
# use Apache2::Directive ();
use Apache2::Module ();
# use Apache2::ServerUtil ();
# use Apache2::RequestUtil ();
use Data::Dump;
warn ddx(Apache2::Module::get_config('MyModule::ServerConfig', Apache2::CmdParms->server));
# warn ddx(Apache2::Directive->as_hash);
# warn Apache2::ServerUtil->dir_config('MyCustomDirective);
# warn Apache2::CmdParms->server->server_hostname();
}
...but to no avail. Most of my efforts (trying to access CmdParms->server for instance) result in Parent: child process exited with status 3221225477 -- Restarting and an automatic restart of Apache as it says. If I pass ServerUtil->server to get_config(), the server stays alive but the warning only prints out '1'.
I read somewhere that this is because you can't get at anything request-related within a BEGIN block, because requests vary. It kind of makes sense, except that with PerlOptions +GlobalRequest I have been able to see $ENV within a BEGIN block, so why wouldn't I be able to see my own directives, just as dependent as they are on how the request happens? Especially confusing is that if I try to pass Apache2::RequestUtil->request->per\_dir\_config() to get_config(), it says Global $r object is not available. If that's true in a BEGIN block, how is it I can get at $ENV?
Try add what you want to import function to other module and use this module in code where you usually put BEGIN block. It should work same. May be it helps.
Partly, Dump isn't being used correctly. This works better:
use Data::Dump qw(pp);
warn pp(Apache2::Module::get_config('MyModule::ServerConfig', Apache2::ServerUtil->server));
However, it doesn't show any directives that appear within <Directory> blocks.
In my particular case, though, I don't need that functionality, on second thought; that just happens to be where I had stuck them.