Is Perl unit-testing only for modules, not programs? - perl

The docs I find around the ’net and the book I have, Perl Testing, either say or suggest that unit-testing for Perl is usually done when creating modules.
Is this true? Is there no way to unit-test actual programs using Test::More and cousins?

Of course you can test scripts using Test::More. It's just harder, because most scripts would need to be run as a separate process from which you capture the output, and then test it against expected output.
This is why modulinos (see chapter 17 in: brian d foy, Mastering Perl, second edition, O'Reilly, 2014) were developed. A modulino is a script that can also be used as a module. This makes it easier to test, as you can load the modulino into your test script and then test its functions like you would a regular module.
The key feature of a modulino is this:
#! /usr/bin/perl
package App::MyName; # put it in a package
run() unless caller; # Run program unless loaded as a module
sub run {
... # your program here
}
The function doesn't have to be called run; you could use main if you're a C programmer. You'd also normally have additional subroutines that run calls as needed.
Then your test scripts can use require "path/to/script" to load your modulino and exercise its functions. Since many scripts involve writing output, and it's often easier to print as you go instead of doing print sub_that_returns_big_string(), you may find Test::Output useful.

It's not an easiest way to test your code, but you can test the script directly. You can run the script with specific parameters using system (or better IPC::Run3), capture output and compare it with expected result.
But this will be a top level test. It'll be hard to tell which part of your code caused problem.
Unit-tests are used to test individual modules. This makes it easier to see where the problem came from. Also testing functions individually is much easier, because you only need to think about what can happen in smaller piece of code.
The way of testing is depend on your project size. You can, of cause, put everything into single file, but putting your code into module (or even splitting it into different modules) will give you benefits in future: code can be easier reused and tested.

Related

In Perl can I create a test in one file that only runs after a test in another file has run?

I am running test files on a Perl module I have written and I would like to have a condition where some tests are only run after others have been run successfully (but still keeping them in separate files).
I was looking at Test::Builder, but I don't think it caters for cross-file testing.
Just to explain why I want to do this; I have a test file for every subroutine in my module. Some of these subroutines are passed large hashes from other subroutines which are difficult to replicate for testing purposes.
So instead of spending a few hours trying to hard-code a testable hash, I would like to be passed one from the code like it would be, only after the subroutine where that hash has been generated is tested.
I hope that makes sense! I could write a script to just run the tests in a certain order, but thought that there may well already be a feature in a Perl testing module that I haven't seen. When it comes to using the module, I ideally want to be able to run the tests without having to fiddle about with the 'make test' bit.
The Perl test harness seems to run test files in alphabetical order. This is why so many CPAN distributions have test files that start with numbers - so that the author can control the order of test execution. That will break if, for example, someone runs those tests with prove -s.
In general a test file should be seen as a completely separate unit of testing. You should be able to run all of your tests in any order without any of them affecting any of the others. If two tests rely on each other then they should be in the same file.
You haven't explained why you're so keen for these tests to be in separate files. Perhaps that's the assumption that you should be questioning.

In Perl scripts, should we use shell commands or call Perl functions that imitate shell operations?

I want to know about the best practices here. Suppose I want to get the content of some line of a file. I can use a one-line shell command to get my answer, or write a subroutine, as shown in the code below.
A text file named some_text:
She laughed. Then both continued eating in silence, like strangers,
but after dinner they walked side by side; and there sprang up
between them the light jesting conversation of people who are free
and satisfied, to whom it does not matter where they go or what
they talk about.
Code to get content of line 5 of the file
#!perl
use warnings;
use strict;
my $file = "some_text";
my $lnum = 5;
my $shellcmd = "awk 'NR==$lnum' $file";
print qx($shellcmd);
print getSrcLine($file, $lnum);
sub getSrcLine {
my($file, $lnum) = #_;
open FILE, $file or die "$!";
my #ray = <FILE>;
return $ray[$lnum-1];
}
I ask this because I see a lot of Perl scripts where at some point, a shell command was called, while at some later point, the same task was done by a call to a (library or handwritten) function, for example, rm -rf versus File::Path::rmtree. I just want to make it consistent.
What is the recommended thing to do?
If there's a Perl function for the operation, Perl thinks you should use its version. However, you give an example of a Perl module providing a pure Perl way to do it. That's much different. There's no single answer (as in most things), so you have to decide for yourself what to do:
Does the pure Perl approach do it correctly? For example, File::Copy has some limitations because it makes some awkward decisions for the user, so many people think it's broken. See, for instance, File::Copy versus cp/mv.
Does pure Perl approach do it in an acceptable time? Sometimes the external program is orders of magnitude faster. Sometimes it's a lot slower.
External commands usually are portable within a family of systems (e.g. all linux-like systems) but probably not across families (e.g. Windows and linux). Your tolerance for that might affect your answer. Even if you think you are running the same command, the different flavors of unix-like systems might have different switches for the operations.
Passing complicated arguments—spaces, quotes, and special characters—to external commands can make you cry. You have to do a lot of fiddly work to make sure you're handling arguments correctly. Perl subroutines don't care though.
You have to pay much more attention to what you are doing when you are using the external command. If you just call rm, Perl is going to search through your PATH and use the first thing called rm. That doesn't mean it's the program you think it is. I write about this quite a bit in the "Secure Programming Techniques" in Mastering Perl.
If the pure Perl approach requires a module, especially if that module has many complicated dependencies, you might be in for dependency or distribution hell down the road.
Personally, I start with the pure Perl approach until it doesn't work for the situation.
For your particular examples, I'd use Perl. Shelling out to awk, which is a proto-Perl, is just odd. You should be able to do everything awk does right it Perl. If you have an awk program, you can convert it to Perl with the a2p program:
NR==5
a2p turns that into (modulo some setup bits at the start):
while (<>) {
print $_ if $. == 5;
}
Notice that it still scans the entire file even though you have the fifth line. However, you can use the translated program as a start:
while (<>) {
if( $. == 5 ) {
print;
last;
}
}
I don't think you should shell out to some other program to avoid that Perl code.
To remove a directory tree, I like File::Path. It has some dependencies, but they are all in the Perl Standard Library. There's very little pain, if any, associated with that module. I'd use it until I ran into a problem where it didn't work.
If you want your app to be portable to non-unix systems, then definitely code everything in Perl.
If not, it's really up to you... creating a new process is slower, but if it's not important for the task then it doesn't matter. Personally I would pick the solution which I can quicker implement.
It seems to me that code that works should be the first priority. Yours fails if the file name has a space in it, for example.
Using the shell makes it harder to code correctly since your program needs to properly generate another program to be run by sh. (This problem goes away if you use the multi-arg version of system to avoid the shell.)
Furthermore, using external tools can make it hard to handle errors. You didn't even attempt to do so!
On the flip side, there are multiple reasons for using external tools. For example, Perl doesn't provide as good an file copy utility as cp; using the sort tool allows you to sort arbitrary large files with limited RAM; etc.

Is it okay to use modules from within subroutines?

Recently I start playing with OO Perl and I've been creating quite a bunch of new objects for a new project that I'm working on. As I'm unfamilliar with any best practice regarding OO Perl and we're kind in a tight rush to get it done :P
I'm putting a lot of this kind of code into each of my function:
sub funcx{
use ObjectX; # i don't declare this on top of the pm file
# but inside the function itself
my $obj = new ObjectX;
}
I was wondering if this will cause any negative impact versus putting on the use Object line on top of the Perl modules outside of any function scope.
I was doing this so that I feel it's cleaner in case I need to shift the function around.
And the other thing that I have noticed is that when I try to run a test.pl script on the unix server itself which test my objects, it slow as heck. But when the same code are run through CGI which is connected to an apache server, the web page doesn't load as slowly.
Where to put use?
use occurs at compile time, so it doesn't matter where you put it. At least from a purely pragmatic, 'will it work', point of view. Because it happens at compile time use will always be executed, even if you put it in a conditional. Never do this: if( $foo eq 'foo' ) { use SomeModule }
In my experience, it is best to put all your use statements at the top of the file. It makes it easy to see what is being loaded and what your dependencies are.
Update:
As brian d foy points out, things compiled before the use statement will not be affected by it. So, the location can matter. For a typical module, location does not matter, however, if it does things that affect compilation (for example it imports functions that have prototypes), the location could matter.
Also, Chas Owens points out that it can affect compilation. Modules that are designed to alter compilation are called pragmas. Pragmas are, by convention, given names in all lower-case. These effects apply only within the scope where the module is used. Chas uses the integer pragma as an example in his answer. You can also disable a pragma or module over a limited scope with the keyword no.
use strict;
use warnings;
my $foo;
print $foo; # Generates a warning
{ no warnings 'unitialized`; # turn off warnings for working with uninitialized values.
print $foo; # No warning here
}
print $foo; # Generates a warning
Indirect object syntax
In your example code you have my $obj = new ObjectX;. This is called indirect object syntax, and it is best avoided as it can lead to obscure bugs. It is better to use this form:
my $obj = ObjectX->new;
Why is your test script slow on the server?
There is no way to tell with the info you have provided.
But the easy way to find out is to profile your code and see where the time is being consumed. NYTProf is another popular profiling tool you may want to check out.
Best practices
Check out Perl Best Practices, and the quick reference card. This page has a nice run down of Damian Conway's OOP advice from PBP.
Also, you may wish to consider using Moose. If the long script startup time is acceptable in your usage, then Moose is a huge win.
question 1
It depends on what the module does. If it has lexical effects, then it will only affect the scope it is used in:
my $x;
{
use integer;
$x = 5/2; #$x is now 2
}
my $y = 5/2; #$y is now 2.5
If it is a normal module then it makes no difference where you use it, but it is common to use all of those modules at the top of the program.
question 2
Things that can affect the speed of a program between machines
speed of the processor
version of modules installed (some modules have XS versions that are much faster)
version of Perl
number of entries in PERL5LIB
speed of the drive
daotoad and Chas. Owens already answered the part of your question pertaining to the position of use statements. Let me remark on something else here:
I was doing this so that I feel it's
cleaner in case I need to shift the
function around.
Personally, I find it much cleaner to have all the used modules in one place at the top of the file. You won't have to search for use statements to see what other modules are being used and a quick glance will tell you what is being used and even what is not being used.
Regarding your performance problem: with Apache and mod_perl the Perl interpreter will have to parse and compile your used modules only once. The next time the script is run, execution should be much faster. On the command line, however, a second run doesn't get this benefit.

How can I use Perl to test C programs?

I'm looking for some tutorials showing how I could test C programs by writing Perl programs to automate testing.
Basically I want to learn automation testing with Perl programs. Can anyone kindly share such tutorials or any experiences of yours which can help me kick-start this process?
Perl tests usually use TAP. There are a several C libraries for TAP. Watch this Perl testing presentation.
If you want to start learning how to use Perl to test external programs, start with learning to use Perl to test Perl bits. The Test::More module is a good place to start. Once you understand that, look at all of the other Test::* modules on CPAN to see if one of those modules does the sort of thing you need to do.
If you have a specific question, ask about that. This question is really too broad for anyone to provide a useful answer.
You should look at Perl Testing: A Developer's Notebook by chromatic and Ian Langworth.
I keep meaning to buy a copy but as yet I've just skimmed it at perlmongers meetings. But it seems to be spot on what you're looking for.
UPDATE:
Hm, and this shows I should read the question - testing C programs with Perl, not testing Perl programs with Perl.
The book may still be useful (in that you should probably be writing test scripts and using Test::More and friends) but you will need to write a set of Perl functions to control your C if you take that approach. Basically
sub run_my_c_program {
my #args=#_;
#Set up test environment according to #args
system "my-c-program";
# Turns restults into a $rv data structure
return $rv;
}
and then check $rv in the same way as a normal Perl test:
is_deeply(run_my_c_program(...),
{ .. what I think it returns ..},
".. description of what I'm testing ..");
I don't have a tutorial but I used to be on a test team that tested a C++ compiler. The test harness (home brewed) we used was written in perl and it worked for us for many years. Perl was ideal because we could easily use it to invoke the tools to build the program, capture the output for later insertion into our test logs (using the "backticks" to run the compiler. For example: $compilerOutput = `cl -?`;
), then, if the build was successful, run the test programs and capture their output, again for insertion into our test logs.
Other benefits we got from using Perl:
For pass/fail detection, Perl's regular expression support was helpful.
Our program (the compiler) was ported during the course of its life to many different hardware architectures. Since Perl source was available and in C, we could get perl running on a new architecture as soon as a C compiler was available (which is usually the first language implemented in a new environment after an assembler).
Lots of documentation, books, help available.
-Ron
I wrote an article about Testing C with Libtap that uses Perl's Test::Harness to test C programs. Here's an example project: https://github.com/stig/libggtl/tree/master/t -- might not build any more, as the esoteric build system probably doesn't exist, but you should be able to figure out how it works :-)

What's the best way to have two modules which use functions from one another in Perl?

Unfortunately, I'm a totally noob when it comes to creating packages, exporting, etc in Perl. I tried reading some of the modules and often found myself dozing off from the long chapters. It would be helpful if I can find what I need to understand in just one simple webpage without the need to scroll down. :P
Basically I have two modules, A & B, and A will use some function off from B and B will use some functions off from A. I get a tons of warning about function redefined when I try to compile via perl -c.
Is there a way to do this properly? Or is my design retarded? If so what would be a better way? As the reason I did this is to avoid copy n pasting the other module functions again into this module and renaming them.
It's not really good practice to have circular dependencies. I'd advise factoring something or another to a third module so you can have A depends on B, A depends on C, B depends on C.
So... the suggestion to factor out common code into another module is
a good one. But, you shouldn't name modules *.pl, and you shouldn't
load them by require-ing a certain pathname (as in require
"../lib/foo.pl";). (For one thing, saying '..' makes your script
depend on being executed from the same working directory every time.
So your script may work when you run it as perl foo.pl, but it won't
work when you run it as perl YourApp/foo.pl. That is generally not good.)
Let's say your app is called YourApp. You should build your
application as a set of modules that live in a lib/ directory. For
example, here is a "Foo" module; its filename is lib/YourApp/Foo.pm.
package YourApp::Foo;
use strict;
sub do_something {
# code goes here
}
Now, let's say you have a module called "Bar" that depends on "Foo".
You just make lib/YourApp/Bar.pm and say:
package YourApp::Bar;
use strict;
use YourApp::Foo;
sub do_something_else {
return YourApp::Foo::do_something() + 1;
}
(As an advanced exercise, you can use Sub::Exporter or Exporter to
make use YourApp::Foo install subroutines in the consuming package's
namespace, so that you don't have to write YourApp::Foo:: before
everything.)
Anyway, you build your whole app like this. Logical pieces of
functionally should be grouped together in modules (or even better,
classes).
To make all this run, you write a small script that looks like this (I
put these in bin/, so let's call it bin/yourapp.pl):
#!/usr/bin/env perl
use strict;
use warnings;
use feature ':5.10';
use FindBin qw($Bin);
use lib "$Bin/../lib";
use YourApp;
YourApp::run(#ARGV);
The key here is that none of your code is outside of modules, except a
tiny bit of boilerplate to start your app running. This is easy to
maintain, and more importantly, it makes it easy to write automated
tests. Instead of running something from the command-line, you can
just call a function with some values.
Anyway, this is probably off-topic now. But I think it's important
to know.
The simple answer is to not test compile modules with perl -c... use perl -e'use Module'
or perl -e0 -MModule instead.
perl -c is designed for doing a test compile of a script, not a module. When you run it
on one of your
When recursively using modules, the key point is to make sure anything externally referenced is set up early. Usually this means at least making use #ISA be set in a compile time construct (in BEGIN{} or via "use parent" or the deprecated "use base") and #EXPORT and friends be set in BEGIN{}.
The basic problem is that if module Foo uses module Bar (which uses Foo), compilation of Foo stops right at that point until Bar is fully compiled and it's mainline code has executed. Making sure that whatever parts of Foo Bar's compile and run-of-mainline-code
need are there is the answer.
(In many cases, you can sensibly separate out the functionality into more modules and break the recursion. This is best of all.)