Evaluating the success/failure of a subroutine - perl

There's something quite unclear to me about subs return value.
I like to test my modules, sub by sub, and check whether they issue the correct return value or the correct exception if the case arise.
For example, let's say I have the following code (X::Argument::BadFormat is an exception handler derived from Exception::Class):
package My::Module;
use strict;
use warnings;
sub new{#does things unrelated to the current question}
sub my_sub {
my ($self,$possible_value) = #_;
if ($possible_value =~ q{\w}) { #Affect value to current object
$self->{field} = $possible_value;
}else{ #throw an exception
X::Argument::BadFormat->throw(
arg => 'possible_value',
expected_format => 'something that looks like a word',
received_value => $possible_value,
);
}
}
In the test file, I will run tests such as:
my $object = My::Module->new();
throws_ok(sub {$object->my_sub('*')}, 'X::Argument::BadFormat', 'Faulty value will raise an exception');
ok($object->my_sub('turlututu'));
It is easy to test when:
the sub returns a value,
the test conditions must raise an exception,
However, when I just set the value of a field in the current object, I have no reason to return anything.
In that case:
is the simple execution of the code enough to evaluate the sub output as "true" ?
Shall I add an explicit "return 1;" ?
does the sub actually return the last evaluation, in this case the sucess of the
test in the "if"? Something else I did not think about but which is
obvious to everybody?

In this case, I'd just check to ensure that the object's attribute was set correctly. That's all this particular sub does. If it's set ok, the sub ended correctly. If it wasn't set, something went wrong before the sub ended.
my $p='blah';
$obj->my_sub($p);
is $obj->{field}, $p, "my_sub() set the field attr ok";
It would be better if the field attribute had a getter so you're not breaking encapsulation, but I digress.

A sub that has no need to return a value should end with
return;
In your case, without it, you will be returning the value of $possible_value, which is the last thing executed. This doesn't look like a useful thing to return.
Assuming you add the explicit return:
Your throws_ok test looks fine. You should then test that the field was correctly set. Your ok test isn't needed, since your sub won't be returning anything.

Perl returns the result of the last executed code by default.
For example:
print main();
sub main {
my $var = 9 * 7;
}
print will output 63. If your code may be affected by the output of a given subroutine, then you need to set a return value (it's generally considered a best practice to always set an explicit return at the end of a subroutine/method).
print main();
sub main {
my $var = 9 * 7;
return;
}
print will output nothing.
Personally, I always try to set a return value depending on the context of what the subroutine will be returning to, but if you're writing code other people will be using, then it's generally safest to just do return;.
An additional explanation from Perl::Critic (link to the specific policy):
Subroutine "main" does not end with "return" at line 8, near 'sub main {'.
Subroutines::RequireFinalReturn (Severity: 4)
Require all subroutines to terminate explicitly with one of the
following: return',carp', croak',die', exec',exit', goto', or
throw'.
Subroutines without explicit return statements at their ends can be
confusing. It can be challenging to deduce what the return value will
be.
Furthermore, if the programmer did not mean for there to be a
significant return value, and omits a return statement, some of the
subroutine's inner data can leak to the outside. Consider this case:
package Password;
# every time the user guesses the password wrong, its value
# is rotated by one character
my $password;
sub set_password {
$password = shift;
}
sub check_password {
my $guess = shift;
if ($guess eq $password) {
unlock_secrets();
} else {
$password = (substr $password, 1).(substr $password, 0, 1);
}
}
1;
In this case, the last statement in check_password() is the assignment.
The result of that assignment is the implicit return value, so a wrong
guess returns the right password! Adding a `return;' at the end of that
subroutine solves the problem.
The only exception allowed is an empty subroutine.
Be careful when fixing problems identified by this Policy; don't blindly
put a `return;' statement at the end of every subroutine.

Related

Perl sub returns a subroutine

I haven't used Perl for around 20 years, and this is confusing me. I've g******d for it, but I obviously haven't used a suitable search string because I haven't found anything relating to this...
Why would I want to do the following? I understand what it's doing, but the "why" escapes me. Why not just return 0 or 1 to begin with?
I'm working on some code where a sub uses "return sub"; here's a very truncated example e.g.
sub test1 {
$a = shift #_;
if ($a eq "cat") {
return sub {
print("cat test OK\n");
return 0;
}
}
# default if "cat" wasn't the argument
return sub {
print("test for cat did not work\n");
return 1;
}
}
$c = test1("cat");
print ("received: $c\n");
print ("value is: ",&$c,"\n");
$c = test1("bat");
print ("received: $c\n");
print ("value is: ",&$c,"\n");
In your code there is no reason to return a sub. However, with a little tweak
sub test1 {
my $animal = shift #_;
if ($animal eq "cat" || $animal eq "dog") {
return sub {
print("$animal test OK\n");
return 0;
};
}
# default if "cat" or "dog" wasn't the argument
return sub {
print("test for cat or dog did not work\n");
return 1;
};
}
We now have a closure around $animal this saves memory as the test for cat and dog share the same code. Note that this only works with my variables. Also note that $a and $b are slightly special to Perl, they are used in the block of code that you can pass to the sort function and bypass some of the checks on visibility so it's best to avoid them for anything except sort.
You probably want to search "perl closures".
There are many reasons that you'd want to return a code reference, but it's not something I can shortly answer in a StackOverflow question. Mark Jason Dominus's Higher Order Perl is a good way to expand your mind, and we cover a little of that in Intermediate Perl.
I wrote File::Find::Closures as a way to demonstrate this is class. Each subroutine in that module returns two code references—one for the callback to File::Find and the other as a way to access the results. The two share a common variable which nothing else can access.
Notice in your case, you aren't merely calling a subroutine to "get a zero". It's doing other things. Even in your simple example there's some output. Some behavior is then deferred until you actually use the result for something.
Having said that, we have no chance of understanding why the programmer who wrote your particular code did it. One plausible guess was that the system was set up for more complex situations and you're looking at a trivial example that fits into that. Another plausible guess was that the programmer just learned that feature and fell in love with it then used it everywhere for a year. There's always a reason, but that doesn't mean there's always a good reason.

Not enough arguments when redefining a subroutine

When I redefine my own subroutine (and not a Perl built-in function), as below :
perl -ce 'sub a($$$){} sub b {a(#_)}'
I get this error :
Not enough arguments for main::a at -e line 1, near "#_)"
I'm wondering why.
Edit :
The word "redefine" is maybe not well chosen. But in my case (and I probably should have explained what I was trying to do originally), I want to redefine (and here "redefine" makes sense) the Test::More::is function by printing first Date and Time before the test result.
Here's what I've done :
Test::More.pm :
sub is ($$;$) {
my $tb = Test::More->builder;
return $tb->is_eq(#_);
}
MyModule.pm :
sub is ($$;$) {
my $t = gmtime(time);
my $date = $t->ymd('/').' '.$t->hms.' ';
print($date);
Test::More::is(#_);
}
The prototype that you have given your subroutine (copied from Test::More::is) says that your subroutine requires two mandatory parameters and one optional one. Passing in a single array will not satisfy that prototype - it is seen as a single parameter which will be evaluated in scalar context.
The fix is to retrieve the two (or three) parameters passed to your subroutine and to pass them, individually, to Test::More::is.
sub is ($$;$) {
my ($got, $expected, $test_name) = #_;
my $t = gmtime(time);
my $date = $t->ymd('/').' '.$t->hms.' ';
print($date);
Test::More::is($got, $expected, $test_name);
}
The problem has nothing to do with your use of a prototype or the fact that you are redefining a subroutine (which, strictly, you aren't as the two subroutines are in different packages) but it's because Test::More::is() has a prototype.
You are not redefining anything here.
You've set a prototype for your sub a by saying sub a($$$). The dollar signs in the function definition tell Perl that this sub has exactly three scalar parameters. When you call it with a(#_), Perl doesn't know how many elements will be in that list, thus it doesn't know how many arguments the call will have, and fails at compile time.
Don't mess with prototypes. You probably don't need them.
Instead, if you know your sub will need three arguments, explicitly grab them where you call it.
sub a($$$) {
...
}
sub b {
my ($one, $two, $three) = #_;
a($one, $two, $three);
}
Or better, don't use the prototype at all.
Also, a and b are terrible names. Don't use them.
In Perl, prototypes don't validate arguments so much as alter parsing rules. $$;$ means the sub expects the caller to match is(EXPR, EXPR) or is(EXPR, EXPR, EXPR).
In this case, bypassing the prototype is ideal.
sub is($$;$) {
print gmtime->strftime("%Y/%m/%d %H:%M:%S ");
return &Test::More::is(#_);
}
Since you don't care if Test::More::is modifies yours #_, the following is a simple optimization:
sub is($$;$) {
print gmtime->strftime("%Y/%m/%d %H:%M:%S ");
return &Test::More::is;
}
If Test::More::is uses caller, you'll find the following useful:
sub is($$;$) {
print gmtime->strftime("%Y/%m/%d %H:%M:%S ");
goto &Test::More::is;
}

Perl: "Variable will not stay shared"

I looked up a few answers dealing with this warning, but neither did they help me, nor do I truly understand what Perl is doing here at all. Here's what I WANT it to do:
sub outerSub {
my $dom = someBigDOM;
...
my $otherVar = innerSub();
return $otherVar;
sub innerSub {
my $resultVar = doStuffWith($dom);
return $resultVar;
}
}
So basically, I have a big DOM object stored in $dom that I don't want to pass along on the stack if possible. In outerSub, stuff is happening that needs the results from innerSub. innerSub needs access to $dom. When I do this, I get this warning "Variable $dom will not stay shared".
What I don't understand:
Does this warning concern me here? Will my intended logic work here or will there be strange things happening?
If it doesn't work as intended: is it possible to do that? To make a local var visible to a nested sub? Or is it better to just pass it as a parameter? Or is it better to declare an "our" variable?
If I push it as a parameter, will the whole object with all its data (may have several MB) be pushed on the stack? Or can I just pass something like a reference? Or is Perl handling that parameter as a reference all by itself?
In "Variable $foo will not stay shared" Warning/Error in Perl While Calling Subroutine, someone talks about an anonymous sub that will make this possible. I did not understand how that works, never used anything like that.
I do not understand that explanation at all (maybe cause English is not my first language): "When the inner subroutine is called, it will see the value of the outer subroutine's variable as it was before and during the first call to the outer subroutine; in this case, after the first call to the outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the variable.":
What does "the first call to the outer subroutine is complete? mean"
I mean: first I call the outer sub. The outer sub calls the inner sub. The outer sub is of course still running. Once the outer sub is complete, the inner sub will be finished as well. Then how does any of this still apply when the inner sub is already finished? And what about the "first" call? When is the "second" call happening... sorry, this explanation confuses me to no end.
Sorry for the many questions. Maybe someone can at least answer some of them.
In brief, the second and later times outerSub is called will have a different $dom variable than the one used by innerSub. You can fix this by doing this:
{
my $dom;
sub outerSub {
$dom = ...
... innerSub() ...
}
sub innerSub {
...
}
}
or by doing this:
sub outerSub {
my $dom = ...
*innerSub = sub {
...
};
... innerSub() ...
}
or this:
sub outerSub {
my $dom = ...
my $innerSub = sub {
...
};
... $innerSub->() ...
}
All the variables are originally preallocated, and innerSub and outerSub share the same $dom. When you leave a scope, perl goes through the lexical variables that were declared in the scope and reinitializes them. So at the point that the first call to outerSub is completed, it gets a new $dom. Because named subs are global things, though, innerSub isn't affected by this, and keeps referring to the old $dom. So if outerSub is called a second time, its $dom and innerSub's $dom are in fact separate variables.
So either moving the declaration out of outerSub or using an anonymous sub (which gets freshly bound to the lexical environment at runtime) fixed the problem.
You need to have an anonymous subroutine to capture variables:
my $innerSub = sub {
my $resultVar = doStuffWith($dom);
return $resultVar;
};
Example:
sub test {
my $s = shift;
my $f = sub {
return $s x 2;
};
print $f->(), "\n";
$s = "543";
print $f->(), "\n";
}
test("a1b");
Gives:
a1ba1b
543543
If you want to minimize the amount of size passing parameters to subs, use Perl references. The drawback / feature is that the sub could change the referenced param contents.
my $dom = someBigDOM;
my $resultVar = doStuffWith(\$dom);
sub doStuffWith {
my $dom_reference = shift;
my $dom_contents = $$dom_reference;
#...
}
Following http://www.foo.be/docs/perl/cookbook/ch10_17.htm , you should define a local GLOB as follows :
local *innerSub = sub {
...
}
#You can call this sub without ->
innerSub( ... )
Note that even if warning is displayed, the result stay the same as it should be expected : variables that are not defined in the inner sub are modified in the outer sub scope. I cannot see what this warning is about.

'Goto undefined subroutine &main::1' writing a simple Perl debugger

I'm trying to write a simple Perl debugger and I'm running into the following problem.
I'm running the following code as a debugger:
{
package DB;
sub DB { }
sub sub
{
&$sub;
# this is what produces the problem
$i = 1*1;
}
}
1;
I'm loading this in by setting the PERL5DB environment variable - e.g.:
export PERL5DB="BEGIN { require './debugger/tracer.pl'; }
Given this simple little Perl script:
#!/usr/bin/env perl
use Getopt::Long;
print "hello world";
I'm running the script as:
perl -d test.pl
When run, it generates the following error:
$ perl -d test.pl
Goto undefined subroutine &main::1 at /home/vagrant/perl5/perlbrew/perls/perl-5.16.0/lib/site_perl/5.16.0/Exporter.pm line 25.
BEGIN failed--compilation aborted at test.pl line 6.
I've isolated the problem to anything that is run after the &$sub; call in sub in the debugger. This problem is happening with certain packages being included in the base Perl script - in this case, Getopt::Long, though I've also found the same result with IO::File.
My Perl is pretty rusty, particularly with respect to advanced topics like the debugger.
Can anyone help me understand how I can get code executing after the &$sub; call in sub in the debugger to place nicely with the packages that I'm importing?
Thanks!
When you leave a Perl subroutine without using an explicit return statement, Perl will return the value of the last statement in the subroutine.
In particular, this means that if you have a subroutine that calls another subroutine as its last statement, like this:
package DB {
sub sub {
warn "Hello from DB::sub, about to call $sub\n";
&$sub;
}
}
then the return value of the other subroutine called via &$sub will be passed to the original caller, just as if you'd done an explicit return &$sub.
However, if the &$sub call is not the last thing in your DB::sub subroutine, then Perl will just throw away its return value and instead return the value of you actual last statement — in this case $i = 1*1, which evaluates to the number 1.
Now, when you define a custom debugger like that, Perl will wrap every ordinary subroutine call with a call to your DB::sub subroutine. Thus, your code causes every subroutine call to return the number 1! It's hardly a surprise that this will break a lot of things very badly.
Specifically, based on your error message, it looks like something in the Exporter module (which is used by many other modules to export symbols to the caller's namespace) is calling a subroutine that should return a reference to another subroutine. But since, because of your debugger, it's actually returning 1, the following attempt to call the returned subroutine ends up trying to call a subroutine named 1 (which gets mapped to the main:: package because numeric symbol names are superglobal), which then fails.
But what if you really need to do something in your DB::sub after calling &$sub? Well, the workaround is to save the return value, like this:
package DB {
sub DB { }
sub sub {
warn "Hello from DB::sub, about to call $sub...\n";
# call &sub, save the return value in #rv
my #rv = (wantarray ? &$sub : scalar &$sub);
warn "Hello again from DB::sub, just called $sub and got #rv!\n";
# ...and return the saved return value
return (wantarray ? #rv : $rv[0]);
}
}
1;
(The code is slightly complicated by the fact that our DB::sub might be called in either list or scalar context, and we need to pass the appropriate context on to &$sub. The wantarray should take care of that, though.)
Adding on to the answer from Ilmari Karonen.
DB::sub can also be called in a no value (void) context, therefore the return handling needs to take this into account. Refer to the documentation in wantarray for more details.
The following code handles all three cases.
package DB {
sub DB { }
sub sub {
# call &sub, save the return value in #rv
my #rv;
if(defined(wantarray)) {
#rv = (wantarray ? &$sub : scalar &$sub);
}
else {
# wantarray is undef
&$sub;
}
# after invoking &$sub
# return #rv
if(defined(wantarray)) {
return (wantarray ? #rv : $rv[0]);
}
else {
return undef
}
}
}
1;

How can a Perl force its caller to return? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Is it possible for a Perl subroutine to force its caller to return?
I want to write a subroutine which causes the caller to return under certain conditions. This is meant to be used as a shortcut for validating input to a function. What I have so far is:
sub needs($$) {
my ($condition, $message) = #_;
if (not $condition) {
print "$message\n";
# would like to return from the *parent* here
}
return $condition;
}
sub run_find {
my $arg = shift #_;
needs $arg, "arg required" or return;
needs exists $lang{$arg}, "No such language: $arg" or return;
# etc.
}
The advantage of returning from the caller in needs would then be to avoid having to write the repetitive or return inside run_find and similar functions.
I think you're focussing on the wrong thing here. I do this sort of thing with Data::Constraint, Brick, etc. and talk about this in Mastering Perl. With a little cleverness and thought about the structure of your program and the dynamic features that Perl has, you don't need such a regimented, procedural approach.
However, the first thing you need to figure out is what you really want to know in that calling subroutine. If you just want to know yes or no, it's pretty easy.
The problem with your needs is that you're thinking about calling it once for every condition, which forces you to use needs to control program flow. That's the wrong way to go. needs is only there to give you an answer. It's job is not to change program state. It becomes much less useful if you misuse it because some other calling subroutine might want to continue even if needs returns false. Call it once and let it return once. The calling subroutine uses the return value to decide what it should do.
The basic structure involves a table that you pass to needs. This is your validation profile.
sub run_find {
my $arg = shift #_;
return unless needs [
[ sub { $arg }, "arg required" ],
[ sub { exists $lang{$arg} }, "No such language: $arg" ],
];
}
...
}
You construct your table for whatever your requirements are. In needs you just process the table:
sub needs($$) {
my ($table) = #_;
foreach $test ( #$table ) {
my( $sub, $message ) = #$test;
unless( $sub->(...) ) {
print $message;
return
}
}
return 1;
}
Now, the really cool thing with this approach is that you don't have to know the table ahead of time. You can pull that from configuration or some other method. That also means that you can change the table dynamically. Now your code shrinks quite a bit:
sub run_find {
my $arg = shift #_;
return unless needs( $validators{run_find} );
...
}
You cna keep going with this. In Mastering Perl I show a couple of solutions that completely remove that from the code and moves it into a configuration file. That is, you can change the business rules without changing the code.
Remember, almost any time that you are typing the same sequence of characters, you're probably doing it wrong. :)
Sounds like you are re-inventing exception handling.
The needs function should not magically deduce its parent and interrupt the parent's control flow - that's bad manners. What if you add additional functions to the call chain, and you need to go back two or even three functions back? How can you determine this programmatically? Will the caller be expecting his or her function to return early? You should follow the principle of least surprise if you want to avoid bugs - and that means using exceptions to indicate that there is a problem, and having the caller decide how to deal with it:
use Carp;
use Try::Tiny;
sub run_find {
my $arg = shift;
defined $arg or croak "arg required";
exists $lang{$arg} or croak "no such language: $arg";
...
}
sub parent {
try { run_find('foo') }
catch { print $#; }
}
Any code inside of the try block is special: if something dies, the exception is caught and stored in $#. In this case, the catch block is executed, which prints the error to STDOUT and control flow continues as normal.
Disclaimer: exception handling in Perl is a pain. I recommend Try::Tiny, which protects against many common gotchas (and provides familiar try/catch semantics) and Exception::Class to quickly make exception objects so you can distinguish between Perl's errors and your own.
For validation of arguments, you might find it easier to use a CPAN module such as Params::Validate.
You may want to look at a similar recent question by kinopiko:
Is it possible for a Perl subroutine to force its caller to return?
The executive summary for that is: best solution is to use exceptions (die/eval, Try::Tiny, etc...). You van also use GOTO and possibly Continuation::Escape
It doesn't make sense to do things this way; ironically, ya doesn't needs needs.
Here's why.
run_find is poorly written. If your first condition is true, you'll never test the second one since you'll have returned already.
The warn and die functions will provide you printing and/or exiting behavior anyway.
Here's how I would write your run_find sub if you wanted to terminate execution if your argument fails (renamed it to well_defined):
sub well_defined {
my $arg = shift;
$arg or die "arg required";
exists $lang{$arg} or die "no such language: $arg";
return 1;
}
There should be a way to return 0 and warn at the same time, but I'll need to play around with it a little more.
run_find can also be written to return 0 and the appropriate warn message if conditions are not met, and return 1 if they are (renamed to well_defined).
sub well_defined {
my $arg = shift;
$arg or warn "arg required" and return 0;
exists $lang{$arg} or warn "no such language: $arg" and return 0;
return 1;
}
This enables Boolean-esque behavior, as demonstrated below:
perform_calculation $arg if well_defined $arg; # executes only if well-defined