Best way to equate 2 subroutines? - perl

I have many modules each of them have a sub insert_info and sub update_info methods. From time to time, the sub update_info and sub insert_info are the same. But I don't want to use only one of these methods when that happens, because in general they are not the same. So How I make the 2 methods equal?
Is this the only way?
sub insert_info {
# code......
}
sub update_info { insert_info(); }

Alias via typeglob
*update_info = \&insert_info;
Adding BEGIN may avoid problems
BEGIN { *update_info = \&insert_info; }
This helps ensure that it runs before other things, which may call it.
Comments on your Example
Also, your sub update_info { insert_info(); } is not a copy because it will always call insert_info with no parameters. If you passed update_info any values (like update_info('someval')), they would not be passed on to insert_info. Furthermore, they are both declared and defined subroutines - both taking memory.
If you wanted to declare it how you did and automatically pass along the arguments to the inner function, you could do sub update_info { insert_info(#_); }, or better is sub update_info { &insert_info }, since the & without any argument list, will automatically pass along #_.
Still these take more memory than using the typeglob assignment, listed at the top.

This is one of the rare opportunities to use goto without being shamed for it.
sub update_info {
goto &insert_info;
}
This has the benefit of passing along any arguments to the inner function, and cleaning up the caller() stack to remove the call to the outer function.

I suggest you use the Sub::Alias module from CPAN, which has the advantages of being explicit and self-documenting as well as neat and clear in use.
Your code becomes
use Sub::Alias 'alias';
sub insert_info {
...
}
alias update_info => 'insert_info';

Related

Perl: "Variable will not stay shared"

I looked up a few answers dealing with this warning, but neither did they help me, nor do I truly understand what Perl is doing here at all. Here's what I WANT it to do:
sub outerSub {
my $dom = someBigDOM;
...
my $otherVar = innerSub();
return $otherVar;
sub innerSub {
my $resultVar = doStuffWith($dom);
return $resultVar;
}
}
So basically, I have a big DOM object stored in $dom that I don't want to pass along on the stack if possible. In outerSub, stuff is happening that needs the results from innerSub. innerSub needs access to $dom. When I do this, I get this warning "Variable $dom will not stay shared".
What I don't understand:
Does this warning concern me here? Will my intended logic work here or will there be strange things happening?
If it doesn't work as intended: is it possible to do that? To make a local var visible to a nested sub? Or is it better to just pass it as a parameter? Or is it better to declare an "our" variable?
If I push it as a parameter, will the whole object with all its data (may have several MB) be pushed on the stack? Or can I just pass something like a reference? Or is Perl handling that parameter as a reference all by itself?
In "Variable $foo will not stay shared" Warning/Error in Perl While Calling Subroutine, someone talks about an anonymous sub that will make this possible. I did not understand how that works, never used anything like that.
I do not understand that explanation at all (maybe cause English is not my first language): "When the inner subroutine is called, it will see the value of the outer subroutine's variable as it was before and during the first call to the outer subroutine; in this case, after the first call to the outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the variable.":
What does "the first call to the outer subroutine is complete? mean"
I mean: first I call the outer sub. The outer sub calls the inner sub. The outer sub is of course still running. Once the outer sub is complete, the inner sub will be finished as well. Then how does any of this still apply when the inner sub is already finished? And what about the "first" call? When is the "second" call happening... sorry, this explanation confuses me to no end.
Sorry for the many questions. Maybe someone can at least answer some of them.
In brief, the second and later times outerSub is called will have a different $dom variable than the one used by innerSub. You can fix this by doing this:
{
my $dom;
sub outerSub {
$dom = ...
... innerSub() ...
}
sub innerSub {
...
}
}
or by doing this:
sub outerSub {
my $dom = ...
*innerSub = sub {
...
};
... innerSub() ...
}
or this:
sub outerSub {
my $dom = ...
my $innerSub = sub {
...
};
... $innerSub->() ...
}
All the variables are originally preallocated, and innerSub and outerSub share the same $dom. When you leave a scope, perl goes through the lexical variables that were declared in the scope and reinitializes them. So at the point that the first call to outerSub is completed, it gets a new $dom. Because named subs are global things, though, innerSub isn't affected by this, and keeps referring to the old $dom. So if outerSub is called a second time, its $dom and innerSub's $dom are in fact separate variables.
So either moving the declaration out of outerSub or using an anonymous sub (which gets freshly bound to the lexical environment at runtime) fixed the problem.
You need to have an anonymous subroutine to capture variables:
my $innerSub = sub {
my $resultVar = doStuffWith($dom);
return $resultVar;
};
Example:
sub test {
my $s = shift;
my $f = sub {
return $s x 2;
};
print $f->(), "\n";
$s = "543";
print $f->(), "\n";
}
test("a1b");
Gives:
a1ba1b
543543
If you want to minimize the amount of size passing parameters to subs, use Perl references. The drawback / feature is that the sub could change the referenced param contents.
my $dom = someBigDOM;
my $resultVar = doStuffWith(\$dom);
sub doStuffWith {
my $dom_reference = shift;
my $dom_contents = $$dom_reference;
#...
}
Following http://www.foo.be/docs/perl/cookbook/ch10_17.htm , you should define a local GLOB as follows :
local *innerSub = sub {
...
}
#You can call this sub without ->
innerSub( ... )
Note that even if warning is displayed, the result stay the same as it should be expected : variables that are not defined in the inner sub are modified in the outer sub scope. I cannot see what this warning is about.

dynamic scope in recursion function

Is it better to pass args to recursive function or let dynamic scope deal with it?
sub rec {
my ($arg1, $arg2 ..) = (#_);
..
rec(..);
}
or rather:
sub main {
our ($arg1, $arg2 ..) = (#_);
sub rec {
my $arg1 = shift;
.. # use $args > 1
rec($arg1);
}
Since I have several rec subs in main I prefer 2nd option which doesn't require passing vars all the time and reduces amount bloated code. Said that it's probably not efficient because it will go thru every stack frame in order to resolve dynamic scope?
Don't place a named subroutine inside another. This causes problems (although use warnings; will find them). If you want to avoid passing a constant argument to every recursion, I recommend the following instead:
sub recurse {
my ($constant, ...) = #_;
local *_recurse = sub {
my (...) = #_;
...
_recurse(...);
...
};
_recurse(...);
}
(No idea why you used our. I switched back to my.)
Or with a sufficiently new version of Perl (5.16+):
sub recurse {
my ($constant, ...) = #_;
my $_recurse = sub {
my (...) = #_;
...
__SUB__->(...);
...
};
$_recurse->(...);
}
Whatever you do, though, don't do the following as it leaks.
sub recurse {
...
my $_recurse;
$_recurse = sub {
...
$_recurse->(...);
...
};
...
}
(The inner sub references $_recurse which holds a reference to the inner sub, forming a reference cycle and thus a memory leak.)
If you're using a recursive function that relies on global variables, it can almost certainly be recoded as an iterative subroutine that uses locally scoped variables instead.
I would strongly recommend that you always pass variables and return values, and that you rethink the design of your algorithms to use iteration (while loops) versus recursion.
If you add details for your methods, we might be able to suggest an actual implementation. But given the amount of information you've shared, all we can do is advise design theory.

About using an array of functions in Perl

We are trying to build an API to support commit() and rollback() automatically, so that we don't have to bother with it anymore. By researching, we have found that using eval {} is the way to go.
For eval {} to know what to do, I have thought of giving the API an array of functions, which it can execute with a foreach without the API having to intepret anything. However, this function might be in a different package.
Let me clarify with an example:
sub handler {
use OSA::SQL;
use OSA::ourAPI;
my #functions = ();
push(#functions, OSA::SQL->add_page($date, $stuff, $foo, $bar));
my $API = OSA::ourAPI->connect();
$API->exec_multi(#functions);
}
The question is: Is it possible to execute the functions in #functions inside of OSA::ourAPI, even if ourAPI has no use OSA::SQL. If not, would it be possible if I use an array reference instead of an array, given that the pointer would point to the known function inside of the memory?
Note: This is the basic idea that we want to base the more complex final version on.
You are NOT adding a function pointer to your array. You are adding teh return value of calling the add_page() subroutine. You have 3 solutions to this:
A. You will need to store (in #functions) an array of arrayrefs of the form [\&OSA::SQL::add_page, #argument_values], meaning you pass in an actual reference to a subroutine (called statically); and then exec_multi will do something like (syntax may not be 100% correct as it's 4am here)
sub exec_multi {
my ($class, $funcs)= #_;
foreach my $f (#$funcs) {
my ($func, #args) = #$f;
my $res = &$func(#args);
print "RES:$res\n";
}
}
Just to re-iterate, this will call individual subs in static version (OSA::SQL::add_page), e.g. WITHOUT passing the package name as the first parameter as a class call OSA::SQL->add_page would. If you want the latter, see the next solution.
B. If you want to call your subs in class context (like in your example, in other words with the class name as a first parameter), you can use ysth's suggestion in the comment.
You will need to store (in #functions) an array of arrayrefs of the form [sub { OSA::SQL->add_page(#argument_values) }], meaning you pass in a reference to a subroutine which will in turn call what you need; and then exec_multi will do something like (syntax may not be 100% correct as it's 4am here)
sub exec_multi {
my ($class, $funcs)= #_;
foreach my $f (#$funcs) {
my ($func) = #$f;
my $res = &$func();
print "RES:$res\n";
}
}
C. You will need to store (in #functions) an array of arrayrefs of the form [ "OSA::SQL", "add_page", #argument_values], meaning you pass in a package and function name; and then exec_multi will do something like (syntax may not be 100% correct as it's 4am here)
my ($package, $sub, #args) = #{ $functions[$i] };
no strict 'refs';
$package->$sub(#args);
use strict 'refs';
If I understood your question correctly, then you don't need to worry about whether ourAPI uses OSA::SQL, since your main code imports it already.
However, since - in #1B - you will be passing a list of packages to exec_multi as first elements of each arrayref, you can do "require $package; $package->import();" in exec_multi. But again, it's completely un-necessary if your handler call already required and loaded each of those packages. And to do it right you need to pass in a list of parameters to import() as well. BUT WHYYYYYY? :)

What does the function declaration "sub function($$)" mean?

I have been using Perl for some time, but today I came across this code:
sub function1($$)
{
//snip
}
What does this mean in Perl?
It is a function with a prototype that takes two scalar arguments.
There are strong arguments for not actually using Perl prototypes in general - as noted in the comments below. The strongest argument is probably:
Far More Than Everything You've Ever Wanted to Know about Prototypes in Perl
There's a discussion on StackOverflow from 2008:
SO 297034
There's a possible replacement in the MooseX::Method::Signatures module.
As the other answer mentions, the $$ declares a prototype. What the other answer doesn't say is what prototypes are for. They are not for input validation, they are hints for the parser.
Imagine you have two functions declared like:
sub foo($) { ... }
sub bar($$) { ... }
Now when you write something ambiguous, like:
foo bar 1, 2
Perl knows where to put the parens; bar takes two args, so it consumes the two closest to it. foo takes one arg, so it takes the result of bar and the two args:
foo(bar(1,2))
Another example:
bar foo 2, 3
The same applies; foo takes one arg, so it gets the 2. bar takes two args, so it gets foo(2) and 3:
bar(foo(2),3)
This is a pretty important part of Perl, so dismissing it as "never use" is doing you a disservice. Nearly every internal function uses prototypes, so by understanding how they work in your own code, you can get a better understanding of how they're used by the builtins. Then you can avoid unnecessary parentheses, which makes for more pleasant-looking code.
Finally, one anti-pattern I will warn you against:
package Class;
sub new ($$) { bless $_[1] }
sub method ($) { $_[0]->{whatever} }
When you are calling code as methods (Class->method or $instance->method), the prototype check is completely meaningless. If your code can only be called as a method, adding a prototype is wrong. I have seen some popular modules that do this (hello, XML::Compile), but it's wrong, so don't do it. If you want to document how many args to pass, how about:
sub foo {
my ($self, $a, $b) = #_; # $a and $b are the bars to fooify
....
or
use MooseX::Method::Signatures;
method foo(Bar $a, Bar $b) { # fooify the bars
....
Unlike foo($$), these are meaningful and readable.

In Perl, what is the right way for a subclass to alias a method in the base class?

I simply hate how CGI::Application's accessor for the CGI object is called query.
I would like my instance classes to be able to use an accessor named cgi to get the CGI object associated with the current instance of my CGI::Application subclass.
Here is a self-contained example of what I am doing:
package My::Hello;
sub hello {
my $self =shift;
print "Hello #_\n";
}
package My::Merhaba;
use base 'My::Hello';
sub merhaba {
goto sub { shift->hello(#_) };
}
package main;
My::Merhaba->merhaba('StackOverflow');
This is working as I think it should and I cannot see any problems (say, if I wanted to inherit from My::Merhaba: Subclasses need not know anything about merhaba).
Would it have been better/more correct to write
sub merhaba {
my $self = shift;
return $self->hello(#_);
}
What are the advantages/disadvantages of using goto &NAME for the purpose of aliasing a method name? Is there a better way?
Note: If you have an urge to respond with goto is evil don't do it because this use of Perl's goto is different than what you have in mind.
Your approach with goto is the right one, because it will ensure that caller / wantarray and the like keep working properly.
I would setup the new method like this:
sub merhaba {
if (my $method = eval {$_[0]->can('hello')}) {
goto &$method
} else {
# error code here
}
}
Or if you don't want to use inheritance, you can add the new method to the existing package from your calling code:
*My::Hello::merhaba = \&My::Hello::hello;
# or you can use = My::Hello->can('hello');
then you can call:
My::Hello->merhaba('StackOverflow');
and get the desired result.
Either way would work, the inheritance route is more maintainable, but adding the method to the existing package would result in faster method calls.
Edit:
As pointed out in the comments, there are a few cases were the glob assignment will run afoul with inheritance, so if in doubt, use the first method (creating a new method in a sub package).
Michael Carman suggested combining both techniques into a self redefining function:
sub merhaba {
if (my $method = eval { $_[0]->can('hello') }) {
no warnings 'redefine';
*merhaba = $method;
goto &merhaba;
}
die "Can't make 'merhaba' an alias for 'hello'";
}
You can alias the subroutines by manipulating the symbol table:
*My::Merhaba::merhaba = \&My::Hello::hello;
Some examples can be found here.
I'm not sure what the right way is, but Adam Kennedy uses your second method (i.e. without goto) in Method::Alias (click here to go directly to the source code).
This is sort of a combination of Quick-n-Dirty with a modicum of indirection using UNIVERSAL::can.
package My::Merhaba;
use base 'My::Hello';
# ...
*merhaba = __PACKAGE__->can( 'hello' );
And you'll have a sub called "merhaba" in this package that aliases My::Hello::hello. You are simply saying that whatever this package would otherwise do under the name hello it can do under the name merhaba.
However, this is insufficient in the possibility that some code decorator might change the sub that *My::Hello::hello{CODE} points to. In that case, Method::Alias might be the appropriate way to specify a method, as molecules suggests.
However, if it is a rather well-controlled library where you control both the parent and child categories, then the method above is slimmmer.