dynamic scope in recursion function - perl

Is it better to pass args to recursive function or let dynamic scope deal with it?
sub rec {
my ($arg1, $arg2 ..) = (#_);
..
rec(..);
}
or rather:
sub main {
our ($arg1, $arg2 ..) = (#_);
sub rec {
my $arg1 = shift;
.. # use $args > 1
rec($arg1);
}
Since I have several rec subs in main I prefer 2nd option which doesn't require passing vars all the time and reduces amount bloated code. Said that it's probably not efficient because it will go thru every stack frame in order to resolve dynamic scope?

Don't place a named subroutine inside another. This causes problems (although use warnings; will find them). If you want to avoid passing a constant argument to every recursion, I recommend the following instead:
sub recurse {
my ($constant, ...) = #_;
local *_recurse = sub {
my (...) = #_;
...
_recurse(...);
...
};
_recurse(...);
}
(No idea why you used our. I switched back to my.)
Or with a sufficiently new version of Perl (5.16+):
sub recurse {
my ($constant, ...) = #_;
my $_recurse = sub {
my (...) = #_;
...
__SUB__->(...);
...
};
$_recurse->(...);
}
Whatever you do, though, don't do the following as it leaks.
sub recurse {
...
my $_recurse;
$_recurse = sub {
...
$_recurse->(...);
...
};
...
}
(The inner sub references $_recurse which holds a reference to the inner sub, forming a reference cycle and thus a memory leak.)

If you're using a recursive function that relies on global variables, it can almost certainly be recoded as an iterative subroutine that uses locally scoped variables instead.
I would strongly recommend that you always pass variables and return values, and that you rethink the design of your algorithms to use iteration (while loops) versus recursion.
If you add details for your methods, we might be able to suggest an actual implementation. But given the amount of information you've shared, all we can do is advise design theory.

Related

Perl:Issue passing self

I'm having an issue with a recent code review. I've been advised to change the following function call:
storeShmcoreservJobsLogs(
$self->{'shmJobDetails'},
$self->{'nhcJobDetails'},
$self->{'cppDetails'},
$self->{'siteId'},
$neTypesIdMap,
$dbh
);
To only use two arguments, being $self and $dbh. So I have coded as follows
storeShmcoreservJobsLogs($self, $dbh);
And the function signature as follows:
sub storeShmcoreservJobsLogs($$) {
my($self, $dbh) = #_;
if ( $#{$self->$shmJobDetails} > -1 ) {
The if statement unfortunately throws an error with the value of $shmJobDetails when I test the change
Global symbol "$shmJobDetails" requires explicit package name at /data/ddp/current/analysis/TOR/elasticsearch/handlers/misc/Shm.pm line 148.
So I must have misinterpreted the instruction. Is anything obvious wrong?
There's no variable $shmJobDetails so you get the compilation error. Do the same thing that you were doing before:
sub storeShmcoreservJobsLogs {
my($self,$dbh)=#_;
if ( $#{ $self->{'shmJobDetails'} } > -1 ) {
Now you're passing the complete object and the subroutine can use any part of the object it needs.
You might want to make some object methods to answer the questions you'll ask it instead of accessing its internals. That method can do all the work to figure out true or false:
sub storeShmcoreservJobsLogs {
my($self,$dbh)=#_;
if ( $self->has_jobs ) {
The use of a lexical variable called $self implies that you're using object-oriented methods, but your code is far from being object-oriented
Are you sure that you understand the points being made in the code review? It's looking like you're writing a method, and the fields should be extracted from the hash within the method
The method definition should be more like this
sub store_shmcoreserv_jobs_logs {
my $self = shift;
my ($id_map, $dbh) = #_;
my #fields = #{$self}{qw/ shmJobDetails nhcJobDetails cppDetails siteId /};
...
while the call should look like this
$self->storeShmcoreservJobsLogs($neTypesIdMap, $dbh)
All of this is essential to Perl object-oriented programming. You should study perlobj together with the rest of the Perl language reference

Access object created in another function

My program creates an object, which, in turn, creates another object
MainScript.pm
use module::FirstModule qw ($hFirstModule);
$hFirstModule->new(parametres);
$hFirstModule->function();
FirstModule.pm
use Exporter ();
#EXPORT = qw($hFirstModule);
use module::SecondModule qw ($hSecondModule);
sub new {
my $className = shift;
my $self = { key => 'val' };
bless $self, $classname;
return $self;
}
sub function{
$hSecondModule->new(parametres);
#some other code here
}
I want to acces $hSecondModule from MainScript.pm.
It depends.
We would have to see the actual code. What you've shown is a bit ambiguous. However, there are two scenarios.
You can't
If your code is not exactly like what you have shown as pseudo-code, then there is no chance to do that. Consider this code in &module1::function.
sub function {
my $obj = Module2->new;
# ... more stuff
return;
}
In this case, you are not returning anything, and the $obj is lexically scoped. A lexical scope means that it only exists inside of the closest {} block (and all blocks inside that). That's the block of the function sub. Once the program returns out of that sub, the variable goes out of scope and the object is destroyed. There is no way to get to it afterwards. It's gone.
Even if it was not destroyed, you cannot reach into a different scope.
You can
If you however return the object from the function, then you'd have to assign it in your script, and then you can access it later. If the code is exactly what you've shown above, this works.
sub function {
my $obj = Module2->new;
# nothing here
}
In Perl, subs always return the last true statement. If you don't have a return and the last statement is the Module2->new call, then the result of that statement, which is the object, is returned. Of course it also works if you actually return explicitly.
sub function {
return Module2->new;
}
So if you assign that to a variable in your script, you can access it in the script.
my $obj = module1->function();
This is similar to the factory pattern.
This is vague, but without more information it's impossible to answer the question more precicely.
Here is a very hacky approach that takes your updated code into consideration. It uses Sub::Override to grab the return value of the constructor call to your SecondModule thingy. This is something that you'd usually maybe do in a unit test, but not in production code. However, it should work. Here's an example.
Foo.pm
package Foo;
use Bar;
sub new {
return bless {}, $_[0];
}
sub frobnicate {
Bar->new;
return;
}
Bar.pm
package Bar;
sub new {
return bless {}, $_[0];
}
sub drink {
return 42; # because.
}
script.pl
package main;
use Foo; # this will load Bar at compile time
use Sub::Override;
my $foo = Foo->new;
my $bar; # this is lexical to the main script, so we can use it inside
my $orig = \&Bar::new; # grab the original function
my $sub = Sub::Override->new(
"Bar::new" => sub {
my $self = shift;
# call the constructor of $hSecondModule, grab the RV and assign
# it to our var from the main script
$bar = $self->$orig(#_);
return $bar;
}
);
$foo->frobnicate;
# restore the original sub
$sub->restore;
# $bar is now assigend
print $bar->drink;
Again, I would not do this in production code.
Let's take a look at the main function. It first creates a new Foo object. Then it grabs a reference to the Bar::new function. We need that as the original, so we can call it to create the object. Then we use Sub::Override to temporarily replace the Bar::new with our sub that calls the original, but takes the return value (which is the object) and assigns it to our variable that's lexical to the main script. Then we return it.
This function will now be called when $foo->frobnicate calls Bar->new. After that call, $bar is populated in our main script. Then we restore Bar::new so we don't accidentally overwrite our $bar in case that gets called again from somewhere else.
Afterwards, we can use $bar.
Note that this is advanced. I'll say again that I would not use this kind of hack in production code. There is probably a better way to do what you want. There might be an x/y problem here and you need to better explain why you need to do this so we can find a less crazy solution.

Perl: "Variable will not stay shared"

I looked up a few answers dealing with this warning, but neither did they help me, nor do I truly understand what Perl is doing here at all. Here's what I WANT it to do:
sub outerSub {
my $dom = someBigDOM;
...
my $otherVar = innerSub();
return $otherVar;
sub innerSub {
my $resultVar = doStuffWith($dom);
return $resultVar;
}
}
So basically, I have a big DOM object stored in $dom that I don't want to pass along on the stack if possible. In outerSub, stuff is happening that needs the results from innerSub. innerSub needs access to $dom. When I do this, I get this warning "Variable $dom will not stay shared".
What I don't understand:
Does this warning concern me here? Will my intended logic work here or will there be strange things happening?
If it doesn't work as intended: is it possible to do that? To make a local var visible to a nested sub? Or is it better to just pass it as a parameter? Or is it better to declare an "our" variable?
If I push it as a parameter, will the whole object with all its data (may have several MB) be pushed on the stack? Or can I just pass something like a reference? Or is Perl handling that parameter as a reference all by itself?
In "Variable $foo will not stay shared" Warning/Error in Perl While Calling Subroutine, someone talks about an anonymous sub that will make this possible. I did not understand how that works, never used anything like that.
I do not understand that explanation at all (maybe cause English is not my first language): "When the inner subroutine is called, it will see the value of the outer subroutine's variable as it was before and during the first call to the outer subroutine; in this case, after the first call to the outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the variable.":
What does "the first call to the outer subroutine is complete? mean"
I mean: first I call the outer sub. The outer sub calls the inner sub. The outer sub is of course still running. Once the outer sub is complete, the inner sub will be finished as well. Then how does any of this still apply when the inner sub is already finished? And what about the "first" call? When is the "second" call happening... sorry, this explanation confuses me to no end.
Sorry for the many questions. Maybe someone can at least answer some of them.
In brief, the second and later times outerSub is called will have a different $dom variable than the one used by innerSub. You can fix this by doing this:
{
my $dom;
sub outerSub {
$dom = ...
... innerSub() ...
}
sub innerSub {
...
}
}
or by doing this:
sub outerSub {
my $dom = ...
*innerSub = sub {
...
};
... innerSub() ...
}
or this:
sub outerSub {
my $dom = ...
my $innerSub = sub {
...
};
... $innerSub->() ...
}
All the variables are originally preallocated, and innerSub and outerSub share the same $dom. When you leave a scope, perl goes through the lexical variables that were declared in the scope and reinitializes them. So at the point that the first call to outerSub is completed, it gets a new $dom. Because named subs are global things, though, innerSub isn't affected by this, and keeps referring to the old $dom. So if outerSub is called a second time, its $dom and innerSub's $dom are in fact separate variables.
So either moving the declaration out of outerSub or using an anonymous sub (which gets freshly bound to the lexical environment at runtime) fixed the problem.
You need to have an anonymous subroutine to capture variables:
my $innerSub = sub {
my $resultVar = doStuffWith($dom);
return $resultVar;
};
Example:
sub test {
my $s = shift;
my $f = sub {
return $s x 2;
};
print $f->(), "\n";
$s = "543";
print $f->(), "\n";
}
test("a1b");
Gives:
a1ba1b
543543
If you want to minimize the amount of size passing parameters to subs, use Perl references. The drawback / feature is that the sub could change the referenced param contents.
my $dom = someBigDOM;
my $resultVar = doStuffWith(\$dom);
sub doStuffWith {
my $dom_reference = shift;
my $dom_contents = $$dom_reference;
#...
}
Following http://www.foo.be/docs/perl/cookbook/ch10_17.htm , you should define a local GLOB as follows :
local *innerSub = sub {
...
}
#You can call this sub without ->
innerSub( ... )
Note that even if warning is displayed, the result stay the same as it should be expected : variables that are not defined in the inner sub are modified in the outer sub scope. I cannot see what this warning is about.

Best way to equate 2 subroutines?

I have many modules each of them have a sub insert_info and sub update_info methods. From time to time, the sub update_info and sub insert_info are the same. But I don't want to use only one of these methods when that happens, because in general they are not the same. So How I make the 2 methods equal?
Is this the only way?
sub insert_info {
# code......
}
sub update_info { insert_info(); }
Alias via typeglob
*update_info = \&insert_info;
Adding BEGIN may avoid problems
BEGIN { *update_info = \&insert_info; }
This helps ensure that it runs before other things, which may call it.
Comments on your Example
Also, your sub update_info { insert_info(); } is not a copy because it will always call insert_info with no parameters. If you passed update_info any values (like update_info('someval')), they would not be passed on to insert_info. Furthermore, they are both declared and defined subroutines - both taking memory.
If you wanted to declare it how you did and automatically pass along the arguments to the inner function, you could do sub update_info { insert_info(#_); }, or better is sub update_info { &insert_info }, since the & without any argument list, will automatically pass along #_.
Still these take more memory than using the typeglob assignment, listed at the top.
This is one of the rare opportunities to use goto without being shamed for it.
sub update_info {
goto &insert_info;
}
This has the benefit of passing along any arguments to the inner function, and cleaning up the caller() stack to remove the call to the outer function.
I suggest you use the Sub::Alias module from CPAN, which has the advantages of being explicit and self-documenting as well as neat and clear in use.
Your code becomes
use Sub::Alias 'alias';
sub insert_info {
...
}
alias update_info => 'insert_info';

How do I implement a dispatch table in a Perl OO module?

I want to put some subs that are within an OO package into an array - also within the package - to use as a dispatch table. Something like this
package Blah::Blah;
use fields 'tests';
sub new {
my($class )= #_;
my $self = fields::new($class);
$self->{'tests'} = [
$self->_sub1
,$self->_sub2
];
return $self;
}
_sub1 { ... };
_sub2 { ... };
I'm not entirely sure on the syntax for this?
$self->{'tests'} = [
$self->_sub1
,$self->_sub2
];
or
$self->{'tests'} = [
\&{$self->_sub1}
,\&{$self->_sub2}
];
or
$self->{'tests'} = [
\&{_sub1}
,\&{_sub2}
];
I don't seem to be able to get this to work within an OO package, whereas it's quite straightforward in a procedural fashion, and I haven't found any examples for OO.
Any help is much appreciated,
Iain
Your friend is can. It returns a reference to the subroutine if it exists, null otherwise. It even does it correctly walking up the OO chain.
$self->{tests} = [
$self->can('_sub1'),
$self->can('_sub2'),
];
# later
for $tn (0..$#{$self->{tests}}) {
ok defined $self->{tests}[$tn], "Function $tn is available.";
}
# and later
my $ref = $self->{tests}[0];
$self->$ref(#args1);
$ref = $self->{tests}[1];
$self->$ref(#args2);
Or, thanks to this question (which happens to be a variation of this question), you can call it directly:
$self->${\$self->{tests}[0]}(#args1);
$self->${\$self->{tests}[1]}(#args1);
Note that the \ gives us a reference to a the subref, which then gets dereferenced by the ${} after $self->. Whew!
To solve the timeliness issue brain d foy mentions, an alternative would be to simply make the {test} a subroutine itself, that returns a ref, and then you could get it at exactly the time you need it:
sub tests {
return [
$self->can('_sub1'),
$self->can('_sub2')
];
}
and then use it:
for $tn (0..$#{$self->tests()}) {
...
}
Of course, if you have to iterate over the refs anyway, you might as well just go straight for passing the reference out:
for my $ref (0..$#{$self->tests()}) {
$self->$ref(#args);
}
Although Robert P's answer might work for you, it has the problem of fixing the dispatch very early in the process. I tend to resolve the methods as late as I can, so I would leave the things in the tests array as method names until you want to use them:
$self->{tests} = [
qw( _sub1 _sub2 )
];
The strength of a dynamic language is that you can wait as long as you like to decide what's going to happen.
When you want to run them, you can go through the same process that Robert already noted. I'd add an interface to it though:
foreach my $method_name ( $obj->get_test_methods )
{
$obj->$method_name();
}
That might even be better as not tying the test to an existing method name:
foreach my $method_name ( $obj->get_test_methods )
{
$obj->run_test_named( $method_name );
}
That run_test_named could then be your dispatcher, and it can be very flexible:
sub run_test_named
{
my( $self, $name ) = #_;
# do anything you want, like in Robert's answer
}
Some things you might want to do:
Run a method on an object
Pass the object as an argument to something else
Temporarily override a test
Do nothing
etc, etc
When you separate what you decide to do from its implementation, you have a lot more freedom. Not only that, the next time you call the same test name, you can do something different.
use lib Alpha;
my $foo = Alpha::Foo->new; # indirect object syntax is deprecated
$foo->bar();
my %disp_table = ( bar => sub { $foo->bar() } );
$disp_table{bar}->(); # call it
You need a closure because you want to turn a method call into an ordinary subroutine call, so you have to capture the object you're calling the method on.
There are a few ways to do this. Your third approach is closest. That will store a reference to the two subs in the array. Then when you want to call them, you have to be sure to pass them an object as their first argument.
Is there a reason you are using the use fields construct?
if you want to create self contained test subs, you could do it this way:
$$self{test} = [
map {
my $code = $self->can($_); # retrieve a reference to the method
sub { # construct a closure that will call it
unshift #_, $self; # while passing $self as the first arg
goto &$code; # goto jumps to the method, to keep 'caller' working
}
} qw/_sub1 _sub2/
];
and then to call them
for (#{ $$self{test} }) {
eval {$_->(args for the test); 1} or die $#;
}