Preserve a variable's value across multiple subroutine calls in perl - perl

Just wanted to know what was the best way to reserve the value of a variable across multiple calls to the same subroutine . i.e
$someList = &someRoutine(100, 200);
$someList2 = &someRoutine(100, 200);
sub someRoutine {
$someAddition = 0;
foreach $someSum (#_){
$someAddition += $someSum;
}
return $someAddition
}
print $someList;
print $someList2;
Basically, someList should print 300 and someList2 should print 600. How do i make it so that someList2 prints 600? i want $someAddition to be preserved across multiple subroutine calls.

There are several ways to do it. I'll demonstrate two of them:
First, in modern versions of Perl you can use state:
use feature qw/state/;
print someRoutine(100,200), "\n";
print someRoutine(100,200), "\n";
sub someRoutine {
state $someAddition = 0;
foreach my $someSum ( #_ ) {
$someAddition += $someSum;
}
return $someAddition;
}
In this version, the $someAddition variable will be initialized to zero once, and only once. From that point on, the value will be retained between calls.
Another version is using a lexical closure. Here's an example:
my $summer = makeSummer();
print $summer->(100,200), "\n";
print $summer->(100,200), "\n";
sub makeSummer {
my $someAddition = 0;
return sub {
$someAddition += $_ foreach #_;
return $someAddition;
}
}
The second version is a little more complex, but has two advantages. First, you can start a fresh summation simply by calling the makeSummer routine to manufacture a new closure. Second, it will work on any version of Perl 5, not just versions recent enough to have the newer state keyword.
If you are not concerned with initializing the stateful variable before the sub is declared, you can also do this:
my $someAddition;
sub someRoutine {
$someAddition = 0 unless defined $someAddition;
foreach my $someSum( #_ ) {
$someAddition += $someSum;
}
return $someAddition;
}
A fourth method makes use of package globals. I save this one for last because it's the most prone to abuse and mistakes. But here you go;
our $someAddition = 0;
someRoutine(100,200);
print "$someAddition\n";
someRoutine(100,200);
print "$someAddition\n";
sub someRoutine {
$someAddition += $_ foreach #_;
}
In this last version, $someAddition is a package global, and its global scope makes it available both inside and outside of any subroutines living within the same namespace.

I assume you're at least using a variant of Perl 5? It has been bad practice to use ampersands & on subroutine calls since the first version of Perl 5 twenty-two years ago.
It is also vital that you use strict and use warnings at the top of every Perl program, and declare your variables ay their first point of use with my. It is a measure that will uncover many simple coding errors that you can otherwise easily overlook.
Perl variable names should use only lower-case letters, digits, and underscores. Capital letters are reserved for global identifiers such as package names.
By far the simplest and most common way of creating a static variable is just to declare it outside the subroutine. Like this
use strict;
use warnings;
my $some_list = some_routine(100, 200);
my $some_list2 = some_routine(100, 200);
my $some_addition;
sub some_routine {
$some_addition += $_ for #_;
return $some_addition
}
print $some_list, "\n";
print $some_list2, "\n";
output
300
600
If you want to protect the variable from being accessed by any following code other than the subroutine, then just enclose them in braces, like this
{
my $some_addition;
sub some_routine {
$some_addition += $_ for #_;
return $some_addition
}
}

Take a look at Persistent Private Variables in "man perlsub".

Related

Perl function/sub best practice

I have a really quick question. I have a program with a lot of functions that are run from main. Is it best practice to have the functions first and then the call from main, or the other way around?
For example:
sub myFunction {
#Do something
}
my $stuff = myFunction();
Or:
my $stuff = myFunction();
sub myFunction {
#Do something
}
Sorry for any ignorance, I do not have any formal training and I have seen it done both ways online. Thanks
I recommend placing your code at the bottom.
Issue 1
The latter snippet poor because myFunction is in scope of $stuff, but it shouldn't be. That's easy to fix though.
{
my $stuff = myFunction();
}
sub myFunction {
#Do something
}
Ok, so that's not a big issue since I place all top-level code in a block, even if it comes at the end. It looks cleaner to me that way, and it makes it easier to transform into a sub from which I can return.
sub myFunction {
#Do something
}
sub main {
return 0 if is_nothing_to_do();
my $stuff = myFunction();
...
return 0;
}
exit(main(parse_args));
Issue 2
Many languages require that you declare your subs before you call them. That's rarely needed in Perl, though there are a couple of scenarios where it is required. Subs with prototypes is one of those. If you wanted to place your code at the top, you would need to add declarations even before that.
sub myFunction(&#);
{
my $stuff = myFunction { ... } ...;
}
sub myFunction(&#) {
#Do something
}
You probably never have to do that since all but some rare uses of prototypes is discouraged, and the other scenarios are even rarer.
Issue 3
You might accidentally skip initialization code by placing your top-level code before your subroutines.
Compare:
print my_counter(), "\n"; # Warns, then prints a blank line
...
{
my $counter = 1;
sub my_counter {
return $counter++;
}
}
...
and
...
{
my $counter = 1;
sub my_counter {
return $counter++;
}
}
...
print my_counter(), "\n"; # Prints 1
Issue 4
Many languages require that you declare your subs before you call them, so more people will be more familiar with having the top-level code at the bottom.
It doesn't matter, so long as you're able to find the code that you need to find. I typically like to set up my code like this:
use strict;
use warnings;
exit main();
sub main {
do_this();
dont_do_that();
cant_you_read_the_signs();
return 0;
}
sub do_this {
....
}
...
Putting your main code in an actual function or block called "main" helps keep you from polluting the program with globals.

Warning: 'Use of uninitialized value in numeric eq' in Perl

I am getting a warning as:
Use of uninitialized value in numeric eq (==) at script.pl line 53; line 53 isif statement`
My code snippet:
foreach(#array)
{
push #gene2refseq, $_;
}
foreach(#::ids)
{
if($_ == $gene2refseq[1])
{
push #::prot_gi, $gene2refseq[6];
}
}
Now if I declare $gene2refseq[1] before foreach(#::ids) the same error persists; but if I initialize it before foreach(#::ids) viz. $gene2refseq[1] = 0 it didn't give the error but also no results; as the value is initialized to 0.
I think I am initializing it at a wrong place, but then where have I to initialize it? As I can't initialize it before or in foreach(#array)
Disclaimer: I am not very good with use warnings and use strict
Edit: Solved
Thanks for the help; I just declared the #gene2refseq and initialized $gene2refseq[1] = 0 before the foreach(#array), and it worked fine.
Also thanks for correcting me on foreach usage.
A question:
What if I have to access multiple indexes of an array? Do I have to initialize them all? As here I need to access only a single index, so I initialized it.
If #gene2refeq is empty and #array is empty before the start of that code snippet, then #gene2refeq will be empty by line 53.
To find out, print the contents of #array and #gene2refeq. It's also possible #::ids contains uninitialized values, check that too. Add a separator so you can see what's in them.
print "\#array is ".join(", ", #array)."\n";
print "\#gene2refeq is ".join(", ", #gene2refeq)."\n";
print "\#::ids is ".join(", ", #::ids)."\n";
As to your question about when to initialize things, there's a difference between declaring a variable and initializing it. my and our declare a variable to exist lexically and globally respectively. Then you can populate (initialize) it.
The basic pattern of your program might look something like this.
my #ids;
...code to populate #ids...
my #gene2refeq;
...code to populate #gene2refeq...
my #array;
...code to populate #array...
push #gene2refeq, #array;
die "\#gene2refeq is not long enough, ".join(", ", #gene2refeq)
unless #gene2refeq >= 7;
my #prot_gi;
foreach my $id (#ids) {
if($id == $gene2refseq[1])
{
push #prot_gi, $gene2refseq[6];
}
}
A few other points. That first foreach loop is better written as simply...
push #gene2refseq, #array;
Using #::ids and #::prot_gi is odd. That's shorthand for getting the global variable #ids in the main package. It shouldn't be necessary and smells like cargo culting.

How do I create a perl sub by specifying its parse tree?

Given a CODE ref, is it possible to:
Access the parse tree of that CODE ref
Create a new CODE ref by specifying the parse tree of the CODE ref which can contain elements of the parse tree returned in 1
Normally we create subroutines in perl by specifying source code which is then parsed and converted into a parse tree.
I would like to write a perl function which can create a subroutine by specifying its parse tree, and that parse tree could be derived from another parse tree of some other subroutine.
Possible?
I don't know the full answer to your question, but I know that Data::Dumper can deparse a code reference. Looking at its documentation, I see that it uses B::Deparse to do the heavy lifting (the B:: modules are the ones that interact with the compiler). Unfortunately it seems that this only results in a textual representation of the coderef.
Instead I searched for Op on metacpan and got many more interesting possibilities. Since I am now far out of my depth in the deepest Perl magic, I will leave it to you to look over those results. Perhaps something will be useful.
This has nothing to do with opcodes, but it does enclose the same two variables in three different closures. The variables are enclosed within subroutines reminiscent of class get/set routines, and those closed vars are then shared by other closures via their access routine.
This is a response to the comment: I'm sure it will be necessary to access the underlying nodes in the parse tree so that I can create new closures which are closed over the same variables.
use strict;
use warnings;
use v5.14;
# create closed vars
my $v1 = access_closure(6);
my $v2 = access_closure(42);
# play with them
say "v1 ", &$v1;
say "v2 ", &$v2;
say "v1 ", &$v1(5);
say "v2 ", &$v2(43);
say "v1 ", &$v1;
say "v2 ", &$v2;
# create silly closures that use them
my $test1 = test_closure(2);
my $test2 = test_closure(17);
my $test3 = test_closure(50);
# play with those
&$test1;
&$test2;
&$test3;
# create the get/set routine for a closed var
sub access_closure {
my $val = shift;
return sub {
$val = shift if #_;
return $val;
}
}
# create a silly closure that encloses a control var and uses the two other vars
sub test_closure {
my $val = shift;
return sub {
say "\nval is $val";
printf "v1 is %2d, v2 is %2d\n",
&$v1, &$v2;
if (&$v1 < $val) {
say "Increment v1";
&$v1(&$v1+1);
}
if (&$v2 > $val) {
say "Decrement v2";
&$v2(&$v2-1);
}
printf "v1 is %2d, v2 is %2d\n",
&$v1, &$v2;
}
}

Perl subroutine array and scalar variable parameters

How exactly can I pass both scalar variables and array variables to a subroutine in Perl?
my $currVal = 1;
my $currValTwo = 1;
my #currArray = ('one','two','three');
my #currArrayTwo =('one','two','three');
&mysub($currVal, $currValTwo,\#currArray, \#currArrayTwo);
sub mysub() {
# That doesn't work for the array as I only get the first element of the array
my($inVal, $inValTwo, #inArray, #inArrayTwo) = #_;
}
You need to fetch them as references because you've already passed them as references (by using the \ operator):
my($inVal, $inValTwo, $inArray, $inArrayTwo) = #_;
and then use the references as arrays:
#{$inArray}
You pass the arguments as references, so you need to dereference them to use the values. Be careful about whether you want to change the original array or not.
sub mysub {
my($inVal, $inValTwo, $inArray, $inArrayTwo) = #_;
#{$inArrayTwo} = ('five','six','seven');
}
This will change the original #currArrayTwo, which might not be what you want.
sub mysub {
my($inVal, $inValTwo, $inArray, $inArrayTwo) = #_;
my #ATwo = #{$inArrayTwo};
#ATwo = ('five','six','seven');
}
This will only copy the values and leave the original array intact.
Also, you do not need the ampersand in front of the sub name, from perldoc perlsub:
If a subroutine is called using the &
form, the argument list is optional,
and if omitted, no #_ array is set up
for the subroutine: the #_ array at
the time of the call is visible to
subroutine instead. This is an
efficiency mechanism that new users
may wish to avoid.
You do not need empty parens after your sub declaration. Those are used to set up prototypes, which is something you do not need to do, unless you really want to.
So, for example: This is a using statement to search something in an array:
use List::Util qw(first);
This is the sub declaration:
sub GetIndex($$$);
This is the call to the sub (last parameter is: Default index value to give back if not found)
$searchedIndex = GetIndex(\#theArr, "valuesearched", 1);
This is the routine:
sub GetIndex($$$)
{
my $inArray=shift;
my #theArray= #{$inArray};
my $searchedTag= shift;
my $defaultVal= shift;
my $retVal = first { $theArray[$_] eq $searchedTag} 0 .. $#theArray;
if ((! defined $retVal)|| ($retVal<0)||($retVal>#theArray))
{
$retVal = $defaultVal;
}
return $retVal;
}

How can I detect recursing package calls in Perl?

I have a Perl project were I just had a problem by making a circular package call. The code below demonstrates the problem.
When this is executed, each package will call the other until all of the memory of the computer is consumed and it locks up. I agree that this is a bad design and that circular calls like this should not be made in the design, but my project is sufficiently big that I would like to detect this at run time.
I have read about the weaken function and Data::Structure::Util, but I have not figured out a way to detect if there is a circular package load (I am assume, because a new copy is being made at each iteration and stored in each copy of the $this hash). Any ideas?
use system::one;
my $one = new system::one();
package system::one;
use strict;
use system::two;
sub new {
my ($class) = #_;
my $this = {};
bless($this,$class);
# attributes
$this->{two} = new system::two();
return $this;
}
package system::two;
use strict;
use system::one;
sub new {
my ($class) = #_;
my $this = {};
bless($this,$class);
# attributes
$this->{one} = new system::one();
return $this;
}
Here, have some code too. :)
sub break_recursion(;$) {
my $allowed = #_ ? shift : 1;
my #caller = caller(1);
my $call = $caller[3];
my $count = 1;
for(my $ix = 2; #caller = caller($ix); $ix++) {
croak "found $count levels of recursion into $call"
if $caller[3] eq $call && ++$count > $allowed;
}
}
sub check_recursion(;$) {
my $allowed = #_ ? shift : 1;
my #caller = caller(1);
my $call = $caller[3];
my $count = 1;
for(my $ix = 2; #caller = caller($ix); $ix++) {
return 1
if $caller[3] eq $call && ++$count > $allowed;
}
return 0;
}
These are called like:
break_recursion(); # to die on any recursion
break_recursion(5); # to allow up to 5 levels of recursion
my $recursing = check_recursion(); # to check for any recursion
my $recursing = check_recursion(10); # to check to see if we have more than 10 levels of recursion.
Might CPAN these, I think. If anyone has any thoughts about that, please share.
The fact that these are in separate packages has nothing at all to do with the fact that this runs infinitely, consuming all available resources. You're calling two methods from within one another. This isn't circular reference, it's recursion, which is not the same thing. In particular, weaken won't help you at all. You'd get exactly the same effect from:
sub a {
b();
}
sub b {
a();
}
a();
The best way to avoid this is don't do that. More usefully, if you have to write recursive functions try not to use multiple functions in the recursion chain, but simply the one, so you have an easier time mentally keeping track of where your calls should terminate.
As to how to detect whether something like this is happening, you would have to do something simple like increment a variable with your recursion depth and terminate (or return) if your depth exceeds a certain value. But you really shouldn't have to rely on that, it's similar to writing a while loop and using an increment there to make sure your function doesn't run out of control. Just don't recurse over a set unless you know how and when it terminates.
Another relevant question would be what are you trying to accomplish in the first place?
I suggest making a routine called something like break_constructor_recursion() that uses caller() to examine the call stack like so:
Find out what method in what package just called me.
Look up the rest of the call stack seeing if that same method in that same package is anywhere further up.
If so, die() with something appropriate.
Then you add a call to break_constructor_recursion() in your constructors. If the constructor is being called from inside itself, it'll bomb out.
Now, this can throw false positives; it's not impossible for a constructor to be legitimately called inside itself. If you have issues with that, I'd say just have it look for some N additional occurrences of the constructor before it identifies an error. If there are 20 calls to system::two::new() on the stack, the chances that you aren't recursing are pretty low.
The classic break on double recursion is to use a state variable to determine if you are already inside a function:
{
my $in_a;
sub a {
return if $in_a; #do nothing if b(), or someone b() calls, calls a()
$in_a = 1;
b();
$in_a = 0;
}
}
You can do whatever you want if $in_a is true, but dieing or returning is common. If you are using Perl 5.10 or later you can use the state function instead of nesting the function in its own scope:
sub a {
state $in_a;
return if $in_a; #do nothing if b(), or someone b() calls, calls a()
$in_a = 1;
b();
$in_a = 0;
}
use warnings;
without warnings:
#!/usr/bin/perl
use strict;
sub foo {
foo();
}
foo();
-
$ perl script.pl
^C # after death
with warnings:
#!/usr/bin/perl
use strict;
use warnings;
sub foo {
foo();
}
foo();
-
$ perl script.pl
Deep recursion on subroutine "main::foo" at script.pl line 7.
^C # after death
Always always use warnings.
use warnings FATAL => qw( recursion );
#!/usr/bin/perl
use strict;
use warnings FATAL => qw( recursion );
sub foo {
foo();
}
foo();
-
$ perl script.pl
Deep recursion on subroutine "main::foo" at script.pl line 7.
$