Detecting global destruction in Perl - perl

I'd like to detect if my object is being DESTROY'd as part of global destruction, and print out a warning (as that'd clearly be an error and lead to data loss). The obvious way to do that would seem to be:
sub DESTROY {
my $self = shift;
# ⋮
if (i_am_in_global_destruction()) {
warn "I survived until global destruction";
}
}
but I have been unable to find a good way to detect global destruction (instead of normal refcount hit 0 destruction).
By "good way", I mean not this, which though it works on 5.10.1 and 5.8.8, probably breaks the second someone gives it an odd glance:
sub DESTROY {
$in_gd = 0;
{
local $SIG{__WARN__} = sub { $_[0] =~ /during global destruction\.$/ and $in_gd = 1 };
warn "look, a warning";
}
if ($in_gd) {
warn "I survived until global destruction";
}
}'

There's a module Devel::GlobalDestruction that uses a tiny bit of XS to let you get at the global destruction flag directly.
Update: since perl 5.14.0 there is a global variable ${^GLOBAL_PHASE} that will be set to "DESTRUCT" during global destruction. You should still generally use Devel::GlobalDestruction, since it works with perls back to 5.6. When installing on a perl with ${^GLOBAL_PHASE} it will use the built-in feature and not even require a C compiler to build.

A solution that is good enough for me is to set a flag in an END block.
package Whatever;
our $_IN_GLOBAL_DESTRUCTION = 0;
END {
$_IN_GLOBAL_DESTRUCTION = 1;
}

Related

Preserve a variable's value across multiple subroutine calls in perl

Just wanted to know what was the best way to reserve the value of a variable across multiple calls to the same subroutine . i.e
$someList = &someRoutine(100, 200);
$someList2 = &someRoutine(100, 200);
sub someRoutine {
$someAddition = 0;
foreach $someSum (#_){
$someAddition += $someSum;
}
return $someAddition
}
print $someList;
print $someList2;
Basically, someList should print 300 and someList2 should print 600. How do i make it so that someList2 prints 600? i want $someAddition to be preserved across multiple subroutine calls.
There are several ways to do it. I'll demonstrate two of them:
First, in modern versions of Perl you can use state:
use feature qw/state/;
print someRoutine(100,200), "\n";
print someRoutine(100,200), "\n";
sub someRoutine {
state $someAddition = 0;
foreach my $someSum ( #_ ) {
$someAddition += $someSum;
}
return $someAddition;
}
In this version, the $someAddition variable will be initialized to zero once, and only once. From that point on, the value will be retained between calls.
Another version is using a lexical closure. Here's an example:
my $summer = makeSummer();
print $summer->(100,200), "\n";
print $summer->(100,200), "\n";
sub makeSummer {
my $someAddition = 0;
return sub {
$someAddition += $_ foreach #_;
return $someAddition;
}
}
The second version is a little more complex, but has two advantages. First, you can start a fresh summation simply by calling the makeSummer routine to manufacture a new closure. Second, it will work on any version of Perl 5, not just versions recent enough to have the newer state keyword.
If you are not concerned with initializing the stateful variable before the sub is declared, you can also do this:
my $someAddition;
sub someRoutine {
$someAddition = 0 unless defined $someAddition;
foreach my $someSum( #_ ) {
$someAddition += $someSum;
}
return $someAddition;
}
A fourth method makes use of package globals. I save this one for last because it's the most prone to abuse and mistakes. But here you go;
our $someAddition = 0;
someRoutine(100,200);
print "$someAddition\n";
someRoutine(100,200);
print "$someAddition\n";
sub someRoutine {
$someAddition += $_ foreach #_;
}
In this last version, $someAddition is a package global, and its global scope makes it available both inside and outside of any subroutines living within the same namespace.
I assume you're at least using a variant of Perl 5? It has been bad practice to use ampersands & on subroutine calls since the first version of Perl 5 twenty-two years ago.
It is also vital that you use strict and use warnings at the top of every Perl program, and declare your variables ay their first point of use with my. It is a measure that will uncover many simple coding errors that you can otherwise easily overlook.
Perl variable names should use only lower-case letters, digits, and underscores. Capital letters are reserved for global identifiers such as package names.
By far the simplest and most common way of creating a static variable is just to declare it outside the subroutine. Like this
use strict;
use warnings;
my $some_list = some_routine(100, 200);
my $some_list2 = some_routine(100, 200);
my $some_addition;
sub some_routine {
$some_addition += $_ for #_;
return $some_addition
}
print $some_list, "\n";
print $some_list2, "\n";
output
300
600
If you want to protect the variable from being accessed by any following code other than the subroutine, then just enclose them in braces, like this
{
my $some_addition;
sub some_routine {
$some_addition += $_ for #_;
return $some_addition
}
}
Take a look at Persistent Private Variables in "man perlsub".

How can I break on warnings in the perl debugger?

I have a codepath that sometimes emits warnings. Since this path is used a lot of times I don't want to break on each pass. How can I break only on times it emits warnings?
You can use a "signal handler" for warnings and set a flag, then break when the flag is set.
our $warn_flag = 0;
$SIG{__WARN__} = sub { $warn_flag = 1; CORE::warn(#_) };
...
for $i (1 .. 1_000_000_000) {
do_something_that_might_warn();
$DB::single ||= $warn_flag;
$warn_flag = 0;
}
use warnings qw(FATAL);
This will convert warnings to errors in the lexical scope, which will automatically cause the debugger to break.

perl - universal operator overload

I have an idea for perl, and I'm trying to figure out the best way to implement it.
The idea is to have new versions of every operator which consider the undefined value as the identity of that operation. For example:
$a = undef + 5; # undef treated as 0, so $a = 5
$a = undef . "foo"; # undef treated as '', so $a = foo
$a = undef && 1; # undef treated as false, $a = true
and so forth.
ideally, this would be in the language as a pragma, or something.
use operators::awesome;
However, I would be satisfied if I could implement this special logic myself, and then invoke it where needed:
use My::Operators;
The problem is that if I say "use overload" inside My::Operators only affects objects blessed into My::Operators.
So the question is: is there a way (with "use overoad" or otherwise) to do a "universal operator overload" - which would be called for all operations, not just operations on blessed scalars.
If not - who thinks this would be a great idea !? It would save me a TON of this kind of code
if($object && $object{value} && $object{value} == 15)
replace with
if($object{value} == 15) ## the special "is-equal-to" operator
It is possible. It would take a lot of work, but you could write an "op checker" that replaces the ops for && with custom op that's your reimplementation of &&.
But it would be a very bad idea. For starters,
if ($a && $b) {
...
}
would stop being equivalent to
if ($a) {
if ($b) {
...
}
}
To take your own example,
if ($object && $object{value} && $object{value} == 15) {
...
}
With your requested model, it would have to be written
if ($object{value}) { if ($object{value} == 15) {
...
}}
You actually want the exact opposite of what you asked for. You actually want the current behaviour. Without your module, you can write:
if ($object{value} && $object{value} == 15) {
...
}
or
no warnings 'uninitialized';
if ($object{value} == 15) {
...
}
or
if (($object{value} // 0) == 15) {
...
}
As mob said, your pragma already exists. It's spelled no warnings 'uninitialized';. Perl already treats undef as either 0 or the empty string (depending on context). This just suppresses the warning you usually get (assuming you have warnings turned on, which you should).
If you want to create a package that does this automatically, you can:
package operators::awesome;
use strict;
use warnings;
sub import {
warnings->unimport('uninitialized');
}
Now use operators::awesome; will turn off warnings about uninitialized values.
Here's a fancier version of import that turns on strict and warnings, but turns off warnings about uninitialized values:
sub import {
strict->import;
warnings->import;
warnings->unimport('uninitialized');
}
All of those operations already work the way you expect them to:
In the context of numbers, undef is 0.
In the context of strings, undef is the empty string ''.
In the context of booleans, undef is 0.
If you use warnings, then perl will let you know that the value is uninitialized, but it will still work just fine.

How can I cleanly handle error checking in Perl?

I have a Perl routine that manages error checking. There are about 10 different checks and some are nested, based on prior success. These are typically not exceptional cases where I would need to croak/die. Also, once an error occurs, there's no point in running through the rest of the checks.
However, I can't seem to think of a neat way to solve this issue except by using something analogous to the following horrid hack:
sub lots_of_checks
{
if(failcond)
{
goto failstate:
}
elsif(failcond2)
{
goto failstate;
}
#This continues on and on until...
return 1; #O happy day!
failstate:
return 0; #Dead...
}
What I would prefer to be able to do would be something like so:
do
{
if(failcond)
{
last;
}
#...
};
An empty return statement is a better way of returning false from a Perl sub than returning 0. The latter value will actually be true in list context:
sub lots_of_checks {
return if fail_condition_1;
return if fail_condition_2;
# ...
return 1;
}
Perhaps you want to have a look at the following articles about exception handling in perl5:
perl.com: Object Oriented Exception Handling in Perl
perlfoundation.com: Exception Handling in Perl
You absolutely can do what you prefer.
Check: {
last Check
if failcond1;
last Check
if failcond2;
success();
}
Why would you not use exceptions? Any case where the normal flow of the code should not be followed is an exception. Using "return" or "goto" is really the same thing, just more "not what you want".
(What you really want are continuations, which "return", "goto", "last", and "throw" are all special cases of. While Perl does not have full continuations, we do have escape continuations; see http://metacpan.org/pod/Continuation::Escape)
In your code example, you write:
do
{
if(failcond)
{
last;
}
#...
};
This is probably the same as:
eval {
if(failcond){
die 'failcond';
}
}
If you want to be tricky and ignore other exceptions:
my $magic = [];
eval {
if(failcond){
die $magic;
}
}
if ($# != $magic) {
die; # rethrow
}
Or, you can use the Continuation::Escape module mentioned above. But
there is no reason to ignore exceptions; it is perfectly acceptable
to use them this way.
Given your example, I'd write it this way:
sub lots_of_checks {
local $_ = shift; # You can use 'my' here in 5.10+
return if /condition1/;
return if /condition2/;
# etc.
return 1;
}
Note the bare return instead of return 0. This is usually better because it respects context; the value will be undef in scalar context and () (the empty list) in list context.
If you want to hold to a single-exit point (which is slightly un-Perlish), you can do it without resorting to goto. As the documentation for last states:
... a block by itself is semantically identical to a loop that executes once.
Thus "last" can be used to effect an early exit out of such a block.
sub lots_of_checks {
local $_ = shift;
my $all_clear;
{
last if /condition1/;
last if /condition2/;
# ...
$all_clear = 1; # only set if all checks pass
}
return unless $all_clear;
return 1;
}
If you want to keep your single in/single out structure, you can modify the other suggestions slightly to get:
sub lots_of_checks
{
goto failstate if failcond1;
goto failstate if failcond2;
# This continues on and on until...
return 1; # O happy day!
failstate:
# Any clean up code here.
return; # Dead...
}
IMO, Perl's use of the statement modifier form "return if EXPR" makes guard clauses more readable than they are in C. When you first see the line, you know that you have a guard clause. This feature is often denigrated, but in this case I am quite fond of it.
Using the goto with the statement modifier retains the clarity, and reduces clutter, while it preserves your single exit code style. I've used this form when I had complex clean up to do after failing validation for a routine.

How can I detect recursing package calls in Perl?

I have a Perl project were I just had a problem by making a circular package call. The code below demonstrates the problem.
When this is executed, each package will call the other until all of the memory of the computer is consumed and it locks up. I agree that this is a bad design and that circular calls like this should not be made in the design, but my project is sufficiently big that I would like to detect this at run time.
I have read about the weaken function and Data::Structure::Util, but I have not figured out a way to detect if there is a circular package load (I am assume, because a new copy is being made at each iteration and stored in each copy of the $this hash). Any ideas?
use system::one;
my $one = new system::one();
package system::one;
use strict;
use system::two;
sub new {
my ($class) = #_;
my $this = {};
bless($this,$class);
# attributes
$this->{two} = new system::two();
return $this;
}
package system::two;
use strict;
use system::one;
sub new {
my ($class) = #_;
my $this = {};
bless($this,$class);
# attributes
$this->{one} = new system::one();
return $this;
}
Here, have some code too. :)
sub break_recursion(;$) {
my $allowed = #_ ? shift : 1;
my #caller = caller(1);
my $call = $caller[3];
my $count = 1;
for(my $ix = 2; #caller = caller($ix); $ix++) {
croak "found $count levels of recursion into $call"
if $caller[3] eq $call && ++$count > $allowed;
}
}
sub check_recursion(;$) {
my $allowed = #_ ? shift : 1;
my #caller = caller(1);
my $call = $caller[3];
my $count = 1;
for(my $ix = 2; #caller = caller($ix); $ix++) {
return 1
if $caller[3] eq $call && ++$count > $allowed;
}
return 0;
}
These are called like:
break_recursion(); # to die on any recursion
break_recursion(5); # to allow up to 5 levels of recursion
my $recursing = check_recursion(); # to check for any recursion
my $recursing = check_recursion(10); # to check to see if we have more than 10 levels of recursion.
Might CPAN these, I think. If anyone has any thoughts about that, please share.
The fact that these are in separate packages has nothing at all to do with the fact that this runs infinitely, consuming all available resources. You're calling two methods from within one another. This isn't circular reference, it's recursion, which is not the same thing. In particular, weaken won't help you at all. You'd get exactly the same effect from:
sub a {
b();
}
sub b {
a();
}
a();
The best way to avoid this is don't do that. More usefully, if you have to write recursive functions try not to use multiple functions in the recursion chain, but simply the one, so you have an easier time mentally keeping track of where your calls should terminate.
As to how to detect whether something like this is happening, you would have to do something simple like increment a variable with your recursion depth and terminate (or return) if your depth exceeds a certain value. But you really shouldn't have to rely on that, it's similar to writing a while loop and using an increment there to make sure your function doesn't run out of control. Just don't recurse over a set unless you know how and when it terminates.
Another relevant question would be what are you trying to accomplish in the first place?
I suggest making a routine called something like break_constructor_recursion() that uses caller() to examine the call stack like so:
Find out what method in what package just called me.
Look up the rest of the call stack seeing if that same method in that same package is anywhere further up.
If so, die() with something appropriate.
Then you add a call to break_constructor_recursion() in your constructors. If the constructor is being called from inside itself, it'll bomb out.
Now, this can throw false positives; it's not impossible for a constructor to be legitimately called inside itself. If you have issues with that, I'd say just have it look for some N additional occurrences of the constructor before it identifies an error. If there are 20 calls to system::two::new() on the stack, the chances that you aren't recursing are pretty low.
The classic break on double recursion is to use a state variable to determine if you are already inside a function:
{
my $in_a;
sub a {
return if $in_a; #do nothing if b(), or someone b() calls, calls a()
$in_a = 1;
b();
$in_a = 0;
}
}
You can do whatever you want if $in_a is true, but dieing or returning is common. If you are using Perl 5.10 or later you can use the state function instead of nesting the function in its own scope:
sub a {
state $in_a;
return if $in_a; #do nothing if b(), or someone b() calls, calls a()
$in_a = 1;
b();
$in_a = 0;
}
use warnings;
without warnings:
#!/usr/bin/perl
use strict;
sub foo {
foo();
}
foo();
-
$ perl script.pl
^C # after death
with warnings:
#!/usr/bin/perl
use strict;
use warnings;
sub foo {
foo();
}
foo();
-
$ perl script.pl
Deep recursion on subroutine "main::foo" at script.pl line 7.
^C # after death
Always always use warnings.
use warnings FATAL => qw( recursion );
#!/usr/bin/perl
use strict;
use warnings FATAL => qw( recursion );
sub foo {
foo();
}
foo();
-
$ perl script.pl
Deep recursion on subroutine "main::foo" at script.pl line 7.
$