Reference counting problem with Perl 5.12.3? - perl

It seems that it's cleaning up the pad too early:
sub search {
my ( $self, $test ) = #_;
my $where;
my $found = 0;
my $counter = 0;
$self->descend( pre_each => sub {
my $lvl = shift;
my $ev_return
= $lvl->each_value( sub {
$counter++;
my ( $name, $value ) = #_;
say "\$name=$name";
say "\$value=$value";
return 1 unless $found = $test->( $value );
$where = { key => $lvl, name => $name, value => $value };
# when any intermediate function sees QUIT_FLAG, it
# knows to return control to the method that called it.
return QUIT_FLAG;
});
say "\$found=$found";
say "\$where=$where";
return $ev_return;
});
say "\$counter=$counter";
say "\$found=$found";
say "\$where=$where";
return unless $found;
return $where;
}
And what I get is:
...
$found=1
$where=HASH(...)
$counter=0
$found=0
$where=
Or, if anybody can point to something bone-headed I'm doing, I'd really appreciate it. I even created incremental variables between the first and outer closure, but they got reset too. Even setting references on the innermost closure, gets me nothing in the named sub scope!
The entire code concerned here is 500 lines. It is impractical to include the code.

It would be really good if you could provide a complete, runnable example.
Stab in the dark: does it help to have an extraneous use of $found in the outer anonymous sub (e.g. $found if 0;)?

Do not use my with statement modifiers!
The problem turned out to be in a called scope. Having forgotten the warning against using my with a statement modifier, I had coded the following:
my $each = shift if #_ == 1;
my %params = #_ unless $each;
The first time it went through #_ had one argument. It assigned the first value to $each. The second time through, with more arguments it skipped the my. So there was no declaration in the current scope, so it simply reused the sub that I had assigned the last time, and saved nothing in %params because the $each it referred to had a value.
Weird, but as ysth pointed out perlsyn warns against this behavior. I think I used to know this, but have forgotten it over the years. Switching it to
my ( %params, $each );
if ( #_ == 1 ) {
$each = shift;
}
else {
%params = #_;
}
did the trick. It not only cleaned up the problems I was having with another method, but it cleaned up problems in search.

Related

Storing the return value from a perl subroutine "as is"

I need to wrap a sub so that it can trigger before/after events before returning its original return value. Everything works except if the coderef returns an array it will be cast to an arrayref. Sometimes I do want an arrayref so I can't just check the ref type and recast to an array. I've also tried using wantarray which is still returning an arrayref.
I'm currently trying to go about it like this
$owner->methods->each(sub { # Methods is a hash container where each entry is a mapping of a method name to its implementation (as a coderef)
my ($key, $val) = #_;
return if $key eq 'trigger' || $key eq 'triggerBefore' || $key eq 'before' || $key eq 'on'; # Ignore decorating keys that would cause infinite callbacks
$owner->methods->set($key, sub { # Replace the original sub with a version that calls before and after events
my ($obj, #args) = #_;
$owner->triggerBefore($key, #args);
my $return = $val->(#args); # Return could be anything...
$owner->trigger($key, $return);
return $return;
});
});
I've also tried replacing the return with the following to no avail:
return (wantarray && ref $return eq 'ARRAY') ? #$return : $return;
Everything works fine if I don't store the return value and instead return $val->(#args); (but then I lose the "after" trigger). Is there a way to store the return value "as is" rather than storing it in a scalar?
If I understand correctly, your original subroutine returns an array when called in a list context, and an arrayref when called in a scalar context.
You need to invoke the wrapped sub in the same calling context as the caller provides, and store and later return its returned value as appropriate. This has the added advantage of letting a context-aware sub skip lengthy calculations when, for example, invoked in a void context.
This complicates the wrapping a bit. You probably also want to pass in #_, in case the sub makes any modifications there, too:
sub {
...
my $wa = wantarray;
my #ret;
... trigger_before() ...
unless (defined($wa)) { # void
$original_sub->(#_);
} elsif (not $wa) { # scalar
$ret[0] = $original_sub->(#_);
} else { # list
#ret = $original_sub->(#_);
}
... trigger_after() ...
return unless defined($wa);
return $wa ? #ret : $ret[0];
}

Should a subroutine always return explicitly?

If perlcritic says "having no returns in a sub is wrong", what is the alternative if they really aren't needed?
I've developed two apparently bad habits:
I explicitly assign variables to the '$main::' namespace.
I then play with those variables in subs.
For example, I might do..
#!/usr/bin/perl
use strict;
use warnings;
#main::array = (1,4,2,6,1,8,5,5,2);
&sort_array;
&push_array;
&pop_array;
sub sort_array{
#main::array = sort #main::array;
for (#main::array){
print "$_\n";
}
}
sub push_array{
for ( 1 .. 9 ){
push #main::array, $_;
}
}
sub pop_array {
for ( 1 .. 3 ){
pop #main::array;
}
}
I don't do this all the time. But in the above, it makes sense, because I can segregate the operations, not have to worry about passing values back and forth and it generally looks tidy to me.
But as I said, perl critic says its wrong - because there's no return..
So, is anyone able to interpret what I'm trying to do and suggest a better way of approaching this style of coding in perl? eg. am I sort of doing OOP?
In short - yes, you're basically doing OO, but in a way that's going to confuse everyone.
The danger of doing subs like that is that you're acting at a distance. It's a bad coding style to have to look somewhere else entirely for what might be breaking your code.
This is generally why 'globals' are to be avoided wherever possible.
For a short script, it doesn't matter too much.
Regarding return values - Perl returns the result of the last expression by default. (See: return)
(In the absence of an explicit return, a subroutine, eval, or do FILE automatically returns the value of the last expression evaluated.)
The reason Perl critic flags it is:
Require all subroutines to terminate explicitly with one of the following: return, carp, croak, die, exec, exit, goto, or throw.
Subroutines without explicit return statements at their ends can be confusing. It can be challenging to deduce what the return value will be.
Furthermore, if the programmer did not mean for there to be a significant return value, and omits a return statement, some of the subroutine's inner data can leak to the outside.
Perlcritic isn't always right though - if there's good reason for doing what you're doing, then turn it off. Just as long as you've thought about it and are aware of the risks an consequences.
Personally I think it's better style to explicitly return something, even if it is just return;.
Anyway, redrafting your code in a (crude) OO fashion:
#!/usr/bin/perl
use strict;
use warnings;
package MyArray;
my $default_array = [ 1,4,2,6,1,8,5,5,2 ];
sub new {
my ( $class ) = #_;
my $self = {};
$self -> {myarray} = $default_array;
bless ( $self, $class );
return $self;
}
sub get_array {
my ( $self ) = #_;
return ( $self -> {myarray} );
}
sub sort_array{
my ( $self ) = #_;
#{ $self -> {myarray} } = sort ( #{ $self -> {myarray} } );
for ( #{ $self -> {myarray} } ) {
print $_,"\n";
}
return 1;
}
sub push_array{
my ( $self ) = #_;
for ( 1 .. 9 ){
push #{$self -> {myarray}}, $_;
}
return 1;
}
sub pop_array {
my ( $self ) = #_;
for ( 1 .. 3 ){
pop #{$self -> {myarray}};
}
return 1;
}
1;
And then call it with:
#!/usr/bin/perl
use strict;
use warnings;
use MyArray;
my $array = MyArray -> new();
print "Started:\n";
print join (",", #{ $array -> get_array()} ),"\n";
print "Reshuffling:\n";
$array -> sort_array();
$array -> push_array();
$array -> pop_array();
print "Finished:\n";
print join (",", #{ $array -> get_array()} ),"\n";
It can probably be tidied up a bit, but hopefully this illustrates - within your object, you've got an internal 'array' which you then 'do stuff with' by making your calls.
Result is much the same (I think I've replicated the logic, but don't trust that entirely!) but you have a self contained thing going on.
If the function doesn't mean to return anything, there's no need to use return!
No, you don't use any aspects of OO (encapsulation, polymorphism, etc). What you are doing is called procedural programming. Nothing wrong with that. All my work for nuclear power plants was written in that style.
The problem is using #main::array, and I'm not talking about the fact that you could abbreviate that to #::array. Fully-qualified names escape strict checks, so they are far, far more error-prone. Mistyped var name won't get caught as easily, and it's easy to have two pieces of code collide by using the same variable name.
If you're just using one file, you can use my #array, but I presume you are using #main::array because you are accessing it from multiple files/modules. I suggest placing our #array in a module, and exporting it.
package MyData;
use Exporter qw( import );
our #EXPORT = qw( #array );
our #array;
1;
Having some kind of hint in the variable name (such as a prefix or suffix) indicating this is a variable used across many modules would be nice.
By the way, if you wanted do create an object, it would look like
package MyArray;
sub new {
my $class = shift;
my $self = bless({}, $class);
$self->{array} = [ #_ ];
return $self;
}
sub get_elements {
my ($self) = #_;
return #{ $self->{array} };
}
sub sort {
my ($self) = #_;
#{ $self->{array} } = sort #{ $self->{array} };
}
sub push {
my $self = shift;
push #{ $self->{array} }, #_;
}
sub pop {
my ($self, $n) = #_;
return splice(#{ $self->{array} }, 0, $n//1);
}
my $array = MyArray->new(1,4,2,6,1,8,5,5,2);
$array->sort;
print("$_\n") for $array->get_elements();
$array->push_array(1..9);
$array->pop_array(3);
I improved your interface a bit. (Sorting shouldn't print. Would be nice to push different things and to pop other than three elements.)

Using a variable as a method name in Perl

I have a perl script (simplified) like so:
my $dh = Stats::Datahandler->new(); ### homebrew module
my %url_map = (
'/(article|blog)/' => \$dh->articleDataHandler,
'/video/' => \$dh->nullDataHandler,
);
Essentially, I'm going to loop through %url_map, and if the current URL matches a key, I want to call the function pointed to by the value of that key:
foreach my $key (keys %url_map) {
if ($url =~ m{$key}) {
$url_map{$key}($url, $visits, $idsite);
$mapped = 1;
last;
}
}
But I'm getting the message:
Can't use string ("/article/") as a subroutine ref while "strict refs" in use at ./test.pl line 236.
Line 236 happens to be the line $url_map{$key}($url, $visits, $idsite);.
I've done similar things in the past, but I'm usually doing it without parameters to the function, and without using a module.
Since this is being answered here despite being a dup, I may as well post the right answer:
What you need to do is store a code reference as the values in your hash. To get a code reference to a method, you can use the UNIVERSAL::can method of all objects. However, this is not enough as the method needs to be passed an invocant. So it is clearest to skip ->can and just write it this way:
my %url_map = (
'/(article|blog)/' => sub {$dh->articleDataHandler(#_)},
'/video/' => sub {$dh->nullDataHandler(#_)},
);
This technique will store code references in the hash that when called with arguments, will in turn call the appropriate methods with those arguments.
This answer omits an important consideration, and that is making sure that caller works correctly in the methods. If you need this, please see the question I linked to above:
How to take code reference to constructor?
You're overthinking the problem. Figure out the string between the two forward slashes, then look up the method name (not reference) in a hash. You can use a scalar variable as a method name in Perl; the value becomes the method you actually call:
%url_map = (
'foo' => 'foo_method',
);
my( $type ) = $url =~ m|\A/(.*?)/|;
my $method = $url_map{$type} or die '...';
$dh->$method( #args );
Try to get rid of any loops where most of the iterations are useless to you. :)
my previous answer, which I don't like even though it's closer to the problem
You can get a reference to a method on a particular object with can (unless you've implemented it yourself to do otherwise):
my $dh = Stats::Datahandler->new(); ### homebrew module
my %url_map = (
'/(article|blog)/' => $dh->can( 'articleDataHandler' ),
'/video/' => $dh->can( 'nullDataHandler' ),
);
The way you have calls the method and takes a reference to the result. That's not what you want for deferred action.
Now, once you have that, you call it as a normal subroutine dereference, not a method call. It already knows its object:
BEGIN {
package Foo;
sub new { bless {}, $_[0] }
sub cat { print "cat is $_[0]!\n"; }
sub dog { print "dog is $_[0]!\n"; }
}
my $foo = Foo->new;
my %hash = (
'cat' => $foo->can( 'cat' ),
'dog' => $foo->can( 'dog' ),
);
my #tries = qw( cat dog catbird dogberg dogberry );
foreach my $try ( #tries ) {
print "Trying $try\n";
foreach my $key ( keys %hash ) {
print "\tTrying $key\n";
if ($try =~ m{$key}) {
$hash{$key}->($try);
last;
}
}
}
The best way to handle this is to wrap your method calls in an anonymous subroutine, which you can invoke later. You can also use the qr operator to store proper regexes to avoid the awkwardness of interpolating patterns into things. For example,
my #url_map = (
{ regex => qr{/(article|blog)/},
method => sub { $dh->articleDataHandler }
},
{ regex => qr{/video/},
method => sub { $dh->nullDataHandler }
}
);
Then run through it like this:
foreach my $map( #url_map ) {
if ( $url =~ $map->{regex} ) {
$map->{method}->();
$mapped = 1;
last;
}
}
This approach uses an array of hashes rather than a flat hash, so each regex can be associated with an anonymous sub ref that contains the code to execute. The ->() syntax dereferences the sub ref and invokes it. You can also pass parameters to the sub ref and they'll be visible in #_ within the sub's block. You can use this to invoke the method with parameters if you want.

Perl, check if pair exists in hash of hashes

In Perl, I have a hash of hashes created with a loop similar to the following
my %HoH
for my $i (1..10) {
$HoH{$a}{$b} = $i;
}
$a and $b are variables that do have some value when the HoH gets filled in. After creating the HoH, how can I check if a particular pair ($c, $d) exists in the HoH? The following does not work
if (defined $HoH{$c}{$d}) {...}
because if $c does not exist in HoH already, it will be created as a key without a value.
Writing
if (defined $HoH{$c}{$d}) {...}
will "work" insomuch as it will tell you whether or not $HoH{$c}{$d} has a defined value. The problem is that if $HoH{$c} doesn't already exist it will be created (with an appropriate value) so that $HoH{$c}{$d} can be tested. This process is called "autovivification." It's convenient when setting values, e.g.
my %hoh;
$hoh{a}{b} = 1; # Don't need to set '$hoh{a} = {}' first
but inconvenient when retrieving possibly non-existent values. I wish that Perl was smart enough to only perform autovivification for expressions used as lvalues and short-circuit to return undef for rvalues but, alas, it's not that magical. The autovivification pragma (available on CPAN) adds the functionality to do this.
To avoid autovivification you need to test the intermediate values first:
if (exists $HoH{$c} && defined $HoH{$c}{$d}) {
...
}
use Data::Dumper;
my %HoH;
$HoH{A}{B} = 1;
if(exists $HoH{C} && exists $HoH{C}{D}) {
print "exists\n";
}
print Dumper(\%HoH);
if(exists $HoH{C}{D}) {
print "exists\n";
}
print Dumper(\%HoH);
Output:
$VAR1 = {
'A' => {
'B' => 1
}
};
$VAR1 = {
'A' => {
'B' => 1
},
'C' => {}
};
Autovivification is causing the keys to be created. "exists" in my second example shows this so the first example checks both keys individually.
Several ways:
if ( $HoH{$c} && defined $HoH{$c}{$d} ) {...}
or
if ( defined ${ $HoH{$c} || {} }{$d} ) {...}
or
no autovivification;
if (defined $HoH{$c}{$d}) {...}
or
use Data::Diver;
if ( defined Data::Diver::Dive( \%HoH, $c, $d ) ) {...}
You have to use the exists function
exists EXPR
Given an expression that specifies an
element of a hash, returns true if the
specified element in the hash has ever
been initialized, even if the
corresponding value is undefined.
Note that the EXPR can be arbitrarily
complicated as long as the final
operation is a hash or array key
lookup or subroutine name:
if (exists $ref->{A}->{B}->{$key}) { }
if (exists $hash{A}{B}{$key}) { }
My take:
use List::Util qw<first>;
use Params::Util qw<_HASH>;
sub exists_deep (\[%$]#) {
my $ref = shift;
return unless my $h = _HASH( $ref ) // _HASH( $$ref )
and defined( my $last_key = pop )
;
# Note that this *must* be a hash ref, for anything else to make sense.
return if first { !( $h = _HASH( $h->{ $_ } )) } #_;
return exists $h->{ $last_key };
}
You could also do this recursively. You could also create a descent structure allowing intermediate and even terminal arrayref with just a little additional coding.

Can I overload Perl's =? (And a problem while use Tie)

I choose to use tie and find this:
package Galaxy::IO::INI;
sub new {
my $invocant = shift;
my $class = ref($invocant) || $invocant;
my $self = {']' => []}; # ini section can never be ']'
tie %{$self},'INIHash';
return bless $self, $class;
}
package INIHash;
use Carp;
require Tie::Hash;
#INIHash::ISA = qw(Tie::StdHash);
sub STORE {
#$_[0]->{$_[1]} = $_[2];
push #{$_[0]->{']'}},$_[1] unless exists $_[0]->{$_[1]};
for (keys %{$_[2]}) {
next if $_ eq '=';
push #{$_[0]->{$_[1]}->{'='}},$_ unless exists $_[0]->{$_[1]}->{$_};
$_[0]->{$_[1]}->{$_}=$_[2]->{$_};
}
$_[0]->{$_[1]}->{'='};
}
if I remove the last "$[0]->{$[1]}->{'='};", it does not work correctly.
Why ?
I know a return value is required. But "$[0]->{$[1]};" cannot work correctly either, and $[0]->{$[1]}->{'='} is not the whole thing.
Old post:
I am write a package in Perl for parsing INI files.
Just something based on Config::Tiny.
I want to keep the order of sections & keys, so I use extra array to store the order.
But when I use " $Config->{newsection} = { this => 'that' }; # Add a section ", I need to overload '=' so that "newsection" and "this" can be pushed in the array.
Is this possible to make "$Config->{newsection} = { this => 'that' };" work without influence other parts ?
Part of the code is:
sub new {
my $invocant = shift;
my $class = ref($invocant) || $invocant;
my $self = {']' => []}; # ini section can never be ']'
return bless $self, $class;
}
sub read_string {
if ( /^\s*\[\s*(.+?)\s*\]\s*$/ ) {
$self->{$ns = $1} ||= {'=' => []}; # ini key can never be '='
push #{$$self{']'}},$ns;
next;
}
if ( /^\s*([^=]+?)\s*=\s*(.*?)\s*$/ ) {
push #{$$self{$ns}{'='}},$1 unless defined $$self{$ns}{$1};
$self->{$ns}->{$1} = $2;
next;
}
}
sub write_string {
my $self = shift;
my $contents = '';
foreach my $section (#{$$self{']'}}) {
}}
Special Symbols for Overload
lists the behaviour of Perl overloading for '='.
The value for "=" is a reference to a function with three arguments, i.e., it looks like the other values in use overload. However, it does not overload the Perl assignment operator. This would go against Camel hair.
So you will probably need to rethink your approach.
This is not exactly JUST operator overloading, but if you absolutely need this functionality, you can try a perl tie:
http://perldoc.perl.org/functions/tie.html
Do you know about Config::IniFiles? You might consider that before you go off and reinvent it. With some proper subclassing, you can add ordering to it.
Also, I think you have the wrong interface. You're exposing the internal structure of your object and modifying it through magical assignments. Using methods would make your life much easier.