How can I check if a Perl scalar holds a reference to a certain subroutine? - perl

In other words, how can I check for coderef "equality"?
The smartmatch operator doesn't work for obvious reasons (would treat it as CODE->(ANY)), but I've included it in the example to show what I'm after:
use strict;
use warnings;
use feature 'say';
sub pick_at_random {
my %table = #_;
return ( values %table )[ rand( keys %table ) ];
}
my %lookup = ( A => \&foo,
B => \&bar,
C => \&baz );
my $selected = pick_at_random( %lookup );
say $selected ~~ \&foo ? "Got 'foo'" :
$selected ~~ \&bar ? "Got 'bar'" :
$selected ~~ \&baz ? "Got 'baz'" :
"Got nadda" ;

You can use normal (numeric) equality (==), as is the case with all references:
Perl> $selected == \&foo
Perl> $selected == \&bar
Perl> $selected == \&baz
1
Live in action here
That breaks when the reference is blessed with something that overloads == or 0+ (which is unlikely for coderefs). In that case, you'd compare Scalar::Util::refaddr($selected).
From man perlref:
Using a reference as a number produces an integer representing its storage location in memory. The only useful thing to be done with this is to compare two
references numerically to see whether they refer to the same location.
if ($ref1 == $ref2) { # cheap numeric compare of references
print "refs 1 and 2 refer to the same thing\n";
}

Related

Check Array and Hash is Empty or Not

I have an array and hash. I just want to check whether they both are empty or not.
I found below two methods to check this. Any suggestion which would be more suffice.
#!/usr/bin/perl
use strict; use warnings;
use Data::Dumper;
my #a = qw/a b c/;
print Dumper(\#a);
my %b = (1 => "Hi");
print Dumper(\%b);
#a = ();
%b = ();
#Method 1
if(!#a && !%b){
print "Empty\n";
} else {
print "Not empty\n";
}
#Method 2
if(!scalar #a && !scalar keys %b){
print "Empty\n";
} else {
print "Not empty\n";
}
The case here is, either both would be Empty or both would have some values.
For finding whether a hash or array is empty,
Hash empty-ness: (%hash) and (keys %hash), when used in a boolean context, are equally optimised internally, and have been since since perl 5.28.0. They both just examine the hash for non-emptiness and evaluate to a true or false value. Prior to that, it was much more complex, and changed across releases, that is to say (keys %hash) may have been faster, but this is no longer a concern.
Array empty-ness: #array in scalar context has always been efficient, and will tell you whether the array is empty.
Array
Use #a in scalar context.
Examples:
say #a ? "not empty" : "empty";
#a
or die( "At least one value is required" );
my $num_elements = #a;
Hash
Use %h or keys( %h ) in scalar context.
If the code will be run on older versions of Perl, you want keys( %hash ) because %h was inefficient before 5.26.
Examples:
say %h ? "not empty" : "empty"; # Slower before 5.26
say keys( %h ) ? "not empty" : "empty";
%h # Slower before 5.26
or die( "At least one element is required" );
keys( %h )
or die( "At least one element is required" );
my $has_elements = %h; # Slower before 5.26
my $num_elements = %h; # 5.26+
my $num_elements = keys( %h );
Note that !#a and !scalar #a are identical since ! already imposes a scalar context. The same goes for !scalar keys %b and !keys %b.

In perl, can I dynamically create variables within a subroutine?

Background
In the code I'm writing, I'm passing data into methods using a hash-ref (see note [1]).
This, unfortunately, leads to a lot of repetitive code:
sub thing {
my ($self, $params) = #_;
my ($foo, $bar, $baz, $biff,);
if ( exists $params->{foo} && $params->{foo} ) {
$foo = $params->{foo};
}
# repeat for `bar`, `baz`, `biff`
## rest of function ##
}
(and duplicate in every function with parameters)
What I want to do
What would be far easier is to define a list of parameters, and then
iterate of that list, creating both the variables and setting them to a value if needed.
So to test this, I tried:
my $params = { x => 1, y => 2};
my #params = qw(x y z a b c);
gno strict 'refs';
rep( ${$_}, #params );
use strict 'refs';
foreach my $p (#params) {
if ( exists $params->{$p} && $params->{$p} ) {
${$p} = $params->{$p};
}
}
print "x:$x, y:$y, z:$z, a:$a, b:$b, c:$c\n"
which gives me the following error:
Global symbol "$x" requires explicit package name at ./test.pl line 20.
Global symbol "$y" requires explicit package name at ./test.pl line 20.
Global symbol "$z" requires explicit package name at ./test.pl line 20.
Global symbol "$c" requires explicit package name at ./test.pl line 20.
Can I do this dynamic variable creation thing? (and if so, how?)
[1] By using a hash to pass data in, I gain in many ways:
There is a clear indication of What each item of data is
The ORDER of the pieces of data is no longer important
I can miss one or more pieces of data, and I don't need to add in random undef values
I'm passing less data: 1 scalar (a reference) rather than multiple scalars
(I accept the danger of functions being able to change the parent's data, rather that mucking around with a copy of it...)
Yes, you can do this in Perl. But it's a terrible idea for all of the reasons explained by Mark Dominus in these three articles.
It's a far better idea to store these values in a hash.
#!/usr/bin/perl
use strict;
use warnings;
my $params = { x => 1, y => 2};
my #params = qw(x y z a b c);
my %var;
foreach my $p (#params) {
# You need to take care exactly what you want in this
# logical statement. The options are:
# 1/ $p exists in the hash
# exists $params->{$p}
# 2/ $p exists in the hash and has a defined value
# defined $params->{$p}
# 3/ $p exists in the hash and has a true value
# $params->{$p}
# I think the first option is most likely. The last one has
# good chance of introducing subtle bugs.
if ( exists $params->{$p} ) {
$var{$p} = $params->{$p};
}
}
print join ', ', map { "$_: " . ($var{$_} // 'undef') } #params;
print "\n";
It's a really bad idea to use symbolic references like this... hashes pretty well completely eliminate the need for this.
use warnings;
use strict;
my $params = { x => 1, y => 2, foo => 3, };
thing($params);
sub thing {
my $params = shift;
my $foo;
if (defined $params->{foo}){
$foo = $params->{foo};
}
print $foo;
}
You can also pass in a hash itself directly (whether it be pre-created, or passed inline to the sub. If pre-created, the sub will operate on a copy).
thing(foo => 1, x => 2);
sub thing {
my %params = #_;
print $params{foo} if defined $params{foo};
}
With thanks to Dave Cross & others - the following test works:
#!/usr/bin/perl
use strict;
use warnings;
use English qw( -no_match_vars ) ;
use Carp;
use Data::Dumper;
my $params = { x => 1, y => 2, z => 0};
my #params = qw(x y z a b c);
my %var;
foreach my $p (#params) {
if ( exists $params->{$p} ) {
$var{$p} = $params->{$p};
} else {
$var{$p} = undef;
}
}
print Dumper \%var;
This gives me %var with all desired parameters (as listed in #params, with the ones that are not passed in (ie, not in the $params hashref) created with an undef value.
Thus I can confidently test for value and truth, without worrying about existence.
Thank you all.
I did this using soft references:
#!perl
no strict "refs";
my %vars = ( x => 1, y => 2 );
for my $k ( keys %vars ) {
$$k = $vars{$k};
}
print $x, $y;
But there's a reason why the recommended settings (use strict; use warnings;) prevent this kind of pattern. It is easy to shoot yourself in the foot with it.
perl -Mstrict -MData::Dumper -wE'
{package Data::Dumper;our($Indent,$Sortkeys,$Terse,$Useqq)=(1)x4}
my #aok = qw( x y z a b c );
my %dft = ( a => -1 );
say "- - - -";
my $arg = { x => 1, y => 2, foo => 42 };
$arg = { %dft, %$arg };
say "arg: ", Dumper($arg);
my %var;
#var{ #aok } = #$arg{ #aok };
say "var: ", Dumper(\%var);
my %aok = map { $_ => 1 } #aok;
my #huh = grep !$aok{$_}, sort keys %$arg;
#huh and say "huh: ", Dumper(\#huh);
'
- - - -
arg: {
"a" => -1,
"foo" => 42,
"x" => 1,
"y" => 2
}
var: {
"a" => -1,
"b" => undef,
"c" => undef,
"x" => 1,
"y" => 2,
"z" => undef
}
huh: [
"foo"
]

How can I find out if elements of an array match any elements of another array?

if I have a hash
my %foo = ( foo => 1, bar => 1 );
I want to check if any key of %foo is in a comparison array (and obviously keys %foo is just an array ). I keep thinking some weird syntax that does't exist like.
my #cmp0 = qw( foo baz );
my #cmp1 = qw( baz blargh );
if keys %foo in #cmp0 # returns true because key foo is in the array
if keys %foo in #cmp1 # returns false because no key in foo is an element of cmp1
What is the simplest way to do this?
List::MoreUtils has a function called any that uses a syntax similar to grep, but stops its internal loop the first time the criteria are met. The advantage to this behavior is that far fewer iterations will be required (assuming random distribution of intersections).
An additional advantage of any is code clarity: It is named for what it does. Perl Best Practices discourages using grep in Boolean context because the assumed use for grep is to return a list of elements that match. It works in Boolean context, but the intent of the code is less clear to a reader than any, which is designed specifically for Boolean usage.
It is true that any adds a dependency on List::MoreUtils. However, List::MoreUtils is one of those modules that is so ubiquitous, it is highly likely to already be installed.
Here's an example:
use List::MoreUtils qw( any );
my %foo = ( foo => 1, bar => 1 );
my #cmp0 = qw( foo baz );
my #cmp1 = qw( baz blargh );
print "\#cmp0 and %foo have an intersection.\n"
if any { exists $foo{$_} } #cmp0;
print "\#cmp1 and %foo have an intersection.\n"
if any { exists $foo{$_} } #cmp1;
Another option is the ~~ Smart Match Operator, which became available in Perl 5.10.0 and newer. It could be used like this:
my %foo = ( foo => 1, bar => 1 );
my #cmp0 = qw( foo baz );
my #cmp1 = qw( baz blargh );
print "\#cmp0 and %foo have an intersection.\n" if #cmp0 ~~ %foo;
print "\#cmp1 and %foo have an intersection.\n" if #cmp1 ~~ %foo;
With smartmatch, you eliminate the List::MoreUtils dependency in favor of a minimum Perl version dependency. It's up to you to decide whether the code is as clear as any.
The tidiest way to write this is to use grep together with the exists operator.
This code
my %foo = ( foo => 1, bar => 1 );
my #cmp0 = qw( foo baz );
my #cmp1 = qw( baz blargh );
print "YES 0\n" if grep { exists $foo{$_} } #cmp0;
print "YES 1\n" if grep { exists $foo{$_} } #cmp1;
gives this output
YES 0
grep is a good idea, and probably the cleanest. You can however also use the logical OR assignment operator ||=:
my $found;
$found ||= exists $foo{$_} for #cmp1;
These are simple set operations.
use strictures;
use Set::Scalar qw();
⋮
my $foo = Set::Scalar->new(keys %foo);
$foo->intersection(Set::Scalar->new(#cmp0))->size; # 1
$foo->intersection(Set::Scalar->new(#cmp1))->size; # 0
There are - as usual - more ways to solve this. You could do it like this:
#!/usr/bin/perl
use strict ;
use warnings ;
my %hash = ( foo => 1 , bar => 1 ) ;
my %cmp = ( cmp0 => [ qw(foo baz) ] ,
cmp1 => [ qw(baz blargh) ] ) ;
my #hash_keys = keys %hash ;
foreach my $compare ( keys %cmp ) {
my %tmp ;
# Generate a temporary hash from comparison keys via hash slice
#tmp{#{$cmp{$compare}}} = undef ;
INNER:
foreach my $hash_key ( #hash_keys ) {
if( exists $tmp{$hash_key} ) {
printf "Key '%s' is part of '%s'.\n" , $hash_key , $compare ;
last INNER ;
}
}
}
This gives:
Key 'foo' is part of 'cmp0'.

In Perl, how can I test whether a sequence is of the form n, n + 1, n + 2, ..., n + k?

I'm trying to implement a subroutine that takes an array as its argument (or uses multiple arguments — still haven't quite grokked the difference), and returns true or false depending on whether that array is an increasing sequence (each number must be 1 more than the last):
isIncreasingArray(1,2,3,4); # true
isIncreasingArray(1,2,3,1); # false
isIncreasingArray(0,9,1); # false
isIncreasingArray(-2,-1,0); # true
isIncreasingArray(1,1,1,1); # false
This is what I've come up with:
sub isIncreasingArray {
my $last;
foreach $n (#_) {
return 0 if defined($last) && $last != $n - 1;
$last = int($n);
}
return 1;
}
I'm quite new to Perl and am wondering if there's a simpler or more concise way of achieving this? Also, is what I've written in line with best practices?
A couple of points:
For efficiency, especially to minimize memory footprint, you probably want to pass a reference to an array to the subroutine.
In list context, return 0 will return a list consisting of a single element and thus will be true. a bare return suffices when you want to return false and does the job in all contexts.
It is probably possible to cut the number of comparisons in half by comparing the difference between the first and the last, the second and the second last etc. to see differences equal difference in indexes, but I am not thinking that clearly right now.
Here is a slightly different version based on yours. Note that you should use strict and make sure to scope your loop variable using my:
#!/usr/bin/env perl
use strict; use warnings;
use Carp qw(croak);
use Test::More;
ok( isSimplyIncreasingSequence( [ 1298 ] ) ); # true
ok( isSimplyIncreasingSequence( [1,2,3,4] ) ); # true
ok( not isSimplyIncreasingSequence( [1,2,3,1] ) ); # false
ok( not isSimplyIncreasingSequence( [0,9,1] ) ); # false
ok( isSimplyIncreasingSequence( [-2,-1,0] ) ); # true
ok( not isSimplyIncreasingSequence( [1,1,1,1] ) ); # false
done_testing();
sub isSimplyIncreasingSequence {
my ($seq) = #_;
unless (defined($seq)
and ('ARRAY' eq ref $seq)) {
croak 'Expecting a reference to an array as first argument';
}
return 1 if #$seq < 2;
my $first = $seq->[0];
for my $n (1 .. $#$seq) {
return unless $seq->[$n] == $first + $n;
}
return 1;
}
And, of course, some benchmarks:
#!/usr/bin/env perl
use strict; use warnings;
use Benchmark qw( cmpthese );
use Carp qw( croak );
my %cases = (
ordered_large => [1 .. 1_000_000],
ordered_small => [1 .. 10],
unordered_large_beg => [5, 1 .. 999_000],
unordered_large_mid => [1 .. 500_000, 5, 500_002 .. 1_000_000],
unordered_large_end => [1 .. 999_999, 5],
);
for my $case (keys %cases) {
print "=== Case: $case\n";
my $seq = $cases{$case};
cmpthese -3, {
'ref' => sub { isSimplyIncreasingSequence($seq) },
'flat' => sub {isIncreasingArray(#{ $seq } ) },
};
}
sub isSimplyIncreasingSequence {
my ($seq) = #_;
unless (defined($seq)
and ('ARRAY' eq ref $seq)) {
croak 'Expecting a reference to an array as first argument';
}
return 1 if #$seq < 2;
my $first = $seq->[0];
for my $n (1 .. $#$seq) {
return unless $seq->[$n] == $first + $n;
}
return 1;
}
sub isIncreasingArray {
my $last;
foreach my $n (#_) {
return 0 if defined($last) && $last != $n - 1;
$last = int($n);
}
return 1;
}
=== Case: unordered_large_mid
Rate flat ref
flat 4.64/s -- -18%
ref 5.67/s 22% --
=== Case: ordered_small
Rate ref flat
ref 154202/s -- -11%
flat 173063/s 12% --
=== Case: ordered_large
Rate flat ref
flat 2.41/s -- -13%
ref 2.78/s 15% --
=== Case: unordered_large_beg
Rate flat ref
flat 54.2/s -- -83%
ref 315/s 481% --
=== Case: unordered_large_end
Rate flat ref
flat 2.41/s -- -12%
ref 2.74/s 14% --
How come no one's come up with a smart-match solution?
While this one isn't as efficient as some of the other solutions, it has the added benefit of working with strings as well.
EDIT
Sub now returns true for empty and single-element lists because that's what the experts say it should do:
use strict;
use warnings;
use 5.010;
sub is_simply_increasing { #_ < 2 || #_ ~~ [$_[0] .. $_[-1]] }
say ( is_simply_increasing(1,2,3,4) ? 'true' : 'false' ); # true
say ( is_simply_increasing(1,2,3,1) ? 'true' : 'false' ); # false
say ( is_simply_increasing(0,9,1) ? 'true' : 'false' ); # false
say ( is_simply_increasing(-2,-1,0) ? 'true' : 'false' ); # true
say ( is_simply_increasing(1,1,1,1) ? 'true' : 'false' ); # false
say ( is_simply_increasing(1,4,1,-1) ? 'true' : 'false' ); # false
say ( is_simply_increasing('a','c') ? 'true' : 'false' ); # false
say ( is_simply_increasing('love'..'perl') ? 'true' : 'false' ); # true
say ( is_simply_increasing(2) ? 'true' : 'false' ); # true
say ( is_simply_increasing() ? 'true' : 'false' ); # true
I love it when my sub's a single-line!
I ended up with something a little longer than yours. Which means, I suppose, that there's nothing wrong with your solution :)
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Test::More;
sub is_increasing_array {
return unless #_;
return 1 if #_ == 1;
foreach (1 .. $#_) {
return if $_[$_] != $_[$_ - 1] + 1;
}
return 1;
}
ok(is_increasing_array(1,2,3,4)); # true
ok(!is_increasing_array(1,2,3,1)); # false
ok(!is_increasing_array(0,9,1)); # false
ok(is_increasing_array(-2,-1,0)); # true
ok(!is_increasing_array(1,1,1,1)); # false
done_testing;
Using a pre-6 "junction":
sub is_increasing_list {
use List::MoreUtils qw<none>;
my $a = shift;
return none {
( my $v, $a ) = (( $_ - $a != 1 ), $_ );
$v;
} #_;
}
The none expression could also be written (more cryptically) as
return none { [ ( $a, undef ) = ( $_, ( $_ - $a - 1 )) ]->[-1]; } #_;
(If the constraint is that $x[$n+1] - $x[$n] == 1, then subtracting 1 makes a "Perl truth condition" as well.)
Actually come to think of it a 'none' junction operator is kind of backward to the concept, so I'll use the all:
sub is_increasing_list {
use List::MoreUtils qw<all>;
my $a = shift;
return all { [ ( $a, undef ) = ( $_, ( $_ - $a == 1 )) ]->[-1]; } #_;
}
Someone has to throw in the functional-programming solution here, since this sort of mathematical formula just begs for recursion. ;)
sub isIncreasingArray {
return 1 if #_ <= 1;
return (pop(#_) - $_[-1] == 1) && isIncreasingArray(#_);
}
As for a subroutine argument being an array versus multiple arguments, think of it this way: Perl is always sending a list of arguments to your subroutine as the array #_. You can either shift or pop off arguments from that array as individual scalars, or otherwise operate on the whole list as an array. From inside your subroutine, it's still an array, period.
If you get into references, yes you can pass a reference-to-an-array into a subroutine. That reference is still technically being passed to your subroutine as an array (list) containing one scalar value: the reference. First I'd ignore all this and wrap your head around basic operation without references.
Calling the subroutine. This way, Perl is secretly converting your bare list of scalars into an array of scalars:
isIncreasingArray(1,2,3,4);
This way, Perl is passing your array:
#a = (1,2,3,4);
$answer = isIncreasingArray(#a);
Either way, the subroutine gets an array. And it's a copy*, hence the efficiency talk of references here. Don't worry about that for K<10,000, even with my ridiculously inefficient, academic, elegant, recursive solution here, which still takes under 1 second on my laptop:
print isIncreasingArray(1..10000), "\n"; # true
*A copy: sort of, but not really? See comments below, and other resources, e.g. PerlMonks. "One might argue that Perl always does Pass-By-Reference, but protects us from ourselves." Sometimes. In practice I make my own copies inside subroutines into localized "my" variables. Just do that.
This is the shortest form I could come up to, check each element in a map to see if it is equal to the increased self, return a set of 0 and 1, count the 1 and match against the original size of the set.
print isIncreasingArray(1,2,3),"\n";
print isIncreasingArray(1,2,1),"\n";
print isIncreasingArray(1,2),"\n";
print isIncreasingArray(1),"\n";
sub isIncreasingArray {
$i = $_[0];
(scalar grep { 1 == $_ } map { $i++ == $_ } #_) == scalar(#_) || 0;
}
Whatever implementation you use, it wouldn't hurt to make some quick checks beforehand:
sub isSimplyIncreasingSequence {
return 1 if #_ < 2;
return 0 if $_[-1] - $_[0] != $#_;
...
}

Passing a scalar reference in Perl

I know that passing a scalar to a sub is actually passing the reference, but since I am new to perl I still did the following test:
#!/usr/bin/perl
$i = 2;
subr(\$i);
sub subr{
print $_[0]."\n";
print $$_[0]."\n";
}
I thought the first line is going to print an address and the second line is going to give be back the number, but the second one is a blank line. I was pointed by someone one else to do this: ${$_[0]} and it prints the number. But she didn't know the reason why without {} it is not working and why it is working with {}. So what has happened?
It's because your second print statement is equivalent to doing this...
my $x = $$_; print $x[0];
When what you want is
my $x = $_[0]; print $$x;
In other words, the de-referencing occurs before the array subscript is evaluated.
When you add those curl-wurlies, it tells perl how to interpret the expression as you want it; it will evaluate $_[0] first, and then de-reference to get the value.
It's an order-of-evaluation thing.
$$_[0] is evaluated as {$$_}[0]
This is the 0th element of the reference of the scalar variable $_. It's taking the reference first and then trying to find the 0th element of it.
${$_[0]}
This is a reference to the 0th element of the array #_. It's finding the 0th element first then taking a reference of that.
If you set use strict and use warnings at the top of your code you'll see plenty of warnings about undefined values from your first attempt.
$$_[0] is like $foo[0], only with $_ in place of the array name. This means $_ is treated as an array reference, and the expression doesn't involve the scalar reference $_[0] at all. $_->[0] is equivalent, using the alternate -> syntax. Syntax for dereferencing may seem arbitrary and hard to remember, but there is underlying sense and order; a very good presentation of it is at http://perlmonks.org/?node=References+quick+reference.
You don't have to pass a reference to $i. The notation $_[0] is an alias for $i when you invoke it as subr( $i ).
use strict;
use warnings;
use Test::More tests => 2;
sub subr{ $_[0]++ } # messing with exactly what was passed first
my $i=2;
is( $i, 2, q[$i == 2] );
subr($i);
is( $i, 3, q[$i == 3] );
Another example is this:
use strict;
use warnings;
use Test::More tests => 6;
use Test::Exception;
sub subr{ $_[0]++ }
my $i=2;
is( $i, 2, q[$i == 2] );
subr($i);
is( $i, 3, q[$i == 3] );
sub subr2 { $_[0] .= 'x'; }
dies_ok { subr2( 'lit' ); } 'subr2 *dies* trying to modify a literal';
lives_ok {
my $s = 'lit';
subr2( $s );
is( $s, 'litx', q[$s eq 'litx'] );
subr2(( my $s2 = 'lit' ));
is( $s2, 'litx', q[$s2 eq 'litx'] );
} 'subr2 lives with heap variables';
Output:
ok 1 - $i == 2
ok 2 - $i == 3
ok 3 - subr2 *dies* trying to modify a literal
ok 4 - $s eq 'litx'
ok 5 - $s2 eq 'litx'
ok 6 - subr2 lives with heap variables
1..6