Hash lookup with an array of keys? - perl

I'd like to see if a hash contains any of a list of keys. Currently I have:
if (grep {$me_hash{$_}} #me_list){
...
but I figure it's inefficient because it doesn't quit on the first match. Is there a better, more idiomatic way to do it?
kthxbi

You can use List::MoreUtils's any
use List::MoreUtils qw(any);
if (any { $me_hash{$_} } #me_list) {
Which presumably short circuits on the first match. This function is rather simple, looking like this:
sub any (&#) {
my $f = shift;
foreach ( #_ ) {
return YES if $f->();
}
return NO;
}
Where YES and NO are defined as
use constant YES => ! 0;
use constant NO => ! 1;
Meaning you can swing your own version of this with something like
sub is_in {
my ($href, #list) = #_;
for (#list) {
return 1 if $href->{$_};
}
return 0;
}
Note that the statement you are using $me_hash{$_} can return false for values you might not consider false, such as the empty string, or zero 0.

List::Util unlike List::MoreUtils is core module,
use List::Util qw(first);
if (defined first {$me_hash{$_}} #me_list) { .. }
and if you don't want to use any of external modules,
my $any = sub{ $me_hash{$_} and return 1 for #_; 0 }->(#me_list);
if ($any) { .. }

Related

Can Perl's "exists" modify data structure values?

I have a nested hash table that looks like this:
my %myhash = (
"val1" => {
"A/B.c" => {
"funct1" => 1
}
},
"val2" => {
"C/D.c" => {
"funct2" => 1
}
}
)
My objective with this data structure is to produce different values based on whether certain hash tables exist. For example,
sub mysub
{
my $val = shift;
my $file = shift;
my $funct = shift;
if (exists $myhash{$val}{$file}{$funct}) {
return "return1";
}
if (exists $myhash{$val}{$file}) {
return "return2";
}
return "return3";
}
The behavior I'm encountering is as follows. I have an instance in time when
my $val = "val1";
my $file = "C/D.c";
my $funct = "funct3";
At this point in time, the return value I get "return2". These are my observations with the Perl debugger:
Break at first "if" in mysub
Print p $proxToBugs{"val1"}{"C/D.c"} ==> Returns blank line. Okay. Continue and this "if" is skipped.
Continue and break at the second "if" in mysub
Print p $proxToBugs{"val1"}{"C/D.c"} ==> Returns "HASH(0x...)". WTF moment. Function returns "return2".
This tells me that running the first if modified the data structure, which allows the second if to pass when in fact it shouldn't. The function I'm running is identical to the function shown above; this one is just sanitized. Anyone has an explanation for me? :)
Yes. This is because of autovivification. See the bottom of the exists documentation:
Although the mostly deeply nested array or hash will not spring into existence just because its existence was tested, any intervening ones [autovivified arrays or hashes] will [spring into existance]. Thus $ref->{"A"} and $ref->{"A"}->{"B"} will spring into existence due to the existence test for the $key element above. This happens anywhere the arrow operator is used...
Where "...test for the $key element above..." refers to:
if (exists $ref->{A}->{B}->{$key}) { }
if (exists $hash{A}{B}{$key}) { } # same idea, implicit arrow
Happy coding.
As pst rightly points out, this is autovivification. There are at least two ways to avoid it. The first (and most common in my experience) is to test at each level:
if (
exists $h{a} and
exists $h{a}{b} and
exists $h{a}{b}{c}
) {
...
}
The short-circuit nature of and causes the second and third calls to exists to not be executed if the earlier levels don't exist.
A more recent solution is the autovivification pragma (available from CPAN):
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Useqq = 1;
{
my %h;
if (exists $h{a}{b}{c}) {
print "impossible, it is empty\n";
}
print Dumper \%h;
}
{
no autovivification;
my %h;
if (exists $h{a}{b}{c}) {
print "impossible, it is empty\n";
}
print Dumper \%h;
}
A third method that ysth mentions in the comments has the benefits of being in core (like the first example) and of not repeating the exists function call; however, I believe it does so at the expense of readability:
if (exists ${ ${ $h{a} || {} }{b} || {} }{c}) {
...
}
It works by replacing any level that doesn't exist with a hashref to take the autovivification. These hashrefs will be discarded after the if statement is done executing. Again we see the value of short-circuiting logic.
Of course, all three of these methods makes an assumption about the data the hash is expected to hold, a more robust method includes calls to ref or reftype depending on how you want to treat objects (there is a third option that takes into account classes that overload the hash indexing operator, but I can't remember its name):
if (
exists $h{a} and
ref $h{a} eq ref {} and
exists $h{a} and
ref $h{a}{b} eq ref {} and
exists $h{a}{b}{c}
) {
...
}
In the comments, pst asked if something like myExists($ref,"a","b","c") exists. I am certain there is a module in CPAN that does something like that, but I am not aware of it. There are too many edge cases for me to find that useful, but a simple implementation would be:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
sub safe_exists {
my ($ref, #keys) = #_;
for my $k (#keys) {
return 0 unless ref $ref eq ref {} and exists $ref->{$k};
$ref = $ref->{$k};
}
return 1;
}
my %h = (
a => {
b => {
c => 5,
},
},
);
unless (safe_exists \%h, qw/x y z/) {
print "x/y/z doesn't exist\n";
}
unless (safe_exists \%h, qw/a b c d/) {
print "a/b/c/d doesn't exist\n";
}
if (safe_exists \%h, qw/a b c/) {
print "a/b/c does exist\n";
}
print Dumper \%h;
If you want to turn off autovivification, you can do that lexically with the autovivification pragma:
{
no autovivification;
if( exists $hash{A}{B}{$key} ) { ... }
}
I wrote more about this at The Effective Perler as Turn off autovivification when you don’t want it.

How do you create a callback function (dispatch table) in Perl using hashes?

I want to call a main controller function that dispatches other function dynamically, something like this:
package Controller;
my %callback_funcs = ();
sub register_callback{
my ($class,$callback,$options) = _#;
#apppend to %callback_funcs hash ... ?
}
sub main{
%callback_funcs = ( add => 'add_func', rem => 'remove_func', edit => 'edit_func');
while(<STDIN>){
last if ($_ =~ /^\s*$/);
if($_ == 'add' || _$ == 'rem' || _$ == 'edit'){
$result = ${callback_funcs['add']['func']}(callback_funcs['add']['options']);
}
}
}
sub add_func{
...
}
One caveat is that the subs are defined in other Modules, so the callbacks would have to be able to reference them... plus
I'm having a hard time getting the hashes right!
So, it's possible to have a hash that contains anonymous subroutines that you can invoke from stdin.
my %callbacks = (
add => sub {
# do stuff
},
fuzzerbligh => sub {
# other stuff
},
);
And you can insert more hashvalues into the hash:
$callbacks{next} = sub {
...
};
And you would invoke one like this
$callbacks{next}->(#args);
Or
my $coderef = $callbacks{next};
$coderef->(#args);
You can get the hashkey from STDIN, or anywhere else.
You can also define them nonymously and then take a reference to them.
sub delete {
# regular sub definition
}
$callbacks{delete} = \&delete;
I wouldn't call these callbacks, however. Callbacks are subs that get called after another subroutine has returned.
Your code is also rife with syntax errors which may be obscuring the deeper issues here. It's also not clear to me what you're trying to do with the second level of arrays. When are you defining these subs, and who is using them when, and for what?
Perhaps this simplified example will help:
# Very important.
use strict;
use warnings;
# Define some functions.
sub multiply { $_[0] * $_[1] }
sub divide { $_[0] / $_[1] }
sub add { $_[0] + $_[1] }
sub subtract { $_[0] - $_[1] }
# Create a hash of references to those functions (dispatch table).
my %funcs = (
multiply => \&multiply,
divide => \&divide,
add => \&add,
subtract => \&subtract,
);
# Register some more functions.
sub register {
my ($key, $func) = #_;
$funcs{$key} = $func;
}
register('+', \&add); # As above.
register('sum', sub { # Or using an anonymous subroutine.
my $s = 0;
$s += $_ for #_;
return $s;
});
# Invoke them dynamically.
while (<>){
my ($op, #args) = split;
last unless $op and exists $funcs{$op}; # No need for equality tests.
print $funcs{$op}->(#args), "\n";
}
You've already got some good answers on how to build a dispatch table and call functions through it within a single file, but you also keep talking about wanting the functions to be defined in other modules. If that's the case, then wouldn't it be better to build the dispatch table dynamically based on what dispatchable functions each module says it has rather than having to worry about keeping it up to date manually? Of course it would!
Demonstrating this requires multiple files, of course, and I'm using Module::Pluggable from CPAN to find the modules which provide the function definitions.
dispatch_core.pl:
#!/usr/bin/env perl
use strict;
use warnings;
my %dispatch;
use lib '.'; # a demo is easier if I can put modules in the same directory
use Module::Pluggable require => 1, search_path => 'DTable';
for my $plugin (plugins) {
%dispatch = (%dispatch, $plugin->dispatchable);
}
for my $func (sort keys %dispatch) {
print "$func:\n";
$dispatch{$func}->(2, 5);
}
DTable/Add.pm:
package DTable::Add;
use strict;
use warnings;
sub dispatchable {
return (add => \&add);
}
sub add {
my ($num1, $num2) = #_;
print "$num1 + $num2 = ", $num1 + $num2, "\n";
}
1;
DTable/MultDiv.pm:
package DTable::MultDiv;
use strict;
use warnings;
sub dispatchable {
return (multiply => \&multiply, divide => \&divide);
}
sub multiply {
my ($num1, $num2) = #_;
print "$num1 * $num2 = ", $num1 * $num2, "\n";
}
sub divide {
my ($num1, $num2) = #_;
print "$num1 / $num2 = ", $num1 / $num2, "\n";
}
1;
Then, on the command line:
$ ./dispatch_core.pl
add:
2 + 5 = 7
divide:
2 / 5 = 0.4
multiply:
2 * 5 = 10
Adding new functions is now as simple as dropping a new file into the DTable directory with an appropriate dispatchable sub. No need to ever touch dispatch_core.pl just to add a new function again.
Edit: In response to the comment's question about whether this can be done without Module::Pluggable, here's a modified dispatch_core.pl which doesn't use any external modules other than the ones defining the dispatchable functions:
#!/usr/bin/env perl
use strict;
use warnings;
my %dispatch;
my #dtable = qw(
DTable::Add
DTable::MultDiv
);
use lib '.';
for my $plugin (#dtable) {
eval "use $plugin";
%dispatch = (%dispatch, $plugin->dispatchable);
}
for my $func (sort keys %dispatch) {
print "$func:\n";
$dispatch{$func}->(2, 5);
}

"Passing arguments to subroutine"-question?

Is routine2 ok too or shouldn't I do this? (I don't need a copy of #list in the subroutine)
#!/usr/bin/perl
use 5.012;
use warnings;
my #list = 0 .. 9;
sub routine1 {
my $list = shift;
for (#$list) { $_++ };
return $list
}
my $l = routine1( \#list );
say "#$l";
sub routine2 {
for (#list) { $_++ };
}
routine2();
say "#list";
If it works for you, then it's ok too. But the first sub can do the job for any array you pass to it which makes it more general.
P.S. Remember that #_ contains aliases for the parameters passed to the function. So you could also use this:
sub increment { $_++ for #_ }
increment(#list);
If you're worried about making the syntax look nice, try this:
sub routine3 (\#) {
for (#{$_[0]}) { $_++ }
}
my #list = (0 .. 9);
routine3(#list);
say "#list"; # prints 1 .. 10
This declares routine3 with a prototype - it takes an array argument by reference. So $_[0] is a reference to #list, no rather unsightly \ needed by the caller. (Some people discourage prototypes, so take this as you will. I like them.)
But unless this is a simplification for what your actual routine does, what I'd do is this:
my #list = 0 .. 9;
my #new_list = map { $_ + 1 } #list;
say "#new_list";
Unless routine is actually really complicated, and it's vital somehow that you modify the original array, I'd just use map. Especially with map, you can plug in a subroutine:
sub complex_operation { ... }
my #new_list = map { complex_operation($_) } #list;
Of course, you could prototype complex_operation with (_) and then just write map(complex_operation, #list); but I like the bracket-syntax personally.

In Perl, how can I call a method whose name I have in a string?

I'm trying to write some abstract code for searching through a list of similar objects for the first one whose attributes match specific values. In order to do this, I need to call a bunch of accessor methods and check all their values one by one. I'd like to use an abstraction like this:
sub verify_attribute {
my ($object, $attribute_method, $wanted_value) = #_;
if ( call_method($object, $attribute_method) ~~ $wanted_value ) {
return 1;
}
else {
return;
}
}
Then I can loop through a hash whose keys are accessor method names and whose values are the values I'm looking for for those attributes. For example, if that hash is called %wanted, I might use code like this to find the object I want:
my $found_object;
FINDOBJ: foreach my $obj (#list_of_objects) {
foreach my $accessor (keys %wanted) {
next FINDOBJ unless verify_attribute($obj, $accessor, $wanted{$accessor});
}
# All attrs verified
$found_object = $obj;
last FINDOBJ;
}
Of course, the only problem is that call_method does not exsit. Or does it? How can I call a method if I have a string containing its name? Or is there a better solution to this whole problem?
my $found_object;
FINDOBJ: foreach my $obj (#list_of_objects) {
foreach my $accessor (keys %wanted) {
next FINDOBJ unless $obj->$accessor() == $wanted{$accessor};
}
# All attrs verified
$found_object = $obj;
last;
}
Yes, you can call methods this way. No string (or any other) eval involved.
Also, substitute == with eq or =~ depending on the type of the data...
Or, for some extra credits, do it the functional way: (all() should really be part of List::Util!)
use List::Util 'first';
sub all (&#) {
my $code = shift;
$code->($_) || return 0 for #_;
return 1;
}
my $match = first {
my $obj = $_;
all { $obj->$_ == $attrs{$_} }
keys %wanted
} #list_of_objects;
Update: Admittedly, the first solution is the less obfuscated one, so it's preferable. But as somebody answering questions, you have add a little sugar to make it interesting for yourself, too! ;-)
Functional way is cool, but for dummies like me eval rules:
test.pl
#!/usr/bin/perl -l
use F;
my $f = F->new();
my $fun = 'lol'; # method of F
eval '$f->'.$fun.'() '; # call method of F, which name is in $fun var
F.pm
package F;
sub new
{
bless {};
}
sub lol
{
print "LoL";
}
1;
[root#ALT-24 root]# perl test.pl
LoL

Perl: if ( element in list )

I'm looking for presence of an element in a list.
In Python there is an in keyword and I would do something like:
if element in list:
doTask
Is there something equivalent in Perl without having to manually iterate through the entire list?
UPDATE:
The smartmatch family of features are now experimental
Smart match, added in v5.10.0 and significantly revised in v5.10.1, has been a regular point of complaint. Although there are a number of ways in which it is useful, it has also proven problematic and confusing for both users and implementors of Perl. There have been a number of proposals on how to best address the problem. It is clear that smartmatch is almost certainly either going to change or go away in the future. Relying on its current behavior is not recommended.
Warnings will now be issued when the parser sees ~~, given, or when.
If you can get away with requiring Perl v5.10, then you can use any of the following examples.
The smart match ~~ operator.
if( $element ~~ #list ){ ... }
if( $element ~~ [ 1, 2, 3 ] ){ ... }
You could also use the given/when construct. Which uses the smart match functionality internally.
given( $element ){
when( #list ){ ... }
}
You can also use a for loop as a "topicalizer" ( meaning it sets $_ ).
for( #elements ){
when( #list ){ ... }
}
One thing that will come out in Perl 5.12 is the ability to use the post-fix version of when. Which makes it even more like if and unless.
given( $element ){
... when #list;
}
If you have to be able to run on older versions of Perl, there still are several options.
You might think you can get away with using List::Util::first, but there are some edge conditions that make it problematic.
In this example it is fairly obvious that we want to successfully match against 0. Unfortunately this code will print failure every time.
use List::Util qw'first';
my $element = 0;
if( first { $element eq $_ } 0..9 ){
print "success\n";
} else {
print "failure\n";
}
You could check the return value of first for defined-ness, but that will fail if we actually want a match against undef to succeed.
You can safely use grep however.
if( grep { $element eq $_ } 0..9 ){ ... }
This is safe because grep gets called in a scalar context. Arrays return the number of elements when called in scalar context. So this will continue to work even if we try to match against undef.
You could use an enclosing for loop. Just make sure you call last, to exit out of the loop on a successful match. Otherwise you might end up running your code more than once.
for( #array ){
if( $element eq $_ ){
...
last;
}
}
You could put the for loop inside the condition of the if statement ...
if(
do{
my $match = 0;
for( #list ){
if( $element eq $_ ){
$match = 1;
last;
}
}
$match; # the return value of the do block
}
){
...
}
... but it might be more clear to put the for loop before the if statement.
my $match = 0;
for( #list ){
if( $_ eq $element ){
$match = 1;
last;
}
}
if( $match ){ ... }
If you're only matching against strings, you could also use a hash. This can speed up your program if #list is large and, you are going to match against %hash several times. Especially if #array doesn't change, because then you only have to load up %hash once.
my %hash = map { $_, 1 } #array;
if( $hash{ $element } ){ ... }
You could also make your own subroutine. This is one of the cases where it is useful to use prototypes.
sub in(&#){
local $_;
my $code = shift;
for( #_ ){ # sets $_
if( $code->() ){
return 1;
}
}
return 0;
}
if( in { $element eq $_ } #list ){ ... }
if( $element ~~ #list ){
do_task
}
~~ is the "smart match operator", and does more than just list membership detection.
grep is helpful here
if (grep { $_ eq $element } #list) {
....
}
If you plan to do this many times, you can trade-off space for lookup time:
#!/usr/bin/perl
use strict; use warnings;
my #array = qw( one ten twenty one );
my %lookup = map { $_ => undef } #array;
for my $element ( qw( one two three ) ) {
if ( exists $lookup{ $element }) {
print "$element\n";
}
}
assuming that the number of times the element appears in #array is not important and the contents of #array are simple scalars.
List::Util::first
$foo = first { ($_ && $_ eq "value" } #list; # first defined value in #list
Or for hand-rolling types:
my $is_in_list = 0;
foreach my $elem (#list) {
if ($elem && $elem eq $value_to_find) {
$is_in_list = 1;
last;
}
}
if ($is_in_list) {
...
A slightly different version MIGHT be somewhat faster on very long lists:
my $is_in_list = 0;
for (my $i = 0; i < scalar(#list); ++$i) {
if ($list[i] && $list[i] eq $value_to_find) {
$is_in_list = 1;
last;
}
}
if ($is_in_list) {
...
TIMTOWTDI
sub is (&#) {
my $test = shift;
$test->() and return 1 for #_;
0
}
sub in (#) {#_}
if( is {$_ eq "a"} in qw(d c b a) ) {
print "Welcome in perl!\n";
}
List::MoreUtils
On perl >= 5.10 the smart match operator is surely the easiest way, as many others have already said.
On older versions of perl, I would instead suggest List::MoreUtils::any.
List::MoreUtils is not a core module (some say it should be) but it's very popular and it's included in major perl distributions.
It has the following advantages:
it returns true/false (as Python's in does) and not the value of the element, as List::Util::first does (which makes it hard to test, as noted above);
unlike grep, it stops at the first element which passes the test (perl's smart match operator short circuits as well);
it works with any perl version (well, >= 5.00503 at least).
Here is an example which works with any searched (scalar) value, including undef:
use List::MoreUtils qw(any);
my $value = 'test'; # or any other scalar
my #array = (1, 2, undef, 'test', 5, 6);
no warnings 'uninitialized';
if ( any { $_ eq $value } #array ) {
print "$value present\n"
}
P.S.
(In production code it's better to narrow the scope of no warnings 'uninitialized').
Probably Perl6::Junction is the clearest way to do. No XS dependencies, no mess and no new perl version required.
use Perl6::Junction qw/ any /;
if (any(#grant) eq 'su') {
...
}
This blog post discusses the best answers to this question.
As a short summary, if you can install CPAN modules then the best solutions are:
if any(#ingredients) eq 'flour';
or
if #ingredients->contains('flour');
However, a more usual idiom is:
if #any { $_ eq 'flour' } #ingredients
which i find less clear.
But please don't use the first() function! It doesn't express the intent of your code at all. Don't use the "Smart match" operator: it is broken. And don't use grep() nor the solution with a hash: they iterate through the whole list. While any() will stop as soon as it finds your value.
Check out the blog post for more details.
PS: i'm answering for people who will have the same question in the future.
You can accomplish a similar enough syntax in Perl if you do some Autoload hacking.
Create a small package to handle the autoload:
package Autoloader;
use strict;
use warnings;
our $AUTOLOAD;
sub AUTOLOAD {
my $self = shift;
my ($method) = (split(/::/, $AUTOLOAD))[-1];
die "Object does not contain method '$method'" if not ref $self->{$method} eq 'CODE';
goto &{$self->{$method}};
}
1;
Then your other package or main script will contain a subroutine that returns the blessed object which gets handled by Autoload when its method attempts to be called.
sub element {
my $elem = shift;
my $sub = {
in => sub {
return if not $_[0];
# you could also implement this as any of the other suggested grep/first/any solutions already posted.
my %hash; #hash{#_} = ();
return (exists $hash{$elem}) ? 1 : ();
}
};
bless($sub, 'Autoloader');
}
This leaves you with usage looking like:
doTask if element('something')->in(#array);
If you reorganize the closure and its arguments, you can switch the syntax around the other way to make it look like this, which is a bit closer to the autobox style:
doTask if search(#array)->contains('something');
function to do that:
sub search {
my #arr = #_;
my $sub = {
contains => sub {
my $elem = shift or return;
my %hash; #hash{#arr} = ();
return (exists $hash{$elem}) ? 1 : ();
}
};
bless($sub, 'Autoloader');
}