Let's say I have this
#!/usr/bin/perl
%x = ('a' => 1, 'b' => 2, 'c' => 3);
and I would like to know if the value 2 is a hash value in %x.
How is that done?
Fundamentally, a hash is a data structure optimized for solving the converse question, knowing whether the key 2 is present. But it's hard to judge without knowing, so let's assume that won't change.
Possibilities presented here will depend on:
how often you need to do it
how dynamic the hash is
One-time op
grep $_==2, values %x (also spelled grep {$_==1} values %x) will return a list of as many 2s as are present in the hash, or, in scalar context, the number of matches. Evaluated as a boolean in a condition, it yields just what you want.
grep works on versions of Perl as old as I can remember.
use List::Util qw(first); first {$_==2} values %x returns only the first match, undef if none. That makes it faster, as it will short-circuit (stop examining elements) as soon as it succeeds. This isn't a problem for 2, but take care that the returned element doesn't necessarily evaluate to boolean true. Use defined in those cases.
List::Util is a part of the Perl core since 5.8.
use List::MoreUtils qw(any); any {$_==2} values %x returns exactly the information you requested as a boolean, and exhibits the short-circuiting behavior.
List::MoreUtils is available from CPAN.
2 ~~ [values %x] returns exactly the information you requested as a boolean, and exhibits the short-circuiting behavior.
Smart matching is available in Perl since 5.10.
Repeated op, static hash
Construct a hash that maps values to keys, and use that one as a natural hash to test key existence.
my %r = reverse %x;
if ( exists $r{2} ) { ... }
Repeated op, dynamic hash
Use a reverse lookup as above. You'll need to keep it up to date, which is left as an exercise to the reader/editor. (hint: value collisions are tricky)
Shorter answer using smart match (Perl version 5.10 or later):
print 2 ~~ [values %x];
my %reverse = reverse %x;
if( defined( $reverse{2} ) ) {
print "2 is a value in the hash!\n";
}
If you want to find out the keys for which the value is 2:
foreach my $key ( keys %x ) {
print "2 is the value for $key\n" if $x{$key} == 2;
}
Everyone's answer so far was not performance-driven. While the smart-match (~~) solution short circuits (e.g. stops searching when something is found), the grep ones do not.
Therefore, here's a solution which may have better performance for Perl before 5.10 that doesn't have smart match operator:
use List::MoreUtils qw(any);
if (any { $_ == 2 } values %x) {
print "Found!\n";
}
Please note that this is just a specific example of searching in a list (values %x) in this case and as such, if you care about performance, the standard performance analysis of searching in a list apply as discussed in detail in this answer
grep and values
my %x = ('a' => 1, 'b' => 2, 'c' => 3);
if (grep { $_ == 2 } values %x ) {
print "2 is in hash\n";
}
else {
print "2 is not in hash\n";
}
See also: perldoc -q hash
Where $count would be the result:
my $count = grep { $_ == 2 } values %x;
This will not only show you if it's a value in the hash, but how many times it occurs as a value. Alternatively you can do it like this as well:
my $count = grep {/2/} values %x;
Related
I have an array with the following values:
push #fruitArray, "apple|0";
push #fruitArray, "apple|1";
push #fruitArray, "pear|0";
push #fruitArray, "pear|0";
I want to find out if the string "apple" exists in this array (ignoring the "|0" "|1")
I am using:
$fruit = 'apple';
if( $fruit ~~ #fruitArray ){ print "I found apple"; }
Which isn't working.
Don't use smart matching. It never worked properly for a number of reasons and it is now marked as experimental
In this case you can use grep instead, together with an appropriate regex pattern
This program tests every element of #fruitArray to see if it starts with the letters in $fruit followed by a pipe character |. grep returns the number of elements that matched the pattern, which is a true value if at least one matched
my #fruitArray = qw/ apple|0 apple|1 pear|0 pear|0 /;
my $fruit = 'apple';
print "I found $fruit\n" if grep /^$fruit\|/, #fruitArray;
output
I found apple
I - like #Borodin suggests, too - would simply use grep():
$fruit = 'apple';
if (grep(/^\Q$fruit\E\|/, #fruitArray)) { print "I found apple"; }
which outputs:
I found apple
\Q...\E converts your string into a regex pattern.
Looking for the | prevents finding a fruit whose name starts with the name of the fruit for which you are looking.
Simple and effective... :-)
Update: to remove elements from array:
$fruit = 'apple';
#fruitsArrayWithoutApples = grep ! /^\Q$fruit\E|/, #fruitArray;
If your Perl is not ancient, you can use the first subroutine from the List::Util module (which became a core module at Perl 5.8) to do the check efficiently:
use List::Util qw{ first };
my $first_fruit = first { /\Q$fruit\E/ } #fruitArray;
if ( defined $first_fruit ) { print "I found $fruit\n"; }
Don't use grep, that will loop the entire array, even if it finds what you are looking for in the first index, so it is inefficient.
this will return true if it finds the substring 'apple', then return and not finish iterating through the rest of the array
#takes a reference to the array as the first parameter
sub find_apple{
#array_input = #{$_[0]};
foreach $fruit (#array_input){
if (index($fruit, 'apple') != -1){
return 1;
}
}
}
You can get close to the smartmatch sun without melting your wings by using match::simple:
use match::simple;
my #fruits = qw/apple|0 apple|1 pear|0 pear|0/;
$fruit = qr/apple/ ;
say "found $fruit" if $fruit |M| \#fruits ;
There's also a match() function if the infix [M] doesn't read well.
I like the way match::simple does almost everything I expected from ~~ without any surprising complexity. If you're fluent in perl it probably isn't something you'd see as necessary, but - especially with match() - code can be made pleasantly readable ... at the cost of imposing the use of references, etc.
This question already has answers here:
Why do you need $ when accessing array and hash elements in Perl?
(9 answers)
Closed 8 years ago.
Today I start my perl journey, and now I'm exploring the data type.
My code looks like:
#list=(1,2,3,4,5);
%dict=(1,2,3,4,5);
print "$list[0]\n"; # using [ ] to wrap index
print "$dict{1}\n"; # using { } to wrap key
print "#list[2]\n";
print "%dict{2}\n";
it seems $ + var_name works for both array and hash, but # + var_name can be used to call an array, meanwhile % + var_name can't be used to call a hash.
Why?
#list[2] works because it is a slice of a list.
In Perl 5, a sigil indicates--in a non-technical sense--the context of your expression. Except from some of the non-standard behavior that slices have in a scalar context, the basic thought is that the sigil represents what you want to get out of the expression.
If you want a scalar out of a hash, it's $hash{key}.
If you want a scalar out of an array, it's $array[0]. However, Perl allows you to get slices of the aggregates. And that allows you to retrieve more than one value in a compact expression. Slices take a list of indexes. So,
#list = #hash{ qw<key1 key2> };
gives you a list of items from the hash. And,
#list2 = #list[0..3];
gives you the first four items from the array. --> For your case, #list[2] still has a "list" of indexes, it's just that list is the special case of a "list of one".
As scalar and list contexts were rather well defined, and there was no "hash context", it stayed pretty stable at $ for scalar and # for "lists" and until recently, Perl did not support addressing any variable with %. So neither %hash{#keys} nor %hash{key} had meaning. Now, however, you can dump out pairs of indexes with values by putting the % sigil on the front.
my %hash = qw<a 1 b 2>;
my #list = %hash{ qw<a b> }; # yields ( 'a', 1, 'b', 2 )
my #l2 = %list[0..2]; # yields ( 0, 'a', 1, '1', 2, 'b' )
So, I guess, if you have an older version of Perl, you can't, but if you have 5.20, you can.
But for a completist's sake, slices have a non-intuitive way that they work in a scalar context. Because the standard behavior of putting a list into a scalar context is to count the list, if a slice worked with that behavior:
( $item = #hash{ #keys } ) == scalar #keys;
Which would make the expression:
$item = #hash{ #keys };
no more valuable than:
scalar #keys;
So, Perl seems to treat it like the expression:
$s = ( $hash{$keys[0]}, $hash{$keys[1]}, ... , $hash{$keys[$#keys]} );
And when a comma-delimited list is evaluated in a scalar context, it assigns the last expression. So it really ends up that
$item = #hash{ #keys };
is no more valuable than:
$item = $hash{ $keys[-1] };
But it makes writing something like this:
$item = $hash{ source1(), source2(), #array3, $banana, ( map { "$_" } source4()};
slightly easier than writing:
$item = $hash{ [source1(), source2(), #array3, $banana, ( map { "$_" } source4()]->[-1] }
But only slightly.
Arrays are interpolated within double quotes, so you see the actual contents of the array printed.
On the other hand, %dict{1} works, but is not interpolated within double quotes. So, something like my %partial_dict = %dict{1,3} is valid and does what you expect i.e. %partial_dict will now have the value (1,2,3,4). But "%dict{1,3}" (in quotes) will still be printed as %dict{1,3}.
Perl Cookbook has some tips on printing hashes.
In Perl if you have a list with an even number of elements you can straightforwardly convert it to a hash:
my #a = qw(each peach pear plum);
my %h = #a;
However, if there are duplicate keys then they will be silently accepted, with the last occurrence being the one used. I would like to make a hash checking that there are no duplicates:
my #a = qw(a x a y);
my %h = safe_hash_from_list(#a); # prints error: duplicate key 'a'
Clearly I could write that routine myself:
sub safe_hash_from_list {
die 'even sized list needed' if #_ % 2;
my %r;
while (#_) {
my $k = shift;
my $v = shift;
die "duplicate key '$k'" if exists $r{$k};
$r{$k} = $v;
}
return %r;
}
This, however, is quite a bit slower than the simple assignment. Moreover I do not want to use my own private routine if there is a CPAN module that already does the same job.
Is there a suitable routine on CPAN for safely turning lists into hashes? Ideally one that is a bit faster than the pure-Perl implementation above (though probably never quite as fast as the simple assignment).
If I may be allowed a related follow-up question, I'm also wondering about a tied hash class which allows each key to be assigned only once and dies on reassignment. That would be a more general case of the above problem. Again, I can write such a tied hash myself but I do not want to reinvent the wheel and I would prefer an optimized implementation if one already exists.
Quick way to check that no keys were duplicate would be count the keys and make sure they are equal to half the number of items in the list:
my #a = ...;
my %h = #a;
if (keys %h == (#a / 2)) {
print "Success!";
}
These are the ones I'm aware of:
The behaviour of a "my" statement modified with a statement modifier conditional or loop construct (e.g. "my $x if ...").
Modifying a variable twice in the same statement, like $i = $i++;
sort() in scalar context
truncate(), when LENGTH is greater than the length of the file
Using 32-bit integers, "1 << 32" is undefined. Shifting by a negative number of bits is also undefined.
Non-scalar assignment to "state" variables, e.g. state #a = (1..3).
One that is easy to trip over is prematurely breaking out of a loop while iterating through a hash with each.
#!/usr/bin/perl
use strict;
use warnings;
my %name_to_num = ( one => 1, two => 2, three => 3 );
find_name(2); # works the first time
find_name(2); # but fails this time
exit;
sub find_name {
my($target) = #_;
while( my($name, $num) = each %name_to_num ) {
if($num == $target) {
print "The number $target is called '$name'\n";
return;
}
}
print "Unable to find a name for $target\n";
}
Output:
The number 2 is called 'two'
Unable to find a name for 2
This is obviously a silly example, but the point still stands - when iterating through a hash with each you should either never last or return out of the loop; or you should reset the iterator (with keys %hash) before each search.
These are just variations on the theme of modifying a structure that is being iterated over:
map, grep and sort where the code reference modifies the list of items to sort.
Another issue with sort arises where the code reference is not idempotent (in the comp sci sense)--sort_func($a, $b) must always return the same value for any given $a and $b.
I'm wondering if there's an idiomatic one-liner or a standard-distribution package/function that I can use to compare two Perl hashes with only builtin, non-blessed types. The hashes are not identical (they don't have equivalent memory addresses).
I'd like to know the answer for both for shallow hashes and hashes with nested collections, but I understand that shallow hashes may have a much simpler solution.
TIA!
Something like cmp_deeply available in Test::Deep ?
[This was a response to an answer by someone who deleted their answer.]
Uh oh!
%a ~~ %b && [sort values %a] ~~ [sort values %b]
doesn't check whether the values belong to the same keys.
#! perl
use warnings;
use strict;
my %a = (eat => "banana", say => "whu whu"); # monkey
my %b = (eat => "whu whu", say => "banana"); # gorilla
print "Magilla Gorilla is always right\n"
if %a ~~ %b && [sort values %a] ~~ [sort values %b];
I don't know if there's an easy way or a built-in package, and I don't know what happens when you just do %hash1 == %hash2 (but that's probably not it), but it's not terribly hard to roll your own:
sub hash_comp (\%\%) {
my %hash1 = %{ shift };
my %hash2 = %{ shift };
foreach (keys %hash1) {
return 1 unless defined $hash2{$_} and $hash1{$_} == $hash2{$_};
delete $hash1{$_};
delete $hash2{$_};
}
return 1 if keys $hash2;
return 0;
}
Untested, but should return 0 if the hashes have all the same elements and all the same values. This function will have to be modified to account for multidimensional hashes.
If you want something from a standard distribution, you could use Data::Dumper; and just dump the two hashes into two scalar variables, then compare the strings for equality. That might work.
There's also a package on CPAN called FreezeThaw that looks like it does what you want.
Note that to use the smart match (not repeated here because it's already posted), you will have to use feature; and it's only available for Perl 5.10. But who's still using Perl 5.8.8, right?
Thanks for your question.
I used Test::More::eq_hash as result.
hashes can be casted into arrays, where every value follows its key (but you won't know the order of the keys). So:
( join("",sort(%hash1)) eq join("",sort(%hash2)) )
Oh, wait, that won't work because there are some edge cases, like:
%hash1 = { 'aaa' => 'aa' };
%hash2 = { 'aa' => 'aaa' };
So it's best to use a character in the join() that you KNOW will never appear in any key or value. If the values are BLOBs, that will be a big problem, but for anything else you could use the NULL char "\0".
( join("\0",sort(%hash1)) eq join("\0",sort(%hash2)) )
Looks kinda ugly, I know, but it will do for checking if two hashes are equal in a shallow way, which is what most people are looking for.
For shallow hashes:
(grep {exists %hash2{$_}} keys %hash1) > 0
You can use eq_deeply in Test::Deep::NoTest. It just returns a boolean that you can check, without the extra overhead of the testing capabilities of the main module.
convert hashes to xml files and compare, and yes you could use multilevel.
sub isEqualHash
{
my ($self,$hash1, $hash2) = #_;
my $file1 = "c:/neo-file1.txt";
my $file2 = "c:/neo-file2.txt";
my $xmlObj = XML::Simple->new();
my $dummy_file = $xmlObj->XMLout($hash1,OutputFile => $file1);
my $dummy_file = $xmlObj->XMLout($hash2,OutputFile => $file2);
open FILE, "<".$file1;
my $file_contents1 = do { local $/; <FILE> };
close(FILE);
open FILE, "<".$file2;
my $file_contents2 = do { local $/; <FILE> };
close(FILE);
if($file_contents1 eq $file_contents2)
{
return "Passed";
}
else
{
return "Failed";
}
}