how to use reference of hashes in perl - perl

I'm trying to learn the reference function, but I can't figure out a way to put hashes in reference at the same time. I want to write a subroutine that will take two simple hash references as arguments and to check whether these two hashes are equal or not. My code is:
#!/usr/bin/perl
use strict;
use warnings;
use feature qw(say);
my $hash1_r = {ITALY => "ROME",
FRANCE => "PARIS"};
my $hash2_r = {ITALY => "MILAN",
FRANCE => "PARIS"};
my $hash3_r = {ITALY => "ROME"};
my $hash4_r = {SPAIN => "ROME",
FRANCE => "PARIS"};
sub compareHashes(%$hash1, %$hash2){
my $hash1; my $hash2;
for (my $i =0; $i < keys %$hash1; $i++){
say "The first hash:";
say "keys %$hash1\t, values %$hash1";
}
for (my $i =0; $i < keys %$hash2; $i++){
say "The second hash:";
say "keys %$hash2\t, values %$hash2";
}
for (keys %$hash1) {
if (keys %$hash1 ne keys %$hash2){
say "Two above hashes are not equal";
}elsif (my $key1 (keys %$hash1) ne my $key2 (keys %$hash2)){
say "Two above hashes are not equal";
}elsif (%$hash1->{$_} ne %$hash2->{$_}){
say "Two above hashes are not equal";
}else {
say "Two above hashes are equal";
}
}
}
compareHashes (%$hash1_r, %$hash1_r);
compareHashes (%$hash1_r, %$hash2_r);
compareHashes (%$hash1_r, %$hash3_r);
compareHashes (%$hash1_r, %$hash4_r);
However, I got those errors:
Prototype after '%' for main::compareHashes : %$hash1,%$hash2 at compareHashes2.pl line 16.
Illegal character in prototype for main::compareHashes : %$hash1,%$hash2 at compareHashes2.pl line 16.
syntax error at compareHashes2.pl line 30, near "$key1 ("
syntax error at compareHashes2.pl line 32, near "}elsif"
Global symbol "$hash2" requires explicit package name at compareHashes2.pl line 32.
Any solutions? Any help will be greatly appreciated!

I would recommend reading the following excellent perl documentation for the general idea:
perldoc perlreftut
A slight simplification of your code, getting the references to work:
#!/usr/bin/perl
use strict;
use warnings;
use feature qw(say);
# { ... } creates a hash reference, you can pass this to a function directly
my $hash1_r = { ITALY => "ROME", FRANCE => "PARIS" };
my $hash2_r = { ITALY => "MILAN", FRANCE => "PARIS" };
my $hash3_r = { ITALY => "ROME" };
my $hash4_r = { SPAIN => "ROME", FRANCE => "PARIS" };
sub compareHashes {
my ($hash1, $hash2) = #_; # #_ is the default array
# You can just use hash references directly by prepending with a '%' symbol
# when you need the actual hash, such as when using 'keys', 'values', 'each', etc.
# You can access the elements by using an arrow: $hashref->{'key_name'}
say "-"x40;
say "The first hash:";
while ( my ($key, $value) = each %$hash1 ) {
say "$key => $value";
}
say "The second hash:";
while ( my ($key, $value) = each %$hash2 ) {
say "$key => $value";
}
my (#keys1) = keys %$hash1;
my ($nkey1) = scalar #keys1;
my (#keys2) = keys %$hash2;
my ($nkey2) = scalar #keys2;
if ($nkey1 != $nkey2) {
say "=> unequal number of keys: $nkey1 vs $nkey2";
return 0; # False, the hashes are different, we don't need to test any further
}
# Create a new hash using all of the keys from hash1 and hash2
# The effect is to eliminate duplicates, as repeated keys, i.e.
# common to both hash1 and hash2 will just produce one key in %uniq
# You can use the 'uniq' function from List::MoreUtils to achieve
# the same thing.
# In perl, using a hash to eliminate duplicates, or test for set
# membership is a very common idiom.
# The 'map' function iterates over a list and performs the
# operation inside the curly braces {...}, returning all
# of the results.
# For example: map { 2 * $_ } ( 1,2,3 ) # ( 2,4,6 )
# If you assign a list to a hash, it takes pairs of values
# and turns them into key/value pairs
# The '=>' is equivalent to a ',' but makes the intent easier
# to understand
my %uniq = map { $_ => 1 } ( #keys1, #keys2 );
my $nuniqkey = scalar keys %uniq;
if ($nkey1 != $nuniqkey) {
say "=> unequal set of keys";
return 0; # False, the hashes are different, we don't need to test any further
}
# Now test the values
# If we neglected to check for uniqueness in the above block,
# we would run into the situation where hash1 might have a key
# that hash2 doesn't have (and vice-versa). This would trigger a
# 'use of uninitialized value' warning in the comparison operator
for my $key (#keys1) {
my ($value1) = $hash1->{$key};
my ($value2) = $hash2->{$key};
if ($value1 ne $value2) {
say "=> unequal values for key '$key' : $value1 vs $value2";
return 0; # False, the hashes are different, we don't need to test any further
}
}
say "=> equal, yay!";
return 1; # True, the hashes are equal after all!
}
compareHashes($hash1_r, $hash1_r);
compareHashes($hash1_r, $hash2_r);
compareHashes($hash1_r, $hash3_r);
compareHashes($hash1_r, $hash4_r);

You have a good answer that you have already accepted. But for people finding this question in the future, I think it's worth explaining some of the errors you have made.
You start by defining some anonymous hashes. That's fine.
my $hash1_r = {
ITALY => "ROME",
FRANCE => "PARIS"
};
my $hash2_r = {
ITALY => "MILAN",
FRANCE => "PARIS"
};
my $hash3_r = {
ITALY => "ROME"
};
my $hash4_r = {
SPAIN => "ROME",
FRANCE => "PARIS"
};
I'm now going to skip to where you call your subroutine (I'll get back to the subroutine itself soon).
compareHashes (%$hash1_r, %$hash1_r);
compareHashes (%$hash1_r, %$hash2_r);
compareHashes (%$hash1_r, %$hash3_r);
compareHashes (%$hash1_r, %$hash4_r);
One of the most important uses for references is to enable you to pass multiple arrays and hashes into a subroutine without them being flattened into a single array. As you have hash references already, it would make sense to pass those references into the subroutine. But you don't do that. You dereference your hashes which means you send the actual hashes into the subroutine. That means that, for example, your first call passes in the list ('ITALY', 'ROME', 'FRANCE', 'PARIS', 'ITALY', 'MILAN', 'FRANCE', 'PARIS'). And there is no way for the code inside your subroutine to separate that list into two hashes.
Now, let's look at the subroutine itself. You start by defining a prototype for the subroutine. In most cases, prototypes are unnecessary. In many cases, they change the code behaviour in hard-to-understand ways. No Perl expert would recommend using prototypes in this code. And, as your error message says, you get the prototype wrong.
sub compareHashes(%$hash1, %$hash2){
I'm not sure what you were trying to do with this prototype. Perhaps it's not a prototype at all - perhaps it's a function signature (but if it was, you would need to turn the feature on).
On the next line, you declare two variables. Variables that you never give values to.
my $hash1; my $hash2;
There are then two very confused for loops.
for (my $i =0; $i < keys %$hash1; $i++){
say "The first hash:";
say "keys %$hash1\t, values %$hash1";
}
$hash1 has no value. So %$hash1 is zero (the hash has no keys) and the loop isn't executed. But we're not missing much as the loop body just prints the same uninitialised values each time.
And you could simplify your for loop by making it a foreach-style loop.
foreach my $i (0 .. keys %$hash1 - 1) { ... }
Or (given that you don't use $i at all:
foreach (1 .. keys %$hash1) { ... }
After another, equally ineffective, for loop for $hash2, you try to compare your two hashes.
for (keys %$hash1) {
if (keys %$hash1 ne keys %$hash2){
say "Two above hashes are not equal";
}elsif (my $key1 (keys %$hash1) ne my $key2 (keys %$hash2)){
say "Two above hashes are not equal";
}elsif (%$hash1->{$_} ne %$hash2->{$_}){
say "Two above hashes are not equal";
}else {
say "Two above hashes are equal";
}
}
I have no idea at all why this is all in a for loop. but your comparisons do nothing to actually compare the values in the hash. All you are comparing is the number of keys in the hashes (which are always going to be equal here - as your hashes are always empty).
All in all, this is the work who is extremely confused about how hashes, subroutines and references work in Perl. I would urge you to stop what you are doing and take the time to work through a good reference book like Learning Perl followed by Intermediate Perl before you continue down your current route and just confuse yourself more.

Related

Accessing a multi-dimensional hash using strings

I have a large multi-dimensional hash which is an import of a JSON structure.
my %bighash;
There is an element in %bighash called:
$bighash{'core'}{'dates'}{'year'} = 2019.
I have a separate string variable called core.dates.year which I would like to use to extract 2019 from %bighash.
I've written this code:
my #keys = split(/\./, 'core.dates.year');
my %hash = ();
my $hash_ref = \%hash;
for my $key ( #keys ){
$hash_ref->{$key} = {};
$hash_ref = $hash_ref->{$key};
}
which when I execute:
say Dumper \%hash;
outputs:
$VAR1 = {
'core' => {
'dates' => {
'year' => {}
}
}
};
All good so far. But what I now want to do is say:
print $bighash{\%hash};
Which I want to return 2019. But nothing is being returned or I'm seeing an error about "Use of uninitialized value within %bighash in concatenation (.) or string at script.pl line 1371, line 17 (#1)...
Can someone point me into what is going on?
My project involves embedding strings in an external file which is then replaced with actual values from %bighash so it's just string interpolation.
Thanks!
Can someone point me into what is going on [when I use $bighash{\%hash}]?
Hash keys are strings, and the stringification of \%hash is something like HASH(0x655178). The only element in %bighash has core —not HASH(0x655178)— for key, so the hash lookup returns undef.
Useful tools:
sub dive_val :lvalue { my $p = \shift; $p //= \( $$p->{$_} ) for #_; $$p } # For setting
sub dive { my $r = shift; $r //= $r->{$_} for #_; $r } # For getting
dive_val(\%hash, split /\./, 'core.dates.year') = 2019;
say dive(\%hash, split /\./, 'core.dates.year');
Hash::Fold would seem to be helpful here. You can "flatten" your hash and then access everything with a single key.
use Hash::Fold 'flatten';
my $flathash = flatten(\%bighash, delimiter => '.');
print $flathash->{"core.dates.year"};
There are no multi-dimensional hashes in Perl. Hashes are key/value pairs. Your understanding of Perl data structures is incomplete.
Re-imagine your data structure as follows
my %bighash = (
core => {
dates => {
year => 2019,
},
},
);
There is a difference between the round parentheses () and the curly braces {}. The % sigil on the variable name indicates that it's a hash, that is a set of unordered key/value pairs. The round () are a list. Inside that list are two scalar values, i.e. a key and a value. The value is a reference to another, anonymous, hash. That's why it has curly {}.
Each of those levels is a separate, distinct data structure.
This rewrite of your code is similar to what ikegami wrote in his answer, but less efficient and more verbose.
my #keys = split( /\./, 'core.dates.year' );
my $value = \%bighash;
for my $key (#keys) {
$value //= $value->{$key};
}
print $value;
It drills down step by step into the structure and eventually gives you the final value.

Find values of nested hash matching a specific key

I've created a hash of hashes in perl, where this is an example of what the hash ends up looking like:
my %grades;
$grades{"Foo Bar"}{Mathematics} = 97;
$grades{"Foo Bar"}{Literature} = 67;
$grades{"Peti Bar"}{Literature} = 88;
$grades{"Peti Bar"}{Mathematics} = 82;
$grades{"Peti Bar"}{Art} = 99;
and to print the entire hash, I'm using:
foreach my $name (sort keys %grades) {
foreach my $subject (keys %{ $grades{$name} }) {
print "$name, $subject: $grades{$name}{$subject}\n";
}
}
I need to print just the inner hash referring to "Peti Bar" and find the highest value, so theoretically, I should just parse through Peti Bar, Literature; Peti Bar, Mathematics; and Peti Bar, Art and end up returning Art, since it has the highest value.
Is there a way to do this or do I need to parse through the entire 2d hash?
You don't need to parse through the first level if you know the key that you're interested. Just leave out the first loop and access it directly. To get the highest value, you have to look at each subject once.
Keep track of the highest value and the key that goes with it, and then print.
my $max_value = 0;
my $max_key;
foreach my $subject (keys %{ $grades{'Peti Bar'} }) {
if ($grades{'Peti Bar'}{$subject} > $max_value){
$max_value = $grades{'Peti Bar'}{$subject};
$max_key = $subject;
}
}
print $max_key;
This will output
Art
An alternative implementation with sort would look like this:
print +(
sort { $grades{'Peti Bar'}{$b} <=> $grades{'Peti Bar'}{$a} }
keys %{ $grades{'Peti Bar'} }
)[0];
The + in +( ... ) tells Perl that the parenthesis () are not meant for the function call to print, but to construct a list. The sort sorts on the keys, descending, because it has $b first. It returns a list, and we take the first value (index 0).
Note that this is more expensive than the first implementation, and not necessarily more concise. Unless you're building a one-liner or your ; is broken I wouldn't recommend the second solution.
This is trivial using the List::UtilsBy module
The code is made clearer by extracting a reference to the inner hash that we're interested in. The max_by is called to return the keys of that hash that has the maximum value
use strict;
use warnings 'all';
use feature 'say';
use List::UtilsBy 'max_by';
my %grades = (
'Foo Bar' => { Literature => 67, Mathematics => 97 },
'Peti Bar' => { Literature => 88, Mathematics => 82, Art => 99 },
);
my $pb_grades = $grades{'Peti Bar'};
say max_by { $pb_grades->{$_} } keys %$pb_grades;
output
Art
As a Perl beginner I would use the List::Util core module:
use 5.014;
use List::Util 'reduce';
my $k='Peti Bar';
say reduce { $grades{$k}{$a} > $grades{$k}{$b} ? $a : $b } keys %{$grades{$k}};

How to find the depth of a nested hash of hashes?

I am trying to write a Perl subroutine to process any given hash (passed by reference) but I would like to make it a generic one so that I can use it anywhere.
Assuming the hash has simple key/value pairs and is not an elaborated record (containing arrays of arrays or arrays of hashes), is there any way to find how deep the hash of hashes runs?
For example
my %stat = (
1 => { one => "One is one.", two => "two is two"},
2 => { one => "second val wone", two => "Seconv v"}
);
The hash above has first level key 1 which has two keys one and two. So it is a hash with two levels.
My question is whether is there any way to test and find this information that the hash has two levels?
Update from comment
About the "problem that has led me to believe that I need to know the depth that a Perl hash is nested". Here is my situation.
The program is creating a data structure of three levels, and also publishing it in a text file for other scripts which are processing this published data and doing something else.
The program is also reading and hashing other data structures of five levels, which has data related to the hash in the first point.
The program is also processing a continuously growing log file and collecting data.
Assuming a uniform hash structure:
use strict;
use warnings;
sub depth {
my ($h) = #_;
my $d = 0;
while () {
return $d if ref($h) ne 'HASH';
($h) = values %$h;
$d++;
}
}
my %stat = (
1 => { one => "One is one.", two => "two is two"},
2 => { one => "second val wone", two => "Seconv v"}
);
print depth(\%stat), "\n";
Output:
2
Traversing hash values and tracking levels could give answer,
use strict;
use warnings;
use v5.16;
{ my $max;
sub hlevel {
my ($h, $n) = #_;
$max = 0 if !$n;
$max = $n if $max < $n;
__SUB__->($_, $n +1) for grep {ref eq "HASH"} values %$h;
return $max;
}}
my %h;
$h{a}{b}{c}{d}{e} =1;
$h{a}{b}{c}{d}{e1}{f} =1;
$h{a}{b}{c}{d}{e1}{f1}{g} =1;
print hlevel(\%h, 0);
output
6

How to parse through many hashes using foreach?

foreach my %hash (%myhash1,%myhash2,%myhash3)
{
while (($keys,$$value) = each %hash)
{
#use key and value...
}
}
Why doesn't this work :
it says synta error on foreach line.
Pls tell me why is it wrong.
This is wrong because you seem to think that this allows you to access each hash as a separate construct, whereas what you are in fact doing is, besides a syntax error, accessing the hashes as a mixed-together new list. For example:
my %hash1 = qw(foo 1 bar 1);
my %hash2 = qw(abc 1 def 1);
for (%hash1, %hash2) # this list is now qw(foo 1 bar 1 abc 1 def 1)
When you place a hash (or array) in a list context statement, they are expanded into their elements, and their integrity is not preserved. Some built-in functions do allow this behaviour, but normal Perl code does not.
You also cannot assign a hash as the for iterator variable, that can only ever be a scalar value. What you can do is this:
for my $hash (\%myhash1, \%myhash2, \%myhash3) {
while (my ($key, $value) = each %$hash) {
...
Which is to say, you create a list of hash references and iterate over them. Note that you cannot tell the difference between the hashes with this approach.
Note also that I use my $hash because this variable must be a scalar.
The syntax should be like:
my $hash1 = {'a'=>1};
my $hash2 = {'b'=>1};
my #arr2 = ($hash1, $hash2);
foreach $hash (#arr2)
{
while(($key, $value) = each %$hash)
{
print $key, $value;
}
}
you need to reference and then dereference the hash.

Is there a simple way to validate a hash of hash element comparsion?

Is there a simple way to validate a hash of hash element comparsion ?
I need to validate a Perl hash of hash element $Table{$key1}{$key2}{K1}{Value} compare to all other elements in hash
third key will be k1 to kn and i want comprare those elements and other keys are same
if ($Table{$key1}{$key2}{K1}{Value} eq $Table{$key1}{$key2}{K2}{Value}
eq $Table{$key1}{$key2}{K3}{Value} )
{
#do whatever
}
Something like this may work:
use List::MoreUtils 'all';
my #keys = map "K$_", 1..10;
print "All keys equal"
if all { $Table{$key1}{$key2}{$keys[1]}{Value} eq $Table{$key1}{$key2}{$_}{Value} } #keys;
I would use Data::Dumper to help with a task like this, especially for a more general problem (where the third key is more arbitrary than 'K1'...'Kn'). Use Data::Dumper to stringify the data structures and then compare the strings.
use Data::Dumper;
# this line is needed to assure that hashes with the same keys output
# those keys in the same order.
$Data::Dumper::Sortkeys = 1;
my $string1= Data::Dumper->Dump($Table{$key1}{$key2}{k1});
for ($n=2; exists($Table{$key1}{$key2}{"k$n"}; $n++) {
my $string_n = Data::Dumper->Dump($Table{$key1}{$key2}{"k$n"});
if ($string1 ne $string_n) {
warn "key 'k$n' is different from 'k1'";
}
}
This can be used for the more general case where $Table{$key1}{$key2}{k7}{value} itself contains a complex data structure. When a difference is detected, though, it doesn't give you much help figuring out where that difference is.
A fairly complex structure. You should be looking into using object oriented programming techniques. That would greatly simplify your programming and the handling of these complex structures.
First of all, let's simplify a bit. When you say:
$Table{$key1}{$key2}{k1}{value}
Do you really mean:
my $value = $Table{$key1}->{$key2}->{k1};
or
my $actual_value = $Table{$key1}->{$key2}->{k1}->{Value};
I'm going to assume the first one. If I'm wrong, let me know, and I'll update my answer.
Let's simplify:
my %hash = %{$Table{$key1}->{$key2}};
Now, we're just dealing with a hash. There are two techniques you can use:
Sort the keys of this hash by value, then if two keys have the same value, they will be next to each other in the sorted list, making it easy to detect duplicates. The advantage is that all the duplicate keys would be printed together. The disadvantage is that this is a sort which takes time and resources.
Reverse the hash, so it's keyed by value and the value of that key is the key. If a key already exists, we know the other key has a duplicate value. This is faster than the first technique because no sorting is involved. However, duplicates will be detected, but not printed together.
Here's the first technique:
my %hash = %{$Table{$key1}->{$key2}};
my $previous_value;
my $previous_key;
foreach my $key (sort {$hash{$a} cmp $hash{$b}} keys %hash) {
if (defined $previous_key and $previous_value eq $hash{$key}) {
print "\$hash{$key} is a duplicate of \$hash{$previous_key}\n";
}
$previous_value = $hash{$key};
$previous_key = $key;
}
And the second:
my %hash = %{$Table{$key1}->{$key2}};
my %reverse_hash;
foreach $key (keys %hash) {
my $value = $hash{$key};
if (exists $reverse_hash{$value}) {
print "\$hash{$reverse_hash{$value}} has the same value as \$hash{$key}\n";
}
else {
$reverse_hash{$value} = $key;
}
}
Alternative approach to the problem is make utility function which will compare all keys if has same value returned from some function for all keys:
sub AllSame (&\%) {
my ($c, $h) = #_;
my #k = keys %$h;
my $ref;
$ref = $c->() for $h->{shift #k};
$ref ne $c->() and return for #$h{#k};
return 1
}
print "OK\n" if AllSame {$_->{Value}} %{$Table{$key1}{$key2}};
But if you start thinking in this way you can found this approach much more generic (recommended way):
sub AllSame (#) {
my $ref = shift;
$ref ne $_ and return for #_;
return 1
}
print "OK\n" if AllSame map {$_->{Value}} values %{$Table{$key1}{$key2}};
If mapping operation is expensive you can make lazy counterpart of same:
sub AllSameMap (&#) {
my $c = shift;
my $ref;
$ref = $c->() for shift;
$ref ne $c->() and return for #_;
return 1
}
print "OK\n" if AllSameMap {$_->{Value}} values %{$Table{$key1}{$key2}};
If you want only some subset of keys you can use hash slice syntax e.g.:
print "OK\n" if AllSame map {$_->{Value}} #{$Table{$key1}{$key2}}{map "K$_", 1..10};