Slicing a nested hash in Perl - perl

Say I have a hash that I can index as:
$hash{$document}{$word}
From what I read online (although I could not find this on perlreftut, perldsc or perllol), I can slice a hash using a list if I use the # prefix on my hash to indicate that I want the hash to return a list. However, if I try to slice my hash using a list #list:
#%hash{$document}{#list}
I get several "Scalar values ... better written" errors.
How can I slash a nested hash in Perl?

The sigill for your hash must be #, like so:
#{$hash{$document}}{#list}
Assuming #list contains valid keys for %hash it will return the corresponding values, or undef if the key does not exist.
This is based on the general rule of a hash slice:
%foo = ( a => 1, b => 2, c => 3 );
print #foo{'a','b'}; # prints 12
%bar = ( foo => \%foo ); # foo is now a reference in %bar
print #{ $bar{foo} }{'a','b'}; # prints 12, same data as before

First, when you expect to get a list from a hash slice, use # sigil first. % is pointless here.
Second, you should understand that $hash{$document} value is not a hash or array. It's a reference - to a hash OR to an array.
With all this said, you might use something like this:
#{ $hash{$document} }{ #list };
... so you dereference value of $hash{$document}, then use a hash slice over it. For example:
my %hash = (
'one' => {
'first' => 1,
'second' => 2,
},
'two' => {
'third' => 3,
'fourth' => 4,
}
);
my $key = 'one';
my #list = ('first', 'second');
print $_, "\n" for #{ $hash{$key} }{#list};
# ...gives 1\n2\n

Related

In Perl when you return a hash from a sub what is the "quick way" to acess a specific value?

In Perl, it seams, there is no "simple" way to address a hash directly when it is returned from a sub or am I missing something?
I would like something like this.
print ( (hash()){One} ); # Not a thing!
You can do that whit an array since the values returned constitute a list.
print ( (arr())[0] );
But for a hash that does not cut it. I have come up with some ways to sort of get this to work.
print ( ${{hash()}}{One} );
print ( 0+{hash()}->{One} );
But they seem kinda lame. I mean this is hard to read and reference shenanigans seams out of place for such a simple thing.
Here is more code for context.
use strict;
use warnings;
### ARRAY
sub arr {
my #arr = (
1,
2,
3,
);
return #arr;
}
print ( (arr())[0] ); #Output is 1;
### HASH
sub hash {
my %hash = (
One => 1,
Two => 2,
Three => 3,
);
return %hash;
}
my %hash = hash();
#print ( (hash()){One} ); # Does not work
print ( $hash{One} ); # Output is 1
print ( (hash())[0] ); # Output will be One, Two or Three
print ( (hash())[1] ); # Output will be 1, 2, 3
print ( ${{hash()}}{One} ); # Output is 1
print ( 0+{hash()}->{One} ); # Output is 1;
# 0+ because Perl gives syntax error otherwise. Not sure why (Scalar context?) but one mystery at a time.
Subroutines do not return hashes, but a list of values. This list can be converted to a hash.
Given a sub that returns an even-sized list, I would use the following expression to access a hash entry:
+{hash()}->{One}
The leading + is sometimes necessary to disambiguate a hash reference literal {...} from a statement-level block {...}1. The unary plus is a no-op and just forces expression context, whereas 0+ would convert the value to a number.
1. Though in this particular instance the ambiguity is due to the print(FILEHANDLE LIST) syntax, where the filehandle can be enclosed in curly braces.
If you have control over the function attempting to return a hash, it could be worth considering to return a hash reference instead. Your example would then look like:
sub hashref {
# alternatively: "return { One => 1, ... };"
my %hash = (
One => 1,
Two => 2,
Three => 3,
);
return \%hash;
# ^-- create a reference to the hash
}
my %hash = hashref()->%*; # or "%{hashref()}". Note that this makes a copy
print( hashref()->{One} ); # can directly access keys
Following code snippet demonstrates usage of array reference and hash reference.
Advantage of this approach gains tracking when you have to operate big arrays or hashes -- no need to allocate memory for copy, saves CPU cycles.
In some instances an algorithm might require to create copy what can be easily achieved with dereferencing data structure (see example).
use strict;
use warnings;
use feature 'say';
### ARRAY
sub arr {
my $ref_arr = [
1,
2,
3,
];
return $ref_arr;
}
my $aref = arr();
say "
Array content:
\$aref->[0] $aref->[0]
\$aref->[1] $aref->[1]
\$aref->[2] $aref->[2]
";
my #array_copy = #$aref;
say "
Array copy content:
\$array_copy[0] $array_copy[0]
\$array_copy[1] $array_copy[1]
\$array_copy[2] $array_copy[2]
";
### HASH
sub hash {
my $ref_hash = {
One => 1,
Two => 2,
Three => 3,
};
return $ref_hash;
}
my $href = hash();
say "
Hash content:
\$href->{Three}: $href->{Three}
\$href->{Two}: $href->{Two}
\$href->{One}: $href->{One}
";
my %hash_copy = %$href;
say "
Hash copy content:
\$hash_copy{One}: $hash_copy{One}
\$hash_copy{Two}: $hash_copy{Two}
\$hash_copy{Three}: $hash_copy{Three}
";
Output
Array content:
$aref->[0] 1
$aref->[1] 2
$aref->[2] 3
Array copy content:
$array_copy[0] 1
$array_copy[1] 2
$array_copy[2] 3
Hash content:
$href->{Three}: 3
$href->{Two}: 2
$href->{One}: 1
Hash copy content:
$hash_copy{One}: 1
$hash_copy{Two}: 2
$hash_copy{Three}: 3

Perl Hashes: $hash{key} vs $hash->{key}

Perl newb here, sorry for a silly question, but googling -> for a coding context is tough... Sometimes, I will access a hash like this: $hash{key} and sometimes that doesn't work, so I access it like this $hash->{key}. What's going on here? Why does it work sometimes one way and not the other?
The difference is that in the first case %hash is a hash, and in the second case, $hash is a reference to a hash (= hash reference), thus you need different notations. In the second case -> dereferences $hash.
EXAMPLES:
# %hash is a hash:
my %hash = ( key1 => 'val1', key2 => 'val2');
# Print 'val1' (hash value for key 'key1'):
print $hash{key1};
# $hash_ref is a reference to a hash:
my $hash_ref = \%hash;
# Print 'val1' (hash value for key 'key1', where the hash
# in pointed to by the reference $hash_ref):
print $hash_ref->{key1};
# A copy of %hash, made using dereferencing:
my %hash2 = %{$hash_ref}
# $hash_ref is an anonymous hash (no need for %hash).
# Note the { curly braces } :
my $hash_ref = { key1 => 'val1', key2 => 'val2' };
# Access the value of anonymous hash similarly to the above $hash_ref:
# Print 'val1':
print $hash_ref->{key1};
SEE ALSO:
perlreftut: https://perldoc.perl.org/perlreftut.html

sum hash of hash values using perl

I have a Perl script that parses an Excel file and does the following : It counts for each value in column A, the number of elements it has in column B, the script looks like this :
use strict;
use warnings;
use Spreadsheet::XLSX;
use Data::Dumper;
use List::Util qw( sum );
my $col1 = 0;
my %hash;
my $excel = Spreadsheet::XLSX->new('inout_chartdata_ronald.xlsx');
my $sheet = ${ $excel->{Worksheet} }[0];
$sheet->{MaxRow} ||= $sheet->{MinRow};
my $count = 0;
# Iterate through each row
foreach my $row ( $sheet->{MinRow}+1 .. $sheet->{MaxRow} ) {
# The cell in column 1
my $cell = $sheet->{Cells}[$row][$col1];
if ($cell) {
# The adjacent cell in column 2
my $adjacentCell = $sheet->{Cells}[$row][ $col1 + 1 ];
# Use a hash of hashes
$hash{ $cell->{Val} }{ $adjacentCell->{Val} }++;
}
}
print "\n", Dumper \%hash;
The output looks like this :
$VAR1 = {
'13' => {
'klm' => 1,
'hij' => 2,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
};
This works great, my question is : How can I access the elements of this output $VAR1 in order to do : for value 13, klm + hij = 3 and get a final output like this :
$VAR1 = {
'13' => {
'somename' => 3,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
};
So basically what I want to do is loop through my final hash of hashes and access its specific elements based on a unique key and finally do their sum.
Any help would be appreciated.
Thanks
I used #do_sum to indicate what changes you want to make. The new key is hardcoded in the script. Note that the new key is not created if no key exists in the subhash (the $found flag).
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my %hash = (
'13' => {
'klm' => 1,
'hij' => 2,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
);
my #do_sum = qw(klm hij);
for my $num (keys %hash) {
my $found;
my $sum = 0;
for my $key (#do_sum) {
next unless exists $hash{$num}{$key};
$sum += $hash{$num}{$key};
delete $hash{$num}{$key};
$found = 1;
}
$hash{$num}{somename} = $sum if $found;
}
print Dumper \%hash;
It sounds like you need to learn about Perl References, and maybe Perl Objects which are just a nice way to deal with references.
As you know, Perl has three basic data-structures:
Scalars ($foo)
Arrays (#foo)
Hashes (%foo)
The problem is that these data structures can only contain scalar data. That is, each element in an array can hold a single value or each key in a hash can hold a single value.
In your case %hash is a Hash where each entry in the hash references another hash. For example:
Your %hash has an entry in it with a key of 13. This doesn't contain a scalar value, but a references to another hash with three keys in it: klm, hij, and lkm. YOu can reference this via this syntax:
${ hash{13} }{klm} = 1
${ hash{13} }{hij} = 2
${ hash{13} }{lkm} = 4
The curly braces may or may not be necessary. However, %{ hash{13} } references that hash contained in $hash{13}, so I can now reference the keys of that hash. You can imagine this getting more complex as you talk about hashes of hashes of arrays of hashes of arrays. Fortunately, Perl includes an easier syntax:
$hash{13}->{klm} = 1
%hash{13}->{hij} = 2
%hash{13}->{lkm} = 4
Read up about hashes and how to manipulate them. After you get comfortable with this, you can start working on learning about Object Oriented Perl which handles references in a safer manner.

What's the point of this kind of code?

{%{$self->param}}
It does hash expand, and then create another hash reference.
But isn't {%{$self->param}} the same as $self->param? Why does the code bother doing the trick?
It copies the hash. Consider the following snippet:
use Data::Dumper;
my $foo = { a => 1, ar => [1] };
my $bar = {%$foo};
$bar->{b} = 2;
push #{$bar->{ar}}, 4;
print Dumper $foo;
print Dumper $bar;
It prints
$VAR1 = {
'a' => 1,
'ar' => [
1,
4
]
};
$VAR1 = {
'a' => 1,
'b' => 2,
'ar' => [
1,
4
]
};
So you can see that the copy is shallow: Scalars are copied over, even if they are references. The referenced objects are the same (in this example the array referenced by ar).
Although both {%{$self->param}} and $self->param are references to a hash, they do not refer to a hash stored in the same location.
The first expression dereferences $self->param to a hash, and returns a reference to an anonymous hash. Within the outer braces, %{$self->param} is actually expanded and copied temporarily, and then a reference to this temporary copy is returned, not to the old hash.
This code actually creates a copy hash (shallow copy of keys and values, but not deep copy), reference to which is returned and returns reference to it.
If some sub returns reference to a hash and you change something in it, you actually are changing values in original hash. To avoid this we sometimes need to copy whole hash (or array) before making any changes.
Here's example:
sub get_hashref {
my $hashref = shift;
return $hashref;
}
my %hash = (foo => 'bar');
my $ref = get_hashref(\%hash);
$ref->{foo} = 'baz'; # Changes 'foo' value in %hash
print "Original 'foo' now is: $hash{foo}\n"; # 'baz'
print "Ref's 'foo' now is: $ref->{foo}\n"; # 'baz'
# But!
$ref = {%{ get_hashref(\%hash) }};
$ref->{foo} = 42; # No changes in %hash
print "Original 'foo' now is: $hash{foo}\n"; # 'baz'
print "Ref's 'foo' now is: $ref->{foo}\n"; # '42'
To be understood better, {%{ $self->param }} may be expanded to:
my $ref = $self->param; # Ref to original hash
my %copy = %{$ref}; # Copies keys and values to new hash
my $ref_to_copy = {%copy}; # get ref to it
You also may omit last step if you need hash but not reference to it.

How can I get the second-level keys in a Perl hash-of-hashes?

I need to get all of the values for a certain key in a hash. The hash looks like this:
$bean = {
Key1 => {
Key4 => 4,
Key5 => 9,
Key6 => 10,
},
Key2 => {
Key7 => 5,
Key8 => 9,
},
};
I just need the values to Key4, Key5 and Key6 for example. The rest is not the point of interest. How could I get the values?
Update:
So I don't have a %bean I just add the values to the $bean like this:
$bean->{'Key1'}->{'Key4'} = $value;
hope this helps.
foreach my $key (keys %{$bean{Key1}})
{
print $key . " ==> " . $bean{Key1}{$key} . "\n";
}
should print:
Key4 ==> 4
Key5 ==> 9
Key6 ==> 10
If %bean is a hash of hashes, $bean{Key1} is a hash reference. To operate on a hash reference as you would on a simple hash, you need to dereference it, like this:
%key1_hash = %{$bean{Key1}};
And to access elements within a hash of hashes, you use syntax like this:
$element = $bean{Key1}{Key4};
So, here's a loop that prints the keys and values for $bean{Key1}:
print $_, '=>', $bean{Key1}{$_}, "\n" for keys %{$bean{Key1}};
Or if you just want the values, and don't need the keys:
print $_, "\n" for values %{$bean{Key1}};
See the following Perl documentation for more details on working with complex data structures: perlreftut, perldsc, and perllol.
Yet another solution:
for my $sh ( values %Bean ) {
print "$_ => $sh->{$_}\n" for grep exists $sh->{$_}, qw(Key4 Key5 Key6);
}
See the Perl Data Structure Cookbook for lots of examples of, well, working with Perl data structures.
A good way to do this - assuming what you're posting is an example, rather than a single one off case - would be recursively. So we have a function which searches a hash looking for keys we specify, calling itself if it finds one of the values to be a reference to another hash.
sub recurse_hash {
# Arguments are a hash ref and a list of keys to find
my($hash,#findkeys) = #_;
# Loop over the keys in the hash
foreach (sort keys %{$hash}) {
# Get the value for the current key
my $value = $hash->{$_};
# See if the value is a hash reference
if (ref($value) eq 'HASH') {
# If it is call this function for that hash
recurse_hash($value,#findkeys);
}
# Don't use an else in case a hash ref value matches our search pattern
for my $key (#findkeys) {
if ($key eq $_) {
print "$_ = $value\n";
}
}
}
}
# Search for Key4, Key5 and Key6 in %Bean
recurse_hash(\%Bean,"Key4","Key5","Key6");
Gives this output:
Key4 = 4
Key5 = 9
Key6 = 10