How can I get a only part of a hash in Perl?

How can I get a only part of a hash in Perl? - perl

Is there a way to get a sub-hash? Do I need to use a hash slice?
For example:
%hash = ( a => 1, b => 2, c => 3 );
I want only
%hash = ( a => 1, b => 2 );

Hash slices return the values associated with a list of keys. To get a hash slice you change the sigil to # and provide a list of keys (in this case "a" and "b"):
my #items = #hash{"a", "b"};
Often you can use a quote word operator to produce the list:
my #items = #hash{qw/a b/};
You can also assign to a hash slice, so if you want a new hash that contains a subset of another hash you can say
my %new_hash;
#new_hash{qw/a b/} = #hash{qw/a b/};
Many people will use a map instead of hash slices:
my %new_hash = map { $_ => $hash{$_} } qw/a b/;
Starting with Perl 5.20.0, you can get the keys and the values in one step if you use the % sigil instead of the # sigil:
my %new_hash = %hash{qw/a b/};

You'd probably want to assemble a list of keys you want:
my #keys = qw(a b);
And then use a loop to make the hash:
my %hash_slice;
for(#keys) {
$hash_slice{$_} = %hash{$_};
}
Or:
my %hash_slice = map { $_ => $hash{$_} } #keys;
(My preference is the second one, but whichever one you like is best.)

Yet another way:
my #keys = qw(a b);
my %hash = (a => 1, b => 2, c => 3);
my %hash_copy;
#hash_copy{#keys} = #hash{#keys};

Too much functional programming leads me to think of zip first.
With List::MoreUtils installed,
use List::MoreUtils qw(zip);
%hash = qw(a 1 b 2 c 3);
#keys = qw(a b);
#values = #hash{#keys};
%hash = zip #keys, #values;
Unfortunately, the prototype of List::MoreUtils's zip inhibits
zip #keys, #hash{#keys};
If you really want to avoid the intermediate variable, you could
zip #keys, #{[#hash{#keys}]};
Or just write your own zip without the problematic prototype. (This doesn't need List::MoreUtils at all.)
sub zip {
my $max = -1;
$max < $#$_and $max = $#$_ for #_;
map { my $ix = $_; map $_->[$ix], #_; } 0..$max;
}
%hash = zip \#keys, [#hash{#keys}];
If you're going to be mutating in-place,
%hash = qw(a 1 b 2 c 3);
%keep = map +($_ => 1), qw(a b);
$keep{$a} or delete $hash{$a} while ($a, $b) = each %hash;
avoids the extra copying that the map and zip solutions incur. (Yes, mutating the hash while you're iterating over it is safe... as long as the mutation is only deleting the most recently iterated pair.)

FWIW, I use Moose::Autobox here:
my $hash = { a => 1, b => 2, c => 3, d => 4 };
$hash->hslice([qw/a b/]) # { a => 1, b => 2 };
In real life, I use this to extract "username" and "password" from a form submission, and pass that to Catalyst's $c->authenticate (which expects, in my case, a hashref containing the username and password, but nothing else).

New in perl 5.20 is hash slices returning keys as well as values by using % like on the last line here:
my %population = ('Norway',5000000,'Sweden',9600000,'Denmark',5500000);
my #slice_values = #population{'Norway','Sweden'}; # all perls can do this
my %slice_hash = %population{'Norway','Sweden'}; # perl >= 5.20 can do this!

A hash is an unordered container, but the term slice only really makes sense in terms of an ordered container. Maybe look into using an array. Otherwise, you may just have to remove all of the elements that you don't want to produce your 'sub-hash'.

Related

Perl Hash References - Is it possible to put reference to nested hash into 1 variable?

I have a partially nested hash like the following:
$href = {one=>1, word_counts=>{"the"=>34, "train"=>4} };
and I would like to get the value of $href->{'word_counts'}{'train'}.
Is it possible to put the {'word_counts'}{'train'} into a variable, so I can access it by simply calling $href->$variable?

No, but you can use Data::Diver to get a value given a list of keys:
my #keys = ('word_counts', 'train');
my $value = Data::Diver::Dive($href, \(#keys));

There are various ways to do this. I don't think you need to involved $href once you have a shortcut to the value that you want.
You can take a reference to the value, but then you have to dereference it:
my $value_ref = \ $href->{'word_counts'}{'train'};
say $$value_ref;
There's an experimental refaliasing feature where both sides are a reference. Now you don't need to dereference:
use v5.22;
\ my $value_ref = \ $href->{'word_counts'}{'train'};
say $value_ref; # 4
$value_ref = 17;
say $href->{'word_counts'}{'train'}; # 17
It's not hard to walk the hash yourself. The trick is to get one level of the hash, store it in a variable, then use that variable to get the next level. Keep going until you are where you want to be:
my $href = {
one => 1,
word_counts => {
"the" => {
"dog" => 45,
"cat" => 24,
},
"train" => {
"car" => 7,
"wreck" => 37,
}
}
};
my #keys = qw( word_counts train car );
my $temp = $href;
foreach my $key ( #keys ) {
die "Not a hash ref at <$key>" unless ref $temp eq ref {};
die "<$key> not in the hash" unless exists $temp->{$key};
$temp = $temp->{$key};
}
print "Value is <$temp>"; # 7

In addition to the more general, excellent answers from ysth and brian d foy, consider also a very simple (perhaps too simple) solution:
my #keys = qw( word_counts train);
print $href->{ $keys[0] }{ $keys[1] }; # 4
Note that this solution is repetitive, not elegant (the order of keys is hardcoded), and does not try to walk the hash. But depending on the context and the specific task of the OP, this may be all that is needed.

sum hash of hash values using perl

I have a Perl script that parses an Excel file and does the following : It counts for each value in column A, the number of elements it has in column B, the script looks like this :
use strict;
use warnings;
use Spreadsheet::XLSX;
use Data::Dumper;
use List::Util qw( sum );
my $col1 = 0;
my %hash;
my $excel = Spreadsheet::XLSX->new('inout_chartdata_ronald.xlsx');
my $sheet = ${ $excel->{Worksheet} }[0];
$sheet->{MaxRow} ||= $sheet->{MinRow};
my $count = 0;
# Iterate through each row
foreach my $row ( $sheet->{MinRow}+1 .. $sheet->{MaxRow} ) {
# The cell in column 1
my $cell = $sheet->{Cells}[$row][$col1];
if ($cell) {
# The adjacent cell in column 2
my $adjacentCell = $sheet->{Cells}[$row][ $col1 + 1 ];
# Use a hash of hashes
$hash{ $cell->{Val} }{ $adjacentCell->{Val} }++;
}
}
print "\n", Dumper \%hash;
The output looks like this :
$VAR1 = {
'13' => {
'klm' => 1,
'hij' => 2,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
};
This works great, my question is : How can I access the elements of this output $VAR1 in order to do : for value 13, klm + hij = 3 and get a final output like this :
$VAR1 = {
'13' => {
'somename' => 3,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
};
So basically what I want to do is loop through my final hash of hashes and access its specific elements based on a unique key and finally do their sum.
Any help would be appreciated.
Thanks

I used #do_sum to indicate what changes you want to make. The new key is hardcoded in the script. Note that the new key is not created if no key exists in the subhash (the $found flag).
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my %hash = (
'13' => {
'klm' => 1,
'hij' => 2,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
);
my #do_sum = qw(klm hij);
for my $num (keys %hash) {
my $found;
my $sum = 0;
for my $key (#do_sum) {
next unless exists $hash{$num}{$key};
$sum += $hash{$num}{$key};
delete $hash{$num}{$key};
$found = 1;
}
$hash{$num}{somename} = $sum if $found;
}
print Dumper \%hash;

It sounds like you need to learn about Perl References, and maybe Perl Objects which are just a nice way to deal with references.
As you know, Perl has three basic data-structures:
Scalars ($foo)
Arrays (#foo)
Hashes (%foo)
The problem is that these data structures can only contain scalar data. That is, each element in an array can hold a single value or each key in a hash can hold a single value.
In your case %hash is a Hash where each entry in the hash references another hash. For example:
Your %hash has an entry in it with a key of 13. This doesn't contain a scalar value, but a references to another hash with three keys in it: klm, hij, and lkm. YOu can reference this via this syntax:
${ hash{13} }{klm} = 1
${ hash{13} }{hij} = 2
${ hash{13} }{lkm} = 4
The curly braces may or may not be necessary. However, %{ hash{13} } references that hash contained in $hash{13}, so I can now reference the keys of that hash. You can imagine this getting more complex as you talk about hashes of hashes of arrays of hashes of arrays. Fortunately, Perl includes an easier syntax:
$hash{13}->{klm} = 1
%hash{13}->{hij} = 2
%hash{13}->{lkm} = 4
Read up about hashes and how to manipulate them. After you get comfortable with this, you can start working on learning about Object Oriented Perl which handles references in a safer manner.

Multiply all values in hash together in Perl?

I would like to multiply all the values of a hash together without having to call the specific elements. E.g. I DON'T want to do $hash{'kv1'} * $hash{'kv2'} * $hash{'kv3'} because I won't know the number of elements in the hash to do this.
I'm guessing there is a simple and efficient function to do this, maybe using each or something, but I can't find any examples to follow. Any ideas?

Start at 1 (since 1 multiplied by anything is what you multiple it by), then loop over the values of the hash and multiple the current total value by the value from the hash.
#!/usr/bin/env perl
use v5.12;
my %hash = ( a => 1, b => 2, c => 3, d => 4 );
my $value = 1;
$value = $value * $_ foreach values %hash;
say $value;

This is what List::Util's reduce is for:
use List::Util 'reduce';
my %hash = (foo => 3, bar => 7, baz => 2);
say reduce { our $a * our $b } values %hash;
Output:
42

I would iterate over keys %hash like this:
my $prod = 1;
$prod *= $hash{$_} for keys %hash;

will this not do?
my $m = 1;
for (values %my_hash) {
$m *= $_;
}

Perl hash metainformation

Is it possible to store information about a hash, in it?
And by that I mean, without adding the information to the hash in the ordinary way, which would affect keys, values etc.
Thing is I am reading a twod_array into a hash but would like to store the order within the original array without affecting how one traverses through the hash etc.
so for instance:
my #the_keys=keys %the_hash;
should not return the information about the order of the hash.
Is there a way to store meta data within a hash?

You can store arbitrary metadata with the tie mechanism. Minimal example with a package storage that does not affect the standard hash interface:
package MetadataHash;
use Tie::Hash;
use base 'Tie::StdHash';
use Scalar::Util qw(refaddr);
our %INSERT_ORDER;
sub STORE {
my ($h, $k, $v) = #_;
$h->{$k} = $v;
push #{ $INSERT_ORDER{refaddr $h} }, $k;
}
1;
package main;
tie my %h, 'MetadataHash';
%h = ( I => 1, n => 2, d => 3, e => 4 );
$h{x} = 5;
# %MetadataHash::INSERT_ORDER is (9042936 => ['I', 'n', 'd', 'e', 'x'])
print keys %h;
# 'enIxd'

Well, one can always use Tie::Hash::Indexed, I suppose:
use Tie::Hash::Indexed;
tie my %hash, 'Tie::Hash::Indexed';
%hash = ( I => 1, n => 2, d => 3, e => 4 );
$hash{x} = 5;
print keys %hash, "\n"; # prints 'Index'
print values %hash, "\n"; # prints '12345'

map for hashes in Perl

Is there a hash equivalent for map?
my %new_hash = hash_map { new_key($a) => new_val($b) } %hash;
I know that I could loop through the keys.

List::Pairwise claims to implement exactly that syntax -- see mapp, grepp. I haven't used it though.
Also, you can do it as
%new_hash = map { new_key($_) => new_value($hash{$_}) } keys %hash;
which I admit looks clumsier if %hash is really a $deeply->{buried}->{hash}. I prefer using $temp = ...; map {...} keys %$temp in such cases.

I really can’t see what you are trying to do here. What does “a hash equivalent for map” even mean? You can use map on a hash just fine. If you want the keys, just use keys; for example"
#msglist = map { "value of $_ is $hash{$_}" } keys %hash
although usually
say "value of $_ is $hash{$_}" keys %hash;
is just fine.
If you want both, then use the whole hash.
For assignment, what’s wrong with %new_hash = %old_hash?
Do you have deep-copy issues? Then use Storable::dclone.
Do you want both key and value available in the closure at the same time? Then make a bunch of pairs with the first map:
#pairlist = map { [ $_ => $hash{$_} ] } keys %hash
I need to see an example of what you would want to do with this, but so far I can see zero cause for using some big old module instead of basic Perl.

You can use map like this:
my $i = 0;
my %new_hash = map { $i ^= 1 ? new_key($_) : new_val($_) } %hash;

You can use mapn from my module List::Gen to do this:
use List::Gen 'mapn';
my %new_hash = mapn {new_key($_[0]) => new_value($_[1])} 2 => %old_hash;
mapn is like map, except it it takes an additional argument, the number of elements to walk the list by. Inside the block, the #_ array is set to the current slice.

$ perl -d /dev/null
DB<2> %p = ( a=>'b', c=> 'd');
DB<5> p Dumper \%p
$VAR1 = {
'c' => 'd',
'a' => 'b'
};
To e.g. reverse the key and the value:
DB<6> %q = map { ($p{$_}, $_ ) } keys %p
DB<7> p Dumper \%q
$VAR1 = {
'b' => 'a',
'd' => 'c'
};

As of perl 5.20, core utility List::Util::pairmap does exactly that:
use List::Util qw(pairmap);
my %new_hash = pairmap { new_key($a) => new_val($b) } %hash;
It's not necessarily optimal (as it involves unrolling the hash to a list and back) but I believe this is the shortest way in vanilla perl.