Perl Hash of Hashes and Counting

Perl Hash of Hashes and Counting - perl

I am trying to process a perl data structure that I have outputted using Data::Dumper
$VAR1 = 'GAHD';
$VAR2 = [
{ 'COUNTRY' => 'US',
'NAME' => 'K. Long',
'DATE_OF_BIRTH' => '7/27/1957',
'POSITION' => 'SENIOR OFFICER',
'AGE' => 57,
'GRADE' => 'P5'
},
{ 'COUNTRY' => 'US',
'NAME' => 'J. Buber',
'DATE_OF_BIRTH' => '12/11/1957',
'POSITION' => 'CHIEF',
'GRADE' => 'D1'
},
{ 'COUNTRY' => 'US',
'NAME' => 'M. Amsi',
'DATE_OF_BIRTH' => '1/1/1957',
'POSITION' => 'SENIOR ANIMAL HEALTH OFFICER',
'AGE' => 57,
'GRADE' => 'P5'
},
{ 'COUNTRY' => 'US',
'NAME' => 'E. Xenu',
'DATE_OF_BIRTH' => '8/31/1964',
'POSITION' => 'SENIOR OFFICER',
'AGE' => 50,
'GRADE' => 'P5'
},
];
$VAR3 = 'GAGD';
$VAR4 = [
{ 'COUNTRY' => 'US',
'NAME' => 'P. Cheru',
'DATE_OF_BIRTH' => '6/18/1966',
'POSITION' => 'ANIMAL PRODUCTION OFFICER',
'AGE' => 48,
'GRADE' => 'P4'
},
{ 'COUNTRY' => 'US',
'NAME' => 'B. Burns',
'DATE_OF_BIRTH' => '2/4/1962',
'POSITION' => 'ANIMAL PRODUCTION OFFICER',
'AGE' => 52,
'GRADE' => 'P4'
},
{ 'COUNTRY' => 'US',
'NAME' => 'R. Mung',
'DATE_OF_BIRTH' => '12/13/1968',
'POSITION' => 'ANIMAL PRODUCTION OFFICER',
'AGE' => 45,
'GRADE' => 'P4'
},
{ 'COUNTRY' => 'GERMANY',
'NAME' => 'B. Scherf',
'DATE_OF_BIRTH' => '8/31/1964',
'POSITION' => 'ANIMAL PRODUCTION OFFICER',
'AGE' => 50,
'GRADE' => 'P4'
},
{ 'COUNTRY' => 'GERMANY',
'NAME' => 'I. Hoffmann',
'DATE_OF_BIRTH' => '2/21/1960',
'POSITION' => 'CHIEF',
'AGE' => 54,
'GRADE' => 'P5'
},
];
The following is outputted:
1 ADG JUNIOR OFFICER K. King
1 DG SENIOR DIRECTOR K. King
3 P5 SENIOR OFFICER R. Forest
R.Forest
K. King
1 P3 JUNIOR OFFICER K. King
3 P1 FORESTRY OFFICER P. Smith
T. Turner
K. Turner
1 P1 GENERAL OFFICER K. King
I would like to count the number of GRADES and POSITIONS by Division. Here is the code that I have put together thus far:
#Push data read from a flat file and while loop
push #{ $grades{ $_->{GRADE} }{ $_->{POSITION} } }, $_->{NAME} for #$AG;
for my $key (
sort { substr( $a, 0, 1 ) cmp substr( $b, 0, 1 ) || substr( $b, 0, 2 ) cmp substr( $a, 0, 2 ) }
keys %grades
)
{
for my $pos ( sort { $a cmp $b } keys %{ $grades{$key} } ) {
my $names = $grades{$key}->{$pos};
my $count = scalar #$names;
print $count, ' ', $key, ' ', $pos, ' ', $names->[0], "\n";
print ' ', $names->[$_], "\n" for 1 .. $#$names;
}
}
The code will stop outputting results if duplicate POSITIONS and GRADES data (i.e. P1, Senior Officer) appear in another Division.
I do not know how to access the Hash of Hash by Division (i.e. GAGD, GAGHD,etc.) so that the same GRADEs and POSITIONs will be outputted per division.
Here is what I really need:
**GAGD**
1 ADG JUNIOR OFFICER K. King
1 DG SENIOR DIRECTOR K. King
3 P5 SENIOR OFFICER R. Forest
R.Forest
K. King
1 P3 JUNIOR OFFICER K. King
3 P1 FORESTRY OFFICER P. Smith
T. Turner
K. Turner
1 P1 GENERAL OFFICER K. King
**GAGHD**
1 P3 JUNIOR OFFICER P. Green
3 P1 FORESTRY OFFICER R. Brown
F. Boo
K. Church
1 P1 GENERAL OFFICER D. Peefer
etc.
etc.

It seems you want to hash the information by Division, then count and store names by grade + position. The following seems to work for me:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my %grades = (
GAHD => [ {
NAME => 'K. Long',
POSITION => 'SENIOR OFFICER',
GRADE => 'P5'
},
{
NAME => 'J. Buber',
POSITION => 'CHIEF',
GRADE => 'D1'
},
{
NAME => 'M. Amsi',
POSITION => 'SENIOR ANIMAL HEALTH OFFICER',
GRADE => 'P5'
},
{
NAME => 'E. Xenu',
POSITION => 'SENIOR OFFICER',
GRADE => 'P5'
},
],
GAGD => [
{
NAME => 'P. Cheru',
POSITION => 'ANIMAL PRODUCTION OFFICER',
GRADE => 'P4'
},
{
NAME => 'B. Burns',
POSITION => 'ANIMAL PRODUCTION OFFICER',
GRADE => 'P4'
},
{
NAME => 'R. Mung',
POSITION => 'ANIMAL PRODUCTION OFFICER',
GRADE => 'P4'
},
{
NAME => 'B. Scherf',
POSITION => 'ANIMAL PRODUCTION OFFICER',
GRADE => 'P4'
},
{
NAME => 'I. Hoffmann',
POSITION => 'CHIEF',
GRADE => 'P5'
},
]);
for my $division (keys %grades) {
say "**$division**";
my %group;
for my $person (#{ $grades{$division} }) {
my $position = join ' ', #{ $person }{qw{GRADE POSITION}};
push #{ $group{$position} }, $person->{NAME};
}
for my $position (keys %group) {
say join ' ', scalar #{ $group{$position} },
$position,
$group{$position}[0];
my #remaining_names = #{ $group{$position} };
shift #remaining_names;
say "\t$_" for #remaining_names;
}
say q();
}
Update
If you store more information than a name for a person in an array ref (push push #{ $group{$position} }, [ ... ];), you can then retrieve it by dereferencing each reference, for example in map:
say join ' ', scalar #{ $group{$position} },
$position,
join "\n\t", map "#$_", #{ $group{$position} };

You're almost there with the code that you've got. Assuming that the hash you've printed out is called %grades, I would do the following:
foreach my $g (sort keys %$grades) {
print "**$g**\n";
# put the info to be printed in a temporary hash
my %temp;
foreach (#{$grades->{$g}}) {
push #{$temp{ $_->{GRADE}." ".$_->{POSITION} }}, $_->{NAME};
}
foreach (sort keys %temp) {
# print a count of the number of names, then the grade/position info
print scalar #{$temp{$_}} . " $_ "
# #{$temp{$_}} holds the names, so just sort them and print them out.
. join("\n\t\t\t", sort #{$temp{$_}}) . "\n";
}
}

Related

How can I redact some values in the dump of a Perl hash?

supposed I have these hashes:
my $hash1 = {
firstname => 'john',
lastname => 'doe',
};
my $hash2_nested = {
name => {
firstname => 'jean',
lastname => 'doe',
}
};
Note: hashes can be nested x times deeply.
I want to use Data::Dumper where I can print the copy of those hashes, but with hidden lastname.
means, it should print out:
$VAR1 = {
'firstname' => 'john'
'lastname' => '***',
};
and this:
$VAR1 = {
'name' => {
'firstname' => 'john'
'lastname' => '***',
}
};
is there any Perl library where it search for a hash key recursively and replace its value dynamically? something like:
replace_hash_value($hash1, 'lastname', '***');

There are several things to consider here. Mostly, you don't want to reinvent what is already out there. Also remember that any Personal Identifying Information (PII) in your program has a way to leak out despite your best efforts, but that's not the programming question at hand.
First, you don't want to operate on the original data, and since you have nested structures, you can't simply make a copy because that only copies the top level and still shares references at the lower level:
my %copy = %original; # shallow copy!
But, the core module Storable can make a deep copy that is completely disconnected, new copy that shares no references:
use Storable qw(dclone);
my $deep_copy = dclone $hash1;
Now you can play with $deep_copy without changing $hash1. You want to find all the last_name keys and remove their value. Grinnz suggested the Data::Walk module (an example of the Visitor design pattern). It's like File::Find for data structures. It's going to handle all the business of finding the hashes for you. In your wanted subroutine, skip everything that's not interesting, then change the nodes that are interesting. You don't worry about how you find or are given the nodes:
use Data::Walk;
walk \&wanted, $deep_copy;
sub wanted {
return unless ref $_ eq ref {};
return unless exists $_->{last_name};
$_->{last_name} = '****';
}
Now, put that all together. Here's a mix of nested things, with some odd cases thrown in, including an object that uses a hash:
use v5.10;
use Hash::AsObject;
my $data = {
first_name => 'Amelia',
last_name => 'Camel',
friends => [
q(last_name => 'REDACTED BY POLICY'),
{
first_name => 'Camelia',
last_name => 'Butterfly',
},
{
first_name => 'Larry',
last_name => 'Llama',
associate => {
first_name => 'Vicky',
last_name => 'Vicuna',
}
},
],
name => {
first_name => 'Andy',
last_name => 'Alpaca',
},
object => bless {
first_name => 'Peter',
last_name => 'Python',
}, 'FooBar',
};
use Storable qw(dclone);
my $deep_copy = dclone( $data );
use Data::Walk;
walk \&wanted, $deep_copy;
use Data::Dumper;
say Dumper( $deep_copy );
sub wanted {
return unless ref $_ eq ref {};
return unless exists $_->{last_name};
$_->{last_name} = '****';
}
And, here's the output from Data::Dumper (which you can prettify with some of its settings):
$VAR1 = {
'object' => bless( {
'first_name' => 'Peter',
'last_name' => 'Python'
}, 'Hash::AsObject' ),
'first_name' => 'Amelia',
'last_name' => '****',
'friends' => [
'last_name => \'REDACTED BY POLICY\'',
{
'last_name' => '****',
'first_name' => 'Camelia'
},
{
'last_name' => '****',
'first_name' => 'Larry',
'associate' => {
'first_name' => 'Vicky',
'last_name' => '****'
}
}
],
'name' => {
'first_name' => 'Andy',
'last_name' => '****'
}
};
Notice that it finds the hashes in the array reference, it doesn't touch the object, and it doesn't touch the literal data that has last_name => in it.
If you don't like those behaviors, then you can modify what you do in wanted to account for what you'd like to happen. Suppose you want to look at certain objects too, like that Hash::AsObject object. One (polymorphic) way to do that is look for objects that let you call a last_name method (although this assumes you can give it an argument to change the last name):
sub wanted {
if( ref $_ eq ref {} and exists $_->{last_name} ) {
$_->{last_name} = '****';
}
# merely one way to do this
elsif( eval { $_->can('last_name') } ) {
$_->last_name( '****' );
}
}
Now the last_name member in the object is also redacted:
$VAR1 = {
'first_name' => 'Amelia',
'friends' => [
'last_name => \'REDACTED BY POLICY\'',
{
'last_name' => '****',
'first_name' => 'Camelia'
},
{
'first_name' => 'Larry',
'associate' => {
'first_name' => 'Vicky',
'last_name' => '****'
},
'last_name' => '****'
}
],
'last_name' => '****',
'name' => {
'first_name' => 'Andy',
'last_name' => '****'
},
'object' => bless( {
'first_name' => 'Peter',
'last_name' => '****'
}, 'Hash::AsObject' )
};
That wanted is as flexible as you'd like it to be, and it's pretty simple.

Why not to code such subroutine yourself?
use strict;
use warnings;
use feature 'say';
my $hash1 = {
firstname => 'john',
lastname => 'doe'
};
my $hash2_nested = {
name => {
firstname => 'jean',
lastname => 'doe'
}
};
my $mask = 'lastname';
hash_mask($hash1,$mask);
hash_mask($hash2_nested,$mask);
sub hash_mask {
say "\$VAR = {";
hash_mask_x(shift, shift, 1);
say "};";
}
sub hash_mask_x {
my $hash = shift;
my $mask_k = shift;
my $depth = shift;
my $indent = ' ' x 8;
my $space = $indent x $depth;
while( my($k,$v) = each %{$hash} ) {
if (ref $v eq 'HASH') {
say $space . "$k => {";
hash_mask_x($v,$mask_k,$depth+1);
say $space . "}";
} elsif( $k eq $mask_k ) {
say $space . "'$k' => '*****'";
} else {
say $space . "'$k' => '$v'";
}
}
}
Output
$VAR = {
'lastname' => '*****'
'firstname' => 'john'
};
$VAR = {
name => {
'lastname' => '*****'
'firstname' => 'jean'
}
};

Parse text file into nested data structure

This is my text file:
animal, cola, husband, 36
animal, wilma, wife, 31
animal, pebbles, kid, 4
brutal, george, husband, 41
brutal, jane, wife, 39
brutal, elroy, kid, 9
cosa, homer, husband, 34
cosa, marge, wife, 37
cosa, bart, kid, 11
And this is the data structure I want:
%HASH = (
animal => [
{ name => "cola", role => "husband", age => 36, },
{ name => "wilma", role => "wife", age => 31, },
{ name => "pebbles", role => "kid", age => 4, },
],
brutal => [
{ name => "george", role => "husband", age => 41, },
{ name => "jane", role => "wife", age => 39, },
{ name => "elroy", role => "kid", age => 9, },
],
cosa => [
{ name => "homer", role => "husband", age => 34, },
{ name => "marge", role => "wife", age => 37, },
{ name => "bart", role => "kid", age => 11, },
],
);
I have some pieces of code, but I can't assemble them into a coherent script. I want only for someone to help me to define this structure and to understand it.

Parse each line into a hash.
Remove the key column from the hash.
Push the hash onto an array based on the key column.
Code:
use strict;
use warnings;
use Data::Dumper;
my %hash;
my #columns = qw(category name role age);
while (<DATA>) {
chomp;
my %temp;
#temp{#columns} = split(/\s*,\s*/);
my $key = delete($temp{category});
push(#{$hash{$key}}, \%temp);
}
print Dumper(\%hash);
__DATA__
animal, cola, husband, 36
animal, wilma, wife, 31
animal, pebbles, kid, 4
brutal, george, husband, 41
brutal, jane, wife, 39
brutal, elroy, kid, 9
cosa, homer, husband, 34
cosa, marge, wife, 37
cosa, bart, kid, 11
Output:
$VAR1 = {
'cosa' => [
{
'name' => 'homer',
'age' => '34',
'role' => 'husband'
},
{
'name' => 'marge',
'age' => '37',
'role' => 'wife'
},
{
'name' => 'bart',
'age' => '11',
'role' => 'kid'
}
],
'brutal' => [
{
'name' => 'george',
'age' => '41',
'role' => 'husband'
},
{
'name' => 'jane',
'age' => '39',
'role' => 'wife'
},
{
'name' => 'elroy',
'age' => '9',
'role' => 'kid'
}
],
'animal' => [
{
'name' => 'cola',
'age' => '36',
'role' => 'husband'
},
{
'name' => 'wilma',
'age' => '31',
'role' => 'wife'
},
{
'name' => 'pebbles',
'age' => '4',
'role' => 'kid'
}
]
};

The data structure you show has one critical rule: A value in a hash can only be a scalar.
So to associate a multi-valued variable with a key use the reference to that variable, here arrayref. And if values in that array need be more complex than scalars you again use a reference, here a hashref.† So the value for each key is an arrayref whose elements are hashrefs.
Then you need to learn how to access elements deeper in the structure. That isn't very complex either: dereference them at each level and you can then work with them like you would with an array or hash.
All this is in perldsc, for which one need be clear with perlreftut. A reference is perlref.
When this is put to use in your problem
use warnings;
use strict;
use Data::Dump qw(dd);
my $file = 'data.txt';
open my $fh, '<', $file or die "Can't open $file: $!";
my %result;
while (<$fh>) {
chomp;
my ($key, $name, $role, $age) = split /\s*,\s*/;
push #{$result{$key}},
{ name => $name, role => $role, age => $age };
}
dd \%result;
This prints the correct data structure. I use Data::Dump to see a complex data structure, which need be installed; there is Data::Dumper in the core. There are yet others.
The split above uses a regex /\s*,\s*/ for the delimiter, so to split the line by comma optionally surrounded by spaces. The default for the string to split is $_.
Note that we don't have to "add a key" or make its arrayref-value ahead of using them, as that's done via autovivification. See for example this page and this page and this page.
It is a complex feature that can bite if misused so please read up on it.
† If we attempt to use array or hash variables for array elements we are really trying to fit a list of values into a single "slot" of an array. That can't be done of course, and what happens is that they'll get "flattened" – their elements are merged with all other given scalar elements, and that whole aggregate list populates the array
my #ary = 1..3;
my %hash = (a => 10, b => 20);
# Most likely an error:
my #all = (5, #ary, %hash, 100); #--> (5, 1, 2, 3, a, 10, b, 20, 100)
where key-value pairs may come in any order since hashes are inherently unordered.
Instead, we take references to arrays and hashes and write
my #all = (5, \#ary, \%hash, 100);
Since references are scalars they are legit elements of the array and no flattening happens. So now contents of #ary and %hash keep their individuality and can be recovered as needed.

How to access array of hashes?

Hi i have an array of hashes as below, i want access hash element/elements. Say suppose i want to print doct1's name, i am not getting right result please help me how do i print that?
#doctors = (
'doct1' => {
'name' => 'abcd',
'specialization' => 'xyz',
'city' => 'pqr'
},
'doct2' => {
'name' => 'efgh',
'specialization' => 'mno',
'city' => 'stu'
}
);
print $doctors[0]{'name'};

Arrays don't have keys,
my #doctors = (
{
'name' => 'abcd',
'specialization' => 'xyz',
'city' => 'pqr'
},
{
'name' => 'efgh',
'specialization' => 'mno',
'city' => 'stu'
}
);
print $doctors[0]{'name'};

You don't have an AoH. You have an array containing both strings and references to hashes. This is a very poor data structure. It's messy and inefficient to locate the correct doctor.
my $i = 0;
$i += 2 while $i<#doctors && $doctors[$i] ne 'doct1';
die "Not found" if $i > #doctors;
say $doctors[$i+1]{name};
If you had an AoH as you say, it you look something like this:
my #doctors = (
{
id => 'doct1',
name => 'abcd',
specialization => 'xyz',
city => 'pqr',
},
{
id => 'doct2',
name => 'efgh',
specialization => 'mno',
city => 'stu',
},
);
That would be better.
my ($doctor) = grep { $_->{id} eq 'doct1' } #doctors
or die "Not found";
say $doctor->{name};
It's also possible that doct1 and doct2 are meaningless, and that you'd be happy using 0 and 1 instead. If so,
die "Not found" if #doctors < 0;
say $doctors[0]{name};
If doct1 and doct2 aren't meaningless, then the cleanest and most efficient solution would be to use an HoH.
my %doctors = (
doct1 => {
name => 'abcd',
specialization => 'xyz',
city => 'pqr',
},
doct2 => {
name => 'efgh',
specialization => 'mno',
city => 'stu',
},
);
The code would then be the simple:
my $doctor = $doctors{doct1}
or die "Not found";
say $doctor->{name};

This is a situation where using Data::Dumper is essential, what you actually have is an array of two strings and two hashrefs. If you were to print it out with Data::Dumper you would see this:
use Data::Dumper;
print Dumper \#doctors;
[
'doct1',
{
'city' => 'pqr',
'specialization' => 'xyz',
'name' => 'abcd'
},
'doct2',
{
'city' => 'stu',
'specialization' => 'mno',
'name' => 'efgh'
}
];
Each hashref has all the data that represents a doctor, the additional key at the front doesn't make any sense. Remove those keys and you will have a structure like this:
#doctors = (
{
'name' => 'abcd',
'specialization' => 'xyz',
'city' => 'pqr'
},
{
'name' => 'efgh',
'specialization' => 'mno',
'city' => 'stu'
}
);
and now you can access the hash attributes like you would expect:
print $doctors[0]{name};

The right hand declaration is not very consistent (in intention) with the assignment to an array. You'd probably want to assign it to a hash instead:
%doctors = (
'doct1' => {
'name' => 'abcd',
'specialization' => 'xyz',
'city' => 'pqr'
},
'doct2' => {
'name' => 'efgh',
'specialization' => 'mno',
'city' => 'stu'
}
);
print $doctors{'doct1'}->{'name'};
Either this or, mpapec's answer.

Combining 2+ 'deep' (multi-dimensional) hashes in perl

There is a question that explains exactly what I want here: how to merge 2 deep hashes in perl
However, the answer there does not seem to work for me (suggestions of using the Merge module).
I have two hashes like so:
$VAR1 = {
'57494' => {
'name' => 'John Smith',
'age' => '9',
'height' => '120'
},
'57495' => {
'name' => 'Amy Pond',
'age' => '17',
'height' => '168'
}
}
};
$VAR1 = {
'57494' => {
'name_address' => 'Peter Smith',
'address' => '5 Cambridge Road',
'post_code' => 'CR5 0FS'
}
}
};
If I use Hash::Merge or the %c = {%a,%b} format I get this every time:
$VAR1 = '57494';
$VAR2 = {
'name_address' => 'Peter Smith',
'address' => '5 Cambridge Road',
'post_code' => 'CR5 0FS'
};
(so it basically overwrote the first data with the second and messed up the keys) when I want:
$VAR1 = {
'57494' => {
'name' => 'John Smith',
'age' => '9',
'height' => '120'
'name_address' => 'Peter Smith',
'address' => '5 Cambridge Road',
'post_code' => 'CR5 0FS'
},
'57495' => {
'name' => 'Amy Pond',
'age' => '17',
'height' => '168'
}
}
};
So when the keys are the same, the data merges together, otherwise the new keys are just appended onto the end. I hope this make sense. Maybe I've done something incorrectly using Merge or need to 'manually' add them in loops, but I'm spending too much time thinking about it, regardless!
Edit: how I use Merge to see if I'm doing something silly:
I have:
use Hash::Merge qw( merge );
...hash data above as %hash1 and %hash2...
my %combined_hash = %{ merge( %hash1,%hash2 ) };
print Dumper(%combined_hash);

If I do it with references, it works like a charm.
use strict; use warnings;
use Data::Dumper;
use Hash::Merge qw(merge);
my $h1 = {
'57494' => {
'name' => 'John Smith',
'age' => '9',
'height' => '120'
},
'57495' => {
'name' => 'Amy Pond',
'age' => '17',
'height' => '168'
}
};
my $h2 = {
'57494' => {
'name_address' => 'Peter Smith',
'address' => '5 Cambridge Road',
'post_code' => 'CR5 0FS'
}
};
my $h3 = merge( $h1, $h2 );
print Dumper $h3;
Output:
$VAR1 = {
'57495' => {
'name' => 'Amy Pond',
'age' => '17',
'height' => '168'
},
'57494' => {
'name_address' => 'Peter Smith',
'name' => 'John Smith',
'post_code' => 'CR5 0FS',
'address' => '5 Cambridge Road',
'height' => '120',
'age' => '9'
}
};
If, however, I do it with hashes instead of hash refs, it doesn't:
my %hash1 = (
'57494' => {
'name' => 'John Smith',
'age' => '9',
'height' => '120'
},
'57495' => {
'name' => 'Amy Pond',
'age' => '17',
'height' => '168'
}
);
my %hash2 = (
'57494' => {
'name_address' => 'Peter Smith',
'address' => '5 Cambridge Road',
'post_code' => 'CR5 0FS'
}
);
my %h3 = merge( %hash1, %hash2 );
print Dumper \%h3;
__END__
$VAR1 = {
'57495' => undef
};
That is because the merge from Hash::Merge can only take references, but you are passing it hashes. In addition, you need to call it in scalar context.
Try it like so:
# +--------+--- references
# ,-- SCALAR context | |
my $combined_hash = %{ merge( \%hash1, \%hash2 ) };
print Dumper($combined_hash);

for my $key (keys %fromhash) {
if(not exists $tohash{$key}) {
$tohash{$key} = {};
}
for my $subkey (keys %{$fromhash{$key}}) {
${$tohash{$key}}{$subkey} = ${$fromhash{$key}}{$subkey};
}
}
With more or less braces depending on whether my last coffee was any good.
Python is definitely more comfortable for this kind of thing, because it doesn't make you think about references:
for key in fromdict:
if key not in todict:
todict[key] = {}
todict[key] = dict(fromdict[key].items() + todict[key].items())
Or if todict is a defaultdict (creating keys on read as well as assignment):
for key in fromdict:
todict[key] = dict(dict(fromdict[key]).items() + dict(todict[key]).items())

Build hash of hashes from a single hash

I have the following hash
my %input_hash = (
'test1' => '100',
'test2' => '200',
'test3' => '300',
'test4' => '400',
'test5' => '500'
);
What I need is to build a hash of hash from the above hash. I need to put the first 2 of the above key value pair into a key of the hash of hash. Better explained with this example.
Desired output:
my %expected_hash = (
1 => {
'test1' => '100',
'test2' => '200',
},
2 => {
'test3' => '300',
'test4' => '400',
},
3 => {
'test5' => '500'
},
);
I would like the split to be dynamic. Example,if i need to split by 3, the desired output should be
my %expected_hash = (
1 => {
'test1' => '100',
'test2' => '200',
'test3' => '300',
},
2 => {
'test4' => '400',
'test5' => '500'
},
);

Here's a version that uses splice to get a dynamic number of elements. Note that you have to sort the keys in the hash, because hashes are unordered.
use strict;
use warnings;
use Data::Dumper;
my %input_hash = (
'test1' => '100',
'test2' => '200',
'test3' => '300',
'test4' => '400',
'test5' => '500',
'test6' => '600',
'test7' => '700',
'test8' => '800',
'test9' => '900'
);
my $foo = foo(\%input_hash, 4);
print Dumper $foo;
sub foo {
my ($href, $count) = #_;
my #keys = sort keys %$href;
my %hash;
my $i = 1;
while (#keys) {
$hash{$i++} = { map { $_ => $href->{$_} }
splice #keys, 0, $count };
}
return \%hash;
}
Output:
$VAR1 = {
'1' => {
'test1' => '100',
'test4' => '400',
'test3' => '300',
'test2' => '200'
},
'3' => {
'test9' => '900'
},
'2' => {
'test8' => '800',
'test5' => '500',
'test7' => '700',
'test6' => '600'
}
};

Here's a solution using an index array built from number of keys in %input_hash and desired size set in $chunk_size:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my %input_hash = (
'test1' => '100',
'test2' => '200',
'test3' => '300',
'test4' => '400',
'test5' => '500'
);
my $chunk_size = 2;
my #indexes = map {int($_ / $chunk_size) + 1} 0 .. keys %input_hash;
my %expected_hash;
for my $key (sort keys %input_hash) {
my $index = shift #indexes;
$expected_hash{$index}{$key} = $input_hash{$key};
}
print Dumper \%expected_hash;
Output:
$VAR1 = {
'1' => {
'test1' => '100',
'test2' => '200'
},
'3' => {
'test5' => '500'
},
'2' => {
'test4' => '400',
'test3' => '300'
}
};
Of course, as TLP mentioned, you have to sort %input_hash to achieve this.

#! /usr/bin/perl -w
use strict;
my ($expected_hash_key, $cnt, $L1, $L2) = (1, 1, "", "");
my %expected_hash;
# Change $LIMIT to required value. 2 or 3.
my $LIMIT = 2;
my %input_hash = (
'test1' => '100',
'test2' => '200',
'test3' => '300',
'test4' => '400',
'test5' => '500'
);
for (sort keys %input_hash) {
$cnt++;
$expected_hash{$expected_hash_key}{$_} = $input_hash{$_};
if ($cnt == $LIMIT + 1) {
$cnt = 1;
$expected_hash_key++;
}
}
for $L1 (sort keys %expected_hash) {
print "$L1 ==> \n";
for $L2 (sort keys %{ $expected_hash{$L1} }) {
print "$L2 -> $expected_hash{$L1}{$L2}\n";
}
print "\n";
}