How do I merge hashes from an array of hashes - perl

Input array:
my #input = (
{
Id => 1,
A => "abcd",
B => undef,
C => "rtyt"
},
{
Id => 1,
A => undef,
B => "efgh",
},
{
Id => 2,
A => "ifk",
B => "rjot",
},
);
Desired output:
my #output = (
{
Id => 1,
A => "abcd",
B => "efgh",
C => "rtyt"
},
{
Id => 2,
A => "ifk",
B => "rjot",
},
);
Other solutions I've seen here either include the extra values from first hash or from second hash, but I need both.

Hashes are great for grouping things.
# Keep first value in case of conflict.
my %by_id;
for my $rec (#input) {
my $id = $rec->{Id};
for my $k (keys(%$rec)) {
$by_id{$id}{$k} //= $rec->{$k};
}
}
or
# Keep last value in case of conflict.
my %by_id;
for my $rec (#input) {
my $id = $rec->{Id};
$by_id{$id} //= { %$rec };
for my $k (keys(%$rec)) {
if (defined($rec->{$k})) {
$by_id{$id}{$k} = $rec->{$k};
}
}
}
The above builds
my %by_id = (
"1" => {
Id => 1,
A => "abcd",
B => "efgh",
C => "rtyt",
},
"2" => {
Id => 2,
A => "ifk",
B => "rjot",
},
);
Then, we can simply extract the values.
my #output = #by_id{
sort { $a <=> $b }
keys(%by_id)
};

Related

Perl hash add values

I am trying to push values into hash. I want to add the values under 'par3'.
e.g.
$VAR1 = { 'obj1' => ['par1',
'par2',
'par3' => ['par4','par5','par6',....]]}
I should also be able to add elements into 'par3' in case 'obj1'-'par1'-'par2'-'par3' matches.
So far I have this, but I can't figure out how can I add "the second level" under 'par3':
push #{$a{$obj}},$par1,$par2,$par3
[ ... ] is a reference to an array. Array elements are scalars. So it is not possible to directly have the structure you seem to be requesting (ie. the par3 => [ ... ] pseudocode from your question). See perldsc
It's not obvious what you are trying to do but a couple of possible ideas might be to use a reference to a hash, or to replace the array with a hash:
use Data::Dumper;
$Var2a = {
'obj1' => [
'par1',
'par2',
{ 'par3' => undef, }
],
};
push #{ $Var2a->{obj1}[2]{par3} }, 'par4', 'par5', 'par6';
print Dumper $Var2a;
$Var2b = {
'obj1' => {
'par1' => undef,
'par2' => undef,
'par3' => undef,
},
};
push #{ $Var2b->{obj1}{par3} }, 'par4', 'par5', 'par6';
print Dumper $Var2b;
I think you want to add children to par3. As such, I think you want pars to have children. If so, you need a different data structure.
# Ordered
my $obj1 = [
{ name => 'par1', children => [ ] },
{ name => 'par2', children => [ ] },
{ name => 'par3', children => [
{ name => 'par4', children => [ ] },
{ name => 'par5', children => [ ] },
{ name => 'par6', children => [ ] },
] },
];
or
# Unordered
my $obj1 = {
par1 => { },
par2 => { },
par3 => {
par4 => { },
par5 => { },
par6 => { },
},
};
To append par7 to par3's children, you would use
# Ordered
use List::Util qw( first );
my $par7 = { name => 'par7', children => [ ] };
my $par3 = first { $_->{ name } eq 'par3' } #$obj1
or die( "par3 not found" );
push #{ $par3->{ children } }, $par7;
or
# Unordered
$obj1->{ par3 }{ par7 } = { };

Adding an array into a JSON struct in Perl

I have a hash which looks as following (In Perl):
'multi' => {
'key2' => '123',
'key1' => 'abc',
'key3' => 'xwz'
}
Also I have the following structure:
my $json_struct = {
name => $name,
time => $time,
};
I would like to insert multi into the $json_struct as a array with keys and values.
I use the $json_struct in order to convert it to a JSON file, so the JSON output should look as following:
{
"name" : "some_name",
"time" : "time",
"multi" : [
{
"key" : "key1",
"value" : "abc"
},
{
"key" : "key2",
"value" : "123"
},
{
"key" : "key3",
"value" : "xwz"
}
],
}
How can I do it? What is the cleanest way possible (without importing additional modules).
To add an array, you just add the reference to the array into your data structure:
$json_struct->{multi} = \#multi;
In the case that %multi is a hash, you can use:
my #multi = map { +{ key => $_, value => $multi{ $_ } } } sort keys %multi;
$json_struct->{multi} = \#multi;
See Also
https://perldoc.perl.org/perlreftut.html
#!/usr/bin/env perl
use v5.10.1;
use warnings FATAL => "all";
use autodie;
use Data::Dumper;
my $payload = {
'multi' => {
'key2' => '123',
'key1' => 'abc',
'key3' => 'xwz'
}
};
my $json_struct = {
name => "foo",
time => "2018-12-03",
};
for my $key (keys %$payload) {
my #as_array = map {
{
key => $_,
value => $payload->{$key}{$_}
}
} keys %{$payload->{$key}};
$json_struct->{$key} = \#as_array;
}
say Dumper $json_struct;
# $VAR1 = {
# 'name' => 'foo',
# 'time' => '2018-12-03',
# 'multi' => [
# {
# 'value' => 'abc',
# 'key' => 'key1'
# },
# {
# 'value' => '123',
# 'key' => 'key2'
# },
# {
# 'key' => 'key3',
# 'value' => 'xwz'
# }
# ]
# };

How to sort this custom hashes of hashes in perl on the value

I am facing an issue with the sort functionality on my Application. I need to sort my Hash of hashes on the lname key under the instructors. The legacy
application is written in Perl.
Here is the dump of the Hash which i need to sort.
$VAR1 = {
'instructors' => [
{
'is_placeholder' => 0,
'lname' => 'Lordy',
'name' => 'Daniel Lordy'
},
{
'is_placeholder' => 0,
'lname' => 'Fisher',
'name' => 'Bethy Fisher'
},
{
'is_placeholder' => 0,
'lname' => 'Jaya',
'name' => 'Jennifer Jaya'
},
],
'id' => '1237058',
'XXX' => {
'name' => 'Fall 2015 MFT Master 695',
},
'YYY' => '45'
};
The instructors key in the above structure can be empty as well.
For Example:
$VAR1 = {
'instructors' => [],
'id' => '1237058',
'XXX' => {
'name' => 'Fall 2015 MFT Master 695',
},
'YYY' => '45'
};
In my application, Users have an option to sort the column based on instructor names. So when user sorts by ascending order, the application should show rows which have instructors are empty at the start and then show the rest of the rows in which each row has the instructor names sorted in ascending order. Vice versa for Descending Order.
This is the code which I have tried until now.
if( $sort_order eq 'ASC' ) {
foreach my $elem ( #$course_sections ) {
my #sorted = map { $_->[1] }
sort { $a->[0] cmp $b->[0] }
map { [$_->{'lname'}, $_] } #{$elem->{'instructors'}};
}
if( $sort_order eq 'DESC' ) {
foreach my $elem ( #$course_sections ) {
my #sorted = map { $_->[1] }
sort { $b->[0] cmp $a->[0] }
map { [$_->{'lname'}, $_] } #{$elem->{'instructors'}};
}
How do I get this #sorted hash affect the order of rows in #$course_sections. Let me know if there is any easier way to do it.
Thanks in Advance.
You need to replace each instructors array ref with the sorted version that you created in your foreach loop. That way you get the instructors of each individual row sorted. Then you can sort the whole $course_sections by the name of the first instructor of each row.
# sort the instructors in-place
foreach my $elem (#$course_sections) {
$elem->{'instructors'} = [
map { $_->[1] }
sort { $a->[0] cmp $b->[0] }
map { [ $_->{'lname'}, $_ ] } #{ $elem->{'instructors'} }
];
}
# sort the courses by first instructor
$course_sections = [
map { $_->[1] }
sort { $a->[0] cmp $b->[0] }
map { [ ( $_->{'instructors'}->[0] ? $_->{'instructors'}->[0]->{'lname'} : q{} ), $_ ] }
#$course_sections
];
Make sure to replace undef values with empty strings so the cmp doesn't blow up. We shouldn't do $_->{'instructors'}->[0]->{'lname'} // q{} because autovivification might create a bunch of empty stuff in our data structure.
Here's your example data pulled together:
my $course_sections = [
{
'instructors' => [
{
'is_placeholder' => 0,
'lname' => 'Lordy',
'name' => 'Daniel Lordy'
},
{
'is_placeholder' => 0,
'lname' => 'Fisher',
'name' => 'Bethy Fisher'
},
{
'is_placeholder' => 0,
'lname' => 'Jaya',
'name' => 'Jennifer Jaya'
},
],
'id' => '1237058',
'XXX' => {
'name' => 'Fall 2015 MFT Master 695',
},
'YYY' => '45'
},
{
'instructors' => [],
'id' => '1237058',
'XXX' => {
'name' => 'Fall 2015 MFT Master 695',
},
'YYY' => '45'
}
];
And this is the output, dumped with Data::Printer.
\ [
[0] {
id 1237058,
instructors [],
XXX {
name "Fall 2015 MFT Master 695"
},
YYY 45
},
[1] {
id 1237058,
instructors [
[0] {
is_placeholder 0,
lname "Fisher",
name "Bethy Fisher"
},
[1] {
is_placeholder 0,
lname "Jaya",
name "Jennifer Jaya"
},
[2] {
is_placeholder 0,
lname "Lordy",
name "Daniel Lordy"
}
],
XXX {
name "Fall 2015 MFT Master 695"
},
YYY 45
}
]

Convert a flat datastructure into a tree

I have an array of hashes. Each element in the array is a node in a hierarchical tree and has referential data for who the parent is. I will have thousands and hundreds of thousands of nodes in the tree... essentially an unknown set of nodes has to be converted to JSON (shown below) for use with http://bl.ocks.org/robschmuecker/7880033
UPDATE: position_id is a node in the heretical tree. placement_id is the parent's position_id (adjacency referential tree).
UPDATE: Here's the full AoH Data::Dumper result with Nested Set and Adjacency result from a modified version of DBIx::Tree::NestedSet (custom).
$VAR1 = [
{
'lft' => '673',
'id' => '109',
'date_created' => '2015-08-15',
'level' => '7',
'user_id' => '13',
'placement_id' => '11',
'position_id' => '13',
'status' => '1',
'structure_id' => '1',
'rght' => '684'
},
{
'placement_id' => '13',
'position_id' => '22',
'status' => '1',
'structure_id' => '1',
'rght' => '679',
'lft' => '674',
'date_created' => '2015-08-15',
'id' => '116',
'level' => '8',
'user_id' => '22'
},
{
'user_id' => '101',
'level' => '9',
'id' => '200',
'date_created' => '2015-08-15',
'lft' => '675',
'rght' => '676',
'structure_id' => '1',
'status' => '1',
'position_id' => '101',
'placement_id' => '22'
},
{
'date_created' => '2015-08-15',
'id' => '201',
'level' => '9',
'user_id' => '374',
'lft' => '677',
'structure_id' => '1',
'rght' => '678',
'placement_id' => '22',
'position_id' => '374',
'status' => '1'
},
{
'lft' => '680',
'user_id' => '95',
'level' => '8',
'id' => '117',
'date_created' => '2015-08-15',
'status' => '1',
'position_id' => '95',
'placement_id' => '13',
'rght' => '681',
'structure_id' => '1'
}
];
THIS IS THE GOAL, For this example I need to end up with:
{
"name": "13",
"children": [
{
"name": "22",
"children": [
{
"name": "101"
},
{
"name": "374"
}
]
},
{
"name": "95"
}
]
}
You can also see the format I am trying to arrive at here (minus size):
http://bl.ocks.org/robschmuecker/7880033#flare.json
My failed approach(es) included various attempts at looping through the array of hashes to create a recursive Hash of Hashes that can then be used with the JSON Perl module to create the actual JSON I need.
my $data = [
{ position_id => 123, placement_id => undef },
{ position_id => 456, placement_id => 123 },
{ position_id => 789, placement_id => 123 },
# ...
];
my $roots;
{
my %recs_by_name;
my %children_by_parent_name;
for my $row (#$data) {
my $name = $row->{position_id};
my $parent_name = $row->{placement_id};
my $rec = {
name => $name,
};
push #{ $children_by_parent_name{$parent_name // 'root'} }, $rec;
$recs_by_name{$name} = $rec;
}
$roots = delete($children_by_parent_name{root}) || [];
for my $name (keys(%children_by_parent_name)) {
my $children = $children_by_parent_name{$name};
if ( my $rec = $recs_by_name{$name} ) {
$rec->{children} = $children;
} else {
die("Parent $name doesn't exist.\n");
push #$roots, #$children;
}
}
}
print(Dumper($roots));
Tested.
You appear to have the depth of each node available to you (level). Simpler code could be used if your data was sorted by increasing depths.
While it was #ikegami who ultimately answered the question that led to the solution. I believe the following adaptation adds 4 important elements/clarifications I found helpful, and thought others reading this question and answer would also find useful.
1- Clear addition of all key,value pairs from the originating AoH to the resulting HOH. See while loop.
2- A Child node counter.
3- Inclusion and use of the encode_json function from JSON
4- The result is also an Array with a Hash as the first element. Newbies (like me) might find the explicit #{$roots}[0] passed to encode_json as helpful.
At first I had a similar adapted solution posted as an UPDATE within my question, but was admonished that it was bad etiquette and instructed to post an answer.
#ikegami's deserves the credit for the core of the solution.
sub get_jsonTree {
my ($array_of_hashes_ref) = #_;
my $roots;
my %recs_by_name;
my %children_by_parent_name;
my %count;
for my $row (#$array_of_hashes_ref) {
my $name = $row->{position_id};
my $parent_name = $row->{placement_id};
my $rec = {
name => $name,
};
## Added to loop through all key,value pairs and add them to $rec
while ( my ($key, $value) = each(%$row) ) {
$rec->{$key} = $value;
}
##Added To Count Child Nodes
$count{$parent_name} = 0 if (!$count{$parent_name});
$rec->{'child_count'} = $count{$parent_name};
$count{$parent_name}++;
push #{ $children_by_parent_name{$parent_name // 'root'} }, $rec;
$recs_by_name{$name} = $rec;
}
$roots = delete($children_by_parent_name{root}) || [];
for my $name (keys(%children_by_parent_name)) {
my $children = $children_by_parent_name{$name};
if ( my $rec = $recs_by_name{$name} ) {
$rec->{children} = $children;
} else {
$util{'test'} .= "Parent $name doesn't exist.\n<BR>";
push #$roots, #$children;
}
}
use JSON;
my $json_str = encode_json(#{$roots}[0]);
return $json_str;
}
my $array_of_hashes_ref = [
{ position_id => 123, placement_id => undef },
{ position_id => 456, placement_id => 123 },
{ position_id => 789, placement_id => 123 },
# ...
];
my $json_str = &get_jsonTree($array_of_hashes_ref);

How can I merge several hashes into one hash in Perl?

In Perl, how do I get this:
$VAR1 = { '999' => { '998' => [ '908', '906', '0', '998', '907' ] } };
$VAR1 = { '999' => { '991' => [ '913', '920', '918', '998', '916', '919', '917', '915', '912', '914' ] } };
$VAR1 = { '999' => { '996' => [] } };
$VAR1 = { '999' => { '995' => [] } };
$VAR1 = { '999' => { '994' => [] } };
$VAR1 = { '999' => { '993' => [] } };
$VAR1 = { '999' => { '997' => [ '986', '987', '990', '984', '989', '988' ] } };
$VAR1 = { '995' => { '101' => [] } };
$VAR1 = { '995' => { '102' => [] } };
$VAR1 = { '995' => { '103' => [] } };
$VAR1 = { '995' => { '104' => [] } };
$VAR1 = { '995' => { '105' => [] } };
$VAR1 = { '995' => { '106' => [] } };
$VAR1 = { '995' => { '107' => [] } };
$VAR1 = { '994' => { '910' => [] } };
$VAR1 = { '993' => { '909' => [] } };
$VAR1 = { '993' => { '904' => [] } };
$VAR1 = { '994' => { '985' => [] } };
$VAR1 = { '994' => { '983' => [] } };
$VAR1 = { '993' => { '902' => [] } };
$VAR1 = { '999' => { '992' => [ '905' ] } };
to this:
$VAR1 = { '999:' => [
{ '992' => [ '905' ] },
{ '993' => [
{ '909' => [] },
{ '904' => [] },
{ '902' => [] }
] },
{ '994' => [
{ '910' => [] },
{ '985' => [] },
{ '983' => [] }
] },
{ '995' => [
{ '101' => [] },
{ '102' => [] },
{ '103' => [] },
{ '104' => [] },
{ '105' => [] },
{ '106' => [] },
{ '107' => [] }
] },
{ '996' => [] },
{ '997' => [ '986', '987', '990', '984', '989', '988' ] },
{ '998' => [ '908', '906', '0', '998', '907' ] },
{ '991' => [ '913', '920', '918', '998', '916', '919', '917', '915', '912', '914' ] }
]};
I think this is closer than anybody else has gotten:
This does most of what you want. I did not store things in arrays of singular
hashes, as I don't feel that that is useful.
Your scenario is not a regular one. I've tried to genericize this to some extent,
but was not possible to overcome the singularity of this code.
First of all because it appears you want to collapse everything with the same
id into a merged entity (with exceptions), you have to descend through the structure
pulling the definitions of the entities. Keeping track of levels, because you
want them in the form of a tree.
Next, you assemble the ID table, merging entities as possible. Note that you
had 995 defined as an empty array one place and as a level another. So given
your output, I wanted to overwrite the empty list with the hash.
After that, we need to move the root to the result structure, descending that in order
to assign canonical entities to the identifiers at each level.
Like I said, it's not anything that regular. Of course, if you still want a list
of hashes which are no more than pairs, that's an exercise left to you.
use strict;
use warnings;
# subroutine to identify all elements
sub descend_identify {
my ( $level, $hash_ref ) = #_;
# return an expanding list that gets populated as we desecend
return map {
my $item = $hash_ref->{$_};
$_ => ( $level, $item )
, ( ref( $item ) eq 'HASH' ? descend_identify( $level + 1, $item )
: ()
)
;
} keys %$hash_ref
;
}
# subroutine to refit all nested elements
sub descend_restore {
my ( $hash, $ident_hash ) = #_;
my #keys = keys %$hash;
#$hash{ #keys } = #$ident_hash{ #keys };
foreach my $h ( grep { ref() eq 'HASH' } values %$hash ) {
descend_restore( $h, $ident_hash );
}
return;
}
# merge hashes, descending down the hash structures.
sub merge_hashes {
my ( $dest_hash, $src_hash ) = #_;
foreach my $key ( keys %$src_hash ) {
if ( exists $dest_hash->{$key} ) {
my $ref = $dest_hash->{$key};
my $typ = ref( $ref );
if ( $typ eq 'HASH' ) {
merge_hashes( $ref, $src_hash->{$key} );
}
else {
push #$ref, $src_hash->{$key};
}
}
else {
$dest_hash->{$key} = $src_hash->{$key};
}
}
return;
}
my ( %levels, %ident_map, %result );
#descend through every level of hash in the list
# #hash_list is assumed to be whatever you Dumper-ed.
my #pairs = map { descend_identify( 0, $_ ); } #hash_list;
while ( #pairs ) {
my ( $key, $level, $ref ) = splice( #pairs, 0, 3 );
$levels{$key} |= $level;
# if we already have an identity for this key, merge the two
if ( exists $ident_map{$key} ) {
my $oref = $ident_map{$key};
my $otyp = ref( $oref );
if ( $otyp ne ref( $ref )) {
# empty arrays can be overwritten by hashrefs -- per 995
if ( $otyp eq 'ARRAY' && #$oref == 0 && ref( $ref ) eq 'HASH' ) {
$ident_map{$key} = $ref;
}
else {
die "Uncertain merge for '$key'!";
}
}
elsif ( $otyp eq 'HASH' ) {
merge_hashes( $oref, $ref );
}
else {
#$oref = sort { $a <=> $b || $a cmp $b } keys %{{ #$ref, #$oref }};
}
}
else {
$ident_map{$key} = $ref;
}
}
# Copy only the keys that do not appear at higher levels to the
# result hash
if ( my #keys = grep { !$levels{$_} } keys %ident_map ) {
#result{ #keys } = #ident_map{ #keys } if #keys;
}
# then step through the hash to make sure that the entries at
# all levels are equal to the identity
descend_restore( \%result, \%ident_map );
Use CPAN! Try Hash::Merge
# OO interface.
my $merge = Hash::Merge->new( 'LEFT_PRECEDENT' );
my %c = %{ $merge->merge( \%a, \%b ) };
See CPAN for more info, it pretty much does everything you would want to, and is fully customizable.
Give this recursive solution a try:
# XXX: doesn't handle circular problems...
sub deepmerge {
my (#structs) = #_;
my $new;
# filter out non-existant structs
#structs = grep {defined($_)} #structs;
my $ref = ref($structs[0]);
if (not all(map {ref($_) eq $ref} #structs)) {
warn("deepmerge: all structs are not $ref\n");
}
my #tomerge = grep {ref($_) eq $ref} #structs;
return qr/$tomerge[0]/ if scalar(#tomerge) == 1 and $ref eq 'Regexp';
return $tomerge[0] if scalar(#tomerge) == 1;
if ($ref eq '') {
$new = pop(#tomerge); # prefer farthest right
}
elsif ($ref eq 'Regexp') {
$new = qr/$tomerge[$#tomerge]/;
}
elsif ($ref eq 'ARRAY') {
$new = [];
for my $i (0 .. max(map {scalar(#$_) - 1} #tomerge)) {
$new->[$i] = deepmerge(map {$_->[$i]} #tomerge);
}
}
elsif ($ref eq 'HASH') {
$new = {};
for my $key (uniq(map {keys %$_} #tomerge)) {
$new->{$key} = deepmerge(map {$_->{$key}} #tomerge);
}
}
else {
# ignore all other structures...
$new = '';
}
return $new;
}
Modify it to your hearts content to achieve the desired result.
Upon further investigation, I noticed you're merging them in some different way than the above algorithm. Maybe just use this as an example then. Mine does this:
deepmerge({k => 'v'}, {k2 => 'v2'});
# returns {k => 'v', k2 => 'v2'}
And similar things for arrays.
I indented your wanted output as it was hard to read, for the benefit of other people who want to answer. I'm still thinking of an answer.
$VAR1 = { '999:' => [
{ '992' => [ '905' ] },
{ '993' => [
{ '909' => [] },
{ '904' => [] },
{ '902' => [] }
]
},
{ '994' => [
{ '910' => [] },
{ '985' => [] },
{ '983' => [] }
]
},
{ '995' => [
{ '101' => [] },
{ '102' => [] },
{ '103' => [] },
{ '104' => [] },
{ '105' => [] },
{ '106' => [] },
{ '107' => [] }
]
},
{ '996' => [] },
{ '997' => [ '986', '987', '990', '984', '989', '988' ] },
{ '998' => [ '908', '906', '0', '998', '907' ] },
{ '991' => [ '913', '920', '918', '998', '916', '919', '917', '915', '912', '914' ] }
]
};
I don't see the point of all those single entry hashes though, would not the following be better?
$VAR1 = { '999:' => {
'992' => [ '905' ],
'993' => {
'909' => [],
'904' => [],
'902' => []
},
'994' => {
'910' => [],
'985' => [],
'983' => []
},
'995' => {
'101' => [],
'102' => [],
'103' => [],
'104' => [],
'105' => [],
'106' => [],
'107' => []
},
'996' => [],
'997' => [ '986', '987', '990', '984', '989', '988' ],
'998' => [ '908', '906', '0', '998', '907' ],
'991' => [ '913', '920', '918', '998', '916', '919', '917', '915', '912', '914' ]
}
};
Assuming the above data is in a file dump.txt, you can eval it piece by piece.
Updated code below
use strict;
use File::Slurp;
my $final_data = {};
my #data = map {eval $_} (read_file("dump.txt") =~ /\$VAR1 = ([^;]+);/gs);
foreach my $element (#data) {
my $key = (keys %$element)[0];
$final_data->{$key} ||= [];
push #{$final_data->{$key}}, $element->{$key}
};
use Data::Dumper;
print Data::Dumper->Dump([$final_data]);
If you want to completely deep merge, you can at the end pass $final_data through this (not tested!!!) deep merger:
# Merge an array of hashes as follows:
# IN: [ { 1 => 11 }, { 1 => 12 },{ 2 => 22 } ]
# OUT: { 1 => [ 11, 12 ], 2 => [ 22 ] }
# This is recursive - if array [11,12] was an array of hashrefs, we merge those too
sub merge_hashes {
my $hashes = #_[0];
return $hashes unless ref $hashes eq ref []; # Hat tip to brian d foy
return $hashes unless grep { ref #_ eq ref {} } #$hashes; # Only merge array of hashes
my $final_hashref = {};
foreach my $element (#$hashes) {
foreach my $key (keys %$element) {
$final_hashref->{$key} ||= [];
push #{ $final_hashref->{$key} }, $element->{$key};
}
}
foreach my $key (keys %$final_hashref) {
$final_hashref->{$key} = merge_hashes($final_hashref->{$key});
}
return $final_hashref;
}
Use push and autovivification.
Start with the usual front matter:
#! /usr/bin/perl
use warnings;
use strict;
Read your sample input from the DATA filehandle and create a datastructure similar to the one you dumped:
my #hashes;
while (<DATA>) {
my $VAR1;
$VAR1 = eval $_;
die $# if $#;
push #hashes => $VAR1;
}
Your input has two cases:
A reference to an array that contains data to be merged with its cousins that have the same "key path."
Otherwise, it's a reference to a hash that contains a reference to an array from case 1 at some depth, so we strip off the outermost layer and keep digging.
Note the use of $_[0]. The semantics of Perl subroutines are such that the values in #_ are aliases rather than copies. This lets us call merge directly without having to first create a bunch of scaffolding to hold the merged contents. The code will break if you copy the value instead.
sub merge {
my $data = shift;
if (ref($data) eq "ARRAY") {
push #{ $_[0] } => #$data;
}
else {
foreach my $k (%$data) {
merge($data->{$k} => $_[0]{$k});
}
}
}
Now we walk #hashes and incrementally merge their contents into %merged.
my %merged;
foreach my $h (#hashes) {
foreach my $k (keys %$h) {
merge $h->{$k} => $merged{$k};
}
}
We don't know in what order the values arrived, so run a final cleanup pass to sort the arrays:
sub sort_arrays {
my($root) = #_;
if (ref($root) eq "ARRAY") {
#$root = sort { $a <=> $b } #$root;
}
else {
sort_arrays($root->{$_}) for keys %$root;
}
}
sort_arrays \%merged;
The Data::Dumper module is great for quick debugging!
use Data::Dumper;
$Data::Dumper::Indent = 1;
print Dumper \%merged;
Place a copy of the input from your question into the special DATA filehandle:
__DATA__
$VAR1 = { '999' => { '998' => [ '908', '906', '0', '998', '907' ] } };
$VAR1 = { '999' => { '991' => [ '913', '920', '918', '998', '916', '919', '917', '915', '912', '914' ] } };
$VAR1 = { '999' => { '996' => [] } };
$VAR1 = { '999' => { '995' => [] } };
$VAR1 = { '999' => { '994' => [] } };
$VAR1 = { '999' => { '993' => [] } };
$VAR1 = { '999' => { '997' => [ '986', '987', '990', '984', '989', '988' ] } };
$VAR1 = { '995' => { '101' => [] } };
$VAR1 = { '995' => { '102' => [] } };
$VAR1 = { '995' => { '103' => [] } };
$VAR1 = { '995' => { '104' => [] } };
$VAR1 = { '995' => { '105' => [] } };
$VAR1 = { '995' => { '106' => [] } };
$VAR1 = { '995' => { '107' => [] } };
$VAR1 = { '994' => { '910' => [] } };
$VAR1 = { '993' => { '909' => [] } };
$VAR1 = { '993' => { '904' => [] } };
$VAR1 = { '994' => { '985' => [] } };
$VAR1 = { '994' => { '983' => [] } };
$VAR1 = { '993' => { '902' => [] } };
$VAR1 = { '999' => { '992' => [ '905' ] } };
A sample of the output is below:
'994' => {
'910' => [],
'985' => [],
'983' => []
},
'999' => {
'993' => [],
'992' => [
'905'
],
'997' => [
'984',
'986',
'987',
'988',
'989',
'990'
],
wow. thanks so much everyone (especially Axeman)! sorry for the lack of code or clarification, I was trying to generate a tree, and did try Hash::Merge, but could not for the life of me resolve the coined-995 problem of replacing the empty 995 with the non-empty 995; Axeman's solution works beautifully and I really appreciate the help/collaboration! (also tried the others and it either did the same thing as Hash::Merge, or it actually got rid of some branches).
some background on the input: had a set of hashes, each had keys (all same level) and two of which defined a) a parent to another, and b) itself (the rest were children), and so with a tree, i figured a hash was perfect, came up with a set of new hashes {a}->{b}->[c], and here we are...
again, thanks everyone and Axeman!