Perl: Combine hash with Keys of Keys containing identical Key - perl

I am trying to merge these hashmaps in perl and unable to figure out how to do it when the key is identical. The desired input is VAR1 and VAR2 and output is mentioned as well.
Input:
my $VAR1 = { 'p'=> { 'a' => 2,
'b' => 3}};
my $VAR2 = { 'p'=> { 'c' => 4,
'd' => 7}};
Desired Output:
{'p'=> {
'a' => 2,
'b' => 3,
'c' => 4,
'd' => 7}};

Say you had
my %h1 = ( a => 2, b => 3 );
my %h2 = ( c => 4, d => 7 );
To combine them into a third hash, you can use
my %h = ( %h1, %h2 );
It would be as if you had done
my %h = ( a => 2, b => 3, c => 4, d => 7 );
Any keys in common will be taken from the hash later in the list.
In your case, you have anonymous hashes. So where we would have used %NAME, we will use %BLOCK, where the block returns a reference to the hash we want to use. This gives us the following:
my %h_inner = (
%{ $VAR1->{p} },
%{ $VAR2->{p} },
);
This can also be written as follows:[1]
my %h_inner = (
$VAR1->{p}->%*,
$VAR2->{p}->%*,
);
Finally, you also want to second new hash with a single element keyed with p whose value is a reference to this first new hash.
my %h_outer = ( p => \%h_inner );
So, all together, you want
my %h_inner = (
%{ $VAR1->{p} },
%{ $VAR2->{p} },
);
my %h_outer = ( p => \%h_inner );
We could also use the anonymous hash constructor ({}) instead.
my %h_outer = (
p => {
%{ $VAR1->{p} },
%{ $VAR2->{p} },
},
};
Docs:
perlreftut
perllol
Check version compatibility here.

Related

Do latter keys in an hash assignment from an array always override earlier keys?

Given this:
%h = (a => 3, b => 4, a => 5);
Imperically $h{a} == 5 holds true, but are there cases where $h{a} == 3 because of internal dictionary hashing or some other perl-internal behavior?
Another way to ask: Does perl guarantee to keep key-ordering the same when assigning an array to a hash, even in the event of a key collision?
Duplicates key entries are convenient for things like %settings = (%defaults, %userflags) so I can hard-code defaults but override with user supplied flags.
Yes, you can rely that the assignment list will be evaluated left to right as surely as you could rely on an assignment to an array occurring in the correct order.
sub DebugHash::TIEHASH { bless {}, shift }
sub DebugHash::CLEAR { %{shift} = (); }
sub DebugHash::STORE {
my ($tied, $key, $value) = #_;
print STDERR "STORE '$key' => '$value'\n";
$tied->{$key} = $value;
}
tie %hash, 'DebugHash';
%hash = (a => 'first', a => 'second', a => 'third',
a => 'fourth', a => 'next', a => 'last');
Output:
STORE 'a' => 'first'
STORE 'a' => 'second'
STORE 'a' => 'third'
STORE 'a' => 'fourth'
STORE 'a' => 'next'
STORE 'a' => 'last'
key-ordering in a hash appears quite random. That is you cannot guarantee that the hash when dumped or looked at will be in the same order as you assigned it (a => 3, b => 4, a => 5); it could be displayed as ( b => 4, a => 5).
Also, there are only two key values in your hash the collision simply overwrites the first:
use Data::Dumper;
my %h = (a => 3, b => 4, a => 5);
print Dumper(\%h);
$VAR1 = {
'a' => 5,
'b' => 4
};
It took me only three tries to produce:
$VAR1 = {
'b' => 4,
'a' => 5
};
Updating as #mob mentioned left to right is what I would assume the second (or nth) assignment of a value to a key will replace the previous value. In this case you are replacing the whole hash the left to right ordering would result in a hash with two elements the value of any duplicates would be the last key/value encountered.

Perl eval Data::Dumper inconsistency

I have to serialize and deserialize in Perl. I am aware that Data::Dumper and eval are not the best suited ones for this job but I am not allowed to modify this aspect in the legacy scripts which I am working on.
Below are two ways ( CODE 1 and CODE 2 ) to use eval.
In CODE 1, the hash is available as a string before being deserialized via eval.
In CODE 2, the hash is serialized using Dumper before being deserialized via eval.
In both the code samples, one of two attempted ways to deserialize works. Why does the other way to deserialize not work ?
CODE 1
my $r2 = "(
'w' => {
'k2' => 5,
'k1' => 'key',
'k3' => [
'a',
'b',
2,
'3'
]
},
'q' => 2
)";
my %z;
eval "\%z = $r2"; ####### Works.
print "\%z = [".Data::Dumper::Dumper (\%z)."] ";
my $answer = eval "$r2"; #### Does NOT work.
print "\n\nEvaled = [".Dumper($answer)."] ";
Output
%z = [$VAR1 = {
'w' => {
'k2' => 5,
'k1' => 'key',
'k3' => [
'a',
'b',
2,
'3'
]
},
'q' => 2
};
]
Evaled = [$VAR1 = 2;
]
But below code works in reverse manner :
CODE 2
my %a = ( "q" =>2, "w"=>{ "k1"=>"key", "k2"=>5, k3=>["a", "b", 2, "3",], }, ); **# Same hash as above example.**
$Data::Dumper::Terse=1;
$Data::Dumper::Purity = 1;
my $r2 = Dumper(\%a);
my %z;
eval '\%z = $r2';
print "\n\n\%z = [".Dumper(\%z)."] "; #### Does NOT work.
my $answer = eval $r2;
print "\n\nEvaled = [".Dumper($answer)."] "; ####### Works.
Output
%z = [$VAR1 = {};
]
Evaled = [$VAR1 = {
'w' => {
'k2' => 5,
'k1' => 'key',
'k3' => [
'a',
'b',
2,
'3'
]
},
'q' => 2
};
]
First of all, please don't put comments that result in syntax errors (**).
Notice that the string you provided in the first code block produces different data structure than the Dumper function. In the first block you are creating a hash, but you don't assign it to any variable. In case of the Dumper function, anonymous hash is created and it's reference is passed to the $VAR variable.
To make the first code work, you should replace ( with { to create anonymous hash and then assign it to a variable, for example:
my $r2 = "$VAR = {
'w' => {
'k2' => 5,
'k1' => 'key',
'k3' => [
'a',
'b',
2,
'3'
]
},
'q' => 2
}";
Don't use Data::Dumper to serialize data. Having said that...
This is a problem of scalar and list context. In the second eval, you have:
my $answer = ...;
Since you are assigning to a scalar, the right side is evaluated in scalar context. That means the eval is in scalar context. That value is:
(
'w' => {
'k2' => 5,
'k1' => 'key',
'k3' => [
'a',
'b',
2,
'3'
]
},
'q' => 2
)
That looks like a list, but it's really the comma operator in scalar context. That evaluates the left hand side, discards that result, and returns the righthand side. So, my $x = ( 'left', 'right' ) assigns right to $x. This is covered in What is the difference between a list and an array? in perlfaq4.
In your question, you see that $r gets the value 2. That's the rightmost value in the comma chain, so that's the value you get back in scalar context. Change that to another value (perhaps 'duck'), and that's the value that you'll get back:
my $r2 = "(
'w' => {
'k2' => 5,
},
'q' => 'duck'
)";
my $answer = eval "$r2";
use Data::Dumper;
print "Evaled =\n" . Dumper($answer);
It's not a number, which confuses people because they think it's some sort of count:
Evaled =
$VAR1 = 'duck';
Change that to assign in list context (that hash assignment is a list assignment) and you get the right answer:
my $r2 = "(
'w' => {
'k2' => 5,
},
'q' => 'duck'
)";
my #answer = eval "$r2";
use Data::Dumper;
print "Evaled =\n" . Dumper(\#answer);
Now it's the data structure you thought it should be:
Evaled =
$VAR1 = [
'w',
{
'k2' => 5
},
'q',
'duck'
];

What does it mean to have a perl hash{}{}

My professor has some syntax on a slide that I do not understand.
In perl there is:
$hash{$string}{$anotherString}++;
What does this syntax mean? If it were:
$hash{$string}{$int}++;
Would it be increment the value?
When I print using
while( my( $key, $value ) = each %hash ){print "$key: $value\n";}
My output is
"key": HASH(0xbe0200)
That is a two-dimensional hash, a hash of hashes. It is easy to keep track of structures in Perl once you realize that any single value is in fact a scalar. In the case of multidimensional structures, the scalar value is a reference. For example:
my %outer = ( "foo" => { "bar" => 1 } );
The inner part { "bar" => 1 } is a hash reference. The use of { } in assignment denotes an anonymous hash. This is similar to:
my %inner = ( "bar" => 1 );
my %outer = ( "foo" => \%inner );
Now when you want to reference a value in %inner, you use the first key to access the hash reference, and the second key to access the value in %inner:
print $outer{"foo"}{"bar"}; # prints 1
And when you use the increment operator ++ on a value, it is incremented:
$outer{"foo"}{"bar"}++; # the value is now 2
$hash{string1}{string2}
is a shorter equivalent of
$hash{string1}->{string2}
i.e. it returns a value from a hash of hashes.
By applying the ++ operator, the value in the inner hash is incremented.
My output is "key": HASH(0xbe0200)
That strange output means that what you are trying to print is actually a hash reference:
use strict;
use warnings;
use 5.016; #allows you to use say(), which is equivalent to print()
#plus a newline at the end
my $href = {
a => 1,
b => 2,
};
say $href;
--output:--
HASH(0x100826698)
Or,
my %hash = (
a => 1,
b => 2,
);
say \%hash;
--output:--
HASH(0x1008270a0)
The \ operator gets the reference for the thing on its right hand side.
The easiest way to print the actual hash is using Data::Dumper, which is something you can and will use all the time:
use strict;
use warnings;
use 5.016;
use Data::Dumper;
my $href = {
a => 1,
b => 2,
};
say Dumper($href);
$VAR1 = {
'a' => 1,
'b' => 2
};
Like use warnings;, I consider use Data::Dumper; mandatory for every program.
So, when you see strange output, like HASH(0xbe0200), use Data::Dumper on the value:
my %hash = (
a => 1,
b => { hello => 2, goodbye => 3},
);
while( my( $key, $value ) = each %hash ){
say $key;
say Dumper($value);
say '-' x 10;
}
--output:--
a
$VAR1 = 1;
----------
b
$VAR1 = {
'hello' => 2,
'goodbye' => 3
};
----------
Or, alternatively just use Data::Dumper on the whole structure:
my %hash = (
a => 1,
b => { hello => 2, goodbye => 3},
);
say Dumper(\%hash);
--output:--
$VAR1 = {
'a' => 1,
'b' => {
'hello' => 2,
'goodbye' => 3
}
};
Note that Dumper() is used to show the contents of a hash reference(or any other reference), so if your variable is not a reference, e.g. %hash, then you must turn it into a reference using the \ operator, e.g. \%hash.
Now, if you have this hash:
my %hash = (
a => 1,
b => { hello => 2, goodbye => 3},
);
...to retrieve the value corresponding to 'goodbye', you can write:
say $hash{b}{goodbye}; #=>3
$hash{b} returns the hash (reference) { hello => 2, goodbye => 3}, and you can retrieve values from that hash by using the subscripts {hello} or {goodbye}.
Alternatively, you can write this:
my %hash = (
a => 1,
b => { hello => 2, goodbye => 3},
);
my $string = 'b';
my $anotherString = 'goodbye';
say $hash{$string}{$anotherString}; #=>3
And to increment the value 3 in the hash, you can write:
my $result = $hash{$string}{$anotherString}++;
say $result; #=>3
say $hash{$string}{$anotherString}; #=>4
The postfix ++ operator actually increments the value after the current operation, so $result is 3, then the value in the hash is incremented to 4, something like this:
my $temp = $hash{$string}{$anotherString};
$hash{$string}{$anotherString} = $hash{$string}{$anotherString} + 1;
my $result = $temp;
If you want the increment to happen before the current operation, then you can use the prefix ++ operator:
my $result = ++$hash{$string}{$anotherString};
say $result; #=>4
say $hash{$string}{$anotherString}; #=>4
Finally, if the value at $hash{$string}{$anotherString} is not a number, e.g. 'green', you will get something strange:
my %hash = (
a => 1,
b => { hello => 2, goodbye => 'green'},
);
my $string = 'b';
my $anotherString = 'goodbye';
my $result = $hash{$string}{$anotherString}++;
say $hash{$string}{$anotherString};
--output:--
greeo
perl has a notion that the string that comes after the string 'green' is the string 'greeo' because the letter 'o' comes after the letter 'n' in the alphabet. And if the string you incremented were 'greez' the output would be:
greez original
grefa output
The next letter after 'z' is to start over with 'a', but just like when you increment 9 by 1 and get 10, the increment for 'z' carries over to the column on the left, incrementing that letter by 1, producing the 'f'. Ha!

Create a hash in Perl

I have a beginner question:
I have an #key_table and many #values_tables.
I want to create a #table of references to hashes, so there is one table, each element points to hash with keys&values from those 2 tables presented at the beginning.
Could anyone help me?
For example:
#keys = (Kate, Peter, John);
#value1 = (1, 2, 3);
#value2 = (a, b, c);
and I want a two-element table that point to:
%hash1 = (Kate=>1, Peter=>2, John=>3);
%hash2 = (Kate=>a, Peter=>b, John=>c);
If you just want to create two hashes, it's really easy:
my ( %hash1, %hash2 );
#hash1{ #keys } = #value1;
#hash2{ #keys } = #value2;
This takes advantage of hash slices.
However, it's usually a mistake to make a bunch of new variables with numbers stuck on the end. If you want this information all together in one structure, you can create nested hashes with references.
Using hash slice is most common way to populate hash with keys/values,
#hash1{#keys} = #value1;
#hash2{#keys} = #value2;
but it could be done in other (less efficient) way using ie. map,
my %hash1 = map { $keys[$_] => $value1[$_] } 0 .. $#keys;
my %hash2 = map { $keys[$_] => $value2[$_] } 0 .. $#keys;
or even foreach
$hash1{ $keys[$_] } = $value1[$_] for 0 .. $#keys;
$hash2{ $keys[$_] } = $value2[$_] for 0 .. $#keys;
This is an example:
use strict;
use warnings;
use Data::Dump;
#Example data
my #key_table = qw/Kate Peter John/;
my #values_tables = (
[qw/1 2 3/],
[qw/a b c/]
);
my #table;
for my $vt(#values_tables) {
my %temph;
#temph{ #key_table } = #$vt;
push #table, \%temph;
}
dd(#table);
#<--- prints:
#(
# { John => 3, Kate => 1, Peter => 2 },
# { John => "c", Kate => "a", Peter => "b" },
#)
This will do it:
use Data::Dumper;
use strict;
my #keys = ("Kate", "Peter", "John");
my #value1 = (1, 2, 3);
my #value2 = ("a", "b", "c");
my (%hash1,%hash2);
for my $i (0 .. $#keys){
$hash1{$keys[$i]}=$value1[$i];
$hash2{$keys[$i]}=$value2[$i];
}
print Dumper(\%hash1);
print Dumper(\%hash2);
This is the output:
$VAR1 = {
'John' => 3,
'Kate' => 1,
'Peter' => 2
};
$VAR1 = {
'John' => 'c',
'Kate' => 'a',
'Peter' => 'b'
};

How can I create an anonymous hash from an existing hash in Perl?

How can I create an anonymous hash from an existing hash?
For arrays, I use:
#x = (1, 2, 3);
my $y = [#x];
but I can't find how to do the same for a hash:
my %x = ();
my $y = ???;
Thanks
Why do you need an anonymous hash? Although the answers tell you various ways you could make an anonymous hash, we have no idea if any of them are the right solution for whatever you are trying to do.
If you want a distinct copy that you can modify without disturbing the original data, use dclone from Storable, which comes with Perl. It creates a deep copy of your data structure:
use Storable qw(dclone);
my $clone = dclone \%hash;
Consider Dave Webb's answer, but with an additional layer of references. The value for the key of c is another hash reference:
use Data::Dumper;
my %original = ( a => 1, b => 2, c => { d => 1 } );
my $copy = { %original };
print
"Before change:\n\n",
Data::Dumper->Dump( [ \%original], [ qw(*original) ] ),
Data::Dumper->Dump( [ $copy ], [ qw(copy) ] ),
;
$copy->{c}{d} = 'foo';
print
"\n\nAfter change:\n\n",
Data::Dumper->Dump( [ \%original], [ qw(*original) ] ),
Data::Dumper->Dump( [ $copy ], [ qw(copy) ] ),
;
By inspecting the output, you see that even though you have an anonymous hash, it's still linked to the original:
Before change:
%original = (
'c' => {
'd' => 1
},
'a' => 1,
'b' => 2
);
$copy = {
'c' => {
'd' => 1
},
'a' => 1,
'b' => 2
};
After change:
%original = (
'c' => {
'd' => 'foo'
},
'a' => 1,
'b' => 2
);
$copy = {
'c' => {
'd' => 'foo'
},
'a' => 1,
'b' => 2
};
my $new_hash = { %existing_hash };
Note that this solution does not make a deep copy.
Read brian's answer for explanation.
I think you need to be careful here. Consider the following hash:
my %hash = (1 => 'one',2 => 'two');
There are two ways you can get a reference from this:
my $ref = \%hash;
my $anon = {%hash};
$ref is a reference to the original hash and can be used similarly to %hash. $anon is a reference to an anonymous copy of the original hash; it will have the same data but changing it won't change the original hash and vice versa.
So, for example, to start with both of these statements will have the same output
print $ref->{1},"\n";
> one
print $anon->{1},"\n";
> one
But if I change the original hash:
$hash{1} = "i";
They two print statements would output different values:
print $ref->{1},"\n";
> i
print $anon->{1},"\n";
> one
If you have
my %hash = ...
then you can do
my $hashref = \%hash;
There seem to be two things going on here, and the answers are split between answering two different possible questions.
You want an anonymous hash. Easy. You can do it in one step.
You want an anonymous copy of an existing hash. As Dave Webb and brian suggest, this might be if you want a reference, but you don't want to tamper with the original hash. Also easy. (Well, not exactly: see brian's answer for details on deep copies.)
If you want 1, do this:
my $hash_ref = { foo => 1, bar => 2 };
If you want 2, do this:
my %hash = ( foo => 1, bar => 2 );
# Then later
my $anon_copy_hash_ref = { %hash };
(The names are not meant for prime time.) My copy isn't ready for prime time either. See brian's post for a fuller, more precise discussion.
Use:
$hashref = {};
A quick/easy way to achieve a deep copy:
use FreezeThaw qw(freeze thaw);
$new = thaw freeze $old;