Perl hashref printing keys - perl

I have the following hashref :-
my $hashref = {'a'=>(1,2,3,4),
'b'=>(5,6,7,8)};
then I use the following to just print the keys (i.e. 'a' and 'b') :-
foreach (keys %$hashref){
print "\n".$_."\n";
}
This prints the following output:-
4
a
7
2
5
Trying to print the datastructure hashref using Data::Dumper gives the following output:-
$VAR1 = {
'4' => 'b',
'a' => 1,
'7' => 8,
'2' => 3,
'5' => 6
};
My question is :-
1) How to just print the correct keys i.e. 'a' and 'b'.
2) Why does the data structure look like the one in the above output and not like:-
$VAR1 = {
'a' => (1,2,3,4),
'b' => (5,6,7,8)
};

You are defining the hash wrong. It interprets this:
'a'=>(1,2,3,4),
'b'=>(5,6,7,8)
as simply a list of 10 elements. (Remember that a hash can also be declared using a simple list, the => operator is optional.) Instead, use square brackets to make your values into arrayref literals:
'a'=>[1,2,3,4],
'b'=>[5,6,7,8]
Which Data::Dumper should call:
$VAR1 = {
'a' => [1,2,3,4],
'b' => [5,6,7,8]
};

Related

Do latter keys in an hash assignment from an array always override earlier keys?

Given this:
%h = (a => 3, b => 4, a => 5);
Imperically $h{a} == 5 holds true, but are there cases where $h{a} == 3 because of internal dictionary hashing or some other perl-internal behavior?
Another way to ask: Does perl guarantee to keep key-ordering the same when assigning an array to a hash, even in the event of a key collision?
Duplicates key entries are convenient for things like %settings = (%defaults, %userflags) so I can hard-code defaults but override with user supplied flags.
Yes, you can rely that the assignment list will be evaluated left to right as surely as you could rely on an assignment to an array occurring in the correct order.
sub DebugHash::TIEHASH { bless {}, shift }
sub DebugHash::CLEAR { %{shift} = (); }
sub DebugHash::STORE {
my ($tied, $key, $value) = #_;
print STDERR "STORE '$key' => '$value'\n";
$tied->{$key} = $value;
}
tie %hash, 'DebugHash';
%hash = (a => 'first', a => 'second', a => 'third',
a => 'fourth', a => 'next', a => 'last');
Output:
STORE 'a' => 'first'
STORE 'a' => 'second'
STORE 'a' => 'third'
STORE 'a' => 'fourth'
STORE 'a' => 'next'
STORE 'a' => 'last'
key-ordering in a hash appears quite random. That is you cannot guarantee that the hash when dumped or looked at will be in the same order as you assigned it (a => 3, b => 4, a => 5); it could be displayed as ( b => 4, a => 5).
Also, there are only two key values in your hash the collision simply overwrites the first:
use Data::Dumper;
my %h = (a => 3, b => 4, a => 5);
print Dumper(\%h);
$VAR1 = {
'a' => 5,
'b' => 4
};
It took me only three tries to produce:
$VAR1 = {
'b' => 4,
'a' => 5
};
Updating as #mob mentioned left to right is what I would assume the second (or nth) assignment of a value to a key will replace the previous value. In this case you are replacing the whole hash the left to right ordering would result in a hash with two elements the value of any duplicates would be the last key/value encountered.

Perl eval Data::Dumper inconsistency

I have to serialize and deserialize in Perl. I am aware that Data::Dumper and eval are not the best suited ones for this job but I am not allowed to modify this aspect in the legacy scripts which I am working on.
Below are two ways ( CODE 1 and CODE 2 ) to use eval.
In CODE 1, the hash is available as a string before being deserialized via eval.
In CODE 2, the hash is serialized using Dumper before being deserialized via eval.
In both the code samples, one of two attempted ways to deserialize works. Why does the other way to deserialize not work ?
CODE 1
my $r2 = "(
'w' => {
'k2' => 5,
'k1' => 'key',
'k3' => [
'a',
'b',
2,
'3'
]
},
'q' => 2
)";
my %z;
eval "\%z = $r2"; ####### Works.
print "\%z = [".Data::Dumper::Dumper (\%z)."] ";
my $answer = eval "$r2"; #### Does NOT work.
print "\n\nEvaled = [".Dumper($answer)."] ";
Output
%z = [$VAR1 = {
'w' => {
'k2' => 5,
'k1' => 'key',
'k3' => [
'a',
'b',
2,
'3'
]
},
'q' => 2
};
]
Evaled = [$VAR1 = 2;
]
But below code works in reverse manner :
CODE 2
my %a = ( "q" =>2, "w"=>{ "k1"=>"key", "k2"=>5, k3=>["a", "b", 2, "3",], }, ); **# Same hash as above example.**
$Data::Dumper::Terse=1;
$Data::Dumper::Purity = 1;
my $r2 = Dumper(\%a);
my %z;
eval '\%z = $r2';
print "\n\n\%z = [".Dumper(\%z)."] "; #### Does NOT work.
my $answer = eval $r2;
print "\n\nEvaled = [".Dumper($answer)."] "; ####### Works.
Output
%z = [$VAR1 = {};
]
Evaled = [$VAR1 = {
'w' => {
'k2' => 5,
'k1' => 'key',
'k3' => [
'a',
'b',
2,
'3'
]
},
'q' => 2
};
]
First of all, please don't put comments that result in syntax errors (**).
Notice that the string you provided in the first code block produces different data structure than the Dumper function. In the first block you are creating a hash, but you don't assign it to any variable. In case of the Dumper function, anonymous hash is created and it's reference is passed to the $VAR variable.
To make the first code work, you should replace ( with { to create anonymous hash and then assign it to a variable, for example:
my $r2 = "$VAR = {
'w' => {
'k2' => 5,
'k1' => 'key',
'k3' => [
'a',
'b',
2,
'3'
]
},
'q' => 2
}";
Don't use Data::Dumper to serialize data. Having said that...
This is a problem of scalar and list context. In the second eval, you have:
my $answer = ...;
Since you are assigning to a scalar, the right side is evaluated in scalar context. That means the eval is in scalar context. That value is:
(
'w' => {
'k2' => 5,
'k1' => 'key',
'k3' => [
'a',
'b',
2,
'3'
]
},
'q' => 2
)
That looks like a list, but it's really the comma operator in scalar context. That evaluates the left hand side, discards that result, and returns the righthand side. So, my $x = ( 'left', 'right' ) assigns right to $x. This is covered in What is the difference between a list and an array? in perlfaq4.
In your question, you see that $r gets the value 2. That's the rightmost value in the comma chain, so that's the value you get back in scalar context. Change that to another value (perhaps 'duck'), and that's the value that you'll get back:
my $r2 = "(
'w' => {
'k2' => 5,
},
'q' => 'duck'
)";
my $answer = eval "$r2";
use Data::Dumper;
print "Evaled =\n" . Dumper($answer);
It's not a number, which confuses people because they think it's some sort of count:
Evaled =
$VAR1 = 'duck';
Change that to assign in list context (that hash assignment is a list assignment) and you get the right answer:
my $r2 = "(
'w' => {
'k2' => 5,
},
'q' => 'duck'
)";
my #answer = eval "$r2";
use Data::Dumper;
print "Evaled =\n" . Dumper(\#answer);
Now it's the data structure you thought it should be:
Evaled =
$VAR1 = [
'w',
{
'k2' => 5
},
'q',
'duck'
];

What does it mean to have a perl hash{}{}

My professor has some syntax on a slide that I do not understand.
In perl there is:
$hash{$string}{$anotherString}++;
What does this syntax mean? If it were:
$hash{$string}{$int}++;
Would it be increment the value?
When I print using
while( my( $key, $value ) = each %hash ){print "$key: $value\n";}
My output is
"key": HASH(0xbe0200)
That is a two-dimensional hash, a hash of hashes. It is easy to keep track of structures in Perl once you realize that any single value is in fact a scalar. In the case of multidimensional structures, the scalar value is a reference. For example:
my %outer = ( "foo" => { "bar" => 1 } );
The inner part { "bar" => 1 } is a hash reference. The use of { } in assignment denotes an anonymous hash. This is similar to:
my %inner = ( "bar" => 1 );
my %outer = ( "foo" => \%inner );
Now when you want to reference a value in %inner, you use the first key to access the hash reference, and the second key to access the value in %inner:
print $outer{"foo"}{"bar"}; # prints 1
And when you use the increment operator ++ on a value, it is incremented:
$outer{"foo"}{"bar"}++; # the value is now 2
$hash{string1}{string2}
is a shorter equivalent of
$hash{string1}->{string2}
i.e. it returns a value from a hash of hashes.
By applying the ++ operator, the value in the inner hash is incremented.
My output is "key": HASH(0xbe0200)
That strange output means that what you are trying to print is actually a hash reference:
use strict;
use warnings;
use 5.016; #allows you to use say(), which is equivalent to print()
#plus a newline at the end
my $href = {
a => 1,
b => 2,
};
say $href;
--output:--
HASH(0x100826698)
Or,
my %hash = (
a => 1,
b => 2,
);
say \%hash;
--output:--
HASH(0x1008270a0)
The \ operator gets the reference for the thing on its right hand side.
The easiest way to print the actual hash is using Data::Dumper, which is something you can and will use all the time:
use strict;
use warnings;
use 5.016;
use Data::Dumper;
my $href = {
a => 1,
b => 2,
};
say Dumper($href);
$VAR1 = {
'a' => 1,
'b' => 2
};
Like use warnings;, I consider use Data::Dumper; mandatory for every program.
So, when you see strange output, like HASH(0xbe0200), use Data::Dumper on the value:
my %hash = (
a => 1,
b => { hello => 2, goodbye => 3},
);
while( my( $key, $value ) = each %hash ){
say $key;
say Dumper($value);
say '-' x 10;
}
--output:--
a
$VAR1 = 1;
----------
b
$VAR1 = {
'hello' => 2,
'goodbye' => 3
};
----------
Or, alternatively just use Data::Dumper on the whole structure:
my %hash = (
a => 1,
b => { hello => 2, goodbye => 3},
);
say Dumper(\%hash);
--output:--
$VAR1 = {
'a' => 1,
'b' => {
'hello' => 2,
'goodbye' => 3
}
};
Note that Dumper() is used to show the contents of a hash reference(or any other reference), so if your variable is not a reference, e.g. %hash, then you must turn it into a reference using the \ operator, e.g. \%hash.
Now, if you have this hash:
my %hash = (
a => 1,
b => { hello => 2, goodbye => 3},
);
...to retrieve the value corresponding to 'goodbye', you can write:
say $hash{b}{goodbye}; #=>3
$hash{b} returns the hash (reference) { hello => 2, goodbye => 3}, and you can retrieve values from that hash by using the subscripts {hello} or {goodbye}.
Alternatively, you can write this:
my %hash = (
a => 1,
b => { hello => 2, goodbye => 3},
);
my $string = 'b';
my $anotherString = 'goodbye';
say $hash{$string}{$anotherString}; #=>3
And to increment the value 3 in the hash, you can write:
my $result = $hash{$string}{$anotherString}++;
say $result; #=>3
say $hash{$string}{$anotherString}; #=>4
The postfix ++ operator actually increments the value after the current operation, so $result is 3, then the value in the hash is incremented to 4, something like this:
my $temp = $hash{$string}{$anotherString};
$hash{$string}{$anotherString} = $hash{$string}{$anotherString} + 1;
my $result = $temp;
If you want the increment to happen before the current operation, then you can use the prefix ++ operator:
my $result = ++$hash{$string}{$anotherString};
say $result; #=>4
say $hash{$string}{$anotherString}; #=>4
Finally, if the value at $hash{$string}{$anotherString} is not a number, e.g. 'green', you will get something strange:
my %hash = (
a => 1,
b => { hello => 2, goodbye => 'green'},
);
my $string = 'b';
my $anotherString = 'goodbye';
my $result = $hash{$string}{$anotherString}++;
say $hash{$string}{$anotherString};
--output:--
greeo
perl has a notion that the string that comes after the string 'green' is the string 'greeo' because the letter 'o' comes after the letter 'n' in the alphabet. And if the string you incremented were 'greez' the output would be:
greez original
grefa output
The next letter after 'z' is to start over with 'a', but just like when you increment 9 by 1 and get 10, the increment for 'z' carries over to the column on the left, incrementing that letter by 1, producing the 'f'. Ha!

create an array from a part of a hash in perl

I have a hash which contain sub hash, I want to abstract that sub hash separately and create a array from that,
hash look like
'a1' => '1',
'a2' => '2'.
'Def' => [
'd' => 'x',
'e' => 'y'
]
I need to make a separate hash for 'Def'. and print only 'Def' as a array
Its hard from reading your question to know just exactly what you are trying to achieve but my interpretation of it is that you want to extract the anonymous hash allocated to def and store it in another hash. Then you want to print this hash as an array. I have also included examples to print just the keys of the values of the hash.
use strict;
use Data::Dumper;
my %first_hash = (
a1 => '1',
a2 => '2',
def => {
d => 'x',
e => 'y'
}
);
my %second_hash = %{$first_hash{'def'}};
my #full_array = %second_hash;
my #keys_array = keys %second_hash;
my #values_array = values %second_hash;
print Dumper (\%first_hash);
print Dumper (\%second_hash);
print "full array: ", join(' ',#full_array), "\n";
print "keys array: ", join(' ',#keys_array), "\n";
print "values array: ", join(' ',#values_array), "\n";
OUTPUT
$VAR1 = {
'a2' => '2',
'def' => {
'e' => 'y',
'd' => 'x'
},
'a1' => '1'
};
$VAR1 = {
'e' => 'y',
'd' => 'x'
};
full array: e y d x
keys array: e d
values array: y x
Below you'll find the answer.
print "#{$a{'Def'}}";

Adding hash keys

I am adding data to a hash using an incrementing numeric key starting at 0. The key/value is fine. When I add the second one, the first key/value pair points back to the second. Each addition after that replaces the value of the second key and then points back to it. The Dumper output would be something like this.
$VAR1 = { '0' => { ... } };
After the first key/value is added. After the second one is added I get
$VAR1= { '1' => { ... }, '0' => $VAR1->{'1} };
After the third key/value is added, it looks like this.
$VAR1 = { '1' => { ... }, '0' => $VAR1->{'1'}, '2' => $VAR1->{'1'} };
My question is why is it doing this? I want each key/value to show up in the hash. When I iterate through the hash I get the same data for every key/value. How do I get rid of the reference pointers to the second added key?
You are setting the value of every element to a reference to the same hash. Data::Dumper is merely reflecting that.
If you're using Data::Dumper as a serializing tool (yuck!), then you should set $Data::Dumper::Purity to 1 to get something eval can process.
use Data::Dumper qw( Dumper );
my %h2 = (a=>5,b=>6,c=>7);
my %h;
$h{0} = \%h2;
$h{1} = \%h2;
$h{2} = \%h2;
print("$h{0}{c} $h{2}{c}\n");
$h{0}{c} = 9;
print("$h{0}{c} $h{2}{c}\n");
{
local $Data::Dumper::Purity = 1;
print(Dumper(\%h));
}
Output:
7 7
9 9
$VAR1 = {
'0' => {
'c' => 9,
'a' => 5,
'b' => 6
},
'1' => {},
'2' => {}
};
$VAR1->{'0'} = $VAR1->{'1'};
$VAR1->{'2'} = $VAR1->{'1'};
If, on the other hand, you didn't mean to use store references to different hashes, you could use
# Shallow copies
$h{0} = { %h2 }; # { ... } means do { my %anon = ( ... ); \%anon }
$h{1} = { %h2 };
$h{2} = { %h2 };
or
# Deep copies
use Storable qw( dclone );
$h{0} = dclone(\%h2);
$h{1} = dclone(\%h2);
$h{2} = dclone(\%h2);
Output:
7 7
9 7
$VAR1 = {
'0' => {
'a' => 5,
'b' => 6,
'c' => 9
},
'1' => {
'a' => 5,
'b' => 6,
'c' => 7
},
'2' => {
'a' => 5,
'b' => 6,
'c' => 7
}
};
You haven't posted the actual code you're using to build the hash, but I assume it looks something like this:
foreach my $i (1 .. 3) {
%hash2 = (number => $i, foo => "bar", baz => "whatever");
$hash1{$i} = \%hash2;
}
(Actually, I'll guess that, in your actual code, you're probably reading data from a file in a while (<>) loop and assigning values to %hash2 based on it, but the foreach loop will do for demonstration purposes.)
If you run the code above and dump the resulting %hash1 using Data::Dumper, you'll get the output:
$VAR1 = {
'1' => {
'baz' => 'whatever',
'number' => 3,
'foo' => 'bar'
},
'3' => $VAR1->{'1'},
'2' => $VAR1->{'1'}
};
Why does it look like that? Well, it's because the values in %hash1 are all references pointing to the same hash, namely %hash2. When you assign new values to %hash2 in your loop, those values will overwrite the old values in %hash2, but it will still be the same hash. Data::Dumper is just highlighting that fact.
So, how can you fix it? Well, there are (at least) two ways. One way is to replace \%hash2, which gives a reference to %hash2, with { %hash2 }, which copies the contents of %hash2 into a new anonymous hash and returns a reference to that:
foreach my $i (1 .. 3) {
%hash2 = (number => $i, foo => "bar", baz => "whatever");
$hash1{$i} = { %hash2 };
}
The other (IMO preferable) way is to declare %hash2 as a (lexically scoped) local variable within the loop using my:
foreach my $i (1 .. 3) {
my %hash2 = (number => $i, foo => "bar", baz => "whatever");
$hash1{$i} = \%hash2;
}
This way, each iteration of the loop will create a new, different hash named %hash2, while the hashes created on previous iterations will continue to exist (since they're referenced from %hash1) independently.
By the way, you wouldn't have had this problem in the first place if you'd followed standard Perl best practices, specifically:
Always use strict; (and use warnings;). This would've forced you to declare %hash2 with my (although it wouldn't have forced you to do so inside the loop).
Always declare local variables in the smallest possible scope. In this case, since %hash2 is only used within the loop, you should've declared it inside the loop, like above.
Following these best practices, the example code above would look like this:
use strict;
use warnings;
use Data::Dumper qw(Dumper);
my %hash1;
foreach my $i (1 .. 3) {
my %hash2 = (number => $i, foo => "bar", baz => "whatever");
$hash1{$i} = \%hash2;
}
print Dumper(\%hash1);
which, as expected, will print:
$VAR1 = {
'1' => {
'baz' => 'whatever',
'number' => 1,
'foo' => 'bar'
},
'3' => {
'baz' => 'whatever',
'number' => 3,
'foo' => 'bar'
},
'2' => {
'baz' => 'whatever',
'number' => 2,
'foo' => 'bar'
}
};
It's hard to see what the problem is when you don't post the code or the actual results of Data::Dumper.
There is one thing you should know about Data::Dumper: When you dump an array or (especially) a hash, you should dump a reference to it. Otherwise, Data::Dumper will treat it like a series of variables. Also notice that hashes do not remain in the order you create them. I've enclosed an example below. Make sure that your issue isn't related to a confusing Data::Dumper output.
Another question: If you're keying your hash by sequential keys, would you be better off with an array?
If you can, please edit your question to post your code and the ACTUAL results.
use strict;
use warnings;
use autodie;
use feature qw(say);
use Data::Dumper;
my #array = qw(one two three four five);
my %hash = (one => 1, two => 2, three => 3, four => 4);
say "Dumped Array: " . Dumper #array;
say "Dumped Hash: " . Dumper %hash;
say "Dumped Array Reference: " . Dumper \#array;
say "Dumped Hash Reference: " . Dumper \%hash;
The output:
Dumped Array: $VAR1 = 'one';
$VAR2 = 'two';
$VAR3 = 'three';
$VAR4 = 'four';
$VAR5 = 'five';
Dumped Hash: $VAR1 = 'three';
$VAR2 = 3;
$VAR3 = 'one';
$VAR4 = 1;
$VAR5 = 'two';
$VAR6 = 2;
$VAR7 = 'four';
$VAR8 = 4;
Dumped Array Reference: $VAR1 = [
'one',
'two',
'three',
'four',
'five'
];
Dumped Hash Reference: $VAR1 = {
'three' => 3,
'one' => 1,
'two' => 2,
'four' => 4
};
The reason it is doing this is you are giving it the same reference to the same hash.
Presumably in a loop construct.
Here is a simple program which has this behaviour.
use strict;
use warnings;
# always use the above two lines until you
# understand completely why they are recommended
use Data::Printer;
my %hash;
my %inner; # <-- wrong place to put it
for my $index (0..5){
$inner{int rand} = $index; # <- doesn't matter
$hash{$index} = \%inner;
}
p %hash;
To fix it just make sure that you are creating a fresh hash reference every time through the loop.
use strict;
use warnings;
use Data::Printer;
my %hash;
for my $index (0..5){
my %inner; # <-- place the declaration here instead
$inner{int rand} = $index; # <- doesn't matter
$hash{$index} = \%inner;
}
p %hash;
If you are only going to use numbers for your indexes, and they are monotonically increasing starting from 0, then I would recommend using an array.
An array would be faster and more memory efficient.
use strict;
use warnings;
use Data::Printer;
my #array; # <--
for my $index (0..5){
my %inner;
$inner{int rand} = $index;
$array[$index] = \%inner; # <--
}
p #array;