Traversing Array of Hashes in perl - perl

I have a structure as below:
my $var1 = [{a=>"B", c=>"D"}, {E=>"F", G=>"H"}];
Now I want to traverse the first hash and the elements in it.. How can I do it?
When I do a dumper of $var1 it gives me Array and when on #var1 it says a hash.

You iterate over the array as you would with any other array, and you'll get hash references. Then iterate over the keys of each hash as you would with a plain hash reference.
Something like:
foreach my $hash (#{$var1}) {
foreach my $key (keys %{$hash}) {
print $key, " -> ", $hash->{$key}, "\n";
}
}

First off, you're going to trip Perl's strict mode with your variable declaration that includes barewords.
With that in mind, complete annotated example given below.
use strict;
my $test = [{'a'=>'B','c'=>'D'},{'E'=>'F','G'=>'H'}];
# Note the #{ $test }
# This says "treat this scalar reference as a list".
foreach my $elem ( #{ $test } ){
# At this point $elem is a scalar reference to one of the anonymous
# hashes
#
# Same trick, except this time, we're asking Perl
# to treat the $elem reference as a reference to hash
#
# Hence, we can just call keys on it and iterate
foreach my $key ( keys %{ $elem } ){
# Finally, another bit of useful syntax for scalar references
# The "point to" syntax automatically does the %{ } around $elem
print "Key -> $key = Value " . $elem->{$key} . "\n";
}
}

C:\wamp\bin\perl\bin\PERL_2~1\BASIC_~1\REVISION>type traverse.pl
my $var1=[{a=>"B", c=>"D"},{E=>"F", G=>"H"}];
foreach my $var (#{$var1}) {
foreach my $key (keys(%$var)) {
print $key, "=>", $var->{$key}, "\n";
}
print "\n";
}
C:\wamp\bin\perl\bin\PERL_2~1\BASIC_~1\REVISION>traverse.pl
c=>D
a=>B
G=>H
E=>F
$var1 = [] is a reference to an anonymous array
using the # sigil before it as in $var1 gives you the access to the array it is referencing. So analogous to foreach (#arr) {...} you would do foreach (#{$var1}) {...}.
Now, the elements in the array that you have provided #{$var1} are anonymous (means not named) too, but they are anonymous hashes, so just like with the arrayref, here we do %{$hash_reference} to get access to the hash referenced by $hash_reference. Here, $hash_reference is $var.
After accessing the hash using %{$var} it becomes easy to access the keys of the hash using keys(%$var) or keys(%{$var}). Since the result returned is an array of keys therefore we can use keys(%{$var}) inside foreach (keys(%{$var})) {...}.
We access the scalar value inside an anonymous hash by using a key like $hash_reference->{$keyname}, that's all the code did.
In case your array contained anonymous hashes of arrays like :
$var1=[ { akey=>["b", "c"], mkey=>["n", "o"]} ];
then, this is how you will access the array values:
C:\wamp\bin\perl\bin\PERL_2012\BASIC_PERL\REVISION>type traverse.pl
my $var1=[ {akey=>["b", "c"], mkey=>["n", "o"]} ];
foreach my $var (#{$var1}) {
foreach my $key (keys(%$var)) {
foreach my $elem (#{ $var->{$key} }) {
print "$key=>$elem,";
}
print "\n...\n";
}
print "\n";
}
C:\wamp\bin\perl\bin\PERL_2012\BASIC_PERL\REVISION>traverse.pl
mkey=>n,mkey=>o,
...
akey=>b,akey=>c,
...
Practice it more and regularly, it will soon become easy for you to break complex structures into such combinations. This is how I created a large parser for another software, it is full of answers to your questions :)

With one peek at amon's up-voted comment above (thanks, amon!) I was able to write this little ditty:
#!/usr/bin/perl
# Given an array of hashes, print out the keys and values of each hash.
use strict; use warnings;
use Data::Dump qw(dump);
my $var1=[{A=>"B",C=>"D"},{E=>"F",G=>"H"}];
my $count = 0;
# #{$var1} is the array of hash references pointed to by $var1
foreach my $href (#{$var1})
{
print "\nArray index ", $count++, "\n";
print "=============\n";
# %{$href} is the hash pointed to by $href
foreach my $key (keys %{$href})
{
# $href->{$key} ( ALT: $$href{$key} ) is the value
# corresponding to $key in the hash pointed to by
# $href
# print $key, " => ", $href->{$key}, "\n";
print $key, " => ", $$href{$key}, "\n";
}
print "\nCompare with dump():\n";
dump ($var1);
print "\nJust the first hash (index 0):\n";
# $var1->[0] ( ALT: $$var1[0] ) is the first hash reference (index 0)
# in #{$var1}
# dump ($var1->[0]);
dump ($$var1[0]);
#print "\nJust the value of key A: \"", $var1->[0]->{A}, "\"\n";
#print "\nJust the value of key A: \"", $var1->[0]{A}, "\"\n";
print "\nJust the value of key A: \"", $$var1[0]{A}, "\"\n"

Related

Perl list all keys in hash with identical values

If I have a colon-delimited file name FILE and I do:
cat FILE|perl -F: -lane 'my %hash = (); $hash{#F[0]} = #F[2]'
to assign the first and 3rd tokens as the key => value pairs for the hash..
1) Is that a sane way to assign key value pairs to a hash?
2) What is the simplest way to now find all keys with shared values and list them?
Assume FILE looks like:
Mike:34:Apple:Male
Don:23:Corn:Male
Jared:12:Apple:Male
Beth:56:Maize:Female
Sam:34:Apple:Male
David:34:Apple:Male
Desired Output: Keys with value "Apple": Mike,Jared,David,Sam
Your example won't work as you want because the -n option puts a while loop around your one-line program, so the hash you declare is created and destoyed for every record in the file. You could get around that by not declaring the hash, and so making it a persistent package variable which will retain all values stored in it.
You can then write push #{ $hash{$F[2]} }, $F[0] but notice that it should be $F[0] etc. and not #F[0], and I have used push to create a list of column 1 values for each column 3 value instead of just a list of one-to-one values relating each column 1 value with its column 3 value.
To clarify, your method produces a hash looking like this, which has to be searched to produce the display that you want.
(
Beth => "Maize",
David => "Apple",
Don => "Corn",
Jared => "Apple",
Mike => "Apple",
Sam => "Apple",
)
while mine creates this, which as you can see is pretty much already in the form you want.
(
Apple => ["Mike", "Jared", "Sam", "David"],
Corn => ["Don"],
Maize => ["Beth"],
)
But I think this problem is a bit too big to be solved with a one-line Perl program. The solution below expects the path to the input file as a command-line parameter, like this
> perl prog.pl colons.csv
but it will default to myfile.csv if no file is specified.
use strict;
use warnings;
our #ARGV = 'myfile.csv' unless #ARGV;
my %data;
while (<>) {
my #fields = split /:/;
push #{ $data{$fields[2]} }, $fields[0];
}
while (my ($k, $v) = each %data) {
next unless #$v > 1;
printf qq{Keys with value "%s": %s\n}, $k, join ', ', #$v;
}
output
Keys with value "Apple": Mike, Jared, Sam, David
use strict;
use warnings;
open my $in, '<', 'in.txt';
my %data;
while(<$in>){
chomp;
my #split = split/:/;
$data{$split[0]} = $split[2];
}
my $query = 'Apple';
print "Keys with value $query = ";
foreach my $name (keys %data){
print "$name " if $data{$name} eq $query;
}
print "\n";
Arrays are used to hold list of values, so use an array.
perl -F: -lane'
push #{ $h{$F[2]} }, $F[0];
END {
for my $fruit (keys %h) {
next if #{ $h{$fruit} } < 2;
print "$fruit: ", join(",", #{ $h{$fruit} });
}
}
' FILE
The END block is executed on exit. In it, we iterate over the keys of the hash. If the value of the current hash element is an array with only one element, it's skipped. Otherwise, we prints the key followed by contents of the array referenced by the hash element.
Here is another way:
perl -F: -lane'
push #{ $h{$F[2]} }, $F[0];
}{
print "$_: ", join(",", #{ $h{$_} }) for grep { #{$h{$_}} > 1 } keys %h;
' file
We read each line and create hash of arrays using third column as key and first column as list of values for matching key. In the END block we iterate over our hash using grep and filter keys whose array count greater than 1 and print the key followed by array elements.
It doesn't have to be a one liner,
Good. It's not going to be...
Is that a sane way to assign key value pairs to a hash?
You're simply assigning the key value pairs as:
$hash{"key"} = "value";
Which is about as simple as it gets. There might be a way of doing it via map. However, the main issue I see is what should happen if you have duplicate keys.
Let's say your file looks like this:
Mike:34:Apple:Male
Don:23:Corn:Male
Jared:12:Apple:Male
Beth:56:Maize:Female
Sam:34:Apple:Male
David:34:Apple:Male # Note this entry is here twice!
David:35:Wheat:Male # Note this entry is here twice!
Let's do a simple assignment loop:
my %hash;
while my $line ( <$fh> ) {
chomp $line;
my ($name, $age, $category, $sex) = split /:/, $line;
$hash{$name} = $category;
}
When you get to $hash{David}, it will first be set to Apple, but then you change the value to Wheat. There are four ways you can handle this:
Use whatever the last value is. No change in the loop.
Use the first value and ignore subsequent values. Simple enough to do.
If that happens, it's an error. Abort the program and report the error.
Keep all values.
This last one is the most interesting because it involves a reference to an array as the values for your hash:
my %hash;
while my $line ( <$fh> ) {
chomp $line;
my ($name, $age, $category, $sex) = split /:/, $line;
$hash{$name} = [] if not exists $hash{$name}; # I'm making this an array reference
push #{ $hash{$name} }, $category;
}
Now, each value in my hash is a reference to an array:
my #values = #{ $hash{David} ); # The values of David...
print "David is in categories " . join ( ", ", #values ) . "\n";
This will print out David is in categories Wheat, Apple
What is the simplest way to now find all keys with shared values and list them?
The easiest way is to create a second hash that's keyed by your value. In this hash, you will need to use an array reference. Let's assume no duplicate names for now:
my %hash;
my %indexed_hash;
while my $line ( <$fh> ) {
chomp $line;
my ($name, $age, $category, $sex) = split /:/, $line;
$hash{$name} = $category;
my $indexed_hash{$category} = [] if not exist $indexed_hash{$category};
push #{ $indexed_hash{$category} }, $name;
}
Now, if I want to find all the duplicates of Apple:
my #names = #{ $indexed_hash{Apple} };
print "The following are in 'Apple': " . join ( ", " #names ) . "\n";
Since we're getting into references, we could take things a step further and store all of your values of your file in your hash. Again, for simplicity, I am assuming that you will have one and only one entry per name:
my %hash;
while my $line ( <$fh> ) {
chomp $line;
my ($name, $age, $category, $sex) = split /:/, $line;
$hash{$name}->{AGE} = $age;
$hash{$name}->{CATEGORY} = $category;
$hash{$name}->{SEX} = $sex;
}
for my $name ( sort keys %hash ) {
print "$name Information:\n";
print " Age: " . $hash{$name}->{AGE} . "\n";
printf "Category: %s\n", $hash{$name}->{CATEGORY};
print " Sex: #{[$hash{$name}->{SEX}]}\n\n";
}
That last two statements are easier ways of interpolating complex data structures into a string. The printf is fairly clear. The second #{[...]} is a neat little trick.
What have you tried?
If you reverse the hash into a list of value => key pairs then use List::Util's pairs() against the list, you can transform the hash into a hash of values => key arrayrefs. i.e. ( foo => [ 'bar', 'baz' ] ), grep {#{$hash{$_}} > 1} keys %hash, and print the results.

About the usage of hash reference in Perl

This reports syntax error:
$hash={a=>2};
print %{$hash}{a};
But this works:
print each(%{$hash})
Why??
To get an element from a hashref, you take the normal code for getting a hash element: $foo{'bar'}, and replace the name of the hash, not including the sigil, with the hashref: $$hash{'bar'}. Your % would only be used to dereference to the full hash, as in your each case, not just an element.
More helpful hints at http://perlmonks.org/?node=References+quick+reference.
Maybe this will help you understand why it's wrong...
$hash = {a => 2}; #Works: $hash is a reference to the hash
%foo = %{$hash}; #Now, we've dereferenced the hash to %foo
# Wherever we have "$hash", we can now use "foo"...
print %foo{a}; #Whoops! Doesn't work.
print %hash{a}; #And, neither did this!
print $foo{a}; #No problem! Use '$" when talking about a single hash element
print ${$hash}{a} #Same as above.
print each %foo; #Each takes a hash (with "%" sign)
print each %{$hash}; #Same as above.
print $hash->{a} #Syntactic Sugar: Same as ${$hash{a}} or $$hash{a}
Yeah, just like print %hash{a} doesn't work even though each(%hash) does.
each(%hash) ==> each(%{ $ref })
print($hash{a}) ==> print(${ $ref }{a})
You were missing the lookup '->'.
print %{$hash}{a};
should be:
print %{$hash}->{a};
You declare it as $ but then try to cast to a hash and retrieve the value, not sure why.
Just retrieve like so:
print $hash->{a};
My personal preference when it comes to hashes:
$hash1->{a} = 1;
print $hash1->{a}, "\n"; # prints '1'
Multi level:
$hash2->{a}{a} = 1;
$hash2->{a}{b} = 2;
print $hash2->{a}{a}, "\n"; # prints '1'
print $hash2->{a}{b}, "\n"; # prints '2'
Looping:
while (my ($key, $value) = each %{$hash1})
{
print $key, "\n"; # prints 'a'
print $value, "\n"; # prints '1'
}

What's the best practise for Perl hashes with array values?

What is the best practise to solve this?
if (... )
{
push (#{$hash{'key'}}, #array ) ;
}
else
{
$hash{'key'} ="";
}
Is that bad practise for storing one element is array or one is just double quote in hash?
I'm not sure I understand your question, but I'll answer it literally as asked for now...
my #array = (1, 2, 3, 4);
my $arrayRef = \#array; # alternatively: my $arrayRef = [1, 2, 3, 4];
my %hash;
$hash{'key'} = $arrayRef; # or again: $hash{'key'} = [1, 2, 3, 4]; or $hash{'key'} = \#array;
The crux of the problem is that arrays or hashes take scalar values... so you need to take a reference to your array or hash and use that as the value.
See perlref and perlreftut for more information.
EDIT: Yes, you can add empty strings as values for some keys and references (to arrays or hashes, or even scalars, typeglobs/filehandles, or other scalars. Either way) for other keys. They're all still scalars.
You'll want to look at the ref function for figuring out how to disambiguate between the reference types and normal scalars.
It's probably simpler to use explicit array references:
my $arr_ref = \#array;
$hash{'key'} = $arr_ref;
Actually, doing the above and using push result in the same data structure:
my #array = qw/ one two three four five /;
my $arr_ref = \#array;
my %hash;
my %hash2;
$hash{'key'} = $arr_ref;
print Dumper \%hash;
push #{$hash2{'key'}}, #array;
print Dumper \%hash2;
This gives:
$VAR1 = {
'key' => [
'one',
'two',
'three',
'four',
'five'
]
};
$VAR1 = {
'key' => [
'one',
'two',
'three',
'four',
'five'
]
};
Using explicit array references uses fewer characters and is easier to read than the push #{$hash{'key'}}, #array construct, IMO.
Edit: For your else{} block, it's probably less than ideal to assign an empty string. It would be a lot easier to just skip the if-else construct and, later on when you're accessing values in the hash, to do a if( defined( $hash{'key'} ) ) check. That's a lot closer to standard Perl idiom, and you don't waste memory storing empty strings in your hash.
Instead, you'll have to use ref() to find out what kind of data you have in your value, and that is less clear than just doing a defined-ness check.
I'm not sure what your goal is, but there are several things to consider.
First, if you are going to store an array, do you want to store a reference to the original value or a copy of the original values? In either case, I prefer to avoid the dereferencing syntax and take references when I can:
$hash{key} = \#array; # just a reference
use Clone; # or a similar module
$hash{key} = clone( \#array );
Next, do you want to add to the values that exist already, even if it's a single value? If you are going to have array values, I'd make all the values arrays even if you have a single element. Then you don't have to decide what to do and you remove a special case:
$hash{key} = [] unless defined $hash{key};
push #{ $hash{key} }, #values;
That might be your "best practice" answer, which is often the technique that removes as many special cases and extra logic as possible. When I do this sort of thing in a module, I typically have a add_value method that encapsulates this magic where I don't have to see it or type it more than once.
If you already have a non-reference value in the hash key, that's easy to fix too:
if( defined $hash{key} and ! ref $hash{key} ) {
$hash{key} = [ $hash{key} ];
}
If you already have non-array reference values that you want to be in the array, you do something similar. Maybe you want an anonymous hash to be one of the array elements:
if( defined $hash{key} and ref $hash{key} eq ref {} ) {
$hash{key} = [ $hash{key} ];
}
Dealing with the revised notation:
if (... )
{
push (#{$hash{'key'}}, #array);
}
else
{
$hash{'key'} = "";
}
we can immediately tell that you are not following the standard advice that protects novices (and experts!) from their own mistakes. You're using a symbolic reference, which is not a good idea.
use strict;
use warnings;
my %hash = ( key => "value" );
my #array = ( 1, "abc", 2 );
my #value = ( 22, 23, 24 );
push(#{$hash{'key'}}, #array);
foreach my $key (sort keys %hash) { print "$key = $hash{$key}\n"; }
foreach my $value (#array) { print "array $value\n"; }
foreach my $value (#value) { print "value $value\n"; }
This does not run:
Can't use string ("value") as an ARRAY ref while "strict refs" in use at xx.pl line 8.
I'm not sure I can work out what you were trying to achieve. Even if you remove the 'use strict;' warning, the code shown does not detect a change from the push operation.
use warnings;
my %hash = ( key => "value" );
my #array = ( 1, "abc", 2 );
my #value = ( 22, 23, 24 );
push #{$hash{'key'}}, #array;
foreach my $key (sort keys %hash) { print "$key = $hash{$key}\n"; }
foreach my $value (#array) { print "array $value\n"; }
foreach my $value (#value) { print "value $value\n"; }
foreach my $value (#{$hash{'key'}}) { print "h_key $value\n"; }
push #value, #array;
foreach my $key (sort keys %hash) { print "$key = $hash{$key}\n"; }
foreach my $value (#array) { print "array $value\n"; }
foreach my $value (#value) { print "value $value\n"; }
Output:
key = value
array 1
array abc
array 2
value 22
value 23
value 24
h_key 1
h_key abc
h_key 2
key = value
array 1
array abc
array 2
value 22
value 23
value 24
value 1
value abc
value 2
I'm not sure what is going on there.
If your problem is how do you replace a empty string value you had stored before with an array onto which you can push your values, this might be the best way to do it:
if ( ... ) {
my $r = \$hash{ $key }; # $hash{ $key } autoviv-ed
$$r = [] unless ref $$r;
push #$$r, #values;
}
else {
$hash{ $key } = "";
}
I avoid multiple hash look-ups by saving a copy of the auto-vivified slot.
Note the code relies on a scalar or an array being the entire universe of things stored in %hash.

Iterate through hash values in perl

I've got a hash with let's say 20 values.
It's initialized this way:
my $line = $_[0]->[0];
foreach my $value ($line) {
print $value;
}
Now when I try to get the value of each hash in $line it says:
Use of uninitialized value in print at file.pl line 89
Is there a way to iterate through each value of a hash?
I also tried it with:
my %line = $_[0]->[0];
foreach my $key (keys %line) {
print %line->{$key};
}
But that is also not working:
Reference found where even-sized list expected at file.pl at line 89
Anybody knows what to do? It shouldn't be that difficult...
To iterate over values in a hash:
for my $value (values %hash) {
print $value;
}
$line in your first example is a scalar, not a hash.
If it's a hash reference, dereference it with %{$line}.
First, you must understand the difference between a hash, and a hash reference.
Your initial assignment $_[0]->[0] means something like : Takes the first argument of the current function ($_[0]), dereference it (->) and consider it is an array and retrieves it's first value ([0]). That value can not be a list or a hash, it must be a scalar (string, int, float, reference).
Here is some example:
my %hash = ( MyKey => "MyValue");
my $hashref = \%hash;
# The next line print all elements of %hash
foreach (keys %hash) { print $_ }
# And is equivalent to
foreach (keys %{$hashref}) { print $_ }
$hash{MyKey} == $hashref->{MyKey}; # is true
Please refer to http://perldoc.perl.org/perlreftut.html for further details.
The warning is telling you that there nothing at $_[0]->[0]. It's not dying and telling you that you're indexing nothing, so $_[0] is likely an arrayref, but nothing is in the first slot--or perhaps it's pointing to an empty array.
Were it a empty string or a 0, it wouldn't complain.
Were there any reference there, you could print something even if only: BLAH(0x80af74). (Where "BLAH" is one of "ARRAY", "HASH", "SCALAR", "REF", "GLOB", "IO", ... )
My suggestion is that you do this:
use Data::Dumper;
say Data::Dumper->Dump( [ $_[0] ] ); # or even say Data::Dumper->Dump( [ \#_ ] )
and then look at the output.
Once you've got a hashref at $_[0]->[0], then if you must loop through the hash, the best way is:
while ( my ( $key, $value ) = each %$hashref ) {
do_stuff_with_key_and_value( $key, $value );
}
see each
Lastly, it seems that you have some sigil confusion. See the last part of this link for a decent attempt to explain that sigils ( '$', '#', '%' ) are not part of the name of a variable, but indicators what we want retrieved from it. Perl compilation woes

How can I pass a hash to a Perl subroutine?

In one of my main( or primary) routines,I have two or more hashes. I want the subroutine foo() to recieve these possibly-multiple hashes as distinct hashes. Right now I have no preference if they go by value, or as references. I am struggling with this for the last many hours and would appreciate help, so that I dont have to leave perl for php! ( I am using mod_perl, or will be)
Right now I have got some answer to my requirement, shown here
From http://forums.gentoo.org/viewtopic-t-803720-start-0.html
# sub: dump the hash values with the keys '1' and '3'
sub dumpvals
{
foreach $h (#_)
{
print "1: $h->{1} 3: $h->{3}\n";
}
}
# initialize an array of anonymous hash references
#arr = ({1,2,3,4}, {1,7,3,8});
# create a new hash and add the reference to the array
$t{1} = 5;
$t{3} = 6;
push #arr, \%t;
# call the sub
dumpvals(#arr);
I only want to extend it so that in dumpvals I could do something like this:
foreach my %k ( keys #_[0]) {
# use $k and #_[0], and others
}
The syntax is wrong, but I suppose you can tell that I am trying to get the keys of the first hash ( hash1 or h1), and iterate over them.
How to do it in the latter code snippet above?
I believe this is what you're looking for:
sub dumpvals {
foreach my $key (keys %{$_[0]}) {
print "$key: $_[0]{$key}\n";
}
}
An element of the argument array is a scalar, so you access it as $_[0] not #_[0].
keys operates on hashes, not hash refs, so you need to dereference, using %
And of course, the keys are scalars, not hashes, so you use my $key, not my %key.
To have dumpvals dump the contents of all hashes passed to it, use
sub dumpvals {
foreach my $h (#_) {
foreach my $k (keys %$h) {
print "$k: $h->{$k}\n";
}
}
}
Its output when called as in your question is
1: 2
3: 4
1: 7
3: 8
1: 5
3: 6