Perl:Access values of hash inside a hash - perl

I have just picked up Perl.
I have a little confusion with accessing hash values. Below is the code where I am trying to access the values of a hash inside a hash.
Since am using a simple text editor to code, I am not able to figure out what can be the problem. Please help
my %box = (
Milk => {
A => 5,
B => 10,
C => 20,
},
Chocolate => {
AB => 10,
BC => 25,
CD => 40,
},
);
foreach my $box_key(keys %box) {
foreach my $inside_key (keys %box{box_key})
print "$box_key"."_$inside_key""is for rupees $box{box_key}{inside_key}";
}

If the syntax is
keys %hash
for a hash, it's
keys %{ ... }
for a hash reference. In this case, the reference is stored in $box{$box_key}, so you'd use
keys %{ $box{$box_key} }
Also, you're accessing elements named box_key and inside_key in a couple of places where you actually want the elements named by $box_key and $inside_key.
Finally, you can use curlies around variable names to instruct Perl where the variable name ends.
for my $box_key (keys %box) {
for my $inside_key (keys %{ $box{$box_key} }) {
print "${box_key}_$inside_key is for rupees $box{$box_key}{$inside_key}\n";
}
}

ikegami has explained it very well and I feel that you are still missing something in your code that's why you are having a problem, try the below code, hope it helps you.
my %box = (
Milk => {
A => 5,
B => 10,
C => 20,
},
Chocolate => {
AB => 10,
BC => 25,
CD => 40,
},
);
foreach my $box_key(keys %box) {
foreach my $inside_key (keys $box{$box_key}) {
print "${box_key}_$inside_key is for rupees $box{$box_key}{$inside_key}\n";
}
}
Output:
Chocolate_CD is for rupees 40
Chocolate_BC is for rupees 25
Chocolate_AB is for rupees 10
Milk_A is for rupees 5
Milk_C is for rupees 20
Milk_B is for rupees 10

Related

insert anonymous hash into anonymous hash for counting in a loop

I'm trying to count starts and stops of some services i keep track of in logs.
I'm not going to past here entire code, but my way of doing hash is this:
I'm passing those starts and stops into anonymous hash .
First I'm creating anonymous hash filled with keys and values (in my case $knot is a key an zeros are values). Next im replaqcing values with another hash.
My code looks like this:
foreach $knot (#knots){
chomp $knot;
$variable = $variable."$knot;0;";
$Services = {split(/;/,$variable)};
}
my $data =
{
Starts=>'0',
Stops=>'0',
};
foreach my $key (keys %$Services) {
$Services->{$key} = $data;
}
print Dumper $Services;
Printing shows:
$VAR1 = {
' knot1' => {
'Stops' => '0',
'Starts' => '0'
},
' knot2' => $VAR1->{' knot1'},
' knot3' => $VAR1->{' knot1'},
' knot4' => $VAR1->{' knot1'},
' knot5' => $VAR1->{' knot1'},
and so on. Is there a better way of doing this? My way if i'm correct is badly written because changing knot1 starts/stops changes every other knot values.
Counting is very simple in Perl, thanks to Autovivification. You can just create anonymous data structures on the fly, like so:
use Data::Dumper;
my %hash = ();
$hash{apple}{green}++;
$hash{apple}{red} ++;
$hash{pear}{yellow}++;
$hash{apple}{green}++;
$hash{apple}{red} ++;
$hash{apple}{green}++;
print Dumper(\%hash);
This will produce the desired structure for counting:
$VAR1 = {
'apple' => {
'green' => 3,
'red' => 2
},
'pear' => {
'yellow' => 1
}
};
This also works in loops using variables (here using a reference to a hash):
my $hash_ref = {};
for my $fruit (qw( apple pear apple peach apple pear )) {
$hash_ref->{$fruit}++;
}
print Dumper($hash_ref);
resulting in:
$VAR1 = {
'peach' => 1,
'pear' => 2,
'apple' => 3
};

accessing the highest value in hash of an hash in perl?

so i have a hash of an hash that looks something like this:
my %hash = (
'fruits' => {
'apple' => 34,
'orange' => 30,
'pear' => 45,
},
'chocolates' => {
'snickers' => 35,
'lindt' => 20,
'mars' => 15,
},
);
I want to access only the fruit which is max in number and chocolate which is max in number. the output should look like:
fruits: pear
chocolates : snickers
foreach my $item (keys %hash){
#print "$item:\t"; # this is the object name
foreach my $iteminitem (keys %{$hash{$item}})
{
my $longestvalue = (sort {$a<=>$b} values %{$hash{$item}})[-1]; #this stores the longest value
print "the chocolate/fruit corresponding to the longestvalue" ;
#iteminitem will be chocolate/fruit name
}
print "\n";
}
I know it is not difficult but I am blanking out!
The following sorts the keys of each hashref by descending value, so the max is the first element returned:
my %hash = (
chocolates => { lindt => 20, mars => 15, snickers => 35 },
fruits => { apple => 34, orange => 30, pear => 45 },
);
while (my ($key, $hashref) = each %hash) {
my ($max) = sort {$hashref->{$b} <=> $hashref->{$a}} keys %$hashref;
print "$key: $max\n";
}
Outputs:
fruits: pear
chocolates: snickers
Here is another way:
use strict;
use warnings;
use List::Util qw(max);
my %hash = (
'fruits' => {
'apple' => 34,
'orange' => 30,
'pear' => 45,
},
"chocolates" => {
'snickers' => 35,
'lindt' => 20,
'mars' => 15,
},
);
for (keys %hash) {
my $max = max values %{$hash{$_}}; # Find the max value
my %rev = reverse %{$hash{$_}}; # Reverse the internal hash
print "$_: $rev{$max}\n"; # Print first key and lookup by max value
}
Output:
fruits: pear
chocolates: snickers
For this you likely want List::UtilsBy:
use List::UtilsBy 'max_by';
my %hash = (
chocolates => { lindt => 20, mars => 15, snickers => 35 },
fruits => { apple => 34, orange => 30, pear => 45 },
);
foreach my $key ( keys %hash ) {
my $subhash = $hash{$key};
my $maximal = max_by { $subhash->{$_} } keys %$subhash;
print "$key: $maximal\n";
}
For this small example it probably doesn't matter too much, but for much larger cases, there's a big difference. This will run in O(n) time for the size of the hash, whereas the "sort and take first index" solution will take O(n log n) time, much slower, to sort the list of keys, only to then throw away all but the first result.

Problems with sorting a hash of hashes by value in Perl

I'm rather inexperienced with hashes of hashes - so I hope someone can help a newbie...
I have the following multi-level hash:
$OCRsimilar{$ifocus}{$theWord}{"form"} = $theWord;
$OCRsimilar{$ifocus}{$theWord}{"score"} = $OCRscore;
$OCRsimilar{$ifocus}{$theWord}{"distance"} = $distance;
$OCRsimilar{$ifocus}{$theWord}{"similarity"} = $similarity;
$OCRsimilar{$ifocus}{$theWord}{"length"} = $ilength;
$OCRsimilar{$ifocus}{$theWord}{"frequency"} = $OCRHashDict{$ikey}{$theWord};
Later, I need to sort each second-level element ($theWord) according to the score value. I've tried various things, but have failed so far. The problem seems to be that the sorting introduces new empty elements in the hash that mess things up.
What I have done (for example - I'm sure this is far from ideal):
my #flat = ();
foreach my $key1 (keys { $OCRsimilar{$ifocus} }) {
push #flat, [$key1, $OCRsimilar{$ifocus}{$key1}{'score'}];
}
for my $entry (sort { $b->[1] <=> $a->[1] } #flat) {
print STDERR "#$entry[0]\t#$entry[1]\n";
}
If I check things with Data::Dumper, the hash contains for example this:
'uroadcast' => {
'HASH(0x7f9739202b08)' => {},
'broadcast' => {
'frequency' => '44',
'length' => 9,
'score' => '26.4893274374278',
'form' => 'broadcast',
'distance' => 1,
'similarity' => 1
}
}
If I don't do the sorting, the hash is fine. What's going on?
Thanks in advance for any kind of pointers...!
Just tell sort what to sort on. No other tricks are needed.
#!/usr/bin/perl
use warnings;
use strict;
my %OCRsimilar = (
focus => {
word => {
form => 'word',
score => .2,
distance => 1,
similarity => 1,
length => 4,
frequency => 22,
},
another => {
form => 'another',
score => .01,
distance => 1,
similarity => 1,
length => 7,
frequency => 3,
},
});
for my $word (sort { $OCRsimilar{focus}{$a}{score} <=> $OCRsimilar{focus}{$b}{score} }
keys %{ $OCRsimilar{focus} }
) {
print "$word: $OCRsimilar{focus}{$word}{score}\n";
}
Pointers: perlreftut, perlref, sort.
What seems suspicious to me is this construct:
foreach my $key1 (keys { $OCRsimilar{$ifocus} }) {
Try dereferencing the hash, so it becomes:
foreach my $key1 (keys %{ $OCRsimilar{$ifocus} }) {
Otherwise, you seem to be creating an anonymous hash and taking the keys of it, equivalent to this code:
foreach my $key1 (keys { $OCRsimilar{$ifocus} => undef }) {
Thus, I think $key1 would equal $OCRsimilar{$ifocus} inside the loop. Then, I think Perl will do auto-vivification when it encounters $OCRsimilar{$ifocus}{$key1}, adding the hash reference $OCRsimilar{$ifocus} as a new key to itself.
If you use warnings;, the program ought to complain Odd number of elements in anonymous hash.
Still, I don't understand why Perl doesn't do further auto-vivication and add 'score' as the key, showing something like 'HASH(0x7f9739202b08)' => { 'score' => undef }, in the Data dump.

Sorting a Hash of Array of Hashes by Internal Hash Value

Having a hard time wrapping my head around this, and I'm sure it's just a case of be being dense today. I have a data structure similar to this:
{
PDGFRA => [
{ "p.N659K" => 22 },
{ "p.D842Y" => 11 },
{ "p.I843_S847>T" => 9 },
{ "p.D842_H845del" => 35 },
{ "p.I843_D846delIMHD" => 24 },
{ "p.I843_D846del" => 21 },
{ "p.D842V" => 457 },
{ "p.N659Y" => 5 },
{ "p.M844_S847del" => 7 },
{ "p.S566_E571>K" => 8 },
{ "p.S566_E571>R" => 50 },
{ "p.V561D" => 54 },
],
}
I would like to print out the results, reverse sorted (greatest to smallest) by the hash values, so ultimately I end up with something like this
PDGFRA p.D842V 457
PDGFRA p.V561D 54
PDGFRA p.S566_E571>R 50
PDGFRA p.D842_H845del 35
.
.
.
etc.
I have no problem printing the data structure as I want, but I can't seem to figure out how to sort the data prior to printing it out. I've tried to sort like this:
for my $gene ( sort keys %matches ) {
for my $var ( sort { $a->{$var} <=> $b->{$var} } #{$matches{$gene}} {
print "$var\n";
}
}
But whether I use $var or $_ it doesn't seem to work, complaining that '$var' is not defined, and '$_' is not initialized. I also tried (really pathetically!) to use a Schwarzian transform, but I don't think I'm even close on this one:
for my $gene ( sort keys %matches ) {
my #sorted =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [ $_, $matches{$gene}->{$_} ] } #{$matches{$gene}};
}
Would someone mind pointing me in the right direction for sorting a hash or arrays of hashes by internal hash value?
Here's an alternative. This one works by sorting a list of indices into the array #sorted so that you can then just iterate over an array slice #$list[#sorted].
You're hampered by the data format though, and it's messy just to extract a pair of values given a single-element hash.
I hope this helps.
use strict;
use warnings;
my %matches = (
PDGFRA => [
{ "p.N659K" => 22 },
{ "p.D842Y" => 11 },
{ "p.I843_S847>T" => 9 },
{ "p.D842_H845del" => 35 },
{ "p.I843_D846delIMHD" => 24 },
{ "p.I843_D846del" => 21 },
{ "p.D842V" => 457 },
{ "p.N659Y" => 5 },
{ "p.M844_S847del" => 7 },
{ "p.S566_E571>K" => 8 },
{ "p.S566_E571>R" => 50 },
{ "p.V561D" => 54 },
],
);
while (my ($key, $list) = each %matches) {
my #sorted = sort {
my ($ka, $va) = %{ $list->[$b] };
my ($kb, $vb) = %{ $list->[$a] };
$va <=> $vb;
} 0 .. $#$list;
for my $item (#$list[#sorted]) {
printf "%s %s %d\n", $key, %$item;
}
}
output
PDGFRA p.D842V 457
PDGFRA p.V561D 54
PDGFRA p.S566_E571>R 50
PDGFRA p.D842_H845del 35
PDGFRA p.I843_D846delIMHD 24
PDGFRA p.N659K 22
PDGFRA p.I843_D846del 21
PDGFRA p.D842Y 11
PDGFRA p.I843_S847>T 9
PDGFRA p.S566_E571>K 8
PDGFRA p.M844_S847del 7
PDGFRA p.N659Y 5
Update
In my comment on David W's answer I suggest a data structure that was just an array of two-element arrays, instead of an array of single-element hashes.
This is what that would look like. The output is identical to the above code
use strict;
use warnings;
my %matches = (
PDGFRA => [
[ "p.N659K", 22 ],
[ "p.D842Y", 11 ],
[ "p.I843_S847>T", 9 ],
[ "p.D842_H845del", 35 ],
[ "p.I843_D846delIMHD", 24 ],
[ "p.I843_D846del", 21 ],
[ "p.D842V", 457 ],
[ "p.N659Y", 5 ],
[ "p.M844_S847del", 7 ],
[ "p.S566_E571>K", 8 ],
[ "p.S566_E571>R", 50 ],
[ "p.V561D", 54 ],
],
);
while (my ($key, $list) = each %matches) {
my #sorted = sort { $list->[$b][1] <=> $list->[$a][1] } 0 .. $#$list;
for my $item (#$list[#sorted]) {
printf "%s %s %d\n", $key, #$item;
}
}
You've already accepted an answer, but I want to comment on your data structure. You have:
A hash
That hash contains an array.
These arrays contain a hash that only has a single key in it.
Why the array which contains the hashes with only a single key? Why not simply get rid of the array?
$VAR1 = {
'PDGFRA' => {
'p.N659K' => 22,
'p.D842Y' => 11,
'p.I843_S847>T' => 9,
'p.D842_H845del' => 35,
'p.I843_D846delIMHD' => 24,
'p.I843_D846del' => 21,
'p.D842V' => 457,
'p.N659Y' => 5,
'p.M844_S847del' => 7,
'p.S566_E571>K' => 8,
'p.S566_E571>R' => 50,
'p.V561D' => 54,
};
This would greatly simplify your structure and make it easier to find the values enclosed in the inner most hash. You can access the keys directly.
If the problem is that some of these hash keys may be duplicates, you can make that hash key point to an array of values:
$VAR1 = {
'PDGFRA' => {
'p.N659K' => 22,
'p.D842Y' => 11,
'p.I843_S847>T' => [
6,
9
],
'p.D842_H845del' => 35,
'p.I843_D846delIMHD' => 24,
'p.I843_D846del' => 21,
...
Note that p.I843_S847 contains both 6 and 9. You could simplify and make each value of the inner hash a reference to an array, and 99% of those arrays may contain a single value, or you could detect with the ref command whether the content is a scalar or a reference to an array. Either way, you still have the benefit of the faster lookup, and easier access to the keys in that hash, so you can sort them.
Since, you are familiar with using complex data structures in Perl, you should also learn about how Object Oriented Perl works. This will make it a lot easier to handle these structures, and will also help you in development because it gives you a clean way of relating to these complex structures.
Assuming $VAR above is Data::Dumper::Dumper(\%matches) ...
If you don't want to make your data-structure nicer....
for my $gene ( sort keys %matches ) {
for my $hash ( sort {
my ($akey) = keys %$a;
my ($bkey) = keys %$b;
$a->{$akey} <=> $b->{$bkey}
} #{$matches{$gene}} ) {
my ($key) = keys %$hash;
print "$key => $hash->{$key}\n";
}
}
That sorts by the value (e.g: 12) not the key (e.g: 'p.I843_D846del'). I figured since you used a numeric comparison that you'd want to sort by the numeric value ;-)
edited: fixed body of inner loop.
edit 2:
I see you tried a Schwartzian Transform... if you keep your data-structure as is, that might be a more efficient solution... as follows:
for my $gene ( sort keys %matches ) {
print "$_->[0] => $_->[1]\n" for # print both key and value
sort { $a->[1] <=> $b->[1] } # sort by value (e.g: 35)
map { my ($key) = keys %$_; [$key, $_->{$key}] } # e.g:['p.D842_H845del' ,35]
#{$matches{$gene}};
}
But instead, I'd just fix the data structure.
Probably just make both the 'key' (e.g: 'p.I843_D846del') and the 'value' (e.g: 12) both values and give them consistent key names.
Your data structure appears to have unnecessary depth. But here's a Schwartzian Transform version that will sort it as you want.
for my $gene (sort keys %matches) {
my #sorted = map { {#$_} } # Back to hash.
sort { $b->[1] <=> $a->[1] } # Sort.
map { [each %$_] } # Unpack the 1-key hash.
#{$matches{$gene}};
}

Why do I get “uninitialized value” warnings when I use Date::Manip’s UnixDate in a sort block?

Related/possible duplicate: Why do I get "uninitialized value" warnings when I use Date::Manip's sortByLength?
This block of code:
my #sorted_models = sort {
UnixDate($a->{'year'}, "%o") <=>
UnixDate($b->{'year'}, "%o")
} values %{$args{car_models}};
kept generating the following error warning:
Use of uninitialized value in length at /.../Date/Manip.pm line 244.
Date::Manip is a CPAN module. And line 244 of Date::Manip is found within the following block of code:
# Get rid of a problem with old versions of perl
no strict "vars";
# This sorts from longest to shortest element
sub sortByLength {
return (length $b <=> length $a);
}
use strict "vars";
But then including this (printing out the actual Unix Date value to the console in the logger) before the block of code to sort the values:
foreach (values %{$args{car_models}}) {
$g_logger->info(UnixDate($_->{'year'},"%o"));
}
removed the errors warnings entirely. Why? And what is a good fix instead of doing all these logging statements?
NOTE: None of the sorted values are undefined because when I printed them out in the logger, I could see that every one of them had a numerical value.
I am going to one last time try to answer this as clearly as possible.
First, if all timestamps are like 2008-08-07T22:31:06Z, there is no need to map them through UnixDate as standard sort using cmp will sort them correctly.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use Date::Manip;
my %args = (
car_models => {
a => { year => '2009-08-07T22:31:06Z' },
b => { year => '2008-08-07T23:31:06Z' },
c => { year => '2008-08-07T21:31:06Z' },
},
);
my #sorted_cmp = sort {
$a->{year} cmp $b->{year}
} values %{ $args{car_models}};
print "Sorted *without* using UnixDate:\n";
print Dumper \#sorted_cmp;
my #sorted_dm = sort {
UnixDate($a->{year}, '%o') <=> UnixDate($b->{year}, '%o')
} values %{ $args{car_models}};
print "Sorted using UnixDate:\n";
print Dumper \#sorted_dm;
Output (after setting TZ in cmd to placate Date::Manip):
C:\Temp> cars
Sorted *without* using UnixDate:
$VAR1 = [
{
'year' => '2008-08-07T21:31:06Z'
},
{
'year' => '2008-08-07T23:31:06Z'
},
{
'year' => '2009-08-07T22:31:06Z'
}
];
Sorted using UnixDate:
$VAR1 = [
{
'year' => '2008-08-07T21:31:06Z'
},
{
'year' => '2008-08-07T23:31:06Z'
},
{
'year' => '2009-08-07T22:31:06Z'
}
];
No warnings, no errors ... Ergo, all you have put on this page is one big mess of a red herring. Besides, this still does not explain where the 1249998666 in your other question came from.
Something is wrong with Date::Manip in the way it localizes special variables.
Try to call Date_Init() before you do sorting. It seems to solve the problem:
use strict;
use warnings;
use Data::Dumper;
use Date::Manip qw(UnixDate Date_Init);
my $cars_ref = {
mazda => {model => 'mazda', year => '2008' },
toyota => {model => 'toyota', year => '2001' },
mitsu => {model => 'mitsu', year => '2005' }
};
Date_Init(); # Initialize Date::Manip first!
my #models =
sort {
UnixDate( $a->{year}, '%o' ) <=> UnixDate( $b->{year}, '%o' );
} values %$cars_ref;
print Dumper \#models;
Output:
$VAR1 = [
{
'model' => 'toyota',
'year' => '2001'
},
{
'model' => 'mitsu',
'year' => '2005'
},
{
'model' => 'mazda',
'year' => '2008'
}
];
So I added that line of code to dump my data:
my #sorted_models = sort { $g_logger->info(Dumper{a=>$a,b=>$b});
UnixDate($a->{'year'}, "%o") <=>
UnixDate($b->{'year'}, "%o"); }
values %{$args{car_models}};
I was able to dump $a and $b ONCE, then I got the error like >50 times, followed by a dump of $a and $b for 20 or so times (I have 10 elements in my array)