How to extract key name from a hash of hash? - perl

I have following hash of hash :
%HoH = (
flintstones => {
husband => "fred",
pal => "barney",
},
jetsons => {
husband => "george",
wife => "jane",
his boy => "elroy",
},
simpsons => {
husband => "homer",
wife => "marge",
kid => "bart",
},
);
How to iterate over each inner hash (say flintstones) and also extract the key names (husband, pal) and corresponding vales for each such iteration?

for my $k (keys %{ $HoH{flintstones} }) {
my $v = $HoH{flintstones}{$k};
print "key is $k; value is $v\n";
}
another way would be using each
while( my($k, $v) = each %{ $HoH{flintstones} }) { ... }

for my $outer_key ( keys %HoH )
{
print "Values for inner_hash $outer_key:\n";
for my $inner_key ( keys %{$HoH{$outer_key}} )
{
print "\t'$inner_key' points to " . $HoH{$outer_key}{$inner_key} . "\n";
}
}
Because each key in the outer level points to a hash in the inner level we can use all the normal hash functions on that entry.
While it is possible to write this in a more succinct way without the double loop I prefer the nested loops for two reasons.
It is more obvious when someone else has to work on the code (including you in six months as someone else).
It makes it easier to track things such as which outer key leads to this point if needed (as shown in my output).

Just loop over the hash (by keys or values or each, depending on whether you need the keys and on taste) and then loop over the inner hashes (likewise).
So, to get all of the people described by this hash:
for (values %HoH) {
for (values $_) {
push #people, $_
}
}
Or to build a table of all the husbands, all the wives, etc.:
for my $fam (values %HoH) {
push #{$relations{$_}}, $fam->{$_} for keys $fam
}
Or to re-key the table off the husbands:
for my $fam (keys %HoH) {
$patriarchs{$HoH{$fam}{husband}}{family} = $fam;
for (keys $HoH{$fam}) {
next if $_ eq 'husband';
$patriarchs{$HoH{$fam}{husband}}{$_} = $HoH{$fam}{$_};
}
}

Related

How can I find which keys in a Perl multi-level hash correspond to a given value?

I have a data structure which looks like this:
my %hoh = (
'T431567' => {
machin => '01',
bidule => '02',
truc => '03',
},
'T123456' => {
machin => '97',
bidule => '99',
truc => '69',
},
'T444444' => {
machin => '12',
bidule => '64',
truc => '78',
},
);
I want to search the various values of truc for a particular value and find the top-level attribute which corresponds to that entry. For example, looking for a value of 78, I want to find the result 'T444444', because $hoh{T444444}{truc} is 78.
How can I do this, please?
You can do this with grep:
my #keys = grep { $hoh{$_}{truc} == 78 } keys %hoh;
Note that this can return more than one key, if there are duplicate values in the hash. Also note that this is not particularly efficient, since it has to search the entire hash. In most cases it's probably fine, but if the hash can be very large and you may need to run lots of such queries against it, it may be more efficient to build a reverse index as suggested by Sobrique:
my %trucs;
foreach my $part (keys %hoh) {
my $val = $hoh{$part}{truc};
push #{ $trucs{$val} }, $part;
}
my #keys = #{ $trucs{78} };
or, more generally:
my %index;
foreach my $part (keys %hoh) {
my %data = %{ $hoh{$part} };
foreach my $key (keys %data) {
my $val = $data{$key};
push #{ $index{$key}{$val} }, $part;
}
}
my #keys = #{ $index{truc}{78} };
Can't with that data structure as is - There is no 'backwards' relationship from value to key without you creating it.
You've two options - run a search, or create an 'index'. Practically speaking, these are the same, just one saves the results.
my %index;
foreach my $key ( keys %hoh ) {
my $truc = $hoh{$key}{'truc'};
$index{$truc} = $key;
}
Note - won't do anything clever if the 'truc' numbers are duplicated - it'll overwrite. (Handling this is left as an exercise to the reader).
This solution is similar to those already posted, but it uses the each operator to process the original hash in fewer lines of code, and probably more quickly.
I have added the dump output only so that you can see the form of the structure that is built.
use strict;
use warnings;
my %hoh = (
T123456 => { bidule => '99', machin => '97', truc => '69' },
T431567 => { bidule => '02', machin => '01', truc => '03' },
T444444 => { bidule => '64', machin => '12', truc => '78' },
);
my %trucs;
while ( my ($key, $val) = each %hoh ) {
next unless defined( my $truc = $val->{truc} );
push #{ $trucs{$truc} }, $key ;
}
use Data::Dump;
dd \%trucs;
print "\n";
print "$_\n" for #{ $trucs{78} };
output
{ "03" => ["T431567"], "69" => ["T123456"], "78" => ["T444444"] }
T444444
If you can guarantee that the answer is unique, i.e. that there is never more than one element of the original hash that has a given value for the truc entry, or you are interested only in the last one found, then you can write this still more neatly
my %trucs;
while ( my ($key, $val) = each %hoh ) {
next unless defined( my $truc = $val->{truc} );
$trucs{$truc} = $key ;
}
print $trucs{78}, "\n";
output
T444444
Simplest of all, if there is always a truc entry in each second-level hash, and its values is guaranteed to be unique, then this will do the job
my %trucs = map { $hoh{$_}{truc} => $_ } keys %hoh;
print $trucs{78}, "\n";
with the output as above.

Problems with sorting a hash of hashes by value in Perl

I'm rather inexperienced with hashes of hashes - so I hope someone can help a newbie...
I have the following multi-level hash:
$OCRsimilar{$ifocus}{$theWord}{"form"} = $theWord;
$OCRsimilar{$ifocus}{$theWord}{"score"} = $OCRscore;
$OCRsimilar{$ifocus}{$theWord}{"distance"} = $distance;
$OCRsimilar{$ifocus}{$theWord}{"similarity"} = $similarity;
$OCRsimilar{$ifocus}{$theWord}{"length"} = $ilength;
$OCRsimilar{$ifocus}{$theWord}{"frequency"} = $OCRHashDict{$ikey}{$theWord};
Later, I need to sort each second-level element ($theWord) according to the score value. I've tried various things, but have failed so far. The problem seems to be that the sorting introduces new empty elements in the hash that mess things up.
What I have done (for example - I'm sure this is far from ideal):
my #flat = ();
foreach my $key1 (keys { $OCRsimilar{$ifocus} }) {
push #flat, [$key1, $OCRsimilar{$ifocus}{$key1}{'score'}];
}
for my $entry (sort { $b->[1] <=> $a->[1] } #flat) {
print STDERR "#$entry[0]\t#$entry[1]\n";
}
If I check things with Data::Dumper, the hash contains for example this:
'uroadcast' => {
'HASH(0x7f9739202b08)' => {},
'broadcast' => {
'frequency' => '44',
'length' => 9,
'score' => '26.4893274374278',
'form' => 'broadcast',
'distance' => 1,
'similarity' => 1
}
}
If I don't do the sorting, the hash is fine. What's going on?
Thanks in advance for any kind of pointers...!
Just tell sort what to sort on. No other tricks are needed.
#!/usr/bin/perl
use warnings;
use strict;
my %OCRsimilar = (
focus => {
word => {
form => 'word',
score => .2,
distance => 1,
similarity => 1,
length => 4,
frequency => 22,
},
another => {
form => 'another',
score => .01,
distance => 1,
similarity => 1,
length => 7,
frequency => 3,
},
});
for my $word (sort { $OCRsimilar{focus}{$a}{score} <=> $OCRsimilar{focus}{$b}{score} }
keys %{ $OCRsimilar{focus} }
) {
print "$word: $OCRsimilar{focus}{$word}{score}\n";
}
Pointers: perlreftut, perlref, sort.
What seems suspicious to me is this construct:
foreach my $key1 (keys { $OCRsimilar{$ifocus} }) {
Try dereferencing the hash, so it becomes:
foreach my $key1 (keys %{ $OCRsimilar{$ifocus} }) {
Otherwise, you seem to be creating an anonymous hash and taking the keys of it, equivalent to this code:
foreach my $key1 (keys { $OCRsimilar{$ifocus} => undef }) {
Thus, I think $key1 would equal $OCRsimilar{$ifocus} inside the loop. Then, I think Perl will do auto-vivification when it encounters $OCRsimilar{$ifocus}{$key1}, adding the hash reference $OCRsimilar{$ifocus} as a new key to itself.
If you use warnings;, the program ought to complain Odd number of elements in anonymous hash.
Still, I don't understand why Perl doesn't do further auto-vivication and add 'score' as the key, showing something like 'HASH(0x7f9739202b08)' => { 'score' => undef }, in the Data dump.

Perl accessing the elements in a hash/hash reference data structure

I have a question I'm hoping you could help with as I am new to hashes and hash reference stuff?
I have the following data structure:
$VAR1 = {
'http://www.superuser.com/' => {
'difference' => {
'http://www.superuser.com/questions' => '10735',
'http://www.superuser.com/faq' => '13095'
},
'equal' => {
'http://www.superuser.com/ ' => '20892'
}
},
'http://www.stackoverflow.com/' => {
'difference' => {
'http://www.stackoverflow.com/faq' => '13015',
'http://www.stackoverflow.com/questions' => '10506'
},
'equal' => {
'http://www.stackoverflow.com/ ' => '33362'
}
}
If I want to access all the URLs in the key 'difference' so I can then perform some other actions on the URLs, what is the correct or preferred method of accessing those elements?
e.g I will end up with the following URLs that I can then do stuff to in a foreach loop with:
http://www.superuser.com/questions
http://www.superuser.com/faq
http://www.stackoverflow.com/faq
http://www.stackoverflow.com/questions
------EDIT------
Code to access the elements further down the data structure shown above:
my #urls;
foreach my $key1 ( keys( %{$VAR1} ) ) {
print( "$key1\n" );
foreach my $key2 ( keys( %{$VAR1->{$key1}} ) ) {
print( "\t$key2\n" );
foreach my $key3 ( keys( %{$VAR1->{$key1}{$key2}} ) ) {
print( "\t\t$key3\n" );
push #urls, keys %{$VAR1->{$key1}{$key2}{$key3}};
}
}
}
print "#urls\n";
Using the code above why do I get the following error?
Can't use string ("13238") as a HASH ref while "strict refs" in use at ....
It is not difficult, just take the second level of keys off every key in the variable:
my #urls;
for my $key (keys %$VAR1) {
push #urls, keys %{$VAR1->{$key}{'difference'}};
}
If you're struggling with dereferencing, just keep in mind that all values in a hash or array can only be a scalar value. In a multilevel hash or array the levels are just single hashes/arrays stacked on top of each other.
For example, you could do:
for my $value (values %$VAR1) {
push #urls, keys %{$value->{'difference'}};
}
Or
for my $name (keys %$VAR1) {
my $site = $VAR1->{$name};
push #urls, keys %{$site->{'difference'}};
}
..taking the route either directly over the value (a reference to a hash) or over a temporary variable, representing the value via the key. There is more to read in perldoc perldata.

Merging hashes in perl

I am trying to merge two hashes. Well, I am able to merge, but the output is not the way I want it to be:
Here is my code:
my %friend_list = (
Raj => "Good friend",
Rohit => "new Friend",
Sumit => "Best Friend",
Rohini => "Fiend",
Allahabad => "UttarPradesh",
);
my %city = (
Bangalore => "Karnataka",
Indore => "MadhyaPradesh",
Pune => "Maharashtra",
Allahabad => "UP",
);
my %friends_place = ();
my ($k, $v);
foreach my $ref (\%friend_list, \%city) {
while (($k,$v) = each (%$ref)) {
if (exists $ref{$k}) {
print"Warning: Key is all ready there\n";
next;
}
$friends_place{$k} = $v;
}
}
while (($k,$v) = each (%friends_place)) {
print "$k = $v \n";
}
From this o/p is
Raj=Good friend
Indore=MadhyaPradesh
Rohit=new Fiend
Bangalore=Karnataka
Allahabad=UttarPradesh
Sumit=Best Friend
Pune=Maharashtra
Rohini =Fiend
But I want to print %friend_list first followed by %city.
Another thing which I was trying to do is, if there is any duplicate key, then it should give me a warning message. But it is not giving me any message. As we can see here, we have Allahabad in both hash.
Thanks
Try with:
my %firend_list = (
Raj => "Good friend",
Rohit => "new Fiend",
Sumit => "Best Friend",
Rohini => "Fiend",
Allahabad => "UttarPradesh",
);
my %city = (
Bangalore => "Karnataka",
Indore => "MadhyaPradesh",
Pune => "Maharashtra",
Allahabad => "UP",
);
#merging
my %friends_place = ( %friend_list, %city );
And, for warnings:
foreach my $friend( keys %friend_list ){
print"Warning: Key is all ready there\n" if $friend ~~ [ keys %city ];
}
The line if (exists $ref{$k}) { is wrong and you can see it if you're putting use strict; use warnings; at the begining of the script.
Moreover this line should be if (exists $friends_place{$k}) { to produce the message about duplicate keys.
As hashes are unordered, you need to use an array to store the ordering:
my %friends_place = (%firend_list, %city);
my #friends_place_keyorder = ((keys %firend_list), (keys %city));
if ((scalar keys %friends_place) != (scalar #friends_place_keyorder)) {
print 'duplicate key found';
}
foreach (#friends_place_keyorder) {
print "$_ = $friends_place{$_}\n";
}
EDIT: my original solution in python, left here for historical purpose:
As hashes are unordered, you need to use an array to store the ordering. I don't know perl, so the following code is python (should be fairly straightforward to translate to perl):
friend_list = ...
city = ...
friends_place = dict(friend_list.items() + city.items())
friends_place_keyorder = friend_list.keys() + city.keys()
# detect duplicate keys by checking their lengths
# if there is duplicate then the hash would be smaller than the list
if len(friends_place) != len(friends_place_keyorder):
print "duplicate key found"
# iterate through the list of keys instead of the hashes directly
for k in friends_place_keyorder:
print k, friends_place[k]

Is it possible to iterate through a hash in sorted order using the while(my($key, $value) ... ) {} method?

For a hash of this format:
my $itemHash = {
tag1 => {
name => "Item 1",
order => 1,
enabled => 1,
},
tag2 => {
name => "Item 2",
order => 2,
enabled => 0,
},
tag3 => {
name => "Item 3",
order => 3,
enabled => 1,
},
...
}
I have this code that correctly iterates through the hash:
keys %$itemHash; # Resets the iterator
while(my($tag, $item) = each %$itemHash) {
print "$tag is $item->{'name'}"
}
However, the order that these items are iterated in seems to be pretty random. Is it possible to use the same while format to iterate through them in the order specified by the 'order' key in the hash for each item?
(I know I can sort the keys first and then foreach loop through it. Just looking to see if there is cleaner way to do this.)
You can do some thing like :
foreach my $key (sort keys %{$itemHash}) {
print "$key : " . $itemHash->{$key}{name} . "\n";
}
The concept of an "ordered hash" is wrong. While an array is an ordered list of elements, and therefore accessible by index, a hash is an (un-ordered) collection of key-value pairs, where the keys are a set.
To accomplish your task, you would have to sort the keys by the order property:
my #sorted = sort {$hash{$a}{order} <=> $hash{$b}{order}} keys %$itemHash;
You can then create the key-value pairs via map:
my #sortedpairs = map {$_ => $itemHash->{$_}} #sorted;
We could wrap this up into a sub:
sub ridiculousEach {
my %hash = #_;
return map
{$_ => $hash{$_}}
sort
{$hash{$a}{order} <=> $hash{$b}{order}}
keys %hash;
}
to get an even-sized list of key-value elements, and
sub giveIterator {
my %hash = #_;
my #sorted = sort {$hash{$a}{order} <=> $hash{$b}{order}} keys %hash;
return sub {
my $key = shift #sorted;
return ($key => $hash{$key});
};
}
to create a callback that is a drop-in for the each.
We can then do:
my $iterator = giveIterator(%$itemHash);
while (my ($tag, $item) = $iterator->()) {
...;
}
There is a severe drawback to this approach: each only uses two elements at a time, and thus operates in constant memory. This solution has to read in the whole hash and store an array of all keys. Unnoticable with small hashes, this can get important with a very large amount of elements.
The order in which keys from the hash is undefined. So you'll need to sort the keys. One way would be, as you stated, to pull the keys out and sort them, then loop through the keys.
Another way would be to sort them on the fly. I'm not sure you'd consider it cleaner though. Something like:
for my $key ( sort { $itemHash->{$a}{order} <=> $itemhash->{$b}{order} } keys %$itemHash ) {
print "$key is $itemHash->{$key}{name}";
}
This will be quite cleaner. we have to use cmp for sorting strings.
my $itemHash = {
tag1 => {
name => "Item 1",
order => 1,
enabled => 1,
},
tag2 => {
name => "Item 2",
order => 2,
enabled => 0,
},
tag3 => {
name => "Item 3",
order => 3,
enabled => 1,
}
};
foreach((sort{$a cmp $b}(keys(%$itemHash)))){
print "$_ is $itemHash->{$_}->{'name'}\n";
}