Perl: Sort values in multidimensional hash - perl

I have a subroutine that looks likes this:
...
sub UserLogins {
my %loginData;
my #logins = qx(last) or die;
foreach my $row (#logins) {
if ( $row=~ /^(\w+)\s+/ and (("$1" ne "reboot") and ("$1" ne "wtmp")) ) {
$loginData{$1}{"logins"}++;
}
}
return \%loginData
}
...
Using this subroutine in main script, I get following output:
...
$VAR1 = {
'user1' => {
'oldpassword' = 0,
'filesize' => '14360',
'logins' => 1
},
'user2' => {
'oldpassword' = 0,
'filesize' => '1220',
'logins' => 15
},
'user3' => {
'oldpassword' = 1,
'filesize' => '1780',
'logins' => 7
}
}
...
I wonder how I should sort my %loginData hash so user with largest number of logins gets printed first (in this case user2, user3, user1).
I have also tried to sort values in this way:
foreach my $test_sort (sort {$a <=> $b} values %loginData) {
say $test_sort;
}
But this function doesn't work at all.
Another thing I tried and didn't work:
print "$_\n" foreach sort {$loginData{$b}->{logins} <=> $loginData{$a}->{logins}} keys %loginData;
Update
This function actually works, but shows errors massages:
print "$_\n" foreach sort {$userData{$b}->{'logins'} <=> $userData{$a}->{'logins'}} keys %userData;
Errors:
Use of uninitialized values in numeric comparison (<=>)

Here's a canonical solution for accessing elements of a hash in some order based on an attribute of the value (in this case logins):
for (sort { $loginData{$b}->{logins} <=> $loginData{$a}->{logins} } keys %loginData)
{
...
}
Note reversal of $b and $a to achieve a reverse sort (most logins first).

You can't sort a hash. From the Perl docs:
Hash entries are returned in an apparently random order.
You can, compute a list of keys that index the hash in the order you like.
Something like this:
use Data::Dumper;
our $hash = {
'user1' => {
'oldpassword' => 0,
'filesize' => '14360',
'logins' => 1
},
'user2' => {
'oldpassword' => 0,
'filesize' => '1220',
'logins' => 15
},
'user3' => {
'oldpassword' => 1,
'filesize' => '1780',
'logins' => 7
}
};
sub KeysByLogins {
my $hash = shift;
map { $_->[1] }
sort { $a->[0] <=> $b->[0] }
map { [ $hash->{$_}->{logins}, $_ ] } keys %$hash;
}
foreach my $key (KeysByLogins($hash)) {
print Data::Dumper->Dump([$hash->{$key}], [$key]) . "\n";
}
Then...
$ perl foo.pl
$user1 = {
'filesize' => '14360',
'oldpassword' => 0,
'logins' => 1
};
$user3 = {
'logins' => 7,
'oldpassword' => 1,
'filesize' => '1780'
};
$user2 = {
'oldpassword' => 0,
'filesize' => '1220',
'logins' => 15
};

Related

Perl - retrieve values from a hash of hashes

How do we retrieve the values of keys in a Hash of Hashes in Perl?
I tried to use the keys function. I wanted to remove the duplicates and then sort them, which i could
do using the uniq and sort functions. Am I missing anything?
#!/usr/bin/perl
use warnings;
use strict;
sub ids {
my ($data) = #_;
my #allID = keys %{$data};
my #unique = uniq #allID;
foreach ( #unique ) {
#allUniqueID = $_;
}
my #result = sort{$a<=>$b}(#allUniqueId);
return #result;
}
my $data = {
'first' => {
'second' => {
'third1' => [
{ id => 44, name => 'a', value => 'aa' },
{ id => 48, name => 'b', value => 'bb' },
{ id => 100, name => 'c', value => 'cc' }
],
id => 19
},
'third2' => [
{ id => 199, data => 'dd' },
{ id => 40, data => 'ee' },
{ id => 100, data => { name => 'f', value => 'ff' } }
],
id => 55
},
id => 1
};
# should print “1, 19, 40, 44, 48, 55, 100, 199”
print join(', ', ids($data)) . "\n";
I know it's incomplete, but I am not sure how to proceed. Any help would be appreciated.
This routine will recursively walk the data structure and pull out all of the values that correspond to a hash key id, without sorting the results or eliminating duplicates:
sub all_keys {
my $obj = shift;
if (ref $obj eq 'HASH') {
return map {
my $value = $obj->{$_};
$_ eq 'id' ? $value : ref $value ? all_keys($value) : ();
} keys %$obj;
} elsif (ref $obj eq 'ARRAY') {
return map all_keys($_), #$obj;
} else {
return;
}
}
To do the sorting/eliminating, just call it like:
my #ids = sort { $a <=> $b } uniq(all_ids($data));
(I assume the uniq routine is defined elsewhere.)
Here's my version of the recursive approach
use warnings;
use strict;
sub ids {
my ($data) = #_;
my #retval;
if (ref $data eq 'HASH') {
push #retval, $data->{id} if exists $data->{id};
push #retval, ids($_) for values %$data;
}
elsif (ref $data eq 'ARRAY') {
push #retval, ids($_) for #$data;
}
#retval;
}
my $data = {
'first' => {
'second' => {
'third1' => [
{ id => 44, name => 'a', value => 'aa' },
{ id => 48, name => 'b', value => 'bb' },
{ id => 100, name => 'c', value => 'cc' }
],
id => 19
},
'third2' => [
{ id => 199, data => 'dd' },
{ id => 40, data => 'ee' },
{ id => 100, data => { name => 'f', value => 'ff' } }
],
id => 55
},
id => 1
};
my #ids = sort { $a <=> $b } ids($data);
print join(', ', #ids), "\n";
output
1, 19, 40, 44, 48, 55, 100, 100, 199
Update
A large chunk of the code in the solution above is there to work out how to extract the list of values from a data reference. Recent versions of Perl have an experimental facility that allows you to use the values operator on both hashes and arrays, and also on references ro both, so if you're running version 14 or later of Perl 5 and are comfortable disabling experimental warnings, then you can write ids like this instead
use warnings;
use strict;
use 5.014;
sub ids {
my ($data) = #_;
return unless my $type = ref $data;
no warnings 'experimental';
if ( $type eq 'HASH' and exists $data->{id} ) {
$data->{id}, map ids($_), values $data;
}
else {
map ids($_), values $data;
}
}
The output is identical to that of the previous solution

getting the sort keys from a hash

This is the data dumper of \%spec_hash.
It is sorted by group which is a national exchange - and symbol.
foohost:~/walt $ vi /tmp/footoo
$VAR1 = {
'ARCX' => {
'IACI' => 1,
'MCHP' => 1,
},
'AMXO' => {
'YUM' => 1,
'SYK' => 1,
},
'XISX' => {
'FCEL' => 1,
'GPS' => 1,
}
};
I was trying to sort by keys these two hashes but cannot. For debugging purposes I really want to see what is getting pumped out of these hashes
foreach my $exch (sort keys %spec_hash) {
foreach my $exch (sort keys %{$spec_hash{$exch}}) {
If I comment out the dumper and try a regular sort :
#print Dumper(\%spec_hash) ;
foreach my $exch (sort keys %spec_hash) {
#foreach my $exch (sort keys %{$spec_hash{$exch}}) {
print "key: $exch, value: $spec_hash{$exch}\n"
}
this i what I get :
key: AMXO, value: HASH(0x9cc88a4)
key: ARCX, value: HASH(0x9cd6f1c)
key: XISX, value: HASH(0x9cbd5f0)
and trying to print this prints nothing at all :
foreach my $exch (sort keys %{$spec_hash{$exch}}) {
print "key: $exch, value: $spec_hash{$exch}\n"
}
If I understand correctly,
for my $exch (sort keys %spec_hash) {
for my $sym (sort keys %{ $spec_hash{$exch} }) {
print "Exchange: $exch, Symbol: $sym\n";
}
}
You want to loop over every symbol, but they are grouped by exchange, so you must first loop over the exchanges.
Data::Dumper doesn't sort its output by default.
Try adding $Data::Dumper::Sortkeys = 1; to your script.
use strict;
use warnings;
use Data::Dumper;
my %hash = (
'ARCX' => { 'IACI' => 1, 'MCHP' => 1, },
'AMXO' => { 'YUM' => 1, 'SYK' => 1, },
'XISX' => { 'FCEL' => 1, 'GPS' => 1, },
);
print do {
local $Data::Dumper::Sortkeys = 1;
Dumper \%hash;
};
Outputs:
$VAR1 = {
'AMXO' => {
'SYK' => 1,
'YUM' => 1
},
'ARCX' => {
'IACI' => 1,
'MCHP' => 1
},
'XISX' => {
'FCEL' => 1,
'GPS' => 1
}
};
Note: This can be modified to include a sort subroutine that you define

Converting HoA to HoH with counting

Have this code:
use 5.020;
use warnings;
use Data::Dumper;
my %h = (
k1 => [qw(aa1 aa2 aa1)],
k2 => [qw(ab1 ab2 ab3)],
k3 => [qw(ac1 ac1 ac1)],
);
my %h2;
for my $k (keys %h) {
$h2{$k}{$_}++ for (#{$h{$k}});
}
say Dumper \%h2;
produces:
$VAR1 = {
'k1' => {
'aa2' => 1,
'aa1' => 2
},
'k3' => {
'ac1' => 3
},
'k2' => {
'ab1' => 1,
'ab3' => 1,
'ab2' => 1
}
};
Is possible to write the above code with "another way"? (e.g. simpler or more compact)?
Honestly, I don't like the number of times $h2{$k} is evaluated.
my %h2;
for my $k (keys %h) {
my $src = $h{$k};
my $dst = $h2{$k} = {};
++$dst->{$_} for #$src;
}
A subroutine can help make the intent more obvious. Maybe.
sub counts { my %c; ++$c{$_} for #_; \%c }
$h2{$_} = counts(#{ $h{$_} }) for keys %h;
That can be simplified if you do the change in-place.
sub counts { my %c; ++$c{$_} for #_; \%c }
$_ = counts(#$_) for values %h;

How do I get all values of a key in a perl data structure?

I want to write a function that will return a list of all “id” values in the data structure below at any level, sorted numerically. Also if the same value is found in multiple locations in the data structure it should only be included in the returned list once.
sub ids {
my ($data) = #_;
 
# Define this function

 }

 
 my $data = {
'top' => {
'window' => {
'elements' => {
{ id => 44, name => 'link', value => 'www.cnn.com' },

 { id => 48, name => 'title', value => 'CNN Home Page' },
{ id => 100, name => 'author', value => 'Admin' }
},

 id => 19

 },

 'cache' => {

 { id => 199, data => '5' },

 { id => 40, data => '9' },
{ id => 100, data => { name => 'author', value => 'Admin' }
}
 },
id => 55
 },

 id => 1

 };

 
 # should print “1, 19, 40, 44, 49, 55, 100, 199”
print join(', ', ids($data)) . “\n”;
Some of data structure should be arrays, not hashes as in OP,
use strict;
use warnings;
sub ids_r {
my ($data) = #_;
return map {
my $r = ref($data->{$_});
$r eq "HASH" ? ids_r($data->{$_}) :
$r ? map ids_r($_), #{$data->{$_}} :
$_ eq "id" ? $data->{$_} :
();
} keys %$data;
}
sub ids {
my ($data) = #_;
my %seen;
return
sort { $a <=> $b }
grep !$seen{$_}++, ids_r($data);
}
my $data = {
'top' => {
'window' => {
'elements' => [
{ id => 44, name => 'link', value => 'www.cnn.com' },
{ id => 48, name => 'title', value => 'CNN Home Page' },
{ id => 100, name => 'author', value => 'Admin' }
],
id => 19
},
'cache' => [
{ id => 199, data => '5' },
{ id => 40, data => '9' },
{ id => 100, data => { name => 'author', value => 'Admin' } }
],
id => 55
},
id => 1
};
print join(', ', ids($data));
output
1, 19, 40, 44, 48, 55, 100, 199
Here's a simple recursive solution. It's pretty easy to see what's going on here.
# There is a faster version of `uniq` provided by List::MoreUtils on CPAN.
sub uniq {
my %seen;
grep !$seen{$_}++, #_;
}
sub ids {
my $val = shift;
my $ref = ref $val;
my #r;
if ($ref eq 'HASH')
{
#r = map ids($_), grep ref, values(%$val);
push #r, $val->{id} if exists $val->{id};
}
elsif ($ref eq 'ARRAY')
{
#r = map ids($_), grep ref, #$val;
}
sort { $a <=> $b } uniq(#r);
}
#mpapec provides a similar solution which uses recursion without doing the sorting (the sub called ids_r in his answer), and then calls that from a separate wrapper function (the sub called ids in his answer) which provides the sorting all at the end. This is more efficient, but arguably more complex. (Indeed, because he had two similarly named functions, the first version of the answer included a mistake which negated the benefit of splitting the sorting out.)
Here's yet another technique, using a queue-based approach instead of recursion. If your data structure is very large, you may find that this works significantly faster.
# There is a faster version of `uniq` provided by List::MoreUtils on CPAN.
sub uniq {
my %seen;
grep !$seen{$_}++, #_;
}
sub ids {
my #r;
while (#_) {
my $val = shift;
my $ref = ref($val);
if ($ref eq 'HASH')
{
push #r, $val->{id} if exists $val->{id};
push #_, grep ref, values %$val;
}
elsif ($ref eq 'ARRAY')
{
push #_, grep ref, #$val;
}
}
sort { $a <=> $b } uniq(#r);
}

Sorting by value Hash of Hashes Perl

Let's say I have a hash of hashes data structure constructed as followed:
%HoH => (
flintstones => {
family_members => "fred;wilma;pebbles;dino",
number_of_members => 4,
},
jetsons => {
family_members => "george;jane;elroy",
number_of_members => 3,
},
simpsons => {
family_members => "homer;marge;bart;lisa;maggie",
number_of_members => 5,
},
)
How do I sort the keys, the families in this case, by the value number_of_members from greatest to least? Then I would like to print out the highest two. Here's a general idea but I know it's wrong:
foreach $value (
sort {
$HoH{$a}{$number_of_members} cmp $HoH{$b}{$number_of_members}
} keys %HoH)
my $count = 0;
while ($key, $value) = each %HoH) {
if (count <= 2){
print "${HoH}{$key}\t$key{$value}";
}
}
continue {
$count++;
};
I want the code to print (the spaces are tab delimited):
simpsons homer;marge;bart;lisa;maggie
flintstones fred;wilma;pebbles;dino
You're on the right track. You use the $a and $b internal variables in the hash and compare the values numerically (<=> not cmp).
When printing, I find it easiest to store the keys in an array and use an array slice to access them.
use strict;
use warnings;
my %HoH = (
flintstones => {
family_members => "fred;wilma;pebbles;dino",
number_of_members => 4,
},
jetsons => {
family_members => "george;jane;elroy",
number_of_members => 3,
},
simpsons => {
family_members => "homer;marge;bart;lisa;maggie",
number_of_members => 5,
},
);
my #sorted = sort { $HoH{$b}{'number_of_members'} <=>
$HoH{$a}{'number_of_members'} } keys %HoH;
for (#sorted[0,1]) { # print only first two
print join("\t", $_, $HoH{$_}{'family_members'}), "\n";
}
Output:
simpsons homer;marge;bart;lisa;maggie
flintstones fred;wilma;pebbles;dino