Sort an array of hashes

Sort an array of hashes - perl

I have a reference that has the following data structure when dumped:
VAR1 = [
{
'0' => 0
},
{
'1' => 1
},
{
'-1' => 2
},
{
'2' => 3
},
];
I am trying to loop over them and eventually sort by key. Here is an example of my code:
use strict;
use warnings;
use Data::Dumper;
my $skew_ref;
push #{$skew_ref}, { 0 => 0, 1 => 1, -1 => 2, 2 => 3, };
my #sorted;
for my $ref ( #{$skew_ref} ) {
while ( my ($k, $v ) = each %{$ref} ) {
print "$k => $v\n";
}
#sorted = sort { %{$b} <=> %{$a} } keys %{$ref};
}
print Dumper(\#sorted);
What am I doing incorrectly? I want the smallest key value and it is giving me the largest.
The output should just be 2 in this case.

use List::Util qw( min );
my $skews = { 0 => 0, 1 => 1, -1 => 2, 2 => 3 };
my $val = $skews->{ min keys %$skews };
Contrary to your implications, there cannot be more than one result since a hash cannot have two elements with the same key.

my #sorted = map $_->[0],
sort { $a->[1] <=> $b->[1] }
map [ $_, keys %$_ ], #arr;

Answering your direct question: you swapped a and b in the sort closure:
#sorted = sort { %{$a} <=> %{$b} } keys %{$ref};

Related

custom sort method for hashes which will automatically use the approporiate hash

My hash contains binary numbers as keys:
my %h = ("1010" => 1, "1110" => 0, "0001" => 3, "1100" => 2);
In perl I can use custom function for sorting hash. This is my function for sorting binary numbers from lowest to largest:
sub sort_binary_numbers {
my $a_dec = oct("0b".$a);
my $b_dec = oct("0b".$b);
return $a_dec <=> $b_dec;
}
I can sort hash using this function following way:
print Dumper sort sort_binary_numbers keys %h;
And the result will be:
$VAR1 = '0001';
$VAR2 = '1010';
$VAR3 = '1100';
$VAR4 = '1110';
I want to sort hash using values not keys. I can do following:
print Dumper sort { $h{$b} <=> $h{$a} } keys %h;
As you can see I have to use hash name in sorting block. The problem is how to rewrite this sorting block to function (as above examples) and automatically get the appropriate hash name in function. I've tried access hash name using #_ but it was not printed e.g.
sub sort_by_value {
print Dumper #_; # This was not printed
print ref #_; # This was not printed
return $b <=> $a;
}
And call it following way:
print Dumper sort sort_by_value keys %h;
The interesting part is that when I wrap this sorting in to another function and call it in loop from this function I will get the output of data dumper that was previously missing (but I still did not get the output of ref command):
sub calling_from_function {
my %h = %{$_[0]};
foreach my $key (sort sort_by_value keys %h){
}
}
&calling_from_function(\%h);
Then I get this output:
$VAR1 = {
'0001' => 3,
'1010' => 1,
'1110' => 0,
'1100' => 2
};
$VAR1 = {
'0001' => 3,
'1010' => 1,
'1110' => 0,
'1100' => 2
};
$VAR1 = {
'0001' => 3,
'1010' => 1,
'1110' => 0,
'1100' => 2
};
$VAR1 = {
'0001' => 3,
'1010' => 1,
'1110' => 0,
'1100' => 2
};
Questions:
How can I replace sorting block in this command print Dumper sort { $h{$b} <=> $h{$a} } keys %h; with function and get the appropriate name of hash inside sortign function?
Why wrapping from another function works?
Why ref does not works?

The sorting subroutine doesn't take parameters normally (i.e. unless prototypes are involved) through #_, but through $a and $b. ref #array can never return anything, as an array is never a reference.
Wrapping by another function works, because you populate #_ by parameters to the wrapper.
Use a wrapper to sort any hash:
sub sort_by_value {
my %h = #_;
return sort { $h{$b} <=> $h{$a} } keys %h
}
print Dumper(sort_by_value(%h));
You can also send the hash reference to the subroutine:
sub sort_by_value {
my ($h) = #_;
return sort { $h->{$b} <=> $h->{$a} } keys %$h
}
print Dumper sort_by_value(\%h);

So you want to have a generic sorting function such as
my $sorter = sub { $_[0]{$b} <=> $_[0]{$a} };
When it comes time to sort, just use
my #sorted_keys = sort { $sorter->(\%h) } keys(%h);

You can use hash as a list, convert it to k/v aref pairs, perform sort on values (second element), and pick keys from sorted list (roughly it is Schwartzian transform in disguise).
use strict;
use warnings;
use List::Util 'pairs';
my %h = ("1010" => 1, "1110" => 0, "0001" => 3, "1100" => 2);
my #k = map $_->[0],
sort { $b->[1] <=> $a->[1] }
pairs %h;
without additional modules,
my #k = map $_->[0],
sort { $b->[1] <=> $a->[1] }
map [ $_, $h{$_} ],
keys %h;

List array find double and add value

the original perl array is sorted and looks like this:
Original ARRARY:
ccc-->2
ccc-->5
abc-->3
abc-->7
cb-->6
and i like to have the following result:
FINAL ARRARY:
ccc-->7
abc-->10
cb-->6
Question:
can you please create a subroutine for that ?
this was the orig. subroutine that i used:
sub read_final_dev_file {
$dfcnt=0;
$DEVICE_ANZSUMZW=0;
$DEVICE_ANZSUM=0;
open(DATA,"$log_dir1/ALLDEVSORT.$log_file_ext1") || die ("Cannot Open Logfile: $log_dir1/$log_DEV_name.$log_file_ext1 !!!!");
#lines = <DATA>;
close(DATA);
chomp(#lines); # erase the last sign from a string
foreach $logline (#lines) {
if ($logline =~ /(.*)-->(.*)/) {
$DEVICE_CODE[$dfcnt] = $1;
$DEVICE_ANZAHL[$dfcnt] = $2;
print "DEVICE_final = $DEVICE_CODE[$dfcnt], D_ANZAHL_final = $DEVICE_ANZAHL[$dfcnt]\n";
if ($dfcnt > 0 ) {
if ( $DEVICE_CODE[$dfcnt] eq $DEVICE_CODE[$dfcnt-1] ) {
$DEVICE_ANZSUM = $DEVICE_ANZAHL[$dfcnt] + $DEVICE_ANZAHL[$dfcnt-1];
$DEVICE_ANZSUMZW = $DEVICE_ANZSUM++;
#$DEVICE_ANZSUM = $DEVICE_ANZAHL[$dfcnt]++;
#print "DEVICE_ANZAHL = $DEVICE_ANZAHL[$dfcnt],DEVICE_ANZAHL -1 = $DEVICE_ANZAHL[$dfcnt-1]\n";
print "DEVICE_eq = $DEVICE_CODE[$dfcnt], D_ANZAHL_eq = $DEVICE_ANZAHL[$dfcnt],DEVANZSUM = $DEVICE_ANZSUM,COUNT = $dfcnt\n";
}#end if
if ( $DEVICE_CODE[$dfcnt] ne $DEVICE_CODE[$dfcnt-1] ) {
#$DEVICE_ANZSUM=0;
#splice(#data3,$dfcnt+2,1) if ($DEVICE_ANZSUM > 1);
push (#data3,$DEVICE_ANZSUMZW) if ($DEVICE_ANZSUM > 1);
push (#data3,$DEVICE_ANZAHL[$dfcnt]) if ($DEVICE_ANZSUM == 0);
if ( $DEVICE_CODE[$dfcnt] ne $DEVICE_CODE[$dfcnt-1] ) {
$DEVICE_ANZSUM=0;
}
print "DEVICE_ne = $DEVICE_CODE[$dfcnt], D_ANZAHL_ne = $DEVICE_ANZAHL[$dfcnt], DEVANZSUM = $DEVICE_ANZSUM\n";
}#end if
}#end if $dfcnt
$dfcnt++;
}#end if logline
}#end for
print "#labels3\n";
print "#data3\n";
}#end sub read_final_dev_file

Probably not the best way, but this is what came to mind after seeing LeoNerd answer, since I don't have CPAN access in production and never have modules lying around:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my #input = (
[ ccc => 2 ],
[ ccc => 5 ],
[ abc => 3 ],
[ abc => 7 ],
[ cb => 6 ],
);
my %output;
$output{$_->[0]} += $_->[1] for #input;
print Dumper \%output;
my #output = map { [ $_ => $output{$_} ] } keys(%output);
print Dumper \#output;
Output:
$VAR1 = {
'abc' => 10,
'cb' => 6,
'ccc' => 7
};
$VAR1 = [
['abc', 10],
['cb', 6],
['ccc', 7],
];

You could use List::UtilsBy::partition_by to group the original list into partitions, by the first string:
use List::UtilsBy qw( partition_by );
my #input = (
[ ccc => 2 ],
[ ccc => 5 ],
[ abc => 3 ],
[ abc => 7 ],
[ cb => 6 ],
);
my %sets = partition_by { $_->[0] } #input;
Now you have a hash, keyed by the leading strings, whose values are all the ARRAY refs with that key first. You can now sum the values within them, by mapping over $_->[1] which contains the numbers:
use List::Util qw( sum );
my %totals;
foreach my $key ( keys %sets ) {
$totals{$key} = sum map { $_->[1] } #{ $sets{$key} };
}
If you're inclined towards code of a more compact and functional-looking nature, you could instead use the new pairmap here; making the whole thing expressible in one line:
use List::UtilsBy qw( partition_by );
use List::Util qw( pairmap sum );
my %totals = pairmap { $a => sum map { $_->[1] } #$b }
partition_by { $_->[0] } #input;
Edit: I should add that even though you stated in your original question that the array was sorted, this solution doesn't require it sorted. It will happily take the input in any order.

You can simplify your subroutine a lot by using a hash to track the counts instead of an array. The following uses an array #devices to track the order and a hash %device_counts to track the counts:
my #devices;
my %device_counts;
while (<DATA>) { # Read one line at a time from DATA
if (/(.*)-->(.*)/) { # This won't extract newlines so no need to chomp
if (!exists $device_counts{$1}) {
push #devices, $1; # Add to the array the first time we encounter a device
}
$device_counts{$1} += $2; # Add to the count for this device
}
}
for my $device (#devices) {
printf "%s-->%s\n", $device, $device_counts{$device};
}

Sorting and printing a hash of hashes in Perl

Using Perl, I have a HoH similar to this:
%HoH = (
'A' => {
'a' => 4,
'b' => 18,
'c' => 2
},
'B' => {
'a' => 1,
'b' => 2
},
'C' => {
'a' => 1
},
'D' => {
'a' => 1,
'b' => 2,
'c' => 5,
'd' => 9
},
#........ on and on and on .....
);
For each of the capital keys, I want to print the one lower-case key that has the largest value associated with it.
example output:
b,b,a,d...
Any direction at this point would be appreciated, new to the game.

use List::Util qw(reduce);
for my $k1 (sort keys %HoH) {
my $h = $HoH{$k1};
my $k2 = reduce { $h->{$a} > $h->{$b} ?$a :$b } keys %$h;
print "$k1, $k2\n";
}

For example:
for my $k (sort keys %HoH) {
my $h = $HoH{$k};
my $g= (sort {$h->{$b} <=> $h->{$a}} keys %$h)[0];
print "$k: $g \n";
}
(Your original output does not much sense, because the order of the keys of %HoH is not fixed)

Using List::Util's reduce;
use List::Util qw(reduce);
use strict;
use warnings;
my %HoH = ...
for my $k (sort keys %HoH) {
my $h = $HoH{$k};
my $maxKey = reduce {$h->{$a} > $h->{$b} ? $a : $b} keys %$h;
print "$k -> $maxKey\n";
}

Weighted sort in perl?

I have a hash of hashes where the values are all numerical. I can sort fine using the sort command and or to sort the hash values in order first to last, but what if I want to weight the results instead of it just being in order of keys specified? Is there a way to do that?
EDIT: Ok, here's the code...
my #check_order = ["disk_usage","num_dbs","qps_avg"];
my %weights = ( disk_usage => .7,
num_dbs => .4,
qps_avg => .2
);
my #dbs=sort {
($stats{$a}->{$check_order[0]}*$weights{$check_order[0]}) <=>
($stats{$b}->{$check_order[0]}*$weights{$check_order[0]}) or
($stats{$a}->{$check_order[1]}*$weights{$check_order[1]}) <=>
($stats{$b}->{$check_order[1]}*$weights{$check_order[1]}) or
($stats{$a}->{$check_order[2]}*$weights{$check_order[2]}) <=>
($stats{$b}->{$check_order[2]}*$weights{$check_order[2]})
} keys(%stats);

You want to sort the list based on a function value of each element. So use a function in your sort statement.
#sorted = sub { sort_function($a) <=> sort_function($b) } #unsorted;
sub sort_function {
my ($input) = #_;
return $input->{disk_usage} * 0.7
+ $input->{num_dbs} * 0.4
+ $input->{qps_avg} * 0.2;
# -or- more generally
my $value = 0;
while (my ($key,$weight) = each %weights) {
$value += $input->{$key} * $weight;
}
return $value;
}
When your sort function is expensive and there are many items to be sorted, a Schwartzian transform can improve the performance of your sort:
#sorted = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [ $_, sort_function($_) ] }
#unsorted;

If your weights are stored in another hash %property
This will sort hash keys based on the product $hash{key} * $property{key}
#!/usr/bin/perl
use strict;
use warnings;
my %hash = (
a => 51,
b => 61,
c => 71,
);
my %property = ( a => 7, b => 6, c => 5 );
foreach (sort { ($hash{$a}*$property{$a}) <=>
($hash{$b}*$property{$b}) } keys %hash)
{
printf("[%d][%d][%d]\n",
$hash{$_},$property{$_},$hash{$_}*$property{$_});
}

Return all hash key/value pairs with maximum value

I have a hash (in Perl) where the values are all numbers. I need to create another hash that contains all key/value pairs from the first hash where the value is the maximum of all values.
For example, given
my %hash = (
key1 => 2,
key2 => 6,
key3 => 6,
);
I would like to create a new hash containing:
%hash_max = (
key2 => 6,
key3 => 6,
);
I'm sure there are many ways to do this, but am looking for an elegant solution (and an opportunity to learn!).

use List::Util 'max';
my $max = max(values %hash);
my %hash_max = map { $hash{$_}==$max ? ($_, $max) : () } keys %hash;
Or a one-pass approach (similar to but slightly different from another answer):
my $max;
my %hash_max;
keys %hash; # reset iterator
while (my ($key, $value) = each %hash) {
if ( !defined $max || $value > $max ) {
%hash_max = ();
$max = $value;
}
$hash_max{$key} = $value if $max == $value;
}

This makes one pass over the data, but wastes a lot of hash writes:
use strict;
use warnings;
my %hash = (
key1 => 2,
key2 => 6,
key3 => 6,
);
my %hash_max = ();
my $max;
foreach my $key (keys %hash) {
if (!defined($max) || $max < $hash{$key} ) {
%hash_max = ();
$max = $hash{$key};
$hash_max{$key} = $hash{$key};
}
elsif ($max == $hash{$key}) {
$hash_max{$key} = $hash{$key};
}
}
foreach my $key (keys %hash_max) {
print "$key\t$hash_max{$key}\n";
}

# sort numerically descending
my #topkey = sort {$hash{$b} <=> $hash{$a}} keys %hash;
Then copy the top values to %hash_max, with a loop terminator after the last max value:
for $key (#topkey) {
if ($hash{$key} == $hash{$topkey[0]}) {
$hash_max{$key} = $hash{$key}
} else { last }
}
ETA: Note to the unbelievers that last works because the keys in #topkey are sorted, so we can break the loop when the value is no longer like the first one. I.e. all the following values are lower.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Sort an array of hashes - perl

use List::Util qw( min ); my $skews = { 0 => 0, 1 => 1, -1 => 2, 2 => 3 }; my $val = $skews->{ min keys %$skews }; Contrary to your implications, there cannot be more than one result since a hash cannot have two elements with the same key.

my #sorted = map $_->[0], sort { $a->[1] <=> $b->[1] } map [ $_, keys %$_ ], #arr;

Answering your direct question: you swapped a and b in the sort closure: #sorted = sort { %{$a} <=> %{$b} } keys %{$ref};

Related

custom sort method for hashes which will automatically use the approporiate hash

List array find double and add value

Sorting and printing a hash of hashes in Perl

Weighted sort in perl?

Return all hash key/value pairs with maximum value

Categories

Resources