Sort Perl hash from largest to smallest - perl

I am looking at an example found here: http://perlmeme.org/tutorials/sort_function.html
And it gives this code to sort a hash based on each key's value:
# Using <=> instead of cmp because of the numbers
foreach my $fruit (sort {$data{$a} <=> $data{$b}} keys %data) {
print $fruit . ": " . $data{$fruit} . "\n";
}
This code I do not fully understand, but when I experiment with it, it sorts from lowest to highest. How can I flip it to sort from highest to lowest?

Just use reverse sort instead of sort.
foreach my $fruit (reverse sort keys %data) { ...

Swap $a and $b:
foreach my $fruit (sort {$data{$b} <=> $data{$a}} keys %data) {

Related

Perl inline subroutines and conditional operator for sort

So I have the following code which works:
my $cmp;
if ( $action eq DEL ) {
$cmp = \&cmpb;
}
else {
$cmp = \&cmpf;
}
foreach my $x ( sort $cmp keys %y ) {
# do something
}
And cmpb and cmpf here are:
sub cmpf { $a cmp $b }
sub cmpb { $b cmp $a }
Now my question is I'd rather have something like:
foreach my $x ( sort $action eq DEL ? \&cmpb : \&cmpf keys %y ) {
# do something
}
Or even better:
foreach my $x ( sort $action eq DEL ? { $a cmp $b } : { $b cmp $a } keys %y ) {
# do something
}
So two questions. First, what is the correct way to have those functions inline, and second, why don't the above work?
Consider also
foreach my $x ($action eq DEL ? reverse sort keys %y : sort keys %y) {
which is pretty compact and pretty readable. Perl optimizes reverse sort by inverting all of the comparisons; it doesn't sort the list one way and then reverse it.
You could put the ternary operator inside the sort function.
foreach my $x ( sort {$action eq DEL ? $a cmp $b : $b cmp $a ;} keys %y ) {
# do something
}
Here's a link to the docs on sort. You can put any function you want within the { }.
#ysth pointed out this would be faster
since basic types of sorting (ascending, descending, numeric
ascending, numeric descending) are optimized to not actually call into
the perl code for comparisons.
- ysth's comment
foreach my $x ( $action eq DEL ? sort { $b cmp $a } keys %y : sort { $a cmp $b } keys %y )

help understanding perl hash

Perl newbie here...I had help with this working perl script with some HASH code and I just need help understanding that code and if it could be written in a way that I would understand the use of HASHES more easily or visually??
In summary the script does a regex to filter on date and the rest of the regex will pull data related to that date.
use strict;
use warnings;
use constant debug => 0;
my $mon = 'Jul';
my $day = 28;
my $year = 2010;
my %items = ();
while (my $line = <>)
{
chomp $line;
print "Line: $line\n" if debug;
if ($line =~ m/(.* $mon $day) \d{2}:\d{2}:\d{2} $year: ([a-zA-Z0-9._]*):.*/)
{
print "### Scan\n" if debug;
my $date = $1;
my $set = $2;
print "$date ($set): " if debug;
$items{$set}->{'a-logdate'} = $date;
$items{$set}->{'a-dataset'} = $set;
if ($line =~ m/(ERROR|backup-date|backup-size|backup-time|backup-status)[:=](.+)/)
{
my $key = $1;
my $val = $2;
$items{$set}->{$key} = $val;
print "$key=$val\n" if debug;
}
}
}
print "### Verify\n";
for my $set (sort keys %items)
{
print "Set: $set\n";
my %info = %{$items{$set}};
for my $key (sort keys %info)
{
printf "%s=%s;", $key, $info{$key};
}
print "\n";
}
What I am trying to understand is these lines:
$items{$set}->{'a-logdate'} = $date;
$items{$set}->{'a-dataset'} = $set;
And again couple lines down:
$items{$set}->{$key} = $val;
Is this an example of hash reference? hash of hashes?
I guess i'm confused with the use of {$set} :-(
%items is a hash of hash references (conceptually, a hash of hashes). $set is the key into %items and then you get back another hash, which is being added to with keys 'a-logdate' and 'a-dataset'.
(corrected based on comments)
Lou Franco's answer is close, with one minor typographical error—the hash of hash references is %items, not $items. It is referred to as $items{key} when you are retrieving a value from %items because the value you are retrieving is a scalar (in this case, a hash reference), but $items would be a different variable.

What's wrong with this statement in Perl?

print "$_", join(',',sort keys %$h),"\n";
It's giving me an error below:
Use of uninitialized value in string at missing_months.pl line 36.
1,10,11,12
this print statement is present in a for loop as below:
foreach my $num ( sort keys %hash )
{
my $h = $hash{$num};
print "$_", join(',',sort keys %$h),"\n";
}
No need for the "$_". That line should be:
print join (',' , sort {$a <=> $b} keys %$h),"\n";
While the $_ is treated as the default iterator in for and foreach loops (see perlvar), you've already assigned the iterator variable as $num.
Here is how to use the $_ correctly in a single line:
print join(',', sort { $a <=> $b } keys %{$hash{$_}}),"\n" foreach keys %hash;
On a Side Note...
sort uses string comparison by default, meaning that '10' is deemed to come before '2'. It seems that you're dealing with months (perhaps?), which is why I've used the numerical comparison block { $a <=> $b }.

How can I get to my anonymous arrays in Perl?

The following code generates a list of the average number of clients connected by subnet. Currently I have to pipe it through sort | uniq | grep -v HASH.
Trying to keep it all in Perl, this doesn't work:
foreach $subnet (keys %{keys %{keys %days}}) {
print "$subnet\n";
}
The source is this:
foreach $file (#ARGV) {
open(FH, $file) or warn("Can't open file $file\n");
if ($file =~ /(2009\d{4})/) {
$dt = $+;
}
%hash = {};
while(<FH>) {
#fields = split(/~/);
$subnet = $fields[0];
$client = $fields[2];
$hash{$subnet}{$client}++;
}
close(FH);
$file = "$dt.csv";
open(FH, ">$file") or die("Can't open $file for output");
foreach $subnet (sort keys %hash) {
$tot = keys(%{$hash{$subnet}});
$days{$dt}{$subnet} = $tot;
print FH "$subnet, $tot\n";
push #{$subnet}, $tot;
}
close(FH);
}
foreach $day (sort keys %days) {
foreach $subnet (sort keys %{$days{$day}}) {
$tot = $i = 0;
foreach $amt (#{$subnet}) {
$i++;
$tot += $amt;
}
print "$subnet," . int($tot/$i) . "\n";
}
}
How can I eliminate the need for the sort | uniq process outside of Perl? The last foreach gets me the subnet ids which are the 'anonymous' names for the arrays. It generates these multiple times (one for each day that subnet was used).
but this seemed easier than combining
spreadsheets in excel.
Actually, modules like Spreadsheet::ParseExcel make that really easy, in most cases. You still have to deal with rows as if from CSV or the "A1" type addressing, but you don't have to do the export step. And then you can output with Spreadsheet::WriteExcel!
I've used these modules to read a spreadsheet of a few hundred checks, sort and arrange and mung the contents, and write to a new one for delivery to an accountant.
In this part:
foreach $subnet (sort keys %hash) {
$tot = keys(%{$hash{$subnet}});
$days{$dt}{$subnet} = $tot;
print FH "$subnet,$tot\n";
push #{$subnet}, $tot;
}
$subnet is a string, but you use it in the last statement as an array reference. Since you don't have strictures on, it treats it as a soft reference to a variable with the name the same as the content of $subnet. Which is okay if you really want to, but it's confusing. As for clarifying the last part...
Update I'm guessing this is what you're looking for, where the subnet value is only saved if it hasn't appeared before, even from another day (?):
use List::Util qw(sum); # List::Util was first released with perl 5.007003 (5.7.3, I think)
my %buckets;
foreach my $day (sort keys %days) {
foreach my $subnet (sort keys %{$days{$day}}) {
next if exists $buckets{$subnet}; # only gives you this value once, regardless of what day it came in
my $total = sum #{$subnet}; # no need to reuse a variable
$buckets{$subnet} = int($total/#{$subnet}; # array in scalar context is number of elements
}
}
use Data::Dumper qw(Dumper);
print Dumper \%buckets;
Building on Anonymous's suggestions, I built a hash of the subnet names to access the arrays:
..
push #{$subnet}, $tot;
$subnets{$subnet}++;
}
close(FH);
}
use List::Util qw(sum); # List::Util was first released with perl 5.007003
foreach my $subnet (sort keys %subnets) {
my $total = sum #{$subnet}; # no need to reuse a variable
print "$subnet," . int($total/#{$subnet}) . "\n"; # array in scalar context is number of elements
}
I am not sure if this is the best solution, but I don't have the duplicates any more.

How can I sort a hash's keys naturally?

I have a Perl hash whose keys start with, or are, numbers.
If I use,
foreach my $key (sort keys %hash) {
print $hash{$key} . "\n";
}
the list might come out as,
0
0001
1000
203
23
Instead of
0
0001
23
203
1000
foreach my $key (sort { $a <=> $b} keys %hash) {
print $hash{$key} . "\n";
}
The sort operation takes an optional comparison "subroutine" (either as a block of code, as I've done here, or the name of a subroutine). I've supplied an in-line comparison that treats the keys as numbers using the built-in numeric comparison operator '<=>'.
Paul's answer is correct for numbers, but if you want to take it a step further and sort mixed words and numbers like a human would, neither cmp nor <=> will do. For example,
9x
14
foo
fooa
foolio
Foolio
foo12
foo12a
Foo12a
foo12z
foo13a
Sort::Naturally takes care of this problem, providing the nsort and ncmp routines.
Your first problem is the body of the loop (which no other answer here seems to point out).
foreach my $key ( sort keys %hash ) {
print $hash{$key} . "\n";
}
We don't know what the keys of %hash are. We just know that they that are handed to you as $key, in lexical order, inside the loop. You then use the keys to access the contents of the hash, printing each entry.
The values of the hash do not come out in a sorted order, because you sort on the keys.
Would you instead want to output the values in sorted order, consider the following loop:
foreach my $value ( sort values(%hash) ) {
printf( "%s\n", $value );
}
This loop does print the values in the order you observe:
0
0001
1000
203
23
To sort them numerically instead, use
foreach my $value ( sort { $a <=> $b } values(%hash) ) {
printf( "%s\n", $value );
}
This produces
0
0001
23
203
1000
which is what you wanted.
See the Perl manual for the sort function for further information and many more examples.
$key (sort { $a <=> $b} keys %hash)
will do the trick
Or descending sort:
$key (sort { $b <=> $a} keys %hash)
Or even
$key (sort { $a <=> $b} values %hash)
$key (sort { $b <=> $a} values %hash)