How to sort perl hash keys [duplicate] - perl

This question already has answers here:
Sorting hash keys by Alphanumeric sort
(4 answers)
Closed 8 years ago.
I have a hash which looks like this
my %hash = (
'124:8' => '',
'4:2' => '',
'17:11' => '',
'17:0' => '',
#and so on
);
I tried to sort and use hash keys by small number to bigger
for my $keys ( sort { $a > $b } keys %hash ) {
#do stuff
}
This gives me some result that looks like correct but it fails sometimes. I don't know how to compare both numbers, 124:8 with 4:2 since it has : in a middle, any suggestions ?

You might want to sort on first and second number delimited by :
my #sorted = sort {
my ($aa, $bb) = map [ split /:/ ], $a, $b;
$aa->[0] <=> $bb->[0] || $aa->[1] <=> $bb->[1]
} keys %hash;
for my $key (#sorted) { .. }
Using Schwartzian,
my #sorted = map $_->[0],
sort {
$a->[1] <=> $b->[1] || $a->[2] <=> $b->[2]
}
map [ $_, split /:/ ],
keys %hash;

When you sort numbers, you use the <=> operator:
for my $key (sort { $a <=> $b } keys %hash) {
This operator returns 1, 0 or -1 depending on the comparison. > only returns true or false, which explains it working with some results, but not all.
Because your keys are not numbers, they will only partially convert to numbers, and you will get warnings
Argument "17:11" isn't numeric in sort
Then you will need to use something like Sort::Key::Natural, or swing your own, such as:
sort {
my #a = $a =~ /\d+/g;
my #b = $b =~ /\d+/g;
$a[0] <=> $b[0] ||
$a[1] <=> $b[1] # continue as long as needed
} keys %hash
You may also use a Schwartzian transform to cache the numbers and possibly speed up the sort.
Or just sort by string comparison, though this will cause 17:11 to end up after 17:2.

Not as elegant as above solutions, but what to convert the : to . and compare them as floating point numbers? Because no math operations occurs, no rounding errors and the next could work:
my %tmp = map { (my $x = $_) =~ s/:/./; $_,$x} keys %hash;
my #sortedkeys = sort { $tmp{$a} <=> $tmp{$b} } keys %tmp;
#4:2 17:0 17:11 124:8
Or this approach is wrong?

Related

sort hash key in perl with key being tab separated

I have a hash of arrays (perl) whose key is a string joined by tabs.
This is how the key of the hash looks like
chr"\t"fivep"\t"threep"\t"strand # separated on tab
If the hash is named %output
I want to sort the keys of this hash such that first the sort is done on chr, then on fivep and then on threep.
I tried the below code for sorting:
foreach my $k(sort keys %output){
print join("\t",$k,#{$output{$k}}),"\n";
}
This sort only the chr but I want to sort fivep after it and then threep.
How can I perform that?
I'm going to recommend you do a transform on your keys and then do the custom sort.
for my $k (
map { $_->[0] } # pull out the original key
sort { $a->[1] cmp $b->[1] || $a->[2] cmp $b->[2] || $a->[3] cmp $b->[3] } # do the actual sort
map { [ $_, split /\t/, $_, -1 ] } # split the keys and make the transform
keys %output
) {
print join "\t", $k, #output{$k};
}
You can factor the sort code block into it's own function if it's overly complex or this process needs to be reused in more places in the code, as well, then just give the function for sort.
I think https://metacpan.org/pod/Sort::Key::Maker does what you want. The following code should work for you.
use Sort::Key::Maker custom_sort => qw(str str str);
my #sorted = custom_sort { (split /\t/, $_, -1)[ 0 .. 2 ] } keys %output;

how do I sort keys of hash alphanumeric with "=" and "."

I have a hash which has keys as following:
m2-10.10845.857-10.3145.857
m2-10.3145.857-10.42545.857
m2-10.42545.857-10.62845.857
m2-10.62845.857-10.83645.857
m2-11.60745.857-12.11745.857
m2-7.80945.857-8.01645.857
m2-8.01645.857-8.13145.857
m2-8.13145.857-8.24645.857
m2-8.24645.857-8.44345.857
m2-8.44345.857-9.7945.857
m2-9.7945.857-9.90545.857
m2-9.90545.857-10.10845.857
I want to sort them in a way that they appear as following:
m2-11.60745.857-12.11745.857
m2-10.62845.857-10.83645.857
m2-10.42545.857-10.62845.857
m2-10.3145.857-10.42545.857
m2-10.10845.857-10.3145.857
m2-9.90545.857-10.10845.857
m2-9.7945.857-9.90545.857
m2-8.44345.857-9.7945.857
m2-8.24645.857-8.44345.857
m2-8.13145.857-8.24645.857
m2-8.01645.857-8.13145.857
m2-7.80945.857-8.01645.857
I tried with
foreach my $key(sort {$h{$a} cmp $h{$b} } keys %h){
printf FHOUT "$h{$key}\n";
}
But it did not work. How do I do this?
Update:
Just came up with a solution:
my #keys = sort{substr($h{$a},3,8) <=> substr($h{$b},3,8) } keys %h;
print "$_\n" for #keys;
But, how can make it more generic?
Both examples shown in the question sort by comparing values, not keys. I will use the statement, to sort by keys. For simplicity we can think of the problem of sorting an array with given elements.
It seems that you want to sort by the first number, then by the second, then by the third. The second set of numbers (after the second -) seems to not matter. Also, as it's always m2 I'll disregard it.
For sorting by numbers we use numerical sort (<=>), not lexicographical (cmp).
Then break the first set of numbers into its components and sort
foreach my $key ( sort { by_component($a, $b) } (keys %h) ) {
print "$key\n";
}
sub by_component {
my ($aa, $bb) = #_;
my #num_aa = split /\./, (split /-/, $aa)[1]; #/
my #num_bb = split /\./, (split /-/, $bb)[1];
return
$num_bb[0] <=> $num_aa[0] ||
$num_bb[1] <=> $num_aa[1] ||
$num_bb[2] <=> $num_aa[2];
}
The <=> operator returns 1 if its left operand is greater than the right, 0 if they're equal, or -1 if right is greater. Both 1 or -1 are "true" so the whole expression is true and that value is returned. The 0 is false and then the next condition after || is evaluated -- we sort by the next number if the first pair is equal.
This prints the desired order of keys.
See sort and <=> and cmp operators, and logical operators for their behavior described above.
sort takes any function that returns positional comparators for $a and $b.
I would suggest what you're trying to do is decompose your strings into fields, and compare 'field by field' to sort. I've chosen to 'handle' the string comparison in my example, but as they're all 'm2' prefixed, that's not having any impact on the results. Note - everything is 'descending' order in my example, so 'm1' will be lower down than 'm3', the same as '7' is lower down the order than '11'.
#!/usr/bin/env perl
use strict;
use warnings;
sub custom_sort {
my #a_keys = split /[-\.]/, $a;
my #b_keys = split /[-\.]/, $b;
#check each key until 'array' is empty.
while ( #a_keys and #b_keys ) {
#if both are numeric, compare numerically.
if ( $a_keys[0] =~ /^\d+$/
and $b_keys[0] =~ /^\d+$/ )
{
my $comparison = shift #b_keys <=> shift #a_keys;
#returns a result if it's anything other than '0' e.g. they're the same.
return $comparison if $comparison;
}
else {
#compare stringwise.
my $comparison = shift #b_keys cmp shift #a_keys;
return $comparison if $comparison;
}
}
#fell through all the comparisons, so identical.
return 0;
}
chomp ( my #values = <DATA> );
print join ("\n", sort {custom_sort} #values ), "\n";
__DATA__
m2-10.10845.857-10.3145.857
m2-10.3145.857-10.42545.857
m2-10.42545.857-10.62845.857
m2-10.62845.857-10.83645.857
m2-11.60745.857-12.11745.857
m2-7.80945.857-8.01645.857
m2-8.01645.857-8.13145.857
m2-8.13145.857-8.24645.857
m2-8.24645.857-8.44345.857
m2-8.44345.857-9.7945.857
m2-9.7945.857-9.90545.857
m2-9.90545.857-10.10845.857
This works through the value, one 'field' at a time, and compares them - moving onto the next if the 'comparison' is zero (e.g. they're equal).
Which gives as output:
m2-11.60745.857-12.11745.857
m2-10.62845.857-10.83645.857
m2-10.42545.857-10.62845.857
m2-10.10845.857-10.3145.857
m2-10.3145.857-10.42545.857
m2-9.90545.857-10.10845.857
m2-9.7945.857-9.90545.857
m2-8.44345.857-9.7945.857
m2-8.24645.857-8.44345.857
m2-8.13145.857-8.24645.857
m2-8.01645.857-8.13145.857
m2-7.80945.857-8.01645.857

How can I sort a Perl hash on values and order the keys correspondingly (in two arrays maybe)?

In Perl, I want to sort the keys of a hash by value, numerically:
{
five => 5
ten => 10
one => 1
four => 4
}
producing two arrays:
(1,4,5,10) and (one, four, five, ten)
And then I want to normalize the values array such that the numbers are sequential:
(1,2,3,4)
How do I do this?
First sort the keys by the associated value. Then get the values (e.g. by using a hash slice).
my #keys = sort { $h{$a} <=> $h{$b} } keys(%h);
my #vals = #h{#keys};
Or if you have a hash reference.
my #keys = sort { $h->{$a} <=> $h->{$b} } keys(%$h);
my #vals = #{$h}{#keys};
How do I sort a hash (optionally by value instead of key)?
To sort a hash, start with the keys. In this example, we give the list of keys to the sort function which then compares them ASCIIbetically (which might be affected by your locale settings). The output list has the keys in ASCIIbetical order. Once we have the keys, we can go through them to create a report which lists the keys in ASCIIbetical order.
my #keys = sort { $a cmp $b } keys %hash;
foreach my $key ( #keys ) {
printf "%-20s %6d\n", $key, $hash{$key};
}
We could get more fancy in the sort() block though. Instead of comparing the keys, we can compute a value with them and use that value as the comparison.
For instance, to make our report order case-insensitive, we use lc to lowercase the keys before comparing them:
my #keys = sort { lc $a cmp lc $b } keys %hash;
Note: if the computation is expensive or the hash has many elements, you may want to look at the Schwartzian Transform to cache the computation results.
If we want to sort by the hash value instead, we use the hash key to look it up. We still get out a list of keys, but this time they are ordered by their value.
my #keys = sort { $hash{$a} <=> $hash{$b} } keys %hash;
From there we can get more complex. If the hash values are the same, we can provide a secondary sort on the hash key.
my #keys = sort {
$hash{$a} <=> $hash{$b}
or
"\L$a" cmp "\L$b"
} keys %hash;
Please see the Perl FAQ entry titled "How do I sort a hash (optionally by value instead of key)".
You can also use perldoc -q to search the FAQ locally on your machine, as in perldoc -q sort, which is how I found your answer.
my ( #nums, #words );
do { push #nums, shift #$_;
push #words, shift #$_;
}
foreach sort { $a->[0] <=> $b->[0] }
map { [ $h->{ $_ }, $_ ] } keys %$h
;
Sometimes it's best to show rather than tell...
%results = (Paul=>87, Ringo=>93, John=>91, George=>97);
#display the results in ascending key (alphabetical) order
print "key ascending...\n";
foreach $key ( sort { $a cmp $b } keys %results ){
print "$key=>$results{$key}\n";
}
print "\n";
# display the results in descending key (alphabetical) order
print "key descending...\n";
foreach $key ( sort { $b cmp $a } keys %results ){
print "$key=>$results{$key}\n";
}
print "\n";
# display the results in descending value (numerical) order
print "value ascending...\n";
foreach $key ( sort { $results{$a} <=> $results{$b} } keys %results ){
print "$key=>$results{$key}\n";
}
print "\n";
# display the results in ascending value (numerical) order
print "value descending...\n";
foreach $key ( sort { $results{$b} <=> $results{$a} } keys %results ){
print "$key=>$results{$key}\n";
}

perl: shuffle value-sorted hash?

At first sorry for my english - i hope you will understand me.
There is a hash:
$hash{a} = 1;
$hash{b} = 3;
$hash{c} = 3;
$hash{d} = 2;
$hash{e} = 1;
$hash{f} = 1;
I want to sort it by values (not keys) so I have:
for my $key ( sort { $hash{ $a } <=> $hash{ $b } } keys %hash ) { ... }
And at first I get all the keys with value 1, then with value 2, etc... Great.
But if hash is not changing, the order of keys (in this sort-by-value) is always the same.
Question: How can I shuffle sort-results, so every time I run 'for' loop, I get different order of keys with value 1, value 2, etc. ?
Not quite sure I well understand your needs, but is this ok:
use List::Util qw(shuffle);
my %hash;
$hash{a} = 1;
$hash{b} = 3;
$hash{c} = 3;
$hash{d} = 2;
$hash{e} = 1;
$hash{f} = 1;
for my $key (sort { $hash{ $a } <=> $hash{ $b } } shuffle( keys %hash )) {
say "hash{$key} = $hash{$key}"
}
You can simply add another level of sorting, which will be used when the regular sorting method cannot distinguish between two values. E.g.:
sort { METHOD_1 || METHOD_2 || ... METHOD_N } LIST
For example:
sub regular_sort {
my $hash = shift;
for (sort { $hash->{$a} <=> $hash->{$b} } keys %$hash) {
print "$_ ";
};
}
sub random_sort {
my $hash = shift;
my %rand = map { $_ => rand } keys %hash;
for (sort { $hash->{$a} <=> $hash->{$b} ||
$rand{$a} <=> $rand{$b} } keys %$hash ) {
print "$_ ";
};
}
To sort the keys by value, with random ordering of keys with identical values, I see two solutions:
use List::Util qw( shuffle );
use sort 'stable';
my #keys =
sort { $hash{$a} <=> $hash{$b} }
shuffle keys %hash;
or
my #keys =
map $_->[0],
sort { $a->[1] <=> $b->[1] || $a->[2] <=> $b->[2] }
map [ $_, $hash{$_}, rand ],
keys %hash;
The use sort 'stable'; is required to prevent sort from corrupting the randomness of the list returned by shuffle.
The above's use of the Schwartzian Transform is not an attempt at optimisation. I've seen people use rand in the compare function itself to try to achieve the above result, but doing so is buggy for two reasons.
When using "misbehaving" comparisons such as that, the results are documented as being undefined, so sort is allowed to return garbage, repeated elements, missing elements, etc.
Even if sort doesn't return garbage, it won't be a fair sort. The result will be weighed.
You can have two functions for ascending and decending order and use them accordingly like
sub hasAscending {
$hash{$a} <=> $hash{$b};
}
sub hashDescending {
$hash{$b} <=> $hash{$a};
}
foreach $key (sort hashAscending (keys(%hash))) {
print "\t$hash{$key} \t\t $key\n";
}
foreach $key (sort hashDescending (keys(%hash))) {
print "\t$hash{$key} \t\t $key\n";
}
It seems like you want to randomize looping through the keys.
Perl, does not store in sequential or sorted order, but this doesn't seem to be random enough for you, so you may want to create an array of keys and loop through that instead.
First, populate an array with keys, then use a random number algorithm (1..$#length_of_array) to push the key at that position in the array, to the array_of_keys.
If you're trying to randomize the keys of the sorted-by-value hash, that's a little different.
See Codepad
my %hash = (a=>1, b=>3, c=>3, d=>2, e=>1, f=>1);
my %hash_by_val;
for my $key ( sort { $hash{$a} <=> $hash{$b} } keys %hash ) {
push #{ $hash_by_val{$hash{$key}} }, $key;
}
for my $key (sort keys %hash_by_val){
my #arr = #{$hash_by_val{$key}};
my $arr_ubound = $#arr;
for (0..$arr_ubound){
my $randnum = int(rand($arr_ubound));
my $val = splice(#arr,$randnum,1);
$arr_ubound--;
print "$key : $val\n"; # notice: output varies b/t runs
}
}

In Perl, how can I print the key corresponding to the maximum value in a hash?

How can I print only the first key and element of my hash?
I have already a sorted hash, but I want to print only the first key and respective value
thanks,
Thanks to all of you at the end I push the keys and the values to two different #array and print element 0 of each array and it works :)
Hashes have unordered keys. So, there is no such key as a first key in a hash.
However, if you need the key that sorts first (for maximum key value):
my %hash = (
'foo' => 'bar',
'qux' => 'baz',
);
my ($key) = sort { $b cmp $a } keys %hash;
print "$key => $hash{$key}"; # Outputs: qux => baz
Remember to use <=> instead of cmp for numerical sorting.
In perl hashes there is no ordering for keys. Use sort function to get the keys in the order that you want or you can push the keys into an array as you create the hash and your first key will be in zero th index in the array
You can use the below code, i am assuming hash name is my_hash and keys and values are numbers. If you have strings, you can use cmp instead of <=>. Refer to the sort documentation for more details
Get the max key
foreach (sort {$b <=> $a} keys %my_hash) {
print "Keys is $_\n";
print "Value is $my_hash{$_}\n";
last;
}
Get the key corresponding to the max value
foreach (sort {$my_hash{$b} <=> $my_hash{$a}} keys %my_hash) {
print "Keys is $_\n";
print "Value is $my_hash{$_}\n";
last;
}
foreach my $key (sort keys(%hash)) {
print "$key" . "$hash{$key}" . "\n";
last;
}
For large hashes, if you do not need the sorted keys for any other reason, it might be better to avoid sorting.
#!/usr/bin/env perl
use strict; use warnings;
my %hash = map { $_ => rand(10_000) } 'aa' .. 'zz';
my ($argmax, $max) = each %hash;
keys %hash; # reset iterator
while (my ($k, $v) = each %hash) {
if ($v >= $max) {
$max = $v;
$argmax = $k;
}
}
print "$argmax => $max\n";
If you are intent on sorting, you only need the key with the maximum value, not the entire arrays of keys and values:
#!/usr/bin/env perl
use strict; use warnings;
my %hash = map { $_ => rand(10_000) } 'aa' .. 'zz';
my ($argmax) = sort { $hash{$b} <=> $hash{$a} } keys %hash;
print "$argmax => $hash{$argmax}\n";
Just as Alan wrote - hashes don't have specific order, but you can sort hash keys:
foreach my $key (sort keys(%hash)) {
print $key . ': ' . $hash{$key} . "\n";
}
or, as you wish, get first element from keys array:
my #keys = keys(%hash);
print $keys[0];