Sort Hash Key and Value simultaneously Perl - perl

I have a hash that I want to sort the keys numerically in ascending order and
its values in ascending alphabetically manner.
#!/usr/bin/perl
use warnings;
use strict;
use List::MoreUtils;
use Tie::IxHash;
my %KEY_VALUE;
#tie %KEY_VALUE,'Tie::IxHash';
my %KEY_VALUE= (
0 => [ 'A', 'C', 'B', 'A' ,'D'],
5 => [ 'D', 'F', 'E', ],
2 => [ 'Z', 'X', 'Y' ],
4 => [ 'E', 'R', 'M' ],
3 => [ 'A', 'B', 'B', 'A' ],
1 => [ 'C', 'C', 'F', 'E' ],
);
#while (my ($k, $av) = each %KEY_VALUE)
#{
# print "$k #$av\n ";
#}
#Sort the key numerically
foreach my $key (sort keys %KEY_VALUE)
{
print "$key\n";
}
#To sort the value alphabetically
foreach my $key (sort {$KEY_VALUE{$a} cmp $KEY_VALUE{$b}} keys %KEY_VALUE){
print "$key: $KEY_VALUE{$key}\n";
}
The wanted input is like this, and I want to print out the sorted keys and values.
%KEY_VALUE= (
0 => [ 'A','A','B','C','D'],
1 => [ 'C','C','E','F' ],
2 => [ 'X','Y','Z' ],
3 => [ 'A', 'A', 'B', 'B' ],
4 => [ 'E','M','R' ],
5 => [ 'D','E','F', ],
);
Additional problem, how to print the key and the scalar value of the first different value
Wanted Output:
KEY= 0 VALUE:0 2 3 4 #The scalar value of first A B C D, start with 0
KEY= 1 VALUE:0 2 3 #The scalar value of first C E F
KEY= 2 VALUE:0 1 2 #The scalar value of first X Y Z
KEY= 3 VALUE:0 2 #The scalar value of first A B
KEY= 4 VALUE:0 1 2 #The scalar value of first E M R
KEY= 5 VALUE:0 1 2 #The scalar value of first D E F

Hash keys have no defined order. Generally you sort the keys as you're iterating through the hash.
The values can be sorted as you iterate through the hash.
# Iterate through the keys in numeric order.
for my $key (sort {$a <=> $b } keys %hash) {
# Get the value
my $val = $hash{$key};
# Sort it in place
#$val = sort { $a cmp $b } #$val;
# Display it
say "$key -> #$val";
}
Note that by default sort sorts in ASCII order as strings. That means sort keys %KEY_VALUE is not sorting as numbers but as strings. sort(2,3,10) is (10,2,3). "10" is less than "2" like "ah" is less than "b". Be sure to use sort { $a <=> $b } for numeric sorting and sort { $a cmp $b } for strings.
You could use a different data structure such as Tie::Ixhash though tying has a significant performance penalty. Generally it's better to sort in place unless your hash gets very large.

You can't sort a hash, you can at best print it sorted (or keep the sorted keys in another array). Finding the position of the first value can be done with first_index; we remove duplicates with uniq.
foreach my $key (sort keys %KEY_VALUE) {
my #value = #{$KEY_VALUE{$key}};
my #indices = map { my $e = $_; first_index { $_ eq $e } #value } (uniq (sort #value));
print "$key: " . (join ', ', #indices) . "\n";
}

Related

Perl to sort words by user-defined alphabet sequence

I have an array of "words" (strings), which consist of letters from an "alphabet" with user-defined sequence. E.g my "alphabet" starts with "ʔ ʕ b g d", so a list of "words" (bʔd ʔbg ʕʔb bʕd) after sort by_my_alphabet should be ʔbd ʕʔb bʔd bʕd.
sort by_my_alphabet (bʔd ʔbg ʕʔb bʕd) # gives ʔbd ʕʔb bʔd bʕd
Is there a way to make a simple subroutine by_my_alphabet with $a and $b to solve this problem?
Simple, and very fast because it doesn't use a compare callback, but it needs to scan the entire string:
use utf8;
my #my_chr = split //, "ʔʕbgd";
my %my_ord = map { $my_chr[$_] => $_ } 0..$#my_chr;
my #sorted =
map { join '', #my_chr[ unpack 'W*', $_ ] } # "\x00\x01\x02\x03\x04" ⇒ "ʔʕbgd"
sort
map { pack 'W*', #my_ord{ split //, $_ } } # "ʔʕbgd" ⇒ "\x00\x01\x02\x03\x04"
#unsorted;
Optimized for long strings since it only scans a string up until a difference is found:
use utf8;
use List::Util qw( min );
my #my_chr = split //, "ʔʕbgd";
my %my_ord = map { $my_chr[$_] => $_ } 0..$#my_chr;
sub my_cmp($$) {
for ( 0 .. ( min map length($_), #_ ) - 1 ) {
my $cmp = $my_ord{substr($_[0], $_, 1)} <=> $my_ord{substr($_[1], $_, 1)};
return $cmp if $cmp;
}
return length($_[0]) <=> length($_[1]);
}
my #sorted = sort my_cmp #unsorted;
Both should be faster than Sobrique's. Theirs uses a compare callback, and it scans the entire strings being compared.
Yes.
sort can take any function that returns a relative sort position. All you need is a function that correctly looks up the 'sort value' of a string for comparing.
So all you need to do here is define a 'relative weight' of your extra letters, and then compare the two.
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my #sort_order = qw ( B C A D );
my #array_to_sort = qw ( A B C D A B C D AB BB CCC ABC );
my $count = 0;
my %position_of;
$position_of{$_} = $count++ for #sort_order;
print Dumper \%position_of;
sub sort_by_pos {
my #a = split //, $a;
my #b = split //, $b;
#iterate one letter at a time, using 'shift' to take it off the front
#of the array.
while ( #a and #b ) {
my $result = $position_of{shift #a} <=> $position_of{shift #b};
#result is 'true' if it's "-1" or "1" which indicates relative position.
# 0 is false, and that'll cause the next loop iteration to test the next
#letter-pair
return $result if $result;
}
#return a value based on remaining length - longest 'string' will sort last;
#That's so "AAA" comparing with "AA" comparison actually work,
return scalar #a <=> scalar #b;
}
my #new = sort { sort_by_pos } #array_to_sort;
print Dumper \#new;
Bit of a simple case, but it sorts our array into:
$VAR1 = [
'B',
'B',
'BB',
'C',
'C',
'CCC',
'A',
'A',
'AB',
'ABC',
'D',
'D'
];

Making the sort stable in Perl

I have an array of refs with me . Something like
$a[0] = [qw( 1 2 3 4 )];
$a[1] = [qw( a b c d )];
The 1,2,3,4 are actually website breadcrumbs which are used for navigation (Home, Profile, Contact-us, Contact-me-specifically).
Now, I have to sort this ladder (And using stable sort in perl 5.8 is not an option sadly)
The sorting criteria is
The depth of the ladder
If two ladders have same depth, then sort them depending on their index.
For example, if the array originally contains
$a[0] = [qw( 1 2 3 4 )];
$a[1] = [qw( 1 2 3 )];
Then after the sort, the array should contain
$a[0] = [qw( 1 2 3 )];
$a[1] = [qw( 1 2 3 4 )];
But if the arrays are something like :-
$a[0] = [qw( 1 2 3 )];
$a[1] = [qw( a b c )];
Then after the sort,
$a[0] = [qw( 1 2 3 )];
$a[1] = [qw( a b c )];
I can't get it to work this way that I tried .
my #sorted_array = sort { #$b <=> #$a || $a <=> $b } #a;
Can someone help me in this?
The description of your data structure (linked list), and the implementation in your sort routine (arrayrefs) do not quite fit together; I will assume the latter.
A non-stable sort can be made stable by sorting by the position as a secondary criterion:
sort { normally or by_index } #stuff
Normally, you seem to want to compare the array length. To be able to test for the index, you have to somehow make the index of the current element available. You can do so by two means:
Do the Schwartzian Transform, and annotate each element with its index. This is silly.
Sort the indices, not the elements.
This would look like:
my #sorted_indices =
sort { #{ $array[$b] } <=> #{ $array[$a] } or $a <=> $b } 0 .. $#array;
my #sorted = #array[#sorted_indices]; # do a slice
What you were previously doing with $a <=> $b was comparing refernces. This is not guaranteed to do anything meaningful.
Test of that sort:
use Test::More;
my #array = (
[qw/1 2 3/],
[qw/a b c/],
[qw/foo bar baz qux/],
);
my #expected = (
[qw/foo bar baz qux/],
[qw/1 2 3/],
[qw/a b c/],
);
...; # above code
is_deeply \#sorted, \#expected;
done_testing;
Your code doesn't work because you expect $a and $b to contain the element's value in one place (#$b <=> #$a) and the element's index in another ($a <=> $b).
You need the indexes in your comparison, so your comparison function is going to need the indexes.
By passing the indexes of the array to sort, you have access to both the indexes and the values at those indexes, so your code is going to include
sort { ... } 0..$#array;
After we're finished sorting, we want to retrieve the elements for those indexes. For that, we can use
my #sorted = map $array[$_], #sorted_indexes;
or
my #sorted = #array[ #sorted_indexes ];
All together, we get:
my #sorted =
map $array[$_],
sort { #{ $array[$a] } <=> #{ $array[$b] } || $a <=> $b }
0..$#array;
or
my #sorted = #array[
sort { #{ $array[$a] } <=> #{ $array[$b] } || $a <=> $b }
0..$#array
];
I think we need to clear up your sorting algorithm. You said:
The depth of the ladder
Sort them depending on their index.
Here's an example:
$array[0] = [ qw(1 a b c d e) ];
$array[2] = [ qw(1 2 b c d e) ];
$array[3] = [ qw(a b c) ];
$array[4] = [ qw(a b c d e) ];
You want them sorted this way:
$array[3] = [ qw(a b c) ];
$array[2] = [ qw(1 2 b c d e) ];
$array[0] = [ qw(1 a b c d e) ];
$array[4] = [ qw(a b c d e) ];
Is that correct?
What about this?
$array[0] = [ qw(100, 21, 15, 32) ];
$array[1] = [ qw(32, 14, 32, 20) ];
Sorting by numeric, $array[1] should be before $array[0], but sorting by string, $array[0] is before $array[1].
Also, you notice that I cannot tell whether $array[0] should be before or after $array[1] until I look at the second element of the array.
This makes sorting very difficult to do on a single line function. Even if you can somehow reduce it, It'll make it very difficult for someone to analyze what you are doing, or for you to debug the statement.
Fortunately, you can use an entire subroutine as a sort routine:
use warnings;
use strict;
use autodie;
use feature qw(say);
use Data::Dumper;
my #array;
$array[0] = [ qw(1 2 3 4 5 6) ];
$array[1] = [ qw(1 2 3) ];
$array[2] = [ qw(a b c d e f) ];
$array[3] = [ qw(0 1 2) ];
my #sorted_array = sort sort_array #array;
say Dumper \#sorted_array;
sub sort_array {
#my $a = shift; #Array reference to an element in #array
#my $b = shift; $Array reference to an element in #array
my #a_array = #{ $a };
my #b_array = #{ $b };
#
#First sort on length of arrays
#
if ( scalar #a_array ne scalar #b_array ) {
return scalar #a_array <=> scalar #b_array;
}
#
# Arrays are the same length. Sort on first element in array that differs
#
for my $index (0..$#a_array ) {
if ( $a_array[$index] ne $b_array[$index] ) {
return $a_array[$index] cmp $b_array[$index];
}
}
#
# Both arrays are equal in size and content
#
return 0;
}
This returns:
$VAR1 = [
[
'0',
'1',
'2'
],
[
'1',
'2',
'3'
],
[
'1',
'2',
'3',
'4',
'5',
'6'
],
[
'a',
'b',
'c',
'd',
'e',
'f'
]
];

Perl extract range of elements from a hash

If I have a hash:
%hash = ("Dog",1,"Cat",2,"Mouse",3,"Fly",4);
How can I extract the first X elements of this hash. For example if I want the first 3 elements, %newhash would contain ("Dog",1,"Cat",2,"Mouse",3).
I'm working with large hashes (~ 8000 keys).
"first X elements of this hash" doesn't mean anything. First three elements in order by numeric value?
my %hash = ( 'Dog' => 1, 'Cat' => 2, 'Mouse' => 3, 'Fly' => 4 );
my #hashkeys = sort { $hash{$a} <=> $hash{$b} } keys %hash;
splice(#hashkeys, 3);
my %newhash;
#newhash{#hashkeys} = #hash{#hashkeys};
You might want to use something like this:
my %hash = ("Dog",1,"Cat",2,"Mouse",3,"Fly",4);
for ( (sort keys %hash)[0..2] ) {
say $hash{$_};
}
You should have an array 1st:
my %hash = ("Dog" => 1,"Cat"=>2,"Mouse"=>3,"Fly"=>4);
my #array;
foreach $value (sort {$hash{$a} <=> $hash{$b} }
keys %hash)
{
push(#array,{$value=>$hash{$value}});
}
#get range:
my #part=#array[0..2];
print part of result;
print $part[0]{'Cat'}."\n";

Perl, convert numerically-keyed hash to array

If I have a hash in Perl that contains complete and sequential integer mappings (ie, all keys from from 0 to n are mapped to something, no keys outside of this), is there a means of converting this to an Array?
I know I could iterate over the key/value pairs and place them into a new array, but something tells me there should be a built-in means of doing this.
You can extract all the values from a hash with the values function:
my #vals = values %hash;
If you want them in a particular order, then you can put the keys in the desired order and then take a hash slice from that:
my #sorted_vals = #hash{sort { $a <=> $b } keys %hash};
If your original data source is a hash:
# first find the max key value, if you don't already know it:
use List::Util 'max';
my $maxkey = max keys %hash;
# get all the values, in order
my #array = #hash{0 .. $maxkey};
Or if your original data source is a hashref:
my $maxkey = max keys %$hashref;
my #array = #{$hashref}{0 .. $maxkey};
This is easy to test using this example:
my %hash;
#hash{0 .. 9} = ('a' .. 'j');
# insert code from above, and then print the result...
use Data::Dumper;
print Dumper(\%hash);
print Dumper(\#array);
$VAR1 = {
'6' => 'g',
'3' => 'd',
'7' => 'h',
'9' => 'j',
'2' => 'c',
'8' => 'i',
'1' => 'b',
'4' => 'e',
'0' => 'a',
'5' => 'f'
};
$VAR1 = [
'a',
'b',
'c',
'd',
'e',
'f',
'g',
'h',
'i',
'j'
];
OK, this is not very "built in" but works. It's also IMHO preferrable to any solution involving "sort" as it's faster.
map { $array[$_] = $hash{$_} } keys %hash; # Or use foreach instead of map
Otherwise, less efficient:
my #array = map { $hash{$_} } sort { $a<=>$b } keys %hash;
Perl does not provide a built-in to solve your problem.
If you know that the keys cover a particular range 0..N, you can leverage that fact:
my $n = keys(%hash) - 1;
my #keys_and_values = map { $_ => $hash{$_} } 0 .. $n;
my #just_values = #hash{0 .. $n};
This will leave keys not defined in %hashed_keys as undef:
# if we're being nitpicky about when and how much memory
# is allocated for the array (for run-time optimization):
my #keys_arr = (undef) x scalar %hashed_keys;
#keys_arr[(keys %hashed_keys)] =
#hashed_keys{(keys %hashed_keys)};
And, if you're using references:
#{$keys_arr}[(keys %{$hashed_keys})] =
#{$hashed_keys}{(keys %{$hashed_keys})};
Or, more dangerously, as it assumes what you said is true (it may not always be true … Just sayin'!):
#keys_arr = #hashed_keys{(sort {$a <=> $b} keys %hashed_keys)};
But this is sort of beside the point. If they were integer-indexed to begin with, why are they in a hash now?
As DVK said, there is no built in way, but this will do the trick:
my #array = map {$hash{$_}} sort {$a <=> $b} keys %hash;
or this:
my #array;
keys %hash;
while (my ($k, $v) = each %hash) {
$array[$k] = $v
}
benchmark to see which is faster, my guess would be the second.
#a = #h{sort { $a <=> $b } keys %h};
Combining FM's and Ether's answers allows one to avoid defining an otherwise unnecessary scalar:
my #array = #hash{ 0 .. $#{[ keys %hash ]} };
The neat thing is that unlike with the scalar approach, $# works above even in the unlikely event that the default index of the first element, $[, is non-zero.
Of course, that would mean writing something silly and obfuscated like so:
my #array = #hash{ $[ .. $#{[ keys %hash ]} }; # Not recommended
But then there is always the remote chance that someone needs it somewhere (wince)...
$Hash_value =
{
'54' => 'abc',
'55' => 'def',
'56' => 'test',
};
while (my ($key,$value) = each %{$Hash_value})
{
print "\n $key > $value";
}
We can write a while as below:
$j =0;
while(($a1,$b1)=each(%hash1)){
$arr[$j][0] = $a1;
($arr[$j][1],$arr[$j][2],$arr[$j][3],$arr[$j][4],$arr[$j][5],$arr[$j][6]) = values($b1);
$j++;
}
$a1 contains the key and
$b1 contains the values
In the above example i have Hash of array and the array contains 6 elements.
An easy way is to do #array = %hash
For example,
my %hash = (
"0" => "zero",
"1" => "one",
"2" => "two",
"3" => "three",
"4" => "four",
"5" => "five",
"6" => "six",
"7" => "seven",
"8" => "eight",
"9" => "nine",
"10" => "ten",
);
my #array = %hash;
print "#array"; would produce the following output,
3 three 9 nine 5 five 8 eight 2 two 4 four 1 one 10 ten 7 seven 0 zero
6 six

Perl: How to get all grouped patterns

I have the following code.
Here I am matching the vowels characters words:
if ( /(a)+/ and /(e)+/ and /(i)+/ and /(o)+/ and /(u)+/ )
{
print "$1#$2#$3#$4#$5\n";
$number++;
}
I am trying to get the all matched patterns using grouping, but I am getting only the last expression pattern, which means the fifth expression of the if condition. Here I know that it is giving only one pattern because last pattern matching in if condition. I want to get all matched patterns, however. Can anyone help me out of this problem?
It is not quite clear what you want to do. Here are some thoughts.
Are you trying to count the number of vowels? In which case, tr will do the job:
my $count = tr/aeiou// ;
printf("string:%-20s count:%d\n" , $_ , $count ) ;
output :
string:book count:2
string:stackoverflow count:4
Or extract the vowels
my #array = / ( [aeiou] ) /xg ;
print Dumper \#array ;
Output from "stackoverflow question"
$VAR1 = [
'a',
'o',
'e',
'o',
'u',
'e',
'i',
'o'
];
Or extract sequences of vowels
my #array = / ( [aeiou]+ ) /xg ;
print Dumper \#array ;
Output from "stackoverflow question"
$VAR1 = [
'a',
'o',
'e',
'o',
'ue',
'io'
];
You could use
sub match_all {
my($s,#patterns) = #_;
my #matches = grep #$_ >= 1,
map [$s =~ /$_/g] => #patterns;
wantarray ? #matches : \#matches;
}
to create an array of non-empty matches.
For example:
my $string = "aaa e iiii oo uuuuu aa";
my #matches = match_all $string, map qr/$_+/ => qw/ a e i o u /;
if (#matches == 5) {
print "[", join("][", #$_), "]\n"
for #matches;
}
else {
my $es = #matches == 1 ? "" : "es";
print scalar(#matches), " match$es\n";
}
Output:
[aaa][aa]
[e]
[iiii]
[oo]
[uuuuu]
An input of, say, "aaa iiii oo uuuuu aa" produces
4 matches
You have 5 patterns with one matching group () each. Not 1 pattern with 5 groups.
(a)+ looks for a string containing a, aa, aaa, aaaa etc. The match will be multiple a's, not the word containing the group of a-s.
Your if( ...) is true if $_ contains one or more of 'a','e','i','o','u'.