Sort HOA by key first, then value - perl

I've got a HoA that I'm setting up as follows (test example):
#!/usr/bin/perl -w
use strict;
my #array1 = qw (1 1 1 4 5); # Note '1' appears several times
my #array2 = qw (a b c d e);
my #array3 = qw (8 6 7 9 10);
my #array4 = qw (f g h i j);
my %hash;
push #{$hash{$array1[$_]}}, [ $array2[$_], $array3[$_], $array4[$_] ] for 0 .. $#array1;
for my $key (sort keys %hash) {
for my $array (# { $hash{$key} } ) {
my ($array2, $array3, $array4) = #$array;
print "[$key] $array2\t$array3\t$array4\n";
}
}
Output:
[1] a 8 f
[1] b 6 g
[1] c 7 h
[4] d 9 i
[5] e 10 j
What I want to do is be able to sort by the key first (as above) but in the instances where the key is the same, sort on a different array contained in the hash - e.g. numerically by the values in #array3, to give the desired output:
*
[1] b 6 g
[1] c 7 h
[1] a 8 f
[4] d 9 i
[5] e 10 j

Replace this line
for my $array (# { $hash{$key} } ) {
by
for my $array ( sort { $a->[1] <=> $b->[1] } #{ $hash{$key} } ) {
All the arrayrefs in #{ $hash{$key} } have the element you want to sort on at index 1. This sort block orders them in numerically ascending order by the second field.

Related

Printing groups of key/value pairs in hash

How can I print a hash in Perl, such that 3 key value pairs are printed on each line?
print %hash;
This will print key value pairs each in a line.
To display the hash, so "that 3[n] key value pairs are printed on each line", you can use a counter ($n) and % (modulo op) to determine when to print a "\n". Demo:
use Modern::Perl;
my %h = ();
for (0..7) {
$h{$_} = chr(65 + $_);
}
print %h, "\n";
my $cols = +$ARGV[0] || 5;
my $n = -$cols;
for my $key (keys %h) {
print $key, ' => ', $h{$key}, 0 == ++$n % $cols ? "\n" : "\t\t";
}
print $n % $cols ? "\n------" : "------";
output:
perl -w 31444449.pl 1
6G4E1B3D0A7H2C5F
6 => G
4 => E
1 => B
3 => D
0 => A
7 => H
2 => C
5 => F
------
perl -w 31444449.pl
6G4E1B3D0A7H2C5F
6 => G 4 => E 1 => B 3 => D 0 => A
7 => H 2 => C 5 => F
------
perl -w 31444449.pl 3
6G4E1B3D0A7H2C5F
6 => G 4 => E 1 => B
3 => D 0 => A 7 => H
2 => C 5 => F
------
Borodin's solutions, however, is simpler.
See mpapec answer for a much improved version.
A very simple way to do this is to copy all the keys and values to an array, and then print six (three pairs) of those at a time
use strict;
use warnings;
my %h = map { $_ => 1 } 'A' .. 'H';
my #kv = %h;
while ( my #row = splice #kv, 0, 6 ) {
print "#row\n";
}
output
B 1 C 1 A 1
D 1 E 1 G 1
F 1 H 1
You can use natatime from List::MoreUtils:
use List::MoreUtils qw/natatime/;
my $it = natatime 6, %ENV;
while (my #vals = $it->()) {
print "#vals\n";
}
List::MoreUtils isn't in core modules, you need to install it.
Thanks All. I tried this and it worked.
my #keylist=sort keys %hash;
my $counter=0;
foreach(#keylist){
#printing the key value pairs
printf "%-15s :%3d ",$_,$hash{$_};
$counter++;
if($counter==3){
$counter=0;
print "\n";
}
}
print "\n";
If you really just want to print hash and check the values for debugging or for analysing then use
use Data::Dumper;
print Dumper(\%hash);
This print hash keys and values at any n number of levels

Modify key if it already exists

I'm writing a piece of code that creates a HoAs and loops through for each key. The snippet below shows a basic example of the problem I'm having.
#!/usr/bin/perl
use strict;
use warnings;
my #array1 = qw (1 1 3 4 5); # Note that '1' appears twice
my #array2 = qw (a b c d e);
my #array3 = qw (6 7 8 9 10);
my #array4 = qw (f g h i j);
my %hash;
push #{$hash{$array1[$_]}}, [ $array2[$_], $array3[$_], $array4[$_] ] for 0 .. $#array1;
for my $key (sort keys %hash) {
my ($array2, $array3, $array4) = #{$hash{$key}[-1]};
print "[$key] $array2\t$array3\t$array4\n";
}
Output:
[1] b 7 g
[3] c 8 h
[4] d 9 i
[5] e 10 j
For the data I'm actually using (as opposed to this example) I have been using a key that I've just realised isn't unique, so, as above I end up overriding non-uniqe keys. I'm mainly using these values as keys in order to sort by them later.
My question is either:
A) I can perform the above task for each key unless (exists $hash{$array1}) in which case I can modify it
or
B) Is there a way to sort by those values, in which case I could use another, non-redundant key.
Thanks!
so, as above I end up overriding non-uniqe keys
You aren't. Let's print out the whole contents of that hash:
for my $key (sort { $a <=> $b } keys %hash) { # sort numerically!
for my $array (#{ $hash{$key} }) { # loop over all instead of $hash{$key}[-1] only
say "[$key] " . join "\t", #$array;
}
}
Output:
[1] a 6 f
[1] b 7 g
[3] c 8 h
[4] d 9 i
[5] e 10 j
You would be overriding the values if you were building the hash like
$hash{$array1[$_]} = [ $array2[$_], $array3[$_], $array4[$_] ] for 0 .. $#array1;
(And printing it as)
for my $key ( ... ) {
say "[$key] " . join "\t", #{ $hash{$key} };
}
That is, assigning instead of pushing.
If you want to keep the first value assigned to each key, use the //= operator (assign if undef).

Making the sort stable in Perl

I have an array of refs with me . Something like
$a[0] = [qw( 1 2 3 4 )];
$a[1] = [qw( a b c d )];
The 1,2,3,4 are actually website breadcrumbs which are used for navigation (Home, Profile, Contact-us, Contact-me-specifically).
Now, I have to sort this ladder (And using stable sort in perl 5.8 is not an option sadly)
The sorting criteria is
The depth of the ladder
If two ladders have same depth, then sort them depending on their index.
For example, if the array originally contains
$a[0] = [qw( 1 2 3 4 )];
$a[1] = [qw( 1 2 3 )];
Then after the sort, the array should contain
$a[0] = [qw( 1 2 3 )];
$a[1] = [qw( 1 2 3 4 )];
But if the arrays are something like :-
$a[0] = [qw( 1 2 3 )];
$a[1] = [qw( a b c )];
Then after the sort,
$a[0] = [qw( 1 2 3 )];
$a[1] = [qw( a b c )];
I can't get it to work this way that I tried .
my #sorted_array = sort { #$b <=> #$a || $a <=> $b } #a;
Can someone help me in this?
The description of your data structure (linked list), and the implementation in your sort routine (arrayrefs) do not quite fit together; I will assume the latter.
A non-stable sort can be made stable by sorting by the position as a secondary criterion:
sort { normally or by_index } #stuff
Normally, you seem to want to compare the array length. To be able to test for the index, you have to somehow make the index of the current element available. You can do so by two means:
Do the Schwartzian Transform, and annotate each element with its index. This is silly.
Sort the indices, not the elements.
This would look like:
my #sorted_indices =
sort { #{ $array[$b] } <=> #{ $array[$a] } or $a <=> $b } 0 .. $#array;
my #sorted = #array[#sorted_indices]; # do a slice
What you were previously doing with $a <=> $b was comparing refernces. This is not guaranteed to do anything meaningful.
Test of that sort:
use Test::More;
my #array = (
[qw/1 2 3/],
[qw/a b c/],
[qw/foo bar baz qux/],
);
my #expected = (
[qw/foo bar baz qux/],
[qw/1 2 3/],
[qw/a b c/],
);
...; # above code
is_deeply \#sorted, \#expected;
done_testing;
Your code doesn't work because you expect $a and $b to contain the element's value in one place (#$b <=> #$a) and the element's index in another ($a <=> $b).
You need the indexes in your comparison, so your comparison function is going to need the indexes.
By passing the indexes of the array to sort, you have access to both the indexes and the values at those indexes, so your code is going to include
sort { ... } 0..$#array;
After we're finished sorting, we want to retrieve the elements for those indexes. For that, we can use
my #sorted = map $array[$_], #sorted_indexes;
or
my #sorted = #array[ #sorted_indexes ];
All together, we get:
my #sorted =
map $array[$_],
sort { #{ $array[$a] } <=> #{ $array[$b] } || $a <=> $b }
0..$#array;
or
my #sorted = #array[
sort { #{ $array[$a] } <=> #{ $array[$b] } || $a <=> $b }
0..$#array
];
I think we need to clear up your sorting algorithm. You said:
The depth of the ladder
Sort them depending on their index.
Here's an example:
$array[0] = [ qw(1 a b c d e) ];
$array[2] = [ qw(1 2 b c d e) ];
$array[3] = [ qw(a b c) ];
$array[4] = [ qw(a b c d e) ];
You want them sorted this way:
$array[3] = [ qw(a b c) ];
$array[2] = [ qw(1 2 b c d e) ];
$array[0] = [ qw(1 a b c d e) ];
$array[4] = [ qw(a b c d e) ];
Is that correct?
What about this?
$array[0] = [ qw(100, 21, 15, 32) ];
$array[1] = [ qw(32, 14, 32, 20) ];
Sorting by numeric, $array[1] should be before $array[0], but sorting by string, $array[0] is before $array[1].
Also, you notice that I cannot tell whether $array[0] should be before or after $array[1] until I look at the second element of the array.
This makes sorting very difficult to do on a single line function. Even if you can somehow reduce it, It'll make it very difficult for someone to analyze what you are doing, or for you to debug the statement.
Fortunately, you can use an entire subroutine as a sort routine:
use warnings;
use strict;
use autodie;
use feature qw(say);
use Data::Dumper;
my #array;
$array[0] = [ qw(1 2 3 4 5 6) ];
$array[1] = [ qw(1 2 3) ];
$array[2] = [ qw(a b c d e f) ];
$array[3] = [ qw(0 1 2) ];
my #sorted_array = sort sort_array #array;
say Dumper \#sorted_array;
sub sort_array {
#my $a = shift; #Array reference to an element in #array
#my $b = shift; $Array reference to an element in #array
my #a_array = #{ $a };
my #b_array = #{ $b };
#
#First sort on length of arrays
#
if ( scalar #a_array ne scalar #b_array ) {
return scalar #a_array <=> scalar #b_array;
}
#
# Arrays are the same length. Sort on first element in array that differs
#
for my $index (0..$#a_array ) {
if ( $a_array[$index] ne $b_array[$index] ) {
return $a_array[$index] cmp $b_array[$index];
}
}
#
# Both arrays are equal in size and content
#
return 0;
}
This returns:
$VAR1 = [
[
'0',
'1',
'2'
],
[
'1',
'2',
'3'
],
[
'1',
'2',
'3',
'4',
'5',
'6'
],
[
'a',
'b',
'c',
'd',
'e',
'f'
]
];

if duplicate values in one column than copy value from other column to a line above

I'm working with a table that looks like this
C1 C2 C3
1 a b
2 c d
4 e g
4 f h
5 x y
... ... ...
If the values in C1 are the same (in this example there is two times a 4) than I want the values of C2 and C3 to be pasted on the first line with 4 in C1 and I want to remove then the second line with 4 in C1. So at the end it should look like this
C1 C2 C3
1 a b
2 c d
4 e,f g,h
5 x y
I'm working with a perl script. I'm using while to loop through the file. I've used thing like my %seen or count in other scripts, but I'm not able to figure out how to use them know. It looks really simple to do ...
This is how my while loop looks like for the moment
while (<$DATA>) {
#columns = split
$var1 = $columns[0]
$var2 = $columns[1]
$var3 = $columns[2];
}
Use a hash to control the duplicates. I have used in my example a hash (%info) of hashes, with keys C1 and C2. Each of them contains an array reference to add the duplicated items.
use strict;
use warnings;
my %info = ();
while (<DATA>) {
my #columns = split /\s+/;
if( exists $info{ $columns[0] } ) {
push #{ $info{ $columns[0] }->{C2} }, $columns[1];
push #{ $info{ $columns[0] }->{C3} }, $columns[2];
}
else {
$info{ $columns[0] } = { C2 =>[ $columns[1] ], C3 => [ $columns[2]] }
}
}
foreach my $c1(sort {$a<=>$b} keys %info ) {
print $c1, "\t",
join(',',#{$info{$c1}->{C2}}), "\t",
join(',',#{$info{$c1}->{C3}}), "\n";
}
__DATA__
1 a b
2 c d
4 e g
4 f h
5 x y

Turn an array into a hash, where the keys' values are of unequal length

I like to turn an array into a hash. However, the values are of unequal length for each key.
Lets say I have
my #array = qw( A 0 B 1 2 3 4 c 5 d 6 7);
Now I like to use the letters as keys and for each such letter/key the following number(s) as their values. So #array should be transformed into %hash as follows
my %hash = ( A => [0],
B => [1, 2, 3, 4],
c => [5],
d => [6, 7]
);
The difficulty for me is the unequal length of each keys' value.
Here is a way to do it:
#!/usr/local/bin/perl
use Data::Dump qw(dump);
use strict;
use warnings;
my #array = qw( A 0 B 1 2 3 4 c 5 d 6 7);
my %hash;
my $key;
foreach (#array) {
if (/^\D+$/) {
$key = $_;
$hash{$key} = [];
} else {
push #{$hash{$key}}, $_;
}
}
dump %hash;
Output:
("A", [0], "c", [5], "d", [6, 7], "B", [1 .. 4])
Firs the answer for this specific example then some comments
my $hash = {};
my #array = qw( A 0 B 1 2 3 4 c 5 d 6 7);
my $key;
foreach (#array) {
if (/\D/) {
$key = $_;
next;
} else {
push #{$hash->{$key}}, $_;
}
}
And if you want to play in the debugger:
$ perl -de 0
DB<18> #array = qw( A 0 B 1 2 3 4 c 5 d 6 7);
DB<19> $hash={}
DB<20> foreach(#array){if(/\D/){$key=$_;next}else{push #{$hash->{$key}},$_}}
DB<21> x $hash
0 HASH(0x347e568)
'A' => ARRAY(0x348fee8)
0 0
'B' => ARRAY(0x346f188)
0 1
1 2
2 3
3 4
'c' => ARRAY(0x34cefb0)
0 5
'd' => ARRAY(0x346f1e8)
0 6
1 7
Comments: unless your keys are giving information about if the value is scalar or array ref, is better to have all the values of the same type (in this case arrayref)
You would like to check if the last key has a value and decide if you want to initialize to undef or not.
Or using map:
my #a = qw{a 1 2 3 b 4 5 6 C 7 8 9};
my ($key, %h);
map { /^[a-z]$/i and $key = $_ or push(#{$h{$key}}, $_) } #a;
Isn't Perl fun?
Slightly simpler than previously provided solutions:
my #array = qw( A 0 B 1 2 3 4 c 5 d 6 7);
my %hash;
my $values;
for (#array) {
if (/\D/) {
$values = $hash{$_} = [];
} else {
push #$values, $_;
}
}