How to Sort 2D array Perl? - perl

May be I am missing something. First It seemed too easy to me.I thought I can easily achieve it using map{}sort{}map{} ,but now it became complicated to me.
So, finally the problem is I have an array:
#array=(['b','e','d'],['s','a','f'],['g','i','h']);
and I want sorted array like
#sorted_array=(['a','f','s'],['b','d','e'],['g','h','i']);
I wrote
##sort based on columns########
my #sorted_array= map{my #sorted=sort{$a cmp $b}#$_;[#sorted]}#array;
###sort on rows####
my #sorted_array= map{$_->[0]}sort{$a->[1] cmp $b->[1]} map{[$_,"#$_"]}#array;
But I was not sure how to wrap it into one(for both rows and column). Can I achieve this using Schwartzian transform.

Yes, you can use it,
use strict;
use warnings;
my #array =( [qw(b e d)], [qw(s a f)], [qw(g i h)] );
my #sorted_array =
map { $_->[0] }
sort {
$a->[1] cmp $b->[1]
}
map {
my $r = [ sort #$_ ];
[$r, "#$r"];
}
#array;
use Data::Dumper;
print Dumper \#sorted_array;
output
$VAR1 = [
[
'a',
'f',
's'
],
[
'b',
'd',
'e'
],
[
'g',
'h',
'i'
]
];

This is two separate sorts. First you want to sort the inner arrays individually, then you can sort the outer array by, perhaps, the first element of each of the inner ones.
use List::UtilsBy qw( sort_by );
my #array =( [qw(b e d)], [qw(s a f)], [qw(g i h)] );
# sort the inner ones individually
#$_ = sort #$_ for #array;
# sort the whole by the first element of each
my #sorted_array = sort_by { $_->[0] } #array;
Or if you'd prefer doing it all in one go and avoiding the temporary mutation:
my #sorted_array = sort_by { $_->[0] }
map { [ sort #$_ ] } #array;

Related

Remove duplicates from a 2D array in perl

I have a 2D array in perl whose data is coming as rows in html format from a DB like the data shown below:
<tr><td>Rafa</td><td>Nadal</td><td>Data1</td></tr>,
<tr><td>Goran</td><td>Ivan</td><td>Data2</td></tr>,
<tr><td>Leander</td><td>Paes</td><td>Data2</td></tr>,
<tr><td>Leander</td><td>Paes</td><td>Data2</td></tr>
i want to remove the duplicate rows from the array.
"<tr><td>Leander</td><td>Paes</td><td>Data2</td></tr>" should be removed in above case.
I tried the below piece of code, but it's not working out.
sub unique {
my %seen;
grep ! $seen{ join $;, #$_ }++, #_
}
First: you really should try not to use outdated Perl syntax and side effects.
Second: the answer depends on the data structure you generate from the input. Here are two example implementations:
#!/usr/bin/perl
use strict;
use warnings;
# 2D Array: list of array references
my #data = (
['Rafa', 'Nadal', 'Data1'],
['Goran', 'Ivan', 'Data2'],
['Leander', 'Paes', 'Data2'],
['Leander', 'Paes', 'Data2'],
);
my %seen;
foreach my $unique (
grep {
not $seen{
join('', #{ $_ })
}++
} #data
) {
print join(',', #{ $unique }), "\n";
}
print "\n";
# List of "objects", keys are table column names
#data = (
{ first => 'Rafa', last => 'Nadal', data => 'Data1' },
{ first => 'Goran', last => 'Ivan', data => 'Data2' },
{ first => 'Leander', last => 'Paes', data => 'Data2' },
{ first => 'Leander', last => 'Paes', data => 'Data2' },
);
%seen = ();
my #key_order = qw(first last data);
foreach my $unique (
grep {
not $seen{
join('', #{ $_ }{ #key_order } )
}++
} #data
) {
print join(',', #{ $unique }{ #key_order }), "\n";
}
Output:
$ perl dummy.pl
Rafa,Nadal,Data1
Goran,Ivan,Data2
Leander,Paes,Data2
Rafa,Nadal,Data1
Goran,Ivan,Data2
Leander,Paes,Data2
The shown sub is good for the job, with an array which for elements has array references. That is indeed a basic way to organize 2D data, where your rows are arrayrefs.
There are modules that can be leveraged for this, but this good old method works fine as well
use warnings;
use strict;
use Data::Dump qw(dd);
sub uniq_arys {
my %seen;
grep { not $seen{join $;, #$_}++ } #_;
}
my #data = (
[ qw(one two three) ],
[ qw(ten eleven twelve) ],
[ qw(10 11 12) ],
[ qw(ten eleven twelve) ],
);
my #data_uniq = uniq_arys(#data);
dd \#data_uniq;
Prints as expected (last row is gone), using Data::Dump to show data.
The sub works by joining each array into a string, and those are then checked for duplicates using a hash. The $; is a subscript separator, and an empty string '' is just fine instead.
This approach creates a lot of ancillary data -- in principle doubles the data -- and if performance becomes a problem it may be better to simply compare element-wise (at the cost of complexity). This can be an issue only with rather large data sets.
A module example: use uniq_by from List::UtilsBy
use List::UtilsBy qw(uniq_by);
my #no_dupes = uniq_by { join '', #$_ } #data;
This does, more or less, the same as the sub above.

Perl to sort words by user-defined alphabet sequence

I have an array of "words" (strings), which consist of letters from an "alphabet" with user-defined sequence. E.g my "alphabet" starts with "ʔ ʕ b g d", so a list of "words" (bʔd ʔbg ʕʔb bʕd) after sort by_my_alphabet should be ʔbd ʕʔb bʔd bʕd.
sort by_my_alphabet (bʔd ʔbg ʕʔb bʕd) # gives ʔbd ʕʔb bʔd bʕd
Is there a way to make a simple subroutine by_my_alphabet with $a and $b to solve this problem?
Simple, and very fast because it doesn't use a compare callback, but it needs to scan the entire string:
use utf8;
my #my_chr = split //, "ʔʕbgd";
my %my_ord = map { $my_chr[$_] => $_ } 0..$#my_chr;
my #sorted =
map { join '', #my_chr[ unpack 'W*', $_ ] } # "\x00\x01\x02\x03\x04" ⇒ "ʔʕbgd"
sort
map { pack 'W*', #my_ord{ split //, $_ } } # "ʔʕbgd" ⇒ "\x00\x01\x02\x03\x04"
#unsorted;
Optimized for long strings since it only scans a string up until a difference is found:
use utf8;
use List::Util qw( min );
my #my_chr = split //, "ʔʕbgd";
my %my_ord = map { $my_chr[$_] => $_ } 0..$#my_chr;
sub my_cmp($$) {
for ( 0 .. ( min map length($_), #_ ) - 1 ) {
my $cmp = $my_ord{substr($_[0], $_, 1)} <=> $my_ord{substr($_[1], $_, 1)};
return $cmp if $cmp;
}
return length($_[0]) <=> length($_[1]);
}
my #sorted = sort my_cmp #unsorted;
Both should be faster than Sobrique's. Theirs uses a compare callback, and it scans the entire strings being compared.
Yes.
sort can take any function that returns a relative sort position. All you need is a function that correctly looks up the 'sort value' of a string for comparing.
So all you need to do here is define a 'relative weight' of your extra letters, and then compare the two.
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my #sort_order = qw ( B C A D );
my #array_to_sort = qw ( A B C D A B C D AB BB CCC ABC );
my $count = 0;
my %position_of;
$position_of{$_} = $count++ for #sort_order;
print Dumper \%position_of;
sub sort_by_pos {
my #a = split //, $a;
my #b = split //, $b;
#iterate one letter at a time, using 'shift' to take it off the front
#of the array.
while ( #a and #b ) {
my $result = $position_of{shift #a} <=> $position_of{shift #b};
#result is 'true' if it's "-1" or "1" which indicates relative position.
# 0 is false, and that'll cause the next loop iteration to test the next
#letter-pair
return $result if $result;
}
#return a value based on remaining length - longest 'string' will sort last;
#That's so "AAA" comparing with "AA" comparison actually work,
return scalar #a <=> scalar #b;
}
my #new = sort { sort_by_pos } #array_to_sort;
print Dumper \#new;
Bit of a simple case, but it sorts our array into:
$VAR1 = [
'B',
'B',
'BB',
'C',
'C',
'CCC',
'A',
'A',
'AB',
'ABC',
'D',
'D'
];

Printing values of an array from an array of array references

How can I print the values of an array. I have tried several ways but I am unable to get the required values out of the arrays:
#array;
Dumper output is as below :
$VAR1 = [
'a',
'b',
'c'
];
$VAR1 = [
'd',
'e',
'f'
];
$VAR1 = [
'g',
'h',
'i'
];
$VAR1 = [
'j',
'k',
'l'
];
for my $value (#array) {
my $ip = $value->[0];
DEBUG("DEBUG '$ip\n'");
}
I am getting output as below, which means foreach instance I am only getting the first value.
a
d
g
j
I have tried several approaches :
First option :
my $size = #array;
for ($n=0; $n < $size; $n++) {
my $value=$array[$n];
DEBUG( "DEBUG: Element is as $value" );
}
Second Option :
for my $value (#array) {
my $ip = $value->[$_];
DEBUG("DEBUG Element is '$ip\n'");
}
What is the best way to do this?
It is obvious that you have list of arrays. You only loop over top list and print first (0th) value in your first example. Barring any automatic dumpers, you need to loop over both levels.
for my $value (#array) {
for my $ip (#$value) {
DEBUG("DEBUG '$ip\n'");
}
}
You want to dereference here so you need to do something like:
my #array_of_arrays = ([qw/a b c/], [qw/d e f/ ], [qw/i j k/])
for my $anon_array (#array_of_arrays) { say for #{$anon_array} }
Or using your variable names:
use strict;
use warnings;
my #array = ([qw/a b c/], [qw/d e f/], [qw/i j k/]);
for my $ip (#array) {
print join "", #{$ip} , "\n"; # or "say"
}
Since there are anonymous arrays involved I have focused on dereferencing (using PPB style!) instead of nested loops, but print for is a loop in disguise really.
Cheers.

Inverting a Hash's Key and Values in Perl

I would like to make the value the key, and the key the value. What is the best way to go about doing this?
Adapted from http://www.dreamincode.net/forums/topic/46400-swap-hash-values/:
Assuming your hash is stored in $hash:
while (($key, $value) = each %hash) {
$hash2{$value}=$key;
}
%hash=%hash2;
Seems like much more elegant solution can be achieved with reverse (http://www.misc-perl-info.com/perl-hashes.html#reverseph):
%nhash = reverse %hash;
Note that with reverse, duplicate values will be overwritten.
Use reverse:
use Data::Dumper;
my %hash = ('month', 'may', 'year', '2011');
print Dumper \%hash;
%hash = reverse %hash;
print Dumper \%hash;
As mentioned, the simplest is
my %inverse = reverse %original;
It "fails" if multiple elements have the same value. You could create an HoA to handle that situation.
my %inverse;
push #{ $inverse{ $original{$_} } }, $_ for keys %original;
So you want reverse keys & vals in a hash? So use reverse... ;)
%hash2 = reverse %hash;
reverting (k1 => v1, k2 => v2) - yield (v2=>k2, v1=>k1) - and that is what you want. ;)
my %orig_hash = (...);
my %new_hash;
%new_hash = map { $orig_hash{$_} => $_ } keys(%orig_hash);
The map-over-keys solution is more flexible. What if your value is not a simple value?
my %forward;
my %reverse;
#forward is built such that each key maps to a value that is a hash ref:
#{ a => 'something', b=> 'something else'}
%reverse = map { join(',', #{$_}{qw(a b)}) => $_ } keys %forward;
Here is a way to do it using Hash::MultiValue.
use experimental qw(postderef);
sub invert {
use Hash::MultiValue;
my $mvh = Hash::MultiValue->from_mixed(shift);
my $inverted;
$mvh->each( sub { push $inverted->{ $_[1] }->#* , $_[0] } ) ;
return $inverted;
}
To test this we can try the following:
my %test_hash = (
q => [qw/1 2 3 4/],
w => [qw/4 6 5 7/],
e => ["8"],
r => ["9"],
t => ["10"],
y => ["11"],
);
my $wow = invert(\%test_hash);
my $wow2 = invert($wow);
use DDP;
print "\n \%test_hash:\n\n" ;
p %test_hash;
print "\n \%test_hash inverted as:\n\n" ;
p $wow ;
# We need to sort the contents of the multi-value array reference
# for the is_deeply() comparison:
map {
$test_hash{$_} = [ sort { $a cmp $b || $a <=> $b } #{ $test_hash{$_} } ]
} keys %test_hash ;
map {
$wow2->{$_} = [ sort { $a cmp $b || $a <=> $b } #{ $wow2->{$_} } ]
} keys %$wow2 ;
use Test::More ;
is_deeply(\%test_hash, $wow2, "double inverted hash == original");
done_testing;
Addendum
Note that in order to pass the gimmicky test here, the invert() function relies on %test_hash having array references as values. To work around this if your hash values are not array references, you can "coerce" the regular/mixed hash into a multi-value hash thatHash::MultiValue can then bless into an object. However, this approach means even single values will appear as array references:
for ( keys %test_hash ) {
if ( ref $test_hash{$_} ne 'ARRAY' ) {
$test_hash{$_} = [ $test_hash{$_} ]
}
}
which is longhand for:
ref($_) or $_ = [ $_ ] for values %test_hash ;
This would only be needed to get the "round trip" test to pass.
Assuming all your values are simple and unique strings, here is one more easy way to do it.
%hash = ( ... );
#newhash{values %hash} = (keys %hash);
This is called a hash slice. Since you're using %newhash to produce a list of keys, you change the % to a #.
Unlike the reverse() method, this will insert the new keys and values in the same order as they were in the original hash. keys and values always return their values in the same order (as does each).
If you need more control over it, like sorting it so that duplicate values get the desired key, use two hash slices.
%hash = ( ... );
#newhash{ #hash{sort keys %hash} } = (sort keys %hash);

Perl, convert numerically-keyed hash to array

If I have a hash in Perl that contains complete and sequential integer mappings (ie, all keys from from 0 to n are mapped to something, no keys outside of this), is there a means of converting this to an Array?
I know I could iterate over the key/value pairs and place them into a new array, but something tells me there should be a built-in means of doing this.
You can extract all the values from a hash with the values function:
my #vals = values %hash;
If you want them in a particular order, then you can put the keys in the desired order and then take a hash slice from that:
my #sorted_vals = #hash{sort { $a <=> $b } keys %hash};
If your original data source is a hash:
# first find the max key value, if you don't already know it:
use List::Util 'max';
my $maxkey = max keys %hash;
# get all the values, in order
my #array = #hash{0 .. $maxkey};
Or if your original data source is a hashref:
my $maxkey = max keys %$hashref;
my #array = #{$hashref}{0 .. $maxkey};
This is easy to test using this example:
my %hash;
#hash{0 .. 9} = ('a' .. 'j');
# insert code from above, and then print the result...
use Data::Dumper;
print Dumper(\%hash);
print Dumper(\#array);
$VAR1 = {
'6' => 'g',
'3' => 'd',
'7' => 'h',
'9' => 'j',
'2' => 'c',
'8' => 'i',
'1' => 'b',
'4' => 'e',
'0' => 'a',
'5' => 'f'
};
$VAR1 = [
'a',
'b',
'c',
'd',
'e',
'f',
'g',
'h',
'i',
'j'
];
OK, this is not very "built in" but works. It's also IMHO preferrable to any solution involving "sort" as it's faster.
map { $array[$_] = $hash{$_} } keys %hash; # Or use foreach instead of map
Otherwise, less efficient:
my #array = map { $hash{$_} } sort { $a<=>$b } keys %hash;
Perl does not provide a built-in to solve your problem.
If you know that the keys cover a particular range 0..N, you can leverage that fact:
my $n = keys(%hash) - 1;
my #keys_and_values = map { $_ => $hash{$_} } 0 .. $n;
my #just_values = #hash{0 .. $n};
This will leave keys not defined in %hashed_keys as undef:
# if we're being nitpicky about when and how much memory
# is allocated for the array (for run-time optimization):
my #keys_arr = (undef) x scalar %hashed_keys;
#keys_arr[(keys %hashed_keys)] =
#hashed_keys{(keys %hashed_keys)};
And, if you're using references:
#{$keys_arr}[(keys %{$hashed_keys})] =
#{$hashed_keys}{(keys %{$hashed_keys})};
Or, more dangerously, as it assumes what you said is true (it may not always be true … Just sayin'!):
#keys_arr = #hashed_keys{(sort {$a <=> $b} keys %hashed_keys)};
But this is sort of beside the point. If they were integer-indexed to begin with, why are they in a hash now?
As DVK said, there is no built in way, but this will do the trick:
my #array = map {$hash{$_}} sort {$a <=> $b} keys %hash;
or this:
my #array;
keys %hash;
while (my ($k, $v) = each %hash) {
$array[$k] = $v
}
benchmark to see which is faster, my guess would be the second.
#a = #h{sort { $a <=> $b } keys %h};
Combining FM's and Ether's answers allows one to avoid defining an otherwise unnecessary scalar:
my #array = #hash{ 0 .. $#{[ keys %hash ]} };
The neat thing is that unlike with the scalar approach, $# works above even in the unlikely event that the default index of the first element, $[, is non-zero.
Of course, that would mean writing something silly and obfuscated like so:
my #array = #hash{ $[ .. $#{[ keys %hash ]} }; # Not recommended
But then there is always the remote chance that someone needs it somewhere (wince)...
$Hash_value =
{
'54' => 'abc',
'55' => 'def',
'56' => 'test',
};
while (my ($key,$value) = each %{$Hash_value})
{
print "\n $key > $value";
}
We can write a while as below:
$j =0;
while(($a1,$b1)=each(%hash1)){
$arr[$j][0] = $a1;
($arr[$j][1],$arr[$j][2],$arr[$j][3],$arr[$j][4],$arr[$j][5],$arr[$j][6]) = values($b1);
$j++;
}
$a1 contains the key and
$b1 contains the values
In the above example i have Hash of array and the array contains 6 elements.
An easy way is to do #array = %hash
For example,
my %hash = (
"0" => "zero",
"1" => "one",
"2" => "two",
"3" => "three",
"4" => "four",
"5" => "five",
"6" => "six",
"7" => "seven",
"8" => "eight",
"9" => "nine",
"10" => "ten",
);
my #array = %hash;
print "#array"; would produce the following output,
3 three 9 nine 5 five 8 eight 2 two 4 four 1 one 10 ten 7 seven 0 zero
6 six