I'm looking for a logical (not additional module) to sort by such format. I have a list of strings which looks like:
asdadasBBBsfasdasdas-0112
asdanfnfnfnfnf222ads-1210
etc.
I cant just sort by the numbers, because, for instance: 812 > 113 (812 = August 2012, 113 = January 2013, so its incorrect)
any good strategy??
thanks,
A schwartzian transform would be a huge waste here. This similar construct whose name I can never remember would be way better.
my #sorted =
map substr($_, 4),
sort
map substr($_, -2) . substr($_, -4, 2) . $_,
#unsorted;
Using the match operator instead of substr:
my #sorted =
map substr($_, 4),
sort
map { /(..)(..)\z/s; $2.$1.$_ }
#unsorted;
How about Schwartzian transform:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dump qw(dump);
my #list = (
'asdadasBBBsfasdasdas-0112',
'asdanfnfnfnfnf222ads-1210',
'asdanfnfnfnfnf222ads-1211',
'asdanfnfnfnfnf222ads-1010',
'asdanfnfnfnfnf222ads-1011',
);
my #sorted =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] or $a->[2] <=> $b->[2] }
map { /-(\d\d)(\d\d)$/; [$_, $2, $1] } #list;
dump #sorted;
output:
(
"asdanfnfnfnfnf222ads-1010",
"asdanfnfnfnfnf222ads-1210",
"asdanfnfnfnfnf222ads-1011",
"asdanfnfnfnfnf222ads-1211",
"asdadasBBBsfasdasdas-0112",
)
Use a sorting function that looks at the year first, and then the date:
sub mmyy_sorter {
my $a_yy = substr($a, -2);
my $b_yy = substr($b, -2);
my $a_mm = substr($a, -4, 2);
my $b_mm = substr($b, -4, 2);
return ($a_yy cmp $b_yy) || ($a_mm cmp $b_mm);
}
my #sorted = sort mmyy_sorter #myarray;
NB: this is technically not as efficient as it could be as it has to re-calculate the month and year subfields for every comparison, not just once for each item in the array.
It would also be possible to take advantage of Perl's automatic type conversion and use the <=> operator in place of cmp, since all of the values actually represent numbers.
What about remake it to months? For example:
812 = 12 * 12 + 8
113 = 13 * 12 + 1
You can turn years into months and it will be good. For selecting numbers you can use regex.
Thanks to #M42 for the sample data.
use strict;
use warnings;
use feature 'say';
my #list = (
'asdadasBBBsfasdasdas-0112',
'asdanfnfnfnfnf222ads-1210',
'asdanfnfnfnfnf222ads-1211',
'asdanfnfnfnfnf222ads-1010',
'asdanfnfnfnfnf222ads-1011',
);
my #sorted = sort {
my ($aa, $bb) = map { /(..)(..)\z/ and $2.$1 } $a, $b;
$aa <=> $bb;
} #list;
say for #sorted;
output
asdanfnfnfnfnf222ads-1010
asdanfnfnfnfnf222ads-1210
asdanfnfnfnfnf222ads-1011
asdanfnfnfnfnf222ads-1211
asdadasBBBsfasdasdas-0112
Related
So, for example i have array like that:
my #arr = (
"blabla\t23\t55",
"jkdcbx\t55\t89",
"jdxjcl\t88\t69",
......)
And i need to sort this array by second column after \t, without outer splits. Is it possible to do?
May be a more elegant way but this will work :
my #arr = ( "blabla\t23\t55", "jkdcbx\t55\t89", "jdxjcl\t88\t69");
for (sort {(split(/\t/,$a))[2] <=> (split(/\t/,$b))[2]} #arr) {
print "$_\n";
}
Update
I've just realised that your question may mean that you want to sort by the third column instead of the second
That would be done by using
my ($aa, $bb) = map { (split /\t/)[2] } $a, $b;
instead
output
blabla 23 55
jdxjcl 88 69
jkdcbx 55 89
I always prefer to use map to convert the values from the original data into the function that they should be sorted by
This program demonstrates
I assume you want the values sorted numerically? Unfortunately your example data is already sorted as you describe
use strict;
use warnings 'all';
use feature 'say';
my #arr = (
"blabla\t23\t55",
"jkdcbx\t55\t89",
"jdxjcl\t88\t69",
);
my #sorted = sort {
my ($aa, $bb) = map { (split /\t/)[1] } $a, $b;
$aa <=> $bb;
} #arr;
say for #sorted;
output
blabla 23 55
jkdcbx 55 89
jdxjcl 88 69
Try this
use warnings;
use strict;
no warnings "numeric";
my #arr = (
"blabla\t23\t55",
"jkdcbx\t85\t89",
"jdxjcl\t83\t69",
);
my #result = sort {$a=~s/^[^\t]*\t//r <=> $b=~s/^[^\t]*\t//r } #arr;
$, = "\n";
print #result,"\n";
I have used following technique with sort for to do it
Negation character class
Non-destructive modifier(-r) - perform non-destructive substitution and return the new value
And tured of the warning for numeric
This question already has answers here:
Sorting hash keys by Alphanumeric sort
(4 answers)
Closed 8 years ago.
I have a hash which looks like this
my %hash = (
'124:8' => '',
'4:2' => '',
'17:11' => '',
'17:0' => '',
#and so on
);
I tried to sort and use hash keys by small number to bigger
for my $keys ( sort { $a > $b } keys %hash ) {
#do stuff
}
This gives me some result that looks like correct but it fails sometimes. I don't know how to compare both numbers, 124:8 with 4:2 since it has : in a middle, any suggestions ?
You might want to sort on first and second number delimited by :
my #sorted = sort {
my ($aa, $bb) = map [ split /:/ ], $a, $b;
$aa->[0] <=> $bb->[0] || $aa->[1] <=> $bb->[1]
} keys %hash;
for my $key (#sorted) { .. }
Using Schwartzian,
my #sorted = map $_->[0],
sort {
$a->[1] <=> $b->[1] || $a->[2] <=> $b->[2]
}
map [ $_, split /:/ ],
keys %hash;
When you sort numbers, you use the <=> operator:
for my $key (sort { $a <=> $b } keys %hash) {
This operator returns 1, 0 or -1 depending on the comparison. > only returns true or false, which explains it working with some results, but not all.
Because your keys are not numbers, they will only partially convert to numbers, and you will get warnings
Argument "17:11" isn't numeric in sort
Then you will need to use something like Sort::Key::Natural, or swing your own, such as:
sort {
my #a = $a =~ /\d+/g;
my #b = $b =~ /\d+/g;
$a[0] <=> $b[0] ||
$a[1] <=> $b[1] # continue as long as needed
} keys %hash
You may also use a Schwartzian transform to cache the numbers and possibly speed up the sort.
Or just sort by string comparison, though this will cause 17:11 to end up after 17:2.
Not as elegant as above solutions, but what to convert the : to . and compare them as floating point numbers? Because no math operations occurs, no rounding errors and the next could work:
my %tmp = map { (my $x = $_) =~ s/:/./; $_,$x} keys %hash;
my #sortedkeys = sort { $tmp{$a} <=> $tmp{$b} } keys %tmp;
#4:2 17:0 17:11 124:8
Or this approach is wrong?
I have a set of strings:
$str1: 7-10-2013- X1
$str2: 19-04-2010-G2
$str3: 7-10-2013-X2
$str4: 7-12-2013-A
I want to sort the strings according to the date and the Alphabet in the end. SO, the above strings after sorting will be:
$str2: 19-04-2010-G2
$str1: 7-10-2013-X1
$str3: 7-10-2013-X2
$str4: 7-12-2013-A
My idea of doing this would be to do regex grouping and then sort according to each group. But I am looking for more efficient ideas to implement this in perl.
Thanks.
Using Schwartzian Transform and the fact that dates in YYYYMMDD format sort lexicographically:
#!/usr/bin/perl
use warnings;
use strict;
my #strings = qw(7-10-2013-X1 19-04-2010-G2 7-10-2013-X2 7-12-2013-A);
print "$_\n" for map $_->[1],
sort { $a->[0] cmp $b->[0] }
map {
my ($d, $m, $y, $str) = split /-/;
[sprintf('%d%02d%02d%s', $y, $m, $d, $str), $_]
}
#strings;
I am reading an ordered file for which I must count by-hour, by-minute or by-second occurrences. If requested, I must print times with 0 occurrences (normalized output) or skip them (non-normalized output). The output must obviously be ordered.
I first thought using an array. When the output is non normalized, I am doing roughly the equivalent of:
#array[10] = 100;
#array[10000] = 10000;
And to print the result:
foreach (#array) {
print if defined;
}
Is there a way to reduce iterations to only elements defined in the array? In the previous example, that would mean doing only two iterations, instead of 10000 as using $#array implies. Then I would also need a way to know the current array index in a loop. Does such a thing exist?
I am thinking more and more to use a hash instead. Using a hash solves my problem and also eliminates the need to convert hh:mm:ss times to index and vice-versa.
Or do you have a better solution to suggest for this simple problem?
Yes, use a hash. You can iterate over the ordered array of the keys of the hash if your keys sort correctly.
You can also remember just the pairs of numbers in an array:
#!/usr/bin/perl
use warnings;
use strict;
my #ar = ( [ 10, 100 ],
[ 100, 99 ],
[ 12, 1 ],
[ 13, 2 ],
[ 15, 1 ],
);
sub normalized {
my #ar = sort { $a->[0] <=> $b->[0] } #_;
map "#$_", #ar;
}
sub non_normalized {
my #ar = sort { $a->[0] <=> $b->[0] } #_;
unshift #ar, [0, 0] unless $ar[0][0] == 0;
my #return;
for my $i (0 .. $#ar) {
push #return, "#{ $ar[$i] }";
push #return, $_ . $" . 0 for 1 + $ar[$i][0] .. $ar[$i + 1][0] - 1;
}
return #return;
}
print join "\n", normalized(#ar), q();
print "\n";
print join "\n", non_normalized(#ar), q();
Consider:
use warnings;
my #a = (1, 11, 3, 5, 21, 9, 10);
my #b = sort #a;
print "#b";
Output: 1 10 11 21 3 5 9
Codepad link: http://codepad.org/Fvhcf3eP
I guess the sort function is not taking the array's elements as an integer. That is why the output is not:
1 3 5 9 10 11 21
Is it?
How can I get the above result as output?
The default implementation of Perl's sort function is to sort values as strings. To perform numerical sorting:
my #a = sort {$a <=> $b} #b;
The linked page shows other examples of how to sort case-insensitively, in reverse order (descending), and so on.
You can create explicit subroutines to prevent duplication:
sub byord { $a <=> $b };
...
#a = sort byord #b;
This is functionally equivalent to the first example using an anonymous subroutine.
You are correct. So just tell Perl to treat it as an integer like below.
File foop.pl
use warnings;
my #a = (1, 11, 3, 5, 21, 9, 10);
my #b = sort {$a <=> $b} #a;
print "#b";
Run
perl foop.pl
1 3 5 9 10 11 21
Provide a custom comparison function (comparing numerically):
sort {$a <=> $b} #array;
Here is a numerical sort:
#sorted = sort { $a <=> $b } #not_sorted
#b = sort { $a <=> $b } #a;
Is numerical
Use the spaceship operator: sort { $a <=> $b } #a
Guessing is the wrong approach. If you don't understand sort, look it up: sort
my #b = sort{$a <=> $b} #a;