string array sorting issue in perl - perl

I am running below code to sort strings and not getting the expected results.
Code:
use warnings;
use strict;
my #strArray= ("64.0.71","68.0.71","62.0.1","62.0.2","62.0.11");
my #sortedStrArray = sort { $a cmp $b } #strArray;
foreach my $element (#sortedStrArray ) {
print "\n$element";
}
Result:
62.0.1
62.0.11 <--- these two
62.0.2 <---
64.0.71
68.0.71
Expected Result:
62.0.1
62.0.2 <---
62.0.11 <---
64.0.71
68.0.71

"1" character 0x31. "2" is character 0x32. 0x31 is less than 0x32, so "1" sorts before "2". Your expectations are incorrect.
To obtain the results you desire to obtain, you could use the following:
my #sortedStrArray =
map substr($_, 3),
sort
map pack('CCCa*', split(/\./), $_),
#strArray;
Or for a much wider range of inputs:
use Sort::Key::Natural qw( natsort );
my #sortedStrArray = natsort(#strArray);

cmp is comparing lexicographically (like a dictionary), not numerically. This means it will go through your strings character by character until there is a mismatch. In the case of "62.0.11" vs. "62.0.2", the strings are equal up until "62.0." and then it finds a mismatch at the next character. Since 2 > 1, it sorts "62.0.2" > "62.0.11". I don't know what you are using your strings for or if you have any control over how they're formatted, but if you were to change the formatting to "62.00.02" (every segment has 2 digits) instead of "62.0.2" then they would be sorted as you expect.

Schwartzian_transform
This is usage of randal schwartz transofm:
First, understand, what you want:
sorting by first number, then second, then third:
let's do it with this:
use warnings;
use strict;
use Data::Dumper;
my #strArray= ("64.0.71","68.0.71","62.0.1","62.0.2","62.0.11");
my #transformedArray = map{[$_,(split(/\./,$_))]}#strArray;
=pod
here #transformedArray have such structure:
$each_element_of_array: [$element_from_original_array, $firstNumber, $secondNumber, $thirdNumber];
for example:
$transformedArray[0] ==== ["64.0.71", 64, 0, 71];
after that we will sort it
first by first number
then: by second number
then: by third number
=cut
my #sortedArray = map{$_->[0]} # save only your original string.
sort{$a->[3]<=>$b->[3]}
sort{$a->[2]<=>$b->[2]}
sort{$a->[1]<=>$b->[1]}
#transformedArray;
print Dumper(\#sortedArray);

Try the Perl module Sort::Versions, it is designed to give you what you expect.http://metacpan.org/pod/Sort::Versions
It supports alpha-numeric version ids as well.

Related

truncate string in perl into substring with trailing elipses

I'm trying to truncate a string in a select input option using perl if it is longer than a set value, though i can't get it to work correctly.
my $value = defined $option->{value} ? $option->{value} : '';
my $maxValueLength = 50;
if ($value.length > $maxValueLength) {
$value = substr $value, 0, $maxValueLength + '...';
}
Another option is regex
$string =~ s/.{$maxLength}\K.*/.../;
It matches any character (.) given number of times ({N}, here $maxLength), what is the first $maxLength characters in $string; then \K makes it "forget" all previous matches so those won't get replaced later. The rest of the string that is matched is then replaced by ...
See Lookaround assertions in perlre for \K.
This does start the regex engine for a simple task but it doesn't need any conditionals -- if the string is shorter than the maximum length the regex won't match and nothing happens.
Your code has several syntax errors. Turn on use strict and use warnings if you don't have it, and then read the error messages it tells you about. This is a bit tricky because of Perl's very complex syntax (see also Damian Conway's keynote from the 2020 Perl and Raku Conference), but it boils down to these:
Use of uninitialized value in concatenation (.) or string at line 7
Argument "..." isn't numeric in addition (+) at line 8
I've used the following adaption of your code to produce these
use strict;
use warnings;
my $value = '1234567890' x 10;
my $maxValueLength = 50;
if ( $value.length > $maxValueLength ) {
$value = substr $value, 0, $maxValueLength + '...';
}
print $value;
Now let's see what they mean.
The . operator in Perl is a concatenation. You cannot use it to call methods, and length is not a method on a string. Perl thinks you are using the built-in length (a function, not a method) without an argument, which makes it default to $_. Most built-ins do this, to make one-liners shorter. But $_ is not defined. Now the . tries to concatenate the length of undef to $value. And using undef in a string operation leads to this warning.
The correct way of doing this is length $value (or with parentheses if you prefer them, length($value)).
The + operator is not concatenation (we just learned that the . is). It's a numerical addition. Perl is pretty good at converting between strings and numbers as there aren't really any types, so saying 1 + "5" would give you 6 without problems, but it cannot do that for a couple of dots in a string. Hence it complains about a non-number value in an addition.
You want the substring with a given length, and then you want to attach the three dots. Because of associativity (or stickyness) of operators you will need to use parentheses () for your substr call.
$value = substr($value, 0, $maxValueLength) . '...';
To find a length of the string use length(STRING)
Here is the code snippet how you can modify the script.
#!/usr/bin/perl
use strict;
use warnings;
use feature qw(say);
my $string = "abcdefghijklmnopqrstuvwxyz abcdefghijklmnopqrstuvwxyz abcdefghijklmnopqrstuvwxyz";
say "length of original string is:".length($string);
my $value = defined $string ? $string : '';
my $maxValueLength = 50;
if (length($value) > $maxValueLength) {
$value = substr $value, 0, $maxValueLength;
say "value:$value";
say "value's length:".length($value);
}
Output:
length of original string is:80
value:abcdefghijklmnopqrstuvwxyz abcdefghijklmnopqrstuvw
value's length:50

generate random binary number in perl

I want to generate 64 iteration of non-repetitive 6 digits that only consist of 0 and 1 (eg. 111111, 101111, 000000) by using perl.
I found code that can generate random hex and try to modify it but I think my code is all wrong. This is my code:
use strict;
use warnings;
my %a;
foreach (1 .. 64) {
my $r;
do {
$r = int(rand(2));
}
until (!exists($a{$r}));
printf "%06d\n", $r;
$a{$r}++;
}
Do you mean that you want 64 six-bit numbers, all distinct from each other? If so, then you should just shuffle the list (0, 1, 2, 3, …, 63), because there are exactly 64 six-bit numbers — you just want them in a random order.
And if you want to print them as base-two string, use the %06b format.
use List::Util;
my #list = List::Util::shuffle 0..63;
printf "%06b\n", $_ for #list;
From the comments:
I am actually want to generate all possible 6-bit binary number. Since writing all the possible combination by hand is cumbersome and prone to human error, I think it will be good idea to just generate it by using rand() with no repetition and store it into array.
This is a horribly inefficent approach to take, thanks to random number collisons.
You get the same result with:
printf ( "%06b\n", $_ ) for 1..63;
If you're after a random order (although, you don't seem to suggest that you do):
use List::Util qw ( shuffle );
printf ( "%06b\n", $_ ) for shuffle (0..63);
If you want 64 x 6-bit integers you can call int(rand(64)); 64 times, there's no need to generate each bit separately.
Your code can be modified to work like this:
#!/usr/bin/perl
# your code goes here
use strict;
use warnings;
my %a;
foreach (1 .. 64) {
my $r;
do
{
$r = int(rand(64));
} until (!exists($a{$r}));
printf "%06b\n", $r;
$a{$r}++;
}
The results are stored in a array of integers. The %06b format specifier string prints out a 6 bit binary number.

Why does split return an array with every second element empty?

I'm trying to split a string every 5 characters. The array I'm getting back from split isn't how I'm expecting it: all the even indexes are empty, the parts I'm looking for are on odd indexes.
This version doesn't output anything:
use warnings;
use strict;
my #ar = <DATA>;
foreach (#ar){
my #mkh = split (/(.{5})/,$_);
print $mkh[2];
}
__DATA__
aaaaabbbbbcccccdddddfffff
If I replace the print line with this (odd indexes 1 and 3):
print $mkh[1],"\n", $mkh[3];
The output is the first two parts:
aaaaa
bbbbb
I don't understand this, I expected to be able to print the first two parts with this:
print $mkh[0],"\n", $mkh[1];
Can someone explain what is wrong in my code, and help me fix it?
The first argument in split is the pattern to split on, i.e. it describes what separates your fields. If you put capturing groups in there (as you do), those will be added to the output of the split as specified in the split docs (last paragraph).
This isn't what you want - your separator isn't a group of five characters. You're looking to split a string every X characters. For that, better use:
my #mkh = (/...../g);
# or
my #mkh = (/.{5}/g);
or one of the other options you'll find in: How can I split a string into chunks of two characters each in Perl?
Debug using Data::Dump
To observe exactly what your split operation is doing, use a module like Data::Dump:
use warnings;
use strict;
while (<DATA>) {
my #mkh = split /(.{5})/;
use Data::Dump;
dd #mkh;
}
__DATA__
aaaaabbbbbcccccdddddfffff
Outputs:
("", "aaaaa", "", "bbbbb", "", "ccccc", "", "ddddd", "", "fffff", "\n")
As you can see, your code is splitting on groups of 5 characters, and leaving empty strings between them. This is obviously not what you want.
Use Pattern Matching instead
Instead, you simply want to capture groups of 5 characters. Therefore, you just need a pattern match with the /g Modifier:
use warnings;
use strict;
while (<DATA>) {
my #mkh = /(.{5})/g;
use Data::Dump;
dd #mkh;
}
__DATA__
aaaaabbbbbcccccdddddfffff
Outputs:
("aaaaa", "bbbbb", "ccccc", "ddddd", "fffff")
You can also use zero-width delimiter, which can be described as split string at places which are in front of 5 chars (by using \K positive look behind)
my #mkh = split (/.{5}\K/, $_);

multidimensional array: argument isn't numeric in array element

OS: AIX
Shell: KSH
Following the accepted answer on this question I have created an multimensional array. Only, I get an error while trying to print the content of the array.
Error:
Argument "content of $pvid" isn't numeric in array element at...
The script:
#!/usr/bin/perl
use warnings;
use strict;
use Term::ANSIColor;
my #arrpvid = ();
print colored( sprintf("%-10s %9s %8s %8s %8s", 'PVID', 'AIX', 'VIO', 'VTD', 'VHOST'), 'green' ), "\n";
foreach my $pvid (`lspv | awk '{print \$2'}`) {
foreach my $hdaix (`lspv | awk '{print \$1'}`) {
chomp $pvid;
chomp $hdaix;
push #{ $arrpvid[$pvid] }, $hdaix;
}
}
print $arrpvid[0][0];
Some explanation:
Basically I want to print 5 variables of 5 different arrays next to each other. The code is written only for 2 arrays.
The content of $pvid:
00088da343b00d9b
00088da38100f93c
The content of $hdaix:
hdisk0
hdisk1
Quick Fix
Looks like you want to use a hash rather than an array, making your inner push
push #{ $arrpvid{$pvid} }, $hdaix;
Note the change from square brackets to curly braces immediately surrounding $pvid. This tells the compiler that you want %arrpvid and not #arrpvid, so be sure to tweak your my declaration as well.
At the end to print the contents of %arrpvid, use
foreach my $pvid (sort { hex $a <=> hex $b } keys %arrpvid) {
local $" = "]["; # handy trick due to mjd
print "$pvid: [#{$arrpvid{$pvid}}]\n";
}
The Data::Dumper module is quick and easy output tool.
use Data::Dumper;
$Data::Dumper::Indent = $Data::Dumper::Terse = 1;
print Dumper \%arrpvid;
More Details
You might be tempted to obtain the numeric value corresponding to each hexadecimal string in $pvid with hex as in
push #{ $arrpvid[hex $pvid] }, ...
but given the large example values in your question, #arrpvid would become enormous. Use a hash to create a sparse array instead.
Be sure that all the values of $pvid have the same padding. Otherwise, like values may not hash together appropriately. If you need to normalize, use code along the lines of
$pvid = sprintf "%016x", hex $pvid;
The problem lies in:
push #{ $arrpvid[$pvid] }, $hdaix;
The $pvid should be a numeric value like 0 or 5 and not i.e. 00088da343b00d9b

Perl: Greedy nature refuses to work

I am trying to replace a string with another string, but the greedy nature doesn't seem to be working for me. Below is my code where "PERFORM GET-APLCY" is identified and replaced properly, but string "PERFORM GET-APLCY-SOI-CVG-WVR" and many other such strings are being replaced by the the replacement string for "PERFORM GET-APLCY".
s/PERFORM $func[$i]\.*/# PERFORM $func[$i]\.\n $hash{$func[$i]}/g;
where the full stop is optional during string match and replacement. I have also tried giving the pattern to be matched as $func[$i]\b
Please help me understand what the issue could be.
Thanks in advance,
Faez
Why GET-APLCY- should not match GET-APLCY., if the dot is optional?
Easy solution: sort your array by length in descending order.
#func = sort { length $b <=> length $a } #func
Testing script:
#!/usr/bin/perl
use warnings;
use strict;
use feature 'say';
my %hash = ('GET-APLCY' => 'REP1',
'GET-APLCY-SOI-CVG-WVR' => 'REP2',
'GET-APLCY-SOI-MNG-CVRW' => 'REP3',
);
my #func = sort { length $b <=> length $a } keys %hash;
while (<DATA>) {
chomp;
print;
print "\t -> \t";
for my $i (0 .. $#func) {
s/$func[$i]/$hash{$func[$i]}/;
}
say;
}
__DATA__
GET-APLCY param
GET-APLCY- param
GET-APLCY. param
GET-APLCY-SOI. param
GET-APLCY-SOI-CVG-WVR param
GET-APLCY-SOI-MNG-CVRW param
You appear to be looping over function names, and calling s/// for each one. An alternative is to use the e option, and do them all in one go (without a loop):
my %hash = (
'GET-APLCY' => 'replacement 1',
'GET-APLCY-SOI-CVG-WVR' => 'replacement 2',
);
s{
PERFORM \s+ # 'PERFORM' keyword
([A-Z-]+) # the original function name
\.? # an optional period
}{
"# PERFORM $1.\n" . $hash{$1};
}xmsge;
The e causes the replacement part to be evaluated as an expression. Basically, the first part finds all PERFORM calls (I'm assuming that the function names are all upper case with '-' between them – adjust otherwise). The second part replaces that line with the text you want to appear.
I've also used the x, m, and s options, which is what allows the comments in the regular expression, among other things. You can find more about these under perldoc perlop.
A plain version of the s-line should be:
s/PERFORM ([A-Z-]+)\.?/"# PERFORM $1.\n" . $hash{$1}/eg;
I guess that $func[$i] contains "GET-APLCY". If so, this is because the star only applies to the dot, an actual dot, not "any character". Try
s/PERFORM $func[$i].*/# PERFORM $func[$i]\.\n $hash{$func[$i]}/g;
I'm pretty sure you trying to do some kind of loop for $i. And in that case most likely
GET-APLCY is located in #func array before GET-APLCY-SOI-CVG-WVR. So I recommend to reverse sort #func before entering loop.