I need to pad a string on the right with dashes ('-'). e.g. convert 'M' to 'M-----'.
sprintf "%-6s", "M"; gives me 'M '. I tried printf "%-6-s", "M";, and printf "%--6s", "M";, but neither of those work...
Can this be done with sprinf and if so, how?
It can't be done with sprintf alone. (sprintf will only pad with spaces or with zeroes.)
sprintf("%-6s", $s) =~ tr/ /-/r
or
substr($s.("-" x 6), 0, 6)
or
$s . ("-" x (6-length($s)))
sprintf only supports padding with 0 and , so no. You can pad with one of those then replace the padding, but the problem with that, is that you run the risk of replacing any padding characters in the original string. For example sprintf('%-6s', ' M') =~ s/ /-/gr produces --M---.
From the FAQ:
If you need to pad with a character other than blank or zero you can
use one of the following methods. They all generate a pad string with
the x operator and combine that with $text. These methods do not
truncate $text.
Left and right padding with any character, creating a new string:
my $padded = $pad_char x ( $pad_len - length( $text ) ) . $text;
my $padded = $text . $pad_char x ( $pad_len - length( $text ) );
Left and right padding with any character, modifying $text directly:
substr( $text, 0, 0 ) = $pad_char x ( $pad_len - length( $text ) );
$text .= $pad_char x ( $pad_len - length( $text ) );
If you do it often, you could wrap it in a subroutine.
sub pad {
my ($str, $padding, $length) = #_;
my $pad_length = $length - length $str;
$pad_length = 0 if $pad_length < 0;
$padding x= $pad_length;
$str.$padding;
}
say pad('M', '-', 6);
say pad('MMMMMM', '-', 6);
say pad('12345', '-', 6);
say pad('1234567', '-', 6);
say pad(' ', '-', 6);
Output:
M-----
MMMMMM
12345-
1234567
--
Related
I have
$data_dec = 7;
$data_bin = sprintf("%08b",data_dec);
and $data_bin is
00000111
How do I pad with "X" instead of zeros while maintaining 8-bits? Expected data:
XXXXX111
substr( sprintf( "XXXXXXX%b", $n ), -8 )
sprintf( "%8b", $n ) =~ tr/ /X/r
I have a random number between 0.001 and 1000 and I need perl to print it with a fixed column width of 5 characters. That is, if it's too long, it should be rounded, and if it's too short, it should be padded with spaces.
Everything I found online suggested using sprintf, but sprintf ignores the field width if the number is too long.
Is there any way to get perl to do this?
What doesn't work:
my $number = 0.001 + rand(1000);
my $formattednumber = sprintf("%5f", $number);
print <<EOF;
$formattednumber
EOF
You need to define your sprintf pattern dynamically. The number of decimals depends on the number of digits on the left hand side of the decimal point.
This function will do that for you.
use strict;
use warnings 'all';
use feature 'say';
sub round_to_col {
my ( $number, $width ) = #_;
my $width_int = length int $number;
return sprintf(
sprintf( '%%%d.%df', $width_int, $width - 1 - $width_int ),
$number
);
}
my $number = 0.001 + rand(1000);
say round_to_col( $number, 5);
Output could be:
11.18
430.7
0.842
You could use pack after the sprintf. It may not be a computationally efficient approach, but it is relatively simple to implement and maintain:
my $formattednumber = pack ('A5', sprintf("%5f", $number));
The answer posted by simbabque does not cover all cases, so this is my improved version, just in case anyone also needs something like this:
sub round_to_col {
my ( $number, $width ) = #_;
my $width_int = length int $number;
my $sprintf;
print "round_to_col error: number longer than width" if $width_int > $width;
$sprintf = "%d" if $width_int == $width;
$sprintf = "% d" if $width_int == $width - 1;
$sprintf = sprintf( '%%%d.%df', $width_int, $width - 1 - $width_int )
if $width_int < $width -1;
return sprintf( $sprintf , $number );
}
With this sentence:
my $sent = "Mapping and quantifying mammalian transcriptomes RNA-Seq";
We want to get all possible consecutive pairs of words.
my $var = ['Mapping and',
'and quantifying',
'quantifying mammalian',
'mammalian transcriptomes',
'transcriptomes RNA-Seq'];
Is there a compact way to do it?
Yes.
my $sent = "Mapping and quantifying mammalian transcriptomes RNA-Seq";
my #pairs = $sent =~ /(?=(\S+\s+\S+))\S+/g;
A variation that (perhaps unwisely) relies on operator evaluation order but doesn't rely on fancy regexes or indices:
my #words = split /\s+/, $sent;
my $last = shift #words;
my #var;
push #var, $last . ' ' . ($last = $_) for #words;
This works:
my #sent = split(/\s+/, $sent);
my #var = map { $sent[$_] . ' ' . $sent[$_ + 1] } 0 .. $#sent - 1;
i.e. just split the original string into an array of words, and then use map to iteratively produce the desired pairs.
I don't have it as a single line, but the following code should give you somewhere to start. Basically does it with a push and a regext with /g.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Indent = 1;
my $t1 = 'aa bb cc dd ee ff';
my $t2 = 'aa bb cc dd ee';
foreach my $txt ( $t1, $t2 )
{
my #a;
push( #a, $& ) while( $txt =~ /\G\S+(\s+\S+|)\s*/g );
print Dumper( \#a );
}
One liner thanks to the syntax from #ysth
my #a = $txt =~ /\G(\S+(?:\s+\S+|))\s*/g;
My regex is slightly different in that if you have an odd number of words, the last word still gets an entry.
I have an array of numbers. What is the easiest way to calculate the Median, Mode, and Std Dev for the data set?
Statistics::Basic::Mean
Statistics::Basic::Median
Statistics::Basic::Mode
Statistics::Basic::StdDev
#!/usr/bin/perl
#
# stdev - figure N, min, max, median, mode, mean, & std deviation
#
# pull out all the real numbers in the input
# stream and run standard calculations on them.
# they may be intermixed with other test, need
# not be on the same or different lines, and
# can be in scientific notion (avagadro=6.02e23).
# they also admit a leading + or -.
#
# Tom Christiansen
# tchrist#perl.com
use strict;
use warnings;
use List::Util qw< min max >;
sub by_number {
if ($a < $b){ -1 } elsif ($a > $b) { 1 } else { 0 }
}
#
my $number_rx = qr{
# leading sign, positive or negative
(?: [+-] ? )
# mantissa
(?= [0123456789.] )
(?:
# "N" or "N." or "N.N"
(?:
(?: [0123456789] + )
(?:
(?: [.] )
(?: [0123456789] * )
) ?
|
# ".N", no leading digits
(?:
(?: [.] )
(?: [0123456789] + )
)
)
)
# abscissa
(?:
(?: [Ee] )
(?:
(?: [+-] ? )
(?: [0123456789] + )
)
|
)
}x;
my $n = 0;
my $sum = 0;
my #values = ();
my %seen = ();
while (<>) {
while (/($number_rx)/g) {
$n++;
my $num = 0 + $1; # 0+ is so numbers in alternate form count as same
$sum += $num;
push #values, $num;
$seen{$num}++;
}
}
die "no values" if $n == 0;
my $mean = $sum / $n;
my $sqsum = 0;
for (#values) {
$sqsum += ( $_ ** 2 );
}
$sqsum /= $n;
$sqsum -= ( $mean ** 2 );
my $stdev = sqrt($sqsum);
my $max_seen_count = max values %seen;
my #modes = grep { $seen{$_} == $max_seen_count } keys %seen;
my $mode = #modes == 1
? $modes[0]
: "(" . join(", ", #modes) . ")";
$mode .= ' # ' . $max_seen_count;
my $median;
my $mid = int #values/2;
my #sorted_values = sort by_number #values;
if (#values % 2) {
$median = $sorted_values[ $mid ];
} else {
$median = ($sorted_values[$mid-1] + $sorted_values[$mid])/2;
}
my $min = min #values;
my $max = max #values;
printf "n is %d, min is %g, max is %g\n", $n, $min, $max;
printf "mode is %s, median is %g, mean is %g, stdev is %g\n",
$mode, $median, $mean, $stdev;
Depending on how in depth you need to go, erickb's answer may work. However for numerical functionality in Perl there is PDL. You would create a piddle (the object containing your data) using the pdl function. From there you can use the operations on this page to do the statistics you need.
Edit: Looking around I found two function calls that do EXACTLY what you need. statsover gives statistics on one dimension of a piddle, while stats does the same over the whole piddle.
my $piddle = pdl #data;
my ($mean,$prms,$median,$min,$max,$adev,$rms) = statsover $piddle;
#!/usr/bin/perl
my $str = "abc def yyy ghi";
print substr($str, 0 , index($str,' '));
I want substr to print def yyy
print substr ($str, index ($str, ' '), rindex($str, ' ') does not work?
Any idea?
You didn't specify EXACTLY what you want as far as logic but the best guess is you want to print characters between first and last spaces.
Your example code would print too many characters as it prints # of characters before the last space (in your example, 10 instead of 7). To fix, you need to adjust the # of characters printed by subtracting the # of characters before the first space.
Also, you need to start one character to the right of your "index" value to avoid printing the first space - this "+1" and "-1" in the example below
$cat d:\scripts\test1.pl
my $str = "abc def yyy ghi";
my $first_space_index = index ($str, ' ');
my $substring = substr($str, $first_space_index + 1,
rindex($str, ' ') - $first_space_index - 1);
print "|$substring|\n";
test1.pl
|def yyy|
The third argument is length, not offset. But it can be negative to indicate chars from the end of the string, which is easily gotten from rindex and length, like so:
my $str = "abc def yyy ghi";
print substr( $str, 1 + index( $str, ' ' ), rindex( $str, ' ' ) - length($str) );
(Note adding 1 to get the offset after the first space.)
If you want to print text between first and last space, wouldn't it be easier with regex?
print $1 if "abc def yyy ghi" =~ / (.*) /
frankly, substr/index/rindex are really not the way to go there. You are better off doing something like:
my $str = "abc def yyy ghi";
my #row = split ' ', $str;
pop #row and shift #row;
print "#row";
Which is more inefficient, but captures the actual intent better