How can I pad part of a string with spaces, in Perl? - perl

Which version would you prefer?
#!/usr/bin/env perl
use warnings;
use strict;
use 5.010;
my $p = 7; # 33
my $prompt = ' : ';
my $key = 'very important text';
my $value = 'Hello, World!';
my $length = length $key . $prompt;
$p -= $length;
Option 1:
$key = $key . ' ' x $p . $prompt;
Option 2:
if ( $p > 0 ) {
$key = $key . ' ' x $p . $prompt;
}
else {
$key = $key . $prompt;
}
say "$key$value"

I would prefer
sprintf "%-7s : %s", $key, $value;
or
sprintf "%-*s : %s", $p, $key, $value;
instead of all this weird stuff.
From the sprintf documentation:
The flag characters
'-' The converted value is to be left adjusted on the field boundary. (The default is right justification.) The converted value is padded on the right with blanks, rather than on the left with blanks or zeros. A '-' overrides a 0 if both are given.
The field width
An optional decimal digit string (with nonzero first digit) specifying a minimum field width. If the converted value has fewer characters than the field width, it will be padded with spaces on the left (or right, if the left-adjustment flag has been given). Instead of a decimal digit string one may write '*' or '*m$' (for some decimal integer m) to specify that the field width is given in the next argument, or in the m-th argument, respectively, which must be of type int. A negative field width is taken as a '-' flag followed by a positive field width. In no case does a nonexistent or small field width cause truncation of a field; if the result of a conversion is wider than the field width, the field is expanded to contain the conversion result.

I don't like option 2 as it introduces an unnecessary special case.
I would refactor out the construction of the prompt suffix:
# Possible at top of program
my $suffix = ( ' ' x $p ) . $prompt;
# Later...
$key .= $suffix ;

Call me old-school, but I'd use printf() or sprintf():
printf "%-33s%s%s\n", $key, $prompt, $value;
That left justifies the string $key into 33 spaces, adds $prompt and $value and a newline. If I wanted to calculate the length for the first part dynamically:
printf "%-*s%s%s\n", $len, $key, $prompt, $value;
Since it is one line instead of the question's 4 (option 1) or 6 (option 2), it scores favourably on the succinctness scale.

I looks a little weird, but this works (until now):
#!/usr/bin/env perl
use warnings; use strict;
use 5.010;
use utf8;
use Term::Size;
my $columns = ( Term::Size::chars *STDOUT{IO} )[0];
binmode STDOUT, ':encoding(UTF-8)';
use Text::Wrap;
use Term::ANSIColor;
sub my_print {
my( $key, $value, $prompt, $color, $p ) = #_;
my $length = length $key.$prompt;
$p -= $length;
my $suff = ( ' ' x $p ) . $prompt;
$key .= $suff;
$length = length $key;
my $col = $columns - $length;
$Text::Wrap::columns = $col;
my #array = split /\n/, wrap ( '','', $value ) ;
$array[0] = colored( $key, $color ) . $array[0];
for my $idx ( 1..$#array ) {
$array[$idx] = ( ' ' x $length ) . $array[$idx];
}
say for #array;
}
my $prompt = ' : ';
my $color = 'magenta';
my $p = 30;
my $key = 'very important text';
my $value = 'text ' x 40;
my_print( $key, $value, $prompt, $color, $p );

Related

How to replace a sequence of numbers separated by " _" in a string with a single number

Say i have a string "mg_delay_1_2_it" , whereby i can have varying sequence of the numbers separated by "_" i.e i can have also a string like "mg_delay_1_2_10_it", or "mg_delay_1_2_5_25_30_it". I want to be able to replace the number section with a single number to produce several versions example:
If the string is:mg_delay_1_2_5_25_30_it,
i want to be able to produce mg_delay_1_it ,mg_delay_2_it, mg_delay_5_it ,mg_delay_25_it and mg_delay_30_it from the original string.
Please how do i do this efficiently in perl?
Try this:
use strict;
use warnings;
use Data::Dumper;
my $str = 'mg_delay_1_2_5_25_30_it';
my $start = 'mg_delay';
my $end = 'it';
if (my ($res) = $str =~ /\Q$start\E_((?:\d+_)+)\Q$end\E/) {
my #items = $res =~ /(\d+)/g;
my #versions = map { $start . '_' . $_ . '_' . $end } #items;
print Dumper(\#versions);
}
Output:
$VAR1 = [
'mg_delay_1_it',
'mg_delay_2_it',
'mg_delay_5_it',
'mg_delay_25_it',
'mg_delay_30_it'
];
Alternatively, if $start and $end are not known:
my $str = 'mg_delay_1_2_5_25_30_it';
if (my ($start, $res, $end ) = $str =~ /^((?:(?!_\d).)+)_((?:\d+_)+)(.+)$/) {
my #items = $res =~ /(\d+)/g;
my #versions = map { $start . '_' . $_ . '_' . $end } #items;
print Dumper(\#versions);
}

Perl CGI output

I have a Perl CGI script for finding the keyword in different types of texts and producing the sorted output.
The present output looks like this
for a keyword "the".
But I would like to highlight the keyword "the" in bold in this output.
How is it possible to do this?
I tried using Term::ANSIColor but it prints the entire sentence in bold and not only the keyword.
My CGI script looks like this:
#!/usr/bin/perl
# require
use warnings;
use diagnostics;
use CGI;
use File::Basename;
my $q = new CGI;
print "Content-type: text/plain\n\n";
#initialize variables
my $target = $q->param( "keyword" );
my $radius = $q->param( "span" );
my $ordinal = $q->param( "ord" );
my $width = 2 * $radius;
#print standard output
print "****************************\n";
print "* SEARCH RESULTS *\n";
print "****************************\n";
print "Your Search word is: $target\n\n";
print "Your Radius is : $radius\n\n";
print "Your Ordinal is : $ordinal\n\n";
print "-----------------------------------------------------------\n\n";
#normal concordance for abstract text
my #files = glob( 'ABS/*.txt' );
for my $file ( #files ) {
my $path = glob( 'ABS/*.txt' );
my $file_name = basename( $path );
# initialize
my $count = 0;
my #lines = ();
$/ = ""; # Paragraph read mode
# open the file, and process each line in it
open( FILE, " < $file" ) or die( "Can not open $file ($!).\n" );
while ( <FILE> ) {
# re-initialize
my $extract = '';
# normalize the data
chomp;
s/\n/ /g; # Replace new lines with spaces
s/\b--\b/ -- /g; # Add spaces around dashes
# process each item if the target is found
while ( $_ =~ /\b$target\b/gi ) {
# find start position
my $match = $1;
my $pos = pos;
my $start = $pos - $radius - length( $match );
# extract the snippets
if ( $start < 0 ) {
$extract = substr( $_, 0, $width + $start + length( $match ) );
$extract = ( " " x -$start ) . $extract;
}
else {
$extract = substr( $_, $start, $width + length( $match ) );
my $deficit = $width + length( $match ) - length( $extract );
if ( $deficit > 0 ) {
$extract .= ( " " x $deficit );
}
}
# add the extracted text to the list of lines, and increment
$lines[$count] = $extract;
++$count;
}
}
sub removePunctuation {
my $string = $_[0];
$string = lc( $string ); # Convert to lowercase
$string =~ s/[^-a-z ]//g; # Remove non-aplhabetic characters
$string =~ s/--+/ /g; # Remove 2+ hyphens with a space
$string =~ s/-//g; # Remove hyphens
$string =~ s/\s=/ /g;
return ( $string );
}
sub onLeft {
#USAGE: $word = onLeft($string, $radius, $ordinal);
my $left = substr( $_[0], 0, $_[1] );
$left = removePunctuation( $left );
my #word = split( /\s+/, $left );
return ( $word[ -$_[2] ] );
}
sub byLeftWords {
my $left_a = onLeft( $a, $radius, $ordinal );
my $left_b = onLeft( $b, $radius, $ordinal );
lc( $left_a ) cmp lc( $left_b );
}
# process each line in the list of lines
my $line_number = 0;
print "File name: $file_name \n\n";
foreach my $x ( sort byLeftWords #lines ) {
++$line_number;
printf "%5d", $line_number;
print " $x\n\n";
}
print "------------------------------------------------------------\n\n";
}
# done
exit;
Thanks.
You can't control the style of characters displayed in a text/plain document. Term::ANSIColor certainly won't work as the browser doesn't recognize the ANSI terminal escape sequences, which will work only on your console
If you write your output as HTML then you just need to put <strong>...</strong> around the word to be emboldened
Or, better, you can write a CSS style like this
.bold {
font-weight: bold;
}
and wrap the words in <span class="bold">...</span>
Update
To convert your text document to HTML, the simplest way is to wrap it in a <pre>...</pre> ("preformatted") element and put it into the standard HTML structure. Less than <, greater than > and ampersand & characters must be replaced with their corresponding entities <, > and & respectively, and you can wrap the words you want in bold in <strong> tags
The minimum HTML5 document which is also friendly with the majority of modern browsers looks like this
<!doctype html>
<html lang=en>
<head>
<meta charset=UTF-8>
<title>Search Results</title>
</head>
<body>
<pre>
Content of my report
with words to be in bold wrapped in <strong> tags
and characters `<`, `>`, and `&` replaced by
`<`, `>` and `&` respectively
</pre>
</body>
</html>

perl replace characters in a string but retain special character or space

I would like to create a program that replaces characters and retains the special characters. An example input and output is shown below.
Here's what I did so far:
$sentence = userinput;
#words = split(/ /, $sentence);
for ($i = 0; $i < #words.length; $i ++){
$words[$i] =~ s/\W//g;
#characters = split(//, $words[$i]);
#print $words[$i] . "\n";
$wordlength = length($words[$i]);
for ($j = 0; $j < #characters.length; $j ++){
$char = $characters[$j];
for ($x = 0; $x < $wordlength; $x++){
$char++;
if ($char eq "aa"){
$char = "a";
}
elsif ($char eq "AA"){
$char = "A";
}
}
print $char;
if ($x = 0){
$output[$i] = $char;
}
else {
$output[$i] = join ($char);
}
}
print $output[$i];
}
Input:
Hi! how are you doing?
Output:
Jk! krz duh brx itnsl?
A couple of things in your code don't make sense:
Missing use strict; use warnings;.
All variables are global (you should be using my to create variables)
#foo.length is not the number of elements in the array #foo. It's the number of elements in the array #foo concatenated with the number of characters in $_ (because arrays in scalar context return their length, . concatenates strings, and length works on $_ by default).
join ($char) always returns the empty string: You're joining an empty list (no elements) using $char as a separator.
Here's an attempt to fix all of these issues:
use strict;
use warnings;
my $sentence = readline;
$sentence =~ s{([A-Za-z]+)}{
my $word = $1;
join '', map {
my $base = ord(/^[A-Z]/ ? 'A' : 'a');
chr((ord($_) - $base + length($word)) % 26 + $base)
} split //, $word
}eg;
print $sentence;
I think what you are doing is rot3 encoding, but if so then your example is wrong
my $sentence = 'Hi! how are you doing?';
$sentence =~ tr/A-Za-z/D-ZA-Cd-za-c/;
print $sentence, "\n";
output
Kl! krz duh brx grlqj?
which is similar, but not identical to
Jk! krz duh brx itnsl?

match string between columns using perl

I want to compare a string in column A with that in column B for every row and print a third column that highlights the differences.
Column A Column B
uuaaugcuaauugugauaggggu uuaaugcuaauugugauaggggu
uuaaugcuaauugugauagggguu uuaaugcuaauugugauaggggu
uuaaugcuaauugugauagggguuu uuaaugcuaauugugauaggggu
Desired Result:
Column A Column B Column C
uuaaugcuaauugugauaggggu uuaaugcuaauugugauaggggu ********************
uuaaugcuaauugugauagggguu uuaaugcuaauugugauaggggu ********************u
uuaaugcuaauugugauagggguuu uuaaugcuaauugugauaggggu ********************uu
I have an example script that might work, but how do I do this for every row in the data frame?
use strict;
use warnings;
my $string1 = 'AAABBBBBCCCCCDDDDD';
my $string2 = 'AEABBBBBCCECCDDDDD';
my $result = '';
for(0 .. length($string1)) {
my $char = substr($string2, $_, 1);
if($char ne substr($string1, $_, 1)) {
$result .= "**$char**";
} else {
$result .= $char;
}
}
print $result;
Using bruteforce and substr
use strict;
use warnings;
while (<DATA>) {
my ($str1, $str2) = split;
my $len = length $str1 < length $str2 ? length $str1 : length $str2;
for my $i (0..$len-1) {
my $c1 = substr $str1, $i, 1;
my $c2 = substr $str2, $i, 1;
if ($c1 eq $c2) {
substr $str1, $i, 1, '*';
substr $str2, $i, 1, '*';
}
}
printf "%-30s %s\n", $str1, $str2;
}
__DATA__
Column_A Column_B
uuaaugcuaauugugauaggggu uuaaugcuaauugugauaggggu
uuaaugcuaauugugauagggguu uuaaugcuaauugugauaggggu
uuaaugcuaauugugauagggguuu uuaaugcuaauugugauaggggu
AAABBBBBCCCCCDDDDD AEABBBBBCCECCDDDDD
Outputs:
*******A *******B
*********************** ***********************
***********************u ***********************
***********************uu ***********************
*A********C******* *E********E*******
Alternative using XOR
It's also possible to use ^ to find the intersection between two strings.
The following performs the same as the above:
while (<DATA>) {
my ($str1, $str2) = split;
my $intersection = $str1 ^ $str2;
while ($intersection =~ /(\0+)/g) {
my $len = length $1;
my $pos = pos($intersection) - $len;
substr $str1, $pos, $len, '*' x $len;
substr $str2, $pos, $len, '*' x $len;
}
printf "%-30s %s\n", $str1, $str2;
}
I could not resist to provide a modified Miller's solution with regular expressions
use strict;
use warnings;
while (<DATA>) {
my $masked_str1 ="";
my $masked_str2 ="";
my ($str1, $str2) = split;
my $intersection = $str1 ^ $str2;
while ($intersection =~ /(\x00+)/g) {
my $mask = $intersection;
$mask =~ s/\x00/1/g;
$mask =~ s/[^1]/0/g;
while ( $mask =~ /\G(.)/gc ) { # traverse the mask
my $bit = $1;
if ( $str1 =~ /\G(.)/gc ) { # traverse the string1 to be masked
$masked_str1 .= $bit ? '_' : $1;
}
if ( $str2 =~ /\G(.)/gc ) { # traverse the string2 to be masked
$masked_str2 .= $bit ? '_' : $1;
}
}
}
print "=" x 80;
printf "\n%-30s %s\n", $str2, $str1; # Minimum length 30 char, left-justified
printf "%-30s %s\n", $str1, $str2;
printf "%-30s %s\n\n", $masked_str1, $masked_str2;
}

Perl - How to change every $variable occurrence of ";" in a string

Very new here so be gentle. :)
Here is the jist of what I want to do:
I want to take a string that is made up of numbers separated by semi-colons (ex. 6;7;8;9;1;17;4;5;90) and replace every "X" number of semicolons with a "\n" instead. The "X" number will be defined by the user.
So if:
$string = "6;7;8;9;1;17;4;5;90";
$Nth_number_of_semicolons_to_replace = 3;
The output should be:
6;7;8\n9;1;17\n4;5;90
I've found lots on changing the Nth occurrence of something but I haven't been able to find anything on changing every Nth occurrence of something like I am trying to describe above.
Thanks for all your help!
use List::MoreUtils qw(natatime);
my $input_string = "6;7;8;9;1;17;4;5;90";
my $it = natatime 3, split(";", $input_string);
my $output_string;
while (my #vals = $it->()) {
$output_string .= join(";", #vals)."\n";
}
Here is a quick and dirty answer.
my $input_string = "6;7;8;9;1;17;4;5;90";
my $count = 0;
$input_string =~ s/;/++$count % 3 ? ";" : "\n"/eg;
Don't have time for a full answer now, but this should get you started.
$string = "6;7;8;9;1;17;4;5;90";
$Nth_number_of_semicolons_to_replace = 3;
my $regexp = '(' . ('\d+;' x ($Nth_number_of_semicolons_to_replace - 1)) . '\d+);';
$string =~ s{ $regexp ) ; }{$1\n}xsmg
sub split_x{
my($str,$num,$sep) = #_;
return unless defined $str;
$num ||= 1;
$sep = ';' unless defined $sep;
my #return;
my #tmp = split $sep, $str;
while( #tmp >= $num ){
push #return, join $sep, splice #tmp, 0, $num;
}
push #return, join $sep, #tmp if #tmp;
return #return;
}
print "$_\n" for split_x '6;7;8;9;1;17;4;5;90', 3
print join( ',', split_x( '6;7;8;9;1;17;4;5;90', 3 ) ), "\n";
my $string = "6;7;8;9;1;17;4;5;90";
my $Nth_number_of_semicolons_to_replace = 3;
my $num = $Nth_number_of_semicolons_to_replace - 1;
$string =~ s{ ( (?:[^;]+;){$num} [^;]+ ) ; }{$1\n}gx;
print $string;
prints:
6;7;8
9;1;17
4;5;90
The regex explained:
s{
( # start of capture group 1
(?:[^;]+;){$num} # any number of non ';' characters followed by a ';'
# repeated $num times
[^;]+ # any non ';' characters
) # end of capture group
; # the ';' to replace
}{$1\n}gx; # replace with capture group 1 followed by a new line
If you've got 5.10 or higher, this could do the trick:
#!/usr/bin/perl
use strict;
use warnings;
my $string = '1;2;3;4;5;6;7;8;9;0';
my $n = 3;
my $search = ';.*?' x ($n -1);
print "string before: [$string]\n";
$string =~ s/$search\K;/\n/g;
print "print string after: [$string]\n";
HTH,
Paul