using two search and replace regex commands in Perl - perl

In a file, I have to replace
.L1 to L_xx_1
.L2 to L_xx_2
.L3 to L_xx_3
...
and
.LC1 to LC_xx_1
.LC2 to Lc_xx_2
.LC3 to LC_xx_3
...
For these two search and replacements I have two different Perl scripts containing following two different loops:
for ($i=0; $currentLine=<file>; $i++) {
$currentLine =~ s/.L(\d+)/L_$ARGV[1]_$1/gi;
print $currentLine;
}
and
for ($i=0; $currentLine=<file>; $i++) {
$currentLine =~ s/.LC(\d+)/LC_$ARGV[1]_$1/gi;
print $currentLine;
}
respectively.
Can I merge these two loops into one by merging two S commands into one.

Yes, you can combine the 2 substitutions into one with a different regular expression. You can capture the "L" followed by an optional "C".
use warnings;
use strict;
my $str = 'xx';
while (my $currentLine = <DATA>) {
$currentLine =~ s/\.(LC?)(\d+)/$1_${str}_$2/gi;
print $currentLine;
}
__DATA__
.L1
.L2
.L3
.LC1
.LC2
.LC3
This prints:
L_xx_1
L_xx_2
L_xx_3
LC_xx_1
LC_xx_2
LC_xx_3
Note that I escaped the first character (the period). This will only match a literal period.

Modify your regex to
$currentLine =~ s/\.L(C?)(\d+)/L$1_$ARGV[1]_$2/gi;

Related

Unmatched ) in reg when using lc function

I am trying to run the following code:
$lines = "Enjoyable )) DAY";
$lines =~ lc $lines;
print $lines;
It fails on the second line where I get the error mentioned in the title. I understand the brackets are causing the trouble. I think I could use "quotemeta", but the thing is that my string contains info that I go on to process later, so I would like to keep the string intact as far as possible and not tamper with it too much.
You have two problems here.
1. =~ is used to execute a specific set of operations
The =~ operator is used to either match with //, m//, qr// or a string; or to substitute with s/// or tr///.
If all you want to do is lowercase the contents of $lines then you should use = not =~.
$lines = "Enjoyable )) DAY";
$lines = lc $lines;
print $lines;
2. Regular expressions have special characters which must be escaped
If you want to match $lines against a lower case version of $Lines, which should return true if $lines was already entirely lower case and false otherwise, then you need to escape the ")" characters.
#!/usr/bin/env perl
use strict;
use warnings;
my $lines = "enjoyable )) day";
if ($lines =~ lc quotemeta $lines) {
print "lines is lower case\n";
}
print $lines;
Note this is a toy example trying to find a reason for doing $lines =~ lc $lines - It would be much better (faster, safer) to solve this with eq as in $lines eq lc $lines.
See perldoc -f quotemeta or http://perldoc.perl.org/functions/quotemeta.html for more details on quotemeta.
=~ is used for regular expressions. "lc" is not part of regex, it's a function like this: $new = lc($old);
I don't recall the regex operator for lowercase, because I use lc() all the time.

perl find and replace ../ and  

I am using Perl to replace all instances of
../../../../../../abc' and  
in a string with
/ and , respectively.
The method I am using looks like this:
sub encode
{
my $result = $_[0];
$result =~ s/..\/..\/..\/..\/..\/..\//\//g;
$result =~ s/ / /g;
return $result;
}
Is this correct?
Essentially, yes, although the first regex has to be written in a different way: because . matches any character, we have to escape it \. or put it in its own character class [.]. The first regex can also be written cleaner as
...;
$result =~ s{ (?: [.][.]/ ){6} }
{/}gx;
...;
We look for the literal pattern ../ repeated 6 times and then replace it. Because I use curly braces as a delimiter I don't have to escape the slash. Because I use the /x modifier I can have these spaces inside the regex improving readability.
Try this. It will print /foo bar/baz.
#!/usr/bin/perl -w
use strict;
my $result = "../../../../../../foo bar/baz";
#$result =~ s/(\.\.\/)+/\//g; #for any number of ../
$result =~ s/(\.\.\/){6}/\//g; #for 6 exactly
$result =~ s/ / /g;
print $result . "\n";
you forgot the abc, i think:
sub encode
{
my $result = $_[0];
$result =~ s/(?:..\/){6}abc/\//g;
$result =~ s/ / /g;
return $result;
}

Perl LWP::Simple File.txt in Array not spaces

The file does not have spaces and do i need to keep every word in the corresponding array,
content in var, the file is more large, but this is ok.
my $file = "http://www.ausa.com.ar/autopista/carteleria/plano/mime.txt";
&VPM4362=008000&VPM4381=FFFFFF&VPM4372=FFFFFF&VPM4391=008000&VPM4382=FFFF00&VPM4392=FF0000&VPM4182=FFFFFF&VPM4181=FFFF00&VPM4402=FFFFFF&VPM4401=FFFF00&VPM4412=008000&VPM4411=FF0000&VPM4422=FFFFFF&VPM4421=FFFFFF&VPM4322=FFFF00&CPMV001_1_Ico=112&CPMV001_1_1=AHORRE 15%&CPMV001_1_2=ADHIERASE AUPASS&CPMV001_1_3=AUPASS.COM.AR&CPMV002_1_Ico=0&CPMV002_1_1=ATENCION&CPMV002_1_2=RADARES&CPMV002_1_3=OPERANDO&CPMV003_1_Ico=0&CPMV003_1_1=ATENCION&CPMV003_1_2=RADARES&CPMV003_1_3=OPERANDO&CPMV004_1_Ico=255&CPMV004_1_1= &CPMV004_1_2=&CPMV004_1_3=&CPMV05 _1_Ico=0&CPMV05 _1_1=ATENCION&CPMV05 _1_2=RADARES&CPMV05 _1_3=OPERANDO&CPMV006_1_Ico=0&CPMV006_1_1=ATENCION&CPMV006_1_2=RADARES&CPMV006_1_3=OPERANDO&CPMV007_1_Ico=0&CPMV007_1_1=ATENCION&CPMV007_1_2=RADARES&CPMV007_1_3=OPERANDO&CPMV08 _1_Ico=0&CPMV08 _1_1=ATENCION&CPMV08
the code.
#!/bash/perl .T
use strict;
use warnings;
use LWP::Simple;
my $file = "http://www.ausa.com.ar/autopista/carteleria/plano/mime.txt";
my $mime = get($file);
my #new;
foreach my $line ($mime) {
$line =~ s/&/ /g;
push(#new, $line);
}
print "$new[0]\n";
Try this way but when I start the array is equal to (all together)
the output I need
print "$new[1]\n";
VPM4381=FFFFFF
You don't want to substitute on &, you want to split on &.
#new = split /&/, $line;

Check for spaces in perl using regex match in perl

I have a variable how do I use the regex in perl to check if a string has spaces in it or not ? For ex:
$test = "abc small ThisIsAVeryLongUnbreakableStringWhichIsBiggerThan20Characters";
So for this string it should check if any word in the string is not bigger than some x characters.
#!/usr/bin/env perl
use strict;
use warnings;
my $test = "ThisIsAVeryLongUnbreakableStringWhichIsBiggerThan20Characters";
if ( $test !~ /\s/ ) {
print "No spaces found\n";
}
Please make sure to read about regular expressions in Perl.
Perl regular expressions tutorial - perldoc perlretut
You should have a look at the perl regex tutorial. Adapting their very first "Hello World" example to your question would look like this:
if ("ThisIsAVeryLongUnbreakableStringWhichIsBiggerThan20Characters" =~ / /) {
print "It matches\n";
}
else {
print "It doesn't match\n";
}
die "No spaces" if $test !~ /[ ]/; # Match a space
die "No spaces" if $test =~ /^[^ ]*\z/; # Match non-spaces for entire string
die "No whitespace" if $test !~ /\s/; # Match a whitespace character
die "No whitespace" if $test =~ /^\S*\z/; # Match non-whitespace for entire string
To find the length of the longest unbroken sequence of non-space characters, write this
use strict;
use warnings;
use List::Util 'max';
my $string = 'abc small ThisIsAVeryLongUnbreakableStringWhichIsBiggerThan20Characters';
my $max = max map length, $string =~ /\S+/g;
print "Maximum unbroken length is $max\n";
output
Maximum unbroken length is 61

Perl search is only showing last result

I have two arrays, one with search terms and another which is multiple lines fetched from a file. I have a nested foreach statement and am searching for for all combinations, but only the very last match is showing even though I know for a fact that there are many other matches!! I have tried many different versions of the code but here is my last one:
open (MYFILE, 'searchTerms.txt');
open (MYFILE2, 'fileToSearchIn.xml');
#searchTerms = <MYFILE>;
#xml = <MYFILE2>;
close(MYFILE2);
close(MYFILE);
$results = "";
foreach $searchIn (#xml)
{
foreach $searchFor (#searchTerms)
{
#print "searching for $searchFor in: $searchIn\n";
if ($searchIn =~ m/$searchFor/)
{
$temp = "found in $searchIn \n while searching for: $searchFor ";
$results = $results.$temp."\n";
$temp = "";
}
}
}
print $results;
You should always use strict and use warnings at the start of your program, and declare all variables at the point of their first use using my. This applies especially when you are asking for help with your code as this measure can quickly reveal many simple mistakes.
As Raze2dust has said it is important to remember that lines read from a file will have a trailing newline "\n" character. If you were checking for exact matches between a pair of lines then this wouldn't matter, but since it's not working for you I assume the strings in searchTerms.txt can appear anywhere in the lines of fileToSearchIn.xml. That means you need to use chomp the strings from searchTerms.txt; lines from the other file can stay as they are.
Things like this are made a lot easier by using the File::Slurp module. It does all the file handling for you and will chomp any newlines from the input text if you ask.
I have changed your program to use this module so that you can see how it works.
use strict;
use warnings;
use File::Slurp;
my #searchTerms = read_file('searchTerms.txt', chomp => 1);
my #xml = read_file('fileToSearchIn.xml');
my #results;
foreach my $searchIn (#xml) {
foreach my $searchFor (#searchTerms) {
if ($searchIn =~ m/$searchFor/) {
push #results, qq/Found in "$searchIn"\n while searching for "$searchFor"/;
}
}
}
print "$_\n" for #results;
chomp your inputs to remove newline characters:
open (MYFILE, 'searchTerms.txt');
open (MYFILE2, 'fileToSearchIn.xml');
#searchTerms = <MYFILE>;
#xml = <MYFILE2>;
close(MYFILE2);
close(MYFILE);
$results = "";
foreach $searchIn (#xml)
{
chomp($searchIn);
foreach $searchFor (#searchTerms)
{
chomp($searchFor);
#print "searching for $searchFor in: $searchIn\n";
if ($searchIn =~ m/$searchFor/)
{
$temp = "found in $searchIn \n while searching for: $searchFor ";
$results = $results.$temp."\n";
$temp = "";
}
}
}
print $results;
Basically, you are thinking you are searching for 'a', but actually it is searching for 'a\n' because that is how it reads the input unless you use chomp. It matches only if 'a' is the last character because in that case, it will be succeeded by a newline.