How can I use 's///' if my string contains a '/'? - perl

I'm using Perl 5.10.6 on Mac 10.6.6. I want to execute a simple search and replace against a file so I tried:
my $searchAndReplaceCmd = "perl -pi -e 's/\\Q${localTestDir}\\E//g' ${testSuiteFile}";
system( $searchAndReplaceCmd );
but the problem above is the variable $localTestDir contains directory separators ("/"), and this screws up the regular expression ...
Bareword found where operator expected at -e line 1, near
"s/\Q/home/selenium"
Backslash found where operator expected at -e
line 1, near "Live\" syntax error at -e line 1, near
"s/\Q/home/selenium"
Search pattern not terminated at -e line 1.
How do I do a search and replace on a file when the variable in question contains regular expression characters? Thanks.

It seems that $localTestDir has begins with a /.
Remedy by changing the regex delimiter to something other than /:
my $searchAndReplaceCmd = "perl -pi -e 's!\\Q${localTestDir}\\E!!g' ${testSuiteFile}";
From perldoc perlrequick :
$x = "A 39% hit rate";
$x =~ s!(\d+)%!$1/100!e; # $x contains "A 0.39 hit rate"
The last example shows that s/// can use other delimiters, such as
s!!! and s{}{}, and even s{}//. If single quotes are used s''', then
the regex and replacement are treated as single-quoted strings.

Question is why you do a search and replace from within perl, through the shell, within perl. Seems like a roundabout way of doing things, and you'll run into problems with shell interpolation.
The \Q ... \E should override the special characters in your string, so / "should" not be an issue. From perlre:
\Q quote (disable) pattern metacharacters till \E
Here's an alternative (untested), all perl solution. If you want to be extra certain, exchange the / delimiter to something else, such as s### (you can use any character as a delimiter).
use strict;
use warnings;
use File::Copy;
open my $fh, '<', $testSuiteFile or die $!;
open my $out, '>', $testSuiteFile . ".bak" or die $!;
while (<$fh>) {
s/\Q${localTestDir}\E//g;
print $out $_;
}
move($testSuiteFile . ".bak", $testSuiteFile) or die $!;
Or use Tie::File
use strict;
use warnings;
use Tie::File;
tie, my #file, 'Tie::File', $testSuiteFile or die $!;
for (#file) {
s/\Q${localTestDir}\E//g;
}
untie #file;

Changing the delimiters is useful, but more generally you can put a backslash in front of any regular expression character to make it non-special.
So s/\/abc/\/xyz/ will work, although it is not very readable.

The problem is that the substitution of $localTestDir is happening too soon.
Here is an approach that lets you use / for your re-delimiters:
system('perl', '-pi',
'-e', 'BEGIN { $dir = shift(#ARGV) };',
'-e', 's/\\Q$dir\\E//g',
$localTestDir,
$suiteTestFile
);
Note that this also protects the contents of $suiteTestFile from being interpreted by the shell.

Related

sed "unterminated `s'command`" error when running from a script

I have a temp file with contents:
a
b
c
d
e
When I run sed 's#b#batman\nRobin#' temp from command line, I get:
a
batman
Robin
c
d
e
However, when I run the command from a Perl scriptL
#!/usr/bin/perl
use strict;
use warnings;
`sed 's#b#batman\nRobin#' temp`
It produces error:
sed: -e expression #1, char 10: unterminated `s' command
What am I doing wrong?
Why run another tool like sed once you are inside a Perl program? If anything, now you have far more tools and power so just do it with Perl.
One simple way to do your sed thing
use warnings;
use strict;
die "Usage: $0 file(s)\n" if not #ARGV;
while (<>) {
s/b/batman\nRobin/;
print;
}
Run this program by supplying the file (temp) to it on the command line. The die line is there merely to support/enforce such usage; it is inessential for script's operation.
This program then is a simple filter
<> operator reads line by line all files submitted on the command line
A line is assigned by it to $_ variable, a default for many things in Perl
The s/// operator by default binds to $_, which gets changed (if pattern matches)
print by default prints the $_ variable
Use nearly anything you want for delimiters in regex, see m// and s/// operators
This can also be done as
while (<>) {
print s/b/batman\nRobin/r
}
With /r modifier s/// returns the changed string (or the original if pattern didn't match)
Finally that's also just
print s/b/batman\nRobin/r while <>;
but I'd expect that with a script you really want to do more and then this probablyisn't it.
On the other side of things you could write it more properly
use warnings;
use strict;
use feature qw(say);
die "Usage: $0 file(s)\n" if not #ARGV;
while (my $line = <>) {
chomp $line;
$line =~ s/b/batman\nRobin/;
say $line;
}
With a line in a lexical variable nicely chomp-ed this is ready for more work.

system grep inside a perl script gives unbalanced parenthesis

system ("grep -E 'Type|group|slack (' $a > temp.rpt");
The above line is giving me an error Unmatched ( or \(
What is wrong here?
I have tried a backslash before ( too. it shows the same error.
Since you are in a script, why not do that in Perl?
my $infile = '...';
open my $fh, '<', $infile or die "Can't open $file: $!";
open my $out_fh, '>', $temp.rpt or die "Can't open $temp.rpt: $!";
while (<$fh>) {
print $out_fh $_ if /Type|group|slack \(/;
}
Adjust your regex as needed. Generally, it is far easier to change and tweak things now.
If the input file isn't too large you can process in one line as well, once you opened files
print $out_fh grep { /Type|group|slack \(/ } <$fh>;
The grep imposes the list context on the <> operator so it reads and at once returns all lines in a list, and the ones that pass the condition are printed.
A comment on regex. As it stands, it matches either Type or group or slack (. If, by any chance, you intend to match either of the words, then followed by space-paren, you need grouping parenthesis, /(?:Type|group|slack) \(/. The ?: is there so it is not needlessly captured.
You need to use three backslashes
system ("grep -E 'Type|group|slack \\\(' $a ");
( is a regex metacharacter in grep -E. That's why you get the error.
Adding a single backslash doesn't fix it because that's processed by perl: "\(" is the same string as "(".
To fix it, you need to either use two backslashes ("\\(", which turns into the two character string \(, which is then interpreted by grep), or remove the -E option because ( isn't special in POSIX "basic" regexes (which is what grep uses by default).

how to copy the files form one folder to another with different extension with perl

#!/usr/bin/perl
use File::Copy;
print "content-type: text/html \n\n"; #The header
$filetobecopied = "C:\Users\avinash\Desktop\mktg/";
$newfile = "C:\Users\avinash\Desktop\elp/";
copy("$.pdf","$.elp") or die "File cannot be copied.";
the above program i have used to get the out put but getting error can any one can help me to out in the code
If you use backslashes, use single quotes for strings, or double the backslashes. In double quotes, many backslashed characters have special meanings:
my $newfile = "C:\Users\avinash\Desktop\elp/";
print $newfile;
Output:
C:SERSVINASHDESKTOPP/
There are some hidden characters, too:
0000000: 433a 5345 5253 0756 494e 4153 4844 4553 C:SERS.VINASHDES
0000010: 4b54 4f50 1b4c 502f KTOP.LP/
There are three big issues with your script.
Always include use strict; and use warnings; in EVERY perl script.
Using these two Pragmas is the number one thing that you can do to become a better programmer. Additionally, you'll always get more help from experts here on SO if they see you're doing this basic due diligence to track down errors yourself.
In this case, you'll actually get two warnings in your code:
Use of uninitialized value $. in concatenation (.) or string at script.pl line 6.
Use of uninitialized value $. in concatenation (.) or string at script.pl line 6.
So your line copy("$.pdf","$.elp") is interpolating the variable $. which is undefined because you haven't read from a file.
Escape your backslashes in double quoted strings
Backslashes have special meaning in literal string definitions. If you want a literal backslash in a double quoted string, then you need to escape it.
In this instance the following are being translated:
\U is the uc function
\a is an alarm code
\D is just a literal D
etc.
To fix this you either need to use a single quoted string or escape the backslashes
my $filetobecopied = "C:\\Users\\avinash\\Desktop\\mktg"; # Backslashes escaped
my $filetobecopied = 'C:\Users\avinash\Desktop\mktg'; # Single quotes safer
Also, I can't figure out why you have a trailing forward slash in both of your strings.
Output the error message: $!
Always include as much information in your error messages as possible. In this case the File::Copy does the following:
All functions return 1 on success, 0 on failure. $! will be set if an error was encountered.
Therefore your or die statement should contain the following:
copy("fromfile","tofile") or die "Can't copy: $!";
For even better debugging information, you could include the parameters that you sent to copy:
copy("fromfile","tofile") or die "Can't copy fromfile -> tofile: $!";
Anyway, these three things will help you debug your script. It's still not possible to completely interpret your intent based off the information you've provided, but the following is a stub of better formatting code:
#!/usr/bin/perl
use strict;
use warnings;
use File::Copy;
print "content-type: text/html \n\n"; #The header
# The following is likely wrong, but the best interpretation of your intent for now:
my $filetobecopied = 'C:\Users\avinash\Desktop\mktg.pdf';
my $newfile = 'C:\Users\avinash\Desktop\elp.elp';
copy($filetobecopied, $newfile)
or die "Can't copy $filetobecopied -> $newfile: $!";

How can I slurp STDIN in Perl?

I piping the output of several scripts. One of these scripts outputs an entire HTML page that gets processed by my perl script. I want to be able to pull the whole 58K of text into the perl script (which will contain newlines, of course).
I thought this might work:
open(my $TTY, '<', '/dev/tty');
my $html_string= do { local( #ARGV, $/ ) = $TTY ; <> } ;
But it just isn't doing what I need. Any suggestions?
my #lines = <STDIN>;
or
my $str = do { local $/; <STDIN> };
I can't let this opportunity to say how much I love IO::All pass without saying:
♥ ♥ __ "I really like IO::All ... a lot" __ ♥ ♥
Variation on the POD SYNOPSIS:
use IO::All;
my $contents < io('-') ;
print "\n printing your IO: \n $contents \n with IO::All goodness ..." ;
Warning: IO::All may begin replacing everything else you know about IO in perl with its own insidious goodness.
tl;dr: see at the bottom of the post. Explanation first.
practical example
I’ve just wondered about the same, but I wanted something suitable for a shell one-liner. Turns out this is (Korn shell, whole example, dissected below):
print -nr -- "$x" | perl -C7 -0777 -Mutf8 -MEncode -e "print encode('MIME-Q', 'Subject: ' . <>);"; print
Dissecting:
print -nr -- "$x" echos the whole of $x without any trailing newline (-n) or backslash escape (-r), POSIX equivalent: printf '%s' "$x"
-C7 sets stdin, stdout, and stderr into UTF-8 mode (you may or may not need it)
-0777 sets $/ so that Perl will slurp the entire file; reference: man perlrun(1)
-Mutf8 -MEncode loads two modules
the remainder is the Perl command itself: print encode('MIME-Q', 'Subject: ' . <>);, let’s look at it from inner to outer, right to left:
<> takes the entire stdin content
which is concatenated with the string "Subject: "
and passed to Encode::encode asking it to convert that to MIME Quoted-Printable
the result of which is printed on stdout (without any trailing newline)
this is followed by ; print, again in Korn shell, which is the same as ; echo in POSIX shell – just echoïng a newline.
tl;dr
Call perl with the -0777 option. Then, inside the script, <> will contain the entire stdin.
complete self-contained example
#!/usr/bin/perl -0777
my $x = <>;
print "Look ma, I got this: '$x'\n";
To get it into a single string you want:
#!/usr/bin/perl -w
use strict;
my $html_string;
while(<>){
$html_string .= $_;
}
print $html_string;
I've always used a bare block.
my $x;
{
undef $/; # Set slurp mode
$x = <>; # Read in everything up to EOF
}
# $x should now contain all of STDIN

perl search & replace script for all files in a directory

I have a directory with nearly 1,200 files. I need to successively go through each file in a perl script to search and replace any occurrences of 66 strings. So, for each file I need to run all 66 s&r's. My replace string is in Thai, so I cannot use the shell. It must be a .pl file or similar so that I can use use::utf8. I am just not familiar with how to open all files in a directory one by one to perform actions on them. Here is a sample of my s&r:
s/psa0*(\d+)/เพลงสดุดี\1/g;
Thanks for any help.
use utf8;
use strict;
use warnings;
use File::Glob qw( bsd_glob );
#ARGV = map bsd_glob($_), #ARGV;
while (<>) {
s/psa0*(?=\d)/เพลงสดุดี/g;
print;
}
perl -i.bak script.pl *
I used File::Glob's bsd_glob since glob won't handle spaces "correctly". They are actually the same function, but the function behaves differently based on how it's called.
By the way, using \1 in the replacement expression (i.e. outside a regular expression) makes no sense. \1 is a regex pattern that means "match what the first capture captured". So
s/psa0*(\d+)/เพลงสดุดี\1/g;
should be
s/psa0*(\d+)/เพลงสดุดี$1/g;
The following is a faster alternative:
s/psa0*(?=\d)/เพลงสดุดี/g;
See opendir/readdir/closedir for functions that can iterate through all the filenames in a directory (much like you would use open/readline/close to iterate through all the lines in a file).
Also see the glob function, which returns a list of filenames that match some pattern.
Just in case someone could use it in the future. This is what I actually did.
use warnings;
use strict;
use utf8;
my #files = glob ("*.html");
foreach $a (#files) {
open IN, "$a" or die $!;
open OUT, ">$a-" or die $!;
binmode(IN, ":utf8");
binmode(OUT, ":utf8");
select (OUT);
foreach (<IN>) {
s/gen0*(\d+)/ปฐมกาล $1/;
s/exo0*(\d+)/อพยพ $1/;
s/lev0*(\d+)/เลวีนิติ $1/;
s/num0*(\d+)/กันดารวิถี $1/;
...etc...
print "$_";
}
close IN;
close OUT;
};