How do I escape special characters for a substitution in a Perl one-liner? - perl

Is there some way to replace a string such as #or * or ? or & without needing to put a "\" before it?
Example:
perl -pe 'next if /^#/; s/\#d\&/new_value/ if /param5/' test
In this example I need to replace a #d& with new_value but the old value might contain any character, how do I escape only the characters that need to be escaped?

You have several problems:
You are using \b incorrectly
You are replacing code with shell variables
You need to quote metacharacters
From perldoc perlre
A word boundary ("\b") is a spot between two characters that has a "\w" on one side of it
Neither of the characters # or & are \w characters. So your match is guaranteed to fail. You may want to use something like s/(^|\s)\#d\&(\s|$)/${1}new text$2/
(^|\s) says to match either the start of the string (^)or a whitespace character (\s).
(\s|$) says to match either the end of the string ($) or a whitespace character (\s).
To solve the second problem, you should use %ENV.
To solve the third problem, you should use the \Q and \E escape sequences to escape the value in $ENV{a}.
Putting it all together we get:
#!/bin/bash
export a='#d&'
export b='new text'
echo 'param5 #d&' |
perl -pe 'next if /^#/; s/(^|\s)\Q$ENV{a}\E(\s|$)/$1$ENV{b}$2/ if /param5/'
Which prints
param5 new text

As discussed at perldoc perlre:
...Today it is more common to use the quotemeta() function or the "\Q" metaquoting
escape sequence to disable all metacharacters' special meanings like this:
/$unquoted\Q$quoted\E$unquoted/
Beware that if you put literal backslashes (those not inside interpolated variables) between "\Q" and "\E", double-quotish backslash interpolation may
lead to confusing results. If you need to use literal backslashes within "\Q...\E", consult "Gory details of parsing quoted constructs" in perlop.
You can also use a ' as the delimiter in the s/// operation to make everything be parsed literally:
my $text = '#';
$text =~ s'#'1';
print $text;
In your example, you can do (note the single quotes):
perl -pe 's/\b\Q#f&\E\b/new_value/g if m/param5/ and not /^ *#/'

The other answers have covered the question, now here's your meta-problem: Leaning Toothpick Syndrome. Its when the delimiter and escapes start to blur together:
s/\/foo\/bar\\/\/bar\/baz/
The solution is to use a different delimiter. You can use just about anything, but balanced braces work best. Most editors can parse them and you generally don't have to worry about escaping.
s{/foo/bar\\}{/bar/baz}
Here's your regex with braced delimiters.
s{\#d\&}{new_value}
Much easier on the eyeholes.

If you really want to avoid typing the \s, put your search string into a variable and then use that in your regex instead. You don't need quotemeta or \Q ... \E in that case. For example:
my $s = '#d&';
s/$s/new_value/g;
If you must use this in a one-liner, bear in mind that you will have to escape the $s if you use "s to contain your perl code, or escape the 's if you use 's to contain your perl code.

If you have a string like
my $var1 = abc$123
and you want to replace it with abcd then you have to use \Q \E. If you don't then no matter what perl doesn't replace the string.
This is the only thing that worked for me.
my $var2 = s/\Q$var1\E/abcd/g;

Related

Perl global substitution of a file path

I am reading a tab delimited file using Perl; and want to apply a global substitution to a file path within this file. I have read that I need to incorporate Q and E into my substitution command; but I'm not able to get the substitution to work. I want to replace the partial string psoft/batch/cs with ps/bat/csprd.
$xl[$idx] =~ s/\Qpsoft/batch/cs\E/\Q/psoft/batch/csprd\E/g;
You can't use \Q to escape a delimiter. For example,
s/\Qa*b//
is equivalent to
s/a\*b//
and not
s/a\*b\/\/...
That means
$xl[$idx] =~ s/\Qpsoft/batch/cs\E/\Q/psoft/batch/csprd\E/g;
is equivalent to
$xl[$idx] =~ s/psoft/batch/cs <junk>
Solution:
$xl[$idx] =~ s/psoft\/batch\/cs/\/psoft\/batch\/csprd/g;
Better:
$xl[$idx] =~ s{psoft/batch/cs}{/psoft/batch/csprd}g;
In more details
There are three steps to parsing an m//, qr// or s/// operator.
The first step is to obtain the trailing flags that affect how the regex pattern is parsed (e.g. x, s, m, i, etc). Since Perl doesn't yet know how to parse the regex pattern and to keep costs down, Perl simply looks for the delimiter marking the end of the pattern and the end of the substitution (usually /), paying attention to no other character other than backslashes (\). \Q is ignored at this point.
The second step is where the double-quoted string escapes (e.g. \Q, \L, etc) and interpolation occurs. Perl won't have a regex pattern until these are processed.
Finally, Perl has a regex pattern and knows how to compile it, so the third step is to compile the regex pattern.
The first problem is that you need to use a different set of delimiters for the substitution operator. Instead of s///, you can use s{}{}. Another problem is that you should not use \Q and \E on the right side of s/// because the right side is not a regular expression. In your case, you don't need Q/E at all:
s{psoft/batch/cs}{/psoft/batch/csprd}g;
Refer to s/PATTERN/REPLACEMENT/

Different ways of expression quotes in Perl

There are two ways to express quotations:
' apostrophe
’ single quotation
In Perl, I can match ' apostrophe using regular expressions. However, I can't match ’ single quotation in same way.
What's the problem here? Thanks a lot!
What you call "signle quotation" is the unicode character "RIGHT SINGLE QUOTATION MARK". When dealing with unicode characters in Perl, be sure to properly identify the encoding of the input and of the script. See perlunicode - Unicode support in Perl
for details.
$ perl -CO -E 'use utf8; say for "’Hello’" =~ /(’)/g'
’
’
use strict;
use warnings;
my $validq1=qq|' apostrophe|;
my $validq2=qq|’ single quotation|;
my $noquotes=qq| teapot|;
my #listofquotechars=qw(' ` " ’);
my $quotematcher="[".join("",map {quotemeta($_)} #listofquotechars)."]";
print $validq1 if ($validq1 =~ /$quotematcher/);
print $validq2 if ($validq2 =~ /$quotematcher/);
print $noquotes if ($noquotes =~ /$quotematcher/);
Giving a list of characters you wish to match and then making a character class for a regular expression is one way of doing it, as shown above.

What is this replacement doing?

$_ =~ s/\#N\/A//g;
it's replacing \3N\ with A\ are these special characters in perl? Sorry I have no idea how to look up this syntax even.
It is removing #N/A from the string $_.
\#N matches #N (escaping the #)
\/A matches /A (escaping the /)
You can simplify how confusing this looks by changing the substitution delimiter:
$_ =~ s|#N/A||g;
Like Hunter McMillen said, It is removing #N/A from the default variable.
But you can code something more readable and shorter :
s!#N/A!!g;
As #Hunter McMillen mentioned, it's just normal regex substitution with special characters escaped. It's probably better written as
s|#N/A||g
or
s{#N/A}{}g
In Damian Conway's Perl Best Practices, to make regular expressions more readable, he recommends:
s{ \# N \/ A }{}gmsx;
That is:
Use curly braces around the pattern and replacement.
Use spaces to separate parts of the pattern via the /x switch.
Backslash escape non-alphanumeric characters, even if they aren't meta-characters.

Perl string sub

I want to replace something with a path like C:\foo, so I:
s/hello/c:\foo
But that is invalid.
Do I need to escape some chars?
Two problems that I can see.
Your first problem is that your s/// replacement is not terminated:
s/hello/c:\foo # fatal syntax error: "Substitution replacement not terminated"
s/hello/c:\foo/ # syntactically okay
s!hello!c:\foo! # also okay, and more readable with backslashes (IMHO)
Your second problem, the one you asked about, is that the \f is taken as a form feed escape sequence (ASCII 0x0C), just as it would be in double quotes, which is not what you want.
You may either escape the backslash, or let variable interpolation "hide" the problem:
s!hello!c:\\foo! # This will do what you want. Note double backslash.
my $replacement = 'c:\foo' # N.B.: Using single quotes here, not double quotes
s!hello!$replacement!; # This also works
Take a look at the treatment of Quote and Quote-like Operators in perlop for more information.
If I understand what you're asking, then this might be something like what you're after:
$path = "hello/there";
$path =~ s/hello/c:\\foo/;
print "$path\n";
To answer your question, yes you do need to double the backslash because \f is an escape sequence for "form feed" in a Perl string.
The problem is that you are not escaping special characters:
s/hello/c:\\foo/;
would solve your problem. \ is a special character so you need to escape it. {}[]()^$.|*+?\ are meta (special) characterss which you need to escape.
Additional reference: http://perldoc.perl.org/perlretut.html

How can I prevent Perl from interpreting \ as an escape character?

How can I print a address string without making Perl take the slashes as escape characters? I don't want to alter the string by adding more escape characters also.
What you're asking about is called interpolation. See the documentation for "Quote-Like Operators" at perldoc perlop, but more specifically the way to do it is with the syntax called the "here-document" combined with single quotes:
Single quotes indicate the text is to be treated literally with no interpolation of its content. This is similar to single quoted strings except that backslashes have no special meaning, with \ being treated as two backslashes and not one as they would in every other quoting construct.
This is the only form of quoting in perl where there is no need to worry about escaping content, something that code generators can and do make good use of.
For example:
my $address = <<'EOF';
blah#blah.blah.com\with\backslashes\all\over\theplace
EOF
You may want to read up on the various other quoting operators such as qw and qq (at the same document as I referenced above), as they are very commonly used and make good shorthand for other more long-winded ways of escaping content.
Use single quotes. For example
print 'lots\of\backslashes', "\n";
gives
lots\of\backslashes
If you want to interpolate variables, use the . operator, as in
$var = "pesky";
print 'lots\of\\' . $var . '\backslashes', "\n";
Notice that you have to escape the backslash at the end of the string.
As an alternative, you could use join:
print join("\\" => "lots", "of", $var, "backslashes"), "\n";
We could give much more helpful answers if you'd give us sample code.
It depends what you're escaping, but the Quote-like operators may help.
See the perlop man page.
Use the backslah two times,
print "This is a backslah character \\";