Perl string sub - perl

I want to replace something with a path like C:\foo, so I:
s/hello/c:\foo
But that is invalid.
Do I need to escape some chars?

Two problems that I can see.
Your first problem is that your s/// replacement is not terminated:
s/hello/c:\foo # fatal syntax error: "Substitution replacement not terminated"
s/hello/c:\foo/ # syntactically okay
s!hello!c:\foo! # also okay, and more readable with backslashes (IMHO)
Your second problem, the one you asked about, is that the \f is taken as a form feed escape sequence (ASCII 0x0C), just as it would be in double quotes, which is not what you want.
You may either escape the backslash, or let variable interpolation "hide" the problem:
s!hello!c:\\foo! # This will do what you want. Note double backslash.
my $replacement = 'c:\foo' # N.B.: Using single quotes here, not double quotes
s!hello!$replacement!; # This also works
Take a look at the treatment of Quote and Quote-like Operators in perlop for more information.

If I understand what you're asking, then this might be something like what you're after:
$path = "hello/there";
$path =~ s/hello/c:\\foo/;
print "$path\n";
To answer your question, yes you do need to double the backslash because \f is an escape sequence for "form feed" in a Perl string.

The problem is that you are not escaping special characters:
s/hello/c:\\foo/;
would solve your problem. \ is a special character so you need to escape it. {}[]()^$.|*+?\ are meta (special) characterss which you need to escape.
Additional reference: http://perldoc.perl.org/perlretut.html

Related

Perl q function or single quote doesn't return the string literal of UNC path correctly

Perl's q function or single quote is supposed to return the string literal as typed (except \'). But it doesn't work as expected for the following scenario.
I want to print the following UNC path
\\dir1\dir2\dir3
So I have used
my $path = q(\\dir1\dir2\dir3);
OR
my $path = '\\dir1\dir2\dir3';
But this skips one backslash at the front.
So if I print it i.e. print $path; it prints
\dir1\dir2\dir3
I want to know why? I have to type 3 or 4 backslashes at the beginning of the UNC path to make it work as expected. What am I missing?
From perldoc perlop:
q/STRING/
'STRING'
A single-quoted, literal string. A backslash represents a backslash unless followed by the delimiter or another backslash, in which case the delimiter or backslash is interpolated.
Change:
my $path = q(\\dir1\dir2\dir3);
to:
my $path = q(\\\dir1\dir2\dir3);
As for why, it's because Perl lets you include the quote delimiter in your string by escaping it with a backslash:
my $single_quote = 'This is a single quote: \'';
But if a backslash before the delimiter always escaped the delimiter, there would be no way to end a string with a backslash:
my $backslash = 'This is a backslash: \'; # nope
Allowing backslashes to be escaped too takes care of that:
my $backslash = 'This is a backslash: \\';
Interestingly enough, there is only one way to type in double backslashes in a perl string without it being interpolated as a single backslash.
As all the other answers showed, any of the quote operators treat backslashes as a backslash unless there is another one following it directly.
The only way to get the double backslashes to display exactly as you have typed them is to use a single quote here-doc.
my $path = <<'VISTA';
\\dir1\dir2\dir3
VISTA
chomp $path;
print $path."\n";
Would print it exactly as you've typed it in.

Substitution operator in Perl

I am new to perl. I have the following substitution expression:
$tmp =~ s:/x/y/z::;
I have searched a lot for it but couldn't find a similar expression.
What does it mean?
You can use non-whitespace any character as a delimiter; here, instead of the most common / (s/foo/bar/), the delimiter is : (s:foo:bar:), because what you are substituting has slash characters and if you used a slash delimiter, you'd have to escape them (s/\/x\/y\/z//) which many people consider ugly.
So your expression is simply removing the first /x/y/z from $tmp.
That means: replace /x/y/z with nothing.
For exmaple: If you have a strng like /a/b/x/y/z the result will be /a/b

How do I escape special characters for a substitution in a Perl one-liner?

Is there some way to replace a string such as #or * or ? or & without needing to put a "\" before it?
Example:
perl -pe 'next if /^#/; s/\#d\&/new_value/ if /param5/' test
In this example I need to replace a #d& with new_value but the old value might contain any character, how do I escape only the characters that need to be escaped?
You have several problems:
You are using \b incorrectly
You are replacing code with shell variables
You need to quote metacharacters
From perldoc perlre
A word boundary ("\b") is a spot between two characters that has a "\w" on one side of it
Neither of the characters # or & are \w characters. So your match is guaranteed to fail. You may want to use something like s/(^|\s)\#d\&(\s|$)/${1}new text$2/
(^|\s) says to match either the start of the string (^)or a whitespace character (\s).
(\s|$) says to match either the end of the string ($) or a whitespace character (\s).
To solve the second problem, you should use %ENV.
To solve the third problem, you should use the \Q and \E escape sequences to escape the value in $ENV{a}.
Putting it all together we get:
#!/bin/bash
export a='#d&'
export b='new text'
echo 'param5 #d&' |
perl -pe 'next if /^#/; s/(^|\s)\Q$ENV{a}\E(\s|$)/$1$ENV{b}$2/ if /param5/'
Which prints
param5 new text
As discussed at perldoc perlre:
...Today it is more common to use the quotemeta() function or the "\Q" metaquoting
escape sequence to disable all metacharacters' special meanings like this:
/$unquoted\Q$quoted\E$unquoted/
Beware that if you put literal backslashes (those not inside interpolated variables) between "\Q" and "\E", double-quotish backslash interpolation may
lead to confusing results. If you need to use literal backslashes within "\Q...\E", consult "Gory details of parsing quoted constructs" in perlop.
You can also use a ' as the delimiter in the s/// operation to make everything be parsed literally:
my $text = '#';
$text =~ s'#'1';
print $text;
In your example, you can do (note the single quotes):
perl -pe 's/\b\Q#f&\E\b/new_value/g if m/param5/ and not /^ *#/'
The other answers have covered the question, now here's your meta-problem: Leaning Toothpick Syndrome. Its when the delimiter and escapes start to blur together:
s/\/foo\/bar\\/\/bar\/baz/
The solution is to use a different delimiter. You can use just about anything, but balanced braces work best. Most editors can parse them and you generally don't have to worry about escaping.
s{/foo/bar\\}{/bar/baz}
Here's your regex with braced delimiters.
s{\#d\&}{new_value}
Much easier on the eyeholes.
If you really want to avoid typing the \s, put your search string into a variable and then use that in your regex instead. You don't need quotemeta or \Q ... \E in that case. For example:
my $s = '#d&';
s/$s/new_value/g;
If you must use this in a one-liner, bear in mind that you will have to escape the $s if you use "s to contain your perl code, or escape the 's if you use 's to contain your perl code.
If you have a string like
my $var1 = abc$123
and you want to replace it with abcd then you have to use \Q \E. If you don't then no matter what perl doesn't replace the string.
This is the only thing that worked for me.
my $var2 = s/\Q$var1\E/abcd/g;

What is the correct usage of (nested | double | simple) quotes

I'm sure this question may seem foolish to some of you, but I'm here to learn.
Are these assumptions true for most of the languages ?
EDIT : OK, let's assume I'm talking about Perl/Bash scripting.
'Single quotes'
=> No interpretation at all (e.g. '$' or any metacharacter will be considered as a character and will be printed on screen)
"Double quotes"
=> Variable interpretation
To be more precise about my concerns, I'm writing some shell scripts (in which quotes can sometimes be a big hassle), and wrote this line :
CODIR=`pwd | sed -e "s/$MODNAME//"`
If I had used single quotes in my sed, my pattern would have been '$MODNAME', right ? (and not the actual value of $MODNAME, which is `alpha' in this particular case)
Another problem I had, with an awk inside an echo :
USAGE=`echo -ne "\
Usage : ./\`basename $0\` [-hnvV]\n\
\`ls -l ${MODPATH}/reference/ | awk -F " " '$8 ~ /\w+/{print "> ",$8}'\`"`
I spent some time debugging that one. I came to the conclusion that backticks were escaped so that the interpreter doesn't "split" the command (and stop right before «basename»). In the awk commmand, '$8' is successfully interpreted by awk, thus not by shell. What if I wanted to use a shell variable ? Would I write awk -F "\"$MY_SHELL_VAR\"" ? Because $MY_SHELL_VAR as is, will be interpreted by awk, won't it ?
Don't hesitate to add any information about quoting or backticks !
Thank you ! :)
It varies massively by language. For example, in the C/Java/C++/C# etc family, you can't use single quotes for a string at all - they're only for single characters.
I think it's far better to learn the rules properly for the languages you're actually interested in than to try to generalise.
Are these assumptions true for most of the languages ?
Answer: No
In bash scripting, backticks are deprecated in favor of $() in part because it is non-obvious how nested quotes and escaping are supposed to work. You may also want to take a look at Bash Pitfalls.
It's definitely not the same for all languages. In Python, for example, single and double quotes are interchangeable. The only difference is that you can include single quotes within a double-quoted string without escaping them and vice versa ("How's it going?").
Also, there are triple-quoted strings that can span multiple lines.
In Perl, you also have q() and qq() to help you in nested quoting situations:
my $x = q(a string with 'single quotes');
my $y = qq(an $interpreted string with "double quotes");
These certainly will help you avoid "\"needlessly\"" '\'escaping\'' internal quotes.
Yes, something like awk -F "\"$MY_SHELL_VAR\"" will work, however in this case you wouldn't be able to use variables in awk, since they will be interpreted by shell, so the way to go is something like this (I will use command simpler than yours, if you don't mind :) ):
awk -F " " '$8 ~ /\w+/{print "> ",$8, '$SOME_SHELL_VAR'}'
Note the single quotes terminating and restarting.
The trickiest part, usually, is to pass a quote in the argument to the command. In this case you need to terminate single quote, add escaped quote character, start quote again, like this:
awk '$1 ~ '\''{print}'
Note, that single quote can't be escaped inside single quotes, since the "\" won't be treated as an escape character.
This is probably not related directly to your quiestion, but still useful.
I don't know about perl, but for bash you don't need to backslash the newline.
As for quotes, I have a (very personal) pattern that I call the "five quotes" pattern. It helps to put one quote in a string enclosed by the same kind of quotes
For instance:
doublequoted="some things "'"'"quoted"'"'" and some not"
simplequoted='again '"'"'quote this'"'"' but not that'
Note that you can freely append strings with different kinds of quotes, which is useful when you want the shell to interprete some vars but not some others:
awk -F " " '$8 ~ /\w+/{print "> ",$8, '"$SOME_SHELL_VAR"'}'
Also, I don't use the backtick anymore but the $(...)pattern which is more legible and can be nested.
USAGE=$(echo -ne "
Usage : ./$(basename $0) [-hnvV]\n
$(ls -l ${MODPATH}/reference/ | awk -F " " '$8 ~ /\w+/{print "> ",$8}')")
In perl, double quoted strings will have their variables expanded.
If you write that for instance:
my $email = "foo#bar.com" ;
perl will try to expand #bar. If you use strict, you'll see an complain about the array bar not existing. If you don't, you'll just see a weird behavior.
So it's better to write:
my $email = 'foo#bar.com' ;
For these types of reason, my advice is to always use single quote for strings, unless you know that you need variable expansion.

How can I prevent Perl from interpreting \ as an escape character?

How can I print a address string without making Perl take the slashes as escape characters? I don't want to alter the string by adding more escape characters also.
What you're asking about is called interpolation. See the documentation for "Quote-Like Operators" at perldoc perlop, but more specifically the way to do it is with the syntax called the "here-document" combined with single quotes:
Single quotes indicate the text is to be treated literally with no interpolation of its content. This is similar to single quoted strings except that backslashes have no special meaning, with \ being treated as two backslashes and not one as they would in every other quoting construct.
This is the only form of quoting in perl where there is no need to worry about escaping content, something that code generators can and do make good use of.
For example:
my $address = <<'EOF';
blah#blah.blah.com\with\backslashes\all\over\theplace
EOF
You may want to read up on the various other quoting operators such as qw and qq (at the same document as I referenced above), as they are very commonly used and make good shorthand for other more long-winded ways of escaping content.
Use single quotes. For example
print 'lots\of\backslashes', "\n";
gives
lots\of\backslashes
If you want to interpolate variables, use the . operator, as in
$var = "pesky";
print 'lots\of\\' . $var . '\backslashes', "\n";
Notice that you have to escape the backslash at the end of the string.
As an alternative, you could use join:
print join("\\" => "lots", "of", $var, "backslashes"), "\n";
We could give much more helpful answers if you'd give us sample code.
It depends what you're escaping, but the Quote-like operators may help.
See the perlop man page.
Use the backslah two times,
print "This is a backslah character \\";