A perl beginner here. I have been working on some simple one-liners to find and replace text in a file. I read about escaping all special characters with \Q\E or quotemeta() but found this only works when interpolating a variable. For example when I try to replace the part containing special characters directly, it fails. But when I store it in a scalar first it works. Of course, if I escape all the special character in backslashes it also works.
$ echo 'One$~^Three' | perl -pe 's/\Q$~^\E/Two/'
One$~^Three
$ echo 'One$~^Three' | perl -pe '$Sub=q($~^); s/\Q$Sub\E/Two/'
OneTwoThree
$ echo 'One$~^Three' | perl -pe 's/\$\~\^/Two/'
OneTwoThree
Can anyone explain this behavior and also show if any alternative exists that can directly quote special characters without using backslashes?
Interpolation happens first, then \Q, \U, \u, \L and \l.
That means
"abc\Qdef$ghi!jkl\Emno"
is equivalent to
"abc" . quotemeta("def" . $ghi . "!jkl") . "mno"
So,
s/\Q$~^/Two/ # not ok quotemeta($~ . "^")
s/\Q$Sub/Two/ # ok
s/\$\~\^/Two/ # ok
s/\$\Q~^/Two/ # ok
Related
I want to rename files with 'sr' in their names, replacing 'sr' with 'SR'. This one succeeded:
ls | perl -e 'while(<>){chomp;if(/(.*)sr(.*)/){rename $_,$1."SR".$2}}'
But this one failed:
ls | perl -e "while(<>){chomp;if(/sr/){rename $_,$\`.'SR'.($')}}"
with this error message:
Not enough arguments for rename at -e line 1, near "rename ,"`
Execution of -e aborted due to compilation errors.
It seems that $_ has become an empty string, but I don't quite understand why. Thanks for any explanations.
Now quotes have been an interesting problem and this is my test:
ls | perl -e "while(<>){chomp;if(/sr/){print $_;print\"\n\";print $\`,$&,($');print \"\n\";print $_,$\`,$&,($');print\"\n\";print $_;print\"\n\"}}"
outputs this:
3sr
3sr
3sr
3sr
sr1
sr1
sr1
sr1
sr2
sr2
sr2
sr2
it seems that when using alone, $_ is not empty; but it become empty when using along with $`,$& and $'. According to the last line of each file, I guess $_ has temporarily changed when not using alone?
Besides, according to a1111exe's answer, I test this:
ls | perl -e "while(<>){chomp;if(/sr/){print \$_,$\`,$&,($');print \"\n\"}}"
and got this:
3sr3sr
sr1sr1
sr2sr2
First in linux we should use single quote instead of double quote.
And instead of ls command you can use perl inbuilt function glob
And to capture the pre and post match you can use the $POSTMATCH and $PREMATCH from English module
so your one liner should be
perl -MEnglish -e 'while(<*>){chomp;if(/sr/){rename $_,$PREMATCH."SR".$POSTMATCH}}'
EDITED
Single quote and double quote is not about Perl this is about shell.
Single quote
Enclosing characters in single quotes (') preserves the literal value of each character within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash.
Double quote
Enclosing characters in double quotes (‘"’) preserves the literal value of all characters within the quotes, with the exception of ‘$’, ‘`’, ‘\’, and, when history expansion is enabled, ‘!’.
In shell script we are accessing the shell variable prefix with $, so while using $ inside the double quote it is looking for the shell variable not a Perl variable. For example you can run the following line in your terminal,
m=4; perl -e "print $m;"
Here
m=4; perl -e "print $m;"
^ ^
| Accessing shell variable
Assigning shell variable
Output is 4. Because m is shell variable you are accessing the shell variable inside your Perl script.
And in windows, we need to use double-quote instead of single quote
It seems that double quotes mess between your shell environment and Perl. You can certainly do what #mkHun suggested. One other way:
ls | perl -e 'while(<>){chomp;($new=$_)=~s/sr/SR/g;rename $_,$new}'
Also, if you escape the '$' sigil in '$_', your oneliner will work too:
ls | perl -e "while(<>){chomp;if(/sr/){rename \$_,$\`.'SR'.$'}}"
I still don't get why though.. But it really seems like bash/perl interpolation issue.
Why is it that simply changing from enclosing my one-liner with ' instead of " affects the behavior of the code? The first line of code produces what is expected and the second line of code gives (to me!) an unexpected result, printing out an unexpected array reference.
$ echo "puke|1|2|3|puke2" | perl -lne 'chomp;#a=split(/\|/,$_);print $a[4];'
puke2
$ echo "puke|1|2|3|puke2" | perl -lne "chomp;#a=split(/\|/,$_);print $a[4];"
This is the Perl version:
$ perl -v
This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi
ARRAY(0x1f79b98)
With double quotes you are letting the shell interpolate variables first.
As you can check, $_ and $a are unset in the subshell forked for pipe by the parent shell. See a comment on $_ below.
So the double-quoted version is effectively
echo "puke|1|2|3|puke2" | perl -lne 'chomp;#a=split(/\|/);print [4];'
what prints the arrayref [4].
A comment on the effects of having $_ exposed to Bash. Thanks to Borodin for bringing this up.
The $_ is one of a handful of special shell parameters in Bash. It contains the last argument of the previous command, or the pathname of what invoked the shell or commands (via _ environment variable). See the link for a full description.
However, here it is being interpreted in a subshell forked to run the perl command, its first. Apparently it is not even set, as seen with
echo hi; echo hi | echo $_
which prints an empty line (after first hi). The reason may be that the _ environment variable just isn't set for a subshell for a pipe, but I don't see why this would be the case. For example,
echo hi; (echo $_)
prints two lines with hi even though ( ) starts a subshell.
In any case, $_ in the given pipeline isn't set.
The split part is then split(/\|/), so via default split(/\|/, $_) -- with nothing to split. With -w added this indeed prints a warning for use of uninitialized $_.
Note that this behavior depends on the shell. The tcsh won't run this with double quotes at all. In ksh and zsh the last part of pipeline runs in the main shell, not a subshell, so $_ is there.
This is actual a shell topic, not a perl topic.
In shell:
Single quotes preserve the literal value of all of the characters they contain, including the $ and backslash. However, with double quotes, the $, backtick, and backslash characters have special meaning.
For example:
'\"' evaluates to \"
whereas
"\'" evaluates to just '
because with double quotes, the backslash gets a special meaning as the escape character.
I am trying to put the system command like below to the perl script, but
sed expression contains both quotes and backticks and I am not sure how to escape all of them, so it will execute my system command exactly as I need.
Here is the example of the command:
mysql -u root -D porta-billing -e "..." | sed "s/'/\'/;s/\t/\",\"/g;s/^/\"/;s/$/\"/;s/\n//g"
The answer to the question you're asking is to use the qx(...) operator. qx(...) is the "choose your own delimiter" version of backticks.
my $output = qx[ ... ];
Or
my $output = qx( ... );
Or
my $output = qx! ... !;
It's easy to find a delimiter that won't clash with the characters in your command string.
But the answer to the question that you should be asking has two parts:
Don't call mysql from your Perl program - use DBI instead.
Don't call sed from your Perl program - use Perl code to manipulate your text.
I feel slightly nervous about the first part of my answer as I'm worried you will just take my hacky workaround and end up with an unmaintainable mess. Please take note of the advice in the second half - even if you ignore it in this case.
I have a variable in a shell script,
var=1234_number
I want to replace all other than integer of $var .. how can I do it using a perl onliner?
You might be looking for something to edit the shell script, in which case, this might be sufficient:
perl -i.bak -e 's/\b(var=\d+).*/$1/' shellscript.sh
The '-i' overwrites the original file, saving a copy in shellscript.sh.bak; the substitute command finds assignments to 'var' (and not any longer name ending 'var') followed by an equals sign, some digits, and any non-digits, and leaves behind just the assignment of digits.
In the example, it gives:
var=1234
Note that the Perl regex is not foolproof - it will mangle this (dropping the closing brace).
: ${var=1234_number}
Dealing with all such possible variants is extremely fairly tricky:
echo $var=$other
OTOH, you might be looking to eliminate digits from a variable within a shell script, in which case:
var=$(echo $var | perl -e 's/\D//g')
You could also use 'sed' for the job:
var=$(echo $var | sed 's/[^0-9]//g')
No need to use anything but the shell for this
var=1234_abcd
var=${var%_*}
echo $var # => 1234
See 'Parameter Expansion' in the bash manual.
I'm trying to find where two variables are being concatenated in a directory of scripts, but when I try the following:
grep -lire "$DATA_PATH . $AWARDS_YEAR" *
I get "undefined variable" errors...
I thought I could escape the $s by using:
grep -lire "\$DATA_PATH . \$AWARDS_YEAR" *
But I get the same error - so, how do you grep for strings with $s in?
Tcsh is a little different about variables than the usual shells, and it's the default on FreeBSD.
So, just use single quotes, '$VAR', or escape the $ outside of the quotes: \$"VAR"
Put it in single quotes, with the escaping slash:
grep -lire '\$DATA_PATH . \$AWARDS_YEAR' *
Also note, that the dot (.) is a regex character. If you don't want it to be, escape it, too (or don't use the -e option).
Here's a nice reference with more general info.