Ignoring escape characters in perl - perl

my %result = "\\path\tfolder\file.txt";
How can I ignore the \t escape sequence without prepending a '\'. Is there something like:
my %result = r"\\path\tfolder\file.txt";
The above doesn't work.

Single quotes process two escape sequences: \\ and \', so you would have to double the leading double-backslash but not the others:
my $result = '\\\\server\toppath\files';
To get what you want, you could use a here-document at the cost of some syntactic bulk.
chomp(my $result = <<'EOPath');
\\server\toppath\files
EOPath
Note the change of sigil from % to $ because a string is a scalar, and hashes are for associations.

Related

How to escape all special characters in a string (along with single and double quotes)?

E.g:
$myVar="this###!~`%^&*()[]}{;'".,<>?/\";
I am not able to export this variable and use it as it is in my program.
Use q to store the characters and use the quotemeta to escape the all character
my $myVar=q("this###!~`%^&*()[]}{;'".,<>?/\");
$myVar = quotemeta($myVar);
print $myVar;
Or else use regex substitution to escape the all character
my $myVar=q("this###!~`%^&*()[]}{;'".,<>?/\");
$myVar =~s/(\W)/\\$1/g;
print $myVar;
This is what quotemeta is for, if I understand your quest
Returns the value of EXPR with all non-"word" characters backslashed. (That is, all characters not matching /[A-Za-z_0-9]/ will be preceded by a backslash in the returned string, regardless of any locale settings.) This is the internal function implementing the \Q escape in double-quoted strings.
Its use is very simple
my $myVar = q(this###!~`%^&*()[]}{;'".,<>?/\\);
print "$myVar\n";
my $quoted_var = quotemeta $myVar;
print "$quoted_var\n";
Note that we must manually escape the last backslash, to prevent it from escaping the closing delimiter. Or you can tack on an extra space at the end, and then strip it (by chop).
my $myVar = q(this###!~`%^&*()[]}{;'".,<>?/\ );
chop $myVar;
Now transform $myVar like above, using quotemeta.
I take the outside pair of " to merely indicate what you'd like in the variable. But if they are in fact meant to be in the variable then simply put it all inside q(), since then the last character is ". The only problem is a backslash immediately preceding the closing delimiter.
If you need this in a regex context then you use \Q to start and \E to end escaping.
Giving Thanks to:
What's between \Q and \E is treated as normal characters, not regexp characters. For example,
'.' =~ /./; # match
'a' =~ /./; # match
'.' =~ /\Q.\E/; # match
'a' =~ /\Q.\E/; # no match
It doesn't stop variables from being interpolated.
$search = '.';
'.' =~ /$search/; # match
'a' =~ /$search/; # match
'.' =~ /\Q$search\E/; # match
'a' =~ /\Q$search\E/; # no match

Perl tr operator is transliterating based on the variable's name not its value

I'm using Perl 5.16.2 to try to count the number of occurrences of a particular delimiter in the $_ string. The delimiter is passed to my Perl program via the #ARGV array. I verify that it is correct within the program. My instruction to count the number of delimiters in the string is:
$dlm_count = tr/$dlm//;
If I hardcode the delimiter, e.g. $dlm_count = tr/,//; the count comes out correctly. But when I use the variable $dlm, the count is wrong. I modified the instruction to say
$dlm_count = tr/$dlm/\t/;
and realized from how the tabs were inserted in the string that the operation was substituting every instance of any of the four characters "$", "d", "l", or "m" to \t — i.e. any of the four characters that made up my variable name $dlm.
Here is a sample program that illustrates the problem:
$_ = "abcdefghij,klm,nopqrstuvwxyz";
my $dlm = ",";
my $dlm_count = tr/$dlm/\t/;
print "The count is $dlm_count\n";
print "The modified string is $_\n";
There are only two commas in the $_ string, but this program prints the following:
The count is 3
The modified string is abc efghij,k ,nopqrstuvwxyz
Why is the $dlm token being treated as a literal string of four characters instead of as a variable name?
You cannot use tr that way, it doesn't interpolate variables. It runs strictly character by character replacement. So this
$string =~ tr/a$v/123/
is going to replace every a with 1, every $ with 2, and every v with 3. It is not a regex but a transliteration. From perlop
Because the transliteration table is built at compile time, neither the SEARCHLIST nor the REPLACEMENTLIST are subjected to double quote interpolation. That means that if you want to use variables, you must use an eval():
eval "tr/$oldlist/$newlist/";
die $# if $#;
eval "tr/$oldlist/$newlist/, 1" or die $#;
The above example from docs hints how to count. For $dlms in $string
$dlm_count = eval "\$string =~ tr/$dlm//";
The $string is escaped so to not be interpolated before it gets to eval. In your case
$dlm_count = eval "tr/$dlm//";
You can also use tools other than tr (or regex). For example, with string being in $_
my $dlm_count = grep { /$dlm/ } split //;
When split breaks $_ by the pattern that is empty string (//) it returns the list of all characters in it. Then the grep block tests each against $dlm so returning the list of as many $dlm characters as there were in $_. Since this is assigned to a scalar, $dlm_count is set to the length of that list, which is the count of all $dlm.
In the section of the docs on perlop 'Quote Like Operators', it states:
Because the transliteration table is built at compile time, neither
the SEARCHLIST nor the REPLACEMENTLIST are subjected to double quote
interpolation. That means that if you want to use variables, you must
use an eval():
As documented and as you discovered, tr/// doesn't interpolate. The simple solution is to use s/// instead.
my $dlm = ",";
$_ = "abcdefghij,klm,nopqrstuvwxyz";
my $dlm_count = s/\Q$dlm/\t/g;
If the transliteration is being performed in a loop, the following might speed things up noticeably:
my $dlm = ",";
my $tr = eval "sub { tr/\Q$dlm\E/\\t/ }";
for (...) {
my $dlm_count = $tr->();
...
}
Although several answers have hinted at the eval() idiom for tr///, none have the form that covers cases where the string has tr syntax characters in it, e.g.- (hyphen):
$_ = "abcdefghij,klm,nopqrstuvwxyz";
my $dlm = ",";
my $dlm_count = eval sprintf "tr/%s/%s/", map quotemeta, $dlm, "\t";
But as others have noted, there are lots of ways to count characters in Perl that avoid eval(), here's another:
my $dlm_count = () = m/$dlm/go;

How can I manually interpolate string escapes in a Perl string?

In perl suppose I have a string like 'hello\tworld\n', and what I want is:
'hello world
'
That is, "hello", then a literal tab character, then "world", then a literal newline. Or equivalently, "hello\tworld\n" (note the double quotes).
In other words, is there a function for taking a string with escape sequences and returning an equivalent string with all the escape sequences interpolated? I don't want to interpolate variables or anything else, just escape sequences like \x, where x is a letter.
Sounds like a problem that someone else would have solved already. I've never used the module, but it looks useful:
use String::Escape qw(unbackslash);
my $s = unbackslash('hello\tworld\n');
You can do it with 'eval':
my $string = 'hello\tworld\n';
my $decoded_string = eval "\"$string\"";
Note that there are security issues tied to that approach if you don't have 100% control of the input string.
Edit: If you want to ONLY interpolate \x substitutions (and not the general case of 'anything Perl would interpolate in a quoted string') you could do this:
my $string = 'hello\tworld\n';
$string =~ s#([^\\A-Za-z_0-9])#\\$1#gs;
my $decoded_string = eval "\"$string\"";
That does almost the same thing as quotemeta - but exempts '\' characters from being escaped.
Edit2: This still isn't 100% safe because if the last character is a '\' - it will 'leak' past the end of the string though...
Personally, if I wanted to be 100% safe I would make a hash with the subs I specifically wanted and use a regex substitution instead of an eval:
my %sub_strings = (
'\n' => "\n",
'\t' => "\t",
'\r' => "\r",
);
$string =~ s/(\\n|\\t|\\n)/$sub_strings{$1}/gs;

What's the difference between single and double quotes in Perl?

In Perl, what is the difference between ' and " ?
For example, I have 2 variables like below:
$var1 = '\(';
$var2 = "\(";
$res1 = ($matchStr =~ m/$var1/);
$res2 = ($matchStr =~ m/$var2/);
The $res2 statement complains that Unmatched ( before HERE mark in regex m.
Double quotes use variable expansion. Single quotes don't
In a double quoted string you need to escape certain characters to stop them being interpreted differently. In a single quoted string you don't (except for a backslash if it is the final character in the string)
my $var1 = 'Hello';
my $var2 = "$var1";
my $var3 = '$var1';
print $var2;
print "\n";
print $var3;
print "\n";
This will output
Hello
$var1
Perl Monks has a pretty good explanation of this here
' will not resolve variables and escapes
" will resolve variables, and escape characters.
If you want to store your \ character in the string in $var2, use "\\("
Double quotation marks interpret, and single quotation do not
If you are going to create regex strings you should really be using the qr// quote-like operator:
my $matchStr = "(";
my $var1 = qr/\(/;
my $res1 = ($matchStr =~ m/$var1/);
It creates a compiled regex that is much faster than just using a variable containing string. It also will return a string if not used in a regex context, so you can say things like
print "$var1\n"; #prints (?-xism:\()
Perl takes the single-quoted strings 'as is' and interpolates the double-quoted strings. Interpolate means, that it substitutes variables with variable values, and also understands escaped characters. So, your "\(" is interpreted as '(', and your regexp becomes m/(/, this is why Perl complains.
"" Supports variable interpolation and escaping. so inside "\(" \ escapes (
Where as ' ' does not support either. So '\(' is literally \(

How can I insert text into a string in Perl?

If I had:
$foo= "12."bar bar bar"|three";
how would I insert in the text ".." after the text 12. in the variable?
Perl allows you to choose your own quote delimiters. If you find you need to use a double quote inside of an interpolating string (i.e. "") or single quote inside of a non-interpolating string (i.e. '') you can use a quote operator to specify a different character to act as the delimiter for the string. Delimiters come in two forms: bracketed and unbracketed. Bracketed delimiters have different beginning and ending characters: [], {}, (), [], and <>. All other characters* are available as unbracketed delimiters.
So your example could be written as
$foo = qq(12."bar bar bar"|three);
Inserting text after "12." can be done many ways (TIMTOWDI). A common solution is to use a substitution to match the text you want to replace.
$foo =~ s/^(12[.])/$1../;
the ^ means match at the start of the sting, the () means capture this text to the variable $1, the 12 just matches the string "12", and the [] mean match any one of the characters inside the brackets. The brackets are being used because . has special meaning in regexes in general, but not inside a character class (the []). Another option to the character class is to escape the special meaning of . with \, but many people find that to be uglier than the character class.
$foo =~ s/^(12\.)/$1../;
Another way to insert text into a string is to assign the value to a call to substr. This highlights one of Perl's fairly unique features: many of its functions can act as lvalues. That is they can be treated like variables.
substr($foo, 3, 0) = "..";
If you did not already know where "12." exists in the string you could use index to find where it starts, length to find out how long "12." is, and then use that information with substr.
Here is a fully functional Perl script that contains the code above.
#!/usr/bin/perl
use strict;
use warnings;
my $foo = my $bar = qq(12."bar bar bar"|three);
$foo =~ s/(12[.])/$1../;
my $i = index($bar, "12.") + length "12.";
substr($bar, $i, 0) = "..";
print "foo is $foo\nbar is $bar\n";
* all characters except whitespace characters (space, tab, carriage return, line feed, vertical tab, and formfeed) that is
If you want to use double quotes in a string in Perl you have two main options:
$foo = "12.\"bar bar bar\"|three";
or:
$foo = '12."bar bar bar"|three';
The first option escapes the quotes inside the string with backslash.
The second option uses single quotes. This means the double quotes are treated as part of the string. However, in single quotes everything is literal so $var or #array isn't treated as a variable. For example:
$myvar = 123;
$mystring = '"$myvar"';
print $mystring;
> "$myvar"
But:
$myvar = 123;
$mystring = "\"$myvar\"";
print $mystring;
> "123"
There are also a large number of other Quote-like Operators you could use instead.
$foo = "12.\"bar bar bar\"|three";
$foo =~s/12\./12\.\.\./;
print $foo; # results in 12...\"bar bar bar\"|three"