How can I interpolate literal \t and \n in Perl strings? [duplicate] - perl

This question already has answers here:
How can I manually interpolate string escapes in a Perl string?
(2 answers)
Closed 8 years ago.
Say I have an environment variable myvar:
myvar=\tapple\n
When the following command will print out this variable
perl -e 'print "$ENV{myvar}"'
I will literally have \tapple\n, however, I want those control chars to be evaluated and not escaped. How would I achieve it?
In the real world $ENV residing in substitution, but I hope the answer will cover that.

Use eval:
perl -e 'print eval qq{"$ENV{myvar}"}'
UPD: You can also use substitution with the ee switch, which is safer:
perl -e '(my $s = $ENV{myvar}) =~ s/(\\n|\\t)/"qq{$1}"/gee; print $s'

You should probably be using String::Escape.
use String::Escape qw(unbackslash);
my $var = unbackslash($ENV{'myvar'});
unbackslash unescapes any string escape sequences it finds, turning them into the characters they represent. If you want to explicitly only translate \n and \t, you'll probably have to do it yourself with a substitution as in this answer.

There's nothing particularly special about a sequence of characters that includes a \. If you want to substitute one sequence of characters for another, it's very simple to do in Perl:
my %sequences = (
'\\t' => "\t",
'\\n' => "\n",
'foo' => 'bar',
);
my $string = '\\tstring fool string\\tfoo\\n';
print "Before: [$string]\n";
$string =~ s/\Q$_/$sequences{$_}/g for ( keys %sequences );
print "After: [$string]\n";
The only trick with \ is to keep track of the times when Perl thinks it's an escape character.
Before: [\tstring fool string\tfoo\n]
After: [ string barl string bar
]
However, as darch notes, you might just be able to use String::Escape.
Note that you have to be extremely careful when you're taking values from environment variables. I'd be reluctant to use String::Escape since it might process quite a bit more than you are willing to translate. The safe way is to only expand the particular values you explicitly want to allow. See my "Secure Programming Techniques" chapter in Mastering Perl where I talk about this, along with the taint checking you might want to use in this case.

Related

In Perl, can you use a variable for the whole of a match string?

I'm new to Perl, though not to programming, and am working through Learning Perl. The book has exercises to match successive lines of a small text file.
I had the idea of supplying match strings from STDIN, and going through the file for each one:
while(<STDIN>) {
chomp;
$regex = $_;
seek JUNK, 0, 0;
while(<JUNK>) {
chomp();
if(/$regex/) {
say;
}
}
say '';
}
This works fine, but I can't find a way to interpolate an entire match string, e.g.
/fred/i
into the predicate. I tried
if($$matcher) # with $matcher = '/fred/'
but Perl complained.
I imagine this is my ignorance, and should welcome enlightenment.
Statement modifiers, such as /i, are a part of the code telling Perl how to perform the match, not a part of the pattern to be matched. This is why that doesn't work for you.
You have three ways to work around this (well, probably more, since this is Perl we're talking about, but three ways that I can think of straight off):
1) Use extended regex syntax and, when you want a case-insensitive match, enter (?i:fred), as suggested in comments on the question.
2) Use string eval to allow the use of the regular statement modifiers: if (eval "$_ =~ $regex") { say } Note that this method will require you to also type the surrounding slashes. e.g., You'd have to enter /fred/i; just typing in fred would not work. Note also that it's a huge security hole to do this without validating your input first, since the user's entered text is executed as Perl code, just as if it were part of the original program. (Imagine if the user entered //, system("rm -rf /") - it would test against an empty regex, then delete all the files on your computer.) So probably not a recommended approach unless you really know what you're doing and/or you're the only one who will ever run the program.
3) The most complex, but also most correct, solution is to write a parser which inspects the user's entered string to see whether any special flags are present and then responds accordingly. A very simple example which allows the user to append /i for a case-insensitive search:
#!/usr/bin/env perl
use strict;
use warnings;
use 5.010;
while(<STDIN>) {
chomp;
my #parts = split '/', $_;
# If the user input starts with a /, the first part will be empty, so throw
# it away.
shift #parts unless $parts[0];
my $re = shift #parts;
my %flags;
for (#parts) {
for (split '') {
$flags{i} = 1 if $_ eq 'i';
}
}
my $f = join '', keys %flags;
say "Matched" if eval qq('foo' =~ /$re/$f);
}
This also uses string eval, so it is potentially vulnerable to the same kind of security issues as #2, but $re cannot contain any / characters (the split '/' would have ended $re immediately prior to the first /), which prevents code from being inserted there and $f can contain only the letter i (or any other flags you might choose to recognize if you expand on this). So it should be safe. (But, if anyone can demonstrate an exploit I missed, please tell me about it in comments!)
Problem
What you are trying to do can be summarized by:
my $regex = '/fred/i';
my #lines = (
'A line containing some words and Fred said Hello.',
'Another line. Here is a regex embedded in the line: /fred/i',
);
for ( #lines ) {
say if /$regex/;
}
Output:
Another line. Here is a regex embedded in the line: /fred/i
We see that the second line matches $regex, whereas we wanted the first line containing Fred to match the string fred with the (case insensitive) i flag added to the regex. The problem is that the characters / and i in $regex are taken as characters to be matched literally, i.e., they are not interpreted as special characters surrounding a Regex (as part of a Perl expression).
Note:
The character / is special as part of a Perl expression for a regular expression, but it is not special inside the Regex pattern. There are however characters that are special inside the pattern, the so-called meta characters:
\ | ( ) [ { ^ $ * + ? .
see perldoc quotemeta for more information.
A solution using extended patterns
Simply change the first line to:
my $regex = '(?i)fred'; # or alternatively: (?i:fred)
Regex flags can be added to a regex pattern using "Extended patterns" described in the manual perldoc perlre :
Extended Patterns
The syntax for most of these is a pair of parentheses with a question
mark as the first thing within the parentheses. The character after
the question mark indicates the extension.
[...]
(?adlupimnsx-imnsx)
(?^alupimnsx)
One or more embedded pattern-match modifiers, to be turned on (or
turned off if preceded by "-" ) for the remainder of the pattern or
the remainder of the enclosing pattern group (if any). This is
particularly useful for dynamically-generated patterns, such as those
read in from a configuration file, taken from an argument, or
specified in a table somewhere.
[...]
These modifiers are restored at the end of the enclosing group.
Alternatively the non-capturing form can be used:
(?:pattern)
(?adluimnsx-imnsx:pattern)
(?^aluimnsx:pattern)
This is for clustering, not capturing; it groups subexpressions like
"()" , but doesn't make backreferences as "()" does.
The question has been answered in the following comment:
Try (?i:fred), see Extended
patterns in
perldoc perlre for more information
– Håkon Hægland 7 hours ago.

Perl generate a file based on a template

I am working on a use case that requires me to generate .hpp files based on a template. So something like
#ifdef changethis_hpp
#define changethis_hpp
#include<fixedheader1>
...
#include<fixedheaderN>
class changethis
{
....
};
needs to be generated based on the requirement of changethis string.
How can I achieve this in perl?
WHITSF
I wrote a fixed template.txt file and and then replaced the text with changethis string and then dumped it as a changethis.hpp.
But is there any other way I can achieve this in perl?
There's a Perl FAQ, How can I expand variables in text strings?. It starts like this:
If you can avoid it, don't, or if you can use a templating system,
such as Text::Template or Template Toolkit, do that instead.
You might even be able to get the job done with sprintf or printf:
my $string = sprintf 'Say hello to %s and %s', $foo, $bar;
However, for the one-off simple case where I don't want to pull out a
full templating system, I'll use a string that has two Perl scalar
variables in it. In this example, I want to expand $foo and $bar to
their variable's values:
my $foo = 'Fred';
my $bar = 'Barney';
$string = 'Say hello to $foo and $bar';
One way I can do this involves the substitution operator and a double /e
flag. The first /e evaluates $1 on the replacement side and turns it
into $foo. The second /e starts with $foo and replaces it with its
value. $foo, then, turns into 'Fred', and that's finally what's left in
the string:
$string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney'
The /e will also silently ignore violations of strict, replacing
undefined variable names with the empty string. Since I'm using the /e
flag (twice even!), I have all of the same security problems I have with
eval in its string form. If there's something odd in $foo , perhaps
something like #{[ system "rm -rf /" ]}, then I could get myself in
trouble.
I'd highly recommend you ignore most of this advice and go directly to a templating system (as recommended in the first line).
I use Text::Template for such tasks.

Using quote-like-operators or quotes in the perl's printf

Reading perl sources I saw many times the next construction:
printf qq[%s\n], getsomestring( $_ );
But usually it is written as
printf "%s\n", getsomestring( $_ );
The question:
is here any "good practice" what is the correct way, and if yes
when is recommended to use the longer qq[...] vs the "..."
or it is only pure TIMTOWTDI?
The perlop doesn't mention anything about this.
You can use qq() as an alternative double quote method, such as when you have double quotes in the string. For example:
"\"foo bar\""
Looks better when written
qq("foo bar")
When in the windows cmd shell, which uses double quotes, I often use qq() when I need interpolation. For example:
perl -lwe "print qq($foo\n)"
The qq() operator -- like many other perl operators such as s///, qx() -- is also handy as you demonstrate because it can use just about any character as its delimiter:
qq[this works]
qq|as does this|
qq#or this#
This is handy for when you have many different delimiters in the string. For example:
qq!This is (not) "hard" to quote!
As for best practice, I would say use whatever is more readable.
I always use qq[...] when there are quotes in the strings, example:
qq["here you are", he said]
If not, for me is more readable the use of ""

Preserving backslashes in Perl strings

Is there a way in Perl to preserve and print all backslashes in a string variable?
For example:
$str = 'a\\b';
The output is
a\b
but I need
a\\b
The problem is can't process the string in any way to escape the backslashes because
I have to read complex regular expressions from a database and don't know in which combination and number they appear and have to print them exactly as they are on a web page.
I tried with template toolkit and html and html_entity filters. The only way it works so far is to use a single quoted here document:
print <<'XYZ';
a\\b
XYZ
But then I can't interpolate variables which makes this solution useless.
I tried to write a string to a web page, into file and on the shell, but no luck, always one backslash disappears. Maybe I am totally on the wrong track, but what is the correct way to print complex regular expressions including backslashes in all combinations and numbers without any changes?
In other words:
I have a database containing hundreds of regular expressions as string data. I want to read them with perl and print them on a web page exatly as they are in the database.
There are all the time changes to these regular expressions by many administrators so I don't know in advance how and what to escape.
A typical example would look like this:
'C:\\test\\file \S+'
but it could change the next day to
'\S+ C:\\test\\file'
Maybe a correct conclusion would be to escape every backslash exactly one time no matter in which combination and in which number it appears? This would mean it works to double them up. Then the problem isn't as big as I feared. I tested it on the bash and it works with two and even three backslashes in a row (4 backslaches print 2 ones and 6 backslashes print 3 ones).
The backslash only has significance to Perl when it occurs in Perl source code, e.g.: your assignment of a literal string to a variable:
my $str = 'a\\b';
However, if you read data from a file (or a database or socket etc) any backslashes in the data you read will be preserved without you needing to take any special steps.
my $str = 'a\\b';
print $str;
This prints a\\b.
Use
my $str = 'a\\\\b';
instead
It's a PITA, but you will just have to double up the backslashes, e.g.
a\\\\b
Otherwise, you could store the backslash in another variable, and interpolate that.
The minimum to get two slashes is (unfortunately) three slashes:
use 5.016;
my $a = 'a\\\b';
say $a;
The problem I tried to solve does not exist. I confused initializing a string directly in the code with using the html forms. Using a string inside the code preserving all backslashes is only possible either with a here document or by reading a textfile containing the string. But if I just use the html form on a web page to insert a string and use escapeHTML() from the CGI module it takes care of all and you can insert the most wired combinations of special characters. They all get displayed and preserved exactly as inserted. So I should have started directly with html and database operations instead of trying to examine things first
by using strings directly in the code. Anyway, thanks for your help.
You can use the following regular expression to form your string correctly:
my $str = 'a\\b';
$str =~ s/\\/\\\\/g;
print "$str\n";
This prints a\\b.
EDIT:
You can use non-interpolating here-document instead:
my $str = <<'EOF';
a\\b
EOF
print "$str\n";
This still prints a\\b.
Grant's answer provided the hint I needed. Some of the other answers did not match Perl's operation on my system so ...
#!/usr/bin/perl
use warnings;
use strict;
my $var = 'content';
print "\'\"\N{U+0050}\\\\\\$var\n";
print <<END;
\'\"\N{U+0050}\\\\\\$var\n
END
print '\'\"\N{U+0050}\\\\\\$var\n'.$/;
my $str = '\'\"\N{U+0050}\\\\\\$var\n';
print $str.$/;
print #ARGV;
print $/;
Called from bash ... using the bash means of escaping in quotes which changes \' to '\''.
jamie#debian:~$ ./ft.pl '\'\''\"\N{U+0050}\\\\\\$var\n'
'"P\\\content
'"P\\\content
'\"\N{U+0050}\\\$var\n
'\"\N{U+0050}\\\$var\n
\'\"\N{U+0050}\\\\\\$var\n
The final line, with six backslashes in the middle, was what I had expected. Reality differed.
So:
"in here \" is interpolated
in HEREDOC \ is interpolated
'in single quotes only \' is interpolated and only for \ and ' (are there more?)
my $str = 'same limited \ interpolation';
perl.pl 'escape using bash rules' with #ARGV is not interpolated

Simple search and replace without regex

I've got a file with various wildcards in it that I want to be able to substitute from a (Bash) shell script. I've got the following which works great until one of the variables contains characters that are special to regexes:
VERSION="1.0"
perl -i -pe "s/VERSION/${VERSION}/g" txtfile.txt # No problems here
APP_NAME="../../path/to/myapp"
perl -i -pe "s/APP_NAME/${APP_NAME}/g" txtfile.txt # Error!
So instead I want something that just performs a literal text replacement rather than a regex. Are there any simple one-line invocations with Perl or another tool that will do this?
The 'proper' way to do this is to escape the contents of the shell variables so that they aren't seen as special regex characters. You can do this in Perl with \Q, as in
s/APP_NAME/\Q${APP_NAME}/g
but when called from a shell script the backslash must be doubled to avoid it being lost, like so
perl -i -pe "s/APP_NAME/\\Q${APP_NAME}/g" txtfile.txt
But I suggest that it would be far easier to write the entire script in Perl
Use the following:
perl -i -pe "s|APP_NAME|\\Q${APP_NAME}|g" txtfile.txt
Since a vertical bar is not a legal character as part of a path, you are good to go.
I don't particularly like this answer because there should be a better way to do a literal replace in Perl. \Q is cryptic. Using quotemeta adds extra lines of code.
But... You can use substr to replace a portion of a string.
#!/usr/bin/perl
my $name = "Jess.*";
my $sentence = "Hi, my name is Jess.*, dude.\n";
my $new_name = "Prince//";
my $name_idx = index $sentence, $name;
if ($name_idx >= 0) {
substr($sentence, $name_idx, length($name), $new_name);
}
print $sentence;
Output:
Hi, my name is Prince//, dude.
You don't have to use a regular expression for this (using substr(), index(), and length()):
perl -pe '
foreach $var ("VERSION", "APP_NAME") {
while (($i = index($_, $var)) != -1) {
substr($_, $i, length($var)) = $ENV{$var};
}
}
'
Make sure you export your variables.
You can use a regex but escape any special characters.
Something like this may work.
APP_NAME="../../path/to/myapp"
APP_NAME=`echo "$APP_NAME" | sed -e '{s:/:\/:}'`
perl -i -pe "s/APP_NAME/${APP_NAME}/g" txtfile.txt
Use:
perl -i -pe "\$r = qq/\Q${APP_NAME}\E/; s/APP_NAME/\$r/go"
Rationale: Escape sequences
I managed to get a working solution, partly based on bits and pieces from other peoples' answers:
app_name='../../path/to/myapp'
perl -pe "\$r = q/${app_name//\//\\/}/; s/APP_NAME/\$r/g" <<<'APP_NAME'
This creates a Perl variable, $r, from the result of the shell parameter expansion:
${app_name//\//\\/}
${ # Open parameter expansion
app_name # Variable name
// # Start global substitution
\/ # Match / (backslash-escaped to avoid being interpreted as delimiter)
/ # Delimiter
\\/ # Replace with \/ (literal backslash needs to be escaped)
} # Close parameter expansion
All that work is needed to prevent forward slashes inside the variable from being treated as Perl syntax, which would otherwise close the q// quotes around the string.
In the replacement part, use the variable $r (the $ is escaped, to prevent it from being treated as a shell variable within double quotes).
Testing it out:
$ app_name='../../path/to/myapp'
$ perl -pe "\$r = q/${app_name//\//\\/}/; s/APP_NAME/\$r/g" <<<'APP_NAME'
../../path/to/myapp