How to interpret replacement? - perl

$test='abc="def"';
$replacement='$1="ghj"';
$test =~ s/(.+)="(.+)"/"$replacement/;
print $test;
It prints:
$1=ghj
How can I treat $replacement to be interpreted?

You add the /e modifier to your regex. You need to modify your replacement string too, so that it is evaluated correctly. Double evaluation is needed to interpolate the variable.
my $test='abc="def"';
my $replacement='"$1=ghj"';
$test =~ s/(.+)="(.+)"/$replacement/ee;
print $test;
Output:
abc=ghj
It should be noted that this is somewhat unsafe, especially if others can affect the value of your replacement. Then they can execute arbitrary code on your system.

There are approximately 3 answers to this question.
Your replacement "string" is actually code to be evaluated at match time to generate the replacement string. That is, it is better represented as a function:
my $test = 'abc="def"';
my $replacement = sub { $1 . '="ghj"' };
$test =~ s/(.+)="(.+)"/$replacement->()/e;
print $test;
If you don't need the full power of arbitrary Perl expressions (or if your replacement string comes from an external source), you can also treat it as a template to be filled in with the match results. There is a module that encapsulates this in the form of a JavaScript-like replace function, Data::Munge:
use Data::Munge qw(replace);
my $test = 'abc="def"';
my $replacement = '$1="ghj"';
$test = replace $test, qr/(.+)="(.+)"/, $replacement;
print $test;
Finally, you can represent Perl code as a string to be eval'd. This is not only inefficient but also fraught with quoting issues (you have to make sure everything in $replacement is syntactically valid Perl) and security holes (if $replacement is generated at runtime, especially if it comes from an external source). My least favorite approach:
my $test = 'abc="def"';
my $replacement = '$1 . "=\\"ghj\\""';
$test =~ s/(.+)="(.+)"/eval $replacement/e;
print $test;
(The s//eval $foo/e part can also be written as s//$foo/ee. I don't like to do that because eval is evil and shouldn't be more hidden than it already is.)

Related

Seeing value of Perl variable created at runtime

My code is below:
use strict;
my $store = 'Media Markt';
my $sentence = "I visited [store]";
# Replace characters "[" and "]"
$sentence =~ s/\[/\$/g;
$sentence =~ s/\]//g;
print $sentence;
I see following at screen:
I visited $store
Is it possible to see following? I want to see value of $store:
I visited Media Markt
You seem to be thinking of using a string, 'store', in order to build a variable name, $store. This gets to the subject of symbolic references, and you do not want to go there.
One way to do what you want is to build a hash that relates such strings to corresponding variables. Then capture the bracketed strings in the sentence and replace them by their hash values
use warnings;
use strict;
my $store = 'Media Markt';
my $time = 'morning';
my %repl = ( store => $store, time => $time );
my $sentence = "I visited [store] in the [time]";
$sentence =~ s/\[ ([^]]+) \]/$repl{$1}/gex;
print "$sentence\n";
This prints the line I visited Media Markt in the morning
The regex captures anything between [ ], by using the negated character class [^]] (any char other than ]), matched one-or-more times (+). Then it replaces that with its value in the hash, using /e to evaluate the replacement side as an expression. Since brackets are matched as well they end up being removed. The /x allows spaces inside, for readibilty.
For each string found in brackets there must be a key-value pair in the hash or you'll get a warning. To account for this, we can provide an alternative
$sentence =~ s{\[ ([^]+) \]}{$repl{$1}//"[$1]"}gex;
The defined-or operator (//) puts back "[$1]" if $repl{$1} returns undef (no key $1 in the hash, or it has undef value). Thus strings which have no hash pairs are unchanged. I changed the delimiters to s{}{} so that // can be used inside.
This does not allow nesting (like [store [name]]), does not handle multiline strings, and has other limitations. But it should work for reasonable cases.
As I told you on the Perl Programmers Facebook group, this is very similar to one of the answers in the Perl FAQ.
How can I expand variables in text strings?
If you can avoid it, don't, or if you can use a templating system, such as Text::Template or Template Toolkit, do that instead. You might even be able to get the job done with sprintf or printf:
my $string = sprintf 'Say hello to %s and %s', $foo, $bar;
However, for the one-off simple case where I don't want to pull out a full templating system, I'll use a string that has two Perl scalar variables in it. In this example, I want to expand $foo and $bar to their variable's values:
my $foo = 'Fred';
my $bar = 'Barney';
$string = 'Say hello to $foo and $bar';
One way I can do this involves the substitution operator and a double /e flag. The first /e evaluates $1 on the replacement side and turns it into $foo. The second /e starts with $foo and replaces it with its value. $foo, then, turns into 'Fred', and that's finally what's left in the string:
$string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney'
The /e will also silently ignore violations of strict, replacing undefined variable names with the empty string. Since I'm using the /e flag (twice even!), I have all of the same security problems I have with eval in its string form. If there's something odd in $foo, perhaps something like #{[ system "rm -rf /" ]}, then I could get myself in trouble.
To get around the security problem, I could also pull the values from a hash instead of evaluating variable names. Using a single /e, I can check the hash to ensure the value exists, and if it doesn't, I can replace the missing value with a marker, in this case ??? to signal that I missed something:
my $string = 'This has $foo and $bar';
my %Replacements = (
foo => 'Fred',
);
# $string =~ s/\$(\w+)/$Replacements{$1}/g;
$string =~ s/\$(\w+)/
exists $Replacements{$1} ? $Replacements{$1} : '???'
/eg;
print $string;
And the actual (but really not recommended - for the reasons explained in the FAQ above) answer to your question is:
$sentence =~ s/\[(\w+)]/'$' . $1/ee;

How do you override substitution operations?

I'm playing around with Perl and creating a string object. I know that this is a very bad idea to do in the real world. I'm doing it purely for fun.
I'm using overload to overload standard Perl string operators with the standard operators you would find in most other languages.
use strict;
use warnings;
use feature qw(say);
my $obj_string1 = Object::String->new("foo");
my $obj_string2 = Object::String->new("bar");
my $reg_string1 = "foobar";
my $reg_string2 = "barfu";
# Object::String "stringifies" correctly inside quotes
say "$obj_string1 $obj_string2";
# Use "+" for concatenations
say $obj_string1 + $obj_string2; # Works
say $obj_string1 + $reg_string1 + $reg_string2 # Works
say $reg_string1 + $obj_string1 # Still works!
say $reg_string1 + $obj_string1 + $reg_string2; # Still works!
say $reg_string1 + $reg_string2 + $obj_string1; # Does't work, of course.
# Overload math booleans with their string boolean equivalents
my $forty = Object::String(40);
my $one_hundred = "100";
if ( $forty > $one_hundred ) { # Valid
say "$forty is bigger than $one_hundred (in strings!)";
}
if ( $one_hundred < $forty ) { # Also Valid
say "$one_hundred is less than $forty (In strings!)";
}
# Some standard "string" methods
say $forty->length # Prints 5
say $forty->reverse; # Prints "ytrof"
say $forty; # Prints "ytrof"
Now comes the hard part:
my $string = Object::String("I am the best programmer around!");
say $string; # Prints "I am the best programmer around"
say $string->get_value; # Prints "I am the best programmer around" with get_value Method
# But, it's time to speak the truth...
$string =~ s/best programer/biggest liar/;
say $string; # Prints "I am the biggest liar around"
say $string->get_value; # Whoops, no get_value method on scalar strings
As you can see, when I do my substitution, it works correctly, but returns a regular scalar string instead of an Object::String.
I am trying to figure out how to override the substitution operation. I've looked in the Perldoc, and I've gone through various Perl books (Advance Perl Programming, Intermediate Perl Programming, Perl Cookbook, etc.), but haven't found a way to override the substitution operation, so it returns an Object::String.
How do I override the substitution operation?
Unfortunately Perl's overload support isn't very universal in the area of strings. There's many operations that overloading isn't party to; and s/// is one of them.
I have started a module to fix this; overload::substr but as yet it's incomplete. It allows you to overload the substr() function for your object, but so far it doesn't yet have power to apply to m// or s///.
You might however, be able to use lvalue (or 4-argument) substr() on your objects as a way to cheat this; if the objects at least stringify into regular strings that can be matched upon, the substitution can be done using the substr()-rewrite trick.
Turn
$string =~ s/pattern/replacement/;
into
$string =~ m/pattern/ and substr($string, $-[0], $+[0]-$-[0]) = "replacement";
and then you'll have some code which will respect a substr() overload on the $string object, if you use my module above.
At some point of course it would be nice if overload::substr can perform that itself; I just haven't got around to writing it yet.

Perl using $1 in a substitution replacement with variable interpolation

I am trying to use variable interpolation in a replacement string including $1, $2,...
However, I can't get it to expand $1 into the replacement. I eventually will have the
$pattern and $replacement variables be read from a configuration file, but even
setting them manually doesn't work.
In the example script, you can see that the $1 (which should be 'DEF') is not
expanded in $new_name, but it is in $new_name2 (without variables).
Adding an 'e' flag to the substitution doesn't help.
How do I fix this?
Matt
EXAMPLE CODE:
#!/usr/local/bin/perl
use strict;
my $old_name = 'ABC_DEF_GHI';
my $pattern = 'ABC_(...)_GHI';
my $replacement = 'CBA_${1}_IHG';
# using variables - doesn't work
my $new_name = $old_name;
$new_name =~ s|$pattern|$replacement|;
printf("%s --> %s\n", $old_name, $new_name);
# not using variables - does work
my $new_name2 = $old_name;
$new_name2 =~ s|ABC_(...)_GHI|CBA_${1}_IHG|;
printf("%s --> %s\n", $old_name, $new_name2);
OUTPUT:
ABC_DEF_GHI --> CBA_${1}_IHG
ABC_DEF_GHI --> CBA_DEF_IHG
You need to do this changes in your code:
my $replacement = '"CBA_$1_IHG"'; #note the single and double quotes
# ...
$new_name =~ s|$pattern|$replacement|ee; #note the double "ee", double evaluation
See this SO answer for more information
/e treat $replacement as Perl code. The Perl code $replacement simply returns the value it contains.
If you want to evaluate the contents of $replacement as Perl code, you need
s/$search/ my $s = eval $replacement; die $# if $#; $s /e
which can be written as
s/$search/$replacement/ee
Note that since $replacement is expected to contain Perl code, it means that this can be used to execute arbitrary Perl code.
A better solution is to realise you are writing your own subpar templating system, and use an existing one instead. String::Interpolate understands the templating syntax you are currently using:
use String::Interpolate qw( interpolate );
s/$search/interpolate $replace/e

what does these perl variables mean?

I'm a little noobish to perl coding conventions, could someone help explain:
why are there / and /< in front of perl variables?
what does\= and =~ mean, and what is the difference?
why does the code require an ending / before the ;, e.g. /start=\'([0-9]+)\'/?
The 1st 3 sub-questions were sort of solved by really the perldocs, but what does the following line means in the code?
push(#{$Start{$start}},$features);
i understand that we are pushing the $features into a #Start array but what does #$Start{$start} mean? Is it the same as:
#Start = ($start);
Within the code there is something like this:
use FileHandle;
sub open_infile {
my $file = shift;
my $in = FileHandle->new($file,"<:encoding(UTF-8)")
or die "ERROR: cannot open $file: $!\n" if ($Opt_utf8);
$in = new FileHandle("$file")
or die "ERROR: cannot open $file: $!\n" if (!$Opt_utf8);
return $in;
}
$uamf = shift #ARGV;
$uamin = open_infile($uamf);
while (<$uamin>) {
chomp;
if(/<segment /){
/start=\'([0-9]+)\'/;
/end=\'([0-9]+)\'/;
/features=\'([^\']+)\'/;
$features =~ s/annotation;//;
push(#{$Start{$start}},$features);
push(#{$End{$end}},$features);
}
}
EDITED
So after some intensive reading of the perl doc, here's somethings i've gotten
The /<segment / is a regex check that checks whether the readline
in while (<$uamin>) contains the following string: <segment.
Similarly the /start=\'([0-9]+)\'/ has nothing to to do with
instantiating any variable, it's a regex check to see whether the
readline in while (<$uamin>) contains start=\'([0-9]+)\' which
\'([0-9]+)\' refers to a numeric string.
In $features =~ s/annotation;// the =~ is use because the string
replacement was testing a regular expression match. See
What does =~ do in Perl?
Where did you see this syntax (or more to the point: have you edited stuff out of what you saw)? /foo/ represents the match operator using regular expressions, not variables. In other words, the first line is checking to see if the input string $_ contains the character sequence <segment.
The subsequent three lines essentially do nothing useful, in the sense that they run regular expression matches and then discard the results (there are side-effects, but subsequent regular expressions discard the side-effects, too).
The last line does a substitution, replacing the first occurance of the characters annotation; with the empty string in the string $features.
Run the command perldoc perlretut to learn about regex in Perl.

What's an easy way to print a multi-line string without variable substitution in Perl?

I have a Perl program that reads in a bunch of data, munges it, and then outputs several different file formats. I'd like to make Perl be one of those formats (in the form of a .pm package) and allow people to use the munged data within their own Perl scripts.
Printing out the data is easy using Data::Dump::pp.
I'd also like to print some helper functions to the resulting package.
What's an easy way to print a multi-line string without variable substitution?
I'd like to be able to do:
print <<EOL;
sub xyz {
my $var = shift;
}
EOL
But then I'd have to escape all of the $'s.
Is there a simple way to do this? Perhaps I can create an actual sub and have some magic pretty-printer print the contents? The printed code doesn't have to match the input or even be legible.
Enclose the name of the delimiter in single quotes and interpolation will not occur.
print <<'EOL';
sub xyz {
my $var = shift;
}
EOL
You could use a templating package like Template::Toolkit or Text::Template.
Or, you could roll your own primitive templating system that looks something like this:
my %vars = qw( foo 1 bar 2 );
Write_Code(\$vars);
sub Write_Code {
my $vars = shift;
my $code = <<'END';
sub baz {
my $foo = <%foo%>;
my $bar = <%bar%>;
return $foo + $bar;
}
END
while ( my ($key, $value) = each %$vars ) {
$code =~ s/<%$key%>/$value/g;
}
return $code;
}
This looks nice and simple, but there are various traps and tricks waiting for you if you DIY. Did you notice that I failed to use quotemeta on my key names in the substituion?
I recommend that you use a time-tested templating library, like the ones I mentioned above.
You can actually continue a string literal on the next line, like this:
my $mail = "Hello!
Blah blah.";
Personally, I find that more readable than heredocs (the <<<EOL thing mentioned elsewhere).
Double quote " interpolates variables, but you can use '. Note you'll need to escape any ' in your string for this to work.
Perl is actually quite rich in convenient things to make things more readable, e.g. other quote-operations. qq and q correspond to " and ' and you can use whatever delimiter makes sense:
my $greeting = qq/Hello there $name!
Nice to meet you/; # Interpolation
my $url = q|http://perlmonks.org/|; # No need to escape /
(note how the syntax coloring here didn't quite keep up)
Read perldoc perlop (find in page: "Quote and Quote-like Operators") for more information.
Use a data section to store the Perl code:
#!/usr/bin/perl
use strict;
use warnings;
print <DATA>;
#print munged data
__DATA__
package MungedData;
use strict;
use warnings;
sub foo {
print "foo\n";
}
Try writing your code as an actual perl subroutine, then using B::Deparse to get the source code at runtime.