in Perl, how do i write regex matched string - perl

I want to write $1 on other line for replacement;
my $converting_rules = +{
'(.+?)' => '$1',
};
my $pre = $converting_rule_key;
my $post = $converting_rules->{$converting_rule_key};
#$path_file =~ s/$pre/$post/; // Bad...
$path_file =~ s/$pre/$1/; // Good!
On Bad, $1 is recognized as a string '$1'.
But I wqnt to treat it matched string.
I have no idea what to do...plz help me!

The trouble is that s/$pre/$post/ interpolates the variables $pre and $post, but will not recursively interpolate anything in them that happens to look like a variable. So you want to add an extra eval to the replacement, with the /ee flag:
$path_file =~ s/$pre/$post/ee;

$x = '$1.00';
print qq/$x/;
prints $1.00, so it's no surprise that
$x = '$1.00';
s/(abc)/$x/;
substitutes with $1.00.
What you have there is a template, yet you did nothing to process this template. String::Interpolate can handle such templates.
use String::Interpolate qw( interpolate );
$rep = '$1';
s/$pat/ interpolate($rep) /e;

Related

How can I convert a string number to a number in Perl? [duplicate]

Is there any way to replace multiple strings in a string?
For example, I have the string hello world what a lovely day and I want to replace what and lovely with something else..
$sentence = "hello world what a lovely day";
#list = ("what", "lovely"); # strings to replace
#replist = ("its", "bad"); # strings to replace with
($val = $sentence) =~ "tr/#list/#replist/d";
print "$val\n"; # should print "hello world its a bad day"..
Any ideas why it's not working?
Thanks.
First of all, tr doesn't work that way; consult perldoc perlop for details, but tr does transliteration, and is very different from substitution.
For this purpose, a more correct way to replace would be
# $val
$val =~ s/what/its/g;
$val =~ s/lovely/bad/g;
Note that "simultaneous" change is rather more difficult, but we could do it, for example,
%replacements = ("what" => "its", "lovely" => "bad");
($val = $sentence) =~ s/(#{[join "|", keys %replacements]})/$replacements{$1}/g;
(Escaping may be necessary to replace strings with metacharacters, of course.)
This is still only simultaneous in a very loose sense of the term, but it does, for most purposes, act as if the substitutions are done in one pass.
Also, it is more correct to replace "what" with "it's", rather than "its".
Well, mainly it's not working as tr///d has nothing to do with your request (tr/abc/12/d replaces a with 1, b with 2, and removes c). Also, by default arrays don't interpolate into regular expressions in a way that's useful for your task. Also, without something like a hash lookup or a subroutine call or other logic, you can't make decisions in the right-hand side of a s/// operation.
To answer the question in the title, you can perform multiple replaces simultaneously--er, in convenient succession--in this manner:
#! /usr/bin/env perl
use common::sense;
my $sentence = "hello world what a lovely day";
for ($sentence) {
s/what/it's/;
s/lovely/bad/
}
say $sentence;
To do something more like what you attempt here:
#! /usr/bin/env perl
use common::sense;
my $sentence = "hello world what a lovely day";
my %replace = (
what => "it's",
lovely => 'bad'
);
$sentence =~ s/(#{[join '|', map { quotemeta($_) } keys %replace]})/$replace{$1}/g;
say $sentence;
If you'll be doing a lot of such replacements, 'compile' the regex first:
my $matchkey = qr/#{[join '|', map { quotemeta($_) } keys %replace]}/;
...
$sentence =~ s/($matchkey)/$replace{$1}/g;
EDIT:
And to expand on my remark about array interpolation, you can change $":
local $" = '|';
$sentence =~ s/(#{[keys %replace]})/$replace{$1}/g;
# --> $sentence =~ s/(what|lovely)/$replace{$1}/g;
Which doesn't improve things here, really, although it may if you already had the keys in an array:
local $" = '|';
$sentence =~ s/(#keys)/$replace{$1}/g;

Perl split() Function Not Handling Pipe Character Saved As A Variable

I'm running into a little trouble with Perl's built-in split function. I'm creating a script that edits the first line of a CSV file which uses a pipe for column delimitation. Below is the first line:
KEY|H1|H2|H3
However, when I run the script, here is the output I receive:
Col1|Col2|Col3|Col4|Col5|Col6|Col7|Col8|Col9|Col10|Col11|Col12|Col13|
I have a feeling that Perl doesn't like the fact that I use a variable to actually do the split, and in this case, the variable is a pipe. When I replace the variable with an actual pipe, it works perfectly as intended. How could I go about splitting the line properly when using pipe delimitation, even when passing in a variable? Also, as a silly caveat, I don't have permissions to install an external module from CPAN, so I have to stick with built-in functions and modules.
For context, here is the necessary part of my script:
our $opt_h;
our $opt_f;
our $opt_d;
# Get user input - filename and delimiter
getopts("f:d:h");
if (defined($opt_h)) {
&print_help;
exit 0;
}
if (!defined($opt_f)) {
$opt_f = &promptUser("Enter the Source file, for example /qa/data/testdata/prod.csv");
}
if (!defined($opt_d)) {
$opt_d = "\|";
}
my $delimiter = "\|";
my $temp_file = $opt_f;
my #temp_file = split(/\./, $temp_file);
$temp_file = $temp_file[0]."_add-headers.".$temp_file[1];
open(source_file, "<", $opt_f) or die "Err opening $opt_f: $!";
open(temp_file, ">", $temp_file) or die "Error opening $temp_file: $!";
my $source_header = <source_file>;
my #source_header_columns = split(/${delimiter}/, $source_header);
chomp(#source_header_columns);
for (my $i=1; $i<=scalar(#source_header_columns); $i++) {
print temp_file "Col$i";
print temp_file "$delimiter";
}
print temp_file "\n";
while (my $line = <source_file>) {
print temp_file "$line";
}
close(source_file);
close(temp_file);
The first argument to split is a compiled regular expression or a regular expression pattern. If you want to split on text |. You'll need to pass a pattern that matches |.
quotemeta creates a pattern from a string that matches that string.
my $delimiter = '|';
my $delimiter_pat = quotemeta($delimiter);
split $delimiter_pat
Alternatively, quotemeta can be accessed as \Q..\E inside double-quoted strings and the like.
my $delimiter = '|';
split /\Q$delimiter\E/
The \E can even be omitted if it's at the end.
my $delimiter = '|';
split /\Q$delimiter/
I mentioned that split also accepts a compiled regular expression.
my $delimiter = '|';
my $delimiter_re = qr/\Q$delimiter/;
split $delimiter_re
If you don't mind hardcoding the regular expression, that's the same as
my $delimiter_re = qr/\|/;
split $delimiter_re
First, the | isn't special inside doublequotes. Setting $delimiter to just "|" and then making sure it is quoted later would work or possibly setting $delimiter to "\\|" would be ok by itself.
Second, the | is special inside regex so you want to quote it there. The safest way to do that is ask perl to quote your code for you. Use the \Q...\E construct within the regex to mark out data you want quoted.
my #source_header_columns = split(/\Q${delimiter}\E/, $source_header);
see: http://perldoc.perl.org/perlre.html
It seems as all you want to do is count the fields in the header, and print the header. Might I suggest something a bit simpler than using split?
my $str="KEY|H1|H2|H3";
my $count=0;
$str =~ s/\w+/"Col" . ++$count/eg;
print "$str\n";
Works with most any delimeter (except alphanumeric and underscore), it also saves the number of fields in $count, in case you need it later.
Here's another version. This one uses the character class brackets instead, to specify "any character but this", which is just another way of defining a delimeter. You can specify delimeter from the command-line. You can use your getopts as well, but I just used a simple shift.
my $d = shift || '[^|]';
if ( $d !~ /^\[/ ) {
$d = '[^' . $d . ']';
}
my $str="KEY|H1|H2|H3";
my $count=0;
$str =~ s/$d+/"Col" . ++$count/eg;
print "$str\n";
By using the brackets, you do not need to worry about escaping metacharacters.
#!/usr/bin/perl
use Data::Dumper;
use strict;
my $delimeter="\\|";
my $string="A|B|C|DD|E";
my #arr=split(/$delimeter/,$string);
print Dumper(#arr)."\n";
output:
$VAR1 = 'A';
$VAR2 = 'B';
$VAR3 = 'C';
$VAR4 = 'DD';
$VAR5 = 'E';
seems you need define delimeter as \\|

How to write a simple sub in perl that takes a string and returns a string?

sub getHeading
{
my $var = $_[0];
my $match;
if ($match = ($var =~ m/$fivetonine/))
{
return "=";
}
if ($match = ($var =~ m/$tentofourteen/))
{
return "==";
}
if ($match = ($var =~ m/$fifteentonineteen/)){
return "===";
}
return "===";
}
my $ref_to_getHeading = \getHeading;
and I am calling it via:
$html =~ s/(.*)<font size="([^"]+)">(.+)<\/font>(.*)/$ref_to_getHeading($2)$1$3$4$ref_to_getHeading($2)/m;
I am wanting to pass a string in to this function, I want to check if it is one of 3 different matches and return the appropriate number of = signs, I am doing this wrong but I can't figure out how to make it take parameters? I get a run time error saying $var is initialised? I tried using #_ but I don't really understand what the difference is.
Any help much appreciated, I have never written perl before and this is my first real program.
Double mistake there.
First, you aren't taking a reference to a function - You need to add the ampersand.
But even if you do that, it won't work. You are missing the /e flag in your substitution: You can't dereference a coderef within a string like you'd normally do with (scalar|hash|array)ref:
my $example = sub { return "hello" };
say "$example->()"; #Will stringify the coderef.
You either need the /e flag,
$html =~ s/etc/$ref_to_getHeading->($2) . "$1$3$4" . $ref_to_getHeading->($2)/em;
Or a little trick:
$html =~ s/etc/#{[$ref_to_getHeading->($2)]}$1$3$4#{[$ref_to_getHeading->($2)]}/m;
EDIT: Gosh, am I a slow typist..
Anyhow, with either way, you should be able to call the sub directly, so no need for the coderef.
The line my $ref_to_getHeading = \getHeading; doesn't do what you think it does. To take a reference to a subroutine:
my $ref_to_getHeading = \&getHeading; # note the &
So you were actually calling getHeading and storing the result. Since you passed no arguments, you got the undefined value warning.
The substitution however will never call the coderef, for that to happen, you need to add the e modifier to run the replacement text through eval:
$html =~ s/.../join '' => getHeading($2), $1, $3, $4, getHeading($2)/me;
you may run into issues here with getHeading resetting the match vars too early. In which case, try writing it this way:
$html =~ s{...}{
my $body = $1 . $3 . $4;
my $heading = getHeading($2);
$heading . $body . $heading
}me;
The bracket change for s/// was not necessary, I just find it easier to read a multi-line curly block.

How can I "store" an operator inside a variable in Perl?

I'm looking for a way to do this in Perl:
$a = "60"; $b = "< 80";
if ( $a $b ) { then .... }
Here, $b "holds" an operator... can I do this? Maybe some other way?
It's nice to see how people discover functional programming. :-)
Luckily, Perl has capabilities to create and store functions on-the-fly. For example, the sample in your question will look like this:
$a = "60"; $b = sub { $_[0] < 80 };
if ( $b->($a) ) { .... }
In this example, a reference to the anonymous subroutine is stored in $b, the sub having the same syntax for argument passing as a usual one. -> is then used to call-by-reference (the same syntax you probably use for references to arrays and hashes).
But, of course, if you want just to construct Perl expressions from arbitrary strings, you might want to use eval:
$a = "60"; $b = " < 80";
if ( eval ("$a $b") ) { .... }
However, doing this via eval is not safe, if the string you're eval-ing contains parts that come as user input. Sinan Ünür explained it perfectly in his answer-comment.
How about defining a function that wraps the needed condition:
my $cond = sub { $_[0] < 80 };
if ( $cond->( $a ) ) {
...
}
This should be a comment but comments are too cramped for something like this so I am making it CW.
For the case which you showed where the contents of the variables that are going to be passed to string eval, the accepted solution is correct.
If, however, the contents of $a and $b come from user input, then take a look at the following script:
#!/usr/bin/perl
use strict; use warnings;
my $x = '80';
my $y = '; warn "evil laugh!\n"; exit';
if ( eval ($x . $y) ) {
print "it worked!!!\n";
}
If the strings are entered by the user, there is nothing preventing the user from passing to your program the string ';system "rm -rf /bin"'.
So, the correct solution to your question would require writing or using an expression parser.
BTW, you should not use $a and $b as variable names as the are magical package local variables used by sort and as such they are exempt from strict — and you must always use strict and warnings in your programs.
$a = "60"; $b = "< 80";
if( eval($a. $b)){
print "ok";
}
see perldoc eval for more
I wonder if Number::Compare is of any interest here. From the example:
Number::Compare->new(">1Ki")->test(1025); # is 1025 > 1024
my $c = Number::Compare->new(">1M");
$c->(1_200_000); # slightly terser invocation
Safer form if you trust (or can sufficiently validate) $op and don't trust the safety of the inputs:
my $compare_x = $user_input_x;
my $compare_y = $user_input_y;
my $op = <some safe non-user-input, or otherwise checked against a safe list>;
if ( eval("\$compare_x $op \$compare_y") )
{
...
}

How can I expand a string like "1..15,16" into a list of numbers?

I have a Perl application that takes from command line an input as:
application --fields 1-6,8
I am required to display the fields as requested by the user on command line.
I thought of substituting '-' with '..' so that I can store them in array e.g.
$str = "1..15,16" ;
#arr2 = ( $str ) ;
#arr = ( 1..15,16 ) ;
print "#arr\n" ;
print "#arr2\n" ;
The problem here is that #arr works fine ( as it should ) but in #arr2 the entire string is not expanded as array elements.
I have tried using escape sequences but no luck.
Can it be done this way?
If this is user input, don't use string eval on it if you have any security concerns at all.
Try using Number::Range instead:
use Number::Range;
$str = "1..15,16" ;
#arr2 = Number::Range->new( $str )->range;
print for #arr2;
To avoid dying on an invalid range, do:
eval { #arr2 = Number::Range->new( $str )->range; 1 } or your_error_handling
There's also Set::IntSpan, which uses - instead of ..:
use Set::IntSpan;
$str = "1-15,16";
#arr2 = Set::IntSpan->new( $str )->elements;
but it requires the ranges to be in order and non-overlapping (it was written for use on .newsrc files, if anyone remembers what those are). It also allows infinite ranges (where the string starts -number or ends number-), which the elements method will croak on.
You're thinking of #arr2 = eval($str);
Since you're taking input and evaluating that, you need to be careful.
You should probably #arr2 = eval($str) if ($str =~ m/^[0-9.,]+$/)
P.S. I didn't know about the Number::Range package, but it's awesome. Number::Range ftw.
I had the same problem in dealing with the output of Bit::Vector::to_Enum. I solved it by doing:
$range_string =~ s/\b(\d+)-(\d+)\b/expand_range($1,$2)/eg;
then also in my file:
sub expand_range
{
return join(",",($_[0] .. $_[1]));
}
So "1,3,5-7,9,12-15" turns into "1,3,5,6,7,9,12,13,14,15".
I tried really hard to put that expansion in the 2nd part of the s/// so I wouldn't need that extra function, but I couldn't get it to work. I like this because while Number::Range would work, this way I don't have to pull in another module for something that should be trivial.
#arr2 = ( eval $str ) ;
Works, though of course you have to be very careful with eval().
You could use eval:
$str = "1..15,16" ;
#arr2 = ( eval $str ) ;
#arr = ( 1..15,16 ) ;
print "#arr\n" ;
print "#arr2\n" ;
Although if this is user input, you'll probably want to do some validation on the input string first, to make sure they haven't input anything dodgy.
Use split:
#parts = split(/\,/, $fields);
print $parts[0];
1-6
print $parts[1];
8
You can't just put a string containing ',' in an array, and expect it to turn to elements (except if you use some Perl black magic, but we won't go into that here)
But Regex and split are your friends.