Using binding operator in perl - perl

I am working on a program in perl and I am trying to combine more than one regex in a binding operator. I have tried using the syntax below but it's not working. I will like to know if there is any other way to go with this.
$in =~ (s/pattern/replacement/)||(s/pattern/replacement/)||...

You can often get a clue about what the Perl makes of some code using B::Deparse.
$ perl -MO=Deparse -E'$in =~ (s/pattern1/replacement1/)||(s/pattern2/replacement2/)'
[ ... snip ... ]
s/pattern2/replacement2/u unless $in =~ s/pattern1/replacement1/u;
-e syntax OK
So it's attempting your first substitution on $in. And if that fails, it is then trying your second substitution. But it's not using $in for the second substitution, it's using $_ instead.
You're running up against precedence issues here. Perl interprets your code as:
($in =~ s/pattern1/replacement1/) or (s/pattern2/replacement2/)
Notice that the opening parenthesis has moved before $in.
As others have pointed out, it's best to use a loop approach here. But I thought it might be useful to explain why your version didn't work.
Update: To be clear, if you wanted to use syntax like this, then you would need:
($in =~ s/pattern1/replacement1/) or
($in =~ s/pattern2/replacement2/);
Note that I've included $in =~ in each expression. At this point, it becomes obvious (I hope) why the looping solution is better.
However, because or is a short-circuiting operator, this statement will stop after the first successful substitution. I assumed that's what you wanted from your use of it in your original code. If that's not what you want, then you need to either switch to using and or (better, in my opinion) break them out into separate statements.
$in =~ s/pattern1/replacement1/;
$in =~ s/pattern2/replacement2/;

The closest you could get with a syntax looking similar to that would be
s/one/ONE/ or
s/two/TWO/ or
...
s/ten/TEN/ for $str;
This will attempt each substitution in turn, once only, stopping after the first successful one.

Use for to "topicalize" (alias $_ to your variable).
for ($in) {
s/pattern/replacement/;
s/pattern/replacement/;
}

A simpler way might be to create an array of all such patterns and replacements, then simply iterate through your array applying the substitution one pattern at a time.
my $in = "some string you want to modify";
my #patterns = (
['pattern to match', 'replacement string'],
# ...
);
$in = replace_many($in, \#patterns);
sub replace_many {
my ($in, $replacements) = #_;
foreach my $replacement ( #$replacements ) {
my ($pattern, $replace_string) = #$replacement;
$in =~ s/$pattern/$replace_string/;
}
return $in;
}

It's not at all clear what you need, and it's not at all clear that you can accomplish what you appear to want by the means you suggest. The OR operator is a short circuit operator, and you may not want this behavior. Please give an example of the input you expect, and the output you desire, hopefully several examples of each. Meanwhile, here is a test script.
use warnings;
use strict;
my $in1 = 'George Walker Bush';
my $in2 = 'George Walker Bush';
my $in3 = 'George Walker Bush';
my $in4 = 'George Walker Bush';
(my $out1 = $in1) =~ s/e/*/g;
print "out1 = $out1 \n";
(my $out2 = $in2) =~ s/Bush/Obama/;
print "out2 = $out2 \n";
(my $out3 = $in3) =~ s/(George)|(Bush)/Obama/g;
print "out3 = $out3\n";
$in4 =~ /(George)|(Walker)|(Bush)/g;
print "$1 - $2 - $3\n";
exit(0);
You will notice in the last case that only the first OR operator matches in the regular expression. If you wanted to replace 'George Walker Bush' with Barack Hussein Obama', you could do that easily enough, but you would also replace 'George Washington'with 'Barack Washington' - is this what you want? Here is the output of the script:
out1 = G*org* Walk*r Bush
out2 = George Walker Obama
out3 = Obama Walker Obama
Use of uninitialized value $2 in concatenation (.) or string at pq_151111a.plx line 19.
Use of uninitialized value $3 in concatenation (.) or string at pq_151111a.plx line 19.
George - -

Related

perl regex too greedy

I went through similar questions asked by other members and applied (or tried to apply) solutions from their inquiry but they did not work on my issue. My pattern match and grouping is too greedy and does not stop at first pipe(|). If I get more specific, I think it can but I'm trying to figure out how I can stop the pattern match at the first instance of the pipe?
Here are couple of lines
09:30:00.063|IN:|8=FIX.4.2|9=206|35=D|34=5159|49=CLIENT|52=20191024-13:30:00.050|56=SERV|57=DEST|1=05033|11=ABZ5702|15=USD|21=1|38=2000|40=2|44=92.48|47=A|54=5|55=RC|60=20191024-13:30:00.050|111=0|114=N|336=X|5700=AP|9281=SOV|10=202
09:37:21.208|IN:|8=FIX.4.2|9=170|35=D|34=5184|49=CLIENT|52=20191024-13:37:21.206|56=SERV|57=ATXB|1=J5129|11=136404|15=USD|21=1|38=100|40=2|44=1.39|47=A|54=2|55=DIW|59=2|60=20191024-13:30:00.206|10=029
I'm expecting my perl script to return the following output from the above data:
09:30:00.063|13:30:00.050|ABZ5702
09:37:21.208|13:37:21.206|136404
I tried all this and few other veriations but could not get it to produce the above output:
#$msg =~ s/([^|]*).*|52=([^|]*).*|11=([^|]*).*/$1|$2|$3/;
$msg =~ s/(.+)\|??.*|52=([^|]*).*|11=([^|]*).*/$1|$2|$3/;
#$msg =~ s/^([^|]*).??|52=([^|]*).??|11=([^|]*).*/$1|$2|$3/;
#$msg =~ s/^([^\|??]*).*|52=([^\|??]*).*|11=([^\|??]*).*/$1|$2|$3/;
#$msg =~ s/(.*\|??).*|52=(.+\|??).*|11=(.+\|??).*/one $1|two $2|three $3/;
#$msg =~ s/(.*?|).*|52=(.*?|).*|11=(.*|?).*/$1|$2|$3/;
#$msg =~ /(.*)|??.*|52=(.*)|??.*|11=(.*)|??.*/$1|$2|$3/;
#$msg =~ s/|.*-[0-3][0-9]:/|/;
print "$msg\n";```
I realize there are other more than one way to skin the cat but there are cases where I need to use the pattern match approach. How can I get it to produce the expected output using the pattern matching where it stops each group at first pipe(|)? Can someone tell me what am I doing wrong?
Try this:
s/(.*?)\|.*\|52=([^|]*).*\|11=([^|]*).*/$1 $2 $3/;
There were a couple of pipe delimiters that needed escaping.
You need to look at non-greedy matching https://docstore.mik.ua/orelly/perl/cookbook/ch06_16.htm
The first matching group is (.*?) instead of (.*). The ? means we match as little as possible.
In general, for parsing FIX in perl, as long as there are no repeating groups, I would recommend splitting on | first and then creating a hash of tag-value pairs.
I would do it a little bit different - split line into array and work on individual element of array.
The regex may be an acceptable solution for one particular case if format of line predetermined and will never change.
use strict;
use warnings;
use Data::Dumper;
my $debug = 0;
while( my $line = <DATA> ) {
my #array = split /\|/, $line;
print Dumper(\#array) if $debug;
$array[7] =~ s/.+?-//;
$array[11] =~ s/\d+=//;
printf "%s\n", join '|', #array[0,7,11];
}
__DATA__
09:30:00.063|IN:|8=FIX.4.2|9=206|35=D|34=5159|49=CLIENT|52=20191024-13:30:00.050|56=SERV|57=DEST|1=05033|11=ABZ5702|15=USD|21=1|38=2000|40=2|44=92.48|47=A|54=5|55=RC|60=20191024-13:30:00.050|111=0|114=N|336=X|5700=AP|9281=SOV|10=202
09:37:21.208|IN:|8=FIX.4.2|9=170|35=D|34=5184|49=CLIENT|52=20191024-13:37:21.206|56=SERV|57=ATXB|1=J5129|11=136404|15=USD|21=1|38=100|40=2|44=1.39|47=A|54=2|55=DIW|59=2|60=20191024-13:30:00.206|10=029

Search array of strings for specific words

Firstly, I don't know Perl at all, and need a reasonably quick answer on this. I have the result of running a command stored in an array:
my #result = `$command`;
What I need to do is search the array to see if any element contains the word "Merge" or the word "changed" (both case insensitive).
Can someone advise please?
The tool for the job here is grep - a function that allows you to specify a filter against a list. You can use it much like Unix grep, but it'll also allow for more complex tests (e.g. code to run).
In your case:
my #matches = grep { /merge|changed/i } #result;
if ( #matches ) {
print "One or more lines matched\n";
}
You can do this with a regular expression. In this example if the line match 'merge' or 'changed' (insensitive of course) the line is printed :
#!/usr/bin/perl
use strict;
use warnings;
my #result = `command`;
foreach my $line (#result){
if ($line =~ /merge|changed/i){
print $line;
}
}

conditional substitution using hashes

I'm trying for substitution in which a condition will allow or disallow substitution.
I have a string
$string = "There is <tag1>you can do for it. that dosen't mean <tag2>you are fool.There <tag3>you got it.";
Here are two hashes which are used to check condition.
my %tag = ('tag1' => '<you>', 'tag2'=>'<do>', 'tag3'=>'<no>');
my %data = ('you'=>'<you>');
Here is actual substitution in which substitution is allowed for hash tag values not matched.
$string =~ s{(?<=<(.*?)>)(you)}{
if($tag{"$1"} eq $data{"$2"}){next;}
"I"
}sixe;
in this code I want to substitute 'you' with something with the condition that it is not equal to the hash value given in tag.
Can I use next in substitution?
Problem is that I can't use \g modifier. And after using next I cant go for next substitution.
Also I can't modify expression while matching and using next it dosen't go for second match, it stops there.
You can't use a variable length look behind assertion. The only one that is allowed is the special \K marker.
With that in mind, one way to perform this test is the following:
use strict;
use warnings;
while (my $string = <DATA>) {
$string =~ s{<([^>]*)>\K(?!\1)\w+}{I}s;
print $string;
}
__DATA__
There is <you>you can do for it. that dosen't mean <notyou>you are fool.
There is <you>you can do for it. that dosen't mean <do>you are fool.There <no>you got it.
Output:
There is <you>you can do for it. that dosen't mean <notyou>I are fool.
There is <you>you can do for it. that dosen't mean <do>I are fool.There <no>you got it.
It was simple but got my two days to think about it. I just written another substitution where it ignores previous tag which is cancelled by next;
$string = "There is <tag1>you can do for it. that dosen't mean <tag2>you are fool.There <tag3>you got it.";
my %tag = ('tag1' => '<you>', 'tag2'=>'<do>', 'tag3'=>'<no>');
my %data = ('you'=>'<you>');
my $notag;
$string =~ s{(?<=<(.*?)>)(you)}{
$notag = $2;
if($tag{"$1"} eq $data{"$2"}){next;}
"I"
}sie;
$string =~ s{(?<=<(.*?)>)(?<!$notag)(you)}{
"I"
}sie;

Matching of data from output table

We need to match certain data element by element that is an output in tabular form obtained on the command prompt.The following is the approach being currently followed wherein the $Var contains the output. Is there an optimal way of doing this without directing the command output to file.
Please share your thoughts.
$Var = "iSCSI Storage LHN StgMgmt Name IP Name
==============================================================
0 Storage_1 15.178.209.194 admin
1 acct-mgmt 15.178.209.194 storage1
2 acct-mgmt2 15.178.209.194 storage2";
#tab = split("\n",$Var);
foreach (#tab) {
next if ($_ !~ /^\d/);
$_ =~ s/\s+//g;
$first=0 if($_ =~ /Storage/i && /15.178.209.194/);
push(#Array, $_); }
$_ =~ /Storage/i && /15.178.209.194/ is silly. That gets broken up like this: ($_ =~ /Storage/i) && (/15.178.209.194/). Either use $_ consistently or don't - the // and s/// operators automatically operate on $_.
Also you should know that in the regex /15.178.209.194/, the .s are being interpreted as any character. Either escape them or use the index() function.
Additionally, I would recommend that you separate each line using split(). This allows you to compare each individual column. You can use split() with a regex like so: #array = split(/\s+/, $string);.
Finally, I'm not really sure what $first is for, but I notice that all three sample lines in that input trigger $first=0 as they all contain that IP and the string "storage".
If I understand you correctly you want to invoke your script like this:
./some_shell_command | perl perl_script.pl
What you want to use is the Perl diamond operator <>:
#!/usr/bin/perl
use strict;
use warnings;
my $first;
my #Array;
for (<>) {
next unless /^\d/;
s/\s+/ /g;
$first = 0 if /Storage/i && /15.178.209.194/;
push(#Array, $_);
}
I've removed the redundant uses of $_ and fixed your substitution, since you probably don't want to remove all spaces.

Can I replace the binding operator with the smartmatch operator in Perl?

How can I write this with the smartmatch operator (~~)?
use 5.010;
my $string = '12 23 34 45 5464 46';
while ( $string =~ /(\d\d)\s/g ) {
say $1;
}
Interesting. perlsyn states:
Any ~~ Regex pattern match $a =~ /$b/
so, at first glance, it seems reasonable to expect
use strict; use warnings;
use 5.010;
my $string = '12 23 34 45 5464 46';
while ( $string ~~ /(\d\d)\s/g ) {
say $1;
}
to print 12, 23, etc but it gets stuck in a loop, matching 12 repeatedly. Using:
$ perl -MO=Deparse y.pl
yields
while ($string ~~ qr/(\d\d)\s/g) {
say $1;
}
looking at perlop, we notice
qr/STRING/msixpo
Note that 'g' is not listed as a modifier (logically, to me).
Interestingly, if you write:
my $re = qr/(\d\d)\s/g;
perl barfs:
Bareword found where operator expected at C:\Temp\y.pl line 5,
near "qr/(\d\d)\s/g"
syntax error at C:\Temp\y.pl line 5, near "qr/(\d\d)\s/g"
and presumably it should also say something if an invalid expression is used in the code above
If we go and look at what these two variants get transformed into, we can see the reason for this.
First lets look at the original version.
perl -MO=Deparse -e'while("abc" =~ /(.)/g){print "hi\n"}'
while ('abc' =~ /(.)/g) {
print "hi\n";
}
As you can see there wasn't any changing of the opcodes.
Now if you go and change it to use the smart-match operator, you can see it does actually change.
perl -MO=Deparse -e'while("abc" ~~ /(.)/g){print "hi\n"}'
while ('abc' ~~ qr/(.)/g) {
print "hi\n";
}
It changes it to qr, which doesn't recognize the /g option.
This should probably give you an error, but it doesn't get transformed until after it gets parsed.
The warning you should have gotten, and would get if you used qr instead is:
syntax error at -e line 1, near "qr/(.)/g"
The smart-match feature was never intended to replace the =~ operator. It came out of the process of making given/when work like it does.
Most of the time, when(EXPR) is treated as an implicit smart match of $_.
...
Is the expected behaviour to output to first match endlessly? Because that's what this code must do in its current form. The problem isn't the smart-match operator. The while loop is endless, because no modification ever occurs to $string. The /g global switch doesn't change the loop itself.
What are you trying to achieve? I'm assuming you want to output the two-digit values, one per line. In which case you might want to consider:
say join("\n", grep { /^\d{2}$/ } split(" ",$string));
To be honest, I'm not sure you can use the smart match operator for this. In my limited testing, it looks like the smart match is returning a boolean instead of a list of matches. The code you posted (using =~) can work without it, however.
What you posted doesn't work because of the while loop. The conditional statement on a while loop is executed before the start of each iteration. In this case, your regex is returning the first value in $string because it is reset at each iteration. A foreach would work however:
my $string = '12 23 34 45 5464 46';
foreach my $number ($string =~ /(\d\d)\s/g) {
print $number."\n";
}