Nested perl matching groups not working? [duplicate] - perl

This question already has an answer here:
Regex $1 into variable interferes with another variable
(1 answer)
Closed 5 years ago.
I want the following code to print out "bye", and then print out "hello". However, when I run it, it prints out "bye" and then perl tells me that $str2 has not been initialized.
my $item = "hello/bye";
if($item =~ m/.*(bye)/g){
my $str1 = $1;
print "$str1\n";
my $str2 = ($item =~ m/(hello).*/g)[0];
print "$str2\n";
}
I think that there is probably something I do not understand about the m//g part, but I am having trouble finding my answer in the perldoc page for perlre.

When you do
if($item =~ m/.*(bye)/g)
that does not reset the match iterator (we are in scalar context). The "position" remains at the character after the bye substring. So the following m//g picks up from there the previous one left off.
You can verify this yourself:
if ($item =~ /(bye)/g) {
printf "pos \$item = %d\n", pos $item;
...
}
which will print pos $item =9.
Incidentally $item =~ /.*(bye)/ is better written as $item =~ /(bye)/ (assuming you don't care if you match the first or the last bye substring, just that $item has bye somewhere). Similarly, $item =~ /(hello).*/ is better written as $item =~ /(hello)/.
#!/usr/bin/env perl
use strict;
use warnings;
my $item = "hello/bye";
if ($item =~ /(bye)/) {
my $str1 = $1;
print "$str1\n";
my $str2 = ($item =~ /(hello)/g)[0];
print "$str2\n";
}

Related

How can I extract the number from the output of a shell command?

The output for the command is ent3, and from that output I want 3 to be stored in a variable
Perl code
sub {
if ( $exit == 1 )
{
$cmdStr = "lsdev | grep en | grep VLAN | awk '{ print \$1 }'\r";
$result =_run_cmd($cmdStr);
my #PdAt_val = split("\r?\n", $result);
my $num = $result =~ /([0-9]+)/;
print "The char is $num\n";
$exit = 0;
exp_continue;
Tidied code
sub {
if ( $exit == 1 ) {
$cmdStr = "lsdev | grep en | grep VLAN | awk '{ print \$1 }'\r";
$result = _run_cmd($cmdStr);
my #PdAt_val = split("\r?\n", $result);
my $num = $result =~ /([0-9]+)/;
print "The char is $num\n";
$exit = 0;
exp_continue;
Your code that is doing the work here is:
my $num = $result =~ /([0-9]+)/;
Let's put that into a simple program so we can see what's going on.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my $result = 'ext3';
my $num = $result =~ /([0-9]+)/;
say $num;
And that prints 1. Which isn't what we want. What's going on?
Well, if you read the documentation for the match operator (in the section Regexp Quote-Like Operators in "perlop"), you'll see what the operator returns under different circumstances. It says:
Searches a string for a pattern match, and in scalar context returns true if it succeeds, false if it fails.
So that explains the behaviour we're seeing. That "1" is just a true value saying that the match succeeded. But how do we get the value that we have captured in our parentheses. There are a couple of ways. Firstly, it's written into the $1 variable.
my $num;
if ($result =~ /([0-9]+)/) {
$num = $1;
}
say $num;
But I think the other approach is what you were looking for. If you read on, you'll see what the operator returns in list context:
m// in list context returns a list consisting of the subexpressions matched by the parentheses in the pattern, that is, ($1, $2, $3 ...)
So if we put the match operator in list context, then we'll get the contents of $1 returned. How do we put a match into list context? By making the expression a list assignment - which we can do by putting parentheses around the left-hand side of the assignment.
my ($num) = $result =~ /([0-9]+)/;
say $num;
Using regex, something like this should work:
if($result =~ /([0-9]+)/) {
$num = $1;
}
print $num;

Perl script - Confusing error

When I run this code, I am purely trying to get all the lines containing the word "that" in them. Sounds easy enough. But when I run it, I get a list of matches that contain the word "that" but only at the end of the line. I don't know why it's coming out like this and I have been going crazy trying to solve it. I am currently getting an output of 268 total matches, and the output I need is only 13. Please advise!
#!/usr/bin/perl -w
#Usage: conc.shift.pl textfile word
open (FH, "$ARGV[0]") || die "cannot open";
#array = (1,2,3,4,5);
$count = 0;
while($line = <FH>) {
chomp $line;
shift #array;
push(#array, $line);
$count++;
if ($line =~ /that/)
{
$output = join(" ",#array);
print "$output \n";
}
}
print "Total matches: $count\n";
Don't you want to increment your $count variable only if the line contains "that", i.e.:
if ($line =~ /that/) {
$count++;
instead of incrementing the counter before checking if $line contains "that", as you have it:
$count++;
if ($line =~ /that/) {
Similarly, I suspect that your push() and join() calls, for stashing a matching line in #array, should also be within the if block, only executed if the line contains "that".
Hope this helps!

Perl IF statement not matching variables in REGEX

my $pointer = 0;
foreach (#new1)
{
my $test = $_;
foreach (#chk)
{
my $check = $_;
chomp $check;
delete($new1[$pointer]) if ($test =~ /^$check/i);
}
$pointer++;
}
The if statement never matches the fact that many entries in the #new1 array do contain $check at the start of the array element (88 at least).
I am not sure it is the nested loop that is causing the problem because if i try this it also fails to match:
foreach (#chk)
{
#final = (grep /^$_/, #new1);
}
#final is empty but I know at least 88 entires for $_ are in #new1.
I wrote this code on a machine running Windows ActivePerl 5.14.2 and the top code works. I then (using a copy of #new1) compare the two and remove any duplicates (also works on 5.14.2). I did try to negate the if match but that seemed to wipe out the #new1 array (so that I didn't need to do a hash compare).
When I try to run this code on a Linux RedHat box with Perl 5.8.0 it seems to struggle with the variable matching in the REGEX. If I hard code the REGEX with an example I know is in #new1 the match works and in the first code the entry is deleted (in the second one value is inserted in #final).
The #chk array is a listing file on the web server and the #new1 array is created by opening two log files on the web server and then pushing one into the other.
I had even gone to the trouble of printing out $test and $check in each loop iteration and manually checking to see if any of the the values did match and some of them do.
It has had me baffled for days now and I have had to throw the towel in and ask for help, any ideas?
As tested by user1568538, the solution was to replace
chomp $check;
with
$check =~ s/\r\n//g;
to remove Windows-style line endings from the variable.
Since chomp removes the contents of the input record separator $/ from the end of its argument, you could also change its value:
my $pointer = 0;
foreach (#new1)
{
my $test = $_;
foreach (#chk)
{
local $/="\r\n";
my $check = $_;
chomp $check;
delete($new1[$pointer]) if ($test =~ /^$_/i);
}
$pointer++;
}
However, since $/ also affects other operations (such as reading from a file handle), perhaps it is safest to avoid changing $/ unless you are sure if it is safe. Here I limit the change to the foreach loop where the chomp occurs.
No knowing what your input data looks like, using \Q might help:
if ($test =~ /^\Q$check/i);
See quotemeta.
It is not clear what you are trying to do. However, you may be trying to only get those elements for which there is no match or vice versa. Adapt the code below for your needs
#!/usr/bin/perl
use strict; use warnings;
my #item = qw(...); # your #new?
my #check = qw(...); # your #chk?
my #match;
my #nomatch;
ITEM:
foreach my $item (#item) {
CHECK:
foreach my $check (#check) {
# uncomment this if $check should not be interpreted as a pattern,
# but as literal characters:
# $item = '\Q' . $item;
if ($item =~ /^$check/) {
push #match, $item;
next ITEM; # there was a match, so this $item is burnt
# we don't need to test against other $checks.
}
}
# there was no match, so lets store it:
push #nomatch, $item.
}
print "matched $_\n" for #matched;
print "didn't match $_" for #nomatch;
Your code is somewhat difficult to read. Let me tell you what this
foreach (#chk) {
#final = (grep /^$_/, #new1);
}
does: It is roughly equivalent to
my #final = ();
foreach my $check (#chk) {
#final = grep /^$check/, #new1;
}
which is equivalent to
my #final = ();
foreach my $check (#chk) {
# #final = grep /^$check/, #new1;
#final = ();
foreach (#new) {
if (/^$check/) {
push #final, $_;
last;
}
}
}
So your #final array gets reset, possibly emptied.

What does =~ mean in Perl? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What does =~ do in Perl?
In a Perl program I am examining (namly plutil.pl), I see a lot of =~ on the XML parser portion. For example, here is UnfixXMLString (lines 159 to 167 on 1.7):
sub UnfixXMLString {
my ($s) = #_;
$s =~ s/</</g;
$s =~ s/>/>/g;
$s =~ s/&/&/g;
return $s;
}
From what I can tell, it's taking a string, modifying it with the =~ operator, then returning that modified string, but what exactly is it doing?
=~ is the Perl binding operator. It's generally used to apply a regular expression to a string; for instance, to test if a string matches a pattern:
if ($string =~ m/pattern/) {
Or to extract components from a string:
my ($first, $rest) = $string =~ m{^(\w+):(.*)$};
Or to apply a substitution:
$string =~ s/foo/bar/;
=~ is the Perl binding operator and can be used to determine if a regular expression match occurred (true or false)
$sentence = "The river flows slowly.";
if ($sentence =~ /river/)
{
print "Matched river.\n";
}
else
{
print "Did not match river.\n";
}

How to display user input on one line in a palandrome assignment?

In perl, I have to determine whether user input is a palindrome or not and it must display like this:
Enter in 7 characters: ghghghg #one line here #
Palindrome! #second line answer#
But instead this is what it does:
Enter in 7 characters: g #one line#
h #second line#
g #third line#
h #fourth line#
g #fifth line#
h #sixth line#
g Palindrom
e! #seventh line#
My problem seems to be on the chomp lines with all the variables but I just can't figure out what to do and I've been at if for hours. I need a simple solution, but have not progressed to arrays yet so need some simple to fix this. Thanks
And here is what i have so far, the formula seems to work but it keeps printing a new line for each character:
use strict;
use warnings;
my ($a, $b, $c, $d, $e, $f, $g);
print "Enter in 7 characters:";
chomp ($a = <>); chomp ($b = <>); chomp ($c = <>); chomp ($d = <>); chomp ($e = <>); chomp ($f = <>); chomp ($g = <>);
if (($a eq $g) && ($b eq $f) && ($c eq $e) && ($d eq $d) && ($e eq $c) && ($f eq $b) && ($g eq $a))
{print "Palindrome! \n";}
else
{print "Not Palindrome! \n";}
If you're going to determine if a word is the same backwards, may I suggest using reverse and lc?
chomp(my $word = <>);
my $reverse = reverse $word;
if (lc($word) eq lc($reverse)) {
print "Palindrome!";
} else {
print "Not palindrome!";
}
Perl is famous for its TIMTOWTDI. Here are two more ways of doing it:
print "Enter 7 characters: ";
chomp(my $i= <STDIN>);
say "reverse: ", pal_reverse($i) ? "yes" : "no";
say "regex: ", pal_regex($i) ? "yes" : "no";
sub pal_reverse {
my $i = (#_ ? shift : $_);
return $i eq reverse $i;
}
sub pal_regex {
return (#_ ? shift() : $_) =~ /^(.?|(.)(?1)\2)$/ + 0;
}
use strict;
use warnings;
use feature 'say';
print "Please enter 7 characters : ";
my $input = <>; # Read in input
chomp $input; # To remove trailing "\n"
# Season with input validation
warn 'Expected 7 characters, got ', length $input, ' instead'
unless length $input == 7;
# Determine if it's palindromic or not
say $input eq reverse $input
? 'Palindrome'
: 'Not palindrome' ;
TIMTOWTDI for the recursion-prone:
sub is_palindrome {
return 1 if length $_[0] < 2; # Whole string is palindromic
goto \&is_palindrome
if substr $_[0], 0, 1, '' eq substr $_[0], -1, 1, ''; # Check next chars
return; # Not palindromic if we reach here
}
say is_palindrome( 'ghghghg' ) ? 'Palindromic' : 'Not palindromic' ;
And perldoc perlretut for those who aren't :)
Recursive patterns
This feature (introduced in Perl 5.10) significantly extends the power
of Perl's pattern matching. By referring to some other capture group
anywhere in the pattern with the construct (?group-ref), the pattern
within the referenced group is used as an independent subpattern in
place of the group reference itself. Because the group reference may
be contained within the group it refers to, it is now possible to
apply pattern matching to tasks that hitherto required a recursive
parser.
To illustrate this feature, we'll design a pattern that matches if a
string contains a palindrome. (This is a word or a sentence that,
while ignoring spaces, interpunctuation and case, reads the same
backwards as forwards. We begin by observing that the empty string or
a string containing just one word character is a palindrome. Otherwise
it must have a word character up front and the same at its end, with
another palindrome in between.
/(?: (\w) (?...Here be a palindrome...) \g{-1} | \w? )/x
Adding \W* at either end to eliminate what is to be ignored, we
already have the full pattern:
my $pp = qr/^(\W* (?: (\w) (?1) \g{-1} | \w? ) \W*)$/ix;
for $s ( "saippuakauppias", "A man, a plan, a canal: Panama!" ){
print "'$s' is a palindrome\n" if $s =~ /$pp/;
}