The ?PATTERN? operator is not working for Matching Only Once in Perl

The ?PATTERN? operator is not working for Matching Only Once in Perl - perl

I am new to Perl and am practising some programs. I have encountered a syntax error. Please help me.
My Perl program
#!/usr/bin/perl
#list = qw/ food foosball subeo footnote terfoot canic footbridge /;
foreach ( #list ) {
$first = $1 if ?(foo.*)?;
$last = $1 if /(foo.*)/;
}
print "First: $first, Last: $last\n";
Output
syntax error at MatchingOnlyOnce.pl line 9, near "if ?"
Execution of MatchingOnlyOnce.pl aborted due to compilation errors.
Output of perl -v
This is perl 5, version 24, subversion 1 (v5.24.1) built for MSWin32-x64-multi-t
hread
Copyright 1987-2017, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.

Use
$first=$1 if m?(foo.*)?;
?PATTERN? could be used as a shortcut for m?PATTERN?, but you can no longer omit the match operator's leading m when you use ? as the delimiter.
5.14 deprecated the ability to omit the leading m from m?PATTERN?flags.
5.22 removed the ability to omit the leading m from m?PATTERN?flags.
5.22 and 5.24's perlop lists both m?PATTERN?flags and ?PATTERN?flags, but only the former if legal in these versions.
5.26's documentation will be free of all mentions of ?PATTERN? (as opposed to m?PATTERN?).

If you are not using the default / pattern delimiters, you must specify the match operation as in $x =~ m{...}, $x =~ m!...! etc.
?...? is different than those other alternative delimiters as ?...? does something different than /.../. perldoc perlreref currently states:
?pattern? is like m/pattern/ but matches only once. No alternate delimiters can be used. Must be reset with reset().
That is misleading as Perl used to recognize the plain ?...?, but support for that was completely removed a few years ago:
[perl #120912] [PATCH] Remove support for ?PATTERN? without explicit 'm' operator
…
This has issued a deprecation warning since Perl v5.14 (commit
725a61d70), and precludes using ? as an operator after a unary operator
that defaults to $_, such as:
ref ? $_ : [$_]
Here is the motivation for the deprecation and eventual removal:
Deprecate ?PATTERN?, recommending the equivalent m?PATTERN? syntax, in
order to eventually allow the question mark to be used in new operators
that would currently be ambiguous.
If you are just beginning to learn Perl, you ought to enable strict and warnings. Declare your variables in the smallest applicable scope.
#!/usr/bin/env perl
use strict;
use warnings;
my #list = qw/food foosball subeo footnote terfoot canic footbridge/;
my ($first, $last);
foreach my $item (#list) {
$item =~ m?(foo.*)?
and $first = $1;
$item =~ /(foo.*)/
and $last = $1;
}
print "First: $first, Last: $last\n";
Output:
$ perl tt.pl
First: food, Last: footbridge

A lot of things about Perl are unusual for a programming language, and you have happened upon a corner of the language that is rarely used
Perl was designed by a the linguist Larry Wall, and there are many similarities between Perl and spoken languages. Perl allows abbreviations of some constructs, for instance, a pattern match like
/abc/
is equivalent to
$_ =~ m/abc/
That is to say, if it looks like a regex pattern it will be treated as one, regardless of whether the leading m is there or not. Also, many Perl operators work on $_ by default, which allows you to write a few lines of code without explicitly mentioning a variable
But m?...? is an old-fashioned construct that has really been edged out by lexical variables. If you declare a variable using my then there is no need for the one-shot match, or the corresponding reset operator. It is your declaration that defines the lifetime of the variable
If you are just starting with Perl, I recommend that you
Always start every program with use strict and use warnings 'all'. This isn't optional as it's the first line of defence against simple mistakes and errors
Always declare every variable using my as close as possible to its first use. Occasionally you may want to declare variables before a loop so that its value is kept across iterations, but generally variables should be temporary and useful only for a few lines of code
Forget about m?...?. I have never seen a program that uses it in twenty years of writing Perl professionally
I hope this helps

Related

Cannot use "my $_" in new version(s) of Perl

In Perl 5.28.1, the following statement is invalid:
>perl
my $_;
Can't use global $_ in "my" at - line 1, near "my $_"
Execution of - aborted due to compilation errors.
This worked at least up to Perl 5.16.3. Was this construct removed from Perl, or is this a bug? If this was removed, I consider that a big problem as this basic construct has been heavily used in the past, and it is also demonstrated in the Perl documentation. Neither Perl history does mention such a big change in the language.

Was this construct removed from Perl, or is this a bug?
From perldoc perlvar:
$_ is a global variable.
However, between perl v5.10.0 and v5.24.0, it could be used lexically
by writing my $_ . Making $_ refer to the global $_ in the same scope
was then possible with our $_ . This experimental feature was removed
and is now a fatal error, but you may encounter it in older code.
If this was removed, I consider that a big problem as ...
I think this is not the right place to discuss this, i.e. discussions would not solve your current problem. As ikegami pointed out in the comments: this feature was marked experimental in 5.18 and thus led to warnings for many years. And you probably just need to replace the my $_ with local $_ in your code.

Usage of defined with Filehandle and while Loop

While reading a book on advanced Perl programming(1), I came across
this code:
while (defined($s = <>)) {
...
Is there any special reason for using defined here? The documentation for
perlop says:
In these loop constructs, the assigned value (whether assignment is
automatic or explicit) is then tested to see whether it is defined. The
defined test avoids problems where line has a string value that would be
treated as false by Perl, for example a "" or a "0" with no trailing
newline. If you really mean for such values to terminate the loop, they
should be tested for explicitly: [...]
So, would there be a corner case or that's simply because the book is too old
and the automatic defined test was added in a recent Perl version?
(1) Advanced Perl Programming, First Edition, Sriram Srinivasan. O'Reilly
(1997)

Perl has a lot of implicit behaviors, many more than most other languages. Perl's motto is There's More Than One To Do It, and because there is so much implicit behavior, there is often More Than One Way To express the exact same thing.
/foo/ instead of $_ =~ m/foo/
$x = shift instead of $x = shift #_
while (defined($_=<ARGV>)) instead of while(<>)
etc.
Which expressions to use are largely a matter of your local coding standards and personal preference. The more explicit expressions remind the reader what is really going on under the hood. This may or may not improve the readability of the code -- that depends on how knowledgeable the audience is and whether you are using well-known idioms.
In this case, the implicit behavior is a little more complicated than it seems. Sometimes perl will implicitly perform a defined(...) test on the result of the readline operator:
$ perl -MO=Deparse -e 'while($s=<>) { print $s }'
while (defined($s = <ARGV>)) {
print $s;
}
-e syntax OK
but sometimes it won't:
$ perl -MO=Deparse -e 'if($s=<>) { print $s }'
if ($s = <ARGV>) {
print $s;
}
-e syntax OK
$ perl -MO=Deparse -e 'while(some_condition() && ($s=<>)) { print $s }'
while (some_condition() and $s = <ARGV>) {
print $s;
}
-e syntax OK
Suppose that you are concerned about the corner cases that this implicit behavior is supposed to handle. Have you committed perlop to memory so that you understand when Perl uses this implicit behavior and when it doesn't? Do you understand the differences in this behavior between Perl v5.14 and Perl v5.6? Will the people reading your code understand?
Again, there's no right or wrong answer about when to use the more explicit expressions, but the case for using an explicit expression is stronger when the implicit behavior is more esoteric.

Say you have the following file
4<LF>
3<LF>
2<LF>
1<LF>
0
(<LF> represents a line feed. Note the lack of newline on the last line.)
Say you use the code
while ($s = <>) {
chomp;
say $s;
}
If Perl didn't do anything magical, the output would be
4
3
2
1
Note the lack of 0, since the string 0 is false. defined is needed in the unlikely case that
You have a non-standard text file (missing trailing newline).
The last line of the file consists of a single ASCII zero (0x30).
BUT WAIT A MINUTE! If you actually ran the above code with the above data, you would see 0 printed! What many don't know is that Perl automagically translates
while ($s = <>) {
to
while (defined($s = <>)) {
as seen here:
$ perl -MO=Deparse -e'while($s=<DATA>) {}'
while (defined($s = <DATA>)) {
();
}
__DATA__
-e syntax OK
So you technically don't even need to specify defined in this very specific circumstance.
That said, I can't blame someone for being explicit instead of relying on Perl automagically modifying their code. After all, Perl is (necessarily) quite specific as to which code sequences it will change. Note the lack of defined in the following even though it's supposedly equivalent code:
$ perl -MO=Deparse -e'while((), $s=<DATA>) {}'
while ((), $s = <DATA>) {
();
}
__DATA__
-e syntax OK

while($line=<DATA>){
chomp($line);
if(***defined*** $line){
print "SEE:$line\n";
}
}
__DATA__
1
0
3
Try the code with defined removed and you will see the different result.

perl encapsulate single variable in double quotes

In Perl, is there any reason to encapsulate a single variable in double quotes (no concatenation) ?
I often find this in the source of the program I am working on (writen 10 years ago by people that don't work here anymore):
my $sql_host = "something";
my $sql_user = "somethingelse";
# a few lines down
my $db = sub_for_sql_conection("$sql_host", "$sql_user", "$sql_pass", "$sql_db");
As far as I know there is no reason to do this. When I work in an old script I usualy remove the quotes so my editor colors them as variables not as strings.
I think they saw this somewhere and copied the style without understanding why it is so. Am I missing something ?
Thank you.

All this does is explicitly stringify the variables. In 99.9% of cases, it is a newbie error of some sort.
There are things that may happen as a side effect of this calling style:
my $foo = "1234";
sub bar { $_[0] =~ s/2/two/ }
print "Foo is $foo\n";
bar( "$foo" );
print "Foo is $foo\n";
bar( $foo );
print "Foo is $foo\n";
Here, stringification created a copy and passed that to the subroutine, circumventing Perl's pass by reference semantics. It's generally considered to be bad manners to munge calling variables, so you are probably okay.
You can also stringify an object or other value here. For example, undef stringifies to the empty string. Objects may specify arbitrary code to run when stringified. It is possible to have dual valued scalars that have distinct numerical and string values. This is a way to specify that you want the string form.
There is also one deep spooky thing that could be going on. If you are working with XS code that looks at the flags that are set on scalar arguments to a function, stringifying the scalar is a straight forward way to say to perl, "Make me a nice clean new string value" with only stringy flags and no numeric flags.
I am sure there are other odd exceptions to the 99.9% rule. These are a few. Before removing the quotes, take a second to check for weird crap like this. If you do happen upon a legit usage, please add a comment that identifies the quotes as a workable kludge, and give their reason for existence.

In this case the double quotes are unnecessary. Moreover, using them is inefficient as this causes the original strings to be copied.
However, sometimes you may want to use this style to "stringify" an object. For example, URI ojects support stringification:
my $uri = URI->new("http://www.perl.com");
my $str = "$uri";

I don't know why, but it's a pattern commonly used by newcomers to Perl. It's usually a waste (as it is in the snippet you posted), but I can think of two uses.
It has the effect of creating a new string with the same value as the original, and that could be useful in very rare circumstances.
In the following example, an explicit copy is done to protect $x from modification by the sub because the sub modifies its argument.
$ perl -E'
sub f { $_[0] =~ tr/a/A/; say $_[0]; }
my $x = "abc";
f($x);
say $x;
'
Abc
Abc
$ perl -E'
sub f { $_[0] =~ tr/a/A/; say $_[0]; }
my $x = "abc";
f("$x");
say $x;
'
Abc
abc
By virtue of creating a copy of the string, it stringifies objects. This could be useful when dealing with code that alters its behaviour based on whether its argument is a reference or not.
In the following example, explicit stringification is done because require handles references in #INC differently than strings.
$ perl -MPath::Class=file -E'
BEGIN { $lib = file($0)->dir; }
use lib $lib;
use DBI;
say "ok";
'
Can't locate object method "INC" via package "Path::Class::Dir" at -e line 4.
BEGIN failed--compilation aborted at -e line 4.
$ perl -MPath::Class=file -E'
BEGIN { $lib = file($0)->dir; }
use lib "$lib";
use DBI;
say "ok";
'
ok

In your case quotes are completely useless. We can even says that it is wrong because this is not idiomatic, as others wrote.
However quoting a variable may sometime be necessary: this explicitely triggers stringification of the value of the variable. Stringification may give a different result for some values if thoses values are dual vars or if they are blessed values with overloaded stringification.
Here is an example with dual vars:
use 5.010;
use strict;
use Scalar::Util 'dualvar';
my $x = dualvar 1, "2";
say 0+$x;
say 0+"$x";
Output:
1
2

My theory has always been that it's people coming over from other languages with bad habits. It's not that they're thinking "I will use double quotes all the time", but that they're just not thinking!
I'll be honest and say that I used to fall into this trap because I came to Perl from Java, so the muscle memory was there, and just kept firing.
PerlCritic finally got me out of the habit!
It definitely makes your code more efficient, but if you're not thinking about whether or not you want your strings interpolated, you are very likely to make silly mistakes, so I'd go further and say that it's dangerous.

The good, the bad, and the ugly of lexical $_ in Perl 5.10+

Starting in Perl 5.10, it is now possible to lexically scope the context variable $_, either explicitly as my $_; or in a given / when construct.
Has anyone found good uses of the lexical $_? Does it make any constructs simpler / safer / faster?
What about situations that it makes more complicated? Has the lexical $_ introduced any bugs into your code? (since control structures that write to $_ will use the lexical version if it is in scope, this can change the behavior of the code if it contains any subroutine calls (due to loss of dynamic scope))
In the end, I'd like to construct a list that clarifies when to use $_ as a lexical, as a global, or when it doesn't matter at all.
NB: as of perl5-5.24 these experimental features are no longer part of perl.

IMO, one great thing to come out of lexical $_ is the new _ prototype symbol.
This allows you to specify a subroutine so that it will take one scalar or if none is provided it will grab $_.
So instead of writing:
sub foo {
my $arg = #_ ? shift : $_;
# Do stuff with $_
}
I can write:
sub foo(_) {
my $arg = shift;
# Do stuff with $_ or first arg.
}
Not a big change, but it's just that much simpler when I want that behavior. Boilerplate removal is a good thing.
Of course, this has the knock on effect of changing the prototypes of several builtins (eg chr), which may break some code.
Overall, I welcome lexical $_. It gives me a tool I can use to limit accidental data munging and bizarre interactions between functions. If I decide to use $_ in the body of a function, by lexicalizing it, I can be sure that whatever code I call, $_ won't be modified in calling code.
Dynamic scope is interesting, but for the most part I want lexical scoping. Add to this the complications around $_. I've heard dire warnings about the inadvisability of simply doing local $_;--that it is best to use for ( $foo ) { } instead. Lexicalized $_ gives me what I want 99 times out of 100 when I have localized $_ by whatever means. Lexical $_ makes a great convenience and readability feature more robust.
The bulk of my work has had to work with perl 5.8, so I haven't had the joy of playing with lexical $_ in larger projects. However, it feels like this will go a long way to make the use of $_ safer, which is a good thing.

I once found an issue (bug would be way too strong of a word) that came up when I was playing around with the Inline module. This simple script:
use strict qw(vars subs);
for ('function') {
$_->();
}
sub function {
require Inline;
Inline->bind(C => <<'__CODE__');
void foo()
{
}
__CODE__
}
fails with a Modification of a read-only value attempted at /usr/lib/perl5/site_perl/5.10/Inline/C.pm line 380. error message. Deep in the internals of the Inline module is a subroutine that wanted to modify $_, leading to the error message above.
Using
for my $_ ('function') { ...
or otherwise declaring my $_ is a viable workaround to this issue.
(The Inline module was patched to fix this particular issue).

[ Rationale: A short additional answer with a quick summary for perl newcomers that may be passing by. When searching for "perl lexical topic" one can end up here.]
By now (2015) I suppose it is common knowledge that the introduction of lexical topic (my $_ and some related features) led to some difficult to detect at the outset unintended behaviors and so was marked as experimental and then entered into a deprecation stage.
Partial summary of #RT119315:
One suggestion was for something like use feature 'lextopic'; to make use of a new
lexical topic variable:
$^_.
Another point made was that an "implicit name for the topicalizing operator ... other than $_" would work best when combined with explicitly lexical functions (e.g. lexical map or lmap). Whether these approaches would somehow make it possible to salvage given/when is not clear. In the afterlife of the experimental and depreciation phases perhaps something may end up living on in the river of CPAN.

Haven't had any problems here, although I tend to follow somewhat of a "Don't ask, don't tell" policy when it comes to Perls magic. I.e. the routines are not usually expected to rely on their peers screwing with non lexical data as a side effect, nor letting them.
I've tested code against various 5.8 and 5.10 versions of perl, while using a 5.6 describing Camel for occasional reference. Haven't had any problems. Most of my stuff was originally done for perl 5.8.8.

Why can't I set $LIST_SEPARATOR in Perl?

I want to set the LIST_SEPARATOR in perl, but all I get is this warning:
Name "main::LIST_SEPARATOR" used only once: possible typo at ldapflip.pl line 7.
Here is my program:
#!/usr/bin/perl -w
#vals;
push #vals, "a";
push #vals, "b";
$LIST_SEPARATOR='|';
print "#vals\n";
I am sure I am missing something obvious, but I don't see it.
Thanks

Only the mnemonic is available
$" = '|';
unless you
use English;
first.
As described in perlvar. Read the docs, please.
The following names have special meaning to Perl. Most punctuation names have reasonable mnemonics, or analogs in the shells. Nevertheless, if you wish to use long variable names, you need only say
use English;
at the top of your program. This aliases all the short names to the long names in the current package. Some even have medium names, generally borrowed from awk. In general, it's best to use the
use English '-no_match_vars';
invocation if you don't need $PREMATCH, $MATCH, or $POSTMATCH, as it avoids a certain performance hit with the use of regular expressions. See English.

perlvar is your friend:
• $LIST_SEPARATOR
• $"
This is like $, except that it applies to array and slice values interpolated into a double-quoted string (or similar interpreted string). Default is a space. (Mnemonic: obvious, I think.)
$LIST_SEPARATOR is only avaliable if you use English; If you don't want to use English; in all your programs, use $" instead. Same variable, just with a more terse name.

Slightly off-topic (the question is already well answered), but I don't get the attraction of English.
Cons:
A lot more typing
Names not more obvious (ie, I still have to look things up)
Pros:
?
I can see the benefit for other readers - especially people who don't know Perl very well at all. But in that case, if it's a question of making code more readable later, I would rather this:
{
local $" = '|'; # Set interpolated list separator to '|'
# fun stuff here...
}

you SHOULD use the strict pragma:
use strict;
you might want to use the diagnostics pragma to get additional hits about the warnings (that you already have enabled with the -w flag):
use diagnostics;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse