Can someone explain this perl snippet? - perl

for my $n (1, 2) {
sub_example();
}
sub sub_example {
my $bar = 1 if 1 == 2;
if ($bar) {
print "hahha, you see\n";
}
else {
$bar = 1;
}
}
So my question is why is $bar defined in the second loop iteration?

You're taking advantage of a strange bug (officially deprecated and slated to become a fatal error in 5.30), which involves declaring a lexical (my) variable in a statement with a statement-modifier conditional which is false.
The reason that this happens is because my $bar = 1; basically has two functions. It has a compile-time function, which is to reserve space in the lexical pad for a variable, and to associate $bar with that space; and it has a run-time function, which is to assign 1 to $bar whenever control flow reaches the my statement (a statement like my $foo; without the assignment also has these two effects, except it assigns undef at runtime).
When you govern the statement with a false conditional like my $bar = 1 if 1 == 2;, the compile-time function stays exactly the same, but the run-time function is prevented from running using the false condition, which means that whatever value was in that storage is re-used, without being assigned fresh each time the code reaches that point. This gives an effect almost, but not exactly, like using a state variable. It's a cool trick, but not recommended for any serious use, and as I've mentioned, it will be fatalized in an upcoming release of perl, which is another reason not to rely on it.

You should never use my with a statement modifier. The behaviour of such constructs is undefined (see perlsyn).
The current implementation doesn't clear the value assigned to the variable in the previous run of the subroutine, but there's no guarantee the behaviour will stay (in fact, it won't: see perldeprecation).

Related

Is the use of an uninitialized variable undefined behavior?

I don't know if "undefined behavior" means something in Perl but I would like to know if using not initialized variables in Perl may provoke unwanted behaviors.
Let's consider the following script:
use strict;
use warnings FATAL => 'all';
use P4;
my $P4;
sub get {
return $P4 if $P4;
# ...connection to Perforce server and initialization of $P4 with a P4 object...
return $P4;
}
sub disconnect {
$P4 = $P4->Disconnect() if $P4;
}
sub getFixes {
my $change = shift;
my $p4 = get();
return $p4->Run( "fixes", "-c", $change );
}
Here, the variable $P4, which is meant to store a P4 object after a connection to a Perforce server, is not initialized at the beginning of the script. However, whatever the function which is called first (get, disconnect or getFixes), the variable will be initialized before being used.
Is there any risk to do that? Should I explicitly initialized the $P4 variable at the beginning of the script?
Just a couple of straight-up answers to basic questions asked.
if "undefined behavior" means something in Perl
Yes, there is such a notion in Perl, and documentation warns of it (way less frequently than in C). See some examples in footnote †. On the other hand, at many places in documentation one finds a discussion ending with
... So don't do that.
It often comes up for things that would confuse the interpreter and could result in strange and possibly unpredictable behavior. These are sometimes typical "undefined behavior" even as they are not directly called as such.
The main question is of how uninitialized variables relate, per the title and
if using not initialized variables in Perl may provoke unwanted behaviors
This does not generally result in "undefined behavior" but it may of course lead to trouble and one mostly gets a warning for it. Unless the variable is legitimately getting initialized in such "use" of course. For example,
my $x;
my $z = $x + 3;
will draw a warning for the use of $x but not for $z (if warnings are on!). Note that this still succeeds as $x gets initialized to 0. (But in what is shown in the question the code will abort at that point, due to the FATAL.)
The code shown in the question seems fine in this sense, since as you say
the variable will be initialized before being used
Testing for truth against an uninitialized variable is fine since once it is declared it is equipped with the value undef, admissible (and false) in such tests.
See the first few paragraphs in Declarations in perlsyn for a summary of sorts on when one does or doesn't need a variable to be defined.
† A list of some behaviors specifically labeled as "undefined" in docs
Calling sort in scalar context
In list context, this sorts the LIST and returns the sorted list value. In scalar context, the behaviour of sort is undefined.
Length too great in truncate
The behavior is undefined if LENGTH is greater than the length of the file.
Using flags for sysopen which are incompatible (nonsensical)
The behavior of O_TRUNC with O_RDONLY is undefined.
Sending signals to a process-list with kill, where one can use negative signal or process number to send to a process group
If both the SIGNAL and the PROCESS are negative, the results are undefined. A warning may be produced in a future version.
From Auto-increment and Auto-decrement (perlop)
... modifying a variable twice in the same statement will lead to undefined behavior.
Iterating with each, tricky as it may be anyway, isn't well behaved if hash is inserted into
If you add or delete a hash's elements while iterating over it, the effect on the iterator is unspecified; for example, entries may be skipped or duplicated--so don't do that. It is always safe to delete the item most recently returned by each, ...
This draws a runtime warning (F), described in perldiag
Use of each() on hash after insertion without resetting hash iterator results in undefined behavior.
Statement modifier (perlsyn) used on my
The behaviour of a my, state, or our modified with a statement modifier conditional or loop construct (for example, my $x if ...) is undefined.
Some of these seem a little underwhelming (predictable), given what UB can mean. Thanks to ikegami for comments. A part of this list is found in this question.
Pried from docs current at the time of this posting (v5.32.1)
A variable declared with my is initialized with undef. There is no undefined behaviour here.
This is documented in perldoc persub:
If no initializer is given for a particular variable, it is created with the undefined value.
However, the curious construct my $x if $condition does have undefined behaviour. Never do that.
my initializes scalars to undef, and arrays and hashes to empty.
Your code is fine, though I would take a different approach to destruction.
Option 1: Provide destructor through wrapping
use Object::Destroyer qw( );
use P4 qw( );
my $P4;
sub get {
return $P4 ||= do {
my $p4 = P4->new();
$p4->SetClient(...);
$p4->SetPort(...);
$p4->SetPassword(...);
$p4->Connect()
or die("Failed to connect to Perforce Server" );
Object::Destroyer->new($p4, 'Disconnect')
};
}
# No disconnect sub
Option 2: Provide destructor through monkey-patching
use P4 qw( );
BEGIN {
my $old_DESTROY = P4->can('DESTROY');
my $new_DESTROY = sub {
my $self = shift;
$self->Disconnect();
$old_DESTROY->($self) if $old_DESTROY;
};
no warnings qw( redefined );
*P4::DESTROY = $new_DESTROY;
}
my $P4;
sub get {
return $P4 ||= do {
my $p4 = P4->new();
$p4->SetClient(...);
$p4->SetPort(...);
$p4->SetPassword(...);
$p4->Connect()
or die("Failed to connect to Perforce Server" );
$p4
};
}
# No disconnect sub

How to avoid global variable declaration when using Perl's dynamic scoping?

I am trying to write a perl script that calls a function written somewhere else (by someone else) which manipulates some of the variables in my script's scope. Let's say the script is main.pl and the function is there in funcs.pm. My main.pl looks like this:
use warnings;
use strict;
package plshelp;
use funcs;
my $var = 3;
print "$var\n"; # <--- prints 3
{ # New scope somehow prevents visibility of $pointer outside
local our $pointer = \$var;
change();
}
print "$var\n"; # <--- Ideally should print whatever funcs.pm wanted
For some reason, using local our $pointer; prevents visibility of $pointer outside the scope. But if I just use our $pointer;, the variable can be seen outside the scope in main.pl using $plshelp::pointer (but not in funcs.pm, so it would be useless anyway). As a side-note, could someone please explain this?
funcs.pm looks something like this:
use warnings;
use strict;
package plshelp;
sub change
{
${$pointer} = 4;
}
I expected this to change the value of $var and print 4 when the main script was run. But I get a compile error saying $pointer wasn't declared. This error can be removed by adding our $pointer; at the top of change in funcs.pm, but that would create an unnecessary global variable that is visible everywhere. We can also remove this error by removing the use strict;, but that seems like a bad idea. We can also get it to work by using $plshelp::pointer in funcs.pm, but the person writing funcs.pm doesn't want to do that.
Is there a good way to achieve this functionality of letting funcs.pm manipulate variables in my scope without declaring global variables? If we were going for global variables anyway, I guess I don't need to use dynamic scoping at all.
Let's just say it's not possible to pass arguments to the function for some reason.
Update
It seems that local our isn't doing any "special" as far as preventing visibility is concerned. From perldoc:
This means that when use strict 'vars' is in effect, our lets you use a package variable without qualifying it with the package name, but only within the lexical scope of the our declaration. This applies immediately--even within the same statement.
and
This works even if the package variable has not been used before, as package variables spring into existence when first used.
So this means that $pointer "exists" even after we leave the curly braces. Just that we have to refer to it using $plshelp::pointer instead of just $pointer. But since we used local before initializing $pointer, it is still undefined outside the scope (although it is still "declared", whatever that means). A clearer way to write this would be (local (our $pointer)) = \$var;. Here, our $pointer "declares" $pointer and returns $pointer as well. We now apply local on this returned value, and this operation returns $pointer again which we are assigning to \$var.
But this still leaves the main question of whether there is a good way of achieving the required functionality unanswered.
Let's be clear about how global variables with our work and why they have to be declared: There's a difference between the storage of a global variable, and visibility of its unqualified name. Under use strict, undefined variable names will not implicitly refer to a global variable.
We can always access the global variable with its fully qualified name, e.g. $Foo::bar.
If a global variable in the current package already exists at compile time and is marked as an imported variable, we can access it with an unqualified name, e.g. $bar. If a Foo package is written appropriately, we could say use Foo qw($bar); say $bar where $bar is now a global variable in our package.
With our $foo, we create a global variable in the current package if that variable doesn't already exist. The name of the variable is also made available in the current lexical scope, just like the variable of a my declaration.
The local operator does not create a variable. Instead, it saves the current value of a global variable and clears that variable. At the end of the current scope, the old value is restored. You can interpret each global variable name as a stack of values. With local you can add (and remove) values on the stack.
So while local can dynamically scope a value, it does not create a dynamically scoped variable name.
By carefully considering which code is compiled when, it becomes clear why your example doesn't currently work:
In your main script, you load the module funcs. The use statement is executed in the BEGIN phase, i.e. during parsing.
use warnings;
use strict;
package plshelp;
use funcs;
The funcs module is compiled:
use warnings;
use strict;
package plshelp;
sub change
{
${$pointer} = 4;
}
At this point, no $pointer variable is in lexical scope and no imported global $pointer variable exists. Therefore you get an error. This compile-time observation is unrelated to the existence of a $pointer variable at runtime.
The canonical way to fix this error is to declare an our $pointer variable name in the scope of the sub change:
sub change {
our $pointer;
${$pointer} = 4;
}
Note that the global variable will exist anyway, this just brings the name into scope for use as an unqualified variable name.
Just because you can use global variables doesn't mean that you should. There are two issues with them:
On a design level, global variables do not declare a clear interface. By using a fully qualified name you can simply access a variable without any checks. They do not provide any encapsulation. This makes for fragile software and weird action-at-a-distance.
On an implementation level, global variables are simply less efficient than lexical variables. I have never actually seen this matter, but think of the cycles!
Also, global variables are global variables: They can only have one value at a time! Scoping the value with local can help to avoid this in some cases, but there can still be conflicts in complex systems where two modules want to set the same global variable to different values and those modules call into each other.
The only good uses for global variables I have seen are to provide additional context to a callback that cannot take extra parameters, roughly similar to your approach. But where possible it is always better to pass the context as a parameter. Subroutine arguments are already effectively dynamically scoped:
sub change {
my ($pointer) = #_;
${$pointer} = 4;
}
...
my $var = 3;
change(\$var);
If there is a lot of context it can be come cumbersome to pass all those references: change(\$foo, \$bar, \$baz, \#something_else, \%even_more, ...). It could then make sense to bundle that context into an object, which can then be manipulated in a more controlled manner. Manipulating local or global variables is not always the best design.
There's too much wrong with your code to just fix it
You've used package plshelp in both the main script and the module, even though the main entry point is in main.pl and your module is in funcs.pm. That's just irresponsible. Did you imagine that the package statement was solely for advertising for help and it didn't matter what you put in there?
Your post doesn't say what is wrong with what you have written, but it's surprising that it doesn't throw an error.
Here's something close that does what you seem to expect. I can't really explain things as your own code is so far from working
Functions.pm
package Functions;
use strict;
use warnings;
use Exporter 'import';
our #EXPORT_OK = 'change';
sub change {
my ($ref) = #_;
$$ref = 4;
}
main.pl
use strict;
use warnings 'all';
use Functions 'change';
my $var = 44;
print "$var\n";
change(\$var);
print "$var\n";
output
44
4

Perl: safe eval?

I'm curious if there is any good information on performing restricted evals.
Looking at the docs, there is a use Safe that has a reval method, but I'm not sure how safe that is.
What I want to do is to be able to pass various conditional statements as a string to a function w/o the source abusing the eval.
For instance:
sub foo {
my $stmt = shift;
my $a = 3;
say eval($stmt)?"correct":"wrong") , "($stmt)";
}
foo( q{1 == $a} );
foo( q{$a =~ /3/ );
foo( q{(sub {return 3})->() eq 3} );
Would use Safe be good for this? All I need to be able to do is comparisons, no disk access, or variable manipulations.
As indicated in the docs, eval($stmt) evaluates $stmt "in the lexical context of the current Perl program, so that any variable settings or subroutine and format definitions remain afterwards." This is useful for delaying execution of $stmt until runtime.
If you reval($stmt) in a Safe compartment, essentially the same thing happens, the statement is eval'd, but it's eval'd in a new lexical context which can only see the Safe compartment's namespace, and in which you can control what sorts of operators are allowed.
So, yes, if you declare a Safe compartment and reval($stmt) in that compartment, then (a) execution of $stmt won't change the functioning of your program without your consent (I guess this is what you mean by "w/o the source abusing the eval"). And, (b) yes, $stmt won't be able to access the disk without your consent if you reval($stmt). In (a) "your consent" requires explicitly playing with the symbol table, and in (b) "your consent" would require specifying a set of op codes that would allow disk access.
I'm not really sure how safe this is either. However, you can see it in action if you set it up and step through it in the debugger.

Why shouldn't I use UNIVERSAL::isa?

According to this
http://perldoc.perl.org/UNIVERSAL.html
I shouldn't use UNIVERSAL::isa() and should instead use $obj->isa() or CLASS->isa().
This means that to find out if something is a reference in the first place and then is reference to this class I have to do
eval { $poss->isa("Class") }
and check $# and all that gumph, or else
use Scalar::Util 'blessed';
blessed $ref && $ref->isa($class);
My question is why? What's wrong with UNIVERSAL::isa called like that? It's much cleaner for things like:
my $self = shift if UNIVERSAL::isa($_[0], __PACKAGE__)
To see whether this function is being called on the object or not. And is there a nice clean alternative that doesn't get cumbersome with ampersands and potentially long lines?
The primary problem is that if you call UNIVERSAL::isa directly, you are bypassing any classes that have overloaded isa. If those classes rely on the overloaded behavior (which they probably do or else they would not have overridden it), then this is a problem. If you invoke isa directly on your blessed object, then the correct isa method will be called in either case (overloaded if it exists, UNIVERSAL:: if not).
The second problem is that UNIVERSAL::isa will only perform the test you want on a blessed reference just like every other use of isa. It has different behavior for non-blessed references and simple scalars. So your example that doesn't check whether $ref is blessed is not doing the right thing, you're ignoring an error condition and using UNIVERSAL's alternate behavior. In certain circumstances this can cause subtle errors (for example, if your variable contains the name of a class).
Consider:
use CGI;
my $a = CGI->new();
my $b = "CGI";
print UNIVERSAL::isa($a,"CGI"); # prints 1, $a is a CGI object.
print UNIVERSAL::isa($b,"CGI"); # Also prints 1!! Uh-oh!!
So, in summary, don't use UNIVERSAL::isa... Do the extra error check and invoke isa on your object directly.
See the docs for UNIVERSAL::isa and UNIVERSAL::can for why you shouldn't do it.
In a nutshell, there are important modules with a genuine need to override 'isa' (such as Test::MockObject), and if you call it as a function, you break this.
I have to say, my $self = shift if UNIVERSAL::isa($_[0], __PACKAGE__) doesn't look terribly clean to me - anti-Perl advocates would be complaining about line noise. :)
To directly answer your question, the answer is at the bottom of the page you linked to, namely that if a package defines an isa method, then calling UNIVERSAL::isa directly will not call the package isa method. This is very unintuitive behaviour from an object-orientation point of view.
The rest of this post is just more questions about why you're doing this in the first place.
In code like the above, in what cases would that specific isa test fail? i.e., if it's a method, in which case would the first argument not be the package class or an instance thereof?
I ask this because I wonder if there is a legitimate reason why you would want to test whether the first argument is an object in the first place. i.e., are you just trying to catch people saying FooBar::method instead of FooBar->method or $foobar->method? I guess Perl isn't designed for that sort of coddling, and if people mistakenly use FooBar::method they'll find out soon enough.
Your mileage may vary.
Everyone else has told you why you don't want to use UNIVERSAL::isa, because it breaks when things overload isa. If they've gone to all the habit of overloading that very special method, you certainly want to respect it. Sure, you could do this by writing:
if (eval { $foo->isa("thing") }) {
# Do thingish things
}
because eval guarantees to return false if it throws an exception, and the last value otherwise. But that looks awful, and you shouldn't need to write your code in funny ways because the language wants you to. What we really want is to write just:
if ( $foo->isa("thing") ) {
# Do thingish things
}
To do that, we'd have to make sure that $foo is always an object. But $foo could be a string, a number, a reference, an undefined value, or all sorts of weird stuff. What a shame Perl can't make everything a first class object.
Oh, wait, it can...
use autobox; # Everything is now a first class object.
use CGI; # Because I know you have it installed.
my $x = 5;
my $y = CGI->new;
print "\$x is a CGI object\n" if $x->isa('CGI'); # This isn't printed.
print "\$y is a CGI object\n" if $y->isa('CGI'); # This is!
You can grab autobox from the CPAN. You can also use it with lexical scope, so everything can be a first class object just for the files or blocks where you want to use ->isa() without all the extra headaches. It also does a lot more than what I've covered in this simple example.
Assuming your example of what you want to be able to do is within an object method, you're being unnecessarily paranoid. The first passed item will always be either a reference to an object of the appropriate class (or a subclass) or it will be the name of the class (or a subclass). It will never be a reference of any other type, unless the method has been deliberately called as a function. You can, therefore, safely just use ref to distinguish between the two cases.
if (ref $_[0]) {
my $self = shift;
# called on instance, so do instancey things
} else {
my $class = shift;
# called as a class/static method, so do classy things
}
Right. It does a wrong thing for classes that overload isa. Just use the following idiom:
if (eval { $obj->isa($class) }) {
It is easily understood and commonly accepted.
Update for 2020: Perl v5.32 has the class infix operator, isa, which handles any sort of thing on the lefthand side. If the $something is not an object, you get back false with no blowup.
use v5.32;
if( $something isa 'Animal' ) { ... }

Is there an andand for Perl?

I'd rather do this:
say $shop->ShopperDueDate->andand->day_name();
vs. this:
say $shop->ShopperDueDate->day_name() if $shop->ShopperDueDate;
Any ideas?
(This idea is inspired by the Ruby andand extension.)
(Actually it is inspired by the Groovy language, but most people don't know that ;-)
update: I think that both maybe() and eval {} are good solutions. This isn't ruby so I can't expect to read all of the methods/functions from left to right anyway, so maybe could certainly be a help. And then of course eval really is the perl way to do it.
You can use Perl's eval statement to catch exceptions, including those from trying to call methods on an undefined argument:
eval {
say $shop->ShopperDueDate->day_name();
};
Since eval returns the last statement evaluated, or undef on failure, you can record the day name in a variable like so:
my $day_name = eval { $shop->ShopperDueDate->day_name(); };
If you actually wish to inspect the exception, you can look in the special variable $#. This will usually be a simple string for Perl's built-in exceptions, but may be a full exception object if the exception originates from autodie or other code that uses object exceptions.
eval {
say $shop->ShopperDueDate->day_name();
};
if ($#) {
say "The error was: $#";
}
It's also possible to string together a sequence of commands using an eval block. The following will only check to see if it's a weekend provided that we haven't had any exceptions thrown when looking up $day_name.
eval {
my $day_name = $shop->ShopperDueDate->day_name();
if ($day_name ~~ [ 'Saturday', 'Sunday' ] ) {
say "I like weekends";
}
};
You can think of eval as being the same as try from other languages; indeed, if you're using the Error module from the CPAN then you can even spell it try. It's also worth noting that the block form of eval (which I've been demonstrating above) doesn't come with performance penalties, and is compiled along with the rest of your code. The string form of eval (which I have not shown) is a different beast entirely, and should be used sparingly, if at all.
eval is technically considered to be a statement in Perl, and hence is one of the few places where you'll see a semi-colon at the end of a block. It's easy to forget these if you don't use eval regularly.
Paul
I assume your second example should better be:
say $shop->ShopperDueDate ? $shop->ShopperDueDate->day_name() : undef;
instead of what it actually says.
Anyway, you can't do it with exactly that syntax without a source filter because undef isn't an object, but an unary operator, so it can't have any methods.
Instead, consider this:
package noop;
our $maybe = bless [], 'noop';
sub AUTOLOAD { undef };
This $noop::maybe is an object; all of its methods return undef though. Elsewhere, you'll have a regular function like this:
sub maybe { $_[0] || $noop::maybe; }
Then you can write this:
say maybe($shop->ShopperDueDate)->day_name()
This works because "maybe" returns returns its argument if true, otherwise it returns our $noop::maybe object which has methods which always return undef.
EDIT: Correction! the ->andand-> syntax can be had without a source filter, by using an XS that mucks about with Perl's internals. Leon Timmermans made an implementation that takes this route. It replaces the undef() function globally, so it's likely to be a lot slower than my method.
I think I've just written it. I've just uploaded it to CPAN, you can find it here.
Something like (untested):
use Object::Generic::False;
sub UNIVERSAL::andand { $_[0] || Object::Generic::False->false }
UNIVERSAL is automatically a subclass of all other classes, so this provides andand for all objects.
(Obviously creating methods in UNIVERSAL has the potential for conflicts or surprising action at a distance,
so it shouldn't be used carelessly.) If the object upon which the andand method is called is true, return it;
otherwise return a generic false object which returns itself for any method call used upon it.