How do I detect the 'strict' pragma within a module? - perl

Normally when I build a data structure in Perl, I end up having to declare it from %leaf to %root, so I have been tinkering with a module that would allow me to build up from
$seed to $seed->trunk->branch->leaf.
It's not difficult with AUTOLOAD and new subroutines. My question for SO is how do I detect if the 'strict' pragma is in use, so that the module runs in a different mode that would require the variables to be "declared" before usage so I don't accidently assign a value to $seed->drunk when I'm using strict -- assume that the module is called branch and this is valid syntax for module usage
$seed->declare('trunk');
$seed->trunk(new branch);
$seed->trunk->declare('leaf');
$seed->trunk->leaf("value");
How do I detect if the strict pragma is in effect in the calling program from the module?
It may be that this is not possible -- in which case I'd have to use a static variable to handle module independent pragmas.
EDITED / POSTSCRIPT:
I coded out the initial version that doesn't check for 'strictness' or implement a 'declare' subroutine and realized that the autoloader would not provide a simple enough user syntax if it operated by reference, so I wrote it to check for the first parameter and assign the value passed to an element in the object's referred hash table, otherwise if there was no parameter it would return the value of the element specified.
So I am posting the code for the branch module to satisfy your curiosity. Mind you, I haven't implemented a check for strictness, yet.
package branch;
sub new
{
my $type = shift;
my $self = { };
bless $self, $type;
return $self;
}
sub DESTROY
{
my $self = shift;
%$self = undef;
}
sub AUTOLOAD
{
my $self = shift;
my $value = shift;
my $sub = $AUTOLOAD;
my ($type, $PROGRAM) = ($sub =~ /(.*)::(.*)/);
if( $value ne undef )
{
$$self{$PROGRAM} = $value;
return $value;
}
return $$self{$PROGRAM};
}
1;

Well the first thing would be, strict what? Strict has three subpragmas, with their own behaviors and bits to check. use strict 'refs' doesn't allow you to dereference strings; use strict 'vars' doesn't allow you to access global variables in an unqualified way, and use strict 'subs' disables barewords outside of a few situations. use strict equates to all three, but none of them really seems close enough to what you're asking for to be worth piggy-backing on.
So to answer your question somewhat directly: element [8] in the list returned by caller($i) returns the compile hint bits in effect for the $ith level caller. If you peek in strict.pm you can see the bits that each subpragma sets and check for them at the caller level that corresponds to the code that's actually calling your method.
But, returning to my original point, you probably shouldn't, because that's not what strict is about. You should either accept an option on your objects' constructor that decides whether they should behave strictly or not, or if you really want a lexical pragma instead of something that follows your objects around, you should write your own using the information in perlpragma as a tutorial. All perls since 5.10 support arbitrary user-defined pragmas using the %^H hints hash which is exposed as element [10] of the caller info.

You seem to be confused about the scoping of the strict pragma.
If a module uses strict, this doesn't enforce anything on the user of the module. Even if you want to extend the package, by subclassing or monkey-patching in additional methods.
use strict only applies to the file it is used within. (Or if it's used within a pair of curly braces, it only applies up until the closing brace.) So if you're extending a package, just do it within a separate file, and none of the pragmas applied in the original module will apply to your code.
That said, it's rarely a good idea to not use strict. There are occasional tasks where it might be useful to disable it in a small scope, but the problem you are describing doesn't seem to be one of them.
In particular, if you're building a deeply nested structure, only don't need to declare every level. Demonstration:
use strict;
use warnings;
use Data::Dumper;
my $root;
$root->{trunk}{branch}{leaf} = 42;
print Dumper($root);

Related

Is the use of an uninitialized variable undefined behavior?

I don't know if "undefined behavior" means something in Perl but I would like to know if using not initialized variables in Perl may provoke unwanted behaviors.
Let's consider the following script:
use strict;
use warnings FATAL => 'all';
use P4;
my $P4;
sub get {
return $P4 if $P4;
# ...connection to Perforce server and initialization of $P4 with a P4 object...
return $P4;
}
sub disconnect {
$P4 = $P4->Disconnect() if $P4;
}
sub getFixes {
my $change = shift;
my $p4 = get();
return $p4->Run( "fixes", "-c", $change );
}
Here, the variable $P4, which is meant to store a P4 object after a connection to a Perforce server, is not initialized at the beginning of the script. However, whatever the function which is called first (get, disconnect or getFixes), the variable will be initialized before being used.
Is there any risk to do that? Should I explicitly initialized the $P4 variable at the beginning of the script?
Just a couple of straight-up answers to basic questions asked.
if "undefined behavior" means something in Perl
Yes, there is such a notion in Perl, and documentation warns of it (way less frequently than in C). See some examples in footnote †. On the other hand, at many places in documentation one finds a discussion ending with
... So don't do that.
It often comes up for things that would confuse the interpreter and could result in strange and possibly unpredictable behavior. These are sometimes typical "undefined behavior" even as they are not directly called as such.
The main question is of how uninitialized variables relate, per the title and
if using not initialized variables in Perl may provoke unwanted behaviors
This does not generally result in "undefined behavior" but it may of course lead to trouble and one mostly gets a warning for it. Unless the variable is legitimately getting initialized in such "use" of course. For example,
my $x;
my $z = $x + 3;
will draw a warning for the use of $x but not for $z (if warnings are on!). Note that this still succeeds as $x gets initialized to 0. (But in what is shown in the question the code will abort at that point, due to the FATAL.)
The code shown in the question seems fine in this sense, since as you say
the variable will be initialized before being used
Testing for truth against an uninitialized variable is fine since once it is declared it is equipped with the value undef, admissible (and false) in such tests.
See the first few paragraphs in Declarations in perlsyn for a summary of sorts on when one does or doesn't need a variable to be defined.
† A list of some behaviors specifically labeled as "undefined" in docs
Calling sort in scalar context
In list context, this sorts the LIST and returns the sorted list value. In scalar context, the behaviour of sort is undefined.
Length too great in truncate
The behavior is undefined if LENGTH is greater than the length of the file.
Using flags for sysopen which are incompatible (nonsensical)
The behavior of O_TRUNC with O_RDONLY is undefined.
Sending signals to a process-list with kill, where one can use negative signal or process number to send to a process group
If both the SIGNAL and the PROCESS are negative, the results are undefined. A warning may be produced in a future version.
From Auto-increment and Auto-decrement (perlop)
... modifying a variable twice in the same statement will lead to undefined behavior.
Iterating with each, tricky as it may be anyway, isn't well behaved if hash is inserted into
If you add or delete a hash's elements while iterating over it, the effect on the iterator is unspecified; for example, entries may be skipped or duplicated--so don't do that. It is always safe to delete the item most recently returned by each, ...
This draws a runtime warning (F), described in perldiag
Use of each() on hash after insertion without resetting hash iterator results in undefined behavior.
Statement modifier (perlsyn) used on my
The behaviour of a my, state, or our modified with a statement modifier conditional or loop construct (for example, my $x if ...) is undefined.
Some of these seem a little underwhelming (predictable), given what UB can mean. Thanks to ikegami for comments. A part of this list is found in this question.
Pried from docs current at the time of this posting (v5.32.1)
A variable declared with my is initialized with undef. There is no undefined behaviour here.
This is documented in perldoc persub:
If no initializer is given for a particular variable, it is created with the undefined value.
However, the curious construct my $x if $condition does have undefined behaviour. Never do that.
my initializes scalars to undef, and arrays and hashes to empty.
Your code is fine, though I would take a different approach to destruction.
Option 1: Provide destructor through wrapping
use Object::Destroyer qw( );
use P4 qw( );
my $P4;
sub get {
return $P4 ||= do {
my $p4 = P4->new();
$p4->SetClient(...);
$p4->SetPort(...);
$p4->SetPassword(...);
$p4->Connect()
or die("Failed to connect to Perforce Server" );
Object::Destroyer->new($p4, 'Disconnect')
};
}
# No disconnect sub
Option 2: Provide destructor through monkey-patching
use P4 qw( );
BEGIN {
my $old_DESTROY = P4->can('DESTROY');
my $new_DESTROY = sub {
my $self = shift;
$self->Disconnect();
$old_DESTROY->($self) if $old_DESTROY;
};
no warnings qw( redefined );
*P4::DESTROY = $new_DESTROY;
}
my $P4;
sub get {
return $P4 ||= do {
my $p4 = P4->new();
$p4->SetClient(...);
$p4->SetPort(...);
$p4->SetPassword(...);
$p4->Connect()
or die("Failed to connect to Perforce Server" );
$p4
};
}
# No disconnect sub

Make perl look ahead for sub prototypes

Perl is a bit too forgiving: If you pass extra arguments to subs they are simply ignored.
To avoid this I would like to use prototypes to make sure each sub is given the correct amount of arguments.
This works OK as long as I declare the prototype before using it:
sub mysub($);
sub mysub2($);
mysub(8);
mysub(8,2); # Complain here
sub mysub($) {
mysub2($#);
}
sub mysub2($) {
if($_[0] == 1) {
mysub(2);
}
print $#;
}
But I really hate splitting this up. I would much rather that Perl read the full file to see if there are declarations further down. So I would like to write something like:
use prototypes_further_down; # This does not work
mysub(8);
mysub(8,2); # Complain here
sub mysub($) {
mysub2($#);
}
sub mysub2($) {
if($_[0] == 1) {
mysub(2);
}
print $#;
}
Can I somehow ask Perl to do that?
To avoid this I would like to use prototypes to make sure each sub is given the correct amount of arguments.
No, you would not. Despite the similarity in name, Perl prototypes are not your father's function prototypes. Quoting The Problem with Prototypes (emphasis mine),
Perl 5's prototypes serve two purposes. First, they're hints to the parser to change the way it parses subroutines and their arguments. Second, they change the way Perl 5 handles arguments to those subroutines when it executes them. A common novice mistake is to assume that they serve the same language purpose as subroutine signatures in other languages. This is not true.
In addition to them not having the same intended purpose, bypassing prototypes is trivial, so they provide no actual protection against someone who deliberately wishes to call your code in (what you believe to be) the "wrong" way. As perldoc perlsub tells us,
The function declaration must be visible at compile time. The prototype affects only interpretation of new-style calls to the function, where new-style is defined as not using the & character. In other words, if you call it like a built-in function, then it behaves like a built-in function. If you call it like an old-fashioned subroutine, then it behaves like an old-fashioned subroutine. It naturally falls out from this rule that prototypes have no influence on subroutine references like \&foo or on indirect subroutine calls like &{$subref} or $subref->().
Method calls are not influenced by prototypes either, because the function to be called is indeterminate at compile time, since the exact code called depends on inheritance.
Even if you could get it to complain about mysub(8,2), &mysub(8,2) or $subref = \&mysub; $subref->(8,2) or (if mysub were an object method inside package MyModule) $o = MyModule->new; $o->mysub(8,2) would work without complaint.
If you want to validate how your subs are called using core Perl (prior to 5.20), then you need to perform the validation yourself within the body of the sub. Perl 5.20 and newer have a ("experimental" at the time of this writing) Signatures extension to sub declarations which may work for your purposes, but I've never used it myself, so I can't speak to its effectiveness or limitations. There are also many CPAN modules available for handling this sort of thing, which you can find by doing searches for things like "signature" or "prototype".
Regardless of your chosen approach, you will not be able to get compile-time errors about incorrect function signatures unless you define those signatures before they are used. In cases such as your example, where two subs mutually call each other, this can be accomplished by using a forward declaration to establish its signature in advance:
sub mysub($foo); # Forward declaration
sub mysub2 { mysub(8) }
sub mysub { mysub2('infinite loops ftw!') } # Complete version of the code

How does this call to a subroutine in a Perl module work?

I recently saw some Perl code that confused me. I took out all of the extra parts to see how it was working, but I still don't understand why it works.
Basically, I created this dummy "module" (TTT.pm):
use strict;
use warnings;
package TTT;
sub new {
my $class = shift;
return bless {'Test' => 'Test'}, $class;
}
sub acquire {
my $tt = new TTT();
return $tt;
}
1;
Then I created this script to use the module (ttt.pl):
#!/usr/bin/perl
use strict;
use warnings;
use TTT;
our $VERSION = 1;
my $tt = acquire TTT;
print $tt->{Test};
The line that confuses me, that I thought would not work, is:
my $tt = acquire TTT;
I thought it would not work since the "acquire" sub was never exported. But it does work.
I was confused by the "TTT" after the call to acquire, so I removed that, leaving the line like this:
my $tt = acquire;
And it complained of a bareword, like I thought it would. I added parens, like this:
my $tt = acquire();
And it complained that there wasn't a main::acquire, like I thought it would.
I'm used to the subroutines being available to the object, or subroutines being exported, but I've never seen a subroutine get called with the package name on the end. I don't even know how to search for this on Google.
So my question is, How does adding the package name after the subroutine call work? I've never seen anything like that before, and it probably isn't good practice, but can someone explain what Perl is doing?
Thanks!
You are using the indirect object syntax that Perl allows (but in modern code is discouraged). Basically, if a name is not predeclared, it can be placed in front of an object (or class name) separated with a space.
So the line acquire TTT actually means TTT->acquire. If you actually had a subroutine named acquire in scope, it would instead be interpreted as aquire(TTT) which is can lead to ambiguity (hence why it is discouraged).
You should also update the new TTT(); line in the method to read TTT->new;
It's the indirect object syntax for method calls, which lets you put the method name before the object name.
As the documentation there shows, it's best avoided because it's unwieldy and it can break in unpredictable ways, for example if there is an imported or defined subroutine named acquire — but it used to be more common than it is today, and so you will find it pretty often in old code and docs.

Can Perl method calls be intercepted?

Can you intercept a method call in Perl, do something with the arguments, and then execute it?
Yes, you can intercept Perl subroutine calls. I have an entire chapter about that sort of thing in Mastering Perl. Check out the Hook::LexWrap module, which lets you do it without going through all of the details. Perl's methods are just subroutines.
You can also create a subclass and override the method you want to catch. That's a slightly better way to do it because that's the way object-oriented programming wants you do to it. However, sometimes people write code that doesn't allow you to do this properly. There's more about that in Mastering Perl too.
To describe briefly, Perl has the aptitude to modify symbol table. You call a subroutine (method) via symbol table of the package, to which the method belongs. If you modify the symbol table (and this is not considered very dirty), you can substitute most method calls with calling the other methods you specify. This demonstrates the approach:
# The subroutine we'll interrupt calls to
sub call_me
{
print shift,"\n";
}
# Intercepting factory
sub aspectate
{
my $callee = shift;
my $value = shift;
return sub { $callee->($value + shift); };
}
my $aspectated_call_me = aspectate \&call_me, 100;
# Rewrite symbol table of main package (lasts to the end of the block).
# Replace "main" with the name of the package (class) you're intercepting
local *main::call_me = $aspectated_call_me;
# Voila! Prints 105!
call_me(5);
This also shows that, once someone takes reference of the subroutine and calls it via the reference, you can no longer influence such calls.
I am pretty sure there are frameworks to do aspectation in perl, but this, I hope, demonstrates the approach.
This looks like a job for Moose! Moose is an object system for Perl that can do that and lots more. The docs will do a much better job at explaining than I can, but what you'll likely want is a Method Modifier, specifically before.
You can, and Pavel describes a good way to do it, but you should probably elaborate as to why you are wanting to do this in the first place.
If you're looking for advanced ways of intercepting calls to arbitrary subroutines, then fiddling with symbol tables will work for you, but if you want to be adding functionality to functions perhaps exported to the namespace you are currently working in, then you might need to know of ways to call functions that exist in other namespaces.
Data::Dumper, for example, normally exports the function 'Dumper' to the calling namespace, but you can override or disable that and provide your own Dumper function which then calls the original by way of the fully qualified name.
e.g.
use Data::Dumper;
sub Dumper {
warn 'Dumping variables';
print Data::Dumper::Dumper(#_);
}
my $foo = {
bar => 'barval',
};
Dumper($foo);
Again, this is an alternate solution that may be more appropriate depending on the original problem. A lot of fun can be had when playing with the symbol table, but it may be overkill and could lead to hard to maintain code if you don't need it.
Yes.
You need three things:
The arguments to a call are in #_ which is just another dynamically scoped variable.
Then, goto supports a reference-sub argument which preserves the current #_ but makes another (tail) function call.
Finally local can be used to create lexically scoped global variables, and the symbol tables are buried in %::.
So you've got:
sub foo {
my($x,$y)=(#_);
print "$x / $y = " . ((0.0+$x)/$y)."\n";
}
sub doit {
foo(3,4);
}
doit();
which of course prints out:
3 / 4 = 0.75
We can replace foo using local and go:
my $oldfoo = \&foo;
local *foo = sub { (#_)=($_[1], $_[0]); goto $oldfoo; };
doit();
And now we get:
4 / 3 = 1.33333333333333
If you wanted to modify *foo without using its name, and you didn't want to use eval, then you could modify it by manipulating %::, for example:
$::{"foo"} = sub { (#_)=($_[0], 1); goto $oldfoo; };
doit();
And now we get:
3 / 1 = 3

Why shouldn't I use UNIVERSAL::isa?

According to this
http://perldoc.perl.org/UNIVERSAL.html
I shouldn't use UNIVERSAL::isa() and should instead use $obj->isa() or CLASS->isa().
This means that to find out if something is a reference in the first place and then is reference to this class I have to do
eval { $poss->isa("Class") }
and check $# and all that gumph, or else
use Scalar::Util 'blessed';
blessed $ref && $ref->isa($class);
My question is why? What's wrong with UNIVERSAL::isa called like that? It's much cleaner for things like:
my $self = shift if UNIVERSAL::isa($_[0], __PACKAGE__)
To see whether this function is being called on the object or not. And is there a nice clean alternative that doesn't get cumbersome with ampersands and potentially long lines?
The primary problem is that if you call UNIVERSAL::isa directly, you are bypassing any classes that have overloaded isa. If those classes rely on the overloaded behavior (which they probably do or else they would not have overridden it), then this is a problem. If you invoke isa directly on your blessed object, then the correct isa method will be called in either case (overloaded if it exists, UNIVERSAL:: if not).
The second problem is that UNIVERSAL::isa will only perform the test you want on a blessed reference just like every other use of isa. It has different behavior for non-blessed references and simple scalars. So your example that doesn't check whether $ref is blessed is not doing the right thing, you're ignoring an error condition and using UNIVERSAL's alternate behavior. In certain circumstances this can cause subtle errors (for example, if your variable contains the name of a class).
Consider:
use CGI;
my $a = CGI->new();
my $b = "CGI";
print UNIVERSAL::isa($a,"CGI"); # prints 1, $a is a CGI object.
print UNIVERSAL::isa($b,"CGI"); # Also prints 1!! Uh-oh!!
So, in summary, don't use UNIVERSAL::isa... Do the extra error check and invoke isa on your object directly.
See the docs for UNIVERSAL::isa and UNIVERSAL::can for why you shouldn't do it.
In a nutshell, there are important modules with a genuine need to override 'isa' (such as Test::MockObject), and if you call it as a function, you break this.
I have to say, my $self = shift if UNIVERSAL::isa($_[0], __PACKAGE__) doesn't look terribly clean to me - anti-Perl advocates would be complaining about line noise. :)
To directly answer your question, the answer is at the bottom of the page you linked to, namely that if a package defines an isa method, then calling UNIVERSAL::isa directly will not call the package isa method. This is very unintuitive behaviour from an object-orientation point of view.
The rest of this post is just more questions about why you're doing this in the first place.
In code like the above, in what cases would that specific isa test fail? i.e., if it's a method, in which case would the first argument not be the package class or an instance thereof?
I ask this because I wonder if there is a legitimate reason why you would want to test whether the first argument is an object in the first place. i.e., are you just trying to catch people saying FooBar::method instead of FooBar->method or $foobar->method? I guess Perl isn't designed for that sort of coddling, and if people mistakenly use FooBar::method they'll find out soon enough.
Your mileage may vary.
Everyone else has told you why you don't want to use UNIVERSAL::isa, because it breaks when things overload isa. If they've gone to all the habit of overloading that very special method, you certainly want to respect it. Sure, you could do this by writing:
if (eval { $foo->isa("thing") }) {
# Do thingish things
}
because eval guarantees to return false if it throws an exception, and the last value otherwise. But that looks awful, and you shouldn't need to write your code in funny ways because the language wants you to. What we really want is to write just:
if ( $foo->isa("thing") ) {
# Do thingish things
}
To do that, we'd have to make sure that $foo is always an object. But $foo could be a string, a number, a reference, an undefined value, or all sorts of weird stuff. What a shame Perl can't make everything a first class object.
Oh, wait, it can...
use autobox; # Everything is now a first class object.
use CGI; # Because I know you have it installed.
my $x = 5;
my $y = CGI->new;
print "\$x is a CGI object\n" if $x->isa('CGI'); # This isn't printed.
print "\$y is a CGI object\n" if $y->isa('CGI'); # This is!
You can grab autobox from the CPAN. You can also use it with lexical scope, so everything can be a first class object just for the files or blocks where you want to use ->isa() without all the extra headaches. It also does a lot more than what I've covered in this simple example.
Assuming your example of what you want to be able to do is within an object method, you're being unnecessarily paranoid. The first passed item will always be either a reference to an object of the appropriate class (or a subclass) or it will be the name of the class (or a subclass). It will never be a reference of any other type, unless the method has been deliberately called as a function. You can, therefore, safely just use ref to distinguish between the two cases.
if (ref $_[0]) {
my $self = shift;
# called on instance, so do instancey things
} else {
my $class = shift;
# called as a class/static method, so do classy things
}
Right. It does a wrong thing for classes that overload isa. Just use the following idiom:
if (eval { $obj->isa($class) }) {
It is easily understood and commonly accepted.
Update for 2020: Perl v5.32 has the class infix operator, isa, which handles any sort of thing on the lefthand side. If the $something is not an object, you get back false with no blowup.
use v5.32;
if( $something isa 'Animal' ) { ... }