Intercept nonexistent methods call in Perl - perl

I try to intercept nonexistent methods call in some subclass.
Yes, I know about AUTOLOAD,
but (for methods) it try to call parent::method first, then UNIVERSAL::method and only then ::AUTOLOAD.
But I need call (something like) ::AUTOLOAD at first.
Because I want to know what methods subclass try to call from parent.
Give me some advice about it please.

If you just want to know what methods are being used, you can use some profiling module like Devel::NYTProf.
If you want to react to that during your program execution, you can intercept directly the entersub opcode just as the profiling modules do. See the perlguts or profiling module code for more details.
You could probably create a 'Monitor' class with FETCH and EXISTS and tie it to the symbol table hash like: tie %Module::Name:: , Monitor;
But unless we know exactly what you are trying to do and why, it's hard to guess what would be the right solution for you.

Please heavily consider Jiri Klouda's suggestion that you step back and reconsider what you are trying to accomplish. You almost never want to do what you're trying to do.
But, if you're really sure you want to, here's how to get enough pure Perl rope to hang yourself...
The subs pragma takes a list of sub names to predeclare. As tchrist says above, you can predeclare subs but never actually define them. This will short-circuit method dispatch to superclasses and call your AUTOLOAD immediately.
As for the list of sub names to pass to the pragma, you could use Class::Inspector->methods (thanks to Nic Gibson's answer for teaching me about this module).
According to brian d foy's comment to Nic Gibson's answer, Class::Inspector will not handle methods defined in UNIVERSAL. If you need to do those separately, you can get inspiration from the 'use subs' line in my Class::LazyObject module.

Why not create an AUTOLOAD sub in the sub-class package which 1) reports the missing method and then 2) dispatches the call to the parent. For this to work you don't defined #ISA in the sub-class.
Something like:
package my_parent;
sub foo { warn "in my_parent::foo" }
package my_subclass;
my $PARENT_CLASS = "my_parent"; # assume only one parent
# Note: no #ISA defined here
sub AUTOLOAD {
warn "non-existent method $AUTOLOAD called\n";
my $self = shift;
(my $method = $AUTOLOAD) =~ s{.*::}{};
my $super = $PARENT_CLASS . '::' . $method;
$self->$super(#_);
}
package main;
my $x = bless {}, 'my_subclass';
$x->foo;
The syntax: $self->$super(#_) where $super has double-colons in it tells perl in which package to start looking for the method, e.g.:
$self->my_parent::foo(...)
will look for the foo method starting in the package my_parent regarless of what class $self is blessed into.

Related

Query in Perl Subroutines

I've to use perl as a part of our internship, I've come across this piece of code and could not understand what this may mean.
$val->ReadSim($first_sim, \&DataProcessing);
In the script, subroutine DataProcessing is defined, but could not find ReadSim. I tried searching in our infrastructure but was not able to. This was given to me to understand a week ago and I can't ask the guide without losing credits...
Please help...
What you're seeing is not a mere subroutine, but a method on some object called $val.
I take it you might see something on top of the program like this:
use Foo::Bar; # Some Perl module
This Perl module will contain the method ReadSim. Somewhere in your code, you probably see something like this:
my $val = Foo::Bar->new; # If the people who wrote this followed standards...
This defines $val as an object of Foo::Bar. If you look in package Foo::Bar, you'd see something like this:
#! Foo/Bar.pm
package Foo::Bar;
use strict; # Because I'm an optimist
use warnings;
...
sub new {
my $class = shift;
...
my $self = {};
...
bless $self, $class;
...
return $self; # May simply be bless {}, $class;
}
Then further down, you'll see:
sub ReadSim {
my $self = shift;
...
}
The $self = {} is a reference to a Perl hash. This is how most objects are defined. That's pretty much all the constructor does. It defines a reference to something, then blesses it as that object type. Then, the methods are merely subroutines that take the defined object and manipulate that.
$val-> ReadSim(...);
is equivalent to:
Foo::Bar::ReadSim( $val, ... );
So much for your introduction to Object Oriented Perl by Fire. You still have a question about what does ReadSim mean.
If all is right in the world, the developer of that module should have created built in Perl documentation called POD. First, determine the type of object $val is. Look where $val is defined (Something like my $val = Foo::Bar->new(...);). The Foo::Bar is the class that $val is a member of. You can do this from the command line:
$ perldoc Foo::Bar
And, if you're lucky, you'll see the documentation for Foo::Bar printed out. If you're really, really lucky, you will also see the what ReadSim also does.
And, if you're not so lucky, you'll have to do some digging. You can do this:
$ perldoc -l Foo::Bar
/usr/perl/lib/perl5/5.12/Foo/Bar.pm
This will print out the location of where the Perl Module resides on your system. For example, in this case, the module's code is in /usr/perl/lib/perl5/5.12/Foo/Bar.pm. Now, you can use an editor on this file to read it, and search for sub ReadSim and find out what that subroutine ... I mean method does.
One final thing. If you're new to Perl, you may want to look at a few tutorials that come with Perl. One is the Perl Reference Tutorial. This tutorial will tell you about references. In standard Perl, there are three different types of variables: scalar, hashes, and arrays. To create more complex data structures, you can create hashes of hashes or hashes of arrays, or arrays of arrays, etc. This tutorial will teach you about how to do this.
Once you understand references, you should read the tutorial on Perl Object Oriented Programming. Object Oriented Perl uses references to create a simulated world object oriented programming design. (I say simulated because some people will argue that Object Oriented Perl isn't really object oriented since you don't have things like private methods and variables. To me, if you can think in terms of objects and methods as you program, it's object oriented).

Why do '::' and '->' work (sort of) interchangeably when calling methods from Perl modules?

I keep getting :: confused with -> when calling subroutines from modules. I know that :: is more related to paths and where the module/subroutine is and -> is used for objects, but I don't really understand why I can seemingly interchange both and it not come up with immediate errors.
I have perl modules which are part of a larger package, e.g. FullProgram::Part1
I'm just about getting to grips with modules, but still am on wobbly grounds when it comes to Perl objects, but I've been accidentally doing this:
FullProgram::Part1::subroutine1();
instead of
FullProgram::Part1->subroutine1();
so when I've been passing a hash ref to subroutine1 and been careful about using $class/$self to deal with the object reference and accidentally use :: I end up pulling my hair out wondering why my hash ref seems to disappear. I have learnt my lesson, but would really like an explanation of the difference. I have read the perldocs and various websites on these but I haven't seen any comparisons between the two (quite hard to google...)
All help appreciated - always good to understand what I'm doing!
There's no inherent difference between a vanilla sub and one's that's a method. It's all in how you call it.
Class::foo('a');
This will call Class::foo. If Class::foo doesn't exist, the inheritance tree will not be checked. Class::foo will be passed only the provided arguments ('a').
It's roughly the same as: my $sub = \&Class::foo; $sub->('a');
Class->foo('a');
This will call Class::foo, or foo in one of its base classes if Class::foo doesn't exist. The invocant (what's on the left of the ->) will be passed as an argument.
It's roughly the same as: my $sub = Class->can('foo'); $sub->('Class', 'a');
FullProgram::Part1::subroutine1();
calls the subroutine subroutine1 of the package FullProgram::Part1 with an empty parameter list while
FullProgram::Part1->subroutine1();
calls the same subroutine with the package name as the first argument (note that it gets a little bit more complex when you're subclassing). This syntax is used by constructor methods that need the class name for building objects of subclasses like
sub new {
my ($class, #args) = #_;
...
return bless $thing, $class;
}
FYI: in Perl OO you see $object->method(#args) which calls Class::method with the object (a blessed reference) as the first argument instead of the package/class name. In a method like this, the subroutine could work like this:
sub method {
my ($self, $foo, $bar) = #_;
$self->do_something_with($bar);
# ...
}
which will call the subroutine do_something_with with the object as first argument again followed by the value of $bar which was the second list element you originally passed to method in #args. That way the object itself doesn't get lost.
For more informations about how the inheritance tree becomes important when calling methods, please see ikegami's answer!
Use both!
use Module::Two;
Module::Two::->class_method();
Note that this works but also protects you against an ambiguity there; the simple
Module::Two->class_method();
will be interpreted as:
Module::Two()->class_method();
(calling the subroutine Two in Module and trying to call class_method on its return value - likely resulting in a runtime error or calling a class or instance method in some completely different class) if there happens to be a sub Two in Module - something that you shouldn't depend on one way or the other, since it's not any of your code's business what is in Module.
Historically, Perl dont had any OO. And functions from packages called with FullProgram::Part1::subroutine1(); sytax. Or even before with FullProgram'Part1'subroutine1(); syntax(deprecated).
Later, they implemented OOP with -> sign, but dont changed too much actually. FullProgram::Part1->subroutine1(); calls subroutine1 and FullProgram::Part1 goes as 1st parameter. you can see usage of this when you create an object: my $cgi = CGI->new(). Now, when you call a method from this object, left part also goes as first parameter to function: $cgi->param(''). Thats how param gets object he called from (usually named $self). Thats it. -> is hack for OOP. So as a result Perl does not have classes(packages work as them) but does have objects("objects" hacks too - they are blessed scalars).
Offtop: Also you can call with my $cgi = new CGI; syntax. This is same as CGI->new. Same when you say print STDOUT "text\n";. Yeah, just just calling IOHandle::print().

How does this call to a subroutine in a Perl module work?

I recently saw some Perl code that confused me. I took out all of the extra parts to see how it was working, but I still don't understand why it works.
Basically, I created this dummy "module" (TTT.pm):
use strict;
use warnings;
package TTT;
sub new {
my $class = shift;
return bless {'Test' => 'Test'}, $class;
}
sub acquire {
my $tt = new TTT();
return $tt;
}
1;
Then I created this script to use the module (ttt.pl):
#!/usr/bin/perl
use strict;
use warnings;
use TTT;
our $VERSION = 1;
my $tt = acquire TTT;
print $tt->{Test};
The line that confuses me, that I thought would not work, is:
my $tt = acquire TTT;
I thought it would not work since the "acquire" sub was never exported. But it does work.
I was confused by the "TTT" after the call to acquire, so I removed that, leaving the line like this:
my $tt = acquire;
And it complained of a bareword, like I thought it would. I added parens, like this:
my $tt = acquire();
And it complained that there wasn't a main::acquire, like I thought it would.
I'm used to the subroutines being available to the object, or subroutines being exported, but I've never seen a subroutine get called with the package name on the end. I don't even know how to search for this on Google.
So my question is, How does adding the package name after the subroutine call work? I've never seen anything like that before, and it probably isn't good practice, but can someone explain what Perl is doing?
Thanks!
You are using the indirect object syntax that Perl allows (but in modern code is discouraged). Basically, if a name is not predeclared, it can be placed in front of an object (or class name) separated with a space.
So the line acquire TTT actually means TTT->acquire. If you actually had a subroutine named acquire in scope, it would instead be interpreted as aquire(TTT) which is can lead to ambiguity (hence why it is discouraged).
You should also update the new TTT(); line in the method to read TTT->new;
It's the indirect object syntax for method calls, which lets you put the method name before the object name.
As the documentation there shows, it's best avoided because it's unwieldy and it can break in unpredictable ways, for example if there is an imported or defined subroutine named acquire — but it used to be more common than it is today, and so you will find it pretty often in old code and docs.

Can Perl method calls be intercepted?

Can you intercept a method call in Perl, do something with the arguments, and then execute it?
Yes, you can intercept Perl subroutine calls. I have an entire chapter about that sort of thing in Mastering Perl. Check out the Hook::LexWrap module, which lets you do it without going through all of the details. Perl's methods are just subroutines.
You can also create a subclass and override the method you want to catch. That's a slightly better way to do it because that's the way object-oriented programming wants you do to it. However, sometimes people write code that doesn't allow you to do this properly. There's more about that in Mastering Perl too.
To describe briefly, Perl has the aptitude to modify symbol table. You call a subroutine (method) via symbol table of the package, to which the method belongs. If you modify the symbol table (and this is not considered very dirty), you can substitute most method calls with calling the other methods you specify. This demonstrates the approach:
# The subroutine we'll interrupt calls to
sub call_me
{
print shift,"\n";
}
# Intercepting factory
sub aspectate
{
my $callee = shift;
my $value = shift;
return sub { $callee->($value + shift); };
}
my $aspectated_call_me = aspectate \&call_me, 100;
# Rewrite symbol table of main package (lasts to the end of the block).
# Replace "main" with the name of the package (class) you're intercepting
local *main::call_me = $aspectated_call_me;
# Voila! Prints 105!
call_me(5);
This also shows that, once someone takes reference of the subroutine and calls it via the reference, you can no longer influence such calls.
I am pretty sure there are frameworks to do aspectation in perl, but this, I hope, demonstrates the approach.
This looks like a job for Moose! Moose is an object system for Perl that can do that and lots more. The docs will do a much better job at explaining than I can, but what you'll likely want is a Method Modifier, specifically before.
You can, and Pavel describes a good way to do it, but you should probably elaborate as to why you are wanting to do this in the first place.
If you're looking for advanced ways of intercepting calls to arbitrary subroutines, then fiddling with symbol tables will work for you, but if you want to be adding functionality to functions perhaps exported to the namespace you are currently working in, then you might need to know of ways to call functions that exist in other namespaces.
Data::Dumper, for example, normally exports the function 'Dumper' to the calling namespace, but you can override or disable that and provide your own Dumper function which then calls the original by way of the fully qualified name.
e.g.
use Data::Dumper;
sub Dumper {
warn 'Dumping variables';
print Data::Dumper::Dumper(#_);
}
my $foo = {
bar => 'barval',
};
Dumper($foo);
Again, this is an alternate solution that may be more appropriate depending on the original problem. A lot of fun can be had when playing with the symbol table, but it may be overkill and could lead to hard to maintain code if you don't need it.
Yes.
You need three things:
The arguments to a call are in #_ which is just another dynamically scoped variable.
Then, goto supports a reference-sub argument which preserves the current #_ but makes another (tail) function call.
Finally local can be used to create lexically scoped global variables, and the symbol tables are buried in %::.
So you've got:
sub foo {
my($x,$y)=(#_);
print "$x / $y = " . ((0.0+$x)/$y)."\n";
}
sub doit {
foo(3,4);
}
doit();
which of course prints out:
3 / 4 = 0.75
We can replace foo using local and go:
my $oldfoo = \&foo;
local *foo = sub { (#_)=($_[1], $_[0]); goto $oldfoo; };
doit();
And now we get:
4 / 3 = 1.33333333333333
If you wanted to modify *foo without using its name, and you didn't want to use eval, then you could modify it by manipulating %::, for example:
$::{"foo"} = sub { (#_)=($_[0], 1); goto $oldfoo; };
doit();
And now we get:
3 / 1 = 3

Is using __PACKAGE__ inside my methods bad for inheritance?

If inside my code I'll have calls like:
__PACKAGE__->method;
will this limit the usability of this module, if this module is inherited?
It depends on what you want to do:
#!/usr/bin/perl
package A;
use strict; use warnings;
sub new { bless {} => $_[0] }
sub method1 {
printf "Hello from: %s\n", __PACKAGE__;
}
sub method2 {
my $self = shift;
printf "Hello from: %s\n", ref($self);
}
package B;
use strict; use warnings;
use parent 'A';
package main;
my $b = B->new;
$b->method1;
$b->method2;
Output:
Hello from: A
Hello from: B
If you intend to inherit that method, call it on the referent and don't rely on the package you find it in. If you intend to call a method internal to the package that no other package should be able to see, then it might be okay. There's a fuller explanation in Intermediate Perl, and probably in perlboot (which is an extract of the book).
In general, I try not to ever use __PACKAGE__ unless I'm writing a modulino.
Why are you trying to use __PACKAGE__?
That depends. Sometimes __PACKAGE__->method() is exactly what you need.
Otherwise it's better to use ref($self)->class_method() or $self->method().
"It depends." is the correct answer. It is relatively uncommon to actually need the package name; usually you will have an instance or a class name to start with. That said, there are times when you really do need the package name -- __PACKAGE__ is clearly the tool for that job, being superior to a literal. Here are some guidelines:
Never call methods off __PACKAGE__ inside methods, as doing so makes it impossible for inheritors to change your implementation by simply overriding the called method. Use $self or $class instead.
Try to avoid __PACKAGE__ inside methods in general. Every use of __PACKAGE__ adds a little bit of inflexibility. Sometimes, inflexibility is what you want (because you need compile-time resolution or badly want to control where information is being stored), but be triply sure that what you want is worth the cost. You'll thank yourself later.
Outside of methods, you don't have access to a $self, and should call methods off __PACKAGE__ rather than a literal. This is mostly important for compile-time declarations like those provided by Class::Accessor.