Perl getting every object invoked function - perl

I am new to Perl so I don't know whether it is doable or not.
I am interested in creating an module which would catch all calls performed on it.
The usage of it would be as follows :
$object = new Foo;
$object->blah;
the function name (so in this case "blah" would be cough by Foo and returned as string to a screen).
The bit I don't know how to do is catching the called function name as string.

You might want to check AUTOLOADING
If you call a subroutine that is undefined, you would ordinarily get an immediate, fatal error complaining that the subroutine doesn't exist. (Likewise for subroutines being used as methods, when the method doesn't exist in any base class of the class's package.) However, if an AUTOLOAD subroutine is defined in the package or packages used to locate the original subroutine, then that AUTOLOAD subroutine is called with the arguments that would have been passed to the original subroutine
my $object = new Foo;
print $object->blah, "\n";
package Foo;
sub new { return bless {}, shift }
# catch-all function
sub AUTOLOAD {
return $AUTOLOAD;
}
outputs Foo::blah

Related

How to find the Perl code referenced by this line?

I have inherited some Perl code which contains a line that is mysterious to me:
my $binary = A->current->config->settings('arg1', 'arg2')
Basically, I am not sure how to find the related code. "A" is NOT a variable in the local code so I thought this was a class hierarchy. However I checked the directory structure to see if the following path existed, but there was none:
A/current/config/settings.pm
Is A->current->config->settings guaranteed to be a nested class hierarchy, or could it be something else? For example could config actually be a property or method of a different object A->current?
Any assistance you could lend tracking this down would be greatly appreciated!
A is a class name, you should find it in A.pm. current should be a method of the class, defined under a sub current in A.pm. It returns an object whose config method is being called which returns an object again whose settings method is being called with arguments 'arg1' and 'arg2' (well, in fact, the object itself is the first argument).
In fact, any of the methods can return a class instead of an object, too.
Step through the code in the perl debugger and see where it takes you.
foo->bar is a method call, meaning that there is likely a subroutine called bar defined in the package referred to by foo (or a superclass), and gives you no information about whether there is a package bar or foo::bar.
Is A->current->config->settings guaranteed to be a nested class hierarchy
You're thinking of A::current::config::settings.
The following are method calls:
INVOCANT->name
INVOCANT->name(LIST)
That means that A->current->config->settings is a chain of method calls.
The only class named in that code is A.
could config actually be a property or method of a different object A->current?
It's the name of a method of the object or class returned by A->current.
How to find the Perl code referenced by this line?
my $binary = A->current->config->settings('arg1', 'arg2');
is short for
my $obj1 = A->current;
my $obj2 = $obj1->config;
my $binary = $obj2->settings('arg1', 'arg2');
Now that you have the objects available, you can find the class of which they are an instance using
say ref($obj) || "Not a reference";
or
use Scalar::Util qw( blessed );
say blessed($obj) // "Not an object";
As explained, you are dealing with a chain of method calls in the class named A, where at least the first one is a class method since it is invoked on the class (A) itself, not on an object.
An easy way to find that class is by using Class::Inspector
use Class::Inspector;
say "Filename: ", Class::Inspector->resolved_filename( 'A' );
which printed the full path to the class I used in my tests. Also see loaded_filename.
Another interesting way to interrogate a class is to add to it at runtime.
Create an object of A and add to it a method of your choice at runtime
my $objA = A->new();
eval q( sub A::get_info { print "$_\n" for (caller(0)) } );
if ($#) { print "Eval: $#" };
eval q( sub A::boom { croak "Stacktrace: " } );
if ($#) { print "Eval: $#" };
$objA->get_info();
$objA->boom();
These are simple examples but you can acquire practically any information from inside a method.
If A happens to not have a method called new (possible) work with methods in the given chain, starting with my $objA = A->current.
Or, you can directly add a subroutine to the package's symbol table
*{A::new_method} = sub { say "A new method" };
$any_obj_of_A->new_method();
which is now also available on all existing instances, as well as on new ones.

Access object created in another function

My program creates an object, which, in turn, creates another object
MainScript.pm
use module::FirstModule qw ($hFirstModule);
$hFirstModule->new(parametres);
$hFirstModule->function();
FirstModule.pm
use Exporter ();
#EXPORT = qw($hFirstModule);
use module::SecondModule qw ($hSecondModule);
sub new {
my $className = shift;
my $self = { key => 'val' };
bless $self, $classname;
return $self;
}
sub function{
$hSecondModule->new(parametres);
#some other code here
}
I want to acces $hSecondModule from MainScript.pm.
It depends.
We would have to see the actual code. What you've shown is a bit ambiguous. However, there are two scenarios.
You can't
If your code is not exactly like what you have shown as pseudo-code, then there is no chance to do that. Consider this code in &module1::function.
sub function {
my $obj = Module2->new;
# ... more stuff
return;
}
In this case, you are not returning anything, and the $obj is lexically scoped. A lexical scope means that it only exists inside of the closest {} block (and all blocks inside that). That's the block of the function sub. Once the program returns out of that sub, the variable goes out of scope and the object is destroyed. There is no way to get to it afterwards. It's gone.
Even if it was not destroyed, you cannot reach into a different scope.
You can
If you however return the object from the function, then you'd have to assign it in your script, and then you can access it later. If the code is exactly what you've shown above, this works.
sub function {
my $obj = Module2->new;
# nothing here
}
In Perl, subs always return the last true statement. If you don't have a return and the last statement is the Module2->new call, then the result of that statement, which is the object, is returned. Of course it also works if you actually return explicitly.
sub function {
return Module2->new;
}
So if you assign that to a variable in your script, you can access it in the script.
my $obj = module1->function();
This is similar to the factory pattern.
This is vague, but without more information it's impossible to answer the question more precicely.
Here is a very hacky approach that takes your updated code into consideration. It uses Sub::Override to grab the return value of the constructor call to your SecondModule thingy. This is something that you'd usually maybe do in a unit test, but not in production code. However, it should work. Here's an example.
Foo.pm
package Foo;
use Bar;
sub new {
return bless {}, $_[0];
}
sub frobnicate {
Bar->new;
return;
}
Bar.pm
package Bar;
sub new {
return bless {}, $_[0];
}
sub drink {
return 42; # because.
}
script.pl
package main;
use Foo; # this will load Bar at compile time
use Sub::Override;
my $foo = Foo->new;
my $bar; # this is lexical to the main script, so we can use it inside
my $orig = \&Bar::new; # grab the original function
my $sub = Sub::Override->new(
"Bar::new" => sub {
my $self = shift;
# call the constructor of $hSecondModule, grab the RV and assign
# it to our var from the main script
$bar = $self->$orig(#_);
return $bar;
}
);
$foo->frobnicate;
# restore the original sub
$sub->restore;
# $bar is now assigend
print $bar->drink;
Again, I would not do this in production code.
Let's take a look at the main function. It first creates a new Foo object. Then it grabs a reference to the Bar::new function. We need that as the original, so we can call it to create the object. Then we use Sub::Override to temporarily replace the Bar::new with our sub that calls the original, but takes the return value (which is the object) and assigns it to our variable that's lexical to the main script. Then we return it.
This function will now be called when $foo->frobnicate calls Bar->new. After that call, $bar is populated in our main script. Then we restore Bar::new so we don't accidentally overwrite our $bar in case that gets called again from somewhere else.
Afterwards, we can use $bar.
Note that this is advanced. I'll say again that I would not use this kind of hack in production code. There is probably a better way to do what you want. There might be an x/y problem here and you need to better explain why you need to do this so we can find a less crazy solution.

'Goto undefined subroutine &main::1' writing a simple Perl debugger

I'm trying to write a simple Perl debugger and I'm running into the following problem.
I'm running the following code as a debugger:
{
package DB;
sub DB { }
sub sub
{
&$sub;
# this is what produces the problem
$i = 1*1;
}
}
1;
I'm loading this in by setting the PERL5DB environment variable - e.g.:
export PERL5DB="BEGIN { require './debugger/tracer.pl'; }
Given this simple little Perl script:
#!/usr/bin/env perl
use Getopt::Long;
print "hello world";
I'm running the script as:
perl -d test.pl
When run, it generates the following error:
$ perl -d test.pl
Goto undefined subroutine &main::1 at /home/vagrant/perl5/perlbrew/perls/perl-5.16.0/lib/site_perl/5.16.0/Exporter.pm line 25.
BEGIN failed--compilation aborted at test.pl line 6.
I've isolated the problem to anything that is run after the &$sub; call in sub in the debugger. This problem is happening with certain packages being included in the base Perl script - in this case, Getopt::Long, though I've also found the same result with IO::File.
My Perl is pretty rusty, particularly with respect to advanced topics like the debugger.
Can anyone help me understand how I can get code executing after the &$sub; call in sub in the debugger to place nicely with the packages that I'm importing?
Thanks!
When you leave a Perl subroutine without using an explicit return statement, Perl will return the value of the last statement in the subroutine.
In particular, this means that if you have a subroutine that calls another subroutine as its last statement, like this:
package DB {
sub sub {
warn "Hello from DB::sub, about to call $sub\n";
&$sub;
}
}
then the return value of the other subroutine called via &$sub will be passed to the original caller, just as if you'd done an explicit return &$sub.
However, if the &$sub call is not the last thing in your DB::sub subroutine, then Perl will just throw away its return value and instead return the value of you actual last statement — in this case $i = 1*1, which evaluates to the number 1.
Now, when you define a custom debugger like that, Perl will wrap every ordinary subroutine call with a call to your DB::sub subroutine. Thus, your code causes every subroutine call to return the number 1! It's hardly a surprise that this will break a lot of things very badly.
Specifically, based on your error message, it looks like something in the Exporter module (which is used by many other modules to export symbols to the caller's namespace) is calling a subroutine that should return a reference to another subroutine. But since, because of your debugger, it's actually returning 1, the following attempt to call the returned subroutine ends up trying to call a subroutine named 1 (which gets mapped to the main:: package because numeric symbol names are superglobal), which then fails.
But what if you really need to do something in your DB::sub after calling &$sub? Well, the workaround is to save the return value, like this:
package DB {
sub DB { }
sub sub {
warn "Hello from DB::sub, about to call $sub...\n";
# call &sub, save the return value in #rv
my #rv = (wantarray ? &$sub : scalar &$sub);
warn "Hello again from DB::sub, just called $sub and got #rv!\n";
# ...and return the saved return value
return (wantarray ? #rv : $rv[0]);
}
}
1;
(The code is slightly complicated by the fact that our DB::sub might be called in either list or scalar context, and we need to pass the appropriate context on to &$sub. The wantarray should take care of that, though.)
Adding on to the answer from Ilmari Karonen.
DB::sub can also be called in a no value (void) context, therefore the return handling needs to take this into account. Refer to the documentation in wantarray for more details.
The following code handles all three cases.
package DB {
sub DB { }
sub sub {
# call &sub, save the return value in #rv
my #rv;
if(defined(wantarray)) {
#rv = (wantarray ? &$sub : scalar &$sub);
}
else {
# wantarray is undef
&$sub;
}
# after invoking &$sub
# return #rv
if(defined(wantarray)) {
return (wantarray ? #rv : $rv[0]);
}
else {
return undef
}
}
}
1;

Typo when calling subref dies with `Not a GLOB reference`

A coworker's typo when calling a subref raised this strange syntax question. If I call a subref without the dereference arrow, perl dies with Not a GLOB reference. However, if the subref is called as a method on a blessed object, it runs fine.
What does this have to do with globs? And why does the method call work?
use 5.12.0;
use Try::Tiny;
my $f = sub { 'sub ref' };
my $obj = bless({}, 'Blessed');
try {
say $f($obj); # should be $f->();
} catch {
say "ERROR: $_";
};
say $obj->$f();
Output:
C:\code>perl dispatch.pl
ERROR: Not a GLOB reference at dispatch.pl line 8.
sub ref
say, like print, accepts an optional filehandle/typeglob to direct output to, eg:
my $f = \*STDERR;
say $f ("This goes to stderr.");
I didn't realize until now that you could do a method call on a subroutine reference, but sure enough, the perlobj man page states:
If the right side of the arrow is a scalar containing a reference to a
subroutine, then this is equivalent to calling the referenced subroutine
directly with the class name or object on the left side of the arrow as its
first argument. No lookup is done and there is no requirement that the
subroutine be defined in any package related to the class name or object on the
left side of the arrow.

Why does this Perl produce "Not a CODE reference?"

I need to remove a method from the Perl symbol table at runtime. I attempted to do this using undef &Square::area, which does delete the function but leaves some traces behind. Specifically, when $square->area() is called, Perl complains that it is "Not a CODE reference" instead of "Undefined subroutine &Square::area called" which is what I expect.
You might ask, "Why does it matter? You deleted the function, why would you call it?" The answer is that I'm not calling it, Perl is. Square inherits from Rectangle, and I want the inheritance chain to pass $square->area through to &Rectangle::area, but instead of skipping Square where the method doesn't exist and then falling through to Rectangle's area(), the method call dies with "Not a CODE reference."
Oddly, this appears to only happen when &Square::area was defined by typeglob assignment (e.g. *area = sub {...}). If the function is defined using the standard sub area {} approach, the code works as expected.
Also interesting, undefining the whole glob works as expected. Just not undefining the subroutine itself.
Here's a short example that illustrates the symptom, and contrasts with correct behavior:
#!/usr/bin/env perl
use strict;
use warnings;
# This generates "Not a CODE reference". Why?
sub howdy; *howdy = sub { "Howdy!\n" };
undef &howdy;
eval { howdy };
print $#;
# Undefined subroutine &main::hi called (as expected)
sub hi { "Hi!\n" }
undef &hi;
eval { hi };
print $#;
# Undefined subroutine &main::hello called (as expected)
sub hello; *hello = sub { "Hello!\n" };
undef *hello;
eval { hello };
print $#;
Update: I have since solved this problem using Package::Stash (thanks #Ether), but I'm still confused by why it's happening in the first place. perldoc perlmod says:
package main;
sub Some_package::foo { ... } # &foo defined in Some_package
This is just a shorthand for a typeglob assignment at compile time:
BEGIN { *Some_package::foo = sub { ... } }
But it appears that it isn't just shorthand, because the two cause different behavior after undefining the function. I'd appreciate if someone could tell me whether this is a case of (1) incorrect docs, (2) bug in perl, or (3) PEBCAK.
Manipulating symbol table references yourself is bound to get you into trouble, as there are lots of little fiddly things that are hard to get right. Fortunately there is a module that does all the heavy lifting for you, Package::Stash -- so just call its methods add_package_symbol and remove_package_symbol as needed.
Another good method installer that you may want to check out is Sub::Install -- especially nice if you want to generate lots of similar functions.
As to why your approach is not correct, let's take a look at the symbol table after deleting the code reference:
sub foo { "foo!\n"}
sub howdy; *howdy = sub { "Howdy!\n" };
undef &howdy;
eval { howdy };
print $#;
use Data::Dumper;
no strict 'refs';
print Dumper(\%{"main::"});
prints (abridged):
$VAR1 = {
'howdy' => *::howdy,
'foo' => *::foo,
};
As you can see, the 'howdy' slot is still present - undefining &howdy doesn't actually do anything enough. You need to explicitly remove the glob slot, *howdy.
The reason it happens is precisely because you assigned a typeglob.
When you delete the CODE symbol, the rest of typeglob is still lingering, so when you try to execute howdy it will point to the non-CODE piece of typeglob.