How can I override Perl functions, enabling multiple overrides? - perl

some time ago, I asked This question about overriding building perl functions.
How do I do this in a way that allows multiple overrides? The following code yields an infinite recursion.
What's the proper way to do this? If I redefine a function, I don't want to step on someone else's redefinition.
package first;
my $orig_system1;
sub mysystem {
my #args = #_;
print("in first mysystem\n");
return &{$orig_system1}(#args);
}
BEGIN {
if (defined(my $orig = \&CORE::GLOBAL::system)) {
$orig_system1 = $orig;
*CORE::GLOBAL::system = \&first::mysystem;
printf("first defined\n");
} else {
printf("no orig for first\n");
}
}
package main;
system("echo hello world");

The proper way to do it is not to do it. Find some other way to accomplish what you're doing. This technique has all the problems of a global variable, squared. Unless you get your rewrite of the function exactly right, you could break all sorts of code you never even knew existed. And while you might be polite in not blowing over an existing override, somebody else probably will not be.
Overriding system is particularly touchy because it does not have a proper prototype. This is because it does things not expressible in the prototype system. This means your override cannot do some things that system can. Namely...
system {$program} #args;
This is a valid way to call system, though you need to read the exec docs to do it. You might think "oh, well I just won't do that then", but if any module that you use does it, or any module it uses does it, then you're out of luck.
That said, there's little different from overriding any other function politely. You have to trap the existing function and be sure you call it in your new one. Whether you do it before or after is up to you.
The problem in your code is that the proper way to check if a function is defined is defined &function. Taking a code ref, even of an undefined function, will always return a true code ref. I'm not sure why, maybe its like how \undef will return a scalar ref. Why calling this code ref is causing mysystem() to go infinitely recursive is anyone's guess.
There's an additional complexity in that you can't take a reference to a core function. \&CORE::system doesn't do what you mean. Nor can you get at it with a symbolic reference. So if you want to call CORE::system or an existing override depending on which is defined you can't just assign one or the other to a code ref. You have to split your logic.
Here is one way to do it.
package first;
use strict;
use warnings;
sub override_system {
my $after = shift;
my $code;
if( defined &CORE::GLOBAL::system ) {
my $original = \&CORE::GLOBAL::system;
$code = sub {
my $exit = $original->(#_);
return $after->($exit, #_);
};
}
else {
$code = sub {
my $exit = CORE::system(#_);
return $after->($exit, #_);
};
}
no warnings 'redefine';
*CORE::GLOBAL::system = $code;
}
sub mysystem {
my($exit, #args) = #_;
print("in first mysystem, got $exit and #args\n");
}
BEGIN { override_system(\&mysystem) }
package main;
system("echo hello world");
Note that I've changed mysystem() to merely be a hook that runs after the real system. It gets all the arguments and the exit code, and it can change the exit code, but it doesn't change what system() actually does. Adding before/after hooks is the only thing you can do if you want to honor an existing override. Its quite a bit safer anyway. The mess of overriding system is now in a subroutine to keep BEGIN from getting too cluttered.
You should be able to modify this for your needs.

Related

Perl:Issue passing self

I'm having an issue with a recent code review. I've been advised to change the following function call:
storeShmcoreservJobsLogs(
$self->{'shmJobDetails'},
$self->{'nhcJobDetails'},
$self->{'cppDetails'},
$self->{'siteId'},
$neTypesIdMap,
$dbh
);
To only use two arguments, being $self and $dbh. So I have coded as follows
storeShmcoreservJobsLogs($self, $dbh);
And the function signature as follows:
sub storeShmcoreservJobsLogs($$) {
my($self, $dbh) = #_;
if ( $#{$self->$shmJobDetails} > -1 ) {
The if statement unfortunately throws an error with the value of $shmJobDetails when I test the change
Global symbol "$shmJobDetails" requires explicit package name at /data/ddp/current/analysis/TOR/elasticsearch/handlers/misc/Shm.pm line 148.
So I must have misinterpreted the instruction. Is anything obvious wrong?
There's no variable $shmJobDetails so you get the compilation error. Do the same thing that you were doing before:
sub storeShmcoreservJobsLogs {
my($self,$dbh)=#_;
if ( $#{ $self->{'shmJobDetails'} } > -1 ) {
Now you're passing the complete object and the subroutine can use any part of the object it needs.
You might want to make some object methods to answer the questions you'll ask it instead of accessing its internals. That method can do all the work to figure out true or false:
sub storeShmcoreservJobsLogs {
my($self,$dbh)=#_;
if ( $self->has_jobs ) {
The use of a lexical variable called $self implies that you're using object-oriented methods, but your code is far from being object-oriented
Are you sure that you understand the points being made in the code review? It's looking like you're writing a method, and the fields should be extracted from the hash within the method
The method definition should be more like this
sub store_shmcoreserv_jobs_logs {
my $self = shift;
my ($id_map, $dbh) = #_;
my #fields = #{$self}{qw/ shmJobDetails nhcJobDetails cppDetails siteId /};
...
while the call should look like this
$self->storeShmcoreservJobsLogs($neTypesIdMap, $dbh)
All of this is essential to Perl object-oriented programming. You should study perlobj together with the rest of the Perl language reference

Destructors without classes

Suppose I have a function that returns a closure:
sub generator
{
my $resource = get_resource();
my $do_thing = sub
{
$resource->do_something();
}
return $do_thing;
}
# new lexical scope
{
my $make_something_happen = generator();
&$make_something_happen();
}
I would like to be able to ensure that when $make_something_happen is removed from scope, I am able to call some $resource->cleanup();
Of course, if I had a class, I could do this with a destructor, but that seems a bit heavyweight for what I want to do. This isn't really an "object" in the sense of modelling an object, it's just a functiony thing that needs to execute some code on startup and immediately prior to death.
How would I do this in Perl( and, out of curiosity, does any language support this idea)?
I'd just use a class for this. Bless the subroutine reference and still use it like you are. The get_resource then uses this class. Since I don't know what that looks like, I'll leave it up to you to integrate it:
package Gozer {
sub new {
my( $class, $subref );
bless $subref, $class;
}
sub DESTROY {
...; #cleanup
}
}
If every thing can have it's own cleanup, I'd use the class to group two code refs:
package Gozer {
sub new {
my( $class, $subref, $cleanup );
bless { run => $subref, cleanup => $cleanup }, $class;
}
sub DESTROY {
$_[0]{cleanup}();
}
}
In Perl, I don't think this is heavyweight. The object system simply attaches labels to references. Not every object needs to model something, so this is a perfectly fine sort of object.
It would be nice to have some sort of finalizers on ordinary variables, but I think those would end up being the same thing, topologically. You could do it with Perl as a tie, but that's just an object again.
I think I understand your question. In my case I want:
* A global variable that may be set at any point during the script's runtime
* To last right up to the end of the life of the script
* Explicitly clean it up.
It looks like I can do this by defining an END block; It will be run "as late as possible".
You should be able to do your $resource->cleanup(); up in there.
More here:
http://perldoc.perl.org/perlmod.html#BEGIN%2c-UNITCHECK%2c-CHECK%2c-INIT-and-END
The begincheck program on that page has the code.

How can a Perl force its caller to return? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Is it possible for a Perl subroutine to force its caller to return?
I want to write a subroutine which causes the caller to return under certain conditions. This is meant to be used as a shortcut for validating input to a function. What I have so far is:
sub needs($$) {
my ($condition, $message) = #_;
if (not $condition) {
print "$message\n";
# would like to return from the *parent* here
}
return $condition;
}
sub run_find {
my $arg = shift #_;
needs $arg, "arg required" or return;
needs exists $lang{$arg}, "No such language: $arg" or return;
# etc.
}
The advantage of returning from the caller in needs would then be to avoid having to write the repetitive or return inside run_find and similar functions.
I think you're focussing on the wrong thing here. I do this sort of thing with Data::Constraint, Brick, etc. and talk about this in Mastering Perl. With a little cleverness and thought about the structure of your program and the dynamic features that Perl has, you don't need such a regimented, procedural approach.
However, the first thing you need to figure out is what you really want to know in that calling subroutine. If you just want to know yes or no, it's pretty easy.
The problem with your needs is that you're thinking about calling it once for every condition, which forces you to use needs to control program flow. That's the wrong way to go. needs is only there to give you an answer. It's job is not to change program state. It becomes much less useful if you misuse it because some other calling subroutine might want to continue even if needs returns false. Call it once and let it return once. The calling subroutine uses the return value to decide what it should do.
The basic structure involves a table that you pass to needs. This is your validation profile.
sub run_find {
my $arg = shift #_;
return unless needs [
[ sub { $arg }, "arg required" ],
[ sub { exists $lang{$arg} }, "No such language: $arg" ],
];
}
...
}
You construct your table for whatever your requirements are. In needs you just process the table:
sub needs($$) {
my ($table) = #_;
foreach $test ( #$table ) {
my( $sub, $message ) = #$test;
unless( $sub->(...) ) {
print $message;
return
}
}
return 1;
}
Now, the really cool thing with this approach is that you don't have to know the table ahead of time. You can pull that from configuration or some other method. That also means that you can change the table dynamically. Now your code shrinks quite a bit:
sub run_find {
my $arg = shift #_;
return unless needs( $validators{run_find} );
...
}
You cna keep going with this. In Mastering Perl I show a couple of solutions that completely remove that from the code and moves it into a configuration file. That is, you can change the business rules without changing the code.
Remember, almost any time that you are typing the same sequence of characters, you're probably doing it wrong. :)
Sounds like you are re-inventing exception handling.
The needs function should not magically deduce its parent and interrupt the parent's control flow - that's bad manners. What if you add additional functions to the call chain, and you need to go back two or even three functions back? How can you determine this programmatically? Will the caller be expecting his or her function to return early? You should follow the principle of least surprise if you want to avoid bugs - and that means using exceptions to indicate that there is a problem, and having the caller decide how to deal with it:
use Carp;
use Try::Tiny;
sub run_find {
my $arg = shift;
defined $arg or croak "arg required";
exists $lang{$arg} or croak "no such language: $arg";
...
}
sub parent {
try { run_find('foo') }
catch { print $#; }
}
Any code inside of the try block is special: if something dies, the exception is caught and stored in $#. In this case, the catch block is executed, which prints the error to STDOUT and control flow continues as normal.
Disclaimer: exception handling in Perl is a pain. I recommend Try::Tiny, which protects against many common gotchas (and provides familiar try/catch semantics) and Exception::Class to quickly make exception objects so you can distinguish between Perl's errors and your own.
For validation of arguments, you might find it easier to use a CPAN module such as Params::Validate.
You may want to look at a similar recent question by kinopiko:
Is it possible for a Perl subroutine to force its caller to return?
The executive summary for that is: best solution is to use exceptions (die/eval, Try::Tiny, etc...). You van also use GOTO and possibly Continuation::Escape
It doesn't make sense to do things this way; ironically, ya doesn't needs needs.
Here's why.
run_find is poorly written. If your first condition is true, you'll never test the second one since you'll have returned already.
The warn and die functions will provide you printing and/or exiting behavior anyway.
Here's how I would write your run_find sub if you wanted to terminate execution if your argument fails (renamed it to well_defined):
sub well_defined {
my $arg = shift;
$arg or die "arg required";
exists $lang{$arg} or die "no such language: $arg";
return 1;
}
There should be a way to return 0 and warn at the same time, but I'll need to play around with it a little more.
run_find can also be written to return 0 and the appropriate warn message if conditions are not met, and return 1 if they are (renamed to well_defined).
sub well_defined {
my $arg = shift;
$arg or warn "arg required" and return 0;
exists $lang{$arg} or warn "no such language: $arg" and return 0;
return 1;
}
This enables Boolean-esque behavior, as demonstrated below:
perform_calculation $arg if well_defined $arg; # executes only if well-defined

How do I implement a dispatch table in a Perl OO module?

I want to put some subs that are within an OO package into an array - also within the package - to use as a dispatch table. Something like this
package Blah::Blah;
use fields 'tests';
sub new {
my($class )= #_;
my $self = fields::new($class);
$self->{'tests'} = [
$self->_sub1
,$self->_sub2
];
return $self;
}
_sub1 { ... };
_sub2 { ... };
I'm not entirely sure on the syntax for this?
$self->{'tests'} = [
$self->_sub1
,$self->_sub2
];
or
$self->{'tests'} = [
\&{$self->_sub1}
,\&{$self->_sub2}
];
or
$self->{'tests'} = [
\&{_sub1}
,\&{_sub2}
];
I don't seem to be able to get this to work within an OO package, whereas it's quite straightforward in a procedural fashion, and I haven't found any examples for OO.
Any help is much appreciated,
Iain
Your friend is can. It returns a reference to the subroutine if it exists, null otherwise. It even does it correctly walking up the OO chain.
$self->{tests} = [
$self->can('_sub1'),
$self->can('_sub2'),
];
# later
for $tn (0..$#{$self->{tests}}) {
ok defined $self->{tests}[$tn], "Function $tn is available.";
}
# and later
my $ref = $self->{tests}[0];
$self->$ref(#args1);
$ref = $self->{tests}[1];
$self->$ref(#args2);
Or, thanks to this question (which happens to be a variation of this question), you can call it directly:
$self->${\$self->{tests}[0]}(#args1);
$self->${\$self->{tests}[1]}(#args1);
Note that the \ gives us a reference to a the subref, which then gets dereferenced by the ${} after $self->. Whew!
To solve the timeliness issue brain d foy mentions, an alternative would be to simply make the {test} a subroutine itself, that returns a ref, and then you could get it at exactly the time you need it:
sub tests {
return [
$self->can('_sub1'),
$self->can('_sub2')
];
}
and then use it:
for $tn (0..$#{$self->tests()}) {
...
}
Of course, if you have to iterate over the refs anyway, you might as well just go straight for passing the reference out:
for my $ref (0..$#{$self->tests()}) {
$self->$ref(#args);
}
Although Robert P's answer might work for you, it has the problem of fixing the dispatch very early in the process. I tend to resolve the methods as late as I can, so I would leave the things in the tests array as method names until you want to use them:
$self->{tests} = [
qw( _sub1 _sub2 )
];
The strength of a dynamic language is that you can wait as long as you like to decide what's going to happen.
When you want to run them, you can go through the same process that Robert already noted. I'd add an interface to it though:
foreach my $method_name ( $obj->get_test_methods )
{
$obj->$method_name();
}
That might even be better as not tying the test to an existing method name:
foreach my $method_name ( $obj->get_test_methods )
{
$obj->run_test_named( $method_name );
}
That run_test_named could then be your dispatcher, and it can be very flexible:
sub run_test_named
{
my( $self, $name ) = #_;
# do anything you want, like in Robert's answer
}
Some things you might want to do:
Run a method on an object
Pass the object as an argument to something else
Temporarily override a test
Do nothing
etc, etc
When you separate what you decide to do from its implementation, you have a lot more freedom. Not only that, the next time you call the same test name, you can do something different.
use lib Alpha;
my $foo = Alpha::Foo->new; # indirect object syntax is deprecated
$foo->bar();
my %disp_table = ( bar => sub { $foo->bar() } );
$disp_table{bar}->(); # call it
You need a closure because you want to turn a method call into an ordinary subroutine call, so you have to capture the object you're calling the method on.
There are a few ways to do this. Your third approach is closest. That will store a reference to the two subs in the array. Then when you want to call them, you have to be sure to pass them an object as their first argument.
Is there a reason you are using the use fields construct?
if you want to create self contained test subs, you could do it this way:
$$self{test} = [
map {
my $code = $self->can($_); # retrieve a reference to the method
sub { # construct a closure that will call it
unshift #_, $self; # while passing $self as the first arg
goto &$code; # goto jumps to the method, to keep 'caller' working
}
} qw/_sub1 _sub2/
];
and then to call them
for (#{ $$self{test} }) {
eval {$_->(args for the test); 1} or die $#;
}

In Perl, what is the right way for a subclass to alias a method in the base class?

I simply hate how CGI::Application's accessor for the CGI object is called query.
I would like my instance classes to be able to use an accessor named cgi to get the CGI object associated with the current instance of my CGI::Application subclass.
Here is a self-contained example of what I am doing:
package My::Hello;
sub hello {
my $self =shift;
print "Hello #_\n";
}
package My::Merhaba;
use base 'My::Hello';
sub merhaba {
goto sub { shift->hello(#_) };
}
package main;
My::Merhaba->merhaba('StackOverflow');
This is working as I think it should and I cannot see any problems (say, if I wanted to inherit from My::Merhaba: Subclasses need not know anything about merhaba).
Would it have been better/more correct to write
sub merhaba {
my $self = shift;
return $self->hello(#_);
}
What are the advantages/disadvantages of using goto &NAME for the purpose of aliasing a method name? Is there a better way?
Note: If you have an urge to respond with goto is evil don't do it because this use of Perl's goto is different than what you have in mind.
Your approach with goto is the right one, because it will ensure that caller / wantarray and the like keep working properly.
I would setup the new method like this:
sub merhaba {
if (my $method = eval {$_[0]->can('hello')}) {
goto &$method
} else {
# error code here
}
}
Or if you don't want to use inheritance, you can add the new method to the existing package from your calling code:
*My::Hello::merhaba = \&My::Hello::hello;
# or you can use = My::Hello->can('hello');
then you can call:
My::Hello->merhaba('StackOverflow');
and get the desired result.
Either way would work, the inheritance route is more maintainable, but adding the method to the existing package would result in faster method calls.
Edit:
As pointed out in the comments, there are a few cases were the glob assignment will run afoul with inheritance, so if in doubt, use the first method (creating a new method in a sub package).
Michael Carman suggested combining both techniques into a self redefining function:
sub merhaba {
if (my $method = eval { $_[0]->can('hello') }) {
no warnings 'redefine';
*merhaba = $method;
goto &merhaba;
}
die "Can't make 'merhaba' an alias for 'hello'";
}
You can alias the subroutines by manipulating the symbol table:
*My::Merhaba::merhaba = \&My::Hello::hello;
Some examples can be found here.
I'm not sure what the right way is, but Adam Kennedy uses your second method (i.e. without goto) in Method::Alias (click here to go directly to the source code).
This is sort of a combination of Quick-n-Dirty with a modicum of indirection using UNIVERSAL::can.
package My::Merhaba;
use base 'My::Hello';
# ...
*merhaba = __PACKAGE__->can( 'hello' );
And you'll have a sub called "merhaba" in this package that aliases My::Hello::hello. You are simply saying that whatever this package would otherwise do under the name hello it can do under the name merhaba.
However, this is insufficient in the possibility that some code decorator might change the sub that *My::Hello::hello{CODE} points to. In that case, Method::Alias might be the appropriate way to specify a method, as molecules suggests.
However, if it is a rather well-controlled library where you control both the parent and child categories, then the method above is slimmmer.