Is there a point to Perl's object oriented interfaces if they're not creating objects? - perl

I think I read somewhere that some modules only have object oriented interfaces ( though they didn't create objects, they only held utility functions ). Is there a point to that?

First, its important to remember that in Perl, classes are implemented in a weird way, via packages. Packages also serve for general namespace pollution prevention.
package Foo;
sub new {
my ($class) = #_;
my $self = bless {}, $class;
return $self;
}
1;
That is how you make a Foo class in Perl (which can have an objected instantiated by calling Foo->new or new Foo). The use of new is just a convention; it can be anything at all. In fact, that new is what C++ would call a static method call.
You can easily create packages that contain only static method calls, and I suspect this is what you're referring to. The advantage here is that you can still use OO features like inheritance:
package Bar;
sub DoSomething {
my ($class, $arg) = #_;
$class->Compute($arg);
}
sub Compute {
my ($class, $arg) = #_;
$arg * 2;
}
1;
package Baz;
#Baz::ISA = qw(Bar);
sub Compute {
my ($class, $arg) = #_;
$arg * 2 - 1
}
1;
Given that, then
say Bar->DoSomething(3) # 6
say Baz->DoSomething(3) # 5
In fact, you can even use variables for the class name, so these can function very much like singletons:
my $obj = "Baz"; # or Baz->new could just return "Baz"
print $obj->DoSomething(3) # 5
[Code is untested; typos may be present]

I suspect that this is mostly a philosophical choice on the part of authors who prefer OO to imperative programming. Others have mentioned establishing a namespace, but it's the package that does that, not the interface. OO is not required.
Personally, I see little value in creating classes that are never instantiated (i.e. when there's no object in object-oriented). Perl isn't Java; you don't have to write a class for everything. Some modules acknowledge this. For example: File::Spec has an OO interface but also provides a functional interface via File::Spec::Functions.
File::Spec also provides an example of where OO can be useful for uninstantiated "utility" interfaces. Essentially, File::Spec is an abstract base class -- an interface with no implementation. When you load File::Spec it checks which OS you're using and loads the appropriate implementation. As a programmer, you use the interface (e.g. File::Spec->catfile) without having to worry about which version of catfile (Unix, Windows, VMS, etc.) to actually call.

As others have said, inheritance is the big gain if an actual object is not needed. The only thing I have to add here is the advice to name your variables well when writing such interfaces, e.g.:
package Foo;
# just a static method call
sub func
{
my $class = shift;
my (#args) = #_;
# stuff...
}
I named the variable that holds the classname "$class", rather than $this, to make it clear to subsequent maintainers that func() will be called as Foo->func() rather than $foo->func() (with an instantiated Foo object). This helps avoid someone adding this line later to the method:
my $value = $this->{key};
...which will fail, as there is no object to deference to get the "key" key.
If a method might be called either statically or against an instantiated object (for example, when writing a custom AUTOLOAD method), you can write this:
my method
{
my $this = shift;
my $class = ref($this) || $this;
my (#args) = #_;
# stuff...
}

namespacing, mostly. Why not? Everything that improves perl has my full approval.

Related

multi-level inheritance in Perl

I have a question related to multi-level inheritance in Perl.
Here is my code
mod.pm
package first;
sub disp {
print "INSIDE FIRST\n";
}
package second;
#ISA = qw(first);
sub disp {
print "INSIDE SECOND\n";
}
package third;
#ISA = qw(second);
sub new {
$class = shift;
$ref = {};
bless $ref, $class;
return $ref;
}
sub show {
$self = shift;
print "INSIDE THIRD\n";
}
1;
prog.pl
use mod;
$obj = third->new();
$obj->show();
$obj->disp();
I have a .pm file which contains three classes. I want to access the disp method in the first class using an object of third class. I'm not sure how that could work.
I tried to access using two ways:
using class name => first::disp()
using SUPER inside second package disp method => $self->SUPER::disp();
But am not sure how it will be accessed directly using the object of third class.
$obj->first::disp(), but what you are asking to do is something you absolutely shouldn't do. Fix your design.
If you need to do that, then you have defined your classes wrongly.
The third class inherits from the second class. second has it's own definition of disp, so it never tries to inherit that method from its superclass first. That means third gets the implementation defined in second
The simple answer would be to call first::disp something else. That way second won't have a definition of the method and inheritance will be invoked again
If you explain the underlying problem, and why you want to ignore an inherited method, then perhaps we can help you find a better way
Please also note that packages and module files should start with a capital letter, and each class is ordinarily in a file of its own, so you would usually use package First in First.pm etc.

Perl dereferencing a subroutine

I have come across code with the following syntax:
$a -> mysub($b);
And after looking into it I am still struggling to figure out what it means. Any help would be greatly appreciated, thanks!
What you have encountered is object oriented perl.
it's documented in perlobj. The principle is fairly simple though - an object is a sort of super-hash, which as well as data, also includes built in code.
The advantage of this, is that your data structure 'knows what to do' with it's contents. At a basic level, that's just validate data - so you can make a hash that rejects "incorrect" input.
But it allows you to do considerably more complicated things. The real point of it is encapsulation, such that I can write a module, and you can make use of it without really having to care what's going on inside it - only the mechanisms for driving it.
So a really basic example might look like this:
#!/usr/bin/env perl
use strict;
use warnings;
package MyObject;
#define new object
sub new {
my ($class) = #_;
my $self = {};
$self->{count} = 0;
bless( $self, $class );
return $self;
}
#method within the object
sub mysub {
my ( $self, $new_count ) = #_;
$self->{count} += $new_count;
print "Internal counter: ", $self->{count}, "\n";
}
package main;
#create a new instance of `MyObject`.
my $obj = MyObject->new();
#call the method,
$obj->mysub(10);
$obj->mysub(10);
We define "class" which is a description of how the object 'works'. In this, class, we set up a subroutine called mysub - but because it's a class, we refer to it as a "method" - that is, a subroutine that is specifically tied to an object.
We create a new instance of the object (basically the same as my %newhash) and then call the methods within it. If you create multiple objects, they each hold their own internal state, just the same as it would if you created separate hashes.
Also: Don't use $a and $b as variable names. It's dirty. Both because single var names are wrong, but also because these two in particular are used for sort.
That's a method call. $a is the invocant (a class name or an object), mysub is the method name, and $b is an argument. You should proceed to read perlootut which explains all of this.

How to use main:: data in modules?

In script i initialize several handlers and set variables which should be available for functions in separate modules. Which is the best way to use them ($q, $dbh, %conf) in modules?
Example pseudo module:
package My::Module
sub SomeFunction (
#data = $dbh->selectrow_array("SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar} );
return $q->p( "#data" );
)
1;
Example pseudo script:
use CGI;
use DBI;
use My::Module;
our $q = new CGI;
our $dbh = some_connenction_function($dsn);
our %conf = ( Foo => 1, Bar => 2, Random => some_dynamic_data() );
I understand that using main:: namespace will work, but there sholud be cleaner way? Or not?
package My::Module
Your modules should be independent of the context. That is, they shouldn't expect that $dbh is the database handler, or that they should return stuff in $q. or that configuration is kept in %conf.
For example, what if you suddenly find yourself with two instances of database handles? What do you do? I hate it when a module requires me to use module specific variables for configuration because it means I can't use two different instances of that module.
So you have two choices:
Either pass in the required data each time.
Create an object instance (that's right Object Oriented Programming) to store the required information.
Let's look at the first instance using your pseudo code:
sub someFunction (
%param = #_;
%conf = %{param{conf};
#data = $param{dbh}->selectrow_array(
"SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar}
);
return $param{q}->p( "#data" );
)
1;
Example pseudo script:
use CGI;
use DBI;
use My::Module;
my $q = new CGI;
my $dbh = some_connenction_function($dsn);
my %conf = ( Foo => 1, Bar => 2, Random => some_dynamic_data() );
someFunction (
q => $q,
dbh => $dbh,
conf => \%conf,
);
Here I'm using parameter calls. Not too bad. Is it? Now if you need another select statement, you can use different variables.
Sure, but what if you don't want to keep passing variables all of the time. Well then, you can use an Object Oriented Techniques. Now, relax and calm down. There are many, many good reasons to use object oriented design:
It can simplify your programming: This confuses people because object oriented programming means thinking about how your program will work, then designing it, then creating the various objects. All you want to do is code and get it out of the way. The truth is that by thinking about your program and designing it makes your program work better, and you code faster. The design aspect keeps the complexity out of your main code and tucks it safely away in small, easily digestible routines.
It's what is cool in Perl: You're going to have to get use to object oriented Perl because that's what everyone else is writing. You'll be seeing lots of my $foo = Foo::Bar->new; type of statements. Newer Perl modules only have object oriented interfaces, and you won't be able to use them.
Chicks dig it: Hey, I'm just grasping at straws here...
Let's see how an object oriented approach might work. First, let's see the main program:
use CGI;
use DBI;
use My::Module;
my $q = new CGI;
my $dbh = some_connenction_function($dsn);
my %conf = ( Foo => 1, Bar => 2, Random => some_dynamic_data() );
my $mod_handle = My::Module->new (
q => $q,
dbh => $dbh,
conf => \%conf,
);
$mod_handle->someFunction;
In the above, I now create an object instance that contains these variables. And, magically, I've changed your Functions into Methods. A method is simply a function in your Class (aka module). The trick is that my instance (the variable $mod_handler has all of your required variables stored away nice and neat for you. The $mod_hander-> syntax merely passes this information for my into my functions I mean methods.
So, what does your module now look like? Let's look at the first part where I have the Constructor which is simply the function that creates the storage for my variables I need:
sub new {
my $class = shift;
my %param = #_;
my $self = {};
bless $self, $class
$self->{Q} = $q;
$self->{DBH} = $dbh;
$self->{CONF} = $conf;
return $self;
}
Let's look at the first thing that is a bit different: my $class = shift;. Where is this coming from? When I call a function with the syntax Foo->Bar, I am passing Foo as the first parameter in the function Bar. Thus, $class is equal to My::Module. It is the same as if I called your function this way:
my $mod_handle = My::Module::new("My::Module", %params);
instead of:
my $mod_handle = My::Module->new(%params);
The next thing is the my $self = {}; line. This is creating a reference to a hash. If you don't understand references, you should look at Mark's Reference Tutorial that's included in Perldocs. Basically, a reference is the memory location where data is stored. In this case, my hash doesn't have a name, all I have is a reference to the memory where it's stored called $self. In Perl, there's nothing special about the name new or $self, but they're standards that everyone pretty much follows.
The bless command is taking my reference, $self, and declaring it a type My::Module. That way, Perl can track whether $mod_handle is the type of instance that has access to these functions.
As you can see, the $self reference contains all the variables that my functions need. And, I conveniently pass this back to my main program where I store it in $mod_handle.
Now, let's look at my Methods:
sub SomeFunction {
$self = shift;
my $dbh = $self->{DBH};
my $q = $self->{Q};
my %conf = %{self->{CONF}};
#data = $dbh->selectrow_array(
"SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar}
);
return $param{q}->p( "#data" );
}
Again, that $self = shift; line. Remember I'm calling this as:
$mod_handle->SomeFunction;
Which is the same as calling it:
My::Module::SomeFunction($mod_handle);
Thus, the value of $self is the hash reference I stored in $mod_handle. And, that hash reference contains the three values I am always passing to this function.
Conclusion
You should never share variables between your main program and your module. Otherwise, you're stuck not only with the same name in your program each and every time, but you must be careful not to use your module in parallel in another part of your program.
By using object oriented code, you can store variables you need in a single instance and pass them back and forth between functions in the same instance. Now, you can call the variables in your program whatever you want, and you can use your module in parallel instances. It improves your program and your programming skills.
Beside, you might as well get use to object oriented programming because it isn't going away. It works too well. Entire languages are designed to be exclusively object oriented, and if you don't understand how it works, you'll never improve your skills.
And, did I mention chicks dig it?
Adium
Before all the Perl hackers descend upon me. I want to mention that my Perl object oriented design is very bad. It's way better than what you wanted, but there is a serious flaw in it: I've exposed the design of my object to all the methods in my class. That means if I change the way I store my data, I'll have to go through my entire module to search and replace it.
I did it this way to keep it simple, and make it a bit more obvious what I was doing. However, as any good object oriented programmer will tell you (and second rate hacks like me) is that you should use setter/getter functions to set your member values.
A setter function is pretty simple. The template looks like this:
sub My_Method {
my $self = shift;
my $value = shift;
# Something here to verify $value is valid
if (defined $value) {
$self->{VALUE} = $value;
}
return $self->{VALUE};
}
If I call $instance->My_Method("This is my value"); in my program, it will set $self->{VALUE} to This is my value. At the same time, it returns the value of $self->{VALUE}.
Now, let's say I all it this way:
my $value = $instance->My_Method;
My parameter, $value is undefined, so I don't set the value $self->{VALUE}. However, I still return the value anyway.
Thus, I can use that same method to set and get my value.
Let's look at my Constructor (which is a fancy name for that new function):
sub new {
my $class = shift;
my %param = #_;
my $self = {};
bless $self, $class
$self->{Q} = $q;
$self->{DBH} = $dbh;
$self->{CONF} = $conf;
return $self;
}
Instead of setting the $self->{} hash reference directly in this program, good design said I should have used getter/setter functions like this:
sub new {
my $class = shift;
my %param = #_;
my $self = {};
bless $self, $class
$self->Q = $q; #This line changed
$self->Dbh = $dbh; #This line changed
$self->Conf = $conf; #This line changed
return $self;
}
Now, I'll have to define these three subroutines, Q, Dbh, and Conf, but now my SomeFunction method goes from this:
sub SomeFunction {
$self = shift;
my $dbh = $self->{DBH};
my $q = $self->{Q};
my %conf = %{self->{CONF}};
#data = $dbh->selectrow_array(
"SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar}
);
return $param{q}->p( "#data" );
}
To this:
sub SomeFunction {
$self = shift;
my $dbh = $self->Dbh; #This line changed
my $q = $self->Q; #This line changed
my %conf = %{self->Conf}; #This line changed
#data = $dbh->selectrow_array(
"SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar}
);
return $param{q}->p( "#data" );
}
The changes are subtle, but important. Now my new function and my SomeFunction have no idea how these parameters are stored. The only place that knows how they're stored is the getter/setter function itself. If I change the class data structure, I don't have to modify anything but the getter/setter functions themselves.
Postscript
Here's food for thought...
If all of your SQL calls are in your My::Module function, why not simply initialize the $dbh, and the $q there in the first place. This way, you don't need to include the Use Dbi; module in your program itself. In fact, your program now remains blissfully ignorant exactly how the data is stored. You don't know if it's a SQL database, a Mongo database, or even some flat Berkeley styled db structure.
I'm including a module I did for a job long, long ago where I tried to simplify the way we use our database. This module did several things:
It handled everything about the database. The initialization, and all the handles. Thus, your main program didn't have to use any module but this.
It also moved away from the developer writing select statements. Instead, you defined what fields you wanted, and it figured out how to do the query for you.
It returns an incriminator function that's used to fetch the next row from the database.
Take a look at it. It's not the greatest, but you'll see how I use object oriented design to remove all of the various issues you are having. To see the documentation, simply type in perldoc MFX:Cmdata on the command line.
Using main:: explicitly is perfectly clean
As an alternative, you can pass that data into the module constructor method assuming your module is object based (or copy it into module's own variables from main:: in the constructor). This way the not-quite-perfwectly-pretty main:: is hidden away from the rest of the module.
One cleaner way is to use Exporter to make symbols from other namespaces available in your package. If you are just an occasional Perl programmer, I wouldn't bother with it because there's a bit of a learning curve to it, there are lots of gotchas, and for a toy project it makes the code even more complex. But it is essential for larger projects and for understanding other people's modules.
Usually, there are global variables (or especially global functions) in a package that you want to use in a different module without qualifying it (i.e., prepending package:: to every variable and function call). Here's one way that Exporter might be used to do that:
# MyModule.pm
package MyModule;
use strict; use warnings;
use Exporter;
our #EXPORT = qw($variable function);
our $variable = 42; # we want this var to be used somewhere else
sub function { return 19 }; # and want to call this function
...
1;
# my_script.pl
use MyModule;
$variable = 16; # refers to $MyModule::variable
my $f = function(); # refers to &MyModule::function
Your problem spec is backwards (not that there's necessarily anything wrong with that) -- you want variables in the main script and main package to be visible inside another module. For a short list of variables/functions that are used many times, a cleaner way to do that might be to hack at the symbol table:
package My::Module;
# required if you use strict . You do use strict , don't you?
our ($q, $dbh, %conf);
*q = \$main::q;
*dbh = \$main::dbh;
*conf = \%main::conf;
...
Now $My::Module::q and $main::q refer to the same variable, and you can just use $q in either the main or the My::Module namespace.

In Perl, what is the right way for a subclass to alias a method in the base class?

I simply hate how CGI::Application's accessor for the CGI object is called query.
I would like my instance classes to be able to use an accessor named cgi to get the CGI object associated with the current instance of my CGI::Application subclass.
Here is a self-contained example of what I am doing:
package My::Hello;
sub hello {
my $self =shift;
print "Hello #_\n";
}
package My::Merhaba;
use base 'My::Hello';
sub merhaba {
goto sub { shift->hello(#_) };
}
package main;
My::Merhaba->merhaba('StackOverflow');
This is working as I think it should and I cannot see any problems (say, if I wanted to inherit from My::Merhaba: Subclasses need not know anything about merhaba).
Would it have been better/more correct to write
sub merhaba {
my $self = shift;
return $self->hello(#_);
}
What are the advantages/disadvantages of using goto &NAME for the purpose of aliasing a method name? Is there a better way?
Note: If you have an urge to respond with goto is evil don't do it because this use of Perl's goto is different than what you have in mind.
Your approach with goto is the right one, because it will ensure that caller / wantarray and the like keep working properly.
I would setup the new method like this:
sub merhaba {
if (my $method = eval {$_[0]->can('hello')}) {
goto &$method
} else {
# error code here
}
}
Or if you don't want to use inheritance, you can add the new method to the existing package from your calling code:
*My::Hello::merhaba = \&My::Hello::hello;
# or you can use = My::Hello->can('hello');
then you can call:
My::Hello->merhaba('StackOverflow');
and get the desired result.
Either way would work, the inheritance route is more maintainable, but adding the method to the existing package would result in faster method calls.
Edit:
As pointed out in the comments, there are a few cases were the glob assignment will run afoul with inheritance, so if in doubt, use the first method (creating a new method in a sub package).
Michael Carman suggested combining both techniques into a self redefining function:
sub merhaba {
if (my $method = eval { $_[0]->can('hello') }) {
no warnings 'redefine';
*merhaba = $method;
goto &merhaba;
}
die "Can't make 'merhaba' an alias for 'hello'";
}
You can alias the subroutines by manipulating the symbol table:
*My::Merhaba::merhaba = \&My::Hello::hello;
Some examples can be found here.
I'm not sure what the right way is, but Adam Kennedy uses your second method (i.e. without goto) in Method::Alias (click here to go directly to the source code).
This is sort of a combination of Quick-n-Dirty with a modicum of indirection using UNIVERSAL::can.
package My::Merhaba;
use base 'My::Hello';
# ...
*merhaba = __PACKAGE__->can( 'hello' );
And you'll have a sub called "merhaba" in this package that aliases My::Hello::hello. You are simply saying that whatever this package would otherwise do under the name hello it can do under the name merhaba.
However, this is insufficient in the possibility that some code decorator might change the sub that *My::Hello::hello{CODE} points to. In that case, Method::Alias might be the appropriate way to specify a method, as molecules suggests.
However, if it is a rather well-controlled library where you control both the parent and child categories, then the method above is slimmmer.

Detecting Overridden Methods in Perl

Last week I was bitten twice by accidentally overriding methods in a subclass. While I am not a fan of inheritance, we (ab)use this in our application at work. What I would like to do is provide some declarative syntax for stating that a method is overriding a parent method. Something like this:
use Attribute::Override;
use parent 'Some::Class';
sub foo : override { ... } # fails if it doesn't override
sub bar { ... } # fails if it does override
There are a couple of issues here. First, if method loading is delayed somehow (for example, methods loaded via AUTOLOAD or otherwise later installed in the symbol table), this won't detect those methods.
Walking the inheritance tree could also get similarly expensive. I do this with Class::Sniff, but it's not really suitable for running code. I could walk the inheritance tree and simply match where there's a defined CODE slot in the appropriate symbol table and that would be faster, but if the method cache is invalidated, that would break if I were to cache those results.
So I have two questions: is this a reasonable approach and is there a hook which allows me to check if the method cache has changed? (search for 'cache' in 'perldoc perlobj').
Of course, this shouldn't break production code, I am thinking about only having it fail or warn if the TEST_HARNESS environment variable is active (and have an explicit environment variable to force it to be inactive, if production code were to set the TEST_HARNESS environment variable for some reason).
One way to enforce this:
package Base;
...
sub new {
my $class = shift;
...
check_overrides( $class );
...
}
sub check_overrides {
my $class = shift;
for my $method ( #unoverridable ) {
die "horribly" if __PACKAGE__->can( $method ) != $class->can( $method );
}
}
Memoization of check_overrides may be helpful.
If there are some cases where you want exemptions, have an alternate method name and
have the base class call that:
package Base;
...
my #unoverridable = 'DESTROY';
sub destroy {}
sub DESTROY {
my $self = shift;
# do essential stuff
...
$self->destroy();
}