Procedural and object oriented interface in Perl/XS - perl

I am currently writing my first XS-module (just a wrapper around the C math-library) with ok-ish success. The biggest issue is the documentation that is quite hard to understand and/or is incomplete.
I have successfully written a constructor in XS and implemented some functions out of the library as method calls. That works fine.
Now I want to implement a procedural interface, too. For this reason I need to know if its a method call or not. If its a method call the number to compute with the function is stored in the instance, if its a procedural call to a function its a number given as the first argument.
This is the current code for the cosine-function:
double
cos(...)
CODE:
SV *arg = newSVsv(ST(0));
if (sv_isobject(arg)) {
HV *self_hv = MUTABLE_HV(SvRV(arg));
SV **callback_ptr = hv_fetchs(self_hv, "Number", 0);
SV *zahl = *callback_ptr;
}
else {
SV *zahl = newSVnv(arg);
}
double x = SvNV(zahl);
RETVAL = cos(x);
OUTPUT:
RETVAL

Generally speaking, writing subs which are intended to be called as either methods or functions is a bad idea. There are one or two well known modules that do this (CGI.pm springs to mind), but for the most part this is confusing for end users, and unnecessarily complicates your own code. Nobody will thank you for it. Pick one style and stick with it.
Let's assume you've chosen to stick with OO programming. Then, once your module is working and tested, you can write a second module, offering exportable functions instead of an OO interface. The second module can probably be written in plain old Perl, just acting as a wrapper around your OO module.
As an example, using pure Perl (and not XS) for readability:
use v5.14;
package MyStuff::Maths::OO {
use Class::Tiny qw(number);
sub cosine {
my $self = shift;
my $number = $self->number;
...;
return $cosine;
}
}
package MyStuff::Maths::Functional {
use Exporter::Shiny qw(cosine);
sub cosine {
my $obj = ref($_[0]) ? $_[0] : MyStuff::Maths::OO->new(number => $_[0]);
return $obj->cosine;
}
}
Now end users can choose to use your OO interface like this:
use v5.14;
use MyStuff::Maths::OO;
my $obj = MyStuff::Maths::OO->new(number => 0.5);
say $obj->cosine;
Or use the functional interface:
use v5.14;
use MyStuff::Maths::Functional -all;
say cosine(0.5);

In my case it was a simple problem, and I simply didn't see it. The declaration of the SV *zahl inside the if-else-statement was the problem. Predeclaration before the if was the key to the solution.
But I agree with tobyinks solution for modules that get used by other people or published somewhere.

Related

How to use main:: data in modules?

In script i initialize several handlers and set variables which should be available for functions in separate modules. Which is the best way to use them ($q, $dbh, %conf) in modules?
Example pseudo module:
package My::Module
sub SomeFunction (
#data = $dbh->selectrow_array("SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar} );
return $q->p( "#data" );
)
1;
Example pseudo script:
use CGI;
use DBI;
use My::Module;
our $q = new CGI;
our $dbh = some_connenction_function($dsn);
our %conf = ( Foo => 1, Bar => 2, Random => some_dynamic_data() );
I understand that using main:: namespace will work, but there sholud be cleaner way? Or not?
package My::Module
Your modules should be independent of the context. That is, they shouldn't expect that $dbh is the database handler, or that they should return stuff in $q. or that configuration is kept in %conf.
For example, what if you suddenly find yourself with two instances of database handles? What do you do? I hate it when a module requires me to use module specific variables for configuration because it means I can't use two different instances of that module.
So you have two choices:
Either pass in the required data each time.
Create an object instance (that's right Object Oriented Programming) to store the required information.
Let's look at the first instance using your pseudo code:
sub someFunction (
%param = #_;
%conf = %{param{conf};
#data = $param{dbh}->selectrow_array(
"SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar}
);
return $param{q}->p( "#data" );
)
1;
Example pseudo script:
use CGI;
use DBI;
use My::Module;
my $q = new CGI;
my $dbh = some_connenction_function($dsn);
my %conf = ( Foo => 1, Bar => 2, Random => some_dynamic_data() );
someFunction (
q => $q,
dbh => $dbh,
conf => \%conf,
);
Here I'm using parameter calls. Not too bad. Is it? Now if you need another select statement, you can use different variables.
Sure, but what if you don't want to keep passing variables all of the time. Well then, you can use an Object Oriented Techniques. Now, relax and calm down. There are many, many good reasons to use object oriented design:
It can simplify your programming: This confuses people because object oriented programming means thinking about how your program will work, then designing it, then creating the various objects. All you want to do is code and get it out of the way. The truth is that by thinking about your program and designing it makes your program work better, and you code faster. The design aspect keeps the complexity out of your main code and tucks it safely away in small, easily digestible routines.
It's what is cool in Perl: You're going to have to get use to object oriented Perl because that's what everyone else is writing. You'll be seeing lots of my $foo = Foo::Bar->new; type of statements. Newer Perl modules only have object oriented interfaces, and you won't be able to use them.
Chicks dig it: Hey, I'm just grasping at straws here...
Let's see how an object oriented approach might work. First, let's see the main program:
use CGI;
use DBI;
use My::Module;
my $q = new CGI;
my $dbh = some_connenction_function($dsn);
my %conf = ( Foo => 1, Bar => 2, Random => some_dynamic_data() );
my $mod_handle = My::Module->new (
q => $q,
dbh => $dbh,
conf => \%conf,
);
$mod_handle->someFunction;
In the above, I now create an object instance that contains these variables. And, magically, I've changed your Functions into Methods. A method is simply a function in your Class (aka module). The trick is that my instance (the variable $mod_handler has all of your required variables stored away nice and neat for you. The $mod_hander-> syntax merely passes this information for my into my functions I mean methods.
So, what does your module now look like? Let's look at the first part where I have the Constructor which is simply the function that creates the storage for my variables I need:
sub new {
my $class = shift;
my %param = #_;
my $self = {};
bless $self, $class
$self->{Q} = $q;
$self->{DBH} = $dbh;
$self->{CONF} = $conf;
return $self;
}
Let's look at the first thing that is a bit different: my $class = shift;. Where is this coming from? When I call a function with the syntax Foo->Bar, I am passing Foo as the first parameter in the function Bar. Thus, $class is equal to My::Module. It is the same as if I called your function this way:
my $mod_handle = My::Module::new("My::Module", %params);
instead of:
my $mod_handle = My::Module->new(%params);
The next thing is the my $self = {}; line. This is creating a reference to a hash. If you don't understand references, you should look at Mark's Reference Tutorial that's included in Perldocs. Basically, a reference is the memory location where data is stored. In this case, my hash doesn't have a name, all I have is a reference to the memory where it's stored called $self. In Perl, there's nothing special about the name new or $self, but they're standards that everyone pretty much follows.
The bless command is taking my reference, $self, and declaring it a type My::Module. That way, Perl can track whether $mod_handle is the type of instance that has access to these functions.
As you can see, the $self reference contains all the variables that my functions need. And, I conveniently pass this back to my main program where I store it in $mod_handle.
Now, let's look at my Methods:
sub SomeFunction {
$self = shift;
my $dbh = $self->{DBH};
my $q = $self->{Q};
my %conf = %{self->{CONF}};
#data = $dbh->selectrow_array(
"SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar}
);
return $param{q}->p( "#data" );
}
Again, that $self = shift; line. Remember I'm calling this as:
$mod_handle->SomeFunction;
Which is the same as calling it:
My::Module::SomeFunction($mod_handle);
Thus, the value of $self is the hash reference I stored in $mod_handle. And, that hash reference contains the three values I am always passing to this function.
Conclusion
You should never share variables between your main program and your module. Otherwise, you're stuck not only with the same name in your program each and every time, but you must be careful not to use your module in parallel in another part of your program.
By using object oriented code, you can store variables you need in a single instance and pass them back and forth between functions in the same instance. Now, you can call the variables in your program whatever you want, and you can use your module in parallel instances. It improves your program and your programming skills.
Beside, you might as well get use to object oriented programming because it isn't going away. It works too well. Entire languages are designed to be exclusively object oriented, and if you don't understand how it works, you'll never improve your skills.
And, did I mention chicks dig it?
Adium
Before all the Perl hackers descend upon me. I want to mention that my Perl object oriented design is very bad. It's way better than what you wanted, but there is a serious flaw in it: I've exposed the design of my object to all the methods in my class. That means if I change the way I store my data, I'll have to go through my entire module to search and replace it.
I did it this way to keep it simple, and make it a bit more obvious what I was doing. However, as any good object oriented programmer will tell you (and second rate hacks like me) is that you should use setter/getter functions to set your member values.
A setter function is pretty simple. The template looks like this:
sub My_Method {
my $self = shift;
my $value = shift;
# Something here to verify $value is valid
if (defined $value) {
$self->{VALUE} = $value;
}
return $self->{VALUE};
}
If I call $instance->My_Method("This is my value"); in my program, it will set $self->{VALUE} to This is my value. At the same time, it returns the value of $self->{VALUE}.
Now, let's say I all it this way:
my $value = $instance->My_Method;
My parameter, $value is undefined, so I don't set the value $self->{VALUE}. However, I still return the value anyway.
Thus, I can use that same method to set and get my value.
Let's look at my Constructor (which is a fancy name for that new function):
sub new {
my $class = shift;
my %param = #_;
my $self = {};
bless $self, $class
$self->{Q} = $q;
$self->{DBH} = $dbh;
$self->{CONF} = $conf;
return $self;
}
Instead of setting the $self->{} hash reference directly in this program, good design said I should have used getter/setter functions like this:
sub new {
my $class = shift;
my %param = #_;
my $self = {};
bless $self, $class
$self->Q = $q; #This line changed
$self->Dbh = $dbh; #This line changed
$self->Conf = $conf; #This line changed
return $self;
}
Now, I'll have to define these three subroutines, Q, Dbh, and Conf, but now my SomeFunction method goes from this:
sub SomeFunction {
$self = shift;
my $dbh = $self->{DBH};
my $q = $self->{Q};
my %conf = %{self->{CONF}};
#data = $dbh->selectrow_array(
"SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar}
);
return $param{q}->p( "#data" );
}
To this:
sub SomeFunction {
$self = shift;
my $dbh = $self->Dbh; #This line changed
my $q = $self->Q; #This line changed
my %conf = %{self->Conf}; #This line changed
#data = $dbh->selectrow_array(
"SELECT * FROM Foo WHERE Bar = ?", undef, $conf{Bar}
);
return $param{q}->p( "#data" );
}
The changes are subtle, but important. Now my new function and my SomeFunction have no idea how these parameters are stored. The only place that knows how they're stored is the getter/setter function itself. If I change the class data structure, I don't have to modify anything but the getter/setter functions themselves.
Postscript
Here's food for thought...
If all of your SQL calls are in your My::Module function, why not simply initialize the $dbh, and the $q there in the first place. This way, you don't need to include the Use Dbi; module in your program itself. In fact, your program now remains blissfully ignorant exactly how the data is stored. You don't know if it's a SQL database, a Mongo database, or even some flat Berkeley styled db structure.
I'm including a module I did for a job long, long ago where I tried to simplify the way we use our database. This module did several things:
It handled everything about the database. The initialization, and all the handles. Thus, your main program didn't have to use any module but this.
It also moved away from the developer writing select statements. Instead, you defined what fields you wanted, and it figured out how to do the query for you.
It returns an incriminator function that's used to fetch the next row from the database.
Take a look at it. It's not the greatest, but you'll see how I use object oriented design to remove all of the various issues you are having. To see the documentation, simply type in perldoc MFX:Cmdata on the command line.
Using main:: explicitly is perfectly clean
As an alternative, you can pass that data into the module constructor method assuming your module is object based (or copy it into module's own variables from main:: in the constructor). This way the not-quite-perfwectly-pretty main:: is hidden away from the rest of the module.
One cleaner way is to use Exporter to make symbols from other namespaces available in your package. If you are just an occasional Perl programmer, I wouldn't bother with it because there's a bit of a learning curve to it, there are lots of gotchas, and for a toy project it makes the code even more complex. But it is essential for larger projects and for understanding other people's modules.
Usually, there are global variables (or especially global functions) in a package that you want to use in a different module without qualifying it (i.e., prepending package:: to every variable and function call). Here's one way that Exporter might be used to do that:
# MyModule.pm
package MyModule;
use strict; use warnings;
use Exporter;
our #EXPORT = qw($variable function);
our $variable = 42; # we want this var to be used somewhere else
sub function { return 19 }; # and want to call this function
...
1;
# my_script.pl
use MyModule;
$variable = 16; # refers to $MyModule::variable
my $f = function(); # refers to &MyModule::function
Your problem spec is backwards (not that there's necessarily anything wrong with that) -- you want variables in the main script and main package to be visible inside another module. For a short list of variables/functions that are used many times, a cleaner way to do that might be to hack at the symbol table:
package My::Module;
# required if you use strict . You do use strict , don't you?
our ($q, $dbh, %conf);
*q = \$main::q;
*dbh = \$main::dbh;
*conf = \%main::conf;
...
Now $My::Module::q and $main::q refer to the same variable, and you can just use $q in either the main or the My::Module namespace.

Import MooseX::Method::Signatures into caller's scope

I've made a "bundle" module which does a bunch of things: imports Moose, imports true, namespace::autoclean, makes the caller's class immutable (taken from MooseX::AutoImmute). The one thing I haven't been able to figure out is how to include MooseX::Method::Signatures.
Here's what I've got so far:
package My::OO;
use Moose::Exporter;
use Hook::AfterRuntime;
use Moose ();
use true ();
use namespace::autoclean ();
my ($import, $unimport, $init_meta) = Moose::Exporter->build_import_methods(
also => ['Moose'],
);
sub import {
true->import();
my $caller = scalar caller;
after_runtime { $caller->meta->make_immutable };
namespace::autoclean->import(-cleanee => $caller);
goto &$import;
}
sub unimport {
goto &$unimport;
}
1;
The idea is that in my code, I can now do things like this:
package My::Class; {
use My::OO;
extends 'My::Parent';
method foo() { ... }
}
but right now I still have to include an extra use MooseX::Method::Signatures;. How can I include that into my OO module?
First off, please note that the implementation of Hook::AfterRuntime is quite fragile. While it works for many simple things, it's extremely easy to end up with very hard to debug errors. I'd recommend just writing the ->meta->make_immutable yourself, or using other approaches to get around writing it, like MooseX::Declare, for example.
To answer your actual question:
MooseX::Method::Signatures->setup_for($your_caller);

In Perl, what is the right way for a subclass to alias a method in the base class?

I simply hate how CGI::Application's accessor for the CGI object is called query.
I would like my instance classes to be able to use an accessor named cgi to get the CGI object associated with the current instance of my CGI::Application subclass.
Here is a self-contained example of what I am doing:
package My::Hello;
sub hello {
my $self =shift;
print "Hello #_\n";
}
package My::Merhaba;
use base 'My::Hello';
sub merhaba {
goto sub { shift->hello(#_) };
}
package main;
My::Merhaba->merhaba('StackOverflow');
This is working as I think it should and I cannot see any problems (say, if I wanted to inherit from My::Merhaba: Subclasses need not know anything about merhaba).
Would it have been better/more correct to write
sub merhaba {
my $self = shift;
return $self->hello(#_);
}
What are the advantages/disadvantages of using goto &NAME for the purpose of aliasing a method name? Is there a better way?
Note: If you have an urge to respond with goto is evil don't do it because this use of Perl's goto is different than what you have in mind.
Your approach with goto is the right one, because it will ensure that caller / wantarray and the like keep working properly.
I would setup the new method like this:
sub merhaba {
if (my $method = eval {$_[0]->can('hello')}) {
goto &$method
} else {
# error code here
}
}
Or if you don't want to use inheritance, you can add the new method to the existing package from your calling code:
*My::Hello::merhaba = \&My::Hello::hello;
# or you can use = My::Hello->can('hello');
then you can call:
My::Hello->merhaba('StackOverflow');
and get the desired result.
Either way would work, the inheritance route is more maintainable, but adding the method to the existing package would result in faster method calls.
Edit:
As pointed out in the comments, there are a few cases were the glob assignment will run afoul with inheritance, so if in doubt, use the first method (creating a new method in a sub package).
Michael Carman suggested combining both techniques into a self redefining function:
sub merhaba {
if (my $method = eval { $_[0]->can('hello') }) {
no warnings 'redefine';
*merhaba = $method;
goto &merhaba;
}
die "Can't make 'merhaba' an alias for 'hello'";
}
You can alias the subroutines by manipulating the symbol table:
*My::Merhaba::merhaba = \&My::Hello::hello;
Some examples can be found here.
I'm not sure what the right way is, but Adam Kennedy uses your second method (i.e. without goto) in Method::Alias (click here to go directly to the source code).
This is sort of a combination of Quick-n-Dirty with a modicum of indirection using UNIVERSAL::can.
package My::Merhaba;
use base 'My::Hello';
# ...
*merhaba = __PACKAGE__->can( 'hello' );
And you'll have a sub called "merhaba" in this package that aliases My::Hello::hello. You are simply saying that whatever this package would otherwise do under the name hello it can do under the name merhaba.
However, this is insufficient in the possibility that some code decorator might change the sub that *My::Hello::hello{CODE} points to. In that case, Method::Alias might be the appropriate way to specify a method, as molecules suggests.
However, if it is a rather well-controlled library where you control both the parent and child categories, then the method above is slimmmer.

Is there a point to Perl's object oriented interfaces if they're not creating objects?

I think I read somewhere that some modules only have object oriented interfaces ( though they didn't create objects, they only held utility functions ). Is there a point to that?
First, its important to remember that in Perl, classes are implemented in a weird way, via packages. Packages also serve for general namespace pollution prevention.
package Foo;
sub new {
my ($class) = #_;
my $self = bless {}, $class;
return $self;
}
1;
That is how you make a Foo class in Perl (which can have an objected instantiated by calling Foo->new or new Foo). The use of new is just a convention; it can be anything at all. In fact, that new is what C++ would call a static method call.
You can easily create packages that contain only static method calls, and I suspect this is what you're referring to. The advantage here is that you can still use OO features like inheritance:
package Bar;
sub DoSomething {
my ($class, $arg) = #_;
$class->Compute($arg);
}
sub Compute {
my ($class, $arg) = #_;
$arg * 2;
}
1;
package Baz;
#Baz::ISA = qw(Bar);
sub Compute {
my ($class, $arg) = #_;
$arg * 2 - 1
}
1;
Given that, then
say Bar->DoSomething(3) # 6
say Baz->DoSomething(3) # 5
In fact, you can even use variables for the class name, so these can function very much like singletons:
my $obj = "Baz"; # or Baz->new could just return "Baz"
print $obj->DoSomething(3) # 5
[Code is untested; typos may be present]
I suspect that this is mostly a philosophical choice on the part of authors who prefer OO to imperative programming. Others have mentioned establishing a namespace, but it's the package that does that, not the interface. OO is not required.
Personally, I see little value in creating classes that are never instantiated (i.e. when there's no object in object-oriented). Perl isn't Java; you don't have to write a class for everything. Some modules acknowledge this. For example: File::Spec has an OO interface but also provides a functional interface via File::Spec::Functions.
File::Spec also provides an example of where OO can be useful for uninstantiated "utility" interfaces. Essentially, File::Spec is an abstract base class -- an interface with no implementation. When you load File::Spec it checks which OS you're using and loads the appropriate implementation. As a programmer, you use the interface (e.g. File::Spec->catfile) without having to worry about which version of catfile (Unix, Windows, VMS, etc.) to actually call.
As others have said, inheritance is the big gain if an actual object is not needed. The only thing I have to add here is the advice to name your variables well when writing such interfaces, e.g.:
package Foo;
# just a static method call
sub func
{
my $class = shift;
my (#args) = #_;
# stuff...
}
I named the variable that holds the classname "$class", rather than $this, to make it clear to subsequent maintainers that func() will be called as Foo->func() rather than $foo->func() (with an instantiated Foo object). This helps avoid someone adding this line later to the method:
my $value = $this->{key};
...which will fail, as there is no object to deference to get the "key" key.
If a method might be called either statically or against an instantiated object (for example, when writing a custom AUTOLOAD method), you can write this:
my method
{
my $this = shift;
my $class = ref($this) || $this;
my (#args) = #_;
# stuff...
}
namespacing, mostly. Why not? Everything that improves perl has my full approval.

How can I override Perl functions, enabling multiple overrides?

some time ago, I asked This question about overriding building perl functions.
How do I do this in a way that allows multiple overrides? The following code yields an infinite recursion.
What's the proper way to do this? If I redefine a function, I don't want to step on someone else's redefinition.
package first;
my $orig_system1;
sub mysystem {
my #args = #_;
print("in first mysystem\n");
return &{$orig_system1}(#args);
}
BEGIN {
if (defined(my $orig = \&CORE::GLOBAL::system)) {
$orig_system1 = $orig;
*CORE::GLOBAL::system = \&first::mysystem;
printf("first defined\n");
} else {
printf("no orig for first\n");
}
}
package main;
system("echo hello world");
The proper way to do it is not to do it. Find some other way to accomplish what you're doing. This technique has all the problems of a global variable, squared. Unless you get your rewrite of the function exactly right, you could break all sorts of code you never even knew existed. And while you might be polite in not blowing over an existing override, somebody else probably will not be.
Overriding system is particularly touchy because it does not have a proper prototype. This is because it does things not expressible in the prototype system. This means your override cannot do some things that system can. Namely...
system {$program} #args;
This is a valid way to call system, though you need to read the exec docs to do it. You might think "oh, well I just won't do that then", but if any module that you use does it, or any module it uses does it, then you're out of luck.
That said, there's little different from overriding any other function politely. You have to trap the existing function and be sure you call it in your new one. Whether you do it before or after is up to you.
The problem in your code is that the proper way to check if a function is defined is defined &function. Taking a code ref, even of an undefined function, will always return a true code ref. I'm not sure why, maybe its like how \undef will return a scalar ref. Why calling this code ref is causing mysystem() to go infinitely recursive is anyone's guess.
There's an additional complexity in that you can't take a reference to a core function. \&CORE::system doesn't do what you mean. Nor can you get at it with a symbolic reference. So if you want to call CORE::system or an existing override depending on which is defined you can't just assign one or the other to a code ref. You have to split your logic.
Here is one way to do it.
package first;
use strict;
use warnings;
sub override_system {
my $after = shift;
my $code;
if( defined &CORE::GLOBAL::system ) {
my $original = \&CORE::GLOBAL::system;
$code = sub {
my $exit = $original->(#_);
return $after->($exit, #_);
};
}
else {
$code = sub {
my $exit = CORE::system(#_);
return $after->($exit, #_);
};
}
no warnings 'redefine';
*CORE::GLOBAL::system = $code;
}
sub mysystem {
my($exit, #args) = #_;
print("in first mysystem, got $exit and #args\n");
}
BEGIN { override_system(\&mysystem) }
package main;
system("echo hello world");
Note that I've changed mysystem() to merely be a hook that runs after the real system. It gets all the arguments and the exit code, and it can change the exit code, but it doesn't change what system() actually does. Adding before/after hooks is the only thing you can do if you want to honor an existing override. Its quite a bit safer anyway. The mess of overriding system is now in a subroutine to keep BEGIN from getting too cluttered.
You should be able to modify this for your needs.