What Perl modules are useful for validating subroutine arguments? - perl

I'm looking for a general-purpose module to take the drudgery out of validating subroutine and method arguments. I've scanned through various possibilities on CPAN: Params::Validate, Params::Smart, Getargs::Mixed, Getargs::Long, and a few others.
Any information regarding pros and cons of these or other modules would be appreciated. Thanks.

If you start to use Moose, you'll find MooseX::Types to your liking. Types automatically have an is_$type(), and to_$type(). These are for making sure you input passes type constraints, or making your input has a valid coercion to the type. I like them better even for these types of things because you can ensure the state of your object has the said types at no additional cost.
use Moose;
has 'foo' => ( isa => MyType, is => ro );
sub _check_my_type {
my ( $self, $type ) = #_;
is_MyType( $type );
};
It might be lacking some support for deep/recursive types, but if your using this stuff in modern perl you're probably "doing it wrong." Instead use an object that has its own consistency checks (like mine above with MyType), and just pass the object.

Have a look at MooseX::Method::Signatures which provides a bit more than just validating the arguments.
Example from POD:
package Foo;
use Moose;
use MooseX::Method::Signatures;
method morning (Str $name) {
$self->say("Good morning ${name}!");
}
method hello (Str :$who, Int :$age where { $_ > 0 }) {
$self->say("Hello ${who}, I am ${age} years old!");
}
method greet (Str $name, Bool :$excited = 0) {
if ($excited) {
$self->say("GREETINGS ${name}!");
}
else {
$self->say("Hi ${name}!");
}
}
MooseX::Method::Signatures also comes as standard with MooseX::Declare which brings even more sexy syntax to the Perl plate. The above could be written like so (just showing first method for brevity):
use MooseX::Declare;
class Foo {
method morning (Str $name) {
$self->say("Good morning ${name}!");
}
}
There is also a corollary signatures CPAN module for plain subroutines but unfortunately it isn't as feature rich as above.

I am currently researching the same question as the O.P.
I noticed that Dave Rolsky - hyper-productive programmer of Moose fame - has recently (2009) taken over maintenance of Params::Validate, so I think this a good sign. The module has not been upgraded since 2003. So I guess, it still can be used again for checking subroutine params.

Related

Procedural and object oriented interface in Perl/XS

I am currently writing my first XS-module (just a wrapper around the C math-library) with ok-ish success. The biggest issue is the documentation that is quite hard to understand and/or is incomplete.
I have successfully written a constructor in XS and implemented some functions out of the library as method calls. That works fine.
Now I want to implement a procedural interface, too. For this reason I need to know if its a method call or not. If its a method call the number to compute with the function is stored in the instance, if its a procedural call to a function its a number given as the first argument.
This is the current code for the cosine-function:
double
cos(...)
CODE:
SV *arg = newSVsv(ST(0));
if (sv_isobject(arg)) {
HV *self_hv = MUTABLE_HV(SvRV(arg));
SV **callback_ptr = hv_fetchs(self_hv, "Number", 0);
SV *zahl = *callback_ptr;
}
else {
SV *zahl = newSVnv(arg);
}
double x = SvNV(zahl);
RETVAL = cos(x);
OUTPUT:
RETVAL
Generally speaking, writing subs which are intended to be called as either methods or functions is a bad idea. There are one or two well known modules that do this (CGI.pm springs to mind), but for the most part this is confusing for end users, and unnecessarily complicates your own code. Nobody will thank you for it. Pick one style and stick with it.
Let's assume you've chosen to stick with OO programming. Then, once your module is working and tested, you can write a second module, offering exportable functions instead of an OO interface. The second module can probably be written in plain old Perl, just acting as a wrapper around your OO module.
As an example, using pure Perl (and not XS) for readability:
use v5.14;
package MyStuff::Maths::OO {
use Class::Tiny qw(number);
sub cosine {
my $self = shift;
my $number = $self->number;
...;
return $cosine;
}
}
package MyStuff::Maths::Functional {
use Exporter::Shiny qw(cosine);
sub cosine {
my $obj = ref($_[0]) ? $_[0] : MyStuff::Maths::OO->new(number => $_[0]);
return $obj->cosine;
}
}
Now end users can choose to use your OO interface like this:
use v5.14;
use MyStuff::Maths::OO;
my $obj = MyStuff::Maths::OO->new(number => 0.5);
say $obj->cosine;
Or use the functional interface:
use v5.14;
use MyStuff::Maths::Functional -all;
say cosine(0.5);
In my case it was a simple problem, and I simply didn't see it. The declaration of the SV *zahl inside the if-else-statement was the problem. Predeclaration before the if was the key to the solution.
But I agree with tobyinks solution for modules that get used by other people or published somewhere.

Perl OOP attribute manipulation best practice

Assume the following code:
package Thing;
sub new {
my $this=shift;
bless {#_},$this;
}
sub name {
my $this=shift;
if (#_) {
$this->{_name}=shift;
}
return $this->{_name};
}
Now assume we've instantiated an object thusly:
my $o=Thing->new();
$o->name('Harold');
Good enough. We could also instantiate the same thing more quickly with either of the following:
my $o=Thing->new(_name=>'Harold'); # poor form
my $o=Thing->new()->name('Harold');
To be sure, I allowed attributes to be passed in the constructor to allow "friendly" classes to create objects more completely. It could also allow for a clone-type operator with the following code:
my $o=Thing->new(%$otherthing); # will clone attrs if not deeper than 1 level
This is all well and good. I understand the need for hiding attributes behind methods to allow for validation, etc.
$o->name; # returns 'Harold'
$o->name('Fred'); # sets name to 'Fred' and returns 'Fred'
But what this doesn't allow is easy manipulation of the attribute based on itself, such as:
$o->{_name}=~s/old/ry/; # name is now 'Harry', but this "exposes" the attribute
One alternative is to do the following:
# Cumbersome, not syntactically sweet
my $n=$o->name;
$n=~s/old/ry/;
$o->name($n);
Another potential is the following method:
sub Name :lvalue { # note the capital 'N', not the same as name
my $this=shift;
return $this->{_name};
}
Now I can do the following:
$o->Name=~s/old/ry/;
So my question is this... is the above "kosher"? Or is it bad form to expose the attribute that way? I mean, doing that takes away any validation that might be found in the 'name' method. For example, if the 'name' method enforced a capital first letter and lowercase letters thereafter, the 'Name' (capital 'N') bypasses that and forces the user of the class to police herself in the use of it.
So, if the 'Name' lvalue method isn't exactly "kosher" are there any established ways to do such things?
I have considered (but get dizzy considering) things like tied scalars as attributes. To be sure, it may be the way to go.
Also, are there perhaps overloads that may help?
Or should I create replacement methods in the vein of (if it would even work):
sub replace_name {
my $this=shift;
my $repl=shift;
my $new=shift;
$this->{_name}=~s/$repl/$new/;
}
...
$o->replace_name(qr/old/,'ry');
Thanks in advance... and note, I am not very experienced in Perl's brand of OOP, even though I am fairly well-versed in OOP itself.
Additional info:
I guess I could get really creative with my interface... here's an idea I tinkered with, but I guess it shows that there really are no bounds:
sub name {
my $this=shift;
if (#_) {
my $first=shift;
if (ref($first) eq 'Regexp') {
my $second=shift;
$this->{_name}=~s/$first/$second/;
}
else {
$this->{_name}=$first;
}
}
return $this->{_name};
}
Now, I can either set the name attribute with
$o->name('Fred');
or I can manipulate it with
$o->name(qr/old/,'ry'); # name is now Harry
This still doesn't allow stuff like $o->name.=' Jr.'; but that's not too tough to add. Heck, I could allow calllback functions to be passed in, couldn't I?
Your first code example is abolutely fine. This is a standard method to write accessors. Of course this can get ugly when doing a substitution, the best solution might be:
$o->name($o->name =~ s/old/ry/r);
The /r flag returns the result of the substitution. Equivalently:
$o->name(do { (my $t = $o->name) =~ s/old/ry/; $t });
Well yes, this 2nd solution is admittedly ugly. But I am assuming that accessing the fields is a more common operation than setting them.
Depending on your personal style preferences, you could have two different methods for getting and setting, e.g. name and set_name. (I do not think get_ prefixes are a good idea – 4 unneccessary characters).
If substituting parts of the name is a central aspect of your class, then encapsulating this in a special substitute_name method sounds like a good idea. Otherwise this is just unneccessary ballast, and a bad tradeoff for avoiding occasional syntactic pain.
I do not advise you to use lvalue methods, as these are experimental.
I would rather not see (and debug) some “clever” code that returns tied scalars. This would work, but feels a bit too fragile for me to be comfortable with such solutions.
Operator overloading does not help with writing accessors. Especially assignment cannot be overloaded in Perl.
Writing accessors is boring, especially when they do no validation. There are modules that can handle autogeneration for us, e.g. Class::Accessor. This adds generic accessors get and set to your class, plus specific accessors as requested. E.g.
package Thing;
use Class::Accessor 'antlers'; # use the Moose-ish syntax
has name => (is => 'rw'); # declare a read-write attribute
# new is autogenerated. Achtung: this takes a hashref
Then:
Thing->new({ name => 'Harold'});
# or
Thing->new->name('Harold');
# or any of the other permutations.
If you want a modern object system for Perl, there is a row of compatible implementations. The most feature-rich of these is Moose, and allows you to add validation, type constraints, default values, etc. to your attributes. E.g.
package Thing;
use Moose; # this is now a Moose class
has first_name => (
is => 'rw',
isa => 'Str',
required => 1, # must be given in constructor
trigger => \&_update_name, # run this sub after attribute is set
);
has last_name => (
is => 'rw',
isa => 'Str',
required => 1, # must be given in constructor
trigger => \&_update_name,
);
has name => (
is => 'ro', # readonly
writer => '_set_name', # but private setter
);
sub _update_name {
my $self = shift;
$self->_set_name(join ' ', $self->first_name, $self->last_name);
}
# accessors are normal Moose methods, which we can modify
before first_name => sub {
my $self = shift;
if (#_ and $_[0] !~ /^\pU/) {
Carp::croak "First name must begin with uppercase letter";
}
};
The purpose of class interface is to prevent users from directly manipulating your data. What you want to do is cool, but not a good idea.
In fact, I design my classes, so even the class itself doesn't know it's own structure:
package Thingy;
sub new {
my $class = shift;
my $name = shift;
my $self = {};
bless, $self, $class;
$self->name($name);
return $self;
}
sub name {
my $self = shift;
my $name = shift;
my $attribute = "GLUNKENSPEC";
if ( defined $name ) {
$self->{$attribute} = $name;
}
return $self->{$attribute};
}
You can see by my new constructor that I could pass it a name for my Thingy. However, my constructor doesn't know how I store my name. Instead, it merely uses my name method to set the name. As you can see by my name method, it stores the name in an unusual way, but my constructor doesn't need to know or care.
If you want to manipulate the name, you have to work at it (as you showed):
my $name = $thingy->name;
$name =~ s/old/ry/;
$thingy->name( $name );
In fact, a lot of Perl developers use inside out classes just to prevent this direct object manipulation.
What if you want to be able to directly manipulate a class by passing in a regular expression? You have to write a method to do this:
sub mod_name {
my $self = shift;
my $pattern = shift;
my $replacement = shift;
if ( not defined $replacement ) {
croak qq(Some basic error checking: Need pattern and replacement string);
}
my $name = $self->name; # Using my name method for my class
if ( not defined $name ) {
croak qq(Cannot modify name: Name is not yet set.);
}
$name = s/$pattern/$replacement/;
return $self->name($name);
}
Now, the developer can do this:
my $thingy->new( "Harold" );
$thingy->mod_name( "old", "new" );
say $thingy->name; # Says "Harry"
Whatever time or effort you save by allowing for direct object manipulation is offset by the magnitude of extra effort it will take to maintain your program. Most methods don't take more than a few minutes to create. If I suddenly got an hankering to manipulate my object in a new and surprising way, it's easy enough to create a new method to do this.
1. No. I don't actually use random nonsense words to protect my class. This is purely for demo purposes to show that even my constructor doesn't have to know how methods actually store their data.
I understand the need for hiding attributes behind methods to allow for validation, etc.
Validation is not the only reason, although it is the only one you refer to. I mention this because another is that encapsulation like this leaves the implementation open. For example, if you have a class which needs to have a string "name" which can be get and set, you could just expose a member, name. However, if you instead use get()/set() subroutines, how "name" is stored and represented internally doesn't matter.
That can be very significant if you write bunches of code with uses the class and then suddenly realize that although the user may be accessing "name" as a string, it would be much better stored some other way (for whatever reason). If the user was accessing the string directly, as a member field, you now either have to compensate for this by including code that will change name when the real whatever is changed and...but wait, how can you then compensate for the client code that changed name...
You can't. You're stuck. You now have to go back and change all the code that uses the class -- if you can. I'm sure anyone who has done enough OOP has run into this situation in one form or another.
No doubt you've read all this before, but I'm bringing it up again because there are a few points (perhaps I've misunderstood you) where you seem to outline strategies for changing "name" based on your knowledge of the implementation, and not what was intended to be the API. That is very tempting in perl because of course there is no access control -- everything is essential public -- but it is still a very very bad practice for the reason just described.
That doesn't mean, of course, that you can't simply commit to exposing "name" as a string. That's a decision and it won't be the same in all cases. However, in this particular case, if what you are particularly concerned with is a simple way to transform "name", IMO you might as well stick with a get/set method. This:
# Cumbersome, not syntactically sweet
Maybe true (although someone else might say it is simple and straightforward), but your primary concern should not be syntactic sweetness, and neither should speed of execution. They can be concerns, but your primary concern has to be design, because no matter how sweet and fast your stuff is, if it is badly designed, it will all come down around you in time.
Remember, "Premature optimization is the root of all evil" (Knuth).
So my question is this... is the above "kosher"? Or is it bad form to expose the attribute that way?
It boils down to: Will this continue to work if the internals change? If the answer is yes, you can do many other things including but not limited to validation.)
The answer is yes. This can be done by having the method return a magical value.
{
package Lvalue;
sub TIESCALAR { my $class = shift; bless({ #_ }, $class) }
sub FETCH { my $self = shift; my $m = $self->{getter}; $self->{obj}->$m(#_) }
sub STORE { my $self = shift; my $m = $self->{setter}; $self->{obj}->$m(#_) }
}
sub new { my $class = shift; bless({}, $class) }
sub get_name {
my ($self) = #_;
return $self->{_name};
}
sub set_name {
my ($self, $val) = #_;
die "Invalid name" if !length($val);
$self->{_name} = $val;
}
sub name :lvalue {
my ($self) = #_;
tie my $rv, 'Lvalue', obj=>$self, getter=>'get_name', setter=>'set_name';
return $rv;
}
my $o = __PACKAGE__->new();
$o->name = 'abc';
print $o->name, "\n"; # abc
$o->name = ''; # Invalid name

unable to find the function definition in a perl codebase

I am rummaging through 7900+ lines of perl code. I needed to change a few things and things were going quite well even though i am just 36 hours into perl. I learned the basic constructs of the language and was able to get small things done. But then suddenly I have found a function call which does not have any definition anywhere. I grep'ed several times to check for all 'sub'. I could not find the functions definition. What am I missing ? Where is the definition of this function. I am quite sure this is a user defined function and not a library function(from its name i guessed this).
Please help me find this function's definition.
Here is a few lines from around the function usage.
(cfg_machine_isActive($ep)) {
staf_var_set($ep, VAR_PHASE, PHASE_PREP);
staf_var_set($ep, VAR_PREP, STATE_RUNNING);
} else {
cfg_machine_set_state($ep, STATE_FAILED);
}
}
$rc = rvt_deploy_library(); #this is the function that is the problem
dump_states() unless ($rc != 0);
Here is the answer:
(i could not post this an answer itself cos i dont have enough reputation)
I found that the fastest way to find the definition of an imported function in perl are the following commands:
>perl.exe -d <filename>.pl
This starts the debugger.
Then; do
b <name of the function/subroutine who's definition you are looking for>
in our case that would mean entering:
b rvt_deploy_library
next press 'c' to jump to the mentioned function/subroutine.
This brings the debugger to the required function. Now, you can see the line no. and location of the function/subroutine on the console.
main::rvt_deploy_library(D:/CAT/rvt/lib/rvt.pm:60):
There are a number of ways to declare a method in Perl. Here is an almost certainly incomplete list:
The standard way, eg. sub NAME { ... }
Using MooseX::Method::Signatures, method NAME (...) {...}
Assignment to a typeglob, eg. *NAME = sub {...};
In addition, if the package declares an AUTOLOAD function, then there may be no explicit definition of the method. See perlsub for more information.
You can inspect any perl value with the B module. In this case:
sub function_to_find {}
sub find_sub (\&) {
my $code = shift;
require B;
my $obj = B::svref_2object($code); # create a B::CV object from $code
print "$code:\n";
print " $$_[0]: $$_[1]\n" for
[file => $obj->FILE],
[line => $obj->GV->LINE],
[name => $obj->GV->NAME],
[package => $obj->STASH->NAME];
}
find_sub &function_to_find;
which prints something like:
CODE(0x80ff50):
file: so.pl
line: 7
name: function_to_find
package: main
B::Xref will show all functions declared in all the files used by your code.

How do you replace a method of a Moose object at runtime?

Is it possible to replace a method of a Moose object at runtime ?
By looking at the source code of Class::MOP::Method (which Moose::Meta::Method inherits from) I concluded that by doing
$method->{body} = sub{ my stuff }
I would be able to replace at runtime a method of an object.
I can get the method using
$object->meta->find_method_by_name(<method_name>);
However, this didn't quite work out.
Is it conceivable to modify methods at run time? And, what is the way to do it with Moose?
Moose or not, that does not sound like a good idea.
Instead, design your object to have an accessor for the method. For example, users of your class can use My::Frobnicator->frobnicator->() to get and invoke the frobnicator method and use My::Frobnicator->frobnicator(sub { } ) to set it.
Sinan's idea is a great start.
But with an little extra tweak, you can make using your method accessor just like using a normal method.
#!/usr/bin/perl
use strict;
use warnings;
use Carp;
my $f = Frob->new;
$f->frob(
sub {
my $self = shift;
print "$self was frobbed\n";
print Carp::longmess('frob')
}
);
print "\nCall frob as normal sub\n";
$f->frobit;
print "\nGoto frob\n";
$f->goto_frob;
BEGIN {
package Frob;
use Moose;
has 'frob' => (
is => 'rw',
isa => 'CodeRef',
);
sub frobit {
&{$_[0]->frob};
}
sub goto_frob {
goto $_[0]->frob;
}
}
The two methods in Frob are very similar.
frobit passes all arguments, including the invocant to the code ref.
goto_frob passes all arguments, including the invocant to the code ref, and replaces goto_frob's stack frame with the code refs.
Which to use depends on what you want in the stack.
Regarding munging the body storage of a Class::MOP::Method object, like so $method->{body} = sub { 'foo' }:
It's never a good idea to violate encapsulation when you are doing OOP. Especially not when you are working with complex object systems like Moose and Class::MOP. It's asking for trouble. Sometimes, there is no other way to get what you want, but even then, violating encapsulation is still a bad idea.
Using the previously mentioned MooseX::SingletonMethod you can replace an objects method.
For example:
{
package Foo;
use MooseX::SingletonMethod;
sub foo { say 'bar' };
}
my $bar = Foo->new;
my $baz = Foo->new;
# replace foo method just in $baz object
$baz->add_singleton_method( foo => sub { say 'baz' } );
$bar->foo; # => bar
$baz->foo; # => baz
Also see this SO answer to What should I do with an object that should no longer be used in Perl?, which shows how this can be achieved using Moose roles.
/I3az/

In Perl, what is the right way for a subclass to alias a method in the base class?

I simply hate how CGI::Application's accessor for the CGI object is called query.
I would like my instance classes to be able to use an accessor named cgi to get the CGI object associated with the current instance of my CGI::Application subclass.
Here is a self-contained example of what I am doing:
package My::Hello;
sub hello {
my $self =shift;
print "Hello #_\n";
}
package My::Merhaba;
use base 'My::Hello';
sub merhaba {
goto sub { shift->hello(#_) };
}
package main;
My::Merhaba->merhaba('StackOverflow');
This is working as I think it should and I cannot see any problems (say, if I wanted to inherit from My::Merhaba: Subclasses need not know anything about merhaba).
Would it have been better/more correct to write
sub merhaba {
my $self = shift;
return $self->hello(#_);
}
What are the advantages/disadvantages of using goto &NAME for the purpose of aliasing a method name? Is there a better way?
Note: If you have an urge to respond with goto is evil don't do it because this use of Perl's goto is different than what you have in mind.
Your approach with goto is the right one, because it will ensure that caller / wantarray and the like keep working properly.
I would setup the new method like this:
sub merhaba {
if (my $method = eval {$_[0]->can('hello')}) {
goto &$method
} else {
# error code here
}
}
Or if you don't want to use inheritance, you can add the new method to the existing package from your calling code:
*My::Hello::merhaba = \&My::Hello::hello;
# or you can use = My::Hello->can('hello');
then you can call:
My::Hello->merhaba('StackOverflow');
and get the desired result.
Either way would work, the inheritance route is more maintainable, but adding the method to the existing package would result in faster method calls.
Edit:
As pointed out in the comments, there are a few cases were the glob assignment will run afoul with inheritance, so if in doubt, use the first method (creating a new method in a sub package).
Michael Carman suggested combining both techniques into a self redefining function:
sub merhaba {
if (my $method = eval { $_[0]->can('hello') }) {
no warnings 'redefine';
*merhaba = $method;
goto &merhaba;
}
die "Can't make 'merhaba' an alias for 'hello'";
}
You can alias the subroutines by manipulating the symbol table:
*My::Merhaba::merhaba = \&My::Hello::hello;
Some examples can be found here.
I'm not sure what the right way is, but Adam Kennedy uses your second method (i.e. without goto) in Method::Alias (click here to go directly to the source code).
This is sort of a combination of Quick-n-Dirty with a modicum of indirection using UNIVERSAL::can.
package My::Merhaba;
use base 'My::Hello';
# ...
*merhaba = __PACKAGE__->can( 'hello' );
And you'll have a sub called "merhaba" in this package that aliases My::Hello::hello. You are simply saying that whatever this package would otherwise do under the name hello it can do under the name merhaba.
However, this is insufficient in the possibility that some code decorator might change the sub that *My::Hello::hello{CODE} points to. In that case, Method::Alias might be the appropriate way to specify a method, as molecules suggests.
However, if it is a rather well-controlled library where you control both the parent and child categories, then the method above is slimmmer.