perl constructor keyword 'new' - perl

I am new to Perl and currently learning Perl object oriented and came across writing a constructor.
It looks like when using new for the name of the subroutine the first parameter will be the package name.
Must the constructor be using the keyword new? Or is it because when we are calling the new subroutine using the packagename, then the first parameter to be passed in will be package name?
packagename->new;
and when the subroutine have other name it will be the first parameter will be the reference to an object? Or is it because when the subroutine is call via the reference to the object so that the first parameter to be passed in will be the reference to the object?
$objRef->subroutine;

NB: All examples below are simplified for instructional purposes.
On Methods
Yes, you are correct. The first argument to your new function, if invoked as a method, will be the thing you invoked it against.
There are two “flavors” of invoking a method, but the result is the same either way. One flavor relies upon an operator, the binary -> operator. The other flavor relies on ordering of arguments, the way bitransitive verbs work in English. Most people use the dative/bitransitive style only with built-ins — and perhaps with constructors, but seldom anything else.
Under most (but not quite all) circumstances, these first two are equivalent:
1. Dative Invocation of Methods
This is the positional one, the one that uses word-order to determine what’s going on.
use Some::Package;
my $obj1 = new Some::Package NAME => "fred";
Notice we use no method arrow there: there is no -> as written. This is what Perl itself uses with many of its own functions, like
printf STDERR "%-20s: %5d\n", $name, $number;
Which just about everyone prefers to the equivalent:
STDERR->printf("%-20s: %5d\n", $name, $number);
However, these days that sort of dative invocation is used almost exclusively for built-ins, because people keep getting things confused.
2. Arrow Invocation of Methods
The arrow invocation is for the most part clearer and cleaner, and less likely to get you tangled up in the weeds of Perl parsing oddities. Note I said less likely; I did not say that it was free of all infelicities. But let’s just pretend so for the purposes of this answer.
use Some::Package;
my $obj2 = Some::Package->new(NAME => "fred");
At run time, barring any fancy oddities or inheritance matters, the actual function call would be
Some::Package::new("Some::Package", "NAME", "fred");
For example, if you were in the Perl debugger and did a stack dump, it would have something like the previous line in its call chain.
Since invoking a method always prefixes the parameter list with invocant, all functions that will be invoked as methods must account for that “extra” first argument. This is very easily done:
package Some::Package;
sub new {
my($classname, #arguments) = #_;
my $obj = { #arguments };
bless $obj, $classname;
return $obj;
}
This is just an extremely simplified example of the new most frequent ways to call constructors, and what happens on the inside. In actual production code, the constructor would be much more careful.
Methods and Indirection
Sometimes you don’t know the class name or the method name at compile time, so you need to use a variable to hold one or the other, or both. Indirection in programming is something different from indirect objects in natural language. Indirection just means you have a variable that contains something else, so you use the variable to get at its contents.
print 3.14; # print a number directly
$var = 3.14; # or indirectly
print $var;
We can use variables to hold other things involved in method invocation that merely the method’s arguments.
3. Arrow Invocation with Indirected Method Name:
If you don’t know the method name, then you can put its name in a variable. Only try this with arrow invocation, not with dative invocation.
use Some::Package;
my $action = (rand(2) < 1) ? "new" : "old";
my $obj = Some::Package->$action(NAME => "fido");
Here the method name itself is unknown until run-time.
4. Arrow Invocation with Indirected Class Name:
Here we use a variable to contain the name of the class we want to use.
my $class = (rand(2) < 1)
? "Fancy::Class"
: "Simple::Class";
my $obj3 = $class->new(NAME => "fred");
Now we randomly pick one class or another.
You can actually use dative invocation this way, too:
my $obj3 = new $class NAME => "fred";
But that isn’t usually done with user methods. It does sometimes happen with built-ins, though.
my $fh = ($error_count == 0) ? *STDOUT : *STDERR;
printf $fh "Error count: %d.\n", $error_count;
That’s because trying to use an expression in the dative slot isn’t going to work in general without a block around it; it can otherwise only be a simple scalar variable, not even a single element from an array or hash.
printf { ($error_count == 0) ? *STDOUT : *STDERR } "Error count: %d.\n", $error_count;
Or more simply:
print { $fh{$filename} } "Some data.\n";
Which is pretty darned ugly.
Let the invoker beware
Note that this doesn’t work perfectly. A literal in the dative object slot works differently than a variable does there. For example, with literal filehandles:
print STDERR;
means
print STDERR $_;
but if you use indirect filehandles, like this:
print $fh;
That actually means
print STDOUT $fh;
which is unlikely to mean what you wanted, which was probably this:
print $fh $_;
aka
$fh->print($_);
Advanced Usage: Dual-Nature Methods
The thing about the method invocation arrow -> is that it is agnostic about whether its left-hand operand is a string representing a class name or a blessed reference representing an object instance.
Of course, nothing formally requires that $class contain a package name. It may be either, and if so, it is up to the method itself to do the right thing.
use Some::Class;
my $class = "Some::Class";
my $obj = $class->new(NAME => "Orlando");
my $invocant = (rand(2) < 1) ? $class : $obj;
$invocant->any_debug(1);
That requires a pretty fancy any_debug method, one that does something different depending on whether its invocant was blessed or not:
package Some::Class;
use Scalar::Util qw(blessed);
sub new {
my($classname, #arguments) = #_;
my $obj = { #arguments };
bless $obj, $classname;
return $obj;
}
sub any_debug {
my($invocant, $value) = #_;
if (blessed($invocant)) {
$invocant->obj_debug($value);
} else {
$invocant->class_debug($value);
}
}
sub obj_debug {
my($self, $value) = #_;
$self->{DEBUG} = $value;
}
my $Global_Debug;
sub class_debug {
my($classname, $value) = #_;
$Global_Debug = $value;
}
However, this is a rather advanced and subtle technique, one applicable in only a few uncommon situations. It is not recommended for most situations, as it can be confusing if not handled properly — and perhaps even if it is.

It is not first parameter to new, but indirect object syntax,
perl -MO=Deparse -e 'my $o = new X 1, 2'
which gets parsed as
my $o = 'X'->new(1, 2);
From perldoc,
Perl suports another method invocation syntax called "indirect object" notation. This syntax is called "indirect" because the method comes before the object it is being invoked on.
That being said, new is not some kind of reserved word for constructor invocation, but name of method/constructor itself, which in perl is not enforced (ie. DBI has connect constructor)

Related

Not enough arguments when redefining a subroutine

When I redefine my own subroutine (and not a Perl built-in function), as below :
perl -ce 'sub a($$$){} sub b {a(#_)}'
I get this error :
Not enough arguments for main::a at -e line 1, near "#_)"
I'm wondering why.
Edit :
The word "redefine" is maybe not well chosen. But in my case (and I probably should have explained what I was trying to do originally), I want to redefine (and here "redefine" makes sense) the Test::More::is function by printing first Date and Time before the test result.
Here's what I've done :
Test::More.pm :
sub is ($$;$) {
my $tb = Test::More->builder;
return $tb->is_eq(#_);
}
MyModule.pm :
sub is ($$;$) {
my $t = gmtime(time);
my $date = $t->ymd('/').' '.$t->hms.' ';
print($date);
Test::More::is(#_);
}
The prototype that you have given your subroutine (copied from Test::More::is) says that your subroutine requires two mandatory parameters and one optional one. Passing in a single array will not satisfy that prototype - it is seen as a single parameter which will be evaluated in scalar context.
The fix is to retrieve the two (or three) parameters passed to your subroutine and to pass them, individually, to Test::More::is.
sub is ($$;$) {
my ($got, $expected, $test_name) = #_;
my $t = gmtime(time);
my $date = $t->ymd('/').' '.$t->hms.' ';
print($date);
Test::More::is($got, $expected, $test_name);
}
The problem has nothing to do with your use of a prototype or the fact that you are redefining a subroutine (which, strictly, you aren't as the two subroutines are in different packages) but it's because Test::More::is() has a prototype.
You are not redefining anything here.
You've set a prototype for your sub a by saying sub a($$$). The dollar signs in the function definition tell Perl that this sub has exactly three scalar parameters. When you call it with a(#_), Perl doesn't know how many elements will be in that list, thus it doesn't know how many arguments the call will have, and fails at compile time.
Don't mess with prototypes. You probably don't need them.
Instead, if you know your sub will need three arguments, explicitly grab them where you call it.
sub a($$$) {
...
}
sub b {
my ($one, $two, $three) = #_;
a($one, $two, $three);
}
Or better, don't use the prototype at all.
Also, a and b are terrible names. Don't use them.
In Perl, prototypes don't validate arguments so much as alter parsing rules. $$;$ means the sub expects the caller to match is(EXPR, EXPR) or is(EXPR, EXPR, EXPR).
In this case, bypassing the prototype is ideal.
sub is($$;$) {
print gmtime->strftime("%Y/%m/%d %H:%M:%S ");
return &Test::More::is(#_);
}
Since you don't care if Test::More::is modifies yours #_, the following is a simple optimization:
sub is($$;$) {
print gmtime->strftime("%Y/%m/%d %H:%M:%S ");
return &Test::More::is;
}
If Test::More::is uses caller, you'll find the following useful:
sub is($$;$) {
print gmtime->strftime("%Y/%m/%d %H:%M:%S ");
goto &Test::More::is;
}

Perl dereferencing a subroutine

I have come across code with the following syntax:
$a -> mysub($b);
And after looking into it I am still struggling to figure out what it means. Any help would be greatly appreciated, thanks!
What you have encountered is object oriented perl.
it's documented in perlobj. The principle is fairly simple though - an object is a sort of super-hash, which as well as data, also includes built in code.
The advantage of this, is that your data structure 'knows what to do' with it's contents. At a basic level, that's just validate data - so you can make a hash that rejects "incorrect" input.
But it allows you to do considerably more complicated things. The real point of it is encapsulation, such that I can write a module, and you can make use of it without really having to care what's going on inside it - only the mechanisms for driving it.
So a really basic example might look like this:
#!/usr/bin/env perl
use strict;
use warnings;
package MyObject;
#define new object
sub new {
my ($class) = #_;
my $self = {};
$self->{count} = 0;
bless( $self, $class );
return $self;
}
#method within the object
sub mysub {
my ( $self, $new_count ) = #_;
$self->{count} += $new_count;
print "Internal counter: ", $self->{count}, "\n";
}
package main;
#create a new instance of `MyObject`.
my $obj = MyObject->new();
#call the method,
$obj->mysub(10);
$obj->mysub(10);
We define "class" which is a description of how the object 'works'. In this, class, we set up a subroutine called mysub - but because it's a class, we refer to it as a "method" - that is, a subroutine that is specifically tied to an object.
We create a new instance of the object (basically the same as my %newhash) and then call the methods within it. If you create multiple objects, they each hold their own internal state, just the same as it would if you created separate hashes.
Also: Don't use $a and $b as variable names. It's dirty. Both because single var names are wrong, but also because these two in particular are used for sort.
That's a method call. $a is the invocant (a class name or an object), mysub is the method name, and $b is an argument. You should proceed to read perlootut which explains all of this.

Perl OOP attribute manipulation best practice

Assume the following code:
package Thing;
sub new {
my $this=shift;
bless {#_},$this;
}
sub name {
my $this=shift;
if (#_) {
$this->{_name}=shift;
}
return $this->{_name};
}
Now assume we've instantiated an object thusly:
my $o=Thing->new();
$o->name('Harold');
Good enough. We could also instantiate the same thing more quickly with either of the following:
my $o=Thing->new(_name=>'Harold'); # poor form
my $o=Thing->new()->name('Harold');
To be sure, I allowed attributes to be passed in the constructor to allow "friendly" classes to create objects more completely. It could also allow for a clone-type operator with the following code:
my $o=Thing->new(%$otherthing); # will clone attrs if not deeper than 1 level
This is all well and good. I understand the need for hiding attributes behind methods to allow for validation, etc.
$o->name; # returns 'Harold'
$o->name('Fred'); # sets name to 'Fred' and returns 'Fred'
But what this doesn't allow is easy manipulation of the attribute based on itself, such as:
$o->{_name}=~s/old/ry/; # name is now 'Harry', but this "exposes" the attribute
One alternative is to do the following:
# Cumbersome, not syntactically sweet
my $n=$o->name;
$n=~s/old/ry/;
$o->name($n);
Another potential is the following method:
sub Name :lvalue { # note the capital 'N', not the same as name
my $this=shift;
return $this->{_name};
}
Now I can do the following:
$o->Name=~s/old/ry/;
So my question is this... is the above "kosher"? Or is it bad form to expose the attribute that way? I mean, doing that takes away any validation that might be found in the 'name' method. For example, if the 'name' method enforced a capital first letter and lowercase letters thereafter, the 'Name' (capital 'N') bypasses that and forces the user of the class to police herself in the use of it.
So, if the 'Name' lvalue method isn't exactly "kosher" are there any established ways to do such things?
I have considered (but get dizzy considering) things like tied scalars as attributes. To be sure, it may be the way to go.
Also, are there perhaps overloads that may help?
Or should I create replacement methods in the vein of (if it would even work):
sub replace_name {
my $this=shift;
my $repl=shift;
my $new=shift;
$this->{_name}=~s/$repl/$new/;
}
...
$o->replace_name(qr/old/,'ry');
Thanks in advance... and note, I am not very experienced in Perl's brand of OOP, even though I am fairly well-versed in OOP itself.
Additional info:
I guess I could get really creative with my interface... here's an idea I tinkered with, but I guess it shows that there really are no bounds:
sub name {
my $this=shift;
if (#_) {
my $first=shift;
if (ref($first) eq 'Regexp') {
my $second=shift;
$this->{_name}=~s/$first/$second/;
}
else {
$this->{_name}=$first;
}
}
return $this->{_name};
}
Now, I can either set the name attribute with
$o->name('Fred');
or I can manipulate it with
$o->name(qr/old/,'ry'); # name is now Harry
This still doesn't allow stuff like $o->name.=' Jr.'; but that's not too tough to add. Heck, I could allow calllback functions to be passed in, couldn't I?
Your first code example is abolutely fine. This is a standard method to write accessors. Of course this can get ugly when doing a substitution, the best solution might be:
$o->name($o->name =~ s/old/ry/r);
The /r flag returns the result of the substitution. Equivalently:
$o->name(do { (my $t = $o->name) =~ s/old/ry/; $t });
Well yes, this 2nd solution is admittedly ugly. But I am assuming that accessing the fields is a more common operation than setting them.
Depending on your personal style preferences, you could have two different methods for getting and setting, e.g. name and set_name. (I do not think get_ prefixes are a good idea – 4 unneccessary characters).
If substituting parts of the name is a central aspect of your class, then encapsulating this in a special substitute_name method sounds like a good idea. Otherwise this is just unneccessary ballast, and a bad tradeoff for avoiding occasional syntactic pain.
I do not advise you to use lvalue methods, as these are experimental.
I would rather not see (and debug) some “clever” code that returns tied scalars. This would work, but feels a bit too fragile for me to be comfortable with such solutions.
Operator overloading does not help with writing accessors. Especially assignment cannot be overloaded in Perl.
Writing accessors is boring, especially when they do no validation. There are modules that can handle autogeneration for us, e.g. Class::Accessor. This adds generic accessors get and set to your class, plus specific accessors as requested. E.g.
package Thing;
use Class::Accessor 'antlers'; # use the Moose-ish syntax
has name => (is => 'rw'); # declare a read-write attribute
# new is autogenerated. Achtung: this takes a hashref
Then:
Thing->new({ name => 'Harold'});
# or
Thing->new->name('Harold');
# or any of the other permutations.
If you want a modern object system for Perl, there is a row of compatible implementations. The most feature-rich of these is Moose, and allows you to add validation, type constraints, default values, etc. to your attributes. E.g.
package Thing;
use Moose; # this is now a Moose class
has first_name => (
is => 'rw',
isa => 'Str',
required => 1, # must be given in constructor
trigger => \&_update_name, # run this sub after attribute is set
);
has last_name => (
is => 'rw',
isa => 'Str',
required => 1, # must be given in constructor
trigger => \&_update_name,
);
has name => (
is => 'ro', # readonly
writer => '_set_name', # but private setter
);
sub _update_name {
my $self = shift;
$self->_set_name(join ' ', $self->first_name, $self->last_name);
}
# accessors are normal Moose methods, which we can modify
before first_name => sub {
my $self = shift;
if (#_ and $_[0] !~ /^\pU/) {
Carp::croak "First name must begin with uppercase letter";
}
};
The purpose of class interface is to prevent users from directly manipulating your data. What you want to do is cool, but not a good idea.
In fact, I design my classes, so even the class itself doesn't know it's own structure:
package Thingy;
sub new {
my $class = shift;
my $name = shift;
my $self = {};
bless, $self, $class;
$self->name($name);
return $self;
}
sub name {
my $self = shift;
my $name = shift;
my $attribute = "GLUNKENSPEC";
if ( defined $name ) {
$self->{$attribute} = $name;
}
return $self->{$attribute};
}
You can see by my new constructor that I could pass it a name for my Thingy. However, my constructor doesn't know how I store my name. Instead, it merely uses my name method to set the name. As you can see by my name method, it stores the name in an unusual way, but my constructor doesn't need to know or care.
If you want to manipulate the name, you have to work at it (as you showed):
my $name = $thingy->name;
$name =~ s/old/ry/;
$thingy->name( $name );
In fact, a lot of Perl developers use inside out classes just to prevent this direct object manipulation.
What if you want to be able to directly manipulate a class by passing in a regular expression? You have to write a method to do this:
sub mod_name {
my $self = shift;
my $pattern = shift;
my $replacement = shift;
if ( not defined $replacement ) {
croak qq(Some basic error checking: Need pattern and replacement string);
}
my $name = $self->name; # Using my name method for my class
if ( not defined $name ) {
croak qq(Cannot modify name: Name is not yet set.);
}
$name = s/$pattern/$replacement/;
return $self->name($name);
}
Now, the developer can do this:
my $thingy->new( "Harold" );
$thingy->mod_name( "old", "new" );
say $thingy->name; # Says "Harry"
Whatever time or effort you save by allowing for direct object manipulation is offset by the magnitude of extra effort it will take to maintain your program. Most methods don't take more than a few minutes to create. If I suddenly got an hankering to manipulate my object in a new and surprising way, it's easy enough to create a new method to do this.
1. No. I don't actually use random nonsense words to protect my class. This is purely for demo purposes to show that even my constructor doesn't have to know how methods actually store their data.
I understand the need for hiding attributes behind methods to allow for validation, etc.
Validation is not the only reason, although it is the only one you refer to. I mention this because another is that encapsulation like this leaves the implementation open. For example, if you have a class which needs to have a string "name" which can be get and set, you could just expose a member, name. However, if you instead use get()/set() subroutines, how "name" is stored and represented internally doesn't matter.
That can be very significant if you write bunches of code with uses the class and then suddenly realize that although the user may be accessing "name" as a string, it would be much better stored some other way (for whatever reason). If the user was accessing the string directly, as a member field, you now either have to compensate for this by including code that will change name when the real whatever is changed and...but wait, how can you then compensate for the client code that changed name...
You can't. You're stuck. You now have to go back and change all the code that uses the class -- if you can. I'm sure anyone who has done enough OOP has run into this situation in one form or another.
No doubt you've read all this before, but I'm bringing it up again because there are a few points (perhaps I've misunderstood you) where you seem to outline strategies for changing "name" based on your knowledge of the implementation, and not what was intended to be the API. That is very tempting in perl because of course there is no access control -- everything is essential public -- but it is still a very very bad practice for the reason just described.
That doesn't mean, of course, that you can't simply commit to exposing "name" as a string. That's a decision and it won't be the same in all cases. However, in this particular case, if what you are particularly concerned with is a simple way to transform "name", IMO you might as well stick with a get/set method. This:
# Cumbersome, not syntactically sweet
Maybe true (although someone else might say it is simple and straightforward), but your primary concern should not be syntactic sweetness, and neither should speed of execution. They can be concerns, but your primary concern has to be design, because no matter how sweet and fast your stuff is, if it is badly designed, it will all come down around you in time.
Remember, "Premature optimization is the root of all evil" (Knuth).
So my question is this... is the above "kosher"? Or is it bad form to expose the attribute that way?
It boils down to: Will this continue to work if the internals change? If the answer is yes, you can do many other things including but not limited to validation.)
The answer is yes. This can be done by having the method return a magical value.
{
package Lvalue;
sub TIESCALAR { my $class = shift; bless({ #_ }, $class) }
sub FETCH { my $self = shift; my $m = $self->{getter}; $self->{obj}->$m(#_) }
sub STORE { my $self = shift; my $m = $self->{setter}; $self->{obj}->$m(#_) }
}
sub new { my $class = shift; bless({}, $class) }
sub get_name {
my ($self) = #_;
return $self->{_name};
}
sub set_name {
my ($self, $val) = #_;
die "Invalid name" if !length($val);
$self->{_name} = $val;
}
sub name :lvalue {
my ($self) = #_;
tie my $rv, 'Lvalue', obj=>$self, getter=>'get_name', setter=>'set_name';
return $rv;
}
my $o = __PACKAGE__->new();
$o->name = 'abc';
print $o->name, "\n"; # abc
$o->name = ''; # Invalid name

About using an array of functions in Perl

We are trying to build an API to support commit() and rollback() automatically, so that we don't have to bother with it anymore. By researching, we have found that using eval {} is the way to go.
For eval {} to know what to do, I have thought of giving the API an array of functions, which it can execute with a foreach without the API having to intepret anything. However, this function might be in a different package.
Let me clarify with an example:
sub handler {
use OSA::SQL;
use OSA::ourAPI;
my #functions = ();
push(#functions, OSA::SQL->add_page($date, $stuff, $foo, $bar));
my $API = OSA::ourAPI->connect();
$API->exec_multi(#functions);
}
The question is: Is it possible to execute the functions in #functions inside of OSA::ourAPI, even if ourAPI has no use OSA::SQL. If not, would it be possible if I use an array reference instead of an array, given that the pointer would point to the known function inside of the memory?
Note: This is the basic idea that we want to base the more complex final version on.
You are NOT adding a function pointer to your array. You are adding teh return value of calling the add_page() subroutine. You have 3 solutions to this:
A. You will need to store (in #functions) an array of arrayrefs of the form [\&OSA::SQL::add_page, #argument_values], meaning you pass in an actual reference to a subroutine (called statically); and then exec_multi will do something like (syntax may not be 100% correct as it's 4am here)
sub exec_multi {
my ($class, $funcs)= #_;
foreach my $f (#$funcs) {
my ($func, #args) = #$f;
my $res = &$func(#args);
print "RES:$res\n";
}
}
Just to re-iterate, this will call individual subs in static version (OSA::SQL::add_page), e.g. WITHOUT passing the package name as the first parameter as a class call OSA::SQL->add_page would. If you want the latter, see the next solution.
B. If you want to call your subs in class context (like in your example, in other words with the class name as a first parameter), you can use ysth's suggestion in the comment.
You will need to store (in #functions) an array of arrayrefs of the form [sub { OSA::SQL->add_page(#argument_values) }], meaning you pass in a reference to a subroutine which will in turn call what you need; and then exec_multi will do something like (syntax may not be 100% correct as it's 4am here)
sub exec_multi {
my ($class, $funcs)= #_;
foreach my $f (#$funcs) {
my ($func) = #$f;
my $res = &$func();
print "RES:$res\n";
}
}
C. You will need to store (in #functions) an array of arrayrefs of the form [ "OSA::SQL", "add_page", #argument_values], meaning you pass in a package and function name; and then exec_multi will do something like (syntax may not be 100% correct as it's 4am here)
my ($package, $sub, #args) = #{ $functions[$i] };
no strict 'refs';
$package->$sub(#args);
use strict 'refs';
If I understood your question correctly, then you don't need to worry about whether ourAPI uses OSA::SQL, since your main code imports it already.
However, since - in #1B - you will be passing a list of packages to exec_multi as first elements of each arrayref, you can do "require $package; $package->import();" in exec_multi. But again, it's completely un-necessary if your handler call already required and loaded each of those packages. And to do it right you need to pass in a list of parameters to import() as well. BUT WHYYYYYY? :)

How do I tell what type of value is in a Perl variable?

How do I tell what type of value is in a Perl variable?
$x might be a scalar, a ref to an array or a ref to a hash (or maybe other things).
ref():
Perl provides the ref() function so that you can check the reference type before dereferencing a reference...
By using the ref() function you can protect program code that dereferences variables from producing errors when the wrong type of reference is used...
$x is always a scalar. The hint is the sigil $: any variable (or dereferencing of some other type) starting with $ is a scalar. (See perldoc perldata for more about data types.)
A reference is just a particular type of scalar.
The built-in function ref will tell you what kind of reference it is. On the other hand, if you have a blessed reference, ref will only tell you the package name the reference was blessed into, not the actual core type of the data (blessed references can be hashrefs, arrayrefs or other things). You can use Scalar::Util 's reftype will tell you what type of reference it is:
use Scalar::Util qw(reftype);
my $x = bless {}, 'My::Foo';
my $y = { };
print "type of x: " . ref($x) . "\n";
print "type of y: " . ref($y) . "\n";
print "base type of x: " . reftype($x) . "\n";
print "base type of y: " . reftype($y) . "\n";
...produces the output:
type of x: My::Foo
type of y: HASH
base type of x: HASH
base type of y: HASH
For more information about the other types of references (e.g. coderef, arrayref etc), see this question: How can I get Perl's ref() function to return REF, IO, and LVALUE? and perldoc perlref.
Note: You should not use ref to implement code branches with a blessed object (e.g. $ref($a) eq "My::Foo" ? say "is a Foo object" : say "foo not defined";) -- if you need to make any decisions based on the type of a variable, use isa (i.e if ($a->isa("My::Foo") { ... or if ($a->can("foo") { ...). Also see polymorphism.
A scalar always holds a single element. Whatever is in a scalar variable is always a scalar. A reference is a scalar value.
If you want to know if it is a reference, you can use ref. If you want to know the reference type,
you can use the reftype routine from Scalar::Util.
If you want to know if it is an object, you can use the blessed routine from Scalar::Util. You should never care what the blessed package is, though. UNIVERSAL has some methods to tell you about an object: if you want to check that it has the method you want to call, use can; if you want to see that it inherits from something, use isa; and if you want to see it the object handles a role, use DOES.
If you want to know if that scalar is actually just acting like a scalar but tied to a class, try tied. If you get an object, continue your checks.
If you want to know if it looks like a number, you can use looks_like_number from Scalar::Util. If it doesn't look like a number and it's not a reference, it's a string. However, all simple values can be strings.
If you need to do something more fancy, you can use a module such as Params::Validate.
I like polymorphism instead of manually checking for something:
use MooseX::Declare;
class Foo {
use MooseX::MultiMethods;
multi method foo (ArrayRef $arg){ say "arg is an array" }
multi method foo (HashRef $arg) { say "arg is a hash" }
multi method foo (Any $arg) { say "arg is something else" }
}
Foo->new->foo([]); # arg is an array
Foo->new->foo(40); # arg is something else
This is much more powerful than manual checking, as you can reuse your "checks" like you would any other type constraint. That means when you want to handle arrays, hashes, and even numbers less than 42, you just write a constraint for "even numbers less than 42" and add a new multimethod for that case. The "calling code" is not affected.
Your type library:
package MyApp::Types;
use MooseX::Types -declare => ['EvenNumberLessThan42'];
use MooseX::Types::Moose qw(Num);
subtype EvenNumberLessThan42, as Num, where { $_ < 42 && $_ % 2 == 0 };
Then make Foo support this (in that class definition):
class Foo {
use MyApp::Types qw(EvenNumberLessThan42);
multi method foo (EvenNumberLessThan42 $arg) { say "arg is an even number less than 42" }
}
Then Foo->new->foo(40) prints arg is an even number less than 42 instead of arg is something else.
Maintainable.
At some point I read a reasonably convincing argument on Perlmonks that testing the type of a scalar with ref or reftype is a bad idea. I don't recall who put the idea forward, or the link. Sorry.
The point was that in Perl there are many mechanisms that make it possible to make a given scalar act like just about anything you want. If you tie a filehandle so that it acts like a hash, the testing with reftype will tell you that you have a filehanle. It won't tell you that you need to use it like a hash.
So, the argument went, it is better to use duck typing to find out what a variable is.
Instead of:
sub foo {
my $var = shift;
my $type = reftype $var;
my $result;
if( $type eq 'HASH' ) {
$result = $var->{foo};
}
elsif( $type eq 'ARRAY' ) {
$result = $var->[3];
}
else {
$result = 'foo';
}
return $result;
}
You should do something like this:
sub foo {
my $var = shift;
my $type = reftype $var;
my $result;
eval {
$result = $var->{foo};
1; # guarantee a true result if code works.
}
or eval {
$result = $var->[3];
1;
}
or do {
$result = 'foo';
}
return $result;
}
For the most part I don't actually do this, but in some cases I have. I'm still making my mind up as to when this approach is appropriate. I thought I'd throw the concept out for further discussion. I'd love to see comments.
Update
I realized I should put forward my thoughts on this approach.
This method has the advantage of handling anything you throw at it.
It has the disadvantage of being cumbersome, and somewhat strange. Stumbling upon this in some code would make me issue a big fat 'WTF'.
I like the idea of testing whether a scalar acts like a hash-ref, rather that whether it is a hash ref.
I don't like this implementation.