How do I tell what type of value is in a Perl variable? - perl

How do I tell what type of value is in a Perl variable?
$x might be a scalar, a ref to an array or a ref to a hash (or maybe other things).

ref():
Perl provides the ref() function so that you can check the reference type before dereferencing a reference...
By using the ref() function you can protect program code that dereferences variables from producing errors when the wrong type of reference is used...

$x is always a scalar. The hint is the sigil $: any variable (or dereferencing of some other type) starting with $ is a scalar. (See perldoc perldata for more about data types.)
A reference is just a particular type of scalar.
The built-in function ref will tell you what kind of reference it is. On the other hand, if you have a blessed reference, ref will only tell you the package name the reference was blessed into, not the actual core type of the data (blessed references can be hashrefs, arrayrefs or other things). You can use Scalar::Util 's reftype will tell you what type of reference it is:
use Scalar::Util qw(reftype);
my $x = bless {}, 'My::Foo';
my $y = { };
print "type of x: " . ref($x) . "\n";
print "type of y: " . ref($y) . "\n";
print "base type of x: " . reftype($x) . "\n";
print "base type of y: " . reftype($y) . "\n";
...produces the output:
type of x: My::Foo
type of y: HASH
base type of x: HASH
base type of y: HASH
For more information about the other types of references (e.g. coderef, arrayref etc), see this question: How can I get Perl's ref() function to return REF, IO, and LVALUE? and perldoc perlref.
Note: You should not use ref to implement code branches with a blessed object (e.g. $ref($a) eq "My::Foo" ? say "is a Foo object" : say "foo not defined";) -- if you need to make any decisions based on the type of a variable, use isa (i.e if ($a->isa("My::Foo") { ... or if ($a->can("foo") { ...). Also see polymorphism.

A scalar always holds a single element. Whatever is in a scalar variable is always a scalar. A reference is a scalar value.
If you want to know if it is a reference, you can use ref. If you want to know the reference type,
you can use the reftype routine from Scalar::Util.
If you want to know if it is an object, you can use the blessed routine from Scalar::Util. You should never care what the blessed package is, though. UNIVERSAL has some methods to tell you about an object: if you want to check that it has the method you want to call, use can; if you want to see that it inherits from something, use isa; and if you want to see it the object handles a role, use DOES.
If you want to know if that scalar is actually just acting like a scalar but tied to a class, try tied. If you get an object, continue your checks.
If you want to know if it looks like a number, you can use looks_like_number from Scalar::Util. If it doesn't look like a number and it's not a reference, it's a string. However, all simple values can be strings.
If you need to do something more fancy, you can use a module such as Params::Validate.

I like polymorphism instead of manually checking for something:
use MooseX::Declare;
class Foo {
use MooseX::MultiMethods;
multi method foo (ArrayRef $arg){ say "arg is an array" }
multi method foo (HashRef $arg) { say "arg is a hash" }
multi method foo (Any $arg) { say "arg is something else" }
}
Foo->new->foo([]); # arg is an array
Foo->new->foo(40); # arg is something else
This is much more powerful than manual checking, as you can reuse your "checks" like you would any other type constraint. That means when you want to handle arrays, hashes, and even numbers less than 42, you just write a constraint for "even numbers less than 42" and add a new multimethod for that case. The "calling code" is not affected.
Your type library:
package MyApp::Types;
use MooseX::Types -declare => ['EvenNumberLessThan42'];
use MooseX::Types::Moose qw(Num);
subtype EvenNumberLessThan42, as Num, where { $_ < 42 && $_ % 2 == 0 };
Then make Foo support this (in that class definition):
class Foo {
use MyApp::Types qw(EvenNumberLessThan42);
multi method foo (EvenNumberLessThan42 $arg) { say "arg is an even number less than 42" }
}
Then Foo->new->foo(40) prints arg is an even number less than 42 instead of arg is something else.
Maintainable.

At some point I read a reasonably convincing argument on Perlmonks that testing the type of a scalar with ref or reftype is a bad idea. I don't recall who put the idea forward, or the link. Sorry.
The point was that in Perl there are many mechanisms that make it possible to make a given scalar act like just about anything you want. If you tie a filehandle so that it acts like a hash, the testing with reftype will tell you that you have a filehanle. It won't tell you that you need to use it like a hash.
So, the argument went, it is better to use duck typing to find out what a variable is.
Instead of:
sub foo {
my $var = shift;
my $type = reftype $var;
my $result;
if( $type eq 'HASH' ) {
$result = $var->{foo};
}
elsif( $type eq 'ARRAY' ) {
$result = $var->[3];
}
else {
$result = 'foo';
}
return $result;
}
You should do something like this:
sub foo {
my $var = shift;
my $type = reftype $var;
my $result;
eval {
$result = $var->{foo};
1; # guarantee a true result if code works.
}
or eval {
$result = $var->[3];
1;
}
or do {
$result = 'foo';
}
return $result;
}
For the most part I don't actually do this, but in some cases I have. I'm still making my mind up as to when this approach is appropriate. I thought I'd throw the concept out for further discussion. I'd love to see comments.
Update
I realized I should put forward my thoughts on this approach.
This method has the advantage of handling anything you throw at it.
It has the disadvantage of being cumbersome, and somewhat strange. Stumbling upon this in some code would make me issue a big fat 'WTF'.
I like the idea of testing whether a scalar acts like a hash-ref, rather that whether it is a hash ref.
I don't like this implementation.

Related

Not enough arguments when redefining a subroutine

When I redefine my own subroutine (and not a Perl built-in function), as below :
perl -ce 'sub a($$$){} sub b {a(#_)}'
I get this error :
Not enough arguments for main::a at -e line 1, near "#_)"
I'm wondering why.
Edit :
The word "redefine" is maybe not well chosen. But in my case (and I probably should have explained what I was trying to do originally), I want to redefine (and here "redefine" makes sense) the Test::More::is function by printing first Date and Time before the test result.
Here's what I've done :
Test::More.pm :
sub is ($$;$) {
my $tb = Test::More->builder;
return $tb->is_eq(#_);
}
MyModule.pm :
sub is ($$;$) {
my $t = gmtime(time);
my $date = $t->ymd('/').' '.$t->hms.' ';
print($date);
Test::More::is(#_);
}
The prototype that you have given your subroutine (copied from Test::More::is) says that your subroutine requires two mandatory parameters and one optional one. Passing in a single array will not satisfy that prototype - it is seen as a single parameter which will be evaluated in scalar context.
The fix is to retrieve the two (or three) parameters passed to your subroutine and to pass them, individually, to Test::More::is.
sub is ($$;$) {
my ($got, $expected, $test_name) = #_;
my $t = gmtime(time);
my $date = $t->ymd('/').' '.$t->hms.' ';
print($date);
Test::More::is($got, $expected, $test_name);
}
The problem has nothing to do with your use of a prototype or the fact that you are redefining a subroutine (which, strictly, you aren't as the two subroutines are in different packages) but it's because Test::More::is() has a prototype.
You are not redefining anything here.
You've set a prototype for your sub a by saying sub a($$$). The dollar signs in the function definition tell Perl that this sub has exactly three scalar parameters. When you call it with a(#_), Perl doesn't know how many elements will be in that list, thus it doesn't know how many arguments the call will have, and fails at compile time.
Don't mess with prototypes. You probably don't need them.
Instead, if you know your sub will need three arguments, explicitly grab them where you call it.
sub a($$$) {
...
}
sub b {
my ($one, $two, $three) = #_;
a($one, $two, $three);
}
Or better, don't use the prototype at all.
Also, a and b are terrible names. Don't use them.
In Perl, prototypes don't validate arguments so much as alter parsing rules. $$;$ means the sub expects the caller to match is(EXPR, EXPR) or is(EXPR, EXPR, EXPR).
In this case, bypassing the prototype is ideal.
sub is($$;$) {
print gmtime->strftime("%Y/%m/%d %H:%M:%S ");
return &Test::More::is(#_);
}
Since you don't care if Test::More::is modifies yours #_, the following is a simple optimization:
sub is($$;$) {
print gmtime->strftime("%Y/%m/%d %H:%M:%S ");
return &Test::More::is;
}
If Test::More::is uses caller, you'll find the following useful:
sub is($$;$) {
print gmtime->strftime("%Y/%m/%d %H:%M:%S ");
goto &Test::More::is;
}

Perl get parameter datatype

Am trying to make a subroutine that replaces data depending on datatype: the problem is i can't get the datatype of the parameter, i used this:
sub replace {
my ($search, $replacement, $subject) = #_;
if (ref($search) eq "HASH") {
print "r is a reference to a HASH.\n";
}
elsif (ref($search) eq "SCALAR") {
print "r is a reference to a SCALAR.\n";
}
elsif (ref($search) eq "ARRAY") {
print "r is a reference to a ARRAY.\n";
}
}
my $str = "Foo";
my #arr = ("Foo");
replace($str);
replace(#arr);
But none works. am really new to perl
ref() takes a reference to something, not the something itself. Here:
replace($str);
replace(#arr);
...you are sending in the something directly. Send in a reference to the something instead, by putting a \ in front of it (which says, "take a reference to this something"):
replace(\$str);
replace(\#arr);
Output:
r is a reference to a SCALAR.
r is a reference to a ARRAY.
Note also that in your replace() function, in this line:
my ($search, $replacement, $subject) = #_;
You are effectively asking for a scalar value as the thing to search, so passing in a list (array, hash etc) will clobber $replacement and $subject if the passed in list has more than one element, so you may want to do something like this to ensure you're getting the proper params, and nothing is clobbered unexpectedly:
sub replace {
my ($search, $replacement, $subject) = #_;
die "first arg must be a ref\n" if ! ref $search;
Of course, you can do further argument checking, but this'll ensure that the first parameter can only be a reference to something. Instead of die(), you can also just return so the program doesn't crash, or print or warn and then return.
It is not stated what you want to do with it, but here's what is wrong with what you show.
The ref function shows the datatype of the reference subtmitted to it, or it returns an empty string if its argument isn't a reference at all.
So to get the expected behavior you should do
replace(\$str);
replace(\#arr);
Also, you need to add the test to your function
else (not ref $search)
for when a submitted string is not a reference.
For completeness, I should also point out an issue, explained in the answer by stevieb. When you pass an array to a function, it receives it as a flat list of arguments. With your function you clearly do not want replace(#arr). They are assigned to your list of scalar variables in order, one element to each. (As soon as there is an array variable it all goes into it.) See, for example, this post.

Perl dereferencing a subroutine

I have come across code with the following syntax:
$a -> mysub($b);
And after looking into it I am still struggling to figure out what it means. Any help would be greatly appreciated, thanks!
What you have encountered is object oriented perl.
it's documented in perlobj. The principle is fairly simple though - an object is a sort of super-hash, which as well as data, also includes built in code.
The advantage of this, is that your data structure 'knows what to do' with it's contents. At a basic level, that's just validate data - so you can make a hash that rejects "incorrect" input.
But it allows you to do considerably more complicated things. The real point of it is encapsulation, such that I can write a module, and you can make use of it without really having to care what's going on inside it - only the mechanisms for driving it.
So a really basic example might look like this:
#!/usr/bin/env perl
use strict;
use warnings;
package MyObject;
#define new object
sub new {
my ($class) = #_;
my $self = {};
$self->{count} = 0;
bless( $self, $class );
return $self;
}
#method within the object
sub mysub {
my ( $self, $new_count ) = #_;
$self->{count} += $new_count;
print "Internal counter: ", $self->{count}, "\n";
}
package main;
#create a new instance of `MyObject`.
my $obj = MyObject->new();
#call the method,
$obj->mysub(10);
$obj->mysub(10);
We define "class" which is a description of how the object 'works'. In this, class, we set up a subroutine called mysub - but because it's a class, we refer to it as a "method" - that is, a subroutine that is specifically tied to an object.
We create a new instance of the object (basically the same as my %newhash) and then call the methods within it. If you create multiple objects, they each hold their own internal state, just the same as it would if you created separate hashes.
Also: Don't use $a and $b as variable names. It's dirty. Both because single var names are wrong, but also because these two in particular are used for sort.
That's a method call. $a is the invocant (a class name or an object), mysub is the method name, and $b is an argument. You should proceed to read perlootut which explains all of this.

Perl Using a hash as a reference is deprecated when used with package

I have a module called News (original name, I know) with a method called get_fields, this method returns all the fields that belong to the module like this
sub get_fields {
my $self = shift;
return $self;
}
Now when I call it like this in a different module where I need to do stuff to the fields
my %fields = %{ $news->get_fields };
I discovered doing it like this prevented this issue
Type of argument to keys on reference must be unblessed hashref or
arrayref
when I iterate other fields like this
foreach my $key ( keys %fields ) {
%pairs->{$key} = %fields->{$key} if %fields->{$key};
}
in order to use the values of the fields, I get this warning
Using a hash as a reference is deprecated
which is pointing back to the foreach loop.
How can I avoid this error message without getting the unbless warning back?
I think you're getting mixed up between objects and hashes. get_fields will return $self - which whilst I can't tell for certain, looks like it'll be returning a blessed object.
Now, blessed objects are quite similar to hashes, but they're not the same. You can test the difference with the ref function.
So the question is more - why are you doing this? Why are you trying to cast an object reference into a hash? Because that's what you're doing with:
my %fields = %{ $news->get_fields };
Because pretty fundamentally - even if that worked, it would be a horrible thing to do. The point, purpose and reason for objects is encapsulation - e.g. things outside the module don't meddle with stuff inside.
So why not instead have get_fields return a list of fields, which you can then iterate on and make method calls? This would really be the 'right' way to do something like this.
sub get_fields {
my ( $self ) = #_;
return keys %$self;
}
Or if you really must, embed a method within your object that returns as hash - rather than an object reference - that you can then manipulate externally.
Generally - you don't refer to hashes with a % prefix, unless you're manipulating the whole hash.
To extract a single element from %pairs you should do:
foreach my $key ( keys %pairs ) {
print $pairs{$key},"\n";
}
If the contents of $pairs{$key} is a reference, then you can use the -> to indicate that you should dereference, e.g. $pairs -> {$key}.

perl constructor keyword 'new'

I am new to Perl and currently learning Perl object oriented and came across writing a constructor.
It looks like when using new for the name of the subroutine the first parameter will be the package name.
Must the constructor be using the keyword new? Or is it because when we are calling the new subroutine using the packagename, then the first parameter to be passed in will be package name?
packagename->new;
and when the subroutine have other name it will be the first parameter will be the reference to an object? Or is it because when the subroutine is call via the reference to the object so that the first parameter to be passed in will be the reference to the object?
$objRef->subroutine;
NB: All examples below are simplified for instructional purposes.
On Methods
Yes, you are correct. The first argument to your new function, if invoked as a method, will be the thing you invoked it against.
There are two “flavors” of invoking a method, but the result is the same either way. One flavor relies upon an operator, the binary -> operator. The other flavor relies on ordering of arguments, the way bitransitive verbs work in English. Most people use the dative/bitransitive style only with built-ins — and perhaps with constructors, but seldom anything else.
Under most (but not quite all) circumstances, these first two are equivalent:
1. Dative Invocation of Methods
This is the positional one, the one that uses word-order to determine what’s going on.
use Some::Package;
my $obj1 = new Some::Package NAME => "fred";
Notice we use no method arrow there: there is no -> as written. This is what Perl itself uses with many of its own functions, like
printf STDERR "%-20s: %5d\n", $name, $number;
Which just about everyone prefers to the equivalent:
STDERR->printf("%-20s: %5d\n", $name, $number);
However, these days that sort of dative invocation is used almost exclusively for built-ins, because people keep getting things confused.
2. Arrow Invocation of Methods
The arrow invocation is for the most part clearer and cleaner, and less likely to get you tangled up in the weeds of Perl parsing oddities. Note I said less likely; I did not say that it was free of all infelicities. But let’s just pretend so for the purposes of this answer.
use Some::Package;
my $obj2 = Some::Package->new(NAME => "fred");
At run time, barring any fancy oddities or inheritance matters, the actual function call would be
Some::Package::new("Some::Package", "NAME", "fred");
For example, if you were in the Perl debugger and did a stack dump, it would have something like the previous line in its call chain.
Since invoking a method always prefixes the parameter list with invocant, all functions that will be invoked as methods must account for that “extra” first argument. This is very easily done:
package Some::Package;
sub new {
my($classname, #arguments) = #_;
my $obj = { #arguments };
bless $obj, $classname;
return $obj;
}
This is just an extremely simplified example of the new most frequent ways to call constructors, and what happens on the inside. In actual production code, the constructor would be much more careful.
Methods and Indirection
Sometimes you don’t know the class name or the method name at compile time, so you need to use a variable to hold one or the other, or both. Indirection in programming is something different from indirect objects in natural language. Indirection just means you have a variable that contains something else, so you use the variable to get at its contents.
print 3.14; # print a number directly
$var = 3.14; # or indirectly
print $var;
We can use variables to hold other things involved in method invocation that merely the method’s arguments.
3. Arrow Invocation with Indirected Method Name:
If you don’t know the method name, then you can put its name in a variable. Only try this with arrow invocation, not with dative invocation.
use Some::Package;
my $action = (rand(2) < 1) ? "new" : "old";
my $obj = Some::Package->$action(NAME => "fido");
Here the method name itself is unknown until run-time.
4. Arrow Invocation with Indirected Class Name:
Here we use a variable to contain the name of the class we want to use.
my $class = (rand(2) < 1)
? "Fancy::Class"
: "Simple::Class";
my $obj3 = $class->new(NAME => "fred");
Now we randomly pick one class or another.
You can actually use dative invocation this way, too:
my $obj3 = new $class NAME => "fred";
But that isn’t usually done with user methods. It does sometimes happen with built-ins, though.
my $fh = ($error_count == 0) ? *STDOUT : *STDERR;
printf $fh "Error count: %d.\n", $error_count;
That’s because trying to use an expression in the dative slot isn’t going to work in general without a block around it; it can otherwise only be a simple scalar variable, not even a single element from an array or hash.
printf { ($error_count == 0) ? *STDOUT : *STDERR } "Error count: %d.\n", $error_count;
Or more simply:
print { $fh{$filename} } "Some data.\n";
Which is pretty darned ugly.
Let the invoker beware
Note that this doesn’t work perfectly. A literal in the dative object slot works differently than a variable does there. For example, with literal filehandles:
print STDERR;
means
print STDERR $_;
but if you use indirect filehandles, like this:
print $fh;
That actually means
print STDOUT $fh;
which is unlikely to mean what you wanted, which was probably this:
print $fh $_;
aka
$fh->print($_);
Advanced Usage: Dual-Nature Methods
The thing about the method invocation arrow -> is that it is agnostic about whether its left-hand operand is a string representing a class name or a blessed reference representing an object instance.
Of course, nothing formally requires that $class contain a package name. It may be either, and if so, it is up to the method itself to do the right thing.
use Some::Class;
my $class = "Some::Class";
my $obj = $class->new(NAME => "Orlando");
my $invocant = (rand(2) < 1) ? $class : $obj;
$invocant->any_debug(1);
That requires a pretty fancy any_debug method, one that does something different depending on whether its invocant was blessed or not:
package Some::Class;
use Scalar::Util qw(blessed);
sub new {
my($classname, #arguments) = #_;
my $obj = { #arguments };
bless $obj, $classname;
return $obj;
}
sub any_debug {
my($invocant, $value) = #_;
if (blessed($invocant)) {
$invocant->obj_debug($value);
} else {
$invocant->class_debug($value);
}
}
sub obj_debug {
my($self, $value) = #_;
$self->{DEBUG} = $value;
}
my $Global_Debug;
sub class_debug {
my($classname, $value) = #_;
$Global_Debug = $value;
}
However, this is a rather advanced and subtle technique, one applicable in only a few uncommon situations. It is not recommended for most situations, as it can be confusing if not handled properly — and perhaps even if it is.
It is not first parameter to new, but indirect object syntax,
perl -MO=Deparse -e 'my $o = new X 1, 2'
which gets parsed as
my $o = 'X'->new(1, 2);
From perldoc,
Perl suports another method invocation syntax called "indirect object" notation. This syntax is called "indirect" because the method comes before the object it is being invoked on.
That being said, new is not some kind of reserved word for constructor invocation, but name of method/constructor itself, which in perl is not enforced (ie. DBI has connect constructor)