Why does `eq` not work when one argument has overloaded stringification? - perl

I have realised (the hard way) that operator eq gives a fatal runtime error when one of the operand is an object with overloaded stringification.
Here is a minimal example:
my $test = MyTest->new('test');
print 'yes' if $test eq 'test';
package MyTest;
use overload '""' => sub { my $self = shift; return $self->{'str'} };
sub new {
my ( $class, $str ) = #_;
return bless { str => $str }, $class;
}
The result of running this is:
Operation "eq": no method found,
left argument in overloaded package MyTest,
right argument has no overloaded magic at ./test.pl line 7.
My expectation from reading perlop would be that string context is forced on both operands, firing the stringification method in $test, then the resulting strings are compared. Why doesn't it work? What is actually hapenning?
The context in which I had this problem was in a script that uses both autodie and Try::Tiny. In the try block, I die with some specific messages to be caught. But in the catch block, when I test for whether $_ eq "my specific message\n", this gives a runtime if $_ is an autodie::exception.
I know I will have to replace $_ eq "..." with !ref && $_ eq "...", but I would like to know why.

You only overloaded stringification, not string comparison. The overload pragma will however use the overloaded stringification for the string comparison if you specify the fallback => 1 parameter:
my $test = MyTest->new('test');
print 'yes' if $test eq 'test';
package MyTest;
use overload
fallback => 1,
'""' => sub { my $self = shift; return $self->{'str'} };
sub new {
my ( $class, $str ) = #_;
return bless { str => $str }, $class;
}
Details on why this works:
When handed an overloaded object, the eq operator will try to invoke the eq overload. We did not provide an overload, and we didn't provide a cmp overload from which eq could be autogenerated. Therefore, Perl will issue that error.
With fallback => 1 enabled, the error is suppressed and Perl will do what it would do anyway – coerce the arguments to strings (which invokes stringification overloading or other magic), and compare them.

Related

String overloaded variable is considered defined no matter what

I have the following lines in my script:
my $spec = shift;
if (!defined $spec) {
return ("Invalid specification", undef);
}
$spec = "$spec" // '';
I would naturally expect this to, when passed undef, return the warning Invalid specification in the array, with the second item being undef. Instead, the check is passed, and I get a console message warning me about Use of uninitialized value $spec in string on the next line.
$spec is an object with string and number overloading, and is unfortunately written such that attempting to test for truthiness in this particular subroutine (by way of if ($spec) for instance) results in deep recursion and a segfault.
While I am interested in why, exactly, this is happening, I'm more interested in how to make it stop happening. I want to eliminate the console warning, preferable without no warnings qw/uninitialized/. Is this possible, and if so, how do I do it?
You say that $spec is an object with string overloading.
If that's the case then you need to coerce it into String form before checking for it being defined:
if (! defined overload::StrVal($spec)) {
Correction per ysth
As ysth pointed out in the StrVal does not coerce the overloaded stringification:
overload::StrVal(arg)
Gives the string value of arg as in the absence of stringify overloading. If you are using this to get the address of a reference (useful for checking if two references point to the same thing) then you may be better off using Scalar::Util::refaddr() , which is faster.
Therefore to accomplish this, try his other suggestion of:
"$spec" trapping warnings and detecting the uninitialized var warning. Better to add a method to the class to test for whatever case returns undef.
The following demonstrates this approach:
#!/usr/bin/env perl
use strict;
use warnings;
use Test::More tests => 2;
my $obj_str_defined = StringOverloaded->new("has value");
my $obj_str_undef = StringOverloaded->new(undef);
ok( is_overloaded_string_defined($obj_str_defined), qq{\$obj_str_defined is defined} );
ok( !is_overloaded_string_defined($obj_str_undef), qq{\$obj_str_undef is undef} );
sub is_overloaded_string_defined {
my $obj = shift;
my $is_str_defined = 1;
local $SIG{__WARN__} = sub {
$is_str_defined = 0 if $_[0] =~ /Use of uninitialized value \$obj in string/;
};
my $throwaway_var = "$obj";
return $is_str_defined;
}
{
# Object with string overloading
package StringOverloaded;
use strict;
use warnings;
use overload (
'""' => sub {
my $self = shift;
return $$self; # Dereference
},
fallback => 1
);
sub new {
my $pkg = shift;
my $val = shift;
my $self = bless \$val, $pkg;
return $self;
}
}
Output:
1..2
ok 1 - $obj_str_defined is defined
ok 2 - $obj_str_undef is undef

unit test for Perl's sort

I try to use a (class) method in an Object for sorting object instances.
package Something;
use strict;
use warnings;
use Data::Dumper;
sub new {
my ($class, $date) = #_;
my $self = bless{}, $class;
$self->{date} = $date;
return $self;
}
sub _sort($$) {
print STDERR Dumper($_[0], $_[1]);
$_[0]->{date} cmp $_[1]->{date};
}
package SomethingTest;
use base 'Test::Class';
use Test::More;
__PACKAGE__->runtests() unless caller;
sub sorting : Test {
my $jan = Something->new("2016-01-01");
my $feb = Something->new("2016-02-01");
my $mar = Something->new("2016-03-01");
is_deeply(
sort Something::_sort [$feb, $mar, $jan],
[$jan, $feb, $mar]);
}
I've seen this snippet in perldoc -f sort, hence the prototype for _sort.
# using a prototype allows you to use any comparison subroutine
# as a sort subroutine (including other package's subroutines)
package other;
sub backwards ($$) { $_[1] cmp $_[0]; } # $a and $b are
# not set here
package main;
#new = sort other::backwards #old;
However, the dumped arguments look odd:
$VAR1 = [
bless( {
'date' => '2016-02-01'
}, 'Something' ),
bless( {
'date' => '2016-03-01'
}, 'Something' ),
bless( {
'date' => '2016-01-01'
}, 'Something' )
];
$VAR2 = [
$VAR1->[2],
$VAR1->[0],
$VAR1->[1]
];
and the test fails with
# Failed test 'sorting died (Not a HASH reference at sort.t line 16.)'
# at sort.t line 25.
Is this just my test setup or can't I have the same objects in these arrays?
What else am I missing?
Your problem isn't with the subroutine you pass to sort(), but in the arguments you pass to is_deeply(). The way you have written it parses like this, if we add some parentheses:
is_deeply(
sort(Something::_sort [$feb, $mar, $jan], [$jan, $feb, $mar] )
);
That is, you're telling sort() to act on a list consisting of two anonymous array references, and then is_deeply() to run with the single argument returned from sort (except it crashes before is_deeply() can try to run and complain that you gave it too few arguments to work with).
This is probably closer to what you intended:
is_deeply(
[sort(Something::_sort ($feb, $mar, $jan))],
[$jan, $feb, $mar]);
That is, tell is_deeply() to compare two anonymous arrays, the first of which is made from telling sort() to apply your sorting routine to the list ($feb, $mar, $jan).

Confusion with checking Hash of Hash of Arrays

I am trying to compare my hash input to valid allowed options in my data structure, and if it's not one of the options then I set the default value for the key. I seem to be missing something here though.
Example of current data structure..
my $opts = {
file => { require => 1 },
head => {
default => 1,
allowed => [0,1],
},
type => {
default => 'foo',
allowed => [qw(foo bar baz)]
},
};
$args is my hash ref ( file => 'file.txt', type => 'foo', head => 1 )
Snippet of what I've tried..
for my $k ( keys %$opts ) {
croak("Argument '$k' is required in constructor call!")
if $opts->{$k}->{require} and !exists $args->{$k};
if (exists $args->{$k}) {
if (grep {!$args->{$k}} #{$opts->{$k}->{allowed}} ) {
$args->{$k} = $opts->{$k}->{default};
}
...
} else {
..set our defaults
$args->{$k} = $opts->{$k}->{default};
}
}
The checking for allowed values is faulty.
The grep function takes a code block and a list. It sets the $_ variable to each element in the list in turn. If the block returns a true value, the element is kept. In scalar context, grep does not return a list of kept elements, but a count.
Your grep block is {!$args->{$k}}. This returns true when $args->{$k} is false and vice versa. The result does not depend on $_, and therefore doesn't check if the argument is one of the allowed values.
To see if the given value is allowed value, you'll have to test for some form of equivalence, e.g.
if (grep { $args->{$k} eq $_ } #{ $opts->{$k}{allowed} }) {
# this is executed when the arg matches an allowed value
} else {
# the arg is not allowed
}
An Excursion To Smart Matching and List::MoreUtils
If you can use a perl > v10, then smart matching is available. This would express above condition as
use 5.010;
$args->{$k} ~~ $opts->{$k}{allowed}
The lengthy table of possible type combinations states that this is roughly equivalent to the grep if the arg is a scalar (string/number), and the allowed arrayref holds only normal scalars as well.
However, smart matching was re-marked as experimantal in v18, and behaviour will likely change soon.
In the meantime, it might be better to stick to explicit grep etc. But we could implement two improvements:
The grep will test all elements, even when a match was already found. This can be inefficient. The first function from List::Util core module has the same syntax as grep, but stops after the first element. If the block matches a value, this value is returned. If no value matches, it returns undef. This makes things complicated when undef might be a valid value, or even when false values may be allowed. But in your case, the grep could be replaced by
use List::Util 'first';
defined first { $_ eq $args->{$k} } #{ $opts->{$k}{allowed} }
The List::MoreUtils module has even more functionality. It provides for example the any function, which corresponds to the mathematical ∃ (there exists) quantifier:
use List::MoreUtils 'any';
any { $_ eq $args->{$k} } #{ $opts->{$k}{allowed} }
This only returns a boolean value. While it may not be as efficient as a plain grep or first, using any is quite self-documenting, and easier to use.
Until now, I have assumed that we'll only ever do string comparision to the allowed values. This sometimes works, but it would be better to specify an explicit mode. For example
croak qq(Value for "$k": "$args->{$k}" not allowed) unless
$opts->{$k}{mode} eq 'str' and any { $args->{$k} eq $_ } #{ $opts->{$k}{allowed} }
or $opts->{$k}{mode} eq 'like' and any { $args->{$k} =~ $_ } #{ $opts->{$k}{allowed} }
or $opts->{$k}{mode} eq 'num' and any { $args->{$k} == $_ } #{ $opts->{$k}{allowed} }
or $opts->{$k}{mode} eq 'smart' and any { $args->{$k} ~~ $_ } #{ $opts->{$k}{allowed} }
or $opts->{$k}{mode} eq 'code' and any { $args->{$k}->($_) } #{ $opts->{$k}{allowed} };
Preventing unknown options
You may or may not want to forbid unknown options in your $args hash. Especially if you consider composability of classes, you may want to ignore unknown options, as a superclass or subclass may need these.
But if you choose to check for wrong options, you could delete those elements you already handled:
my $self = {};
for my $k (keys %$opts) {
my $v = delete $args->{$k};
...; # use $v in the rest of the loop
$self->{$k} = $v;
}
croak "Unknown arguments (" . (join ", ", keys %$args) . ") are forbidden" if keys %$args;
or grep for unknown args:
my #unknown = grep { not exists $opts->{$_} } keys %$args;
croak "Unknown arguments (" . (join ", ", #unknown) . ") are forbidden" if #unknown;
for my $k (keys %$opts) {
...;
}
or you could loop over the combined keys of $args and $opts:
use List::Util 'uniq';
for my $k (uniq keys(%$opts), keys(%$args)) {
croak "Unknown argument $k" unless exists $opts->{$k};
...;
}
Scalar Context
I have assumed that you correctly initialized $args as a hash reference:
my $args = { file => 'file.txt', type => 'foo', head => 1 };
Using parens instead of curlies is syntactically valid:
my $args = ( file => 'file.txt', type => 'foo', head => 1 );
but this doesn't produce a hash. Instead, the => and , behave like the comma operator in C: the left operand is evaluated and discarded. That is, only the last element is kept:
my $args = 1; # equivalent to above snippet.

Best way to use "isa" method?

What is the "best" way to use "isa()" reliably? In other words, so it works correctly on any value, not just an object.
By "best", I mean lack of un-handled corner cases as well as lack of potential performance issues, so this is not a subjective question.
This question mentions two approaches that seem reliable (please note that the old style UNIVERSAL::isa() should not be used, with reasons well documented in the answers to that Q):
eval { $x->isa("Class") }
#and check $# in case $x was not an object, in case $x was not an object
use Scalar::Util 'blessed';
blessed $x && $x ->isa($class);
The first one uses eval, the second uses B:: (at least for non-XS flavor of Scalar::Util).
The first does not seem to work correctly if $x is a scalar containing a class name, as illustrated below, so I'm leaning towards #2 (using blessed) unless somoene indicates a good reason not to.
$ perl5.8 -e '{use IO::Handle;$x="IO::Handle";
eval {$is = $x->isa("IO::Handle")}; print "$is:$#\n";}'
1:
Are there any objective reasons to pick one of these two approaches (or a 3rd one i'm not aware of) such as performance, not handling some special case, etc...?
The Scalar::Util implementation is categorically better. It avoids the overhead of the eval {} which always results in the setting of an additional variable.
perl -we'$#=q[foo]; eval {}; print $#'
The Scalar::Util implementation is easier to read (it doesn't die for a reason that is unknown to the code). If the eval fails too, I believe what happens is you have walk backwards in the tree to the state prior to the eval -- this is how resetting state is achieved. This comes with additional overhead on failure.
Benchmarks
Not an object at all
Rate eval su
eval 256410/s -- -88%
su 2222222/s 767% --
Object passing isa check
Rate su eval
su 1030928/s -- -16%
eval 1234568/s 20% --
Object failing isa check
Rate su eval
su 826446/s -- -9%
eval 909091/s 10% --
Test code:
use strict;
use warnings;
use Benchmark;
use Scalar::Util;
package Foo;
Benchmark::cmpthese(
1_000_000
, {
eval => sub{ eval{ $a->isa(__PACKAGE__) } }
, su => sub { Scalar::Util::blessed $a && $a->isa(__PACKAGE__) }
}
);
package Bar;
$a = bless {};
Benchmark::cmpthese(
1_000_000
, {
eval => sub{ eval{ $a->isa(__PACKAGE__)} }
, su => sub { Scalar::Util::blessed $a && $a->isa(__PACKAGE__) }
}
);
package Baz;
$a = bless {};
Benchmark::cmpthese(
1_000_000
, {
eval => sub{ eval{ $a->isa('duck')} }
, su => sub { Scalar::Util::blessed $a && $a->isa( 'duck' ) }
}
);
I used This is perl, v5.10.1 (*) built for i486-linux-gnu-thread-multi, and Scalar::Util, 1.21
You can wrap the safety checks in a scalar and then use the scalar as a method to keep things clean:
use Scalar::Util 'blessed';
my $isa = sub {blessed $_[0] and $_[0]->isa($_[1])};
my $obj;
if ($obj->$isa('object')) { ... } # returns false instead of throwing an error
$obj = {};
if ($obj->$isa('object')) { ... } # returns false as well
bless $obj => 'object';
if ($obj->$isa('object')) { say "we got an object" }
Note that $obj->$isa(...) is just a different spelling of $isa->($obj, ...) so no method call actually takes place (which is why it avoids throwing any errors).
And here is some code that will allow you to call isa on anything and then inspect the result (inspired by Axeman's answer):
{package ISA::Helper;
use Scalar::Util;
sub new {
my ($class, $obj, $type) = #_;
my $blessed = Scalar::Util::blessed $obj;
bless {
type => $type,
obj => $obj,
blessed => $blessed,
isa => $blessed && $obj->isa($type)
} => $class
}
sub blessed {$_[0]{blessed}}
sub type {$_[0]{isa}}
sub ref {ref $_[0]{obj}}
sub defined {defined $_[0]{obj}}
use overload fallback => 1,
bool => sub {$_[0]{isa}};
sub explain {
my $self = shift;
$self->type ? "object is a $$self{type}" :
$self->blessed ? "object is a $$self{blessed} not a $$self{type}" :
$self->ref ? "object is a reference, but is not blessed" :
$self->defined ? "object is defined, but not a reference"
: "object is not defined"
}
}
my $isa = sub {ISA::Helper->new(#_)};
By placing the code reference in a scalar, it can be called on anything without error:
my #items = (
undef,
5,
'five',
\'ref',
bless( {} => 'Other::Pkg'),
bless( {} => 'My::Obj'),
);
for (#items) {
if (my $ok = $_->$isa('My::Obj')) {
print 'ok: ', $ok->explain, "\n";
} else {
print 'error: ', $ok->explain, "\n";
}
}
print undef->$isa('anything?')->explain, "\n";
my $obj = bless {} => 'Obj';
print $obj->$isa('Obj'), "\n";
my $ref = {};
if (my $reason = $ref->$isa('Object')) {
say "all is well"
} else {
given ($reason) {
when (not $_->defined) {say "not defined"}
when (not $_->ref) {say "not a reference"}
when (not $_->blessed) {say "not a blessed reference"}
when (not $_->type) {say "not correct type"}
}
}
this prints:
error: object is not defined
error: object is defined, but not a reference
error: object is defined, but not a reference
error: object is a reference, but is not blessed
error: object is a Other::Pkg not a My::Obj
ok: object is a My::Obj
object is not defined
1
not a blessed reference
If anyone thinks this is actually useful, let me know, and I will put it up on CPAN.
Here's an update for 2020. Perl v5.32 has the isa operator, also known as the class infix operator. It handles the case where the left-hand argument is not an object it returns false instead of blowing up:
use v5.32;
if( $something isa 'Animal' ) { ... }
This might sound a little bit harsh to Perl, but neither one of these is ideal. Both cover up the fact that objects are a tack on to Perl. The blessed idiom is wordy and contains more than a couple simple pieces.
blessed( $object ) && object->isa( 'Class' )
I would prefer something more like this:
object_isa( $object, 'Class' )
There is no logical operation to get wrong, and most of the unfit uses will be weeded out by the compiler. (Quotes not closed, no comma, parens not closed, calling object_isa instead...)
It would take undefined scalars, simple scalars (unless they are a classname that is a Class), unblessed references, and blessed references that do not extend 'Class' and tell you that no, they are not Class objects. Unless we want to go the route of autobox-ing everything, we're going to need a function that tells us simply.
Perhaps there might be a third parameter for $how_close, but there could also be something like this:
if ( my $ranking = object_isa( $object, 'Class' )) {
...
}
else {
given ( $ranking ) {
when ( NOT_TYPE ) { ... }
when ( NOT_BLESSED ) { ... }
when ( NOT_REF ) { ... }
when ( NOT_DEFINED ) { ... }
}
}
About the only way I can see that we could return this many unique falses is if $ranking was blessed into a class that overloaded the boolean operator to return false unless the function returned the one value indicating an ISA relationship.
However, it could have a few members: EXACTLY, INHERITS, IMPLEMENTS, AGGREGATES or even MOCKS
I get tired of typing this too:
$object->can( 'DOES' ) && $object->DOES( 'role' )
because I try to implement the future-facing DOES in lesser perls (on the idea that people might frown on my polluting UNIVERSAL) on them.

How can I define pre/post-increment behavior in Perl objects?

Date::Simple objects display this behavior, where $date++ returns the next day's date.
Date::Simple objects are immutable. After assigning $date1 to $date2, no change to $date1 can affect $date2. This means, for example, that there is nothing like a set_year operation, and $date++ assigns a new object to $date.
How can one custom-define the pre/post-incremental behavior of an object, such that ++$object or $object-- performs a particular action?
I've skimmed over perlboot, perltoot, perltooc and perlbot, but I don't see any examples showing how this can be done.
You want overload.
package Number;
use overload
'0+' => \&as_number,
'++' => \&incr,
;
sub new {
my ($class, $num) = #_;
return bless \$num => $class;
}
sub as_number {
my ($self) = #_;
return $$self;
}
sub incr {
my ($self) = #_;
$_[0] = Number->new($self->as_number + 1); # note the modification of $_[0]
return;
}
package main;
my $num = Number->new(5);
print $num . "\n"; # 5
print $num++ . "\n"; # 5
print ++$num . "\n"; # 7
If you look up perlfaq7 you'll find that the answer is to use the overload pragma, though they probably could have given the FAQ question a better name (in my opinion).
package SomeThing;
use overload
'+' => \&myadd,
'-' => \&mysub;
Basically (assuming $a is an object of the SomeThing class and $b isn't), the above would overload $a + $b to be $a->myadd($b, 0) and $b + $a to $a->myadd($b, 1) (that is, the third argument is a boolean meaning "were the arguments to this operator flipped" and the first-argument-is-self syntax is preserved), and the same for - and mysub.
Read the documentation for the full explanation.