String overloaded variable is considered defined no matter what - perl

I have the following lines in my script:
my $spec = shift;
if (!defined $spec) {
return ("Invalid specification", undef);
}
$spec = "$spec" // '';
I would naturally expect this to, when passed undef, return the warning Invalid specification in the array, with the second item being undef. Instead, the check is passed, and I get a console message warning me about Use of uninitialized value $spec in string on the next line.
$spec is an object with string and number overloading, and is unfortunately written such that attempting to test for truthiness in this particular subroutine (by way of if ($spec) for instance) results in deep recursion and a segfault.
While I am interested in why, exactly, this is happening, I'm more interested in how to make it stop happening. I want to eliminate the console warning, preferable without no warnings qw/uninitialized/. Is this possible, and if so, how do I do it?

You say that $spec is an object with string overloading.
If that's the case then you need to coerce it into String form before checking for it being defined:
if (! defined overload::StrVal($spec)) {
Correction per ysth
As ysth pointed out in the StrVal does not coerce the overloaded stringification:
overload::StrVal(arg)
Gives the string value of arg as in the absence of stringify overloading. If you are using this to get the address of a reference (useful for checking if two references point to the same thing) then you may be better off using Scalar::Util::refaddr() , which is faster.
Therefore to accomplish this, try his other suggestion of:
"$spec" trapping warnings and detecting the uninitialized var warning. Better to add a method to the class to test for whatever case returns undef.
The following demonstrates this approach:
#!/usr/bin/env perl
use strict;
use warnings;
use Test::More tests => 2;
my $obj_str_defined = StringOverloaded->new("has value");
my $obj_str_undef = StringOverloaded->new(undef);
ok( is_overloaded_string_defined($obj_str_defined), qq{\$obj_str_defined is defined} );
ok( !is_overloaded_string_defined($obj_str_undef), qq{\$obj_str_undef is undef} );
sub is_overloaded_string_defined {
my $obj = shift;
my $is_str_defined = 1;
local $SIG{__WARN__} = sub {
$is_str_defined = 0 if $_[0] =~ /Use of uninitialized value \$obj in string/;
};
my $throwaway_var = "$obj";
return $is_str_defined;
}
{
# Object with string overloading
package StringOverloaded;
use strict;
use warnings;
use overload (
'""' => sub {
my $self = shift;
return $$self; # Dereference
},
fallback => 1
);
sub new {
my $pkg = shift;
my $val = shift;
my $self = bless \$val, $pkg;
return $self;
}
}
Output:
1..2
ok 1 - $obj_str_defined is defined
ok 2 - $obj_str_undef is undef

Related

Check if a subroutine is being used as an lvalue or an rvalue in Perl

I'm writing some code where I am using a subroutine as both an lvalue and an rvalue to read and write database values. The problem is, I want it to react differently based on whether it is being used as an lvalue or an rvalue.
I want the subroutine to write to the database when it is used as an lvalue, and read from the database when it is used as an rvalue.
Example:
# Write some data
$database->record_name($subscript) = $value;
# Read some data
my $value = $database->record_name($subscript);
The only way I can think of the make this work is to find a way for the subroutine to recognize whether it is being used as an lvalue or an rvalue and react differently for each case.
Is there a way to do this?
Deciding how to behave on whether it was called as an lvalue or not is a bad idea since foo(record_name(...)) would call it as an lvalue.
Instead, you should decide how to behave on whether it is used as an lvalue or not.
You can do that by returning a magical value.
use Variable::Magic qw( cast wizard );
my $wiz = wizard(
data => sub { shift; \#_ },
get => sub { my ($ref, $args) = #_; $$ref = get_record_name(#$args); },
set => sub { my ($ref, $args) = #_; set_record_name(#$args, $$ref); },
);
sub record_name :lvalue {
cast(my $rv, $wiz, #_);
return $rv;
}
A little test:
use Data::Dumper;
sub get_record_name { print("get: #_\n"); return "val"; }
sub set_record_name { print("set: #_\n"); }
my $x = record_name("abc", "def"); # Called as rvalue
record_name("abc", "def") = "xyz"; # Called as lvalue. Used as lvalue.
my $y_ref = \record_name("abc", "def"); # Called as lvalue.
my $y = $$y_ref; # Used as rvalue.
$$y_ref = "xyz"; # Used as lvalue.
Output:
get: abc def
set: abc def xyz
get: abc def
set: abc def xyz
After seeing this, you've surely learned that you should abandon the idea of using an lvalue sub. It's possible to hide all that complexity (such as by using sentinel), but the complexity remains. The fanciness is not worth all the complexity. Use separate setters and getters or use an accessor whose role is based on the number of parameters passed to it ($s=acc(); vs acc($s)) instead.
For this situation you might like to try my Sentinel module.
It provides a function you can use in the accessor, to turn it into a more get/set style approach. E.g. you could
use Sentinel qw( sentinel );
sub get_record_name { ... }
sub set_record_name { ... }
sub record_name
{
sentinel get => \&get_record_name,
set => \&set_record_name,
obj => shift;
}
At this point, the following pairs of lines of code are equivalent
$name = $record->record_name;
$name = $record->get_record_name;
$record->record_name = $new_name;
$record->set_record_name( $new_name );
Of course, if you're not needing to provide the specific get_ and set_ prefixed versions of the methods as well, you could inline them as closures.
See the module docs also for further ideas.
In my opinion, lvalue subroutines in Perl were a dumb idea. Just support ->record_name($subscript, $value) as a setter and ->record_name($subscript) as a getter.
That said, you can use the Want module, like this
use Want;
sub record_name:lvalue {
if ( want('LVALUE') ) {
...
}
else {
...
}
}
though that will also treat this as an LVALUE:
foo( $database->record_name($subscript) );
If you want only assignment statements to be treated specially, use want('ASSIGN') instead.

Why does `eq` not work when one argument has overloaded stringification?

I have realised (the hard way) that operator eq gives a fatal runtime error when one of the operand is an object with overloaded stringification.
Here is a minimal example:
my $test = MyTest->new('test');
print 'yes' if $test eq 'test';
package MyTest;
use overload '""' => sub { my $self = shift; return $self->{'str'} };
sub new {
my ( $class, $str ) = #_;
return bless { str => $str }, $class;
}
The result of running this is:
Operation "eq": no method found,
left argument in overloaded package MyTest,
right argument has no overloaded magic at ./test.pl line 7.
My expectation from reading perlop would be that string context is forced on both operands, firing the stringification method in $test, then the resulting strings are compared. Why doesn't it work? What is actually hapenning?
The context in which I had this problem was in a script that uses both autodie and Try::Tiny. In the try block, I die with some specific messages to be caught. But in the catch block, when I test for whether $_ eq "my specific message\n", this gives a runtime if $_ is an autodie::exception.
I know I will have to replace $_ eq "..." with !ref && $_ eq "...", but I would like to know why.
You only overloaded stringification, not string comparison. The overload pragma will however use the overloaded stringification for the string comparison if you specify the fallback => 1 parameter:
my $test = MyTest->new('test');
print 'yes' if $test eq 'test';
package MyTest;
use overload
fallback => 1,
'""' => sub { my $self = shift; return $self->{'str'} };
sub new {
my ( $class, $str ) = #_;
return bless { str => $str }, $class;
}
Details on why this works:
When handed an overloaded object, the eq operator will try to invoke the eq overload. We did not provide an overload, and we didn't provide a cmp overload from which eq could be autogenerated. Therefore, Perl will issue that error.
With fallback => 1 enabled, the error is suppressed and Perl will do what it would do anyway – coerce the arguments to strings (which invokes stringification overloading or other magic), and compare them.

Is there a convenience for safe dereferencing in Perl?

So perl5porters is discussing to add a safe dereferencing operator, to allow stuff like
$ceo_car_color = $company->ceo->car->color
if defined $company
and defined $company->ceo
and defined $company->ceo->car;
to be shortened to e.g.
$ceo_car_color = $company->>ceo->>car->>color;
where $foo->>bar means defined $foo ? $foo->bar : undef.
The question: Is there some module or unobstrusive hack that gets me this operator, or similar behavior with a visually pleasing syntax?
For your enjoyment, I'll list ideas that I was able to come up with.
A multiple derefencing method (looks ugly).
sub multicall {
my $instance = shift // return undef;
for my $method (#_) {
$instance = $instance->$method() // return undef;
}
return $instance;
}
$ceo_car_color = multicall($company, qw(ceo car color));
A wrapper that turns undef into a proxy object (looks even uglier) which returns undef from all function calls.
{ package Safe; sub AUTOLOAD { return undef } }
sub safe { (shift) // bless {}, 'Safe' }
$ceo_car_color = safe(safe(safe($company)->ceo)->car)->color;
Since I have access to the implementations of ceo(), car() and color(), I thought about returning the safe proxy directly from these methods, but then existing code might break:
my $ceo = $company->ceo;
my $car = $ceo->car if defined $ceo; # defined() breaks
Unfortunately, I don't see anything in perldoc overload about overloading the meaning of defined and // in my safe proxy.
Maybe this is not the most useful solution, but it's one more WTDI (a variant of nr. 1) and it's a non-trivial use-case for List::Util's reduce, which are very rare. ;)
Code
#!/usr/bin/env perl
use strict;
use warnings;
use feature 'say';
use List::Util 'reduce';
my $answer = 42;
sub new { bless \$answer }
sub foo { return shift } # just chaining
sub bar { return undef } # break the chain
sub baz { return ${shift()} } # return the answer
sub multicall { reduce { our ($a, $b); $a and $a = $a->$b } #_ }
my $obj = main->new();
say $obj->multicall(qw(foo foo baz)) // 'undef!';
say $obj->multicall(qw(foo bar baz)) // 'undef!';
Output
42
undef!
Note:
Of course it should be
return unless defined $a;
$a = $a->$b;
instead of the shorter $a and $a = $a->$b from above to work correctly with defined but false values, but my point here is to use reduce.
You can use eval:
$ceo_car_color = eval { $company->ceo->car->color };
But it will of course catch any errors, not just calling a method on an undef.

Most efficient way of checking for a return from a function call in Perl

I want to add the return value from the function call to an array iff something is returned (not by default, i.e. if I have a return statement in the subroutine.)
so I'm using unshift #{$errors}, "HashValidator::$vfunction($hashref)"; but this actually adds the string of the function call to the array. I also tried unshift #{$errors}, $temp if defined my $temp = "HashValidator::$vfunction($hashref)"; with the same result. What would a perl one-liner look like that does this efficiently (I know I can do the ugly, multi-line check but I want to learn).
Thanks,
iff something is returned (not by
default, i.e. if I have a return
statement in the subroutine.)
There could be a gotcha here. Perl always returns something, even if you don't mean to:
my $failures = 0;
sub word_to_number {
my $_ = shift;
/one/ and return 1;
/two/ and return 2;
++$failures; # whoops, equivalent to return ++$failures
}
The last expression in a sub is used as the return value if there is no explicit return. To return "nothing", use bare return, which returns undef or the empty list, depending on context:
my $failures = 0;
sub word_to_number {
my $_ = shift;
/one/ and return 1;
/two/ and return 2;
++$failures;
return;
}
This behaviour is actually useful for things like sorting:
my #results = sort { $a->name cmp $b->name } #list;
where we have passed in the anonymous subroutine:
{
$a->name cmp $b->name # equivalent to return $a->name cmp $b->name
}
In this case, there is no need to use the string form of eval (or any form for that matter). Not only is it slow, but it also can silently trap errors and could lead to untrusted code execution if used with a tainted input. To write a virtual function call in Perl, you can either work with the symbol table directly, or use a symbolic reference:
use 5.010;
use warnings;
use strict;
{package HashValidator;
sub test_ok {exists $_[0]{ok}}
sub test_fail {exists $_[0]{fail}}
}
my $hashref = {ok => 1};
my $errors;
for my $vFunction qw(test_ok test_fail) {
# to call the function:
say "glob deref: $vFunction: ", $HashValidator::{$vFunction}->($hashref);
{no strict 'refs';
say "symbolic: $vFunction: ", &{"HashValidator::$vFunction"}($hashref)}
# to conditionally use the result (if it is a true boolean value):
if (my $ret = $HashValidator::{$vFunction}->($hashref)) {
push #$errors, $ret;
}
# or to keep the function call in list context:
push #$errors, grep $_, $HashValidator::{$vFunction}->($hashref);
# or to golf it:
push #$errors, $HashValidator::{$vFunction}->($hashref) || ();
}
say #$errors.': ', join ', ' => #$errors;
which prints:
glob deref: test_ok: 1
symbolic: test_ok: 1
glob deref: test_fail:
symbolic: test_fail:
3: 1, 1, 1
If you are working with object oriented code, virtual method calls are even easier, with no symbol table or symbolic references:
$obj->$vMethod(...)
try using eval:
push #{$errors}, eval "HashValidator::$vfunction($hashref)"
The following works for me with perl 5.12, and checks for undef return values:
my $foo = "foo";
my $val = eval "Foo::$foo()"
push #arry,$val if ($val);

How can I code in a functional style in Perl?

How do you either:
have a sub return a sub
or
execute text as code
in Perl?
Also, how do I have an anonymous function store state?
A sub returns a sub as a coderef:
# example 1: return a sub that is defined inline.
sub foo
{
return sub {
my $this = shift;
my #other_params = #_;
do_stuff();
return $some_value;
};
}
# example 2: return a sub that is defined elsewhere.
sub bar
{
return \&foo;
}
Arbitrary text can be executed with the eval function: see the documentation at perldoc -f eval:
eval q{print "hello world!\n"};
Note that this is very dangerous if you are evaluating anything extracted from user input, and is generally a poor practice anyway as you can generally define your code in a coderef as in the earlier examples above.
You can store state with a state variable (new in perl5.10), or with a variable scoped higher than the sub itself, as a closure:
use feature 'state';
sub baz
{
state $x;
return ++$x;
}
# create a new scope so that $y is not visible to other functions in this package
{
my $y;
sub quux
{
return ++$y;
}
}
Return a subroutine reference.
Here's a simple example that creates sub refs closed over a value:
my $add_5_to = add_x_to(5);
print $add_5_to->(7), "\n";
sub add_x_to {
my $x = shift;
return sub { my $value = shift; return $x + $value; };
}
You can also work with named subs like this:
sub op {
my $name = shift;
return $op eq 'add' ? \&add : sub {};
}
sub add {
my $l = shift;
my $r = shift;
return $l + $r;
}
You can use eval with an arbitrary string, but don't do it. The code is hard to read and it restarts compilation, which slows everything down. There are a small number of cases where string eval is the best tool for the job. Any time string eval seems like a good idea, you are almost certainly better off with another approach.
Almost anything you would like to do with string eval can be achieved with closures.
Returning subs is easy by using the sub keyword. The returned sub closes over the lexical variables it uses:
#!/usr/bin/perl
use strict; use warnings;
sub mk_count_from_to {
my ($from, $to) = #_;
return sub {
return if $from > $to;
return $from ++;
};
}
my $c = mk_count_from_to(-5, 5);
while ( defined( my $n = $c->() ) ) {
print "$n\n";
}
5.10 introduced state variables.
Executing text as Perl is accomplished using eval EXPR:
the return value of EXPR is parsed and executed as if it were a little Perl program. The value of the expression (which is itself determined within scalar context) is first parsed, and if there weren't any errors, executed in the lexical context of the current Perl program, so that any variable settings or subroutine and format definitions remain afterwards. Note that the value is parsed every time the eval executes
Executing arbitrary strings will open up huge gaping security holes.
You can create anonymous subroutines and access them via a reference; this reference can of course be assigned to a scalar:
my $subref = sub { ... code ... }
or returned from another subroutine
return sub { ... code ... }
If you need to store states, you can create closures with lexical variables defined in an outer scope like:
sub create_func {
my $state;
return sub { ... code that can refer to $state ... }
}
You can run code with eval