Is there a convenience for safe dereferencing in Perl? - perl

So perl5porters is discussing to add a safe dereferencing operator, to allow stuff like
$ceo_car_color = $company->ceo->car->color
if defined $company
and defined $company->ceo
and defined $company->ceo->car;
to be shortened to e.g.
$ceo_car_color = $company->>ceo->>car->>color;
where $foo->>bar means defined $foo ? $foo->bar : undef.
The question: Is there some module or unobstrusive hack that gets me this operator, or similar behavior with a visually pleasing syntax?
For your enjoyment, I'll list ideas that I was able to come up with.
A multiple derefencing method (looks ugly).
sub multicall {
my $instance = shift // return undef;
for my $method (#_) {
$instance = $instance->$method() // return undef;
}
return $instance;
}
$ceo_car_color = multicall($company, qw(ceo car color));
A wrapper that turns undef into a proxy object (looks even uglier) which returns undef from all function calls.
{ package Safe; sub AUTOLOAD { return undef } }
sub safe { (shift) // bless {}, 'Safe' }
$ceo_car_color = safe(safe(safe($company)->ceo)->car)->color;
Since I have access to the implementations of ceo(), car() and color(), I thought about returning the safe proxy directly from these methods, but then existing code might break:
my $ceo = $company->ceo;
my $car = $ceo->car if defined $ceo; # defined() breaks
Unfortunately, I don't see anything in perldoc overload about overloading the meaning of defined and // in my safe proxy.

Maybe this is not the most useful solution, but it's one more WTDI (a variant of nr. 1) and it's a non-trivial use-case for List::Util's reduce, which are very rare. ;)
Code
#!/usr/bin/env perl
use strict;
use warnings;
use feature 'say';
use List::Util 'reduce';
my $answer = 42;
sub new { bless \$answer }
sub foo { return shift } # just chaining
sub bar { return undef } # break the chain
sub baz { return ${shift()} } # return the answer
sub multicall { reduce { our ($a, $b); $a and $a = $a->$b } #_ }
my $obj = main->new();
say $obj->multicall(qw(foo foo baz)) // 'undef!';
say $obj->multicall(qw(foo bar baz)) // 'undef!';
Output
42
undef!
Note:
Of course it should be
return unless defined $a;
$a = $a->$b;
instead of the shorter $a and $a = $a->$b from above to work correctly with defined but false values, but my point here is to use reduce.

You can use eval:
$ceo_car_color = eval { $company->ceo->car->color };
But it will of course catch any errors, not just calling a method on an undef.

Related

String overloaded variable is considered defined no matter what

I have the following lines in my script:
my $spec = shift;
if (!defined $spec) {
return ("Invalid specification", undef);
}
$spec = "$spec" // '';
I would naturally expect this to, when passed undef, return the warning Invalid specification in the array, with the second item being undef. Instead, the check is passed, and I get a console message warning me about Use of uninitialized value $spec in string on the next line.
$spec is an object with string and number overloading, and is unfortunately written such that attempting to test for truthiness in this particular subroutine (by way of if ($spec) for instance) results in deep recursion and a segfault.
While I am interested in why, exactly, this is happening, I'm more interested in how to make it stop happening. I want to eliminate the console warning, preferable without no warnings qw/uninitialized/. Is this possible, and if so, how do I do it?
You say that $spec is an object with string overloading.
If that's the case then you need to coerce it into String form before checking for it being defined:
if (! defined overload::StrVal($spec)) {
Correction per ysth
As ysth pointed out in the StrVal does not coerce the overloaded stringification:
overload::StrVal(arg)
Gives the string value of arg as in the absence of stringify overloading. If you are using this to get the address of a reference (useful for checking if two references point to the same thing) then you may be better off using Scalar::Util::refaddr() , which is faster.
Therefore to accomplish this, try his other suggestion of:
"$spec" trapping warnings and detecting the uninitialized var warning. Better to add a method to the class to test for whatever case returns undef.
The following demonstrates this approach:
#!/usr/bin/env perl
use strict;
use warnings;
use Test::More tests => 2;
my $obj_str_defined = StringOverloaded->new("has value");
my $obj_str_undef = StringOverloaded->new(undef);
ok( is_overloaded_string_defined($obj_str_defined), qq{\$obj_str_defined is defined} );
ok( !is_overloaded_string_defined($obj_str_undef), qq{\$obj_str_undef is undef} );
sub is_overloaded_string_defined {
my $obj = shift;
my $is_str_defined = 1;
local $SIG{__WARN__} = sub {
$is_str_defined = 0 if $_[0] =~ /Use of uninitialized value \$obj in string/;
};
my $throwaway_var = "$obj";
return $is_str_defined;
}
{
# Object with string overloading
package StringOverloaded;
use strict;
use warnings;
use overload (
'""' => sub {
my $self = shift;
return $$self; # Dereference
},
fallback => 1
);
sub new {
my $pkg = shift;
my $val = shift;
my $self = bless \$val, $pkg;
return $self;
}
}
Output:
1..2
ok 1 - $obj_str_defined is defined
ok 2 - $obj_str_undef is undef

Check if a subroutine is being used as an lvalue or an rvalue in Perl

I'm writing some code where I am using a subroutine as both an lvalue and an rvalue to read and write database values. The problem is, I want it to react differently based on whether it is being used as an lvalue or an rvalue.
I want the subroutine to write to the database when it is used as an lvalue, and read from the database when it is used as an rvalue.
Example:
# Write some data
$database->record_name($subscript) = $value;
# Read some data
my $value = $database->record_name($subscript);
The only way I can think of the make this work is to find a way for the subroutine to recognize whether it is being used as an lvalue or an rvalue and react differently for each case.
Is there a way to do this?
Deciding how to behave on whether it was called as an lvalue or not is a bad idea since foo(record_name(...)) would call it as an lvalue.
Instead, you should decide how to behave on whether it is used as an lvalue or not.
You can do that by returning a magical value.
use Variable::Magic qw( cast wizard );
my $wiz = wizard(
data => sub { shift; \#_ },
get => sub { my ($ref, $args) = #_; $$ref = get_record_name(#$args); },
set => sub { my ($ref, $args) = #_; set_record_name(#$args, $$ref); },
);
sub record_name :lvalue {
cast(my $rv, $wiz, #_);
return $rv;
}
A little test:
use Data::Dumper;
sub get_record_name { print("get: #_\n"); return "val"; }
sub set_record_name { print("set: #_\n"); }
my $x = record_name("abc", "def"); # Called as rvalue
record_name("abc", "def") = "xyz"; # Called as lvalue. Used as lvalue.
my $y_ref = \record_name("abc", "def"); # Called as lvalue.
my $y = $$y_ref; # Used as rvalue.
$$y_ref = "xyz"; # Used as lvalue.
Output:
get: abc def
set: abc def xyz
get: abc def
set: abc def xyz
After seeing this, you've surely learned that you should abandon the idea of using an lvalue sub. It's possible to hide all that complexity (such as by using sentinel), but the complexity remains. The fanciness is not worth all the complexity. Use separate setters and getters or use an accessor whose role is based on the number of parameters passed to it ($s=acc(); vs acc($s)) instead.
For this situation you might like to try my Sentinel module.
It provides a function you can use in the accessor, to turn it into a more get/set style approach. E.g. you could
use Sentinel qw( sentinel );
sub get_record_name { ... }
sub set_record_name { ... }
sub record_name
{
sentinel get => \&get_record_name,
set => \&set_record_name,
obj => shift;
}
At this point, the following pairs of lines of code are equivalent
$name = $record->record_name;
$name = $record->get_record_name;
$record->record_name = $new_name;
$record->set_record_name( $new_name );
Of course, if you're not needing to provide the specific get_ and set_ prefixed versions of the methods as well, you could inline them as closures.
See the module docs also for further ideas.
In my opinion, lvalue subroutines in Perl were a dumb idea. Just support ->record_name($subscript, $value) as a setter and ->record_name($subscript) as a getter.
That said, you can use the Want module, like this
use Want;
sub record_name:lvalue {
if ( want('LVALUE') ) {
...
}
else {
...
}
}
though that will also treat this as an LVALUE:
foo( $database->record_name($subscript) );
If you want only assignment statements to be treated specially, use want('ASSIGN') instead.

Perl: Syntactical Sugar for Latter Coderef Arguments?

Using sub prototypes, we can define our own subs that look like map or grep. That is, the first coderef argument has shorter syntax than a normal anonymous sub. For example:
sub thunked (&) { $_[0] }
my $val = thunked { 2 * 4 };
Works great here, since the first argument is the coderef. For latter arguments however, it simple won't parse properly.
I made a with sub designed to make writing GTK2 code cleaner. It's meant to look like this (untested since it's hypothetical code):
use 5.012;
use warnings;
use Gtk2 '-init';
sub with ($&) {
local $_ = $_[0];
$_[1]->();
$_;
}
for (Gtk2::Window->new('toplevel')) {
$_->set_title('Test Application');
$_->add(with Gtk2::VBox->new {
my $box = $_;
$box->add(Gtk2::Button->new("Button $_")) for (1..4);
});
$_->show_all;
}
Gtk2->main;
It doesn't work because with needs to take the block as a first argument for the nice syntax to work. Is there any way to pull it off?
The module Devel::Declare contains tools for extending Perl's syntax in a relatively safe way.
Using Devel::Declare you would create a hook on the with token, which will stop the parser when it reaches that word. From there, you have control over the parser and you can read ahead until you reach a { symbol. At that point, you have what you need to work with, so you rewrite it into valid Perl, and pass it back to the parser.
in the file With.pm:
package With;
use warnings;
use strict;
use Devel::Declare;
sub import {
my $caller = caller;
Devel::Declare->setup_for (
$caller => {with => {const => \&parser}}
);
no strict 'refs';
*{$caller.'::with'} = sub ($&) {
$_[1]() for $_[0];
$_[0]
}
}
our $prefix = '';
sub get {substr Devel::Declare::get_linestr, length $prefix}
sub set { Devel::Declare::set_linestr $prefix . $_[0]}
sub parser {
local $prefix = substr get, 0, length($_[0]) + $_[1];
my $with = strip_with();
strip_space();
set "scalar($with), sub " . get;
}
sub strip_space {
my $skip = Devel::Declare::toke_skipspace length $prefix;
set substr get, $skip;
}
sub strip_with {
strip_space;
my $with;
until (get =~ /^\{/) {
(my $line = get) =~ s/^([^{]+)//;
$with .= $1;
set $line;
strip_space;
}
$with =~ s/\s+/ /g;
$with
}
and to use it:
use With;
sub Window::add {say "window add: ", $_[1]->str}
sub Window::new {bless [] => 'Window'}
sub Box::new {bless [] => 'Box'}
sub Box::add {push #{$_[0]}, #_[1..$#_]}
sub Box::str {"Box(#{$_[0]})"}
sub Button::new {"Button($_[1])"}
with Window->new {
$_->add(with Box->new {
for my $num (1 .. 4) {
$_->add(Button->new($num))
}
})
};
Which prints:
window add: Box(Button(1) Button(2) Button(3) Button(4))
A completely different approach would be to skip the with keyword altogether and write a routine to generate constructor subroutines:
BEGIN {
for my $name (qw(VBox)) { # and any others you want
no strict 'refs';
*$name = sub (&#) {
use strict;
my $code = shift;
my $with = "Gtk2::$name"->new(#_);
$code->() for $with;
$with
}
}
}
and then your code could look like
for (Gtk2::Window->new('toplevel')) {
$_->set_title('Test Application');
$_->add(VBox {
my $box = $_;
$box->add(Gtk2::Button->new("Button $_")) for (1..4);
});
$_->show_all;
}
One way that you could deal with it is to add a fairly useless keyword:
sub perform(&) { $_[0] }
with GTK2::VBox->new, perform { ... }
where perform is really just a sugarier alternative to sub.
Another way is to write a Devel::Declare filter or a Syntax::Keyword:: plugin to implement your with, as long as you have some way to tell when you're done parsing the with argument and ready to start parsing the block — balanced parentheses would do (so would an opening curly brace, but then hashes become a problem). Then you could support something like
with (GTK2::VBox->new) { ... }
and let the filter rewrite it to something like
do {
local $_ = GTK2::VBox->new;
do {
...;
};
$_;
}
which, if it works, has the advantage of not actually creating a sub, and thus not interfering with #_, return, and a few other things. The two layers of do-age I think are necessary for being able to install an EndOfScope hook in the proper place.
The obvious disadvantages of this are that it's tricky, it's hairy, and it's a source filter (even if it's a tame one) which means there are problems you have to solve if you want any code using it to be debuggable at all.

How can I code in a functional style in Perl?

How do you either:
have a sub return a sub
or
execute text as code
in Perl?
Also, how do I have an anonymous function store state?
A sub returns a sub as a coderef:
# example 1: return a sub that is defined inline.
sub foo
{
return sub {
my $this = shift;
my #other_params = #_;
do_stuff();
return $some_value;
};
}
# example 2: return a sub that is defined elsewhere.
sub bar
{
return \&foo;
}
Arbitrary text can be executed with the eval function: see the documentation at perldoc -f eval:
eval q{print "hello world!\n"};
Note that this is very dangerous if you are evaluating anything extracted from user input, and is generally a poor practice anyway as you can generally define your code in a coderef as in the earlier examples above.
You can store state with a state variable (new in perl5.10), or with a variable scoped higher than the sub itself, as a closure:
use feature 'state';
sub baz
{
state $x;
return ++$x;
}
# create a new scope so that $y is not visible to other functions in this package
{
my $y;
sub quux
{
return ++$y;
}
}
Return a subroutine reference.
Here's a simple example that creates sub refs closed over a value:
my $add_5_to = add_x_to(5);
print $add_5_to->(7), "\n";
sub add_x_to {
my $x = shift;
return sub { my $value = shift; return $x + $value; };
}
You can also work with named subs like this:
sub op {
my $name = shift;
return $op eq 'add' ? \&add : sub {};
}
sub add {
my $l = shift;
my $r = shift;
return $l + $r;
}
You can use eval with an arbitrary string, but don't do it. The code is hard to read and it restarts compilation, which slows everything down. There are a small number of cases where string eval is the best tool for the job. Any time string eval seems like a good idea, you are almost certainly better off with another approach.
Almost anything you would like to do with string eval can be achieved with closures.
Returning subs is easy by using the sub keyword. The returned sub closes over the lexical variables it uses:
#!/usr/bin/perl
use strict; use warnings;
sub mk_count_from_to {
my ($from, $to) = #_;
return sub {
return if $from > $to;
return $from ++;
};
}
my $c = mk_count_from_to(-5, 5);
while ( defined( my $n = $c->() ) ) {
print "$n\n";
}
5.10 introduced state variables.
Executing text as Perl is accomplished using eval EXPR:
the return value of EXPR is parsed and executed as if it were a little Perl program. The value of the expression (which is itself determined within scalar context) is first parsed, and if there weren't any errors, executed in the lexical context of the current Perl program, so that any variable settings or subroutine and format definitions remain afterwards. Note that the value is parsed every time the eval executes
Executing arbitrary strings will open up huge gaping security holes.
You can create anonymous subroutines and access them via a reference; this reference can of course be assigned to a scalar:
my $subref = sub { ... code ... }
or returned from another subroutine
return sub { ... code ... }
If you need to store states, you can create closures with lexical variables defined in an outer scope like:
sub create_func {
my $state;
return sub { ... code that can refer to $state ... }
}
You can run code with eval

Is there a better way to pass by reference in Perl?

I am doing pass-by-reference like this:
use strict;
use warnings;
sub repl {
local *line = \$_[0]; our $line;
$line = "new value";
}
sub doRepl {
my ($replFunc) = #_;
my $foo = "old value";
$replFunc->($foo);
print $foo; # prints "new value";
}
doRepl(\&repl);
Is there a cleaner way of doing it?
Prototypes don't work because I'm using a function reference (trust me that there's a good reason for using a function reference).
I also don't want to use $_[0] everywhere in repl because it's ugly.
Have you looked at Data::Alias? It lets you create lexically-scoped aliases with a clean syntax.
You can use it to create pass-by-reference semantics like this:
use strict;
use warnings;
use Data::Alias;
sub foo {
alias my ($arg) = #_;
$arg++;
}
my $count = 0;
foo($count);
print "$count\n";
The output is 1, indicating that the call to foo modified its argument.
There are a couple of ways to do this. Explicitly pass a scalar ref to $foo, or take advantage of Perl's built-in pass by reference semantics.
Explicit reference:
my $foo = "old value";
doRepl( \&repl, \$foo );
print $foo; # prints "new value";
sub repl {
my $line = shift;
$$line = "new value";
}
sub doRepl {
my ($replFunc, $foo) = #_;
$replFunc->($foo);
}
Pass by reference:
my $foo = "old value";
doRepl( \&repl, $foo );
print $foo; # prints "new value";
sub repl {
$_[0] = "new value";
}
sub doRepl {
my $replFunc = shift;
$replFunc->(#_);
}
Even fancier pass by reference:
my $foo = "old value";
doRepl( \&repl, $foo );
print $foo; # prints "new value";
sub repl {
$_[0] = "new value";
}
sub doRepl {
my $replFunc = shift;
&$replFunc;
}
The first one use normal perl hard references to do the job.
The first pass by ref method uses the fact that Perl passes arguments to all functions as references. The elements of #_ are actually aliases to the values in the argument list when the subroutine is called. By altering $_[0] in foo(), you actually alter the first argument to foo().
The second pass by ref method use the fact that a sub called with an & sigil and no parens gets the #_ array of its caller. Otherwise it is identical.
Update: I just noticed you desire to avoid $_[0]. You can do this in repl if you want:
sub repl {
for my $line( $_[0] ) {
$line = 'new value';
}
}
sub repl {
my $line = \$_[0]; # or: my $line = \shift
$$line = "new value";
}
I don't think there is anything wrong with using local to create the alias in this case.
Dynamic scope is of course a powerful feature, but so long as you are aware of the side effects (new value is visible in functions called from its scope, if a lexical of the same name is in scope, it can't be localized, ...) then it is a useful addition to the already overflowing Perl toolbox.
The main reason for the warnings in the Perl docs about local are to keep people from inadvertently using it instead of my and to ease the transition from perl4. But there are definitely times when local is useful, and this is one.
Using for to create your alias is also an option, but I find the explicit syntax with local clearer in its intent. It is also a bit faster if performance is a concern.