Perl built in functions as a subroutine reference - perl

I am trying to a set of operations to be performed as an array. For this, I have to pass sub routine references. (There may be other ways to perform this without using an array. But, I feel this is best for now, due to certain other constraints).
Basic sample code for what I am trying to do:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
sub test()
{
print "Tested\n";
}
my $test;
my #temp = (1, 2, 3);
my $operations = [
[\&test, undef, undef],
[\&shift, \$test, \#temp],
];
foreach(#$operations){
my $func = shift $_;
my $out = shift $_;
$$out = $func->(#$_);
}
print Dumper $test;
Output observed is:
Tested
Undefined subroutine &main::shift called at temp2.pl line 22.
Query - Is it possible to pass built in sub routines as a reference?
There are earlier queries already, reg built in functions as a sub routine reference in here.
As the question was asked about 3 years ealier, was wondering if there is any alternative for it now.
Would appreciate if some one explains why there is a distinction between built in functions and user defined sub routines in this scenario?

shift isn't a sub; it's an operator just like and and +. You'll need to create a sub if you want a reference to a sub.
[sub { shift(#{$_[0]}) }, \$test, \#temp],
Related:
What are Perl built-in operators/functions?
How to get a reference to print?

Related

in Perl, how to assign the print function to a variable?

I need to control the print method using a variable
My code is below
#!/usr/bin/perl
# test_assign_func.pl
use strict;
use warnings;
sub echo {
my ($string) = #_;
print "from echo: $string\n\n";
}
my $myprint = \&echo;
$myprint->("hello");
$myprint = \&print;
$myprint->("world");
when I ran, I got the following error for the assignment of print function
$ test_assign_func.pl
from echo: hello
Undefined subroutine &main::print called at test_assign_func.pl line 17.
Looks like I need to prefix a namespace to print function but I cannot find the name space. Thank you for any advice!
print is an operator, not a sub.
perlfunc:
The functions in this section can serve as terms in an expression. They fall into two major categories: list operators and named unary operators.
Perl provides a sub for named operators that can be duplicated by a sub with a prototype. A reference to these can be obtained using \&CORE::name.
my $f = \&CORE::length;
say $f->("abc"); # 3
But print isn't such an operator (because of the way it accepts a file handle). For these, you'll need to create a sub with a more limited calling convention.
my $f = sub { print #_ };
$f->("abc\n");
Related:
What are Perl built-in operators/functions?
As mentioned in CORE, some functions can't be called as subroutines, only as barewords. print is one of them.

Can I make a variable optional in a perl sub prototype?

I'd like to understand if it's possible to have a sub prototype and optional parameters in it. With prototypes I can do this:
sub some_sub (\#\#\#) {
...
}
my #foo = qw/a b c/;
my #bar = qw/1 2 3/;
my #baz = qw/X Y Z/;
some_sub(#foo, #bar, #baz);
which is nice and readable, but the minute I try to do
some_sub(#foo, #bar);
or even
some_sub(#foo, #bar, ());
I get errors:
Not enough arguments for main::some_sub at tablify.pl line 72, near "#bar)"
or
Type of arg 3 to main::some_sub must be array (not stub) at tablify.pl line 72, near "))"
Is it possible to have a prototype and a variable number of arguments? or is something similar achievable via signatures?
I know it could be done by always passing arrayrefs I was wondering if there was another way. After all, TMTOWTDI.
All arguments after a semi-colon are optional:
sub some_sub(\#\#;\#) {
}
Most people are going to expect your argument list to flatten, and you are reaching for an outdated tool to do what people don't expect.
Instead, pass data structures by reference:
some_sub( \#array1, \#array2 );
sub some_sub {
my #args = #_;
say "Array 1 has " . $args[0]->#* . " elements";
}
If you want to use those as named arrays within the sub, you can use ref aliasing
use v5.22;
use experimental qw(ref_aliasing);
sub some_sub {
\my( #array1 ) = $_[0];
...
}
With v5.26, you can move the reference operator inside the parens:
use v5.26;
use experimental qw(declared_refs);
sub some_sub {
my( \#array1 ) = $_[0];
...
}
And, remember that v5.20 introduced the :prototype attribute so you can distinguish between prototypes and signatures:
use v5.20;
sub some_sub :prototype(##;#) { ... }
I write about these things at The Effective Perler (which you already read, I see), in Perl New Features, a little bit in Preparing for Perl 7 (which is mostly about what you need to stop doing in Perl 5 to be future proof).

Directly changing an argument to a subroutine

It would make a lot of things easier in my script if I could use subroutines in the way that shift, push, and other built-in subroutines work: they can all directly change the variable that is passed to it without the need to return the change.
When I try to do this the variable is copied at some point and I appear to be simply changing the copy. I understand that this would be fine with references but it even happens with arrays and hashes, where I feel like I am simply passing the variable I was working on to the sub so that more work can be done on it:
#it = (10,11);
changeThis(#it);
print join(" ", #it),"\n"; #prints 10 11 but not 12
sub changeThis{
$_[2] = 12;
}
Is there a way to do this? I understand that it isn't best practice, but in my case it would be very convenient.
That's what prototypes are for:
#!/usr/bin/perl
use strict;
use warnings;
sub changeThis(\#); # the argument will be seen as an array ref (prototype must come before the call!)
my #it = (10,11);
changeThis #it; # even when called with an array
print join(" ", #it),"\n"; #prints 10 11 12
sub changeThis(\#)
{ my( $ar)= #_; $ar->[2]= 12; }
See http://perldoc.perl.org/perlsub.html#Prototypes for more information.
It's not really a popular method though, passing actual array references is probably a better alternative, with less magic involved.
The problem is that the sub call expands the variable to a list of values, which are passed on to the sub routine. I.e. a copy is passed, not the variable itself. Your sub call is equal to:
changeThis(11, 12);
If you wish to change the original array, pass a reference instead:
use strict;
use warnings;
my #it = (10,11);
changeThis(\#it);
print join(" ", #it),"\n";
sub changeThis{
my $array = shift;
$$array[2] = 12;
}
Also, #_[2] will give you the warning:
Scalar value #_[2] better written as $_[2]
If you use warnings, which of course you should. There is no good reason to not turn on warnings and strict, unless you know exactly what you are doing.
As the previous answers suggest, you should use a reference passed to the subroutine.
Additionally you also can use implicit referencing if you want to read trough the documentation for Prototypes
sub changeThis(\#);
#it = (10,11);
changeThis #it;
print join(" ", #it),"\n"; #prints 10 11 12
sub changeThis(\#){
$_[0][2] = 12;
}
(note that you either have to predeclare your subs before the first call or put the sub definitions on top.)

Is 'shift' evil for processing Perl subroutine parameters?

I'm frequently using shift to unpack function parameters:
sub my_sub {
my $self = shift;
my $params = shift;
....
}
However, many on my colleagues are preaching that shift is actually evil. Could you explain why I should prefer
sub my_sub {
my ($self, $params) = #_;
....
}
to shift?
The use of shift to unpack arguments is not evil. It's a common convention and may be the fastest way to process arguments (depending on how many there are and how they're passed). Here's one example of a somewhat common scenario where that's the case: a simple accessor.
use Benchmark qw(cmpthese);
sub Foo::x_shift { shift->{'a'} }
sub Foo::x_ref { $_[0]->{'a'} }
sub Foo::x_copy { my $s = $_[0]; $s->{'a'} }
our $o = bless {a => 123}, 'Foo';
cmpthese(-2, { x_shift => sub { $o->x_shift },
x_ref => sub { $o->x_ref },
x_copy => sub { $o->x_copy }, });
The results on perl 5.8.8 on my machine:
Rate x_copy x_ref x_shift
x_copy 772761/s -- -12% -19%
x_ref 877709/s 14% -- -8%
x_shift 949792/s 23% 8% --
Not dramatic, but there it is. Always test your scenario on your version of perl on your target hardware to find out for sure.
shift is also useful in cases where you want to shift off the invocant and then call a SUPER:: method, passing the remaining #_ as-is.
sub my_method
{
my $self = shift;
...
return $self->SUPER::my_method(#_);
}
If I had a very long series of my $foo = shift; operations at the top of a function, however, I might consider using a mass copy from #_ instead. But in general, if you have a function or method that takes more than a handful of arguments, using named parameters (i.e., catching all of #_ in a %args hash or expecting a single hash reference argument) is a much better approach.
It is not evil, it is a taste sort of thing. You will often see the styles used together:
sub new {
my $class = shift;
my %self = #_;
return bless \%self, $class;
}
I tend to use shift when there is one argument or when I want to treat the first few arguments differently than the rest.
This is, as others have said, a matter of taste.
I generally prefer to shift my parameters into lexicals because it gives me a standard place to declare a group of variables that will be used in the subroutine. The extra verbosity gives me a nice place to hang comments. It also makes it easy to provide default values in a tidy fashion.
sub foo {
my $foo = shift; # a thing of some sort.
my $bar = shift; # a horse of a different color.
my $baz = shift || 23; # a pale horse.
# blah
}
If you are really concerned about the speed of calling your routines, don't unpack your arguments at all--access them directly using #_. Be careful, those are references to the caller's data you are working with. This is a common idiom in POE. POE provides a bunch of constants that you use to get positional parameters by name:
sub some_poe_state_handler {
$_[HEAP]{some_data} = 'chicken';
$_[KERNEL]->yield('next_state');
}
Now the big stupid bug you can get if you habitually unpack params with shift is this one:
sub foo {
my $foo = shift;
my $bar = shift;
my #baz = shift;
# I should really stop coding and go to bed. I am making dumb errors.
}
I think that consistent code style is more important than any particular style. If all my coworkers used the list assignment style, I'd use it too.
If my coworkers said there was a big problem using shift to unpack, I'd ask for a demonstration of why it is bad. If the case is solid, then I'd learn something. If the case is bogus, I could then refute it and help stop the spread of anti-knowledge. Then I'd suggest that we determine a standard method and follow it for future code. I might even try to set up a Perl::Critic policy to check for the decided upon standard.
Assigning #_ to a list can bring some helpul addtional features.
It makes it slightly easier to add additional named parameters at a later date, as you modify your code Some people consider this a feature, similar to how finishing a list or a hash with a trailing ',' makes it slightly easier to append members in the future.
If you're in the habit of using this idiom, then shifting the arguments might seem harmful, because if you edit the code to add an extra argument, you could end up with a subtle bug, if you don't pay attention.
e.g.
sub do_something {
my ($foo) = #_;
}
later edited to
sub do_something {
my ($foo,$bar) = #_; # still works
}
however
sub do_another_thing {
my $foo = shift;
}
If another colleague, who uses the first form dogmatically (perhaps they think shift is evil) edits your file and absent-mindedly updates this to read
sub do_another_thing {
my ($foo, $bar) = shift; # still 'works', but $bar not defined
}
and they may have introduced a subtle bug.
Assigning to #_ can be more compact and efficient with vertical space, when you have a small number of parameters to assign at once. It also allows for you to supply default arguments if you're using the hash style of named function parameters
e.g.
my (%arguments) = (user=>'defaultuser',password=>'password',#_);
I would still consider it a question of style / taste. I think the most important thing is to apply one style or the other with consistency, obeying the principle of least surprise.
I don't think shift is evil. The use of shift shows your willingness to actually name variables - instead of using $_[0].
Personally, I use shift when there's only one parameter to a function. If I have more than one parameter, I'll use the list context.
my $result = decode($theString);
sub decode {
my $string = shift;
...
}
my $otherResult = encode($otherString, $format);
sub encode {
my ($string,$format) = #_;
...
}
There is an optimization for list assignment.
The only reference I could find, is this one.
5.10.0 inadvertently disabled an optimization, which caused a
measurable performance drop in list
assignment, such as is often used to
assign function parameters from #_ .
The optimisation has been re-instated,
and the performance regression fixed.
This is an example of the affected performance regression.
sub example{
my($arg1,$arg2,$arg3) = #_;
}
Perl::Critic is your friend here. It follows the "standards" set up in Damian Conway's book Perl Best Practices. Running it with --verbose 11 gives you an explanation on why things are bad. Not unpacking #_ first in your subs is a severity 4 (out of 5). E.g:
echo 'package foo; use warnings; use strict; sub quux { my foo= shift; my (bar,baz) = #_;};1;' | perlcritic -4 --verbose 11
Always unpack #_ first at line 1, near 'sub quux { my foo= shift; my (bar,baz) = #_;}'.
Subroutines::RequireArgUnpacking (Severity: 4)
Subroutines that use `#_' directly instead of unpacking the arguments to
local variables first have two major problems. First, they are very hard
to read. If you're going to refer to your variables by number instead of
by name, you may as well be writing assembler code! Second, `#_'
contains aliases to the original variables! If you modify the contents
of a `#_' entry, then you are modifying the variable outside of your
subroutine. For example:
sub print_local_var_plus_one {
my ($var) = #_;
print ++$var;
}
sub print_var_plus_one {
print ++$_[0];
}
my $x = 2;
print_local_var_plus_one($x); # prints "3", $x is still 2
print_var_plus_one($x); # prints "3", $x is now 3 !
print $x; # prints "3"
This is spooky action-at-a-distance and is very hard to debug if it's
not intentional and well-documented (like `chop' or `chomp').
An exception is made for the usual delegation idiom
`$object->SUPER::something( #_ )'. Only `SUPER::' and `NEXT::' are
recognized (though this is configurable) and the argument list for the
delegate must consist only of `( #_ )'.
It isn't intrinsically evil, but using it to pull off the arguments of a subroutine one by one is comparatively slow and requires a greater number of lines of code.

How would I do the equivalent of Prototype's Enumerator.detect in Perl with the least amount of code?

Lately I've been thinking a lot about functional programming. Perl offers quite a few tools to go that way, however there's something I haven't been able to find yet.
Prototype has the function detect for enumerators, the descriptions is simply this:
Enumerator.detect(iterator[, context]) -> firstElement | undefined
Finds the first element for which the iterator returns true.
Enumerator in this case is any list while iterator is a reference to a function, which is applied in turn on each element of the list.
I am looking for something like this to apply in situations where performance is important, i.e. when stopping upon encountering a match saves time by disregarding the rest of the list.
I am also looking for a solution that would not involve loading any extra module, so if possible it should be done with builtins only. And if possible, it should be as concise as this for example:
my #result = map function #array;
You say you don't want a module, but this is exactly what the first function in List::Util does. That's a core module, so it should be available everywhere.
use List::Util qw(first);
my $first = first { some condition } #array;
If you insist on not using a module, you could copy the implementation out of List::Util. If somebody knew a faster way to do it, it would be in there. (Note that List::Util includes an XS implementation, so that's probably faster than any pure-Perl approach. It also has a pure-Perl version of first, in List::Util::PP.)
Note that the value being tested is passed to the subroutine in $_ and not as a parameter. This is a convenience when you're using the first { some condition} #values form, but is something you have to remember if you're using a regular subroutine. Some more examples:
use 5.010; # I want to use 'say'; nothing else here is 5.10 specific
use List::Util qw(first);
say first { $_ > 3 } 1 .. 10; # prints 4
sub wanted { $_ > 4 }; # note we're using $_ not $_[0]
say first \&wanted, 1 .. 10; # prints 5
my $want = \&wanted; # Get a subroutine reference
say first \&$want, 1 .. 10; # This is how you pass a reference in a scalar
# someFunc expects a parameter instead of looking at $_
say first { someFunc($_) } 1 .. 10;
Untested since I don't have Perl on this machine, but:
sub first(\&#) {
my $pred = shift;
die "First argument to "first" must be a sub" unless ref $pred eq 'CODE';
for my $val (#_) {
return $val if $pred->($val);
}
return undef;
}
Then use it as:
my $first = first { sub performing test } #list;
Note that this doesn't distinguish between no matches in the list and one of the elements in the list being an undefined value and having that match.
Just since its not here, a Perl function definition of first that localizes $_ for its block:
sub first (&#) {
my $code = shift;
for (#_) {return $_ if $code->()}
undef
}
my #array = 1 .. 10;
say first {$_ > 5} #array; # prints 6
While it will work fine, I don't advocate using this version, since List::Util is a core module (installed by default), and its implementation of first will usually use the XS version (written in C) which is much faster.