in perl, is it bad practice to call multiple subroutines with default arguments? - perl

I am learning perl and understand that it is a common and accepted practice to unpack subroutine arguments using shift. I also understand that it is common and acceptable practice to omit function arguments to use the default #_ array.
Considering these two things, if you call a subroutine without arguments, the #_ can (and will, if using shift) be changed. Does this mean that calling another subroutine with default arguments, or, in fact, using the #_ array after this, is considered bad practice? Consider this example:
sub total { # calculate sum of all arguments
my $running_sum;
# take arguments one by one and sum them together
while (#_) {
$running_sum += shift;
}
$running_sum;
}
sub avg { calculate the mean of given arguments
if (#_ == 0) { return }
my $sum = &total; # gets the correct answer, but changes #_
$sum / #_ # causes division by zero, since #_ is now empty
}
My gut feeling tells me that using shift to unpack arguments would actually be bad practice, unless your subroutine is actually supposed to change the passed arguments, but I have read in multiple places, including Stack Overflow, that this is not a bad practice.
So the question is: if using shift is common practice, should I always assume the passed argument list could get changed, as a side-effect of the subroutine (like the &total subroutine in the quoted example)? Is there maybe a way to pass arguments by value, so I can be sure that the argument list does not get changed, so I could use it again (like in the &avg subroutine in the quoted text)?

In general, shifting from the arguments is ok—using the & sigil to call functions isn't. (Except in some very specific situations you'll probably never encounter.)
Your code could be re-written, so that total doesn't shift from #_. Using a for-loop may even be more efficient.
sub total {
my $total = 0;
$total += $_ for #_;
$total;
}
Or you could use the sum function from List::Util:
use List::Util qw(sum);
sub avg { #_ ? sum(#_) / #_ : 0 }
Using shift isn't that common, except for extracting $self in object oriented Perl. But as you always call your functions like foo( ... ), it doesn't matter if foo shifts or doesn't shift the argument array.
(The only thing worth noting about a function is whether it assigns to elements in #_, as these are aliases for the variables you gave as arguments. Assigning to elements in #_ is usually bad.)
Even if you can't change the implementation of total, calling the sub with an explicit argument list is safe, as the argument list is a copy of the array:
(a) &total — calls total with the identical #_, and overrides prototypes.
(b) total(#_) — calls total with a copy of #_.
(c) &total(#_) — calls total with a copy of #_, and overrides prototypes.
Form (b) is standard. Form (c) shouldn't be seen, except in very few cases for subs inside the same package where the sub has a prototype (and don't use prototypes), and they have to be overridden for some obscure reason. A testament to poor design.
Form (a) is only sensible for tail calls (#_ = (...); goto &foo) or other forms of optimization (and premature optimization is the root of all evil).

You should avoid using the &func; style of calling unless you have a really good reason, and trust that others do the same.
To guard your #_ against modification by a callee, just do &func() or func.

Perl is a little too lax sometimes and having multiple ways of accessing input parameters can make smelly and inconsistent code. For want of a better answer, try to impose your own standard.
Here's a few ways I've used and seen
Shifty
sub login
{
my $user = shift;
my $passphrase = shift;
# Validate authentication
return 0;
}
Expanding #_
sub login
{
my ($user, $passphrase) = #_;
# Validate authentication
return 0;
}
Explicit indexing
sub login
{
my user = $_[0];
my user = $_[1];
# Validate authentication
return 0;
}
Enforce parameters with function prototypes (this is not popular however)
sub login($$)
{
my ($user, $passphrase) = #_;
# Validate authentication
return 0;
}
Sadly you still have to perform your own convoluted input validation/taint checking, ie:
return unless defined $user;
return unless defined $passphrase;
or better still, a little more informative
unless (defined($user) && defined($passphrase)) {
carp "Input error: user or passphrase not defined";
return -1;
}
Perldoc perlsub should really be your first port of call.
Hope this helps!

Here are some examples where the careful use of #_ matters.
1. Hash-y Arguments
Sometimes you want to write a function which can take a list of key-value pairs, but one is the most common use and you want that to be available without needing a key. For example
sub get_temp {
my $location = #_ % 2 ? shift : undef;
my %options = #_;
$location ||= $options{location};
...
}
So now if you call the function with an odd number of arguments, the first is location. This allows get_temp('Chicago') or get_temp('New York', unit => 'C') or even get_temp( unit => 'K', location => 'Nome, Ak'). This may be a more convenient API for your users. By shifting the odd argument, now #_ is an even list and may be assigned to a hash.
2. Dispatching
Lets say we have a class that we want to be able to dispatch methods by name (possibly AUTOLOAD could be useful, we will hand roll). Perhaps this is a command line script where arguments are methods. In this case we define two dispatch methods one "clean" and one "dirty". If we call with the -c flag we get the clean one. These methods find the method by name and call it. The difference is how. The dirty one leaves itself in the stack trace, the clean one has to be more cleaver, but dispatches without being in the stack trace. We make a death method which gives us that trace.
#!/usr/bin/env perl
use strict;
use warnings;
package Unusual;
use Carp;
sub new {
my $class = shift;
return bless { #_ }, $class;
}
sub dispatch_dirty {
my $self = shift;
my $name = shift;
my $method = $self->can($name) or confess "No method named $name";
$self->$method(#_);
}
sub dispatch_clean {
my $self = shift;
my $name = shift;
my $method = $self->can($name) or confess "No method named $name";
unshift #_, $self;
goto $method;
}
sub death {
my ($self, $message) = #_;
$message ||= 'died';
confess "$self->{name}: $message";
}
package main;
use Getopt::Long;
GetOptions
'clean' => \my $clean,
'name=s' => \(my $name = 'Robot');
my $obj = Unusual->new(name => $name);
if ($clean) {
$obj->dispatch_clean(#ARGV);
} else {
$obj->dispatch_dirty(#ARGV);
}
So now if we call ./test.pl to invoke the death method
$ ./test.pl death Goodbye
Robot: Goodbye at ./test.pl line 32
Unusual::death('Unusual=HASH(0xa0f7188)', 'Goodbye') called at ./test.pl line 19
Unusual::dispatch_dirty('Unusual=HASH(0xa0f7188)', 'death', 'Goodbye') called at ./test.pl line 46
but wee see dispatch_dirty in the trace. If instead we call ./test.pl -c we now use the clean dispatcher and get
$ ./test.pl -c death Adios
Robot: Adios at ./test.pl line 33
Unusual::death('Unusual=HASH(0x9427188)', 'Adios') called at ./test.pl line 44
The key here is the goto (not the evil goto) which takes the subroutine reference and immediately switches the execution to that reference, using the current #_. This is why I have to unshift #_, $self so that the invocant is ready for the new method.

Refs:
sub refWay{
my ($refToArray,$secondParam,$thirdParam) = #_;
#work here
}
refWay(\#array, 'a','b');
HashWay:
sub hashWay{
my $refToHash = shift; #(if pass ref to hash)
#and i know, that:
return undef unless exists $refToHash->{'user'};
return undef unless exists $refToHash->{'password'};
#or the same in loop:
for (qw(user password etc)){
return undef unless exists $refToHash->{$_};
}
}
hashWay({'user'=>YourName, 'password'=>YourPassword});

I tried a simple example:
#!/usr/bin/perl
use strict;
sub total {
my $sum = 0;
while(#_) {
$sum = $sum + shift;
}
return $sum;
}
sub total1 {
my ($a, $aa, $aaa) = #_;
return ($a + $aa + $aaa);
}
my $s;
$s = total(10, 20, 30);
print $s;
$s = total1(10, 20, 30);
print "\n$s";
Both print statements gave answer as 60.
But personally I feel, the arguments should be accepted in this manner:
my (arguments, #garb) = #_;
in order to avoid any sort of issue latter.

I found the following gem in http://perldoc.perl.org/perlsub.html:
"Yes, there are still unresolved issues having to do with visibility of #_ . I'm ignoring that question for the moment. (But note that if we make #_ lexically scoped, those anonymous subroutines can act like closures... (Gee, is this sounding a little Lispish? (Never mind.)))"
You might have run into one of those issues :-(
OTOH amon is probably right -> +1

Related

Can I make a variable optional in a perl sub prototype?

I'd like to understand if it's possible to have a sub prototype and optional parameters in it. With prototypes I can do this:
sub some_sub (\#\#\#) {
...
}
my #foo = qw/a b c/;
my #bar = qw/1 2 3/;
my #baz = qw/X Y Z/;
some_sub(#foo, #bar, #baz);
which is nice and readable, but the minute I try to do
some_sub(#foo, #bar);
or even
some_sub(#foo, #bar, ());
I get errors:
Not enough arguments for main::some_sub at tablify.pl line 72, near "#bar)"
or
Type of arg 3 to main::some_sub must be array (not stub) at tablify.pl line 72, near "))"
Is it possible to have a prototype and a variable number of arguments? or is something similar achievable via signatures?
I know it could be done by always passing arrayrefs I was wondering if there was another way. After all, TMTOWTDI.
All arguments after a semi-colon are optional:
sub some_sub(\#\#;\#) {
}
Most people are going to expect your argument list to flatten, and you are reaching for an outdated tool to do what people don't expect.
Instead, pass data structures by reference:
some_sub( \#array1, \#array2 );
sub some_sub {
my #args = #_;
say "Array 1 has " . $args[0]->#* . " elements";
}
If you want to use those as named arrays within the sub, you can use ref aliasing
use v5.22;
use experimental qw(ref_aliasing);
sub some_sub {
\my( #array1 ) = $_[0];
...
}
With v5.26, you can move the reference operator inside the parens:
use v5.26;
use experimental qw(declared_refs);
sub some_sub {
my( \#array1 ) = $_[0];
...
}
And, remember that v5.20 introduced the :prototype attribute so you can distinguish between prototypes and signatures:
use v5.20;
sub some_sub :prototype(##;#) { ... }
I write about these things at The Effective Perler (which you already read, I see), in Perl New Features, a little bit in Preparing for Perl 7 (which is mostly about what you need to stop doing in Perl 5 to be future proof).

Perl using the special character &

I had a small question. I was reading some code and as my school didn't teach me anything useful about perl programming, I am here to ask you people. I see this line being used a lot in some perl programs:
$variable = &something();
I don't know what the & sign means here as I never say it in perl. And the something is a subroutine ( I am guessing). It usually says a name and it has arguments like a function too sometimes. Can someone tell me what & stands for here and what that something is all the time.
The variable takes in some sort of returned value and is then used to check some conditions, which makes me think it is a subroutine. But still why the &?
Thanks
Virtually every time you see & outside of \&foo and EXRP && EXPR, it's an error.
&foo(...) is the same as foo(...) except foo's prototype will be ignored.
sub foo(&#) { ... } # Cause foo to takes a BLOCK as its first arg
foo { ... } ...;
&foo(sub { ... }, ...); # Same thing.
Only subroutines (not operators) will be called by &foo(...).
sub print { ... }
print(...); # Calls the print builtin
&print(...); # Calls the print sub.
You'll probably never need to use this feature in your entire programming career. If you see it used, it's surely someone using & when they shouldn't.
&foo is similar to &foo(#_). The difference is that changes to #_ in foo affects the current sub's #_.
You'll probably never need to use this feature in your entire programming career. If you see it used, it's surely someone using & when they shouldn't or a foolish attempt at optimization. However, the following is pretty elegant:
sub log_info { unshift #_, 'info'; &log }
sub log_warn { unshift #_, 'warn'; &log }
sub log_error { unshift #_, 'error'; &log }
goto &foo is similar to &foo, except the current subroutine is removed from the call stack first. This will cause it to not show up in stack traces, for example.
You'll probably never need to use this feature in your entire programming career. If you see it used, it's surely a foolish attempt at optimization.
sub log_info { unshift #_, 'info'; goto &log; } # These are slower than
sub log_warn { unshift #_, 'warn'; goto &log; } # not using goto, but maybe
sub log_error { unshift #_, 'error'; goto &log; } # maybe log uses caller()?
$& contains what the last regex expression match matched. Before 5.20, using this causes every regex in your entire interpreter to become slower (if they have no captures), so don't use this.
print $& if /fo+/; # Bad before 5.20
print $MATCH if /fo+/; # Bad (Same thing. Requires "use English;")
print ${^MATCH} if /fo+/p; # Ok (Requires Perl 5.10)
print $1 if /(fo+)/; # Ok
defined &foo is a perfectly legitimate way of checking if a subroutine exists, but it's not something you'll likely ever need. There's also exists &foo is similar, but not as useful.
EXPR & EXPR is the bitwise AND operator. This is used when dealing with low-level systems that store multiple pieces of information in a single word.
system($cmd);
die "Can't execute command: $!\n" if $? == -1;
die "Child kill by ".($? & 0x7F)."\n" if $? & 0x7F;
die "Child exited with ".($? >> 8)."\n" if $? >> 8;
&{ EXPR }() (and &$ref()) is a subroutine call via a reference. This is a perfectly acceptable and somewhat common thing to do, though I prefer the $ref->() syntax. Example in next item.
\&foo takes a reference to subroutine foo. This is a perfectly acceptable and somewhat common thing to do.
my %dispatch = (
foo => \&foo,
bar => \&bar,
);
my $handler = $dispatch{$cmd} or die;
$handler->();
# Same: &{ $handler }();
# Same: &$handler();
EXPR && EXPR is the boolean AND operator. I'm sure you're familiar with this extremely common operator.
if (0 <= $x && $x <= 100) { ... }
In older versions of perl & was used to call subroutines. Now this is not necessary and \& is mostly used to take a reference to subroutine,
my $sub_ref = \&subroutine;
or to ignore function prototype (http://perldoc.perl.org/perlsub.html#Prototypes)
Other than for referencing subroutines & is bitwise and operator,
http://perldoc.perl.org/perlop.html#Bitwise-And

Perl function protoypes

Why do we use function protoypes in Perl?
What are the different prototypes available? How to use them?
Example: $$,$#,\## what do they mean?
You can find the description in the official documentation: http://perldoc.perl.org/perlsub.html#Prototypes
But more important: read why you should not use function prototytpes" Why are Perl 5's function prototypes bad?
To write some functions, prototypes are absolutely neccessary, as they change the way arguments are passed, the sub invocations are parsed, and in what context the arguments are evaluated.
Below are discussions on prototypes with the builtins open and bless, as well as the effect on user-written code like a fold_left subroutine. I come to the conclusion that there are a few scenarios where they are useful, but they are generally not a good mechanism to cope with signatures.
Example: CORE::open
Some builtin functions have prototypes, e.g open. You can get the prototype of any function like say prototype "CORE::open". We get *;$#. This means:
The * takes a bareword, glob, globref or scalar. E.g. STDOUT or my $fh.
The ; makes the following arguments optional.
The $ evaluates the next item in scalar context. We'll see in a minute why this is good.
The # allows any number of arguments.
This allows invocations like
open FOO; (very bad style, equivalent to open FOO, our $FOO)
open my $fh, #array;, which parses as open my $fh, scalar(#array). Useless
open my $fh, "<foo.txt"; (bad style, allows shell injection)
open my $fh, "<", "foo.txt"; (good three-arg-open)
open my $fh, "-|", #command; (now #command is evaluated in list context, i.e. is flattened)
So why should the second argument have scalar context? (1) either you use traditional two-arg-open. Then it isn't difficult to access the first element. (2) Or you want 3-arg-open (rather: multiarg). Then having an explicit mode in the source code is neccessary, which is good style and reduces action at a distance. So this forces you to decide between the outdated flexible 2-arg or the safe multi-arg.
Further restrictions, like that the < mode can only take one filename, while -| takes at least one string (the command) plus any number of arguments, are implemented on a non-syntactic level.
Example: CORE::bless
Another interesting example is the bless function. Its prototype is $;$. I.e. takes one or two scalars.
This allows bless $self; (blesses into current package), or the better bless $self, $class. However, my #array = ($self, $class); bless #array does not work, as scalar context is imposed on the first arg. So the first argument is not a reference, but the number 2. This reduces action at a distance, and fails rather than providing a probably wrong interpretation: both bless $array[0], $array[1] or bless \#array could have been meant here. So prototypes help and augment input validation, but are no substitute for it.
Example fold_left
Let us define a function fold_left that takes a list and an action as arguments. It performs this action on the first two values of the list, and replaces them with the result. This loops until only one element, the return value is left.
Simple implementation:
sub fold_left {
my $code = shift;
while ($#_) { # loop while more than one element
my ($x, $y) = splice #_, 0, 2;
unshift #_, $code->($x, $y);
}
return $_[0];
}
This can be called like
my $sum = fold_left sub{ $_[0] + $_[1] }, 1 .. 10;
my $str = fold_left sub{ "$_[0] $_[1]" }, 1 .. 10;
my $undef = fold_left;
my $runtime_error = fold_left \"foo", 1..10;
But this is unsatisfactory: we know that the first argument is a sub, so the sub keyword is redundant. Also, We can call it without a sub, which we want to be illegal. With prototypes, we can work around that:
sub fold_left (&#) { ... }
The & states that we'll take a coderef. If this is the first argument, this allows the sub keyword and the comma after the sub block to be omitted. Now we can do
my $sum = fold_left { $_[0] + $_[1] } 1 .. 10; # aka List::Util::sum(1..10);
my $str = fold_left { "$_[0] $_[1]" } 1 .. 10; # aka join " ", 1..10;
my $compile_error1 = fold_left; # ERROR: not enough arguments
my $compile_error2 = fold_left "foo", 1..10; # ERROR: type of arg 1 must be sub{} or block.
which is reminiscent of map {...} #list
On backslash prototypes
Backslash prototypes allow to capture typed references to arguments without imposing context. This is good when we want to pass an array without flattening it. E.g.
sub mypush (\##) {
my ($arrayref, #push_these) = #_;
my $len = #$arrayref;
#$arrayref[$len .. $len + $#push_these] = #push_these;
}
my #array;
mypush #array, 1, 2, 3;
You can think of the \ protecting the # like in regexes, thus requiring a literal # character on the argument. This is where prototypes are a sad story: Requiring literal characters is a bad idea. We can't even pass a reference directly, we have to dereference it first:
my $array = [];
mypush #$array, 1, 2, 3;
even though the called code sees and wants exactly that reference. From v14 on, the + can be used instead. It accepts an array, arrayref, hash or hashref (actually, it's like $ on scalar arguments, and \[#%] on hashes and arrays). This proto does no type validation, It'll just make sure you receive a reference unless the argument already is scalar.
sub mypush (+#) { ... }
my #array;
mypush #array, 1, 2, 3;
my $array_ref = [];
mypush $array_ref, 1, 2, 3; # works as well! yay
my %hash;
mypush %hash, 1, 2, 3; # syntactically legal, but will throw fatal on dereferencing.
mypush "foo", 1, 2, 3; # ditto
Conclusion
Prototypes are a great way to bend Perl to your will. Recently I was investigating how pattern matching from functional languages can be implemented in Perl. The match itself has the prototype $% (one scalar thing which is to be matched, and an even number of further arguments. These are pairs of patterns and code).
They are also a great way to shoot yourself in the foot, and can be downright ugly. From List::MoreUtils:
sub each_array (\#;\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#) {
return each_arrayref(#_);
}
This allows you to call it as each_array #a, #b, #c ..., but it isn't much effort to directly do each_arrayref \#a, \#b, \#c, ..., which imposes no limit on the number of parameters, and is more flexible.
Especially parameters like sub foo ($$$$$$;$$) indicate a code smell, and that you should move to named parameters, Method::Signatures, or Params::Validate.
In my experience, good prototypes are
#, % to slurp any (or an even) number of args. Note that # as sole prototype is equivalent to no prototype at all.
& leading codeblocks for nicer syntax.
$ iff you need to pad a slurpy # or %, but not on their own.
I actively dislike \# etc, and have yet to see a good use for _ aside from length (_ can be the last required argument in a prototype. If no explicit value is given, $_ is used.)
Having a good documentation and requiring the user of your subs to include the occasional backslash before your arguments is generally preferable to unexpected action at a distance or having scalar context imposed surprisingly.
Prototypes can be overridden like &foo(#args), and aren't honoured on method calls, so they are already useless here.

Is 'shift' evil for processing Perl subroutine parameters?

I'm frequently using shift to unpack function parameters:
sub my_sub {
my $self = shift;
my $params = shift;
....
}
However, many on my colleagues are preaching that shift is actually evil. Could you explain why I should prefer
sub my_sub {
my ($self, $params) = #_;
....
}
to shift?
The use of shift to unpack arguments is not evil. It's a common convention and may be the fastest way to process arguments (depending on how many there are and how they're passed). Here's one example of a somewhat common scenario where that's the case: a simple accessor.
use Benchmark qw(cmpthese);
sub Foo::x_shift { shift->{'a'} }
sub Foo::x_ref { $_[0]->{'a'} }
sub Foo::x_copy { my $s = $_[0]; $s->{'a'} }
our $o = bless {a => 123}, 'Foo';
cmpthese(-2, { x_shift => sub { $o->x_shift },
x_ref => sub { $o->x_ref },
x_copy => sub { $o->x_copy }, });
The results on perl 5.8.8 on my machine:
Rate x_copy x_ref x_shift
x_copy 772761/s -- -12% -19%
x_ref 877709/s 14% -- -8%
x_shift 949792/s 23% 8% --
Not dramatic, but there it is. Always test your scenario on your version of perl on your target hardware to find out for sure.
shift is also useful in cases where you want to shift off the invocant and then call a SUPER:: method, passing the remaining #_ as-is.
sub my_method
{
my $self = shift;
...
return $self->SUPER::my_method(#_);
}
If I had a very long series of my $foo = shift; operations at the top of a function, however, I might consider using a mass copy from #_ instead. But in general, if you have a function or method that takes more than a handful of arguments, using named parameters (i.e., catching all of #_ in a %args hash or expecting a single hash reference argument) is a much better approach.
It is not evil, it is a taste sort of thing. You will often see the styles used together:
sub new {
my $class = shift;
my %self = #_;
return bless \%self, $class;
}
I tend to use shift when there is one argument or when I want to treat the first few arguments differently than the rest.
This is, as others have said, a matter of taste.
I generally prefer to shift my parameters into lexicals because it gives me a standard place to declare a group of variables that will be used in the subroutine. The extra verbosity gives me a nice place to hang comments. It also makes it easy to provide default values in a tidy fashion.
sub foo {
my $foo = shift; # a thing of some sort.
my $bar = shift; # a horse of a different color.
my $baz = shift || 23; # a pale horse.
# blah
}
If you are really concerned about the speed of calling your routines, don't unpack your arguments at all--access them directly using #_. Be careful, those are references to the caller's data you are working with. This is a common idiom in POE. POE provides a bunch of constants that you use to get positional parameters by name:
sub some_poe_state_handler {
$_[HEAP]{some_data} = 'chicken';
$_[KERNEL]->yield('next_state');
}
Now the big stupid bug you can get if you habitually unpack params with shift is this one:
sub foo {
my $foo = shift;
my $bar = shift;
my #baz = shift;
# I should really stop coding and go to bed. I am making dumb errors.
}
I think that consistent code style is more important than any particular style. If all my coworkers used the list assignment style, I'd use it too.
If my coworkers said there was a big problem using shift to unpack, I'd ask for a demonstration of why it is bad. If the case is solid, then I'd learn something. If the case is bogus, I could then refute it and help stop the spread of anti-knowledge. Then I'd suggest that we determine a standard method and follow it for future code. I might even try to set up a Perl::Critic policy to check for the decided upon standard.
Assigning #_ to a list can bring some helpul addtional features.
It makes it slightly easier to add additional named parameters at a later date, as you modify your code Some people consider this a feature, similar to how finishing a list or a hash with a trailing ',' makes it slightly easier to append members in the future.
If you're in the habit of using this idiom, then shifting the arguments might seem harmful, because if you edit the code to add an extra argument, you could end up with a subtle bug, if you don't pay attention.
e.g.
sub do_something {
my ($foo) = #_;
}
later edited to
sub do_something {
my ($foo,$bar) = #_; # still works
}
however
sub do_another_thing {
my $foo = shift;
}
If another colleague, who uses the first form dogmatically (perhaps they think shift is evil) edits your file and absent-mindedly updates this to read
sub do_another_thing {
my ($foo, $bar) = shift; # still 'works', but $bar not defined
}
and they may have introduced a subtle bug.
Assigning to #_ can be more compact and efficient with vertical space, when you have a small number of parameters to assign at once. It also allows for you to supply default arguments if you're using the hash style of named function parameters
e.g.
my (%arguments) = (user=>'defaultuser',password=>'password',#_);
I would still consider it a question of style / taste. I think the most important thing is to apply one style or the other with consistency, obeying the principle of least surprise.
I don't think shift is evil. The use of shift shows your willingness to actually name variables - instead of using $_[0].
Personally, I use shift when there's only one parameter to a function. If I have more than one parameter, I'll use the list context.
my $result = decode($theString);
sub decode {
my $string = shift;
...
}
my $otherResult = encode($otherString, $format);
sub encode {
my ($string,$format) = #_;
...
}
There is an optimization for list assignment.
The only reference I could find, is this one.
5.10.0 inadvertently disabled an optimization, which caused a
measurable performance drop in list
assignment, such as is often used to
assign function parameters from #_ .
The optimisation has been re-instated,
and the performance regression fixed.
This is an example of the affected performance regression.
sub example{
my($arg1,$arg2,$arg3) = #_;
}
Perl::Critic is your friend here. It follows the "standards" set up in Damian Conway's book Perl Best Practices. Running it with --verbose 11 gives you an explanation on why things are bad. Not unpacking #_ first in your subs is a severity 4 (out of 5). E.g:
echo 'package foo; use warnings; use strict; sub quux { my foo= shift; my (bar,baz) = #_;};1;' | perlcritic -4 --verbose 11
Always unpack #_ first at line 1, near 'sub quux { my foo= shift; my (bar,baz) = #_;}'.
Subroutines::RequireArgUnpacking (Severity: 4)
Subroutines that use `#_' directly instead of unpacking the arguments to
local variables first have two major problems. First, they are very hard
to read. If you're going to refer to your variables by number instead of
by name, you may as well be writing assembler code! Second, `#_'
contains aliases to the original variables! If you modify the contents
of a `#_' entry, then you are modifying the variable outside of your
subroutine. For example:
sub print_local_var_plus_one {
my ($var) = #_;
print ++$var;
}
sub print_var_plus_one {
print ++$_[0];
}
my $x = 2;
print_local_var_plus_one($x); # prints "3", $x is still 2
print_var_plus_one($x); # prints "3", $x is now 3 !
print $x; # prints "3"
This is spooky action-at-a-distance and is very hard to debug if it's
not intentional and well-documented (like `chop' or `chomp').
An exception is made for the usual delegation idiom
`$object->SUPER::something( #_ )'. Only `SUPER::' and `NEXT::' are
recognized (though this is configurable) and the argument list for the
delegate must consist only of `( #_ )'.
It isn't intrinsically evil, but using it to pull off the arguments of a subroutine one by one is comparatively slow and requires a greater number of lines of code.

How would I do the equivalent of Prototype's Enumerator.detect in Perl with the least amount of code?

Lately I've been thinking a lot about functional programming. Perl offers quite a few tools to go that way, however there's something I haven't been able to find yet.
Prototype has the function detect for enumerators, the descriptions is simply this:
Enumerator.detect(iterator[, context]) -> firstElement | undefined
Finds the first element for which the iterator returns true.
Enumerator in this case is any list while iterator is a reference to a function, which is applied in turn on each element of the list.
I am looking for something like this to apply in situations where performance is important, i.e. when stopping upon encountering a match saves time by disregarding the rest of the list.
I am also looking for a solution that would not involve loading any extra module, so if possible it should be done with builtins only. And if possible, it should be as concise as this for example:
my #result = map function #array;
You say you don't want a module, but this is exactly what the first function in List::Util does. That's a core module, so it should be available everywhere.
use List::Util qw(first);
my $first = first { some condition } #array;
If you insist on not using a module, you could copy the implementation out of List::Util. If somebody knew a faster way to do it, it would be in there. (Note that List::Util includes an XS implementation, so that's probably faster than any pure-Perl approach. It also has a pure-Perl version of first, in List::Util::PP.)
Note that the value being tested is passed to the subroutine in $_ and not as a parameter. This is a convenience when you're using the first { some condition} #values form, but is something you have to remember if you're using a regular subroutine. Some more examples:
use 5.010; # I want to use 'say'; nothing else here is 5.10 specific
use List::Util qw(first);
say first { $_ > 3 } 1 .. 10; # prints 4
sub wanted { $_ > 4 }; # note we're using $_ not $_[0]
say first \&wanted, 1 .. 10; # prints 5
my $want = \&wanted; # Get a subroutine reference
say first \&$want, 1 .. 10; # This is how you pass a reference in a scalar
# someFunc expects a parameter instead of looking at $_
say first { someFunc($_) } 1 .. 10;
Untested since I don't have Perl on this machine, but:
sub first(\&#) {
my $pred = shift;
die "First argument to "first" must be a sub" unless ref $pred eq 'CODE';
for my $val (#_) {
return $val if $pred->($val);
}
return undef;
}
Then use it as:
my $first = first { sub performing test } #list;
Note that this doesn't distinguish between no matches in the list and one of the elements in the list being an undefined value and having that match.
Just since its not here, a Perl function definition of first that localizes $_ for its block:
sub first (&#) {
my $code = shift;
for (#_) {return $_ if $code->()}
undef
}
my #array = 1 .. 10;
say first {$_ > 5} #array; # prints 6
While it will work fine, I don't advocate using this version, since List::Util is a core module (installed by default), and its implementation of first will usually use the XS version (written in C) which is much faster.