Is 'shift' evil for processing Perl subroutine parameters? - perl

I'm frequently using shift to unpack function parameters:
sub my_sub {
my $self = shift;
my $params = shift;
....
}
However, many on my colleagues are preaching that shift is actually evil. Could you explain why I should prefer
sub my_sub {
my ($self, $params) = #_;
....
}
to shift?

The use of shift to unpack arguments is not evil. It's a common convention and may be the fastest way to process arguments (depending on how many there are and how they're passed). Here's one example of a somewhat common scenario where that's the case: a simple accessor.
use Benchmark qw(cmpthese);
sub Foo::x_shift { shift->{'a'} }
sub Foo::x_ref { $_[0]->{'a'} }
sub Foo::x_copy { my $s = $_[0]; $s->{'a'} }
our $o = bless {a => 123}, 'Foo';
cmpthese(-2, { x_shift => sub { $o->x_shift },
x_ref => sub { $o->x_ref },
x_copy => sub { $o->x_copy }, });
The results on perl 5.8.8 on my machine:
Rate x_copy x_ref x_shift
x_copy 772761/s -- -12% -19%
x_ref 877709/s 14% -- -8%
x_shift 949792/s 23% 8% --
Not dramatic, but there it is. Always test your scenario on your version of perl on your target hardware to find out for sure.
shift is also useful in cases where you want to shift off the invocant and then call a SUPER:: method, passing the remaining #_ as-is.
sub my_method
{
my $self = shift;
...
return $self->SUPER::my_method(#_);
}
If I had a very long series of my $foo = shift; operations at the top of a function, however, I might consider using a mass copy from #_ instead. But in general, if you have a function or method that takes more than a handful of arguments, using named parameters (i.e., catching all of #_ in a %args hash or expecting a single hash reference argument) is a much better approach.

It is not evil, it is a taste sort of thing. You will often see the styles used together:
sub new {
my $class = shift;
my %self = #_;
return bless \%self, $class;
}
I tend to use shift when there is one argument or when I want to treat the first few arguments differently than the rest.

This is, as others have said, a matter of taste.
I generally prefer to shift my parameters into lexicals because it gives me a standard place to declare a group of variables that will be used in the subroutine. The extra verbosity gives me a nice place to hang comments. It also makes it easy to provide default values in a tidy fashion.
sub foo {
my $foo = shift; # a thing of some sort.
my $bar = shift; # a horse of a different color.
my $baz = shift || 23; # a pale horse.
# blah
}
If you are really concerned about the speed of calling your routines, don't unpack your arguments at all--access them directly using #_. Be careful, those are references to the caller's data you are working with. This is a common idiom in POE. POE provides a bunch of constants that you use to get positional parameters by name:
sub some_poe_state_handler {
$_[HEAP]{some_data} = 'chicken';
$_[KERNEL]->yield('next_state');
}
Now the big stupid bug you can get if you habitually unpack params with shift is this one:
sub foo {
my $foo = shift;
my $bar = shift;
my #baz = shift;
# I should really stop coding and go to bed. I am making dumb errors.
}
I think that consistent code style is more important than any particular style. If all my coworkers used the list assignment style, I'd use it too.
If my coworkers said there was a big problem using shift to unpack, I'd ask for a demonstration of why it is bad. If the case is solid, then I'd learn something. If the case is bogus, I could then refute it and help stop the spread of anti-knowledge. Then I'd suggest that we determine a standard method and follow it for future code. I might even try to set up a Perl::Critic policy to check for the decided upon standard.

Assigning #_ to a list can bring some helpul addtional features.
It makes it slightly easier to add additional named parameters at a later date, as you modify your code Some people consider this a feature, similar to how finishing a list or a hash with a trailing ',' makes it slightly easier to append members in the future.
If you're in the habit of using this idiom, then shifting the arguments might seem harmful, because if you edit the code to add an extra argument, you could end up with a subtle bug, if you don't pay attention.
e.g.
sub do_something {
my ($foo) = #_;
}
later edited to
sub do_something {
my ($foo,$bar) = #_; # still works
}
however
sub do_another_thing {
my $foo = shift;
}
If another colleague, who uses the first form dogmatically (perhaps they think shift is evil) edits your file and absent-mindedly updates this to read
sub do_another_thing {
my ($foo, $bar) = shift; # still 'works', but $bar not defined
}
and they may have introduced a subtle bug.
Assigning to #_ can be more compact and efficient with vertical space, when you have a small number of parameters to assign at once. It also allows for you to supply default arguments if you're using the hash style of named function parameters
e.g.
my (%arguments) = (user=>'defaultuser',password=>'password',#_);
I would still consider it a question of style / taste. I think the most important thing is to apply one style or the other with consistency, obeying the principle of least surprise.

I don't think shift is evil. The use of shift shows your willingness to actually name variables - instead of using $_[0].
Personally, I use shift when there's only one parameter to a function. If I have more than one parameter, I'll use the list context.
my $result = decode($theString);
sub decode {
my $string = shift;
...
}
my $otherResult = encode($otherString, $format);
sub encode {
my ($string,$format) = #_;
...
}

There is an optimization for list assignment.
The only reference I could find, is this one.
5.10.0 inadvertently disabled an optimization, which caused a
measurable performance drop in list
assignment, such as is often used to
assign function parameters from #_ .
The optimisation has been re-instated,
and the performance regression fixed.
This is an example of the affected performance regression.
sub example{
my($arg1,$arg2,$arg3) = #_;
}

Perl::Critic is your friend here. It follows the "standards" set up in Damian Conway's book Perl Best Practices. Running it with --verbose 11 gives you an explanation on why things are bad. Not unpacking #_ first in your subs is a severity 4 (out of 5). E.g:
echo 'package foo; use warnings; use strict; sub quux { my foo= shift; my (bar,baz) = #_;};1;' | perlcritic -4 --verbose 11
Always unpack #_ first at line 1, near 'sub quux { my foo= shift; my (bar,baz) = #_;}'.
Subroutines::RequireArgUnpacking (Severity: 4)
Subroutines that use `#_' directly instead of unpacking the arguments to
local variables first have two major problems. First, they are very hard
to read. If you're going to refer to your variables by number instead of
by name, you may as well be writing assembler code! Second, `#_'
contains aliases to the original variables! If you modify the contents
of a `#_' entry, then you are modifying the variable outside of your
subroutine. For example:
sub print_local_var_plus_one {
my ($var) = #_;
print ++$var;
}
sub print_var_plus_one {
print ++$_[0];
}
my $x = 2;
print_local_var_plus_one($x); # prints "3", $x is still 2
print_var_plus_one($x); # prints "3", $x is now 3 !
print $x; # prints "3"
This is spooky action-at-a-distance and is very hard to debug if it's
not intentional and well-documented (like `chop' or `chomp').
An exception is made for the usual delegation idiom
`$object->SUPER::something( #_ )'. Only `SUPER::' and `NEXT::' are
recognized (though this is configurable) and the argument list for the
delegate must consist only of `( #_ )'.

It isn't intrinsically evil, but using it to pull off the arguments of a subroutine one by one is comparatively slow and requires a greater number of lines of code.

Related

Can I make a variable optional in a perl sub prototype?

I'd like to understand if it's possible to have a sub prototype and optional parameters in it. With prototypes I can do this:
sub some_sub (\#\#\#) {
...
}
my #foo = qw/a b c/;
my #bar = qw/1 2 3/;
my #baz = qw/X Y Z/;
some_sub(#foo, #bar, #baz);
which is nice and readable, but the minute I try to do
some_sub(#foo, #bar);
or even
some_sub(#foo, #bar, ());
I get errors:
Not enough arguments for main::some_sub at tablify.pl line 72, near "#bar)"
or
Type of arg 3 to main::some_sub must be array (not stub) at tablify.pl line 72, near "))"
Is it possible to have a prototype and a variable number of arguments? or is something similar achievable via signatures?
I know it could be done by always passing arrayrefs I was wondering if there was another way. After all, TMTOWTDI.
All arguments after a semi-colon are optional:
sub some_sub(\#\#;\#) {
}
Most people are going to expect your argument list to flatten, and you are reaching for an outdated tool to do what people don't expect.
Instead, pass data structures by reference:
some_sub( \#array1, \#array2 );
sub some_sub {
my #args = #_;
say "Array 1 has " . $args[0]->#* . " elements";
}
If you want to use those as named arrays within the sub, you can use ref aliasing
use v5.22;
use experimental qw(ref_aliasing);
sub some_sub {
\my( #array1 ) = $_[0];
...
}
With v5.26, you can move the reference operator inside the parens:
use v5.26;
use experimental qw(declared_refs);
sub some_sub {
my( \#array1 ) = $_[0];
...
}
And, remember that v5.20 introduced the :prototype attribute so you can distinguish between prototypes and signatures:
use v5.20;
sub some_sub :prototype(##;#) { ... }
I write about these things at The Effective Perler (which you already read, I see), in Perl New Features, a little bit in Preparing for Perl 7 (which is mostly about what you need to stop doing in Perl 5 to be future proof).

Perl using the special character &

I had a small question. I was reading some code and as my school didn't teach me anything useful about perl programming, I am here to ask you people. I see this line being used a lot in some perl programs:
$variable = &something();
I don't know what the & sign means here as I never say it in perl. And the something is a subroutine ( I am guessing). It usually says a name and it has arguments like a function too sometimes. Can someone tell me what & stands for here and what that something is all the time.
The variable takes in some sort of returned value and is then used to check some conditions, which makes me think it is a subroutine. But still why the &?
Thanks
Virtually every time you see & outside of \&foo and EXRP && EXPR, it's an error.
&foo(...) is the same as foo(...) except foo's prototype will be ignored.
sub foo(&#) { ... } # Cause foo to takes a BLOCK as its first arg
foo { ... } ...;
&foo(sub { ... }, ...); # Same thing.
Only subroutines (not operators) will be called by &foo(...).
sub print { ... }
print(...); # Calls the print builtin
&print(...); # Calls the print sub.
You'll probably never need to use this feature in your entire programming career. If you see it used, it's surely someone using & when they shouldn't.
&foo is similar to &foo(#_). The difference is that changes to #_ in foo affects the current sub's #_.
You'll probably never need to use this feature in your entire programming career. If you see it used, it's surely someone using & when they shouldn't or a foolish attempt at optimization. However, the following is pretty elegant:
sub log_info { unshift #_, 'info'; &log }
sub log_warn { unshift #_, 'warn'; &log }
sub log_error { unshift #_, 'error'; &log }
goto &foo is similar to &foo, except the current subroutine is removed from the call stack first. This will cause it to not show up in stack traces, for example.
You'll probably never need to use this feature in your entire programming career. If you see it used, it's surely a foolish attempt at optimization.
sub log_info { unshift #_, 'info'; goto &log; } # These are slower than
sub log_warn { unshift #_, 'warn'; goto &log; } # not using goto, but maybe
sub log_error { unshift #_, 'error'; goto &log; } # maybe log uses caller()?
$& contains what the last regex expression match matched. Before 5.20, using this causes every regex in your entire interpreter to become slower (if they have no captures), so don't use this.
print $& if /fo+/; # Bad before 5.20
print $MATCH if /fo+/; # Bad (Same thing. Requires "use English;")
print ${^MATCH} if /fo+/p; # Ok (Requires Perl 5.10)
print $1 if /(fo+)/; # Ok
defined &foo is a perfectly legitimate way of checking if a subroutine exists, but it's not something you'll likely ever need. There's also exists &foo is similar, but not as useful.
EXPR & EXPR is the bitwise AND operator. This is used when dealing with low-level systems that store multiple pieces of information in a single word.
system($cmd);
die "Can't execute command: $!\n" if $? == -1;
die "Child kill by ".($? & 0x7F)."\n" if $? & 0x7F;
die "Child exited with ".($? >> 8)."\n" if $? >> 8;
&{ EXPR }() (and &$ref()) is a subroutine call via a reference. This is a perfectly acceptable and somewhat common thing to do, though I prefer the $ref->() syntax. Example in next item.
\&foo takes a reference to subroutine foo. This is a perfectly acceptable and somewhat common thing to do.
my %dispatch = (
foo => \&foo,
bar => \&bar,
);
my $handler = $dispatch{$cmd} or die;
$handler->();
# Same: &{ $handler }();
# Same: &$handler();
EXPR && EXPR is the boolean AND operator. I'm sure you're familiar with this extremely common operator.
if (0 <= $x && $x <= 100) { ... }
In older versions of perl & was used to call subroutines. Now this is not necessary and \& is mostly used to take a reference to subroutine,
my $sub_ref = \&subroutine;
or to ignore function prototype (http://perldoc.perl.org/perlsub.html#Prototypes)
Other than for referencing subroutines & is bitwise and operator,
http://perldoc.perl.org/perlop.html#Bitwise-And

Perl function protoypes

Why do we use function protoypes in Perl?
What are the different prototypes available? How to use them?
Example: $$,$#,\## what do they mean?
You can find the description in the official documentation: http://perldoc.perl.org/perlsub.html#Prototypes
But more important: read why you should not use function prototytpes" Why are Perl 5's function prototypes bad?
To write some functions, prototypes are absolutely neccessary, as they change the way arguments are passed, the sub invocations are parsed, and in what context the arguments are evaluated.
Below are discussions on prototypes with the builtins open and bless, as well as the effect on user-written code like a fold_left subroutine. I come to the conclusion that there are a few scenarios where they are useful, but they are generally not a good mechanism to cope with signatures.
Example: CORE::open
Some builtin functions have prototypes, e.g open. You can get the prototype of any function like say prototype "CORE::open". We get *;$#. This means:
The * takes a bareword, glob, globref or scalar. E.g. STDOUT or my $fh.
The ; makes the following arguments optional.
The $ evaluates the next item in scalar context. We'll see in a minute why this is good.
The # allows any number of arguments.
This allows invocations like
open FOO; (very bad style, equivalent to open FOO, our $FOO)
open my $fh, #array;, which parses as open my $fh, scalar(#array). Useless
open my $fh, "<foo.txt"; (bad style, allows shell injection)
open my $fh, "<", "foo.txt"; (good three-arg-open)
open my $fh, "-|", #command; (now #command is evaluated in list context, i.e. is flattened)
So why should the second argument have scalar context? (1) either you use traditional two-arg-open. Then it isn't difficult to access the first element. (2) Or you want 3-arg-open (rather: multiarg). Then having an explicit mode in the source code is neccessary, which is good style and reduces action at a distance. So this forces you to decide between the outdated flexible 2-arg or the safe multi-arg.
Further restrictions, like that the < mode can only take one filename, while -| takes at least one string (the command) plus any number of arguments, are implemented on a non-syntactic level.
Example: CORE::bless
Another interesting example is the bless function. Its prototype is $;$. I.e. takes one or two scalars.
This allows bless $self; (blesses into current package), or the better bless $self, $class. However, my #array = ($self, $class); bless #array does not work, as scalar context is imposed on the first arg. So the first argument is not a reference, but the number 2. This reduces action at a distance, and fails rather than providing a probably wrong interpretation: both bless $array[0], $array[1] or bless \#array could have been meant here. So prototypes help and augment input validation, but are no substitute for it.
Example fold_left
Let us define a function fold_left that takes a list and an action as arguments. It performs this action on the first two values of the list, and replaces them with the result. This loops until only one element, the return value is left.
Simple implementation:
sub fold_left {
my $code = shift;
while ($#_) { # loop while more than one element
my ($x, $y) = splice #_, 0, 2;
unshift #_, $code->($x, $y);
}
return $_[0];
}
This can be called like
my $sum = fold_left sub{ $_[0] + $_[1] }, 1 .. 10;
my $str = fold_left sub{ "$_[0] $_[1]" }, 1 .. 10;
my $undef = fold_left;
my $runtime_error = fold_left \"foo", 1..10;
But this is unsatisfactory: we know that the first argument is a sub, so the sub keyword is redundant. Also, We can call it without a sub, which we want to be illegal. With prototypes, we can work around that:
sub fold_left (&#) { ... }
The & states that we'll take a coderef. If this is the first argument, this allows the sub keyword and the comma after the sub block to be omitted. Now we can do
my $sum = fold_left { $_[0] + $_[1] } 1 .. 10; # aka List::Util::sum(1..10);
my $str = fold_left { "$_[0] $_[1]" } 1 .. 10; # aka join " ", 1..10;
my $compile_error1 = fold_left; # ERROR: not enough arguments
my $compile_error2 = fold_left "foo", 1..10; # ERROR: type of arg 1 must be sub{} or block.
which is reminiscent of map {...} #list
On backslash prototypes
Backslash prototypes allow to capture typed references to arguments without imposing context. This is good when we want to pass an array without flattening it. E.g.
sub mypush (\##) {
my ($arrayref, #push_these) = #_;
my $len = #$arrayref;
#$arrayref[$len .. $len + $#push_these] = #push_these;
}
my #array;
mypush #array, 1, 2, 3;
You can think of the \ protecting the # like in regexes, thus requiring a literal # character on the argument. This is where prototypes are a sad story: Requiring literal characters is a bad idea. We can't even pass a reference directly, we have to dereference it first:
my $array = [];
mypush #$array, 1, 2, 3;
even though the called code sees and wants exactly that reference. From v14 on, the + can be used instead. It accepts an array, arrayref, hash or hashref (actually, it's like $ on scalar arguments, and \[#%] on hashes and arrays). This proto does no type validation, It'll just make sure you receive a reference unless the argument already is scalar.
sub mypush (+#) { ... }
my #array;
mypush #array, 1, 2, 3;
my $array_ref = [];
mypush $array_ref, 1, 2, 3; # works as well! yay
my %hash;
mypush %hash, 1, 2, 3; # syntactically legal, but will throw fatal on dereferencing.
mypush "foo", 1, 2, 3; # ditto
Conclusion
Prototypes are a great way to bend Perl to your will. Recently I was investigating how pattern matching from functional languages can be implemented in Perl. The match itself has the prototype $% (one scalar thing which is to be matched, and an even number of further arguments. These are pairs of patterns and code).
They are also a great way to shoot yourself in the foot, and can be downright ugly. From List::MoreUtils:
sub each_array (\#;\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#) {
return each_arrayref(#_);
}
This allows you to call it as each_array #a, #b, #c ..., but it isn't much effort to directly do each_arrayref \#a, \#b, \#c, ..., which imposes no limit on the number of parameters, and is more flexible.
Especially parameters like sub foo ($$$$$$;$$) indicate a code smell, and that you should move to named parameters, Method::Signatures, or Params::Validate.
In my experience, good prototypes are
#, % to slurp any (or an even) number of args. Note that # as sole prototype is equivalent to no prototype at all.
& leading codeblocks for nicer syntax.
$ iff you need to pad a slurpy # or %, but not on their own.
I actively dislike \# etc, and have yet to see a good use for _ aside from length (_ can be the last required argument in a prototype. If no explicit value is given, $_ is used.)
Having a good documentation and requiring the user of your subs to include the occasional backslash before your arguments is generally preferable to unexpected action at a distance or having scalar context imposed surprisingly.
Prototypes can be overridden like &foo(#args), and aren't honoured on method calls, so they are already useless here.

in perl, is it bad practice to call multiple subroutines with default arguments?

I am learning perl and understand that it is a common and accepted practice to unpack subroutine arguments using shift. I also understand that it is common and acceptable practice to omit function arguments to use the default #_ array.
Considering these two things, if you call a subroutine without arguments, the #_ can (and will, if using shift) be changed. Does this mean that calling another subroutine with default arguments, or, in fact, using the #_ array after this, is considered bad practice? Consider this example:
sub total { # calculate sum of all arguments
my $running_sum;
# take arguments one by one and sum them together
while (#_) {
$running_sum += shift;
}
$running_sum;
}
sub avg { calculate the mean of given arguments
if (#_ == 0) { return }
my $sum = &total; # gets the correct answer, but changes #_
$sum / #_ # causes division by zero, since #_ is now empty
}
My gut feeling tells me that using shift to unpack arguments would actually be bad practice, unless your subroutine is actually supposed to change the passed arguments, but I have read in multiple places, including Stack Overflow, that this is not a bad practice.
So the question is: if using shift is common practice, should I always assume the passed argument list could get changed, as a side-effect of the subroutine (like the &total subroutine in the quoted example)? Is there maybe a way to pass arguments by value, so I can be sure that the argument list does not get changed, so I could use it again (like in the &avg subroutine in the quoted text)?
In general, shifting from the arguments is ok—using the & sigil to call functions isn't. (Except in some very specific situations you'll probably never encounter.)
Your code could be re-written, so that total doesn't shift from #_. Using a for-loop may even be more efficient.
sub total {
my $total = 0;
$total += $_ for #_;
$total;
}
Or you could use the sum function from List::Util:
use List::Util qw(sum);
sub avg { #_ ? sum(#_) / #_ : 0 }
Using shift isn't that common, except for extracting $self in object oriented Perl. But as you always call your functions like foo( ... ), it doesn't matter if foo shifts or doesn't shift the argument array.
(The only thing worth noting about a function is whether it assigns to elements in #_, as these are aliases for the variables you gave as arguments. Assigning to elements in #_ is usually bad.)
Even if you can't change the implementation of total, calling the sub with an explicit argument list is safe, as the argument list is a copy of the array:
(a) &total — calls total with the identical #_, and overrides prototypes.
(b) total(#_) — calls total with a copy of #_.
(c) &total(#_) — calls total with a copy of #_, and overrides prototypes.
Form (b) is standard. Form (c) shouldn't be seen, except in very few cases for subs inside the same package where the sub has a prototype (and don't use prototypes), and they have to be overridden for some obscure reason. A testament to poor design.
Form (a) is only sensible for tail calls (#_ = (...); goto &foo) or other forms of optimization (and premature optimization is the root of all evil).
You should avoid using the &func; style of calling unless you have a really good reason, and trust that others do the same.
To guard your #_ against modification by a callee, just do &func() or func.
Perl is a little too lax sometimes and having multiple ways of accessing input parameters can make smelly and inconsistent code. For want of a better answer, try to impose your own standard.
Here's a few ways I've used and seen
Shifty
sub login
{
my $user = shift;
my $passphrase = shift;
# Validate authentication
return 0;
}
Expanding #_
sub login
{
my ($user, $passphrase) = #_;
# Validate authentication
return 0;
}
Explicit indexing
sub login
{
my user = $_[0];
my user = $_[1];
# Validate authentication
return 0;
}
Enforce parameters with function prototypes (this is not popular however)
sub login($$)
{
my ($user, $passphrase) = #_;
# Validate authentication
return 0;
}
Sadly you still have to perform your own convoluted input validation/taint checking, ie:
return unless defined $user;
return unless defined $passphrase;
or better still, a little more informative
unless (defined($user) && defined($passphrase)) {
carp "Input error: user or passphrase not defined";
return -1;
}
Perldoc perlsub should really be your first port of call.
Hope this helps!
Here are some examples where the careful use of #_ matters.
1. Hash-y Arguments
Sometimes you want to write a function which can take a list of key-value pairs, but one is the most common use and you want that to be available without needing a key. For example
sub get_temp {
my $location = #_ % 2 ? shift : undef;
my %options = #_;
$location ||= $options{location};
...
}
So now if you call the function with an odd number of arguments, the first is location. This allows get_temp('Chicago') or get_temp('New York', unit => 'C') or even get_temp( unit => 'K', location => 'Nome, Ak'). This may be a more convenient API for your users. By shifting the odd argument, now #_ is an even list and may be assigned to a hash.
2. Dispatching
Lets say we have a class that we want to be able to dispatch methods by name (possibly AUTOLOAD could be useful, we will hand roll). Perhaps this is a command line script where arguments are methods. In this case we define two dispatch methods one "clean" and one "dirty". If we call with the -c flag we get the clean one. These methods find the method by name and call it. The difference is how. The dirty one leaves itself in the stack trace, the clean one has to be more cleaver, but dispatches without being in the stack trace. We make a death method which gives us that trace.
#!/usr/bin/env perl
use strict;
use warnings;
package Unusual;
use Carp;
sub new {
my $class = shift;
return bless { #_ }, $class;
}
sub dispatch_dirty {
my $self = shift;
my $name = shift;
my $method = $self->can($name) or confess "No method named $name";
$self->$method(#_);
}
sub dispatch_clean {
my $self = shift;
my $name = shift;
my $method = $self->can($name) or confess "No method named $name";
unshift #_, $self;
goto $method;
}
sub death {
my ($self, $message) = #_;
$message ||= 'died';
confess "$self->{name}: $message";
}
package main;
use Getopt::Long;
GetOptions
'clean' => \my $clean,
'name=s' => \(my $name = 'Robot');
my $obj = Unusual->new(name => $name);
if ($clean) {
$obj->dispatch_clean(#ARGV);
} else {
$obj->dispatch_dirty(#ARGV);
}
So now if we call ./test.pl to invoke the death method
$ ./test.pl death Goodbye
Robot: Goodbye at ./test.pl line 32
Unusual::death('Unusual=HASH(0xa0f7188)', 'Goodbye') called at ./test.pl line 19
Unusual::dispatch_dirty('Unusual=HASH(0xa0f7188)', 'death', 'Goodbye') called at ./test.pl line 46
but wee see dispatch_dirty in the trace. If instead we call ./test.pl -c we now use the clean dispatcher and get
$ ./test.pl -c death Adios
Robot: Adios at ./test.pl line 33
Unusual::death('Unusual=HASH(0x9427188)', 'Adios') called at ./test.pl line 44
The key here is the goto (not the evil goto) which takes the subroutine reference and immediately switches the execution to that reference, using the current #_. This is why I have to unshift #_, $self so that the invocant is ready for the new method.
Refs:
sub refWay{
my ($refToArray,$secondParam,$thirdParam) = #_;
#work here
}
refWay(\#array, 'a','b');
HashWay:
sub hashWay{
my $refToHash = shift; #(if pass ref to hash)
#and i know, that:
return undef unless exists $refToHash->{'user'};
return undef unless exists $refToHash->{'password'};
#or the same in loop:
for (qw(user password etc)){
return undef unless exists $refToHash->{$_};
}
}
hashWay({'user'=>YourName, 'password'=>YourPassword});
I tried a simple example:
#!/usr/bin/perl
use strict;
sub total {
my $sum = 0;
while(#_) {
$sum = $sum + shift;
}
return $sum;
}
sub total1 {
my ($a, $aa, $aaa) = #_;
return ($a + $aa + $aaa);
}
my $s;
$s = total(10, 20, 30);
print $s;
$s = total1(10, 20, 30);
print "\n$s";
Both print statements gave answer as 60.
But personally I feel, the arguments should be accepted in this manner:
my (arguments, #garb) = #_;
in order to avoid any sort of issue latter.
I found the following gem in http://perldoc.perl.org/perlsub.html:
"Yes, there are still unresolved issues having to do with visibility of #_ . I'm ignoring that question for the moment. (But note that if we make #_ lexically scoped, those anonymous subroutines can act like closures... (Gee, is this sounding a little Lispish? (Never mind.)))"
You might have run into one of those issues :-(
OTOH amon is probably right -> +1

Should I use $_[0] or copy the argument list in Perl?

If I pass a hash to a sub:
parse(\%data);
Should I use a variable to $_[0] first or is it okay to keep accessing $_[0] whenever I want to get an element from the hash? clarification:
sub parse
{ $var1 = $_[0]->{'elem1'};
$var2 = $_[0]->{'elem2'};
$var3 = $_[0]->{'elem3'};
$var4 = $_[0]->{'elem4'};
$var5 = $_[0]->{'elem5'};
}
# Versus
sub parse
{ my $hr = $_[0];
$var1 = $hr->{'elem1'};
$var2 = $hr->{'elem2'};
$var3 = $hr->{'elem3'};
$var4 = $hr->{'elem4'};
$var5 = $hr->{'elem5'};
}
Is the second version more correct since it doesn't have to keep accessing the argument array, or does Perl end up interpereting them the same way anyhow?
In this case there is no difference because you are passing reference to hash. But in case of passing scalar there will be difference:
sub rtrim {
## remove tailing spaces from first argument
$_[0] =~ s/\s+$//;
}
rtrim($str); ## value of the variable will be changed
sub rtrim_bugged {
my $str = $_[0]; ## this makes a copy of variable
$str =~ s/\s+$//;
}
rtrim($str); ## value of the variable will stay the same
If you're passing hash reference, then only copy of reference is created. But the hash itself will be the same. So if you care about code readability then I suggest you to create a variable for all your parameters. For example:
sub parse {
## you can easily add new parameters to this function
my ($hr) = #_;
my $var1 = $hr->{'elem1'};
my $var2 = $hr->{'elem2'};
my $var3 = $hr->{'elem3'};
my $var4 = $hr->{'elem4'};
my $var5 = $hr->{'elem5'};
}
Also more descriptive variable names will improve your code too.
For a general discussion of the efficiency of shift vs accessing #_ directly, see:
Is there a difference between Perl's shift versus assignment from #_ for subroutine parameters?
Is 'shift' evil for processing Perl subroutine parameters?
As for your specific code, I'd use shift, but simplify the data extraction with a hash slice:
sub parse
{
my $hr = shift;
my ($var1, $var2, $var3, $var4, $var5) = #{$hr}{qw(elem1 elem2 elem3 elem4 elem5)};
}
I'll assume that this method does something else with these variables that makes it worthwhile to keep them in separate variables (perhaps the hash is read-only, and you need to make some modifications before inserting them into some other data?) -- otherwise why not just leave them in the hashref where they started?
You are micro-optimizing; try to avoid that. Go with whatever is most readable/maintainable. Usually this would be the one where you use a lexical variable, since its name indicates its purpose...but if you use a name like $data or $x this obviously doesn't apply.
In terms of the technical details, for most purposes you can estimate the time taken by counting the number of basic ops perl will use. For your $_[0], an element lookup in a non-lexical array variable takes multiple ops: one to get the glob, one to get the array part of the glob, one or more to get the index (just one for a constant), and one to look up the element. $hr, on the other hand is a single op. To cater to direct users of #_, there's an optimization that reduces the ops for $_[0] to a single combined op (when the index is between 0 and 255 inclusive), but it isn't used in your case because the hash-deref context requires an additional flag on the array element lookup (to support autovivification) and that flag isn't supported by the optimized op.
In summary, using a lexical is going to be both more readable and (if you using it more than once) imperceptibly faster.
My rule is that I try not to use $_[0] in subroutines that are longer than a couple of statements. After that everything gets a user-defined variable.
Why are you copying all of the hash values into variables? Just leave them in the hash where they belong. That's a much better optimization than the one you are thinking about.
Its the same although the second is more clear
Since they work, both are fine, the common practice is to shift off parameters.
sub parse { my $hr = shift; my $var1 = $hr->{'elem1'}; }