Perl: passing two hashes to sub - perl

Below is a small sub to calculate distance between two points by passing two hashes each containing the x and y coordinates. I get a "syntax error near ]{" fatal error at the line calling the sub. Just started on Perl yesterday, not entirely sure what I am doing. How do I pass two hashes to a sub to return a value? Tried this without much success and not sure what I need to work on (hope it is ok to refer to an outside link).
%dot1 = ('x'=>5, 'y'=>6);
%dot2 = ('x'=>7, 'y'=>8);
sub dist {
my (%hash1) = #_[0];
my (%hash2) = #_[1];
$dist = ((#_[0]{'x'}-#_[1]{'x'})**2 + (#_[0]{'y'}-#_[1]{'y'})**2)**0.5;
}
$D = dist(\%dot1,\%dot2);

First and foremost, you should start every file with
use strict;
use warnings;
This lets Perl catch the most obvious errors in code.
This part is mostly fine, but under use strict Perl will complain about %dot1 and %dot2 being undeclared (and without strict they will be implicitly global, which is usually not what you want):
%dot1 = ('x'=>5, 'y'=>6);
%dot2 = ('x'=>7, 'y'=>8);
Change it to
my %dot1 = ('x'=>5, 'y'=>6);
my %dot2 = ('x'=>7, 'y'=>8);
The call
$D = dist(\%dot1,\%dot2);
has the same problem: It should be
my $D = dist(\%dot1,\%dot2);
What it does is pass references to %dot1 and %dot2 to the sub dist.
my (%hash1) = #_[0];
This line doesn't make much sense: #_[0] is a list slice, returning a list of elements of #_ corresponding to the indices 0. In other words, it's a one-element slice and better written as $_[0], accessing the single element directly.
But in either case it doesn't make sense to assign a single element to a hash. Perl will interpret it as a key and set the corresponding value to undef. Your call passed \%dot1 as the first argument, so $_[0] is a reference to a hash. By using it as a hash key, Perl will convert it to a string, yielding something like "HASH(0x0075ADD40)".
Your choices at this point are to either dereference the reference right there and make a copy:
my %hash1 = %{ $_[0] }; # effectively performs %hash1 = %dot1
Or keep the reference and dereference it each time you need access to the hash:
my $hashref1 = $_[0]; # $hashref1->{foo} accesses $dot1{foo} directly
$dist = ((#_[0]{'x'}-#_[1]{'x'})**2 + (#_[0]{'y'}-#_[1]{'y'})**2)**0.5;
There are a few issues here. First, you don't need the (implicitly global) variable $dist. You just want to return a value from the sub, which can be done with return. Then, as explained above, #_[0] and #_[1] should be $_[0] and $_[1], respectively. Fixing that we get
return (($_[0]{'x'} - $_[1]{'x'}) ** 2 + ($_[0]{'y'} - $_[1]{'y'}) ** 2) ** 0.5;
This does indeed work ($_[0]{'x'} is syntactic sugar for $_[0]->{'x'}, i.e. this expression dereferences the hash reference stored in $_[0] to reach the 'x' key of %dot1).
But we didn't use the variables we just created at all. Depending on which way you want to go, you should replace $_[0]{foo} by either $hash1{foo} or $hashref1->{foo} (and similar for $_[1] and %hash2/$hashref2).
Finally, instead of ** 0.5 we can just use sqrt.
Here's how I'd write it:
use strict;
use warnings;
sub dist {
my ($p1, $p2) = #_;
return sqrt(($p1->{x} - $p2->{x}) ** 2 + ($p1->{y} - $p2->{y}) ** 2);
}
my %dot1 = (x => 5, y => 6);
my %dot2 = (x => 7, y => 8);
my $D = dist(\%dot1, \%dot2);
print "Result: $D\n";

Many problems here, I'm afraid.
First you, need to add use strict and use warnings to your code. This will point out many errors. Mainly places where you use #array[index] but should have used $array[index]. You also don't declare any of your variables.
Fixing all of that gives me this code:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
my %dot1 = ('x'=>5, 'y'=>6);
my %dot2 = ('x'=>7, 'y'=>8);
sub dist {
my (%hash1) = $_[0];
my (%hash2) = $_[1];
my $dist = (($_[0]{'x'}-$_[1]{'x'})**2 + ($_[0]{'y'}-$_[1]{'y'})**2)**0.5;
}
my $D = dist(\%dot1,\%dot2);
say $D;
This still doesn't work. I get:
$ perl twohash
Reference found where even-sized list expected at twohash line 11.
Reference found where even-sized list expected at twohash line 12.
2.82842712474619
The errors are where you are assigning the two arguments you pass into dist().
my (%hash1) = $_[0];
my (%hash2) = $_[1];
You are passing references to hashes, not the hashes themselves (And you're right to do that), but this means that you get scalars, not hashes, in the subroutine. So those lines need to be:
my ($hash1) = $_[0];
my ($hash2) = $_[1];
Making those changes, the code now works and gives the result "2.82842712474619".
I'll just point out one further strangeness in your code - you assign the parameters to the function to two lexical variables ($hash1 and $hash2) but you then ignore those variables and instead go directly to #_ for this data. I expect you actually want:
my $dist = (($hash1->{'x'}-$hash2->{'x'})**2 + ($hash1->{'y'}-$hash2->{'y'})**2)**0.5;
Note, I've changed $_[0]{'x'} to $hash1->{'x'} as you have a reference to a hash.
All in all, this code is a bit of a mess and I suggest you go back to the very earliest chapters of whatever book you are learning Perl from.

Could you please try this:
use Data::Dumper;
my %dot1 = ('x'=>5, 'y'=>6);
my %dot2 = ('x'=>7, 'y'=>8);
sub dist {
my %hash1 = #_[0];
my %hash2 = #_[1];
$dist = ((#_[0]->{'x'}-#_[1]->{'x'})**2 + (#_[0]->{'y'}-#_[1]->{'y'})**2)**0.5;
}
$D = dist(\%dot1,\%dot2);
print $D;

Related

Can I make a variable optional in a perl sub prototype?

I'd like to understand if it's possible to have a sub prototype and optional parameters in it. With prototypes I can do this:
sub some_sub (\#\#\#) {
...
}
my #foo = qw/a b c/;
my #bar = qw/1 2 3/;
my #baz = qw/X Y Z/;
some_sub(#foo, #bar, #baz);
which is nice and readable, but the minute I try to do
some_sub(#foo, #bar);
or even
some_sub(#foo, #bar, ());
I get errors:
Not enough arguments for main::some_sub at tablify.pl line 72, near "#bar)"
or
Type of arg 3 to main::some_sub must be array (not stub) at tablify.pl line 72, near "))"
Is it possible to have a prototype and a variable number of arguments? or is something similar achievable via signatures?
I know it could be done by always passing arrayrefs I was wondering if there was another way. After all, TMTOWTDI.
All arguments after a semi-colon are optional:
sub some_sub(\#\#;\#) {
}
Most people are going to expect your argument list to flatten, and you are reaching for an outdated tool to do what people don't expect.
Instead, pass data structures by reference:
some_sub( \#array1, \#array2 );
sub some_sub {
my #args = #_;
say "Array 1 has " . $args[0]->#* . " elements";
}
If you want to use those as named arrays within the sub, you can use ref aliasing
use v5.22;
use experimental qw(ref_aliasing);
sub some_sub {
\my( #array1 ) = $_[0];
...
}
With v5.26, you can move the reference operator inside the parens:
use v5.26;
use experimental qw(declared_refs);
sub some_sub {
my( \#array1 ) = $_[0];
...
}
And, remember that v5.20 introduced the :prototype attribute so you can distinguish between prototypes and signatures:
use v5.20;
sub some_sub :prototype(##;#) { ... }
I write about these things at The Effective Perler (which you already read, I see), in Perl New Features, a little bit in Preparing for Perl 7 (which is mostly about what you need to stop doing in Perl 5 to be future proof).

Is it possible to match a string $x = "foo" with a variable named $foo and assign a value only if it matches?

What I am trying to ask is easily shown.
Imagine 2 variables named as
my ($foo, $bar) = (0,0);
and
my #a = ("foo","bar","beyond","recognition");
is it possible to string match $a[0] with the variable named $foo, and assign a value to it (say "hi") only if it matches identically?
I am trying to debug this some code (not mine), and I ran into a tough spot. Basically, I have a part of the script where I have a bunch of variables.
my ($p1, $p2, $p3, $p4)= (0,0,0,0); # *Edited*
my #ids = ("p1","p2","p3","p4")
I have a case where I need to pass each of those variables as a hash key to call a certain operation inside a loop.
for (0..3){
my $handle = get_my_stuff(#ids);
my $ret = $p1->do_something(); # <- $p1 is used for the first instance of loop.
...
...
...
}
FOr the first iteration of the loop, I need to use $p1, but for the second iteration of the loop I need to pass (or call)
my $ret = $p2->do_something(); # not $p1
So what I did was ;
my $p;
for (1..4){
my $handle = get_my_stuff(#ids);
no strict 'refs';
my $ret = $p{$_}->do_something();
...
...
...
use strict 'refs';
...
}
But the above operation is not allowed, and I am unable to call my key in such a manner :(. As it turns out, $p1 became a blessed hash as soon after get_my_stuff() was called. And to my biggest surprise, somehow the script in the function (too much and too long to paste here) assign or pass a hash reference to my variables only if they match.
You don't need to try to invent something to deal with variable names. Your idea to use a hash is correct, but your approach is flawed.
It seems your function get_my_stuff takes a list of arguments and transforms them somehow. It then returns a list of objects that correspond to the arguments. Instead of doing that in the loop, do it before you loop through the numbers and build up your hash by assigning each id to an object.
Perl allows you to assign to a hash slice. In that case, the sigil changes to an #. My below implementation uses DateTime with years to show that the objects are different.
use strict;
use warnings;
use feature 'say';
use DateTime;
# for illustration purposes
sub get_my_stuff {
return map { DateTime->new( year => (substr $_, 1) + 2000 ) } #_;
}
my #ids = qw(p1 p2 p3 p4);
my %p;
# create this outside of the loop
#p{#ids} = get_my_stuff(#ids);
foreach my $i ( 1.. 4 ) {
say $p{'p' . $i}->ymd; # "do_something"
}
This will output
2001-01-01
2002-01-01
2003-01-01
2004-01-01

PERL -- hash in argument / return / reuse

I wish to fill a hash table successively by applying a function several times. The function takes a hash reference in argument, fills it, and returns it. The hash is taken again in argument by the function.
It seems that the hash is not filled at all.
Here is my code :
Can someone tell me where might be the error please ?
sub extractMLResult {
my (%h1, %h2, %h3, %h4, $param) = #_;
my $h1= shift;
my $h2= shift;
my $h3= shift;
my $h4=shift;
$params= shift;
# read csv file, split it, fill hashes with values
$h1->{$key1}{$key2}{'p'}=$val1;
# ... do the same for the other hashes ...
return (%$h1, %$h2, %$h3, %$h4);
}
my %myhash = ();
my %h1= ();
my %h2= ();
my %h3= ();
my %h4= ();
$myhash{'a'}{'x'}=1;
$myhash{'b'}{'y'}=1;
if (exists $myhash{'a'}){
(%h1, %h2, %h3, %h4) = extractMLResult(\%h1, \%h2, \%h3, \%h4, 'a');
}
if (exists $myhash{'b'}){
(%h1, %h2, %h3, %h4) = extractMLResult(\%h1, \%h2, \%h3, \%h4, 'b');
}
my declares variables in the lexical scope. So the instant you exit your 'if' clause, %$h1 etc. vanishes again.
Also, you're doing some strange things with the assigning, which I don't think will be working thew way you think - you're deferencing your hash-references, and as such returning a list of values.
Those will all be going into %$h1 because of the way list assignments work.
But on the flip side - when you're reading in myfunction your assignment probably isn't doing what you think.
Because you're calling myfunction and passing a list of values, but you're doing a list assignment for %h1. That means all your arguments are 'consumed':
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
sub myfunction {
my (%h1, %h2, %h3, %h4, $param) = #_;
print Dumper \%h1;
print Dumper \%h2;
}
# launch function :
my %h1 = ( a => "test" );
my %h2 = ( b => "wibble" );
myfunction ( \%h1, \%h2 );
As you will see - your arguments are all consumed by the assignment to %h1 and none are left for the rest of your assignments.
More importantly - your code doesn't even compile, because if you do this:
my (%$h1, %$h2, %$h3, %$h4) = myfunction (\%h1, \%h2, \%h3, \%h4, "a");
You get:
Can't declare hash dereference in "my"
So perhaps give us some sample code that actually illustrates the problem - and runs, with some sample data?
Edit: With the more code - the problem is right here:
sub extractMLResult {
my (%h1, %h2, %h3, %h4, $param) = #_;
Because that's not doing what you think it's doing. Because %h1 is a hash, and it's assigned in a list context - all the arguments of #_ are inserted into it. So %h2, %h3, %h4, $param will always be empty/undefined.
You don't indicate whether you're actually using %h1 though, which just means it's confusing nonsense - potentially.
But this bit:
my $h1= shift;
my $h2= shift;
my $h3= shift;
my $h4_parents = shift;
Ok, so you're extracting some hash references here, which is perhaps a little more sane. But naming the same as the hashes is confusing - there's NO relationship between $h1 and %h1, and you'll confuse things in your code if you do that. (Because $h1{key} is from %h1 and nothing to do with $h1 in turn).
But the same problem exists in return:
(%h1, %h2, %h3, %h4) = extractMLResult(\%h1, \%h2, \%h3, \%h4, 'a');
Because your return:
return (%$h1, %$h2, %$h3, %$h4);
This return will give you back an unrolled list containing all the elements in the hashes. But given the way you're packing the hashes, they'll probably be a partially unrolled list, containing hash-references.
But then, they'll all be consumed by %h1 again, in the assignment.
You would need to:
return ( $h1, $h2, $h3, $h4);
And then in your function:
( $h1_ret, $h2_ret, $h3_ret, $h4_ret ) = extractMLResult(\%h1, \%h2, \%h3, \%h4, 'a');
And then unpack:
%h1 = %$h1_ret;
Or just stick with working with references all the way through, which is probably clearer for all concerned.
You are passing hash references into your subroutine. This is a good idea. But inside your subroutine, you are treating your parameters as hashes (not hash references). This is a bad idea.
A hash is initialised from a list. It should be a list with an even number of elements. Each pair of elements in the list will become a key/value pair in the hash.
my %french => ('one', 'un', 'two', 'deux', 'three', 'trois');
We often use the "fat comma" to emphasise the link between keys and values.
my %french => (one => 'un', two => 'deux', three => 'trois');
This means that hash initialisation is a greedy operation. It will use up all of any list that it is given. You cannot initialise two hashes in the same statement:
my (%french, %german) = (one => 'un', two => 'deux',
three => 'drei', four => 'vier');
This doesn't work, as all of the pairs will end up in %french, leaving nothing to populate %german.
This is the same mistake that you are making when extracting the parameters within your subroutine. You have this:
my (%h1, %h2, %h3, %h4, $param) = #_;
Nothing will end up in %h2,%h3,%h4or$paramas the assignment to%his greedy and will take all of the data values from#_` - leaving nothing for the other variables.
But, as you are passing hash references, your code shouldn't look like that. A hash reference is a scalar value (that's pretty much the point of them) so it is stored in a scalar variable.
What you want is this:
# Note, scalars ($), not hashes (%)
my ($h1, $h2, $h3, $h4, $param) = #_;
This should get you started. Note also, that you'll now need to deal with hash references ($h1->{key}) rather than hashes ($h1{key}).
And, please, always include both use strict and use warnings.

in perl, is it bad practice to call multiple subroutines with default arguments?

I am learning perl and understand that it is a common and accepted practice to unpack subroutine arguments using shift. I also understand that it is common and acceptable practice to omit function arguments to use the default #_ array.
Considering these two things, if you call a subroutine without arguments, the #_ can (and will, if using shift) be changed. Does this mean that calling another subroutine with default arguments, or, in fact, using the #_ array after this, is considered bad practice? Consider this example:
sub total { # calculate sum of all arguments
my $running_sum;
# take arguments one by one and sum them together
while (#_) {
$running_sum += shift;
}
$running_sum;
}
sub avg { calculate the mean of given arguments
if (#_ == 0) { return }
my $sum = &total; # gets the correct answer, but changes #_
$sum / #_ # causes division by zero, since #_ is now empty
}
My gut feeling tells me that using shift to unpack arguments would actually be bad practice, unless your subroutine is actually supposed to change the passed arguments, but I have read in multiple places, including Stack Overflow, that this is not a bad practice.
So the question is: if using shift is common practice, should I always assume the passed argument list could get changed, as a side-effect of the subroutine (like the &total subroutine in the quoted example)? Is there maybe a way to pass arguments by value, so I can be sure that the argument list does not get changed, so I could use it again (like in the &avg subroutine in the quoted text)?
In general, shifting from the arguments is ok—using the & sigil to call functions isn't. (Except in some very specific situations you'll probably never encounter.)
Your code could be re-written, so that total doesn't shift from #_. Using a for-loop may even be more efficient.
sub total {
my $total = 0;
$total += $_ for #_;
$total;
}
Or you could use the sum function from List::Util:
use List::Util qw(sum);
sub avg { #_ ? sum(#_) / #_ : 0 }
Using shift isn't that common, except for extracting $self in object oriented Perl. But as you always call your functions like foo( ... ), it doesn't matter if foo shifts or doesn't shift the argument array.
(The only thing worth noting about a function is whether it assigns to elements in #_, as these are aliases for the variables you gave as arguments. Assigning to elements in #_ is usually bad.)
Even if you can't change the implementation of total, calling the sub with an explicit argument list is safe, as the argument list is a copy of the array:
(a) &total — calls total with the identical #_, and overrides prototypes.
(b) total(#_) — calls total with a copy of #_.
(c) &total(#_) — calls total with a copy of #_, and overrides prototypes.
Form (b) is standard. Form (c) shouldn't be seen, except in very few cases for subs inside the same package where the sub has a prototype (and don't use prototypes), and they have to be overridden for some obscure reason. A testament to poor design.
Form (a) is only sensible for tail calls (#_ = (...); goto &foo) or other forms of optimization (and premature optimization is the root of all evil).
You should avoid using the &func; style of calling unless you have a really good reason, and trust that others do the same.
To guard your #_ against modification by a callee, just do &func() or func.
Perl is a little too lax sometimes and having multiple ways of accessing input parameters can make smelly and inconsistent code. For want of a better answer, try to impose your own standard.
Here's a few ways I've used and seen
Shifty
sub login
{
my $user = shift;
my $passphrase = shift;
# Validate authentication
return 0;
}
Expanding #_
sub login
{
my ($user, $passphrase) = #_;
# Validate authentication
return 0;
}
Explicit indexing
sub login
{
my user = $_[0];
my user = $_[1];
# Validate authentication
return 0;
}
Enforce parameters with function prototypes (this is not popular however)
sub login($$)
{
my ($user, $passphrase) = #_;
# Validate authentication
return 0;
}
Sadly you still have to perform your own convoluted input validation/taint checking, ie:
return unless defined $user;
return unless defined $passphrase;
or better still, a little more informative
unless (defined($user) && defined($passphrase)) {
carp "Input error: user or passphrase not defined";
return -1;
}
Perldoc perlsub should really be your first port of call.
Hope this helps!
Here are some examples where the careful use of #_ matters.
1. Hash-y Arguments
Sometimes you want to write a function which can take a list of key-value pairs, but one is the most common use and you want that to be available without needing a key. For example
sub get_temp {
my $location = #_ % 2 ? shift : undef;
my %options = #_;
$location ||= $options{location};
...
}
So now if you call the function with an odd number of arguments, the first is location. This allows get_temp('Chicago') or get_temp('New York', unit => 'C') or even get_temp( unit => 'K', location => 'Nome, Ak'). This may be a more convenient API for your users. By shifting the odd argument, now #_ is an even list and may be assigned to a hash.
2. Dispatching
Lets say we have a class that we want to be able to dispatch methods by name (possibly AUTOLOAD could be useful, we will hand roll). Perhaps this is a command line script where arguments are methods. In this case we define two dispatch methods one "clean" and one "dirty". If we call with the -c flag we get the clean one. These methods find the method by name and call it. The difference is how. The dirty one leaves itself in the stack trace, the clean one has to be more cleaver, but dispatches without being in the stack trace. We make a death method which gives us that trace.
#!/usr/bin/env perl
use strict;
use warnings;
package Unusual;
use Carp;
sub new {
my $class = shift;
return bless { #_ }, $class;
}
sub dispatch_dirty {
my $self = shift;
my $name = shift;
my $method = $self->can($name) or confess "No method named $name";
$self->$method(#_);
}
sub dispatch_clean {
my $self = shift;
my $name = shift;
my $method = $self->can($name) or confess "No method named $name";
unshift #_, $self;
goto $method;
}
sub death {
my ($self, $message) = #_;
$message ||= 'died';
confess "$self->{name}: $message";
}
package main;
use Getopt::Long;
GetOptions
'clean' => \my $clean,
'name=s' => \(my $name = 'Robot');
my $obj = Unusual->new(name => $name);
if ($clean) {
$obj->dispatch_clean(#ARGV);
} else {
$obj->dispatch_dirty(#ARGV);
}
So now if we call ./test.pl to invoke the death method
$ ./test.pl death Goodbye
Robot: Goodbye at ./test.pl line 32
Unusual::death('Unusual=HASH(0xa0f7188)', 'Goodbye') called at ./test.pl line 19
Unusual::dispatch_dirty('Unusual=HASH(0xa0f7188)', 'death', 'Goodbye') called at ./test.pl line 46
but wee see dispatch_dirty in the trace. If instead we call ./test.pl -c we now use the clean dispatcher and get
$ ./test.pl -c death Adios
Robot: Adios at ./test.pl line 33
Unusual::death('Unusual=HASH(0x9427188)', 'Adios') called at ./test.pl line 44
The key here is the goto (not the evil goto) which takes the subroutine reference and immediately switches the execution to that reference, using the current #_. This is why I have to unshift #_, $self so that the invocant is ready for the new method.
Refs:
sub refWay{
my ($refToArray,$secondParam,$thirdParam) = #_;
#work here
}
refWay(\#array, 'a','b');
HashWay:
sub hashWay{
my $refToHash = shift; #(if pass ref to hash)
#and i know, that:
return undef unless exists $refToHash->{'user'};
return undef unless exists $refToHash->{'password'};
#or the same in loop:
for (qw(user password etc)){
return undef unless exists $refToHash->{$_};
}
}
hashWay({'user'=>YourName, 'password'=>YourPassword});
I tried a simple example:
#!/usr/bin/perl
use strict;
sub total {
my $sum = 0;
while(#_) {
$sum = $sum + shift;
}
return $sum;
}
sub total1 {
my ($a, $aa, $aaa) = #_;
return ($a + $aa + $aaa);
}
my $s;
$s = total(10, 20, 30);
print $s;
$s = total1(10, 20, 30);
print "\n$s";
Both print statements gave answer as 60.
But personally I feel, the arguments should be accepted in this manner:
my (arguments, #garb) = #_;
in order to avoid any sort of issue latter.
I found the following gem in http://perldoc.perl.org/perlsub.html:
"Yes, there are still unresolved issues having to do with visibility of #_ . I'm ignoring that question for the moment. (But note that if we make #_ lexically scoped, those anonymous subroutines can act like closures... (Gee, is this sounding a little Lispish? (Never mind.)))"
You might have run into one of those issues :-(
OTOH amon is probably right -> +1

Should I use $_[0] or copy the argument list in Perl?

If I pass a hash to a sub:
parse(\%data);
Should I use a variable to $_[0] first or is it okay to keep accessing $_[0] whenever I want to get an element from the hash? clarification:
sub parse
{ $var1 = $_[0]->{'elem1'};
$var2 = $_[0]->{'elem2'};
$var3 = $_[0]->{'elem3'};
$var4 = $_[0]->{'elem4'};
$var5 = $_[0]->{'elem5'};
}
# Versus
sub parse
{ my $hr = $_[0];
$var1 = $hr->{'elem1'};
$var2 = $hr->{'elem2'};
$var3 = $hr->{'elem3'};
$var4 = $hr->{'elem4'};
$var5 = $hr->{'elem5'};
}
Is the second version more correct since it doesn't have to keep accessing the argument array, or does Perl end up interpereting them the same way anyhow?
In this case there is no difference because you are passing reference to hash. But in case of passing scalar there will be difference:
sub rtrim {
## remove tailing spaces from first argument
$_[0] =~ s/\s+$//;
}
rtrim($str); ## value of the variable will be changed
sub rtrim_bugged {
my $str = $_[0]; ## this makes a copy of variable
$str =~ s/\s+$//;
}
rtrim($str); ## value of the variable will stay the same
If you're passing hash reference, then only copy of reference is created. But the hash itself will be the same. So if you care about code readability then I suggest you to create a variable for all your parameters. For example:
sub parse {
## you can easily add new parameters to this function
my ($hr) = #_;
my $var1 = $hr->{'elem1'};
my $var2 = $hr->{'elem2'};
my $var3 = $hr->{'elem3'};
my $var4 = $hr->{'elem4'};
my $var5 = $hr->{'elem5'};
}
Also more descriptive variable names will improve your code too.
For a general discussion of the efficiency of shift vs accessing #_ directly, see:
Is there a difference between Perl's shift versus assignment from #_ for subroutine parameters?
Is 'shift' evil for processing Perl subroutine parameters?
As for your specific code, I'd use shift, but simplify the data extraction with a hash slice:
sub parse
{
my $hr = shift;
my ($var1, $var2, $var3, $var4, $var5) = #{$hr}{qw(elem1 elem2 elem3 elem4 elem5)};
}
I'll assume that this method does something else with these variables that makes it worthwhile to keep them in separate variables (perhaps the hash is read-only, and you need to make some modifications before inserting them into some other data?) -- otherwise why not just leave them in the hashref where they started?
You are micro-optimizing; try to avoid that. Go with whatever is most readable/maintainable. Usually this would be the one where you use a lexical variable, since its name indicates its purpose...but if you use a name like $data or $x this obviously doesn't apply.
In terms of the technical details, for most purposes you can estimate the time taken by counting the number of basic ops perl will use. For your $_[0], an element lookup in a non-lexical array variable takes multiple ops: one to get the glob, one to get the array part of the glob, one or more to get the index (just one for a constant), and one to look up the element. $hr, on the other hand is a single op. To cater to direct users of #_, there's an optimization that reduces the ops for $_[0] to a single combined op (when the index is between 0 and 255 inclusive), but it isn't used in your case because the hash-deref context requires an additional flag on the array element lookup (to support autovivification) and that flag isn't supported by the optimized op.
In summary, using a lexical is going to be both more readable and (if you using it more than once) imperceptibly faster.
My rule is that I try not to use $_[0] in subroutines that are longer than a couple of statements. After that everything gets a user-defined variable.
Why are you copying all of the hash values into variables? Just leave them in the hash where they belong. That's a much better optimization than the one you are thinking about.
Its the same although the second is more clear
Since they work, both are fine, the common practice is to shift off parameters.
sub parse { my $hr = shift; my $var1 = $hr->{'elem1'}; }