How does #_ work in Perl subroutines? - perl

I was always sure that if I pass a Perl subroutine a simple scalar, it can never change its value outside the subroutine. That is:
my $x = 100;
foo($x);
# without knowing anything about foo(), I'm sure $x still == 100
So if I want foo() to change x, I must pass it a reference to x.
Then I found out this is not the case:
sub foo {
$_[0] = 'CHANGED!';
}
my $x = 100;
foo($x);
print $x, "\n"; # prints 'CHANGED!'
And the same goes for array elements:
my #arr = (1,2,3);
print $arr[0], "\n"; # prints '1'
foo($arr[0]);
print $arr[0], "\n"; # prints 'CHANGED!'
That kinda surprised me. How does this work? Isn't the subroutine only gets the value of the argument? How does it know its address?

In Perl, the subroutine arguments stored in #_ are always aliases to the values at the call site. This aliasing only persists in #_, if you copy values out, that's what you get, values.
so in this sub:
sub example {
# #_ is an alias to the arguments
my ($x, $y, #rest) = #_; # $x $y and #rest contain copies of the values
my $args = \#_; # $args contains a reference to #_ which maintains aliases
}
Note that this aliasing happens after list expansion, so if you passed an array to example, the array expands in list context, and #_ is set to aliases of each element of the array (but the array itself is not available to example). If you wanted the latter, you would pass a reference to the array.
Aliasing of subroutine arguments is a very useful feature, but must be used with care. To prevent unintended modification of external variables, in Perl 6 you must specify that you want writable aliased arguments with is rw.
One of the lesser known but useful tricks is to use this aliasing feature to create array refs of aliases
my ($x, $y) = (1, 2);
my $alias = sub {\#_}->($x, $y);
$$alias[1]++; # $y is now 3
or aliased slices:
my $slice = sub {\#_}->(#somearray[3 .. 10]);
it also turns out that using sub {\#_}->(LIST) to create an array from a list is actually faster than [ LIST ] since Perl does not need to copy every value. Of course the downside (or upside depending on your perspective) is that the values remain aliased, so you can't change them without changing the originals.
As tchrist mentions in a comment to another answer, when you use any of Perl's aliasing constructs on #_, the $_ that they provide you is also an alias to the original subroutine arguments. Such as:
sub trim {s!^\s+!!, s!\s+$!! for #_} # in place trimming of white space
Lastly all of this behavior is nestable, so when using #_ (or a slice of it) in the argument list of another subroutine, it also gets aliases to the first subroutine's arguments:
sub add_1 {$_[0] += 1}
sub add_2 {
add_1(#_) for 1 .. 2;
}

This is all documented in detail in perldoc perlsub. For example:
Any arguments passed in show up in the array #_. Therefore, if you called a function with two arguments, those would be stored in $_[0] and $_[1]. The
array #_ is a local array, but its elements are aliases for the actual scalar parameters. In particular, if an element $_[0] is updated, the
corresponding argument is updated (or an error occurs if it is not updatable). If an argument is an array or hash element which did not exist when the
function was called, that element is created only when (and if) it is modified or a reference to it is taken. (Some earlier versions of Perl created the
element whether or not the element was assigned to.) Assigning to the whole array #_ removes that aliasing, and does not update any arguments.

Perl passes arguments by reference, not by value. See http://www.troubleshooters.com/codecorn/littperl/perlsub.htm

Related

perl assign reference to subroutine

I use #_ in a subroutine to get a parameter which is assigned as a reference of an array, but the result dose not showing as an array reference.
My code is down below.
my #aar = (9,8,7,6,5);
my $ref = \#aar;
AAR($ref);
sub AAR {
my $ref = #_;
print "ref = $ref";
}
This will print 1 , not an array reference , but if I replace #_ with shift , the print result will be a reference.
can anyone explain why I can't get a reference using #_ to me ?
This is about context in Perl. It is a crucial aspect of the language.
An expression like
my $var = #ary;
attempts to assign an array to a scalar.
That doesn't make sense as it stands and what happens is that the right-hand side is evaluated to the number of elements of the array and that is assigned to $var.
In order to change that behavior you need to provide the "list context" to the assignment operator.† In this case you'd do
my ($var) = #ary;
and now we have an assignment of a list (of array elements) to a list (of variables, here only $var), where they are assigned one for one. So here the first element of #ary is assigned to $var. Please note that this statement plays loose with the elusive notion of the "list."
So in your case you want
my ($ref) = #_;
and the first element from #_ is assigned to $ref, as needed.
Alternatively, you can remove and return the first element of #_ using shift, in which case the scalar-context assignment is fine
my $ref = shift #_;
In this case you can also do
my $ref = shift;
since shift by default works on #_.
This is useful when you want to remove the first element of input as it's being assigned so that the remaining #_ is well suited for further processing. It is often done in object-oriented code.
It is well worth pointing out that many operators and builtin facilities in Perl act differently depending on what context they are invoked in.
For some specifics, just a few examples: the regex match operator returns true/false (1/empty string) in scalar context but the actual matches in list context,‡ readdir returns a single entry in scalar context but all of them in list context, while localtime shows a bit more distinct difference. This context-sensitive behavior is in every corner of Perl.
User level subroutines can be made to behave that way via wantarray.
†
See Scalar vs List Assignment Operator
for a detailed discussion
‡
See it in perlretut and in perlop for instance
When you assign an array to a scalar, you're getting the size of the array. You pass one argument (a reference to an array) to AAR, that's why you get 1.
To get the actual parameters, place the local variable in braces:
sub AAR {
my ($ref) = #_;
print "ref = $ref\n";
}
This prints something like ref = ARRAY(0x5566c89a4710).
You can then use the reference to access the array elements like this:
print join(", ", #{$ref});

Is it possible to use hash, array, scalar with the same name in Perl?

I'm converting perl script to python.
I haven't used perl language before so many things in perl confused me.
For example, below, opt was declared as a scalar first but declared again as a hash. %opt = ();
Is it possible to declare a scalar and a hash with the same name in perl?
As I know, foreach $opt (#opts) means that scalar opt gets values of array opts one by one. opt is an array at this time???
In addition, what does $opt{$opt} mean?
opt outside $opt{$opt} is a hash and opt inside $opt{$opt} is a scalar?
I'm so confused, please help me ...
sub ParseOptions
{
local (#optval) = #_;
local ($opt, #opts, %valFollows, #newargs);
while (#optval) {
$opt = shift(#optval);
push(#opts,$opt);
$valFollows{$opt} = shift(#optval);
}
#optArgs = ();
%opt = ();
arg: while (defined($arg = shift(#ARGV))) {
foreach $opt (#opts) {
if ($arg eq $opt) {
push(#optArgs, $arg);
if ($valFollows{$opt}) {
if (#ARGV == 0) {
&Usage();
}
$opt{$opt} = shift(#ARGV);
push(#optArgs, $opt{$opt});
} else {
$opt{$opt} = 1;
}
next arg;
}
}
push(#newargs,$arg);
}
#ARGV = #newargs;
}
In Perl types SCALAR, ARRAY, and HASH are distinguished by the leading sigil in the variable name, $, #, and % respectively, but otherwise they may bear the same name. So $var and #var and %var are, other than belonging to the same *var typeglob, completely distinct variables.
A key for a hash must be a scalar, let's call it $key. A value in a hash must also be a scalar, and the one corresponding to $key in the hash %h is $h{$key}, the leading $ indicating that this is now a scalar. Much like an element of an array is a scalar, thus $ary[$index].
In foreach $var (#ary) the scalar $var does not really "get values" of array elements but rather aliases them. So if you change it in the loop you change the element of the array.
(Note, you have the array #opts, not #opt.)
A few comments on the code, in the light of common practices in modern Perl
One must always have use warnings; on top. The other one that is a must is use strict, as it enforces declaration of variables with my (or our), promoting all kinds of good behavior. This restricts the scope of lexical (my) variables to the nearest block enclosing the declaration
A declaration may be done inside loops as well, allowing while (my $var = EXPR) {} and foreach my $var (LIST) {}, and then $var is scoped to the loop's block (and is also usable in the rest of the condition)
The local is a different beast altogether, and in this code those should be my instead
Loop labels like arg: are commonly typed in block capitals
The & in front of a function name has rather particular meanings† and is rarely needed. It is almost certainly not needed in this code
Assignment of empty list like my #optArgs = () when the variable is declared (with my) is unneeded and has no effect (with a global #name it may be needed to clear it from higher scope)
In this code there is no need to introduce variables first since they are global and thus created the first time they are used, much like in Python – Unless they are used outside of this sub, in which case they may need to be cleared. That's the thing with globals, they radiate throughout all code
Except for the last two these are unrelated to your Python translation task, for this script.
† It suppresses prototypes; informs interpreter (at runtime) that it's a subroutine and not a "bareword", if it hasn't been already defined (can do that with () after the name); ensures that the user-defined sub is called, if there is a builtin of the same name; passes the caller's #_ (when without parens, &name); in goto &name and defined ...
See, for example, this post and this post
First, to make it clear, in perl, as in shell, you could surround variable names in curly brackets {} :
${bimbo} and $bimbo are the same scalar variable.
#bimbo is an array pointer ;
%bimbo is a hash pointer ;
$bimbo is a scalar (unique value).
To address an array or hash value, you will have to use the '$' :
$bimbo{'index'} is the 'index' value of hash %bimbo.
If $i contains an int, such as 1 for instance,
$bimbo[$i] is the second value of the array #bimbo.
So, as you can see, # or % ALWAYS refers to the whole array, as $bimbo{} or $bimbo[] refers to any value of the hash or array.
${bimbo[4]} refers to the 5th value of the array #bimbo ; %{bimbo{'index'}} refers to the 'index' value of array %bimbo.
Yes, all those structures could have the same name. This is one of the obvious syntax of perl.
And, euhm… always think in perl as a C++ edulcored syntax, it is simplified, but it is C.

Assigning shift to a variable

Other thread that I read: What does assigning 'shift' to a variable mean?
I also used perldoc -f shift:
shift ARRAY
shift
Shifts the first value of the array off and returns it, shortening the array by 1 and moving everything down. If there are no elements in the array, returns the undefined value. If ARRAY is omitted, shifts the #_ array within the lexical scope of subroutines and formats, and the #ARGV array outside of a subroutine and also within the lexical scopes established by the eval STRING, BEGIN {}, INIT {}, CHECK {}, UNITCHECK {} and END {} constructs.
See also unshift, push, and pop. shift and unshift do the same thing to the left end of an array that pop and push do to the right end.
I understand outside of subroutines, the array is #ARGV, and inside arguments are passed through #_
I've read countless tutorials on how to use the shift function, but it's always about arrays, and how it removes the first element at the beginning of the array and returns it. But I see sometimes
$eg = shift;
~do more things here~
It seems as if nothing is making sense to me, and I feel like I can't continue reading more until I understand how this works as it's a "basic building block" to the language.
I'm not quite sure if an example of code is needed, as I believe the same principles apply to all programs that use shift. But if wrong I can provide some examples of code.
It depends on your context.
In all cases, shift removes the element at list index 0 and returns it, shifting all remaining elements down one.
Inside a sub, shift without an argument (bare shift) operates on the #_ list of subroutine parameters.
Suppose I call mysub('me', '22')
sub mysub {
my $self = shift; # return $_[0], moves everything else down, $self = 'me', #_ = ( '22' )
my $arg1 = shift; # returns $_[0], moves everything down, $arg1 = '22', #_ = ()
}
Outside a sub, it operates on the #ARGV list of command line parameters.
Given an argument, shift operates on that list.
my #list1 = ( 1,2,3,4,5 );
my $first = shift #list1; # $first = 1, #list1 = (2,3,4,5)
It removes the first element from an array and returns it.
If #_ contains the elements ("foo","bar",123) then the statement
$eg = shift; # same as $eg = shift #_
Assigns the value "foo" to the variable $eg, and leaves #_ containing the elements ("bar",123).
This is very much a basic building block of the language. You will frequently see this construction at the beginning of subroutines as it is one of the ways to copy the arguments to the subroutine (#_) into other variables.
sub func {
my $x = shift; # put first arg into $x
my $y = shift; # put second arg into $y
my #z = #_; # put remaining args into #z
...
}
$r = func(1,2,"foo");

Scope of the default variable $_ in Perl

I have the following method which accepts a variable and then displays info from a database:
sub showResult {
if (#_ == 2) {
my #results = dbGetResults($_[0]);
if (#results) {
foreach (#results) {
print "$count - $_[1] (ID: $_[0])\n";
}
} else {
print "\n\nNo results found";
}
}
}
Everything works fine, except the print line in the foreach loop. This $_ variable still contains the values passed to the method.
Is there anyway to 'force' the new scope of values on $_, or will it always contain the original values?
If there are any good tutorials that explain how the scope of $_ works, that would also be cool!
Thanks
The problem here is that you're using really #_ instead of $_. The foreach loop changes $_, the scalar variable, not #_, which is what you're accessing if you index it by $_[X]. Also, check again the code to see what it is inside #results. If it is an array of arrays or refs, you may need to use the indirect ${$_}[0] or something like that.
In Perl, the _ name can refer to a number of different variables:
The common ones are:
$_ the default scalar (set by foreach, map, grep)
#_ the default array (set by calling a subroutine)
The less common:
%_ the default hash (not used by anything by default)
_ the default file handle (used by file test operators)
&_ an unused subroutine name
*_ the glob containing all of the above names
Each of these variables can be used independently of the others. In fact, the only way that they are related is that they are all contained within the *_ glob.
Since the sigils vary with arrays and hashes, when accessing an element, you use the bracket characters to determine which variable you are accessing:
$_[0] # element of #_
$_{...} # element of %_
$$_[0] # first element of the array reference stored in $_
$_->[0] # same
The for/foreach loop can accept a variable name to use rather than $_, and that might be clearer in your situation:
for my $result (#results) {...}
In general, if your code is longer than a few lines, or nested, you should name the variables rather than relying on the default ones.
Since your question was related more to variable names than scope, I have not discussed the actual scope surrounding the foreach loop, but in general, the following code is equivalent to what you have.
for (my $i = 0; $i < $#results; $i++) {
local *_ = \$results[$i];
...
}
The line local *_ = \$results[$i] installs the $ith element of #results into the scalar slot of the *_ glob, aka $_. At this point $_ contains an alias of the array element. The localization will unwind at the end of the loop. local creates a dynamic scope, so any subroutines called from within the loop will see the new value of $_ unless they also localize it. There is much more detail available about these concepts, but I think they are outside the scope of your question.
As others have pointed out:
You're really using #_ and not $_ in your print statement.
It's not good to keep stuff in these variables since they're used elsewhere.
Officially, $_ and #_ are global variables and aren't members of any package. You can localize the scope with my $_ although that's probably a really, really bad idea. The problem is that Perl could use them without you even knowing it. It's bad practice to depend upon their values for more than a few lines.
Here's a slight rewrite in your program getting rid of the dependency on #_ and $_ as much as possible:
sub showResults {
my $foo = shift; #Or some meaningful name
my $bar = shift; #Or some meaningful name
if (not defined $foo) {
print "didn't pass two parameters\n";
return; #No need to hang around
}
if (my #results = dbGetResults($foo)) {
foreach my $item (#results) {
...
}
}
Some modifications:
I used shift to give your two parameters actual names. foo and bar aren't good names, but I couldn't find out what dbGetResults was from, so I couldn't figure out what parameters you were looking for. The #_ is still being used when the parameters are passed, and my shift is depending upon the value of #_, but after the first two lines, I'm free.
Since your two parameters have actual names, I can use the if (not defined $bar) to see if both parameters were passed. I also changed this to the negative. This way, if they didn't pass both parameters, you can exit early. This way, your code has one less indent, and you don't have a if structure that takes up your entire subroutine. It makes it easier to understand your code.
I used foreach my $item (#results) instead of foreach (#results) and depend upon $_. Again, it's clearer what your program is doing, and you wouldn't have confused $_->[0] with $_[0] (I think that's what you were doing). It would have been obvious you wanted $item->[0].

What does '#_' do in Perl?

I was glancing through some code I had written in my Perl class and I noticed this.
my ($string) = #_;
my #stringarray = split(//, $string);
I am wondering two things:
The first line where the variable is in parenthesis, this is something you do when declaring more than one variable and if I removed them it would still work right?
The second question would be what does the #_ do?
The #_ variable is an array that contains all the parameters passed into a subroutine.
The parentheses around the $string variable are absolutely necessary. They designate that you are assigning variables from an array. Without them, the #_ array is assigned to $string in a scalar context, which means that $string would be equal to the number of parameters passed into the subroutine. For example:
sub foo {
my $bar = #_;
print $bar;
}
foo('bar');
The output here is 1--definitely not what you are expecting in this case.
Alternatively, you could assign the $string variable without using the #_ array and using the shift function instead:
sub foo {
my $bar = shift;
print $bar;
}
Using one method over the other is quite a matter of taste. I asked this very question which you can check out if you are interested.
When you encounter a special (or punctuation) variable in Perl, check out the perlvar documentation. It lists them all, gives you an English equivalent, and tells you what it does.
Perl has two different contexts, scalar context, and list context. An array '#_', if used in scalar context returns the size of the array.
So given these two examples, the first one gives you the size of the #_ array, and the other gives you the first element.
my $string = #_ ;
my ($string) = #_ ;
Perl has three 'Default' variables $_, #_, and depending on who you ask %_. Many operations will use these variables, if you don't give them a variable to work on. The only exception is there is no operation that currently will by default use %_.
For example we have push, pop, shift, and unshift, that all will accept an array as the first parameter.
If you don't give them a parameter, they will use the 'default' variable instead. So 'shift;' is the same as 'shift #_;'
The way that subroutines were designed, you couldn't formally tell the compiler which values you wanted in which variables. Well it made sense to just use the 'default' array variable '#_' to hold the arguments.
So these three subroutines are (nearly) identical.
sub myjoin{
my ( $stringl, $stringr ) = #_;
return "$stringl$stringr";
}
sub myjoin{
my $stringl = shift;
my $stringr = shift;
return "$stringl$stringr";
}
sub myjoin{
my $stringl = shift #_;
my $stringr = shift #_;
return "$stringl$stringr";
}
I think the first one is slightly faster than the other two, because you aren't modifying the #_ variable.
The variable #_ is an array (hence the # prefix) that holds all of the parameters to the current function.