perl assign reference to subroutine - perl

I use #_ in a subroutine to get a parameter which is assigned as a reference of an array, but the result dose not showing as an array reference.
My code is down below.
my #aar = (9,8,7,6,5);
my $ref = \#aar;
AAR($ref);
sub AAR {
my $ref = #_;
print "ref = $ref";
}
This will print 1 , not an array reference , but if I replace #_ with shift , the print result will be a reference.
can anyone explain why I can't get a reference using #_ to me ?

This is about context in Perl. It is a crucial aspect of the language.
An expression like
my $var = #ary;
attempts to assign an array to a scalar.
That doesn't make sense as it stands and what happens is that the right-hand side is evaluated to the number of elements of the array and that is assigned to $var.
In order to change that behavior you need to provide the "list context" to the assignment operator.† In this case you'd do
my ($var) = #ary;
and now we have an assignment of a list (of array elements) to a list (of variables, here only $var), where they are assigned one for one. So here the first element of #ary is assigned to $var. Please note that this statement plays loose with the elusive notion of the "list."
So in your case you want
my ($ref) = #_;
and the first element from #_ is assigned to $ref, as needed.
Alternatively, you can remove and return the first element of #_ using shift, in which case the scalar-context assignment is fine
my $ref = shift #_;
In this case you can also do
my $ref = shift;
since shift by default works on #_.
This is useful when you want to remove the first element of input as it's being assigned so that the remaining #_ is well suited for further processing. It is often done in object-oriented code.
It is well worth pointing out that many operators and builtin facilities in Perl act differently depending on what context they are invoked in.
For some specifics, just a few examples: the regex match operator returns true/false (1/empty string) in scalar context but the actual matches in list context,‡ readdir returns a single entry in scalar context but all of them in list context, while localtime shows a bit more distinct difference. This context-sensitive behavior is in every corner of Perl.
User level subroutines can be made to behave that way via wantarray.
†
See Scalar vs List Assignment Operator
for a detailed discussion
‡
See it in perlretut and in perlop for instance

When you assign an array to a scalar, you're getting the size of the array. You pass one argument (a reference to an array) to AAR, that's why you get 1.
To get the actual parameters, place the local variable in braces:
sub AAR {
my ($ref) = #_;
print "ref = $ref\n";
}
This prints something like ref = ARRAY(0x5566c89a4710).
You can then use the reference to access the array elements like this:
print join(", ", #{$ref});

Related

why perl doesn't recognize i want to create an arrayRef with "split"?

I want to split a scalar by whitespaces and save the result in an ArrayReference.
use strict;
use warnings;
use Data::Dumper;
my $name = 'hans georg mustermann';
my $array = split ' ', $name;
print Dumper($array); #$VAR1 = 3;
So it seems $array is now a scalar with the size resulted by the split operation.
When i change the code to my $array = [split ' ', $name]; the variable $array is now a ArrayReference and contains all 3 strings.
I just don't understand this behavior. Would be really great if someone could explain it to me or post a good documentation about these things, as i don't know how to search for this topic.
Thank you in advance
What you see here is called "context". The documentation about this is rather scattered. You also want to take a look at this tutorial about "scalar vs list context" https://perlmaven.com/scalar-and-list-context-in-perl
If you assign the result of split (or any subroutine calls) to an array, it's list context:
my #arr = split ' ', $name;
#=> #arr = ('hans', 'georg', 'mustermann');
What your example code shows is assigning them to a scalar -- and therefore it's under "scalar context".
Since, naturally, multiple things cannot fit into one position, some sort of summarization needs to be done. In the case of split function, perl5 has defined that the number of elements in the result of split shall be the best.
Check the documentation of the split function: https://perldoc.pl/functions/split -- which actually defines the behaviour under scalar context as well as list context.
Also take a glance at the documentation of all builtin functions at https://perldoc.pl/functions -- you'll find the behaviour definition under "list context" and "scalar context" for most of them -- although many of them are not returning "the size of lists" but rather something else.
That's called context.
The partial expression split ' ', $name evaluates to a list. The partial expression $array = LIST coerces the list to a scalar value, namely counting the number of elements in the list. That's the default behaviour of lists in scalar context.
You should write #array = LIST instead, using an array variable, not a scalar variable, in order to preserve the list values.
If you read the documentation for split(), you'll find the bit that explains what the function returns.
Splits the string EXPR into a list of strings and returns the list in list context, or the size of the list in scalar context.
You're calling the function in scalar context (because you're assigning the result of the call to a scalar variable) so you're getting the size of the list.
If you want to get the list, then you need to store it either in a list of variables:
my ($forename, $middlename, $surname) = split ' ', $name;
Or (more usually) in an array:
my #name_parts = split ' ', $name;
But actually, you say that you want an array reference. You can do that by calling split() inside an anonymous array constructor ([ ... ]) and assigning the result of that call to a scalar variable.
my $name_parts = [ split ' ', $name ];

Is it possible to use hash, array, scalar with the same name in Perl?

I'm converting perl script to python.
I haven't used perl language before so many things in perl confused me.
For example, below, opt was declared as a scalar first but declared again as a hash. %opt = ();
Is it possible to declare a scalar and a hash with the same name in perl?
As I know, foreach $opt (#opts) means that scalar opt gets values of array opts one by one. opt is an array at this time???
In addition, what does $opt{$opt} mean?
opt outside $opt{$opt} is a hash and opt inside $opt{$opt} is a scalar?
I'm so confused, please help me ...
sub ParseOptions
{
local (#optval) = #_;
local ($opt, #opts, %valFollows, #newargs);
while (#optval) {
$opt = shift(#optval);
push(#opts,$opt);
$valFollows{$opt} = shift(#optval);
}
#optArgs = ();
%opt = ();
arg: while (defined($arg = shift(#ARGV))) {
foreach $opt (#opts) {
if ($arg eq $opt) {
push(#optArgs, $arg);
if ($valFollows{$opt}) {
if (#ARGV == 0) {
&Usage();
}
$opt{$opt} = shift(#ARGV);
push(#optArgs, $opt{$opt});
} else {
$opt{$opt} = 1;
}
next arg;
}
}
push(#newargs,$arg);
}
#ARGV = #newargs;
}
In Perl types SCALAR, ARRAY, and HASH are distinguished by the leading sigil in the variable name, $, #, and % respectively, but otherwise they may bear the same name. So $var and #var and %var are, other than belonging to the same *var typeglob, completely distinct variables.
A key for a hash must be a scalar, let's call it $key. A value in a hash must also be a scalar, and the one corresponding to $key in the hash %h is $h{$key}, the leading $ indicating that this is now a scalar. Much like an element of an array is a scalar, thus $ary[$index].
In foreach $var (#ary) the scalar $var does not really "get values" of array elements but rather aliases them. So if you change it in the loop you change the element of the array.
(Note, you have the array #opts, not #opt.)
A few comments on the code, in the light of common practices in modern Perl
One must always have use warnings; on top. The other one that is a must is use strict, as it enforces declaration of variables with my (or our), promoting all kinds of good behavior. This restricts the scope of lexical (my) variables to the nearest block enclosing the declaration
A declaration may be done inside loops as well, allowing while (my $var = EXPR) {} and foreach my $var (LIST) {}, and then $var is scoped to the loop's block (and is also usable in the rest of the condition)
The local is a different beast altogether, and in this code those should be my instead
Loop labels like arg: are commonly typed in block capitals
The & in front of a function name has rather particular meanings† and is rarely needed. It is almost certainly not needed in this code
Assignment of empty list like my #optArgs = () when the variable is declared (with my) is unneeded and has no effect (with a global #name it may be needed to clear it from higher scope)
In this code there is no need to introduce variables first since they are global and thus created the first time they are used, much like in Python – Unless they are used outside of this sub, in which case they may need to be cleared. That's the thing with globals, they radiate throughout all code
Except for the last two these are unrelated to your Python translation task, for this script.
† It suppresses prototypes; informs interpreter (at runtime) that it's a subroutine and not a "bareword", if it hasn't been already defined (can do that with () after the name); ensures that the user-defined sub is called, if there is a builtin of the same name; passes the caller's #_ (when without parens, &name); in goto &name and defined ...
See, for example, this post and this post
First, to make it clear, in perl, as in shell, you could surround variable names in curly brackets {} :
${bimbo} and $bimbo are the same scalar variable.
#bimbo is an array pointer ;
%bimbo is a hash pointer ;
$bimbo is a scalar (unique value).
To address an array or hash value, you will have to use the '$' :
$bimbo{'index'} is the 'index' value of hash %bimbo.
If $i contains an int, such as 1 for instance,
$bimbo[$i] is the second value of the array #bimbo.
So, as you can see, # or % ALWAYS refers to the whole array, as $bimbo{} or $bimbo[] refers to any value of the hash or array.
${bimbo[4]} refers to the 5th value of the array #bimbo ; %{bimbo{'index'}} refers to the 'index' value of array %bimbo.
Yes, all those structures could have the same name. This is one of the obvious syntax of perl.
And, euhm… always think in perl as a C++ edulcored syntax, it is simplified, but it is C.

= and , operators in Perl

Please explain this apparently inconsistent behaviour:
$a = b, c;
print $a; # this prints: b
$a = (b, c);
print $a; # this prints: c
The = operator has higher precedence than ,.
And the comma operator throws away its left argument and returns the right one.
Note that the comma operator behaves differently depending on context. From perldoc perlop:
Binary "," is the comma operator. In
scalar context it evaluates its left
argument, throws that value away, then
evaluates its right argument and
returns that value. This is just like
C's comma operator.
In list context, it's just the list
argument separator, and inserts both
its arguments into the list. These
arguments are also evaluated from left
to right.
As eugene's answer seems to leave some questions by OP i try to explain based on that:
$a = "b", "c";
print $a;
Here the left argument is $a = "b" because = has a higher precedence than , it will be evaluated first. After that $a contains "b".
The right argument is "c" and will be returned as i show soon.
At that point when you print $a it is obviously printing b to your screen.
$a = ("b", "c");
print $a;
Here the term ("b","c") will be evaluated first because of the higher precedence of parentheses. It returns "c" and this will be assigned to $a.
So here you print "c".
$var = ($a = "b","c");
print $var;
print $a;
Here $a contains "b" and $var contains "c".
Once you get the precedence rules this is perfectly consistent
Since eugene and mugen have answered this question nicely with good examples already, I am going to setup some concepts then ask some conceptual questions of the OP to see if it helps to illuminate some Perl concepts.
The first concept is what the sigils $ and # mean (we wont descuss % here). # means multiple items (said "these things"). $ means one item (said "this thing"). To get first element of an array #a you can do $first = $a[0], get the last element: $last = $a[-1]. N.B. not #a[0] or #a[-1]. You can slice by doing #shorter = #longer[1,2].
The second concept is the difference between void, scalar and list context. Perl has the concept of the context in which your containers (scalars, arrays etc.) are used. An easy way to see this is that if you store a list (we will get to this) as an array #array = ("cow", "sheep", "llama") then we store the array as a scalar $size = #array we get the length of the array. We can also force this behavior by using the scalar operator such as print scalar #array. I will say it one more time for clarity: An array (not a list) in scalar context will return, not an element (as a list does) but rather the length of the array.
Remember from before you use the $ sigil when you only expect one item, i.e. $first = $a[0]. In this way you know you are in scalar context. Now when you call $length = #array you can see clearly that you are calling the array in scalar context, and thus you trigger the special property of an array in list context, you get its length.
This has another nice feature for testing if there are element in the array. print '#array contains items' if #array; print '#array is empty' unless #array. The if/unless tests force scalar context on the array, thus the if sees the length of the array not elements of it. Since all numerical values are 'truthy' except zero, if the array has non-zero length, the statement if #array evaluates to true and you get the print statement.
Void context means that the return value of some operation is ignored. A useful operation in void context could be something like incrementing. $n = 1; $n++; print $n; In this example $n++ (increment after returning) was in void context in that its return value "1" wasn't used (stored, printed etc).
The third concept is the difference between a list and an array. A list is an ordered set of values, an array is a container that holds an ordered set of values. You can see the difference for example in the gymnastics one must do to get particular element after using sort without storing the result first (try pop sort { $a cmp $b } #array for example, which doesn't work because pop does not act on a list, only an array).
Now we can ask, when you attempt your examples, what would you want Perl to do in these cases? As others have said, this depends on precedence.
In your first example, since the = operator has higher precedence than the ,, you haven't actually assigned a list to the variable, you have done something more like ($a = "b"), ("c") which effectively does nothing with the string "c". In fact it was called in void context. With warnings enabled, since this operation does not accomplish anything, Perl attempts to warn you that you probably didn't mean to do that with the message: Useless use of a constant in void context.
Now, what would you want Perl to do when you attempt to store a list to a scalar (or use a list in a scalar context)? It will not store the length of the list, this is only a behavior of an array. Therefore it must store one of the values in the list. While I know it is not canonically true, this example is very close to what happens.
my #animals = ("cow", "sheep", "llama");
my $return;
foreach my $animal (#animals) {
$return = $animal;
}
print $return;
And therefore you get the last element of the list (the canonical difference is that the preceding values were never stored then overwritten, however the logic is similar).
There are ways to store a something that looks like a list in a scalar, but this involves references. Read more about that in perldoc perlreftut.
Hopefully this makes things a little more clear. Finally I will say, until you get the hang of Perl's precedence rules, it never hurts to put in explicit parentheses for lists and function's arguments.
There is an easy way to see how Perl handles both of the examples, just run them through with:
perl -MO=Deparse,-p -e'...'
As you can see, the difference is because the order of operations is slightly different than you might suspect.
perl -MO=Deparse,-p -e'$a = a, b;print $a'
(($a = 'a'), '???');
print($a);
perl -MO=Deparse,-p -e'$a = (a, b);print $a'
($a = ('???', 'b'));
print($a);
Note: you see '???', because the original value got optimized away.

How does #_ work in Perl subroutines?

I was always sure that if I pass a Perl subroutine a simple scalar, it can never change its value outside the subroutine. That is:
my $x = 100;
foo($x);
# without knowing anything about foo(), I'm sure $x still == 100
So if I want foo() to change x, I must pass it a reference to x.
Then I found out this is not the case:
sub foo {
$_[0] = 'CHANGED!';
}
my $x = 100;
foo($x);
print $x, "\n"; # prints 'CHANGED!'
And the same goes for array elements:
my #arr = (1,2,3);
print $arr[0], "\n"; # prints '1'
foo($arr[0]);
print $arr[0], "\n"; # prints 'CHANGED!'
That kinda surprised me. How does this work? Isn't the subroutine only gets the value of the argument? How does it know its address?
In Perl, the subroutine arguments stored in #_ are always aliases to the values at the call site. This aliasing only persists in #_, if you copy values out, that's what you get, values.
so in this sub:
sub example {
# #_ is an alias to the arguments
my ($x, $y, #rest) = #_; # $x $y and #rest contain copies of the values
my $args = \#_; # $args contains a reference to #_ which maintains aliases
}
Note that this aliasing happens after list expansion, so if you passed an array to example, the array expands in list context, and #_ is set to aliases of each element of the array (but the array itself is not available to example). If you wanted the latter, you would pass a reference to the array.
Aliasing of subroutine arguments is a very useful feature, but must be used with care. To prevent unintended modification of external variables, in Perl 6 you must specify that you want writable aliased arguments with is rw.
One of the lesser known but useful tricks is to use this aliasing feature to create array refs of aliases
my ($x, $y) = (1, 2);
my $alias = sub {\#_}->($x, $y);
$$alias[1]++; # $y is now 3
or aliased slices:
my $slice = sub {\#_}->(#somearray[3 .. 10]);
it also turns out that using sub {\#_}->(LIST) to create an array from a list is actually faster than [ LIST ] since Perl does not need to copy every value. Of course the downside (or upside depending on your perspective) is that the values remain aliased, so you can't change them without changing the originals.
As tchrist mentions in a comment to another answer, when you use any of Perl's aliasing constructs on #_, the $_ that they provide you is also an alias to the original subroutine arguments. Such as:
sub trim {s!^\s+!!, s!\s+$!! for #_} # in place trimming of white space
Lastly all of this behavior is nestable, so when using #_ (or a slice of it) in the argument list of another subroutine, it also gets aliases to the first subroutine's arguments:
sub add_1 {$_[0] += 1}
sub add_2 {
add_1(#_) for 1 .. 2;
}
This is all documented in detail in perldoc perlsub. For example:
Any arguments passed in show up in the array #_. Therefore, if you called a function with two arguments, those would be stored in $_[0] and $_[1]. The
array #_ is a local array, but its elements are aliases for the actual scalar parameters. In particular, if an element $_[0] is updated, the
corresponding argument is updated (or an error occurs if it is not updatable). If an argument is an array or hash element which did not exist when the
function was called, that element is created only when (and if) it is modified or a reference to it is taken. (Some earlier versions of Perl created the
element whether or not the element was assigned to.) Assigning to the whole array #_ removes that aliasing, and does not update any arguments.
Perl passes arguments by reference, not by value. See http://www.troubleshooters.com/codecorn/littperl/perlsub.htm

What does '#_' do in Perl?

I was glancing through some code I had written in my Perl class and I noticed this.
my ($string) = #_;
my #stringarray = split(//, $string);
I am wondering two things:
The first line where the variable is in parenthesis, this is something you do when declaring more than one variable and if I removed them it would still work right?
The second question would be what does the #_ do?
The #_ variable is an array that contains all the parameters passed into a subroutine.
The parentheses around the $string variable are absolutely necessary. They designate that you are assigning variables from an array. Without them, the #_ array is assigned to $string in a scalar context, which means that $string would be equal to the number of parameters passed into the subroutine. For example:
sub foo {
my $bar = #_;
print $bar;
}
foo('bar');
The output here is 1--definitely not what you are expecting in this case.
Alternatively, you could assign the $string variable without using the #_ array and using the shift function instead:
sub foo {
my $bar = shift;
print $bar;
}
Using one method over the other is quite a matter of taste. I asked this very question which you can check out if you are interested.
When you encounter a special (or punctuation) variable in Perl, check out the perlvar documentation. It lists them all, gives you an English equivalent, and tells you what it does.
Perl has two different contexts, scalar context, and list context. An array '#_', if used in scalar context returns the size of the array.
So given these two examples, the first one gives you the size of the #_ array, and the other gives you the first element.
my $string = #_ ;
my ($string) = #_ ;
Perl has three 'Default' variables $_, #_, and depending on who you ask %_. Many operations will use these variables, if you don't give them a variable to work on. The only exception is there is no operation that currently will by default use %_.
For example we have push, pop, shift, and unshift, that all will accept an array as the first parameter.
If you don't give them a parameter, they will use the 'default' variable instead. So 'shift;' is the same as 'shift #_;'
The way that subroutines were designed, you couldn't formally tell the compiler which values you wanted in which variables. Well it made sense to just use the 'default' array variable '#_' to hold the arguments.
So these three subroutines are (nearly) identical.
sub myjoin{
my ( $stringl, $stringr ) = #_;
return "$stringl$stringr";
}
sub myjoin{
my $stringl = shift;
my $stringr = shift;
return "$stringl$stringr";
}
sub myjoin{
my $stringl = shift #_;
my $stringr = shift #_;
return "$stringl$stringr";
}
I think the first one is slightly faster than the other two, because you aren't modifying the #_ variable.
The variable #_ is an array (hence the # prefix) that holds all of the parameters to the current function.