Assigning shift to a variable - perl

Other thread that I read: What does assigning 'shift' to a variable mean?
I also used perldoc -f shift:
shift ARRAY
shift
Shifts the first value of the array off and returns it, shortening the array by 1 and moving everything down. If there are no elements in the array, returns the undefined value. If ARRAY is omitted, shifts the #_ array within the lexical scope of subroutines and formats, and the #ARGV array outside of a subroutine and also within the lexical scopes established by the eval STRING, BEGIN {}, INIT {}, CHECK {}, UNITCHECK {} and END {} constructs.
See also unshift, push, and pop. shift and unshift do the same thing to the left end of an array that pop and push do to the right end.
I understand outside of subroutines, the array is #ARGV, and inside arguments are passed through #_
I've read countless tutorials on how to use the shift function, but it's always about arrays, and how it removes the first element at the beginning of the array and returns it. But I see sometimes
$eg = shift;
~do more things here~
It seems as if nothing is making sense to me, and I feel like I can't continue reading more until I understand how this works as it's a "basic building block" to the language.
I'm not quite sure if an example of code is needed, as I believe the same principles apply to all programs that use shift. But if wrong I can provide some examples of code.

It depends on your context.
In all cases, shift removes the element at list index 0 and returns it, shifting all remaining elements down one.
Inside a sub, shift without an argument (bare shift) operates on the #_ list of subroutine parameters.
Suppose I call mysub('me', '22')
sub mysub {
my $self = shift; # return $_[0], moves everything else down, $self = 'me', #_ = ( '22' )
my $arg1 = shift; # returns $_[0], moves everything down, $arg1 = '22', #_ = ()
}
Outside a sub, it operates on the #ARGV list of command line parameters.
Given an argument, shift operates on that list.
my #list1 = ( 1,2,3,4,5 );
my $first = shift #list1; # $first = 1, #list1 = (2,3,4,5)

It removes the first element from an array and returns it.
If #_ contains the elements ("foo","bar",123) then the statement
$eg = shift; # same as $eg = shift #_
Assigns the value "foo" to the variable $eg, and leaves #_ containing the elements ("bar",123).
This is very much a basic building block of the language. You will frequently see this construction at the beginning of subroutines as it is one of the ways to copy the arguments to the subroutine (#_) into other variables.
sub func {
my $x = shift; # put first arg into $x
my $y = shift; # put second arg into $y
my #z = #_; # put remaining args into #z
...
}
$r = func(1,2,"foo");

Related

perl assign reference to subroutine

I use #_ in a subroutine to get a parameter which is assigned as a reference of an array, but the result dose not showing as an array reference.
My code is down below.
my #aar = (9,8,7,6,5);
my $ref = \#aar;
AAR($ref);
sub AAR {
my $ref = #_;
print "ref = $ref";
}
This will print 1 , not an array reference , but if I replace #_ with shift , the print result will be a reference.
can anyone explain why I can't get a reference using #_ to me ?
This is about context in Perl. It is a crucial aspect of the language.
An expression like
my $var = #ary;
attempts to assign an array to a scalar.
That doesn't make sense as it stands and what happens is that the right-hand side is evaluated to the number of elements of the array and that is assigned to $var.
In order to change that behavior you need to provide the "list context" to the assignment operator.† In this case you'd do
my ($var) = #ary;
and now we have an assignment of a list (of array elements) to a list (of variables, here only $var), where they are assigned one for one. So here the first element of #ary is assigned to $var. Please note that this statement plays loose with the elusive notion of the "list."
So in your case you want
my ($ref) = #_;
and the first element from #_ is assigned to $ref, as needed.
Alternatively, you can remove and return the first element of #_ using shift, in which case the scalar-context assignment is fine
my $ref = shift #_;
In this case you can also do
my $ref = shift;
since shift by default works on #_.
This is useful when you want to remove the first element of input as it's being assigned so that the remaining #_ is well suited for further processing. It is often done in object-oriented code.
It is well worth pointing out that many operators and builtin facilities in Perl act differently depending on what context they are invoked in.
For some specifics, just a few examples: the regex match operator returns true/false (1/empty string) in scalar context but the actual matches in list context,‡ readdir returns a single entry in scalar context but all of them in list context, while localtime shows a bit more distinct difference. This context-sensitive behavior is in every corner of Perl.
User level subroutines can be made to behave that way via wantarray.
†
See Scalar vs List Assignment Operator
for a detailed discussion
‡
See it in perlretut and in perlop for instance
When you assign an array to a scalar, you're getting the size of the array. You pass one argument (a reference to an array) to AAR, that's why you get 1.
To get the actual parameters, place the local variable in braces:
sub AAR {
my ($ref) = #_;
print "ref = $ref\n";
}
This prints something like ref = ARRAY(0x5566c89a4710).
You can then use the reference to access the array elements like this:
print join(", ", #{$ref});

Dereferencing in case of $_[0], $_[1] ..... so on

please see the below code:
$scalar = 10;
subroutine(\$scalar);
sub subroutine {
my $subroutine_scalar = ${$_[0]}; #note you need the {} brackets, or this doesn't work!
print "$subroutine_scalar\n";
}
In the code above you can see the comment written "note you need the {} brackets, or this doesn't work!" . Please explain the reason that why we cant use the same statement as:
my $subroutine_scalar = $$_[0];
i.e. without using the curly brackets.
Many people have already given correct answers here. I wanted to add an example I found illuminating. You can read the documentation in perldoc perlref for more information.
Your problem is one of ambiguity, you have two operations $$ and [0] working on the same identifier _, and the result depends on which operation is performed first. We can make it less ambiguous by using the support curly braces ${ ... }. $$_[0] could (for a human anyway) possibly mean:
${$$_}[0] -- dereference the scalar $_, then take its first element.
${$_[0]} -- take element 0 of the array #_ and dereference it.
As you can see, these two cases refer to completely different variables, #_ and $_.
Of course, for Perl it is not ambiguous, we simply get the first option, since dereferencing is performed before key lookup. We need the support curly braces to override this dereferencing, and that is why your example does not "work" without support braces.
You might consider a slightly less confusing functionality for your subroutine. Instead of trying to do two things at once (get the argument and dereference it), you can do it in two stages:
sub foo {
my $n = shift;
print $$n;
}
Here, we take the first argument off #_ with shift, and then dereference it. Clean and simple.
Most often, you will not be using references to scalar variables, however. And in those cases, you can make use of the arrow operator ->
my #array = (1,2,3);
foo(\#array);
sub foo {
my $aref = shift;
print $aref->[0];
}
I find using the arrow operator to be preferable to the $$ syntax.
${ $x }[0] grabs the value of element 0 in the array referenced by $x.
${ $x[0] } grabs the value of scalar referenced by the element 0 of the array #x.
>perl -E"$x=['def']; #x=\'abc'; say ${ $x }[0];"
def
>perl -E"$x=['def']; #x=\'abc'; say ${ $x[0] };"
abc
$$x[0] is short for ${ $x }[0].
>perl -E"$x=['def']; #x=\'abc'; say $$x[0];"
def
my $subroutine_scalar = $$_[0];
is same as
my $subroutine_scalar = $_->[0]; # $_ is array reference
On the other hand,
my $subroutine_scalar = ${$_[0]};
dereferences scalar ref for first element of #_ array, and can be written as
my ($sref) = #_;
my $subroutine_scalar = ${$sref}; # or $$sref for short
Because $$_[0] means ${$_}[0].
Consider these two pieces of code which both print 10:
sub subroutine1 {
my $scalar = 10;
my $ref_scalar = \$scalar;
my #array = ($ref_scalar);
my $subroutine_scalar = ${$array[0]};
print "$subroutine_scalar\n";
}
sub subroutine2 {
my #array = (10);
my $ref_array = \#array;
my $subroutine_scalar = $$ref_array[0];
print "$subroutine_scalar\n";
}
In subroutine1, #array is an array containing the reference of $scalar. So the first step is to get the first element by $array[0], and then deference it.
While in subroutine2, #array is an array containing an scalar 10, and $ref_array is its reference. So the first step is to get the array by $ref_array, and then index the array.

Can anybody please explain (my $self = shift) in Perl

I'm having a really hard time understanding the intersection of OO Perl and my $self = shift; The documentation on these individual elements is great, but none of them that I've found touch on how they work together.
I've been using Moose to make modules with attributes, and of course, it's useful to reference a module's attribute within said module. I've been told over and over again to use my $self = shift; within a subroutine to assign the module's attributes to that variable. This makes sense and works, but when I'm also passing arguments to the subroutine, this process clearly takes the first element of the #ARGV array and assigns it to $self as well.
Can someone offer an explanation of how I can use shift to gain internal access to a module's attributes, while also passing in arguments in the #ARGV array?
First off, a subroutine isn't passed the #ARGV array. Rather all the parameters passed to a subroutine are flattened into a single list represented by #_ inside the subroutine. The #ARGV array is available at the top-level of your script, containing the command line arguments passed to you script.
Now, in Perl, when you call a method on an object, the object is implicitly passed as a parameter to the method.
If you ignore inheritance,
$obj->doCoolStuff($a, $b);
is equivalent to
doCoolStuff($obj, $a, $b);
Which means the contents of #_ in the method doCoolStuff would be:
#_ = ($obj, $a, $b);
Now, the shift builtin function, without any parameters, shifts an element out of the default array variable #_. In this case, that would be $obj.
So when you do $self = shift, you are effectively saying $self = $obj.
I also hope this explains how to pass other parameters to a method via the -> notation. Continuing the example I've stated above, this would be like:
sub doCoolStuff {
# Remember #_ = ($obj, $a, $b)
my $self = shift;
my ($a, $b) = #_;
Additionally, while Moose is a great object layer for Perl, it doesn't take away from the requirement that you need to initialize the $self yourself in each method. Always remember this. While language like C++ and Java initialize the object reference this implicitly, in Perl you need to do it explicitly for every method you write.
In top level-code, shift() is short for shift(#ARGV). #ARGV contains the command-line arguments.
In a sub, shift() is short for shift(#_). #_ contains the sub's arguments.
So my $self = shift; is grabbing the sub's first argument. When calling a method, the invocant (what's left of the ->) is passed as the first parameter. In other words,
$o->method(#a)
is similar to
my $sub = $o->can('method');
$sub->($o, #a);
In that example, my $self = shift; will assign $o to $self.
If you call:
$myinstance->myMethod("my_parameter");
is the same that doing:
myMethod($myinstance, "my_parameter");
but if you do:
myMethod("my_parameter");
only "my_parameter" wil be passed.
THEN if inside myMethod always you do :
$self = shift #_;
$self will be the object reference when myMethod id called from an object context
but will be "my_parameter" when called from another method inside on a procedural way.
Be aware of this;

How does #_ work in Perl subroutines?

I was always sure that if I pass a Perl subroutine a simple scalar, it can never change its value outside the subroutine. That is:
my $x = 100;
foo($x);
# without knowing anything about foo(), I'm sure $x still == 100
So if I want foo() to change x, I must pass it a reference to x.
Then I found out this is not the case:
sub foo {
$_[0] = 'CHANGED!';
}
my $x = 100;
foo($x);
print $x, "\n"; # prints 'CHANGED!'
And the same goes for array elements:
my #arr = (1,2,3);
print $arr[0], "\n"; # prints '1'
foo($arr[0]);
print $arr[0], "\n"; # prints 'CHANGED!'
That kinda surprised me. How does this work? Isn't the subroutine only gets the value of the argument? How does it know its address?
In Perl, the subroutine arguments stored in #_ are always aliases to the values at the call site. This aliasing only persists in #_, if you copy values out, that's what you get, values.
so in this sub:
sub example {
# #_ is an alias to the arguments
my ($x, $y, #rest) = #_; # $x $y and #rest contain copies of the values
my $args = \#_; # $args contains a reference to #_ which maintains aliases
}
Note that this aliasing happens after list expansion, so if you passed an array to example, the array expands in list context, and #_ is set to aliases of each element of the array (but the array itself is not available to example). If you wanted the latter, you would pass a reference to the array.
Aliasing of subroutine arguments is a very useful feature, but must be used with care. To prevent unintended modification of external variables, in Perl 6 you must specify that you want writable aliased arguments with is rw.
One of the lesser known but useful tricks is to use this aliasing feature to create array refs of aliases
my ($x, $y) = (1, 2);
my $alias = sub {\#_}->($x, $y);
$$alias[1]++; # $y is now 3
or aliased slices:
my $slice = sub {\#_}->(#somearray[3 .. 10]);
it also turns out that using sub {\#_}->(LIST) to create an array from a list is actually faster than [ LIST ] since Perl does not need to copy every value. Of course the downside (or upside depending on your perspective) is that the values remain aliased, so you can't change them without changing the originals.
As tchrist mentions in a comment to another answer, when you use any of Perl's aliasing constructs on #_, the $_ that they provide you is also an alias to the original subroutine arguments. Such as:
sub trim {s!^\s+!!, s!\s+$!! for #_} # in place trimming of white space
Lastly all of this behavior is nestable, so when using #_ (or a slice of it) in the argument list of another subroutine, it also gets aliases to the first subroutine's arguments:
sub add_1 {$_[0] += 1}
sub add_2 {
add_1(#_) for 1 .. 2;
}
This is all documented in detail in perldoc perlsub. For example:
Any arguments passed in show up in the array #_. Therefore, if you called a function with two arguments, those would be stored in $_[0] and $_[1]. The
array #_ is a local array, but its elements are aliases for the actual scalar parameters. In particular, if an element $_[0] is updated, the
corresponding argument is updated (or an error occurs if it is not updatable). If an argument is an array or hash element which did not exist when the
function was called, that element is created only when (and if) it is modified or a reference to it is taken. (Some earlier versions of Perl created the
element whether or not the element was assigned to.) Assigning to the whole array #_ removes that aliasing, and does not update any arguments.
Perl passes arguments by reference, not by value. See http://www.troubleshooters.com/codecorn/littperl/perlsub.htm

What does '#_' do in Perl?

I was glancing through some code I had written in my Perl class and I noticed this.
my ($string) = #_;
my #stringarray = split(//, $string);
I am wondering two things:
The first line where the variable is in parenthesis, this is something you do when declaring more than one variable and if I removed them it would still work right?
The second question would be what does the #_ do?
The #_ variable is an array that contains all the parameters passed into a subroutine.
The parentheses around the $string variable are absolutely necessary. They designate that you are assigning variables from an array. Without them, the #_ array is assigned to $string in a scalar context, which means that $string would be equal to the number of parameters passed into the subroutine. For example:
sub foo {
my $bar = #_;
print $bar;
}
foo('bar');
The output here is 1--definitely not what you are expecting in this case.
Alternatively, you could assign the $string variable without using the #_ array and using the shift function instead:
sub foo {
my $bar = shift;
print $bar;
}
Using one method over the other is quite a matter of taste. I asked this very question which you can check out if you are interested.
When you encounter a special (or punctuation) variable in Perl, check out the perlvar documentation. It lists them all, gives you an English equivalent, and tells you what it does.
Perl has two different contexts, scalar context, and list context. An array '#_', if used in scalar context returns the size of the array.
So given these two examples, the first one gives you the size of the #_ array, and the other gives you the first element.
my $string = #_ ;
my ($string) = #_ ;
Perl has three 'Default' variables $_, #_, and depending on who you ask %_. Many operations will use these variables, if you don't give them a variable to work on. The only exception is there is no operation that currently will by default use %_.
For example we have push, pop, shift, and unshift, that all will accept an array as the first parameter.
If you don't give them a parameter, they will use the 'default' variable instead. So 'shift;' is the same as 'shift #_;'
The way that subroutines were designed, you couldn't formally tell the compiler which values you wanted in which variables. Well it made sense to just use the 'default' array variable '#_' to hold the arguments.
So these three subroutines are (nearly) identical.
sub myjoin{
my ( $stringl, $stringr ) = #_;
return "$stringl$stringr";
}
sub myjoin{
my $stringl = shift;
my $stringr = shift;
return "$stringl$stringr";
}
sub myjoin{
my $stringl = shift #_;
my $stringr = shift #_;
return "$stringl$stringr";
}
I think the first one is slightly faster than the other two, because you aren't modifying the #_ variable.
The variable #_ is an array (hence the # prefix) that holds all of the parameters to the current function.