How does use of "my" keyword distiguish the results in perl? - perl

I dont understand how the "my" keyword works here. This is my perl script.
$line = ' sdfaad(asdvfr)';
code1:
if ($tmp = $line =~ /(\(\s*[^)]+\))/ ) {
print $tmp;
}
Outputs:
1
code2:
if (my ($tmp) = $line =~ /(\(\s*[^)]+\))/ ) {
print $tmp;
}
Outputs:
(asdvfr)
Why are the two outputs different? Does it have to do with the use of my?

It is not my that makes the difference, but scalar/list context. Braces around $tmp are imposing list context,
if (($tmp) = $line=~ /(\(\s*[^)]+\))/ ) # braces makes difference, not 'my'
while my only declares variable as lexical scoped one.

Perl has two different assignment operators; a list assignment operator and a scalar assignment operator. A list assignment gives its right operand list context, while a scalar assignment gives its right operand scalar context. A match operation returns differently depending on this context.
Which operator = is depends on what is on the left side; if it is an array, a hash, a slice, or a parenthesized expression, it is a list assignment; otherwise it is a scalar assignment.

Related

Dereferencing in case of $_[0], $_[1] ..... so on

please see the below code:
$scalar = 10;
subroutine(\$scalar);
sub subroutine {
my $subroutine_scalar = ${$_[0]}; #note you need the {} brackets, or this doesn't work!
print "$subroutine_scalar\n";
}
In the code above you can see the comment written "note you need the {} brackets, or this doesn't work!" . Please explain the reason that why we cant use the same statement as:
my $subroutine_scalar = $$_[0];
i.e. without using the curly brackets.
Many people have already given correct answers here. I wanted to add an example I found illuminating. You can read the documentation in perldoc perlref for more information.
Your problem is one of ambiguity, you have two operations $$ and [0] working on the same identifier _, and the result depends on which operation is performed first. We can make it less ambiguous by using the support curly braces ${ ... }. $$_[0] could (for a human anyway) possibly mean:
${$$_}[0] -- dereference the scalar $_, then take its first element.
${$_[0]} -- take element 0 of the array #_ and dereference it.
As you can see, these two cases refer to completely different variables, #_ and $_.
Of course, for Perl it is not ambiguous, we simply get the first option, since dereferencing is performed before key lookup. We need the support curly braces to override this dereferencing, and that is why your example does not "work" without support braces.
You might consider a slightly less confusing functionality for your subroutine. Instead of trying to do two things at once (get the argument and dereference it), you can do it in two stages:
sub foo {
my $n = shift;
print $$n;
}
Here, we take the first argument off #_ with shift, and then dereference it. Clean and simple.
Most often, you will not be using references to scalar variables, however. And in those cases, you can make use of the arrow operator ->
my #array = (1,2,3);
foo(\#array);
sub foo {
my $aref = shift;
print $aref->[0];
}
I find using the arrow operator to be preferable to the $$ syntax.
${ $x }[0] grabs the value of element 0 in the array referenced by $x.
${ $x[0] } grabs the value of scalar referenced by the element 0 of the array #x.
>perl -E"$x=['def']; #x=\'abc'; say ${ $x }[0];"
def
>perl -E"$x=['def']; #x=\'abc'; say ${ $x[0] };"
abc
$$x[0] is short for ${ $x }[0].
>perl -E"$x=['def']; #x=\'abc'; say $$x[0];"
def
my $subroutine_scalar = $$_[0];
is same as
my $subroutine_scalar = $_->[0]; # $_ is array reference
On the other hand,
my $subroutine_scalar = ${$_[0]};
dereferences scalar ref for first element of #_ array, and can be written as
my ($sref) = #_;
my $subroutine_scalar = ${$sref}; # or $$sref for short
Because $$_[0] means ${$_}[0].
Consider these two pieces of code which both print 10:
sub subroutine1 {
my $scalar = 10;
my $ref_scalar = \$scalar;
my #array = ($ref_scalar);
my $subroutine_scalar = ${$array[0]};
print "$subroutine_scalar\n";
}
sub subroutine2 {
my #array = (10);
my $ref_array = \#array;
my $subroutine_scalar = $$ref_array[0];
print "$subroutine_scalar\n";
}
In subroutine1, #array is an array containing the reference of $scalar. So the first step is to get the first element by $array[0], and then deference it.
While in subroutine2, #array is an array containing an scalar 10, and $ref_array is its reference. So the first step is to get the array by $ref_array, and then index the array.

How to grok (and modify) this Perl statement

I am new to Perl.
How do I interpret this Perl statement?:
my( $foo, $bar ) = split /\s+/, $foobar, 2;
I know that local variables are being simultaneously assigned by the split function, but I don't understand what the integer 2 is for - I'm guessing the func will return an array with two elements?.
Can a Perl monger explain the statement above to me (ELI5)
Also, on occasion, the string being split does not contain the expected tokens, resulting in either foo or bar being uninitialized and thus causing a warning when an attempt is made to use them further on in the code.
How do I initialize $foo and $bar to sensible values (null strings) in case the split "fails" to return two strings?
The split function takes three arguments:
A regex that matches separators, or the special value " " (a string consisting of a single space), which trims the string, then splits at whitespace like /\s+/.
A string that shall be split.
A maximum number of resulting fragments. Sometimes this is an optimization when you aren't interested in all fields, and sometimes you don't want to split at each separator, as is the case here.
So your split expression will return at most two fields, but not neccessarily exactly two. To give your variables default values, either assign default values before the split, or check if they are undef after the split, and give the default:
my ($foo, $bar) = ('', '');
($foo, $bar) = split ...;
or combined
(my ($foo, $bar) = ('', '')) = split ...
or
my ($foo, $bar) = split ...;
$_ //= '' for $foo, $bar;
The //= operator assigns the value on the RHS if the LHS is undef. The for loop is just a way to shorten the code.
You may also want to carry on with a piece of code only when exactly two fields were produced:
if ( 2 == (my ($foo, $bar) = split ...) ) {
say "foo = $foo";
say "bar = $bar";
} else {
warn "could not split!";
}
List assignment in scalar context evaluates to the number of elements assigned.
The 2 is the maximum number of components returned by split.
Thus, the regexp /\s+/ splits $foobar on clumps of whitespace, but will only split once, to make two components. If there is no whitespace, then $bar will be undefined.
See http://perldoc.perl.org/functions/split.html
In addition to amon's method, Perl has a defined(x) function that returns true or false depending on whether its argument x is defined or undefined, and this can be used in an if statement to correct cases where something is undefined.
See http://perldoc.perl.org/functions/defined.html
As stated here: http://perldoc.perl.org/functions/split.html
"If LIMIT is specified and positive, it represents the maximum number of fields into which the EXPR may be split;"
For example:
#!/opt/local/bin/perl
my $foobar = "A B C D";
my( $foo, $bar ) = split /\s+/, $foobar, 2;
print "\nfoo=$foo";
print "\nbar=$bar";
print "\n";
output:
foo=A
bar=B C D

How exactly does Perl handle operator chaining?

So I have this bit of code that does not work:
print $userInput."\n" x $userInput2; #$userInput = string & $userInput2 is a integer
It prints it out once fine if the number is over 0 of course, but it doesn't print out the rest if the number is greater than 1. I come from a java background and I assume that it does the concatenation first, then the result will be what will multiply itself with the x operator. But of course that does not happen. Now it works when I do the following:
$userInput .= "\n";
print $userInput x $userInput2;
I am new to Perl so I'd like to understand exactly what goes on with chaining, and if I can even do so.
You're asking about operator precedence. ("Chaining" usually refers to chaining of method calls, e.g. $obj->foo->bar->baz.)
The Perl documentation page perlop starts off with a list of all the operators in order of precedence level. x has the same precedence as other multiplication operators, and . has the same precedence as other addition operators, so of course x is evaluated first. (i.e., it "has higher precedence" or "binds more tightly".)
As in Java you can resolve this with parentheses:
print(($userInput . "\n") x $userInput2);
Note that you need two pairs of parentheses here. If you'd only used the inner parentheses, Perl would treat them as indicating the arguments to print, like this:
# THIS DOESN'T WORK
print($userInput . "\n") x $userInput2;
This would print the string once, then duplicate print's return value some number of times. Putting space before the ( doesn't help since whitespace is generally optional and ignored. In a way, this is another form of operator precedence: function calls bind more tightly than anything else.
If you really hate having more parentheses than strictly necessary, you can defeat Perl with the unary + operator:
print +($userInput . "\n") x $userInput2;
This separates the print from the (, so Perl knows the rest of the line is a single expression. Unary + has no effect whatsoever; its primary use is exactly this sort of situation.
This is due to precedence of . (concatenation) operator being less than the x operator. So it ends up with:
use strict;
use warnings;
my $userInput = "line";
my $userInput2 = 2;
print $userInput.("\n" x $userInput2);
And outputs:
line[newline]
[newline]
This is what you want:
print (($userInput."\n") x $userInput2);
This prints out:
line
line
As has already been mentioned, this is a precedence issue, in that the repetition operator x has higher precedence than the concatenation operator .. However, that is not all that's going on here, and also, the issue itself comes from a bad solution.
First off, when you say
print (($foo . "\n") x $count);
What you are doing is changing the context of the repetition operator to list context.
(LIST) x $count
The above statement really means this (if $count == 3):
print ( $foo . "\n", $foo . "\n", $foo . "\n" ); # list with 3 elements
From perldoc perlop:
Binary "x" is the repetition operator. In scalar context or if the left operand is not enclosed in parentheses, it returns a string consisting of the left operand repeated the number of times specified by the right operand. In list context, if the left operand is enclosed in parentheses or is a list formed by qw/STRING/, it repeats the list. If the right operand is zero or negative, it returns an empty string or an empty list, depending on the context.
The solution works as intended because print takes list arguments. However, if you had something else that takes scalar arguments, such as a subroutine:
foo(("text" . "\n") x 3);
sub foo {
# #_ is now the list ("text\n", "text\n", "text\n");
my ($string) = #_; # error enters here
# $string is now "text\n"
}
This is a subtle difference which might not always give the desired result.
A better solution for this particular case is to not use the concatenation operator at all, because it is redundant:
print "$foo\n" x $count;
Or even use more mundane methods:
for (0 .. $count) {
print "$foo\n";
}
Or
use feature 'say'
...
say $foo for 0 .. $count;

How does Perl parse print " " . ($x, $y) ."\n"?

I know from Learning Perl, 6th Ed. (ISBN: 978-1-449-30358-7) p.58 that ($x, $y) = "something", "new"; is a list context. So why does the following code print " bee"? Please explain how does the code parsed.
$dina = bobba;
$ba = bee;
print " " . ($dina, $ba)."\n";
The concatenation operator . imposes scalar context on the list created by the comma operator, so it returns its last member.
The most relevant documentation quote is this paragraph from perlop(1):
Comma Operator
Binary "," is the comma operator. In scalar context it evaluates its
left argument, throws that value away, then evaluates its right
argument and returns that value. This is just like C's comma operator.
"($x, $y) = ("something", "new"); is a list context." makes no sense. (Added the missing parenthesis to avoid going off-topic.)
First, something is evaluated in list context.
Second, there's no way to know in which context that expression will be evaluated from what you posted, but chances are it's evaluated in void context.
You are probably referring to the sub expressions ($x, $y) and ("something", "new"). They are evaluated indeed evaluated in list context, and that's because the list assignment operator evaluates its operands in list context.
In your code, ($x, $y) is the operand of a concatenation operator (.). The concatenation operator combines two strings, so it expects strings as operands. Strings being scalars, the concatenation operator evaluates its operands in scalar context.
In scalar context,
$x, $y
is about the same as
do { $x; $y }
(without the additional scope). Each item of the list is evaluated in turn in void or scalar context, and the whole evaluates to what the last item in the list returned.
>perl -E"sub f { say 'f'; 3 } sub g { say 'g'; 4 } say ':'.(f,g);"
f
g
:4

Can you explain the context dependent variable assignment in perl

The following is one of the many cool things that Perl can do
my ($tmp) = ($_=~ /^>(.*)/);
It finds the pattern ^>.* in the current line in a loop, and it stores the what's in the parenthesis in the $tmp variable.
What I am curious is the concept behind this syntax. How and why(under what premises) does this work?
My understanding is the snippet $_=~ /^>(.*)/ is a boolean context, but the parenthesis renders it as a list context? But how come only what is in the parenthesis in the matched pattern is stored in the variable?!
Is it some kind of special case of variable assignments I have to "memorize" or can this be perfectly explainable? if so, what is this feature called(name like "autovivifacation?")
There are two assignment operators: list assignment and scalar assignment. The choice is determined based on the LHS of the "=". (The two operators are covered in detail in here.)
In this case, a list assignment operator is used. The list assignment operator evaluates both of its operands in list context.
So what does $_=~ /^>(.*)/ do in list context? Quote perlop:
If the /g option is not used, m// in list context returns a list consisting of the subexpressions matched by the parentheses in the pattern, i.e., ($1, $2, $3...) [...] When there are no parentheses in the pattern, the return value is the list (1) for success. With or without parentheses, an empty list is returned upon failure.
In other words,
my ($match) = $_ =~ /^>(.*)/;
is equivalent to
my $match;
if ($_ =~ /^>(.*)/) {
$match = $1;
} else {
$match = undef;
}
Were the parens omitted (my $tmp = ...;), a scalar assignment would be used instead. The scalar assignment operator evaluates both of its operands in scalar context.
So what does $_=~ /^>(.*)/ do in scalar context? Quote perlop:
returns true if it succeeds, false if it fails.
In other words,
my $matched = $_ =~ /^>(.*)/;
is equivalent to
my $matched;
if ($_ =~ /^>(.*)/) {
$matched = 1; # !!1 if you want to be picky.
} else {
$matched = 0; # !!0 if you want to be picky.
}
The brackets in the search pattern make that a "group". What $_ =~ /regex/returns is an array of all the matching groups, so my ($tmp) grabs the first group into $tmp.
All operations in perl have a return value, including assignment. Thats why you can do $a=$b=1 and set $a to the result of $b=1.
You can use =~ in a boolean (well, scalar) context, but that's just because it returns an empty list / undef if there's no match, and that evaluates to false. Calling it in an array context returns an array, just like other context-sensitive functions can do using the wantarray method to determine context.