Order of `shift` execution in Perl equation - perl

For a demonstration of my question, consider the following Perl code:
use strict;
use warnings;
use Data::Dumper;
my #a = (1, 2);
my %h;
sub test {
$h{shift #_} = shift;
}
&test(#a);
print Dumper(%h);
The output IS as the following:
$VAR1 = '2';
$VAR2 = 1;
Why Perl executes the first shift from right side of the equation, and not from the left one?
Why the output IS NOT as the following?
$VAR1 = '1';
$VAR2 = 2;

In most language, operand evaluation order is undefined or at least undocumented for most operators.[1] Perl is no exception.
Does f() + g() call f() or g() first? Well, that's undocumented and presumably undefined.
Now, it turns out that perl is currently very consistent. The binary arithmetic operators will always evaluate their left-hand side operand before their right-hand side operand (including **, which is right-associative), while the scalar assignment operator and list assignment operator evaluate their RHS operand before their LHS operand.
Notable exceptions include the comma operator in scalar context, and short-circuiting operators.
The comma operator in scalar context is documented to evaluate its LHS before its RHS, though no such guarantee is made when it's called in list context.
Short-circuiting operators —namely &&, ||, and, or and the conditional operator— must necessarily evaluate their LHS before any other operand.

Related

Perl dereferencing in non-strict mode

In Perl, if I have:
no strict;
#ARY = (58, 90);
To operate on an element of the array, say it, the 2nd one, I would write (possibly as part of a larger expression):
$ARY[1] # The most common way found in Perldoc's idioms.
Though, for some reason these also work:
#ARY[1]
#{ARY[1]}
Resulting all in the same object:
print (\$ARY[1]);
print (\#ARY[1]);
print (\#{ARY[1]});
Output:
SCALAR(0x9dbcdc)
SCALAR(0x9dbcdc)
SCALAR(0x9dbcdc)
What is the syntax rules that enable this sort of constructs? How far could one devise reliable program code with each of these constructs, or with a mix of all of them either? How interchangeable are these expressions? (always speaking in a non-strict context).
On a concern of justifying how I come into this question, I agree "use strict" as a better practice, still I'm interested at some knowledge on build-up non-strict expressions.
In an attemp to find myself some help to this uneasiness, I came to:
The notion on "no strict;" of not complaining about undeclared
variables and quirk syntax.
The prefix dereference having higher precedence than subindex [] (perldsc § "Caveat on precedence").
The clarification on when to use # instead of $ (perldata § "Slices").
The lack of "[]" (array subscript / slice) description among the Perl's operators (perlop), which lead me to think it is not an
operator... (yet it has to be something else. But, what?).
For what I learned, none of these hints, put together, make me better understand my issue.
Thanks in advance.
Quotation from perlfaq4:
What is the difference between $array[1] and #array[1]?
The difference is the sigil, that special character in front of the array name. The $ sigil means "exactly one item", while the # sigil means "zero or more items". The $ gets you a single scalar, while the # gets you a list.
Please see: What is the difference between $array[1] and #array[1]?
#ARY[1] is indeed a slice, in fact a slice of only one member. The difference is it creates a list context:
#ar1[0] = qw( a b c ); # List context.
$ar2[0] = qw( a b c ); # Scalar context, the last value is returned.
print "<#ar1> <#ar2>\n";
Output:
<a> <c>
Besides using strict, turn warnings on, too. You'll get the following warning:
Scalar value #ar1[0] better written as $ar1[0]
In perlop, you can read that "Perl's prefix dereferencing operators are typed: $, #, %, and &." The standard syntax is SIGIL { ... }, but in the simple cases, the curly braces can be omitted.
See Can you use string as a HASH ref while "strict refs" in use? for some fun with no strict refs and its emulation under strict.
Extending choroba's answer, to check a particular context, you can use wantarray
sub context { return wantarray ? "LIST" : "SCALAR" }
print $ary1[0] = context(), "\n";
print #ary1[0] = context(), "\n";
Outputs:
SCALAR
LIST
Nothing you did requires no strict; other than to hide your error of doing
#ARY = (58, 90);
when you should have done
my #ARY = (58, 90);
The following returns a single element of the array. Since EXPR is to return a single index, it is evaluated in scalar context.
$array[EXPR]
e.g.
my #array = qw( a b c d );
my $index = 2;
my $ele = $array[$index]; # my $ele = 'c';
The following returns the elements identified by LIST. Since LIST is to return 0 or more elements, it must be evaluated in list context.
#array[LIST]
e.g.
my #array = qw( a b c d );
my #indexes ( 1, 2 );
my #slice = $array[#indexes]; # my #slice = qw( b c );
\( $ARY[$index] ) # Returns a ref to the element returned by $ARY[$index]
\( #ARY[#indexes] ) # Returns refs to each element returned by #ARY[#indexes]
${foo} # Weird way of writing $foo. Useful in literals, e.g. "${foo}bar"
#{foo} # Weird way of writing #foo. Useful in literals, e.g. "#{foo}bar"
${foo}[...] # Weird way of writing $foo[...].
Most people don't even know you can use these outside of string literals.

How does use of "my" keyword distiguish the results in perl?

I dont understand how the "my" keyword works here. This is my perl script.
$line = ' sdfaad(asdvfr)';
code1:
if ($tmp = $line =~ /(\(\s*[^)]+\))/ ) {
print $tmp;
}
Outputs:
1
code2:
if (my ($tmp) = $line =~ /(\(\s*[^)]+\))/ ) {
print $tmp;
}
Outputs:
(asdvfr)
Why are the two outputs different? Does it have to do with the use of my?
It is not my that makes the difference, but scalar/list context. Braces around $tmp are imposing list context,
if (($tmp) = $line=~ /(\(\s*[^)]+\))/ ) # braces makes difference, not 'my'
while my only declares variable as lexical scoped one.
Perl has two different assignment operators; a list assignment operator and a scalar assignment operator. A list assignment gives its right operand list context, while a scalar assignment gives its right operand scalar context. A match operation returns differently depending on this context.
Which operator = is depends on what is on the left side; if it is an array, a hash, a slice, or a parenthesized expression, it is a list assignment; otherwise it is a scalar assignment.

Understanding precedence when assigning and testing for definedness in Perl

When trying to assign a variable and test it for definedness in one operation in Perl, as would be useful for instance in an if's condition, it would seem natural to me to write:
if ( defined my $thing = $object->get_thing ) {
$thing->do_something;
}
As far as my understanding goes, defined has the precedence of a rightward list operator, which is lower than that of the assignment, therefore I would expect my code above to be equivalent to:
if ( defined ( my $thing = $object->get_thing ) ) {
$thing->do_something;
}
While the latter, parenthesised code does work, the former yields the following fatal error: "Can't modify defined operator in scalar assignment".
It's not a big deal having to add parentheses, but I would love to understand why the first version doesn't work, e.g. what kind of "thing" defined is and what is its precedence?
Named operators are divided into unary operators (operators that always take exactly one operand) and list operators (everything else)[1].
defined and my[2] are unary operators, which have much higher precedence than other named operators.
The same goes for subs, so I'll use them to demonstrate.
$ perl -MO=Deparse,-p -e'sub f :lvalue {} sub g :lvalue {} f g $x = 123;'
sub f : lvalue { }
sub g : lvalue { }
f(g(($x = 123)));
-e syntax OK
$ perl -MO=Deparse,-p -e'sub f($) :lvalue {} sub g($) :lvalue {} f g $x = 123;'
sub f ($) : lvalue { }
sub g ($) : lvalue { }
(f(g($x)) = 123);
-e syntax OK
But of course, defined is not an lvalue function, so finding it on the LHS of an assignment results in an error.
and, or, not, xor, lt, le, gt, ge, eq, ne and cmp are not considered named operators.
my is very unusual. Aside from having both a compile-time and run-time effect, its syntax varies depending on whether parens are used around its argument(s) or not. Without parens, it's a unary operator. With parens, it's a list operator.

Change meaning of the operator "+" in perl

Currently "+" in perl means addition, in my project, we do string concatenation a lot. I know we can concatention with "." operator, like:
$x = $a . $b; #will concatenate string $a, and string $b
But "+" feels better. Wonder if there is a magic to make the following do concatenation.
$x = $a + $b;
Even better, make the it check the operator type, if both variables ($a, $b) are numbers, then do "addition" in the usual sense, otherwise, do concatenation.
I know in C++, one can overload the operator. Hope there is something similar in perl.
Thanks.
Yes, Perl too offers operator overloading.
package UnintuitiveString;
use Scalar::Util qw/looks_like_number/;
use overload '+' => \&concat,
'.' => \&concat,
'""' => \&as_string;
# Additionally, the following operators *have* to be overridden
# I suggest you raise an exception if an implementation does not make sense
# - * / % ** << >> x
# <=> cmp
# & | ^ ~
# atan2 cos sin exp log sqrt int
# 0+ bool
# ~~
sub new {
my ($class, $val) = #_;
return bless \$val => $class;
}
sub concat {
my ($self, $other, $swap) = #_;
# check for append mode
if (not defined $swap) {
$$self .= "$other";
return $self;
}
($self, $other) = ($other, $self) if $swap;
return UnintuitiveString->new("$self" . "$other");
}
sub as_string {
my ($self) = #_;
return $$self;
}
sub as_number {
my ($self) = #_;
return 0+$$self if looks_like_number $$self;
return undef;
}
Now we can do weird stuff like:
my $foo = UnintuitiveString->new(4);
my $bar = UnintuitiveString->new(2);
print $foo + $bar, "\n"; # "42"
my ($num_x, $num_y) = map { $_->as_number } $foo, $bar;
print $num_x + $num_y, "\n"; # "6"
$foo += 6;
print $foo + "\n"; # "46"
But just because we can do such things does not at all mean that we should:
Perl already has a concatenation operator: .. It's perfectly fine to use that.
Operator overloading comes at a massive performance cost. What previously was a single opcode in perl's VM is now a series of method calls and intermediate copies.
Changing the meaning of your operators is extremely confusing for people who actually know Perl. I stumbled a few times with the test cases above, when I was surprised that $foo + 6 wouldn't produce 10.
Perl's scalars are not a number or a string, they are both at the same time and are interpreted as one or the other depending on their usage context. This is actually half-true, and the scalars have different representations. They could be a string (PV), an integer (IV), a float (NV). However, once a PV is used in a numerical context like addition, a numerical value is determined and saved alongside the string, and we get an PVIV or PVNV. The reverse is also true: when a number is used in a stringy context, the formatted string is saved alongside the number. The looks_like_number function mentioned above determines whether a given string could represent a valid number like "42" or "NaN". Because just using a scalar in some context can change the representation, checking that a given scalar is a PV does not guarantee that it was intended to be a string, and an IV does not guarantee that it was intended to be an integer.
Perl has two sets of operators for a very good reason: If the “type” of a scalar is fluid, we need another way to explicitly request certain behavior. E.g. Perl has numeric comparison operators < <= == != >= > <=> and stringy comparison operators lt le eq ne ge gt cmp which can behave very differently: 4 XXX 12 will be -1 for <=> (because 4 is numerically smaller than 12), but 1 for cmp (because 4 comes later than 1 in most collation orders).
Other languages suffer a lot from having operators coerce their operands to required types but not offering two sets of operators. E.g. in Java, + is overloaded to concat strings. However, this leads to a loss of commutativity and associativity. Given three values x, y, z which can be either strings or numbers, we get different results for:
x + y and y + x – string concatenation is not commutative, whereas numeric addition is.
(x + y) + z and x + (y + z) – the + is not associative as soon as one string enters the playing field. Consider x = 1, y = 2, z = "4". Then the first evaluation order leads to "34", whereas the second leads to "124".
In Java, this is not a problem, because the language is statically typed, and because there are very few coercions (autoboxing, autounboxing, widening conversions, and stringification in concatenation). However, JavaScript (which is dynamically typed and will perform conversions from strings to numbers for other operators) shows the exact same behavior. Oops.
Stop this madness. Now. Perl's set of operators (barring smartmatch) is one of the best designed parts of the language (and its type system one of the worst parts from a modern viewpoint). If you dislike Perl because its operators make sense, you are free to use PHP instead (which, by the way, also uses . for concatenation to avoid such issues) :P

Perl5 = (equals) operator precedence

$a,$b,$c = 1,2,3;
print "$a, $b, $c\n";
returns
, , 1
So does = (equals) take higher precedence than the tuple construction - doing this?
$a,$b,($c=1),2,3;
Yes. There's a precedence table in perlop. Assignment operators are level 19, and comma is level 20. In general, Perl's operators have the same precedence as the corresponding C operators (for those operators that have a corresponding C operator).
If you meant ($a,$b,$c) = (1,2,3); you have to use the parens.
The comma operator as you used it (in scalar context) is not for tuple construction, it's for evaluating several expressions and returning the last one.
Perl does things differently depending on context, it decides what to do depending on if it's expecting a scalar value, a list, nothing at all, ... See perldoc perldata's section on Context for an introduction.
So, if you do:
perl -e '$a = (1 and 4,2,0); print"$a\n"'
You get 0, because 4,2,0 is evaluated in scalar context, and behaves like C's comma operator, evaluating expressions between commas and returning the result of the last one.
If you force 4,2,0 to be evaluated in list context:
perl -e '$a = (1 and #a=(4,2,0)); print"$a\n"'
You get 3, because assigning to an array forces list context (the additional parenthesis are there to solve the precedence issue cjm mentioned), and the value of a list in scalar context (forced by being the RHS of an and in scalar context) is the number of elements it has (logical and in Perl returns the last expression evaluated, instead of a boolean value as in other programming languages).
So, as cjm said, you need to do:
($a,$b,$c) = (1,2,3);
to deal with precedence and force list context.
Notice the difference between:
$ perl -e '$a,$b,$c = (7,6,8); print "$a $b $c\n"'
8
The comma operator is evaluated in scalar context, and returns 8.
$ perl -e '($a,$b,$c) = (7,6,8); print "$a $b $c\n"'
7 6 8
The comma operator is evaluated in list context, and returns a list.
$ perl -e '$a,$b,$c = () = (7,6,8); print "$a $b $c\n"'
3
The comma operator is evaluated in list context, returning a list, then the assignment to $c forces scalar context, returning the number of elements in the list.