Perl increment or decrement, but not both? - perl

$a++; # ok
$a--; # ok
--$a; # ok
++$a; # ok
--$a++; # syntax error
$a++--; # syntax error
($a++)--; # syntax error
--$a--; # syntax error
On some of these, I can sort of see why- but on like --$a-- there is no ambiguity and no precedence conflict. I'm floored Larry didn't let me do that.. (and don't even get me started on the lack of a floor operator!)
Not that I would need or want to- I was just trying to understand more about how these operators worked and discovered that sort of surprising result..

In the Perldoc for auto increment/decrement we find:
"++" and "--" work as in C.
and slightly earlier on the same page
Perl operators have the following associativity and precedence, listed from highest precedence to lowest. Operators borrowed from C keep the same precedence relationship with each other, even where C's precedence is slightly screwy. (This makes learning Perl easier for C folks.)
Since C returns an rvalue in both cases, Perl does the same. Interestingly, C++ returns a reference to an lvalue for pre-increment/decrement thus having different semantics.

Consider the following:
length($x) = 123;
Just like ++(++$a), there is no ambiguity, there is no precedence conflict, and it would require absolutely no code to function. The limitation is completely artificial[1], which means code was added specifically to forbid it!
So why is length($x) = 123; disallowed? Because disallowing it allows us to catch errors with little or no downside.
length($x) = 123; # XXX Did you mean "length($x) == 123"?
How is it disallowed? Using a concept of lvalues. lvalues are values that are allowed to appear on the left of a scalar assignment.
Some operators are deemed to return lvalues.
$x = 123; # $x returns an lvalue
$#a = 123; # $#a returns an lvalue
substr($s,0,0) = "abc"; # substr returns an lvalue
Some arguments are expected to be an lvalue.
length($x) = 123; # XXX LHS of scalar assignment must be an lvalue
++length($x); # XXX Operand of pre/post-inc/dec must be an lvalue.
The pre/post-increment/decrement operators aren't flagged as returning an lvalue. Operators that except an lvalue will not accept them.
++$a = 123; # XXX Did you mean "++$a == 123"?
This has the side effect of also preventing ++(++$a) which would work fine without the lvalue check.
$ perl -E' ++( ++$a); say $a;'
Can't modify preincrement (++) in preincrement (++) at -e line 1, near ");"
Execution of -e aborted due to compilation errors.
$ perl -E'sub lvalue :lvalue { $_[0] } ++lvalue(++$a); say $a;'
2
Changing ++$a to return an lvalue would allow ++(++$a) to work, but it would also allow ++$a = 123 to work. What's more likely? ++$a = 123 was intentional, or ++$a = 123 is a typo for ++$a == 123?
The following shows that length($x) = 123 would work without the lvalue syntax check.
$ perl -E' say length($x) = 123;'
Can't modify length in scalar assignment at -e line 1, near "123;"
Execution of -e aborted due to compilation errors.
$ perl -E'sub lvalue :lvalue { $_[0] } say lvalue(length($x)) = 123;'
123
The value you see printed is the value of the scalar returned by length after it was changed by the assignment.

For example, what do you expect for:
$a = 1;
$b = --$a++; # imaginary syntax
I think it will be harder to explain that $b is equals to 0 and $a is 1, isn't it?... In any case, I don´t remember any real example where that syntax would be useful. It's useless and ugly

Related

(4 + sub) not equals to (sub + 4)?

(edit) TL;DR: my problem was that I though the Win32 API defines were true integer constants (as in the platform SDK headers) while the Win32 Perl wrapper defines them as subs. Thus caused the one-liner parsing misunderstood.
While testing in a one-liner a call to Win32::MsgBox, I am puzzled by the following : giving that the possible arguments for MsgBox are the message, a sum of flags to chose the kind of buttons (value 0..5) and message box icon "constants" (MB_ICONSTOP, ...) and the title
calling perl -MWin32 -e"Win32::MsgBox world, 4+MB_ICONQUESTION, hello" gives the expected result
while the looking similar code perl -MWin32 -e"Win32::MsgBox world, MB_ICONQUESTION+4, hello" is wrong
I first though that it comes from my lack of parenthesis, but adding some perl -MWin32 -e"Win32::MsgBox (world, MB_ICONQUESTION+4, hello)" gives exactly the same wrong result.
I tried with a colleague to dig deeper and display the parameters that are passed to a function call (as the MB_xxx constants are actually subs) with the following code
>perl -Mstrict -w -e"sub T{print $/,'called T(#'.join(',',#_).'#)'; 42 }; print $/,'results:', join ' ,', T(1), T+1, 1+T"
that outputs
called T(#1#)
called T(##)
called T(#1,43#)
results:42 ,42
but I can't understand why in the list passed to join() the args T+1, 1+T are parsed as T(1, 43)...
B::Deparse to the rescue:
C:>perl -MO=Deparse -MWin32 -e"Win32::MsgBox world, MB_ICONQUETION+4, hello"
use Win32;
Win32::MsgBox('world', MB_ICONQUESTION(4, 'hello'));
-e syntax OK
C:>perl -MO=Deparse -MWin32 -e"Win32::MsgBox world, 4+MB_ICONQESTION, hello"
use Win32;
Win32::MsgBox('world', 4 + MB_ICONQUESTION(), 'hello');
-e syntax OK
The MB_ICONQUESTION call in the first case is considered a function call with the arguments +4, 'hello'. In the second case, it is considered as a function call with no arguments, and having 4 added to it. It is not a constant, it seems, but a function.
In the source code we get this verified:
sub MB_ICONQUESTION { 0x00000020 }
It is a function that returns 32 (00100000 in binary, indicating a bit being set). Also as Sobrique points out, this is a flag variable, so you should not use addition, but the bitwise logical and/or operators.
In your case, it just accepts any arguments and ignores them. This is a bit confusing if you are expecting a constant.
In your experiment case, the statement
print $/,'results:', join ' ,', T(1), T+1, 1+T
Is interpreted
print $/,'results:', join ' ,', T(1), T(+1, (1+T))
Because execution from right to left goes
1+T = 43
T +1, 43 = 42
T(1) = 42
Because plus + has higher precedence than comma ,, and unary + even higher.
To disambiguate, you need to do use parentheses to clarify precedence:
print $/,'results:', join ' ,', T(1), T()+1, 1+T
# ^^-- parentheses
As a general rule, one should always use parentheses with subroutine calls. In perldoc perlsub there are 4 calling notations:
NAME(LIST); # & is optional with parentheses.
NAME LIST; # Parentheses optional if predeclared/imported.
&NAME(LIST); # Circumvent prototypes.
&NAME; # Makes current #_ visible to called subroutine.
Of which in my opinion, only the first one is transparent, and the other ones a bit obscure.
This is all to do with how you're invoking T and how perl is interpreting the results.
If we deparse your example we get:
BEGIN { $^W = 1; }
sub T {
use strict;
print $/, 'called T(#' . join(',', #_) . '#)';
42;
}
use strict;
print $/, 'results:', join(' ,', T(1), T(1, 1 + T()));
This is clearly not what you've got in mind, but does explain why you get the result you do.
I would suggest in your original example - rather that + you may wish to consider using | as it looks very much like MB_ICONQUESTION is intended to be a flag.
So:
use strict;
use warnings;
use Win32 qw( MB_ICONQUESTION );
print MB_ICONQUESTION;
Win32::MsgBox( "world", 4 | MB_ICONQUESTION , "hello" );
Or
use strict;
use warnings;
use Win32 qw( MB_ICONQUESTION );
print MB_ICONQUESTION;
Win32::MsgBox( "world", MB_ICONQUESTION | 4 , "hello" );
Produce the same result.
This is because of precence when invoking subroutines without brackets - you can do:
print "one", "two";
And both are treated as arguments to print. Perl assumes that arguments after a sub are to be passed to it.
+4 is enumerated as an argument, and passed to T.
sub test { print #_,"\n";};
test 1;
test +1;
If we deparse this, we see perl treats it as:
test 1;
test 1;
So ultimately - there is a bug in Win32 that you have found, that would be fixable by:
sub MB_ICONQUESTION() {0x00000020}
Win32::MsgBox "world", 4 + MB_ICONQUESTION, "hello";
Win32::MsgBox "world", MB_ICONQUESTION + 4, "hello";
Or perhaps:
use constant MB_ICONQUESTION => 0x00000020;
Or as noted - the workaround in your code - don't use + and instead use | which is going to have the same result for bit flag operations, but because of operator precedence is never going to be passed into the subroutine. (Or of course, always specify the parenthesis for your constants)

Perl dereferencing in non-strict mode

In Perl, if I have:
no strict;
#ARY = (58, 90);
To operate on an element of the array, say it, the 2nd one, I would write (possibly as part of a larger expression):
$ARY[1] # The most common way found in Perldoc's idioms.
Though, for some reason these also work:
#ARY[1]
#{ARY[1]}
Resulting all in the same object:
print (\$ARY[1]);
print (\#ARY[1]);
print (\#{ARY[1]});
Output:
SCALAR(0x9dbcdc)
SCALAR(0x9dbcdc)
SCALAR(0x9dbcdc)
What is the syntax rules that enable this sort of constructs? How far could one devise reliable program code with each of these constructs, or with a mix of all of them either? How interchangeable are these expressions? (always speaking in a non-strict context).
On a concern of justifying how I come into this question, I agree "use strict" as a better practice, still I'm interested at some knowledge on build-up non-strict expressions.
In an attemp to find myself some help to this uneasiness, I came to:
The notion on "no strict;" of not complaining about undeclared
variables and quirk syntax.
The prefix dereference having higher precedence than subindex [] (perldsc § "Caveat on precedence").
The clarification on when to use # instead of $ (perldata § "Slices").
The lack of "[]" (array subscript / slice) description among the Perl's operators (perlop), which lead me to think it is not an
operator... (yet it has to be something else. But, what?).
For what I learned, none of these hints, put together, make me better understand my issue.
Thanks in advance.
Quotation from perlfaq4:
What is the difference between $array[1] and #array[1]?
The difference is the sigil, that special character in front of the array name. The $ sigil means "exactly one item", while the # sigil means "zero or more items". The $ gets you a single scalar, while the # gets you a list.
Please see: What is the difference between $array[1] and #array[1]?
#ARY[1] is indeed a slice, in fact a slice of only one member. The difference is it creates a list context:
#ar1[0] = qw( a b c ); # List context.
$ar2[0] = qw( a b c ); # Scalar context, the last value is returned.
print "<#ar1> <#ar2>\n";
Output:
<a> <c>
Besides using strict, turn warnings on, too. You'll get the following warning:
Scalar value #ar1[0] better written as $ar1[0]
In perlop, you can read that "Perl's prefix dereferencing operators are typed: $, #, %, and &." The standard syntax is SIGIL { ... }, but in the simple cases, the curly braces can be omitted.
See Can you use string as a HASH ref while "strict refs" in use? for some fun with no strict refs and its emulation under strict.
Extending choroba's answer, to check a particular context, you can use wantarray
sub context { return wantarray ? "LIST" : "SCALAR" }
print $ary1[0] = context(), "\n";
print #ary1[0] = context(), "\n";
Outputs:
SCALAR
LIST
Nothing you did requires no strict; other than to hide your error of doing
#ARY = (58, 90);
when you should have done
my #ARY = (58, 90);
The following returns a single element of the array. Since EXPR is to return a single index, it is evaluated in scalar context.
$array[EXPR]
e.g.
my #array = qw( a b c d );
my $index = 2;
my $ele = $array[$index]; # my $ele = 'c';
The following returns the elements identified by LIST. Since LIST is to return 0 or more elements, it must be evaluated in list context.
#array[LIST]
e.g.
my #array = qw( a b c d );
my #indexes ( 1, 2 );
my #slice = $array[#indexes]; # my #slice = qw( b c );
\( $ARY[$index] ) # Returns a ref to the element returned by $ARY[$index]
\( #ARY[#indexes] ) # Returns refs to each element returned by #ARY[#indexes]
${foo} # Weird way of writing $foo. Useful in literals, e.g. "${foo}bar"
#{foo} # Weird way of writing #foo. Useful in literals, e.g. "#{foo}bar"
${foo}[...] # Weird way of writing $foo[...].
Most people don't even know you can use these outside of string literals.

Understanding precedence when assigning and testing for definedness in Perl

When trying to assign a variable and test it for definedness in one operation in Perl, as would be useful for instance in an if's condition, it would seem natural to me to write:
if ( defined my $thing = $object->get_thing ) {
$thing->do_something;
}
As far as my understanding goes, defined has the precedence of a rightward list operator, which is lower than that of the assignment, therefore I would expect my code above to be equivalent to:
if ( defined ( my $thing = $object->get_thing ) ) {
$thing->do_something;
}
While the latter, parenthesised code does work, the former yields the following fatal error: "Can't modify defined operator in scalar assignment".
It's not a big deal having to add parentheses, but I would love to understand why the first version doesn't work, e.g. what kind of "thing" defined is and what is its precedence?
Named operators are divided into unary operators (operators that always take exactly one operand) and list operators (everything else)[1].
defined and my[2] are unary operators, which have much higher precedence than other named operators.
The same goes for subs, so I'll use them to demonstrate.
$ perl -MO=Deparse,-p -e'sub f :lvalue {} sub g :lvalue {} f g $x = 123;'
sub f : lvalue { }
sub g : lvalue { }
f(g(($x = 123)));
-e syntax OK
$ perl -MO=Deparse,-p -e'sub f($) :lvalue {} sub g($) :lvalue {} f g $x = 123;'
sub f ($) : lvalue { }
sub g ($) : lvalue { }
(f(g($x)) = 123);
-e syntax OK
But of course, defined is not an lvalue function, so finding it on the LHS of an assignment results in an error.
and, or, not, xor, lt, le, gt, ge, eq, ne and cmp are not considered named operators.
my is very unusual. Aside from having both a compile-time and run-time effect, its syntax varies depending on whether parens are used around its argument(s) or not. Without parens, it's a unary operator. With parens, it's a list operator.

Perl argument parsing into functions when no parentheses are used

I have the following test code:
sub one_argument {
($a) = #_;
print "In one argument: \$a = $a\n";
return "one_argument";
}
sub mul_arguments {
(#a) = #_;
return "mul_argument";
}
print &one_argument &mul_arguments "something", "\n";
My goal is to be able to understand a bit better how perl decides which arguments to go into each function, and to possibly clear up any misunderstandings that I might have. I would've expected the above code to output:
In one argument: mul_argument
one_argument
However, the below is output:
Use of uninitialized value $a in concatenation (.) or string at ./test.pl line 5.
In one argument: $a =
mdd_argument
I don't understand where 'mdd_argument' comes from (Is it a sort of reference to a function?), and why one_argument receives no arguments.
I would appreciate any insight as to how perl parses arguments into functions when they are called in a similar fashion to above.
Please note that this is purely a learning exercise, I don't need the above code to perform as I expected, and in my own code I wouldn't call a function in such a way.
perldoc perlsub:
If a subroutine is called using the & form, the argument list is optional, and if omitted, no #_ array is set up for the subroutine: the #_ array at the time of the call is visible to subroutine instead. This is an efficiency mechanism that new users may wish to avoid.
In other words, in normal usage, if you use the &, you must use parentheses. Otherwise, the subroutine will be passed the caller's #_.
The mysterious "mdd" is caused because &one_argument doesn't have any arguments and perl is expecting an operator to follow it, not an expression. So the & of &mul_arguments is actually interpreted as the stringwise bit and operator:
$ perl -MO=Deparse,-p -e 'sub mul_arguments; print &one_argument &mul_arguments "something", "\n"'
print((&one_argument & mul_arguments('something', "\n")));
and "one_argument" & "mul_arguments" produces "mdd_argument".

What is happening when my() is conditional?

Compare using perl -w -Mstrict:
# case Alpha
print $c;
...
# case Bravo
if (0) {
my $c = 1;
}
print $c;
...
# case Charlie
my $c = 1 if 0;
print $c;
Alpha and Bravo both complain about the global symbol not having an explicit package name, which is to be expected. But Charlie does not give the same warning, only that the value is uninitialized, which smells a lot like:
# case Delta
my $c;
print $c;
What exactly is going on under the hood? (Even though something like this should never be written for production code)
You can think of a my declaration as having an action at compile-time and at run-time. At compile-time, a my declaration tells the compiler to make a note that a symbol exists and will be available until the end of the current lexical scope. An assignment or other use of the symbol in that declaration will take place at run-time.
So your example
my $c = 1 if 0;
is like
my $c; # compile-time declaration, initialized to undef
$c = 1 if 0; # runtime -- as written has no effect
Note that this compile-time/run-time distinction allows you to write code like this.
my $DEBUG; # lexical scope variable declared at compile-time
BEGIN {
$DEBUG = $ENV{MY_DEBUG}; # statement executed at compile-time
};
Now can you guess what the output of this program is?
my $c = 3;
BEGIN {
print "\$c is $c\n";
$c = 4;
}
print "\$c is $c\n";
mob's answer is a great explanation of what currently happens (and why), but don't forget that perldoc perlsyn tells us:
NOTE: The behaviour of a my, state, or our modified with a statement modifier conditional or loop construct (for example, my $x
if ... ) is undefined. The value of the my variable may be
undef, any previously assigned value, or possibly anything else.
Don't rely on it. Future versions of perl might do something different
from the version of perl you try it out on. Here be dragons.
Don't count on that result or the explanation for it still being true in future versions of Perl. (Although it probably will be.)
The "my $foo = val if cond" construct and its undefined behavior has bitten me many times over the years. I wish the compiler could simply reject it (why keep something in the language that has undefined behavior?!), but presumably this cannot be done for backward compatibility or other reasons. Best solution I've found is to prevent it with perlcritic:
http://search.cpan.org/perldoc?Perl::Critic::Policy::Variables::ProhibitConditionalDeclarations