(4 + sub) not equals to (sub + 4)? - perl

(edit) TL;DR: my problem was that I though the Win32 API defines were true integer constants (as in the platform SDK headers) while the Win32 Perl wrapper defines them as subs. Thus caused the one-liner parsing misunderstood.
While testing in a one-liner a call to Win32::MsgBox, I am puzzled by the following : giving that the possible arguments for MsgBox are the message, a sum of flags to chose the kind of buttons (value 0..5) and message box icon "constants" (MB_ICONSTOP, ...) and the title
calling perl -MWin32 -e"Win32::MsgBox world, 4+MB_ICONQUESTION, hello" gives the expected result
while the looking similar code perl -MWin32 -e"Win32::MsgBox world, MB_ICONQUESTION+4, hello" is wrong
I first though that it comes from my lack of parenthesis, but adding some perl -MWin32 -e"Win32::MsgBox (world, MB_ICONQUESTION+4, hello)" gives exactly the same wrong result.
I tried with a colleague to dig deeper and display the parameters that are passed to a function call (as the MB_xxx constants are actually subs) with the following code
>perl -Mstrict -w -e"sub T{print $/,'called T(#'.join(',',#_).'#)'; 42 }; print $/,'results:', join ' ,', T(1), T+1, 1+T"
that outputs
called T(#1#)
called T(##)
called T(#1,43#)
results:42 ,42
but I can't understand why in the list passed to join() the args T+1, 1+T are parsed as T(1, 43)...

B::Deparse to the rescue:
C:>perl -MO=Deparse -MWin32 -e"Win32::MsgBox world, MB_ICONQUETION+4, hello"
use Win32;
Win32::MsgBox('world', MB_ICONQUESTION(4, 'hello'));
-e syntax OK
C:>perl -MO=Deparse -MWin32 -e"Win32::MsgBox world, 4+MB_ICONQESTION, hello"
use Win32;
Win32::MsgBox('world', 4 + MB_ICONQUESTION(), 'hello');
-e syntax OK
The MB_ICONQUESTION call in the first case is considered a function call with the arguments +4, 'hello'. In the second case, it is considered as a function call with no arguments, and having 4 added to it. It is not a constant, it seems, but a function.
In the source code we get this verified:
sub MB_ICONQUESTION { 0x00000020 }
It is a function that returns 32 (00100000 in binary, indicating a bit being set). Also as Sobrique points out, this is a flag variable, so you should not use addition, but the bitwise logical and/or operators.
In your case, it just accepts any arguments and ignores them. This is a bit confusing if you are expecting a constant.
In your experiment case, the statement
print $/,'results:', join ' ,', T(1), T+1, 1+T
Is interpreted
print $/,'results:', join ' ,', T(1), T(+1, (1+T))
Because execution from right to left goes
1+T = 43
T +1, 43 = 42
T(1) = 42
Because plus + has higher precedence than comma ,, and unary + even higher.
To disambiguate, you need to do use parentheses to clarify precedence:
print $/,'results:', join ' ,', T(1), T()+1, 1+T
# ^^-- parentheses
As a general rule, one should always use parentheses with subroutine calls. In perldoc perlsub there are 4 calling notations:
NAME(LIST); # & is optional with parentheses.
NAME LIST; # Parentheses optional if predeclared/imported.
&NAME(LIST); # Circumvent prototypes.
&NAME; # Makes current #_ visible to called subroutine.
Of which in my opinion, only the first one is transparent, and the other ones a bit obscure.

This is all to do with how you're invoking T and how perl is interpreting the results.
If we deparse your example we get:
BEGIN { $^W = 1; }
sub T {
use strict;
print $/, 'called T(#' . join(',', #_) . '#)';
42;
}
use strict;
print $/, 'results:', join(' ,', T(1), T(1, 1 + T()));
This is clearly not what you've got in mind, but does explain why you get the result you do.
I would suggest in your original example - rather that + you may wish to consider using | as it looks very much like MB_ICONQUESTION is intended to be a flag.
So:
use strict;
use warnings;
use Win32 qw( MB_ICONQUESTION );
print MB_ICONQUESTION;
Win32::MsgBox( "world", 4 | MB_ICONQUESTION , "hello" );
Or
use strict;
use warnings;
use Win32 qw( MB_ICONQUESTION );
print MB_ICONQUESTION;
Win32::MsgBox( "world", MB_ICONQUESTION | 4 , "hello" );
Produce the same result.
This is because of precence when invoking subroutines without brackets - you can do:
print "one", "two";
And both are treated as arguments to print. Perl assumes that arguments after a sub are to be passed to it.
+4 is enumerated as an argument, and passed to T.
sub test { print #_,"\n";};
test 1;
test +1;
If we deparse this, we see perl treats it as:
test 1;
test 1;
So ultimately - there is a bug in Win32 that you have found, that would be fixable by:
sub MB_ICONQUESTION() {0x00000020}
Win32::MsgBox "world", 4 + MB_ICONQUESTION, "hello";
Win32::MsgBox "world", MB_ICONQUESTION + 4, "hello";
Or perhaps:
use constant MB_ICONQUESTION => 0x00000020;
Or as noted - the workaround in your code - don't use + and instead use | which is going to have the same result for bit flag operations, but because of operator precedence is never going to be passed into the subroutine. (Or of course, always specify the parenthesis for your constants)

Related

Dot at the beginning of a "print" statement?

I've come across something odd while using a Perl script. It's about using a dot giving different results.
perlop didn't turn anything up, or perhaps I just blew past it. I was looking at Operator Precedence and Associativity
print "I'd expect to see 11x twice, but I only see it once.\n";
print (1 . 1) . "3";
print "\n";
print "" . (1 . 1) . "3\n";
print "Pluses: I expect to see 14 in both cases, and not 113, because plus works on numbers.\n";
print (1 . 1) + "3";
print "\n";
print "" + (1 . 1) + "3\n";
Putting quotes at the start is an acceptable workaround to get what I want, but what is happening here with the order of operations that I'm missing? What rules are there to be learned?
When you put the first argument to print in parentheses, Perl sees it as function call syntax.
So this:
print (1 . 1) . "3";
is parsed as this:
print(1 . 1) . "3";
or, equivalently:
(print 1 . 1) . "3";
Therefore, Perl prints "11", then takes the return value of that print call (which is 1 if it succeeded), concatenates 3 to it, and - since the whole expression is in void context - does absolutely nothing with the resulting 13.
If you run your code with warnings enabled (via -w on the command line or the use warnings; pragma), you will get these warnings identifying your error:
$ perl -w foo.pl
print (...) interpreted as function at foo.pl line 2.
print (...) interpreted as function at foo.pl line 6.
Useless use of concatenation (.) or string in void context at foo.pl line 2.
Useless use of addition (+) in void context at foo.pl line 6.
As Borodin points out in the comment below, you shouldn't rely on -w (or the in-code equivalent $^W); production code should always make use of the warnings pragma, preferably with use warnings qw(all);. While it wouldn't matter in this particular instance, you should also use strict;, although requesting modern features via useversion; for a Perl version of 5.11 or higher automatically turns on strict as well.
If a named operator (or a sub call) is followed by parens, those parens delimit the operands (or arguments).
print (1 . 1) . "3"; ≡ ( print(1 . 1) ) . "3";
print "" . (1 . 1) . "3"; ≡ print("" . (1 . 1) . "3");
Note that Perl would have alerted you of your problems had you been using (use strict; and) use warnings qw( all ); as you should.
print (...) interpreted as function at a.pl line 2.
print (...) interpreted as function at a.pl line 6.
Useless use of concatenation (.) or string in void context at a.pl line 2.
Useless use of addition (+) in void context at a.pl line 6.
I'd expect to see 11x twice, but I only see it once.
11
113
Pluses: I expect to see 14 in both cases, and not 113, because plus works on numbers.
11
Argument "" isn't numeric in addition (+) at a.pl line 8.
14

Perl dereferencing in non-strict mode

In Perl, if I have:
no strict;
#ARY = (58, 90);
To operate on an element of the array, say it, the 2nd one, I would write (possibly as part of a larger expression):
$ARY[1] # The most common way found in Perldoc's idioms.
Though, for some reason these also work:
#ARY[1]
#{ARY[1]}
Resulting all in the same object:
print (\$ARY[1]);
print (\#ARY[1]);
print (\#{ARY[1]});
Output:
SCALAR(0x9dbcdc)
SCALAR(0x9dbcdc)
SCALAR(0x9dbcdc)
What is the syntax rules that enable this sort of constructs? How far could one devise reliable program code with each of these constructs, or with a mix of all of them either? How interchangeable are these expressions? (always speaking in a non-strict context).
On a concern of justifying how I come into this question, I agree "use strict" as a better practice, still I'm interested at some knowledge on build-up non-strict expressions.
In an attemp to find myself some help to this uneasiness, I came to:
The notion on "no strict;" of not complaining about undeclared
variables and quirk syntax.
The prefix dereference having higher precedence than subindex [] (perldsc § "Caveat on precedence").
The clarification on when to use # instead of $ (perldata § "Slices").
The lack of "[]" (array subscript / slice) description among the Perl's operators (perlop), which lead me to think it is not an
operator... (yet it has to be something else. But, what?).
For what I learned, none of these hints, put together, make me better understand my issue.
Thanks in advance.
Quotation from perlfaq4:
What is the difference between $array[1] and #array[1]?
The difference is the sigil, that special character in front of the array name. The $ sigil means "exactly one item", while the # sigil means "zero or more items". The $ gets you a single scalar, while the # gets you a list.
Please see: What is the difference between $array[1] and #array[1]?
#ARY[1] is indeed a slice, in fact a slice of only one member. The difference is it creates a list context:
#ar1[0] = qw( a b c ); # List context.
$ar2[0] = qw( a b c ); # Scalar context, the last value is returned.
print "<#ar1> <#ar2>\n";
Output:
<a> <c>
Besides using strict, turn warnings on, too. You'll get the following warning:
Scalar value #ar1[0] better written as $ar1[0]
In perlop, you can read that "Perl's prefix dereferencing operators are typed: $, #, %, and &." The standard syntax is SIGIL { ... }, but in the simple cases, the curly braces can be omitted.
See Can you use string as a HASH ref while "strict refs" in use? for some fun with no strict refs and its emulation under strict.
Extending choroba's answer, to check a particular context, you can use wantarray
sub context { return wantarray ? "LIST" : "SCALAR" }
print $ary1[0] = context(), "\n";
print #ary1[0] = context(), "\n";
Outputs:
SCALAR
LIST
Nothing you did requires no strict; other than to hide your error of doing
#ARY = (58, 90);
when you should have done
my #ARY = (58, 90);
The following returns a single element of the array. Since EXPR is to return a single index, it is evaluated in scalar context.
$array[EXPR]
e.g.
my #array = qw( a b c d );
my $index = 2;
my $ele = $array[$index]; # my $ele = 'c';
The following returns the elements identified by LIST. Since LIST is to return 0 or more elements, it must be evaluated in list context.
#array[LIST]
e.g.
my #array = qw( a b c d );
my #indexes ( 1, 2 );
my #slice = $array[#indexes]; # my #slice = qw( b c );
\( $ARY[$index] ) # Returns a ref to the element returned by $ARY[$index]
\( #ARY[#indexes] ) # Returns refs to each element returned by #ARY[#indexes]
${foo} # Weird way of writing $foo. Useful in literals, e.g. "${foo}bar"
#{foo} # Weird way of writing #foo. Useful in literals, e.g. "#{foo}bar"
${foo}[...] # Weird way of writing $foo[...].
Most people don't even know you can use these outside of string literals.

2 Sub references as arguments in perl

I have perl function I dont what does it do?
my what does min in perl?
#ARVG what does mean?
sub getArgs
{
my $argCnt=0;
my %argH;
for my $arg (#ARGV)
{
if ($arg =~ /^-/) # insert this entry and the next in the hash table
{
$argH{$ARGV[$argCnt]} = $ARGV[$argCnt+1];
}
$argCnt++;
}
return %argH;}
Code like that makes David sad...
Here's a reformatted version of the code doing the indentations correctly. That makes it so much easier to read. I can easily tell where my if and loops start and end:
sub getArgs {
my $argCnt = 0;
my %argH;
for my $arg ( #ARGV ) {
if ( $arg =~ /^-/ ) { # insert this entry and the next in the hash table
$argH{ $ARGV[$argCnt] } = $ARGV[$argCnt+1];
}
$argCnt++;
}
return %argH;
}
The #ARGV is what is passed to the program. It is an array of all the arguments passed. For example, I have a program foo.pl, and I call it like this:
foo.pl one two three four five
In this case, $ARGV is set to the list of values ("one", "two", "three", "four", "five"). The name comes from a similar variable found in the C programming language.
The author is attempting to parse these arguments. For example:
foo.pl -this that -the other
would result in:
$arg{"-this"} = "that";
$arg{"-the"} = "other";
I don't see min. Do you mean my?
This is a wee bit of a complex discussion which would normally involve package variables vs. lexically scoped variables, and how Perl stores variables. To make things easier, I'm going to give you a sort-of incorrect, but technically wrong answer: If you use the (strict) pragma, and you should, you have to declare your variables with my before they can be used. For example, here's a simple two line program that's wrong. Can you see the error?
$name = "Bob";
print "Hello $Name, how are you?\n";
Note that when I set $name to "Bob", $name is with a lowercase n. But, I used $Name (upper case N) in my print statement. As it stands, now. Perl will print out "Hello, how are you?" without a care that I've used the wrong variable name. If it's hard to spot an error like this in a two line program, imagine what it would be like in a 1000 line program.
By using strict and forcing me to declare variables with my, Perl can catch that error:
use strict;
use warnings; # Another Pragma that should always be used
my $name = "Bob";
print "Hello $Name, how are you doing\n";
Now, when I run the program, I get the following error:
Global symbol "$Name" requires explicit package name at (line # of print statement)
This means that $Name isn't defined, and Perl points to where that error is.
When you define variables like this, they are in scope with in the block where it's defined. A block could be the code contained in a set of curly braces or a while, if, or for statement. If you define a variable with my outside of these, it's defined to the end of the file.
Thus, by using my, the variables are only defined inside this subroutine. And, the $arg variable is only defined in the for loop.
One more thing:
The person who wrote this should have used the Getopt::Long module. There's a major bug in their code:
For example:
foo.pl -this that -one -two
In this case, my hash looks like this:
$args{'-this'} = "that";
$args{'-one'} = "-two";
$args{'-two'} = undef;
If I did this:
if ( defined $args{'-two'} ) {
...
}
I would not execute the if statement.
Also:
foo.pl -this=that -one -two
would also fail.
#ARGV is a special variable (refer to perldoc perlvar):
#ARGV
The array #ARGV contains the command-line arguments intended for the
script. $#ARGV is generally the number of arguments minus one, because
$ARGV[0] is the first argument, not the program's command name itself.
See $0 for the command name.
Perl documentation is also available from your command line:
perldoc -v #ARGV

Perl argument parsing into functions when no parentheses are used

I have the following test code:
sub one_argument {
($a) = #_;
print "In one argument: \$a = $a\n";
return "one_argument";
}
sub mul_arguments {
(#a) = #_;
return "mul_argument";
}
print &one_argument &mul_arguments "something", "\n";
My goal is to be able to understand a bit better how perl decides which arguments to go into each function, and to possibly clear up any misunderstandings that I might have. I would've expected the above code to output:
In one argument: mul_argument
one_argument
However, the below is output:
Use of uninitialized value $a in concatenation (.) or string at ./test.pl line 5.
In one argument: $a =
mdd_argument
I don't understand where 'mdd_argument' comes from (Is it a sort of reference to a function?), and why one_argument receives no arguments.
I would appreciate any insight as to how perl parses arguments into functions when they are called in a similar fashion to above.
Please note that this is purely a learning exercise, I don't need the above code to perform as I expected, and in my own code I wouldn't call a function in such a way.
perldoc perlsub:
If a subroutine is called using the & form, the argument list is optional, and if omitted, no #_ array is set up for the subroutine: the #_ array at the time of the call is visible to subroutine instead. This is an efficiency mechanism that new users may wish to avoid.
In other words, in normal usage, if you use the &, you must use parentheses. Otherwise, the subroutine will be passed the caller's #_.
The mysterious "mdd" is caused because &one_argument doesn't have any arguments and perl is expecting an operator to follow it, not an expression. So the & of &mul_arguments is actually interpreted as the stringwise bit and operator:
$ perl -MO=Deparse,-p -e 'sub mul_arguments; print &one_argument &mul_arguments "something", "\n"'
print((&one_argument & mul_arguments('something', "\n")));
and "one_argument" & "mul_arguments" produces "mdd_argument".

Strange perl code - looking for explanation

I know a little bit perl, but not enough deeply to understand the next.
Reading perldelta 5.18 i found the next piece of code what is already disabled in 5.18. Not counting this, still want understand how it's works.
Here is the code and in the comments are what i understand
%_=(_,"Just another "); #initialize the %_ hash with key=>value _ => 'Just another'
$_="Perl hacker,\n"; #assign to the $_ variable with "Perl..."
s//_}->{_/e; # darkness. the /e - evauates the expression, but...
print
it prints:
Just another Perl hacker,
I tried, the perl -MO=Deparse and get the next
(%_) = ('_', 'Just another '); #initializing the %_ hash
$_ = "Perl hacker,\n"; # as above
s//%{'_';}/e; # substitute to the beginning of the $_ - WHAT?
print $_; # print the result
japh syntax OK
What is strange (at least for me) - running the "deparsed" code doesn't gives the original result and prints:
1/8Perl hacker,
I would be very happy:
if someone can explain the code, especially if someone could write an helper code, (with additional steps) what helps me understand how it is works - what happens.
explain, why the deparsed code not prints the original result.
What means the %{'_';} in the deparsed code?
The code actually executed by the substitution operator is probably actually something like
my $code = "do { $repl_expr }";
So when the replacement expression is _}->{_, the following is executed:
do { _}->{_ }
_ simply returns the string _ (since strict is off), so that's the same as
do { "_" }->{_}
which is the same as
"_"->{_}
What you have there is a hash element dereference, where the reference is a symbolic reference (i.e. a string rather than an actual reference). Normally forbidden by strict, here's an example of a symbolic reference at work:
%h1 = ( id => 123 );
%h2 = ( id => 456 );
print "h$_"->{id}, "\n"
for 1..2;
So that means
"_"->{_} # Run-time symbol lookup
is the same as
$_{_} # Compile-time symbol lookup
A similar trick is often used in one-liners.
perl -nle'$c += $_; END { print $c }'
can be shortened to
perl -nle'$c += $_; }{ print $c'
Because the code actually executed when -n is used is obtained from something equivalent to
my $code = "LINE: while (<>) { $program }";
%{'_';}
is a very weird way to write
%{'_'}
which is a hash dereference. Again, the reference here is a symbolic reference. It's equivalent to
%_
In scalar context, hash current returns a value that reports some information about that hash's internals (or a false value if empty). There's been a suggestion to change it to return the number of keys instead.