I have a question regarding how the left and right sides of the -> operator are evaluated. Consider the following code:
#! /usr/bin/perl
use strict;
use warnings;
use feature ':5.10';
$, = ': ';
$" = ', ';
my $sub = sub { "#_" };
sub u { shift->(#_) }
sub v { my $s = shift; $s->(#_) }
say 'u', u($sub, 'foo', 'bar');
say 'v', v($sub, 'foo', 'bar');
Output:
u: CODE(0x324718), foo, bar
v: foo, bar
I expect u and v to behave identically but they don't. I always assumed perl evaluated things left to right in these situations. Code like shift->another_method(#_) and even shift->another_method(shift, 'stuff', #_) is pretty common.
Why does this break if the first argument happens to be a code reference? Am I on undefined / undocumented territory here?
The operand evaluation order of ->() is undocumented. It happens to evaluate the arguments before the LHS (lines 3-4 and 5 respectively below).
>perl -MO=Concise,u,-exec a.pl
main::u:
1 <;> nextstate(main 51 a.pl:11) v:%,*,&,x*,x&,x$,$,469762048
2 <0> pushmark s
3 <#> gv[*_] s
4 <1> rv2av[t2] lKM/3
5 <0> shift s*
6 <1> entersub[t3] KS/TARG,2
7 <1> leavesub[1 ref] K/REFC,1
a.pl syntax OK
Both using and modifying a variable in the same expression can be dangerous. It's best to avoid it unless you can explain the following:
>perl -E"$i=5; say $i,++$i,$i"
666
You could use
$_[0]->(#_[1..$#_])
Related
It's possible to print a variable's name by *var{NAME}, is it possible to print the argument's name in a subroutine?
Below is what I want to achieve
var_name($myVar); will print myVar
sub var_name{
print *_{NAME}; # Prints `_`, but want `myVar`
}
First, your attempt using print *_{name}; does work; but it prints the name associated with the typeglob of _ (The one for things like $_ and #_), which isn't what you want. If you pass a typeglob/reference to typeglob to the function you can extract its name by de-referencing the argument:
#!/usr/bin/env perl
use warnings;
use strict;
use feature qw/say/;
# The prototype isn't strictly necessary but it makes it harder
# to pass a non-typeglob value.
sub var_name :prototype(\*) {
say *{$_[0]}{NAME}; # Note the typeglob deref
}
my $myVar = 1;
say *myVar{NAME}; # myVar
var_name *myVar; # myVar
You get _ because the NAME associated with *_ is _.
So what glob should you use? Well, the glob that contains the variable used as an argument (if any) isn't passed to the sub, so you're out of luck.
A glob-based solution would never work with my variables anyway since these aren't found in globs. This means the very concept of a glob-based varname couldn't possibly work in practice.
Getting the name of the variables would entail examining the opcode tree before the call site. I believe this is how operators achieve this in situations such as the following:
$ perl -we'my $y; my $x = 0 + $y;'
Use of uninitialized value $y in addition (+) at -e line 1.
$ perl -MO=Concise -we'my $y; my $x = 0 + $y;'
a <#> leave[1 ref] vKP/REFC ->(end)
1 <0> enter v ->2
2 <;> nextstate(main 1 -e:1) v:{ ->3
3 <0> padsv[$y:1,3] vM/LVINTRO ->4
4 <;> nextstate(main 2 -e:1) v:{ ->5
9 <2> sassign vKS/2 ->a
7 <2> add[t3] sK/2 ->8 <-- This is what's issuing
5 <$> const[IV 0] s ->6 the warning.
6 <0> padsv[$y:1,3] s ->7 <-- This is the source of
8 <0> padsv[$x:2,3] sRM*/LVINTRO ->9 the name in the warning.
-e syntax OK
I've got the following script running on Perl 5.10.1:
#!/usr/bin/perl
use strict;
use warnings;
foreach( my $x =0 ; $x < 1; $x++) { # Line 5
print_line(); # Line 6
}
sub print_line {
print "Function call from line: " . [caller(0)]->[2] . "\n";
}
Despite the call to the subroutine coming from line 6, the script outputs the line number of the start of the C-style for statement:
Function call from line: 5
What's really weird is if I throw a random statement into one of the blank line in the C-style for loop, caller returns the correct line number:
#!/usr/bin/perl
use strict;
use warnings;
foreach( my $x =0 ; $x < 1; $x++) {
my $x = 3;
print_line(); # Line 7
}
sub print_line {
print "Function call from line: " . [caller(0)]->[2] . "\n";
}
The above script correctly outputs:
Function call from line: 7
Is this some kind of bug or is there something I can do to get caller to accurately report the line number?
I think potentially it is a bug, because the same behavior doesn't occur if you replace
foreach (my $x = 0 ; $x < 1 ; $x++) {
with
foreach my $x (0 .. 0) {
I don't understand exactly what's happening, but by comparing the optrees of the two different versions, I think that a nextstate op is getting improperly optimized out. My version has
<;> nextstate(main 4 lineno.pl:11) v:*,&,x*,x&,x$,$ ->8
as the left sibling of the entersub op that calls print_line, while yours has
<0> ex-nextstate v ->8
which has been taken out of the flow of execution.
It wouldn't hurt to write this up as a perlbug.
$ perl -MO=Concise a.pl
j <#> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 6 a.pl:5) v:*,&,{,x*,x&,x$,$ ->3
5 <2> sassign vKS/2 ->6
3 <$> const[IV 0] s ->4
4 <0> padsv[$x:3,5] sRM*/LVINTRO ->5
6 <0> unstack v* ->7
i <2> leaveloop vK/2 ->j
7 <{> enterloop(next->b last->i redo->8) v ->e
- <1> null vK/1 ->i
h <|> and(other->8) vK/1 ->i
g <2> lt sK/2 ->h
e <0> padsv[$x:3,5] s ->f
f <$> const[IV 1] s ->g
- <#> lineseq vK ->-
- <#> scope vK ->b <---
- <0> ex-nextstate v ->8 <---
a <1> entersub[t5] vKS/TARG,2 ->b
- <1> ex-list K ->a
8 <0> pushmark s ->9
- <1> ex-rv2cv sK/2 ->-
9 <#> gv[*print_line] s/EARLYCV ->a
c <1> preinc[t2] vK/1 ->d
b <0> padsv[$x:3,5] sRM ->c
d <0> unstack v ->e
a.pl syntax OK
There's some optimization going on. The scope was deemed unnecessary and optimized away. (Notice the "-" meaning it's never reached.)
But at the same time, that removed the nextstate op, which is what sets the line number for warnings and for caller.
So, it's a bug that results from an improper optimization.
I suspect this may be down to statement separators (semicolons). As you may have spotted - with the code you're running, the line number reported by caller is the same as the foreach loop.
So I think what is happening, is because there's no semicolons.
If you were to do a multi-line sub call, caller would report the first line:
print "first call:", __LINE__, "\n";
print "Start of statement\n",
"a bit more on line ", __LINE__, "\n",
print_line(
1,
2,
3,
5,
);
You get the line number of the start of the call, not the end. So I think that's what you're getting - the statement starts when the semicolon statement separator occurs - which is the foreach line in the first example.
So as a workaround - I might suggest making use of __LINE__. Although I'd also perhaps suggest not worrying about it too much, because it's still going to point you to the right place in the code.
You get something similar if you use croak, for presumably the same reason.
As has been pointed out this is really a bug in Perl going back at least to 5.10 or 11 years, but in reality I think longer.
It has been reported as Perl bug perl #133239 and although it is alleged that it is not that hard to fix, it hasn't been. It may not also be that easy to fix, has performance ramifications since adding COP's slows things down, and possibly some administrative work would be needed to adjust tests.
And even if this bug were fixed, it would be only fixed in versions Perl 5.29 and later, or so. This isn't going to help you with 5.10.
So here is another tack that doesn't rely on a change to Perl's core, and therefore puts users more in control. However, I'll say up front it is a bit experimental and unless people are willing to spend coding effort on it, it's not likely to go back as far as 5.10. Right now the earliest Perl version I have working is 5.14, 7 years ago as if this writing.
Using B::DeparseTree you can write a different, and I think better caller() which can show you the location of the caller with more detail. Here is your program modified to do that:
#!/usr/bin/perl
use strict;
use warnings;
use B::DeparseTree::Fragment;
use Devel::Callsite;
sub dt_caller
{
my $level = $_ ? $_ : 0;
# Pick up the right caller's OP address.
my $addr = callsite($level+1);
# Hack alert 'main::main' should be replaced with the function name if not the top level. caller() is a little off-sync here.
my $op_info = deparse_offset('main::main', $addr);
# When Perl is in the middle of call, it has already advanced the PC,
# so we need to go back to the preceding op.
$op_info = get_prev_addr_info($op_info);
my $extract_texts = extract_node_info($op_info);
print join("\n", #$extract_texts), "\n";
}
foreach( my $x =0 ; $x < 1; $x++) {
print_line();
}
sub print_line {
dt_caller();
}
When run it prints:
$ perl bug-caller.pl
print_line()
------------
dt_caller() could and should be wrapped up into a package like Carp so you don't see all of that ugliness. However I'll leave that for someone else. And I'll mention that just to get this working, there were some bug fixes I had to make, so this works only starting with version 3.4.0 of B::DeparseTree.
I ran into a situation where I can't inhibit warnings in an intuitive way because perl is in-lining a call to a built-in function. e.g.
use strict;
use warnings;
{
no warnings 'substr'; # no effect
foo(substr('123', 4, 6)); # out of range but shouldn't emit a warning
}
sub foo {
my $s = shift; # warning reported here
# do something
}
Running this code results in
substr outside of string at c:\temp\foo.pl line 10.
In order to inhibit the warning I have to move the no warnings 'substr' inside the function.
sub foo {
no warnings 'substr'; # works here, but there's no call to substr
my $s = shift; # no warnings here
# do something
}
I can see that the call to substr is being inlined by passing the code through perl -MO=Terse
LISTOP (0x27dcaa8) leave [1]
OP (0x27a402c) enter
COP (0x27dcac8) nextstate
BINOP (0x27dcb00) leaveloop
LOOP (0x27dcb20) enterloop
LISTOP (0x27dcb68) lineseq
COP (0x27dcb88) nextstate
UNOP (0x27dcbc0) entersub [5] # entry point for foo
UNOP (0x27dcbf4) null [148]
OP (0x27dcbdc) pushmark
LISTOP (0x27dcc48) substr [4] # substr gets called here
OP (0x27dcc30) null [3]
SVOP (0x27dcc84) const [6] PV (0x2319944) "123"
SVOP (0x27dcc68) const [7] IV (0x2319904) 4
SVOP (0x27dcc14) const [8] IV (0x231944c) 6
UNOP (0x27dcca0) null [17]
PADOP (0x27dccf4) gv GV (0x2318e5c) *foo
Is this optimizer behavior documented anywhere? perlsub only mentions inlining of constant functions. Given that the warning is being reported on the wrong line and that no warnings isn't working in the lexical scope where the call is being made I'm inclined to report this as a bug, although I can't think of how it could reasonably be fixed while preserving the optimization.
Note: This behavior was observed under Perl 5.16.1.
This is a documented behaviour (in perldiag):
substr outside of string
(W substr),(F) You tried to reference a substr() that pointed
outside of a string. That is, the absolute value of the offset was
larger than the length of the string. See "substr" in perlfunc.
This warning is fatal if substr is used in an lvalue context (as
the left hand side of an assignment or as a subroutine argument for
example).
Emphasis mine.
Changing the call to
foo(my $o = substr('123', 4, 6));
makes the warnings disappear.
Moving the no warnings into the sub doesn't change the behaviour for me. What Perl version do you have? (5.14.4 here).
The code I used for testing:
#!/usr/bin/perl
use strict;
use warnings;
$| = 1;
print 1, foo(my $s1 = substr('abc', 4, 6));
print 2, bar(my $s2 = substr('def', 4, 6));
{
no warnings 'substr';
print 3, foo(my $s3 = substr('ghi', 4, 6));
print 4, bar(my $s4 = substr('jkl', 4, 6));
print 5, bar(substr('mno', 4, 6)); # Stops here, reports line 12.
print 6, foo(substr('pqr', 4, 6));
}
print "ok\n";
sub foo {
my $s = shift;
}
sub bar {
no warnings 'substr';
my $s = shift;
}
Update:
I'm getting the same behaviour in 5.10.1, but in 5.20.1, the behaviour is as you described.
As you saw from B::Terse, the substr is not inlined.
$ perl -MO=Concise,-exec -e'f(substr($_, 3, 4))'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <0> pushmark s
4 <#> gvsv[*_] s
5 <$> const[IV 3] s
6 <$> const[IV 4] s
7 <#> substr[t4] sKM/3 <-- The substr operator is evaluated first.
8 <#> gv[*f] s/EARLYCV
9 <1> entersub[t5] vKS/TARG <-- The sub call second.
a <#> leave[1 ref] vKP/REFC
-e syntax OK
When substr is called as an lvalue context, substr returns a magical scalar that contains the operands passed to substr.
$ perl -MDevel::Peek -e'$_ = "abcdef"; Dump(${\ substr($_, 3, 4) })'
SV = PVLV(0x2865d60) at 0x283fbd8
REFCNT = 2
FLAGS = (GMG,SMG) <--- Gets and sets are magical.
IV = 0 GMG: A function that mods the scalar
NV = 0 is called before fetches.
PV = 0 SMG: A function is called after the
MAGIC = 0x2856810 scalar is modified.
MG_VIRTUAL = &PL_vtbl_substr
MG_TYPE = PERL_MAGIC_substr(x)
TYPE = x
TARGOFF = 3 <--- substr's second arg
TARGLEN = 4 <--- substr's third arg
TARG = 0x287bfd0 <--- substr's first arg
FLAGS = 0
SV = PV(0x28407f0) at 0x287bfd0 <--- A dump of substr's first arg
REFCNT = 2
FLAGS = (POK,IsCOW,pPOK)
PV = 0x2865d20 "abcdef"\0
CUR = 6
LEN = 10
COW_REFCNT = 1
Subroutine arguments are evaluated in lvalue context because subroutine arguments are always passed by reference in Perl[1].
$ perl -E'sub f { $_[0] = "def"; } $x = "abc"; f($x); say $x;'
def
The substring operation happens when the magical scalar is accessed.
$ perl -E'$x = "abc"; $r = \substr($x, 0, 1); $x = "def"; say $$r;'
d
This is done to allow substr(...) = "abc";
This is probably documented using language similar to the following: "The elements of #_ are aliased to the subroutine arguments."
Why perl -we '$c = $c+3' rises
Use of uninitialized value $c in addition (+) at -e line 1.
and perl -we '$c += 3' doesn't complain about uninitialized value?
UPDATE
Does documentation or some book like 'Perl best practices' mention such behavior?
I think perldoc perlop has a little explanation:
Assignment Operators
"=" is the ordinary assignment operator.
Assignment operators work as in C. That is,
$a += 2;
is equivalent to
$a = $a + 2;
although without duplicating any side effects that dereferencing the
lvalue might trigger, such as from tie()
With B::Concise helper, we can see the trick:
$ perl -MO=Concise,-exec -e '$c += 3'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <#> gvsv[*c] s
4 <$> const[IV 3] s
5 <2> add[t2] vKS/2
6 <#> leave[1 ref] vKP/REFC
-e syntax OK
$ perl -MO=Concise,-exec -e '$c = $c + 3'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <#> gvsv[*c] s
4 <$> const[IV 3] s
5 <2> add[t3] sK/2
6 <#> gvsv[*c] s
7 <2> sassign vKS/2
8 <#> leave[1 ref] vKP/REFC
-e syntax OK
Update
After searching in perldoc, I saw that this problem had been documented in perlsyn:
Declarations
The only things you need to declare in Perl are report formats and subroutines (and sometimes not even subroutines). A variable
holds the undefined value ("undef") until it has been assigned a defined value, which is anything other than "undef". When used as a
number, "undef" is treated as 0; when used as a string, it is treated as the empty string, ""; and when used as a reference that
isn't being assigned to, it is treated as an error. If you enable warnings, you'll be notified of an uninitialized value whenever
you treat "undef" as a string or a number. Well, usually. Boolean contexts, such as:
my $a;
if ($a) {}
are exempt from warnings (because they care about truth rather than definedness). Operators such as "++", "--", "+=", "-=", and
".=", that operate on undefined left values such as:
my $a;
$a++;
are also always exempt from such warnings.
Because it makes sense for addition to warn when adding things other than numbers, but it's very convenient for += not to warn for undefined values.
As Gnouc found, this is documented in perlsyn:
Operators such as ++ , -- , += , -= , and .= , that operate on undefined variables such as:
undef $a;
$a++;
are also always exempt from such warnings.
from Programming Perl pg 90, he says:
#ary = (1, 3, sort 4, 2);
print #ary;
the commas on the right of the sort are evaluated before the sort but the commas on the left are evaluated after. ... list operators tend to gobble .. and then act like a simple term"
Does the assignment result in sort being processed or does that happen when #ary is expanded by print?
What does he mean by all that "comma" stuff?? My understanding is that in the assignment statement, comma has a lower priority than a list operator therefore sort runs first and gobbles up it's arguments (4 and 2).. How the heck is comma being evaluated at all?? So that statemnent then becomes (1, 3, 2, 4) a list which is assigned.. comma is just acting as a list separator and not an operator!! In fact on pg:108 he says: do not confuse the scalar context use of comma with list context use..
What is a leftward and rightward list operator? print #ary is a rightward list operator?? So it has very low priority?
print($foo, exit);
here, how is precedence evaluated? print is a list operator that looks like a function so it should run first! it has two arguments $foo and exit.. so why is exit not treated as a string??? After all priority-wise print(the list operator) has higher priority??
print $foo, exit;
here, you have print and , operators but the list operator has higher precedence.. so.. exit should be treated as a string - why not??
print ($foo & 255) + 1, "\n";
here since it's a list operator it prints $foo & 255 Shouldn't something similar happen with the above mentioned exit stuff..
When in doubt about how Perl is parsing a construct, you can run the code through the B::Deparse module, which will generate Perl source code from the compiled internal representation. For your first example:
$ perl -MO=Deparse,-p -e '#ary = (1, 3, sort 4, 2); print #ary;'
(#ary = (1, 3, sort(4, 2)));
print(#ary);
-e syntax OK
So as you can see, sort takes the two arguments to its right.
As far as execution order goes, you can find that out with the B::Concise module (I've added the comments):
$ perl -MO=Concise,-exec -e '#ary = (1, 3, sort 4, 2); print #ary;'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <0> pushmark s # start of list
4 <$> const[IV 1] s # 1 is added to list
5 <$> const[IV 3] s # 3 is added to list
6 <0> pushmark s # start of sort's argument list
7 <$> const[IV 4] s # 4 is added to sort's argument list
8 <$> const[IV 2] s # 2 is added to sort's argument list
9 <#> sort lK # sort is run, and returns its list into the outer list
a <0> pushmark s
b <#> gv[*ary] s
c <1> rv2av[t2] lKRM*/1
d <2> aassign[t3] vKS/COMMON # the list is assigned to the array
e <;> nextstate(main 1 -e:1) v:{
f <0> pushmark s # start of print's argument list
g <#> gv[*ary] s # the array is loaded into print's argument list
h <1> rv2av[t5] lK/1
i <#> print vK # print outputs it's argument list
j <#> leave[1 ref] vKP/REFC
-e syntax OK
For your second example:
$ perl -MO=Deparse,-p -e 'print $foo, exit;'
print($foo, exit);
-e syntax OK
$ perl -MO=Concise,-exec -e 'print $foo, exit;'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <0> pushmark s
4 <#> gvsv[*foo] s # add $foo to the argument list
5 <0> exit s # call `exit` and add its return value to the list
6 <#> print vK # print the list, but we never get here
7 <#> leave[1 ref] vKP/REFC
-e syntax OK
So as you can see, the exit builtin is run while trying to assemble the argument list for print. Since exit causes the program to quit, the print command never gets to run.
And the last one:
$ perl -MO=Deparse,-p -e 'print ($foo & 255) + 1, "\n";'
((print(($foo & 255)) + 1), '???'); # '???' means this was optimized away
-e syntax OK
$ perl -MO=Concise,-exec -e 'print ($foo & 255) + 1, "\n";'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <0> pushmark v
4 <0> pushmark s
5 <#> gvsv[*foo] s
6 <$> const[IV 255] s
7 <2> bit_and[t2] sK
8 <#> print sK
9 <$> const[IV 1] s
a <2> add[t3] vK/2
b <#> list vK
c <#> leave[1 ref] vKP/REFC
-e syntax OK
sort is evaluated when it's called, it really doesn't have anything to do with the assignment. sort returns a list. So what you're assigning is:
#ary = (1, 3, (2,4) );
Perl ignores the second parenthesis so you end up with 1,3,2,4 as you would expect.
The comma you're referring to no longer exists. It's the second argument to sort. Perl sees your list as a 3 item list not a 4 item list (it expands it to 4 in the assignment)
rightward does something with the parameters (e.g. prints them out or storing them), leftward does something TO the parameters, usually by modifying them in someway.
print acts like any other function in Perl (or any other language I've ever used for that matter). If you call a function as an argument, the return value of that function is given as the argument. So your case of:
print ($foo, exit);
or equivalent (the parens don't matter)
print $foo, exit;
does nothing, because you're asking it to print the return value of exit. Your program exits first so you get nothing back. I don't understand why you'd expect exit to be treated as a string. exit is a function in all contexts unless you quoted it.
print ($foo & 255) + 1,"\n";
From perlop which gives this example:
probably doesn't do what you expect at first glance. The parentheses
enclose the argument list for "print" which is evaluated (printing the
result of "$foo & 255"). Then one is added to the return value of
"print" (usually 1). The result is something like this:
1 + 1, "\n"; # Obviously not what you meant.
To do what you meant properly, you must write:
print(($foo & 255) + 1, "\n");
Not sure if what follows is perfectly accurate (it's a mishmash from IRC, the above mentioned answers, google and my interpretation of the book)
(operator)(operands) this is viewed as a leftward operator because it's to the left of the operands. (operands)(operator) this is viewed as a rightward operator because it's to the right of the operands. So, (1, 2, 3, sort 4, 5, sort 6, 7) Here the second sort, acts as both a leftword and a rightword operator!! sort 6,7 is leftword as in to the left of (6,7) - it's operands. It's also to the right of sort(4, 5 so here it's rightward and of very low precedence.
2.
#ary = (1, 3, sort 4, 2);
print #ary;
here, sort is a leftward list operator so straight away it's precedence is highest and as 'Cfreak' says..
print($foo, exit); print $foo, exit;
Here, print is leftward list so highest precedence and so it should execute first BUT! to execute it should resolve it's arguments and the bareword 'exit'. To resolve it, I guess it runs exit, ergo.. Print $foo,... will gobble up all it's arguments then it has to processes them and at the bareword runs it..
print ($foo & 255) same as above. print gets highest precedence but it needs to now resolve its various arguments.. so $foo & 255 etc as 'Cfreak' explained.
Many thanks guys!