my $x = do { 3; } if 1; say $x # works
my $x = (do { 3; } if 1); say $x # syntax error
How come? If the do block is an expression, why can't it be parenthesised? If it's not, how does the first one parse?
A compound statement used for flow control (if BLOCK), as well as one with the statement modifier (used here, the postfix if), cannot appear inside parenthesis.
This restriction makes sense since such a statement may or may not return a value
if executes the statement once if and only if the condition is true.
(original emphasis)
A side note. The first example runs without warnings but it has undefined behavior, what must be avoided. From the end of the section Statement Modifiers in perlsyn
NOTE: The behaviour of a my, state, or our modified with a statement modifier conditional or loop construct (for example, my $x if ...) is undefined. The value of the my variable may be undef, any previously assigned value, or possibly anything else. Don't rely on it. Future versions of perl might do something different from the version of perl you try it out on. Here be dragons.
(original emphasis)
Any instances of this should be rewritten, and Perl::Critic has a policy for it, making it easier to find them
It's not that the do that's the problem, it's the postfix if. That postfix can't appear inside the parens:
$ perl -E 'my $x = ( 1 if 1); say $x'
syntax error at -e line 1, near "1 if"
Execution of -e aborted due to compilation errors.
Instead, you can use the conditional operator ?: with a do in one of the branches:
$ perl -E 'my $x = ( time % 2 ? do { 1 } : () ); say $x'
my $x = do { 3; } if 1;
is actually equivalent to
( my $x = do { 3; } ) if 1;
Note that you shouldn't execute my conditionally. (More precisely, you shouldn't use a my variable that hasn't been executed. Your code is technically ok since the my is always executed before $x is used.)
An expression (my $x = do { ... }) modified by a statement modifier (if 1) is a statement.
The inside of parens must be an expression, not a statement.
You can't do
( $x = 3; )
( sub f { } )
( if (f()) { g() } )
( g() if f(); )
( g() if f() )
You get the idea.
if 1 is a statement modifier. my $x = do { 3; } is a statement; do { 3; } is an expression.
Related
I came across the following Perl subroutine get_billable_pages while chasing a bug. It takes 12 arguments.
sub get_billable_pages {
my ($dbc,
$bill_pages, $page_count, $cover_page_count,
$domain_det_page, $bill_cover_page, $virtual_page_billing,
$job, $bsj, $xqn,
$direction, $attempt,
) = #_;
my $billable_pages = 0;
if ($virtual_page_billing) {
my #row;
### Below is testing on the existence of the 11th and 12th parameters ###
if ( length($direction) && length($attempt) ) {
$dbc->xdb_execute("
SELECT convert(int, value)
FROM job_attribute_detail_atmp_tbl
WHERE job = $job
AND billing_sub_job = $bsj
AND xqn = $xqn
AND direction = '$direction'
AND attempt = $attempt
AND attribute = 1
");
}
else {
$dbc->xdb_execute("
SELECT convert(int, value)
FROM job_attribute_detail_tbl
WHERE job = $job
AND billing_sub_job = $bsj
AND xqn = $xqn
AND attribute = 1
");
}
$cnt = 0;
...;
But is sometimes called with only 10 arguments
$tmp_det = get_billable_pages(
$dbc2,
$row[6], $row[8], $row[7],
$domain_det_page, $bill_cover_page, $virtual_page_billing,
$job1, $bsj1, $row[3],
);
The function does a check on the 11th and 12th arguments.
What are the 11th and 12th arguments when the function is passed only 10 arguments?
Is it a bug to call the function with only 10 arguments because the 11th and 12th arguments end up being random values?
I am thinking this may be the source of the bug because the 12th argument had a funky value when the program failed.
I did not see another definition of the function which takes only 10 arguments.
The values are copied out of the parameter array #_ to the list of scalar variables.
If the array is shorter than the list, then the excess variables are set to undef. If the array is longer than the list, then excess array elements are ignored.
Note that the original array #_ is unmodified by the assignment. No values are created or lost, so it remains the definitive source of the actual parameters passed when the subroutine is called.
ikegami suggested that I should provide some Perl code to demonstrate the assignment of arrays to lists of scalars. Here is that Perl code, based mostly on his edit
use strict;
use warnings;
use Data::Dumper;
my $x = 44; # Make sure that we
my $y = 55; # know if they change
my #params = (8); # Make a dummy parameter array with only one value
($x, $y) = #params; # Copy as if this is were a subroutine
print Dumper $x, $y; # Let's see our parameters
print Dumper \#params; # And how the parameter array looks
output
$VAR1 = 8;
$VAR2 = undef;
$VAR1 = [ 8 ];
So both $x and $y are modified, but if there are insufficient values in the array then undef is used instead. It is as if the source array was extended indefinitely with undef elements.
Now let's look at the logic of the Perl code. undef evaluates as false for the purposes of conditional tests, but you apply the length operator like this
if ( length($direction) && length($attempt) ) { ... }
If you have use warnings in place as you should, Perl would normally produce a Use of uninitialized value warning. However length is unusual in that, if you ask for the length of an undef value (and you are running version 12 or later of Perl 5) it will just return undef instead of warning you.
Regarding "I did not see another definition of the function which takes only 10 arguments", Perl doesn't have function templates like C++ and Java - it is up to the code in the subroutine to look at what it has been passed and behave accordingly.
No, it's not a bug. The remaining arguments are "undef" and you can check for this situation
sub foo {
my ($x, $y) = #_;
print " x is undef\n" unless defined $x;
print " y is undef\n" unless defined $y;
}
foo(1);
prints
y is undef
I have the following code:
# List of tests
my $tests = [("system_test_builtins_sin", "system_test_builtins_cos", "system_test_builtins_tan")];
# Provide overrides for certain variables that may be needed because of special cases
# For example, cos must be executed 100 times and sin only 5 times.
my %testOverrides = (
system_test_builtins_sin => {
reps => 5,
},
system_test_builtins_cos => {
reps => 100,
},
);
my %testDefaults = (
system_test_reps => 10,
);
# Execute a system tests
foreach my $testName (#$tests)
{
print "Executing $testName\n";
my $reps;
if (exists $testOverrides{$testName}{reps})
{ $reps = $testOverrides{$testName}{reps}; }
else
{ $reps = $testDefaults{system_test_reps}; }
print "After long if: $reps\n";
exists $testOverrides{$testName}{reps} ? $reps = $testOverrides{$testName}{reps} : $reps = $testDefaults{system_test_reps};
print "After first ternary: $reps\n";
exists $testOverrides{$testName}{reps} ? $reps = $testOverrides{$testName}{reps} : print "Override not found.\n";
print "After second ternary: $reps\n";
}
This gives the following output:
Executing system_test_builtins_sin
After long if: 5
After first ternary: 10
After second ternary: 5
Executing system_test_builtins_cos
After long if: 100
After first ternary: 10
After second ternary: 100
Executing system_test_builtins_tan
After long if: 10
After first ternary: 10
Override not found.
After second ternary: 10
This output is most unexpected! I don't understand why the first ternary seems to always be executing the "if false" clause. It is always assigning a value of 10. I also tried changing the "false" clause to $reps = 6, and I saw that it always got the value of 6. Why does the ternary's logic depend on the content of the third (if false) clause?
Here's a simpler script that illustrates the problem:
#!/usr/bin/perl
use strict;
use warnings;
my $x;
1 ? $x = 1 : $x = 0;
print "Without parentheses, \$x = $x\n";
1 ? ($x = 1) : ($x = 0);
print "With parentheses, \$x = $x\n";
It produces this output:
Without parentheses, $x = 0
With parentheses, $x = 1
I'm not sure that the relationship between assignment and ?: can be complete explained by operator precedence. (For example, I believe C and C++ can behave differently in some cases.)
Run perldoc perlop and search for "Conditional Operator", or look here; it covers this exact issue (more concisely than I did here).
In any case, I think that using an if/else statement would be clearer than using the ?: operator. Or, since both the "true" and "false" branches assign to the same variable, a better use of ?: would be to change this:
exists $testOverrides{$testName}{reps}
? $reps = $testOverrides{$testName}{reps}
: $reps = $testDefaults{system_test_reps};
to this:
$reps = ( exists $testOverrides{$testName}{reps}
? testOverrides{$testName}{reps}
: $testDefaults{system_test_reps} );
But again, the fact that I had to wrap the line to avoid scrolling is a good indication that an if/else would be clearer.
You might also consider using the // operator, unless you're stuck with an ancient version of Perl that doesn't support it. (It was introduced by Perl 5.10.) It's also known as the "defined-or" operator. This:
$x // $y
is equivalent to
defined($x) ? $x : $y
So you could write:
$reps = $testOverrides{$testName}{reps} // $testDefaults{system_test_reps};
This doesn't have exactly the same semantics, since it tests the expression using defined rather than exists; it will behave differently if $testOverrides{$testName}{reps} exists but has the value undef.
The -p option to B::Deparse is illuminative for problems like this:
$ perl -MO=Deparse,-p -e '$condition ? $x = $value1 : $x = $value2'
(($condition ? ($x = $value1) : $x) = $value2);
As Keith Thompson points out, it is all about the precedence. If the condition is false, the ultimate assignment is $x = $value2. If the condition is true, then the assignment is ($x = $value1) = $value2 -- either way the outcome is to assign $value2 to $x.
I would do it this way (I dont mind using ternary operators)
$reps = exists($testOverrides{$testName}{reps}) ?
$testOverrides{$testName}{reps} :
$testDefaults{system_test_reps};
HTH
Thanks for giving us a sample of your code. Now we can tear it to pieces.
Don't use the ternary operator. It's an infection left over to originally make C programmers feel comfortable. In C, the ternary operator was used because it was originally more efficient than an if/else statement. However, compilers are pretty good about optimizing code, so that's no longer true and now it's discouraged in C and C++ programming. Programming in C is hard enough as it is without ternary operators mucking about.
The Perl compiler is also extremely efficient at optimizing your code, so you should always write for maximum clarity, so others who aren't as good as programming and get stuck maintaining your code can muddle through their job.
The problem you're having is one of operator precedence. You're assuming this:
(exists $testOverrides{$testName}{reps})
? ($reps = $testOverrides{$testName}{reps})
: ($reps = $testDefaults{system_test_reps});
I would too. After all, that's what I pretty much mean. However, the assignment operator has lower precedence than the ternary operator. What's really happening is this:
(exists $testOverrides{$testName}{reps})
? ($reps = $testOverrides{$testName}{reps}) : ($reps))
= $testDefaults{system_test_reps});
so, the final assignment is always happening to $reps.
It's much better if you use if/else:
if (exists $testOverrides{$testName}{reps}) {
$reps = = $testOverrides{$testName}{reps};
}
else {
$reps = $testDefaults{system_test_reps};
}
No precedence issues, easier to read, and just as efficient.
for(1){
print 1;
}
do {
print 1;
}
Is it true?
Or is there any special case these two doesn't equal?
One difference is that for(1) sets $_ to the value of 1, as well:
for(1){
print $_; # prints 1
}
Also, do returns the value of the last command in the sequence:
my $x = do { 1 }; # $x = 1
my $y = for(1){ 1 }; # invalid
You might really be looking for just plain curlies.
{
print 1;
}
It has the following benefits:
Creates a lexical scope (like for (1) and do {}).
You can use next, last and redo in them (like for (1)).
It doesn't mask $_ (like do {}).
But
It can only used where a statement is expected (like for (1), but unlike do {}).
Therefore, { ... } makes more sense than for (1) { ... }, and do { ... } is useful when you want to return a value.
About the same.
You can next, last and redo a for loop, but a do is not a loop--including as part of a do-while "loop". So in a non-trivial block, you couldn't be sure. However, this will work:
do {{
...
}};
Also do will not automatically set $_ to each member of the list, the way a bare for loop will.
No. They have different compilation properties and have different effects. They are similar in only one dimension, that being that the code they introduce will not be looped over -- something they have in common with other constructs, including bare blocks and (sub {...})->().
Here's an obvious difference: for (LIST) BLOCK is a loop, whereas do BLOCK is an expression. This means that
for (1) {
say "Blurgh"
} unless 1;
doesn't compile, whereas
do {
say "Blurgh"
} unless 1;
does.
I like perl the more I am getting into it but I had a question about a line I saw in a subroutine in a module I am looking through.
my $var = 1;
....
....
....
....
$var;
What throws me is just seeing that $var all by itself on a line. Is that just a roundabout way of returning 1 ?
Many thanks!
Jane
In perl the value of a block is the value of the last expression in the block. That is just a shorthand for return $var.
EDIT: Purists point out that that blocks in general do not return values (like they do in Scala, for example) so you can't write:
my $x = if (cond) { 7 } else { 8 }; # wrong!
The implicit return value of a subroutine, eval or do FILE is the last expression evaluated. That last expression can be inside a block, though:
sub f {
my $cond = shift;
if ($cond) { 7 } else { 8 } # successfully returns 7 or 8 from f()
}
There is the superficial appearance of the if/else blocks returning a value, even though, strictly speaking, they don't.
Quoting the last line of perldoc -f return:
In the absence of an explicit return, a subroutine, eval, or do FILE automatically returns the value of the last expression evaluated.
I'm dismayed. OK, so this was probably the most fun Perl bug I've ever found. Even today I'm learning new stuff about Perl. Essentially, the flip-flop operator .. which returns false until the left-hand-side returns true, and then true until the right-hand-side returns false keep global state (or that is what I assume.)
Can I reset it (perhaps this would be a good addition to Perl 4-esque hardly ever used reset())? Or, is there no way to use this operator safely?
I also don't see this (the global context bit) documented anywhere in perldoc perlop is this a mistake?
Code
use feature ':5.10';
use strict;
use warnings;
sub search {
my $arr = shift;
grep { !( /start/ .. /never_exist/ ) } #$arr;
}
my #foo = qw/foo bar start baz end quz quz/;
my #bar = qw/foo bar start baz end quz quz/;
say 'first shot - foo';
say for search \#foo;
say 'second shot - bar';
say for search \#bar;
Spoiler
$ perl test.pl
first shot
foo
bar
second shot
Can someone clarify what the issue with the documentation is? It clearly indicates:
Each ".." operator maintains its own boolean state.
There is some vagueness there about what "Each" means, but I don't think the documentation would be well served by a complex explanation.
Note that Perl's other iterators (each or scalar context glob) can lead to the same problems. Because the state for each is bound to a particular hash, not a particular bit of code,each can be reset by calling (even in void context) keys on the hash. But for glob or .., there is no reset mechanism available except by calling the iterator until it is reset. A sample glob bug:
sub globme {
print "globbing $_[0]:\n";
print "got: ".glob("{$_[0]}")."\n" for 1..2;
}
globme("a,b,c");
globme("d,e,f");
__END__
globbing a,b,c:
got: a
got: b
globbing d,e,f:
got: c
Use of uninitialized value in concatenation (.) or string at - line 3.
got:
For the overly curious, here are some examples where the same .. in the source is a different .. operator:
Separate closures:
sub make_closure {
my $x;
return sub {
$x if 0; # Look, ma, I'm a closure
scalar( $^O..!$^O ); # handy values of true..false that don't trigger ..'s implicit comparison to $.
}
}
print make_closure()->(), make_closure()->();
__END__
11
Comment out the $x if 0 line to see that non-closures have a single .. operation shared by all "copies", with the output being 12.
Threads:
use threads;
sub coderef { sub { scalar( $^O..!$^O ) } }
coderef()->();
print threads->create( coderef() )->join(), threads->create( coderef() )->join();
__END__
22
Threaded code starts with whatever the state of the .. had been before thread creation, but changes to its state in the thread are isolated from affecting anything else.
Recursion:
sub flopme {
my $recurse = $_[0];
flopme($recurse-1) if $recurse;
print " "x$recurse, scalar( $^O..!$^O ), "\n";
flopme($recurse-1) if $recurse;
}
flopme(2)
__END__
1
1
2
1
3
2
4
Each depth of recursion is a separate .. operator.
The trick is not use the same flip-flop so you have no state to worry about. Just make a generator function to give you a new subroutine with a new flip-flop that you only use once:
sub make_search {
my( $left, $right ) = #_;
sub {
grep { !( /\Q$left\E/ .. /\Q$right\E/ ) } #{$_[0]};
}
}
my $search_sub1 = make_search( 'start', 'never_existed' );
my $search_sub2 = make_search( 'start', 'never_existed' );
my #foo = qw/foo bar start baz end quz quz/;
my $count1 = $search_sub1->( \#foo );
my $count2 = $search_sub2->( \#foo );
print "count1 $count1 and count2 $count2\n";
I also write about this in Make exclusive flip-flop operators.
The "range operator" .. is documented in perlop under "Range Operators". Looking through the doucmentation, it appears that there isn't any way to reset the state of the .. operator. Each instance of the .. operator keeps its own state, which means there isn't any way to refer to the state of any particular .. operator.
It looks like it's designed for very small scripts such as:
if (101 .. 200) { print; }
The documentation states that this is short for
if ($. == 101 .. $. == 200) { print; }
Somehow the use of $. is implicit there (toolic points out in a comment that that's documented too). The idea seems to be that this loop runs once (until $. == 200) in a given instance of the Perl interpreter, and therefore you don't need to worry about resetting the state of the .. flip-flop.
This operator doesn't seem too useful in a more general reusable context, for the reasons you've identified.
A workaround/hack/cheat for your particular case is to append the end value to your array:
sub search {
my $arr = shift;
grep { !( /start/ .. /never_exist/ ) } #$arr, 'never_exist';
}
This will guarantee that the RHS of range operator will eventually be true.
Of course, this is in no way a general solution.
In my opinion, this behavior is not clearly documented. If you can construct a clear explanation, you could apply a patch to perlop.pod via perlbug.
I found this problem, and as far as I know there's no way to fix it. The upshot is - don't use the .. operator in functions, unless you are sure you are leaving it in the false state when you leave the function, otherwise the function may return different output for the same input (or exhibit different behaviour for the same input).
Each use of the .. operator maintains its own state. Like Alex Brown said, you need to leave it in the false state when you leave the function. Maybe you could do something like:
sub search {
my $arr = shift;
grep { !( /start/ || $_ eq "my magic reset string" ..
/never_exist/ || $_ eq "my magic reset string" ) }
(#$arr, "my magic reset string");
}