How to convince Devel::Trace to print the BEGIN-block statements? - perl

Have a simple script p.pl:
use strict;
use warnings;
our $x;
BEGIN {
$x = 42;
}
print "$x\n";
When I run it as:
perl -d:Trace p.pl
prints:
>> p.pl:3: our $x;
>> p.pl:7: print "$x\n";
42
how to get printed the BEGIN block statements too, e.g. the $x = 42;?
Because my intention isn't clear, adding the clarification:
Looking for ANY way to print statements when the perl script runs (like Devel::Trace it does) but including the statements in the BEGIN block.

It's very possible. Set $DB::single in an early BEGIN block.
use strict;
use warnings;
our $x;
BEGIN { $DB::single = 1 }
BEGIN {
$x = 42;
}
print "$x\n";
$DB::single is a debugger variable used to determine whether the DB::DB function will be invoked at each line. In compilation phase it is usually false but you can set it in compilation phase in a BEGIN block.
This trick is also helpful to set a breakpoint inside a BEGIN block when you want to debug compile-time code in the standard debugger.

Disclaimer: This is just an attempt to explain the behaviour.
Devel::Trace hooks up to the Perl debugging API through the DB model. That is just code. It installs a sub DB::DB.
The big question is, when is that executed. According to perlmod, there are five block types that are executed at specific points during execution. One of them is BEGIN, which is the first.
Consider this program.
use strict;
use warnings;
our ($x, $y);
BEGIN { $x = '42' }
UNITCHECK { 'unitcheck' }
CHECK { 'check' }
INIT { 'init' }
END { 'end' }
print "$x\n";
This will output the following:
>> trace.pl:8: INIT { 'init' }
>> trace.pl:3: our ($x, $y);
>> trace.pl:11: print "$x\n";
42
>> trace.pl:9: END { 'end' }
So Devel::Trace sees the INIT block and the END block. But why the INIT block?
Above mentioned perlmod says:
INIT blocks are run just before the Perl runtime begins execution, in "first in, first out" (FIFO) order.
Apparently at that phase, the DB::DB has already been installed. I could not find any documentation that says when a sub definition is run exactly. However, it seems it's after BEGIN and before INIT. Hence, it does not see whatever goes on in the BEGIN.
Adding a BEGIN { $Devel::Trace::TRACE = 1 } to the beginning of the file also does not help.
I rummaged around in documentation for perldebug and the likes, but could not find an explanation of this behaviour. My guess is that the debugger interface doesn't know about BEGIN at all. They are executed very early after all (consider e.g. perl -c -E 'BEGIN{ say "foo" } say "bar"' will print foo.)

Related

Perl eval scope

According to perldoc, String Eval should be performed in the current scope. But the following simple test seems to contradict this.
We need the following two simple files to set up the test. Please put them under the same folder.
test_eval_scope.pm
package test_eval_scope;
use strict;
use warnings;
my %h = (a=>'b');
sub f1 {
eval 'print %h, "\n"';
# print %h, "\n"; # this would work
# my $dummy = \%h; # adding this would also work
}
1
test_eval_scope.pl
#!/usr/bin/perl
use File::Basename;
use lib dirname (__FILE__);
use test_eval_scope;
test_eval_scope::f1();
When I run the program, I got the following error
$ test_eval_scope.pl
Variable "%h" is not available at (eval 1) line 1.
My question is why the variable %h is out of scope.
I have done some modification, and found the following:
If I run without eval(), as in the above comment, it will work.
meaning that %h should be in the scope.
If I just add a seemingly useless mentioning in the code, as in the above
comment, eval() will work too.
If I combine pl and pm file into one file, eval() will work too.
If I declare %h with 'our' instead of 'my', eval() will work too.
I encountered this question when I was writing a big program which parsed user-provided code during run time. I don't need solutions as I have plenty workarounds above. But I cannot explain why the above code doesn't work. This affects my perl pride.
My perl version is v5.26.1 on linux.
Thank you for your help!
Subs only capture variables they use. Since f1 doesn't use %h, it doesn't capture it, and %h becomes inaccessible to f1 after it goes out of scope when the module finishes executing.
Any reference to the var, including one that's optimized away, causes the sub to capture the variable. As such, the following does work:
sub f1 {
%h if 0;
eval 'print %h, "\n"';
}
Demo:
$ perl -M5.010 -we'
{
my $x = "x";
sub f { eval q{$x} }
sub g { $x if 0; eval q{$x} }
}
say "f: ", f();
say "g: ", g();
'
Variable "$x" is not available at (eval 1) line 1.
Use of uninitialized value in say at -e line 8.
f:
g: x

Conditionally including a module in perl

I am new to perl, just encountered one case.
Can someone tell why does this fail with error
Undefined subroutine &main::color
$condition = 1;
use if ( $condition ), Term::ANSIColor;
print color('bold red');
print "hii";
print color('reset');
and this passes
use if ( 1 ), Term::ANSIColor;
print color('bold red');
print "hii";
print color('reset');
This is because use statements are executed at compile time, while your assignment is performed at run time and hasn't been executed yet
You can fix this by using a BEGIN block to do the assigmment at compile time as well, like this. Note that the variable must be declared outside the block, otherwise it will be local to the block and will disappear before it is neded
my $condition;
BEGIN {
$condition = 1;
}
use if $condition, 'Term::ANSIColor';
print color('bold red');
print "hii";
print color('reset');
Note also that you should always have use strict and use warnings 'all' at the top of every Perl program. If you had these in place you would need to quote the module name, as shown above

Can someone explain why Perl behaves this way (variable scoping)?

My test goes like this:
use strict;
use warnings;
func();
my $string = 'string';
func();
sub func {
print $string, "\n";
}
And the result is:
Use of uninitialized value $string in print at test.pl line 10.
string
Perl allows us to call a function before it has been defined. However when the function uses a variable declared only after the function call, the variable appears to be undefined. Is this behavior documented somewhere? Thank you!
The behaviour of my is documented in perlsub - it boils down to this - perl knows $string is in scope - because the my tells it so.
The my operator declares the listed variables to be lexically confined to the enclosing block, conditional (if/unless/elsif/else), loop (for/foreach/while/until/continue), subroutine, eval, or do/require/use'd file.
It means it's 'in scope' from the point at which it's first 'seen' until the closing bracket of the current 'block'. (Or in your example - the end of the code)
However - in your example my also assigns a value.
This scoping process happens at compile time - where perl checks where it's valid to use $string or not. (Thanks to strict). However - it can't know what the value was, because that might change during code execution. (and is non-trivial to analyze)
So if you do this it might be a little clearer what's going on:
#!/usr/bin/env perl
use strict;
use warnings;
my $string; #undefined
func();
$string = 'string';
func();
sub func {
print $string, "\n";
}
$string is in scope in both cases - because the my happened at compile time - before the subroutine has been called - but it doesn't have a value set beyond the default of undef prior to the first invocation.
Note this contrasts with:
#!/usr/bin/env perl
use strict;
use warnings;
sub func {
print $string, "\n";
}
my $string; #undefined
func();
$string = 'string';
func();
Which errors because when the sub is declared, $string isn't in scope.
First of all, I would consider this undefined behaviour since it skips executing my like my $x if $cond; does.
That said, the behaviour is currently consistent and predictable. And in this instance, it behaves exactly as expected if the optimization that warranted the undefined behaviour notice didn't exit.
At compile-time, my has the effect of declaring and allocating the variable[1]. Scalars are initialized to undef when created. Arrays and hashes are created empty.
my $string was encountered by the compiler, so the variable was created. But since you haven't executed the assignment yet, it still has its default value (undefined) during the first call to func.
This model allows variables to be captured by closures.
Example 1:
{
my $x = "abc";
sub foo { $x } # Named subs capture at compile-time.
}
say foo(); # abc, even though $x fell out of scope before foo was called.
Example 2:
sub make_closure {
my ($x) = #_;
return sub { $x }; # Anon subs capture at run-time.
}
my $foo = make_closure("foo");
my $bar = make_closure("bar");
say $foo->(); # foo
say $bar->(); # bar
The allocation is possibly deferred until the variable is actually used.

break out of a subroutine

what is the best way to break out of a subroutine & continue processing the rest of the script?
ie
#!/usr/bin/perl
use strict;
use warnings;
&mySub;
print "we executed the sub partway through & continued w/ the rest
of the script...yipee!\n";
sub mySub{
print "entered sub\n";
#### Options
#exit; # will kill the script...we don't want to use exit
#next; # perldoc says not to use this to breakout of a sub
#last; # perldoc says not to use this to breakout of a sub
#any other options????
print "we should NOT see this\n";
}
At the expense of stating the obvious the best way of returning for a subroutine is ......
return
Unless there is some hidden subtlety in the question that isn't made clear
Edit - maybe I see what you are getting at
If you write a loop, then a valid way of getting out of the loop is to use last
use strict ;
use warnings ;
while (<>) {
last if /getout/ ;
do_something() ;
}
If you refactor this, you might end up with a using last to get out of the subroutine.
use strict ;
use warnings ;
while (<>) {
process_line() ;
do_something() ;
}
sub process_line {
last if /getout/ ;
print "continuing \n" ;
}
This means you are using last where you should be using return and if you have wanings in place you get the error :
Exiting subroutine via last at ..... some file ...
Don't use exit to abort a subroutine if there's any chance that someone might want to trap whatever error happened. Use die instead, which can be trapped by an eval.

What is the role of the BEGIN block in Perl?

I know that the BEGIN block is compiled and executed before the main body of a Perl program. If you're not sure of that just try running the command perl -cw over this:
#!/ms/dist/perl5/bin/perl5.8
use strict;
use warnings;
BEGIN {
print "Hello from the BEGIN block\n";
}
END {
print "Hello from the END block\n";
}
I have been taught that early compilation and execution of a BEGIN block lets a programmer ensure that any needed resources are available before the main program is executed.
And so I have been using BEGIN blocks to make sure that things like DB connections have been established and are available for use by the main program. Similarly, I use END blocks to ensure that all resources are closed, deleted, terminated, etc. before the program terminates.
After a discussion this morning, I am wondering if this the wrong way to look at BEGIN and END blocks.
What is the intended role of a BEGIN block in Perl?
Update 1: Just found out why the DBI connect didn't work. After being given this little Perl program:
use strict;
use warnings;
my $x = 12;
BEGIN {
$x = 14;
}
print "$x\n";
when executed it prints 12.
Update 2: Thanks to Eric Strom's comment below this new version makes it clearer:
use strict;
use warnings;
my $x = 12;
my $y;
BEGIN {
$x = 14;
print "x => $x\n";
$y = 16;
print "y => $y\n";
}
print "x => $x\n";
print "y => $y\n";
and the output is
x => 14
y => 16
x => 12
y => 16
Once again, thanks Eric!
While BEGIN and END blocks can be used as you describe, the typical usage is to make changes that affect the subsequent compilation.
For example, the use Module qw/a b c/; statement actually means:
BEGIN {
require Module;
Module->import(qw/a b c/);
}
similarly, the subroutine declaration sub name {...} is actually:
BEGIN {
*name = sub {...};
}
Since these blocks are run at compile time, all lines that are compiled after a block has run will use the new definitions that the BEGIN blocks made. This is how you can call subroutines without parenthesis, or how various modules "change the way the world works".
END blocks can be used to clean up changes that the BEGIN blocks have made but it is more common to use objects with a DESTROY method.
If the state that you are trying to clean up is a DBI connection, doing that in an END block is fine. I would not create the connection in a BEGIN block though for several reasons. Usually there is no need for the connection to be available at compile time. Performing actions like connecting to a database at compile time will drastically slow down any editor you use that has syntax checking (because that runs perl -c).
Have you tried swapping out the BEGIN{} block for an INIT{} block? That's the standard approach for things like modperl which use the "compile-once, run-many" model, as you need to initialize things anew on each separate run, not just once during the compile.
But I have to ask why it's all in special block anyway. Why don't you just make some sort of prepare_db_connection() function, and then call it as you need to when the program starts up?
Something that won't work in a BEGIN{} will also have the same problem if it's main-line code in a module file that gets used. That's another possible reason to use an INIT{} block.
I've also seen deadly-embrace problems of mutual recursion that have to be unravelled using something like an require instead of use, or an INIT{} instead of a BEGIN{}. But that's pretty rare.
Consider this program:
% cat sto-INIT-eg
#!/usr/bin/perl -l
print " PRINT: main running";
die " DIE: main dying\n";
die "DIE XXX /* NOTREACHED */";
END { print "1st END: done running" }
CHECK { print "1st CHECK: done compiling" }
INIT { print "1st INIT: started running" }
END { print "2nd END: done running" }
BEGIN { print "1st BEGIN: still compiling" }
INIT { print "2nd INIT: started running" }
BEGIN { print "2nd BEGIN: still compiling" }
CHECK { print "2nd CHECK: done compiling" }
END { print "3rd END: done running" }
When compiled only, it produces:
% perl -c sto-INIT-eg
1st BEGIN: still compiling
2nd BEGIN: still compiling
2nd CHECK: done compiling
1st CHECK: done compiling
sto-INIT-eg syntax OK
While when compiled and executed, it produces this:
% perl sto-INIT-eg
1st BEGIN: still compiling
2nd BEGIN: still compiling
2nd CHECK: done compiling
1st CHECK: done compiling
1st INIT: started running
2nd INIT: started running
PRINT: main running
DIE: main dying
3rd END: done running
2nd END: done running
1st END: done running
And the shell reports an exit of 255, per the die.
You should be able to arrange to have the connection happen when you need it to, even if a BEGIN{} proves too early.
Hm, just remembered. There's no chance you're doing something with DATA in a BEGIN{}, is there? That's not set up till the interpreter runs; it's not open to the compiler.
While the other answers are true, I find it also worth to mention the use of BEGIN and END blocks when using the -n or -p switches to Perl.
From http://perldoc.perl.org/perlmod.html
When you use the -n and -p switches to Perl, BEGIN and END work just as they do in awk, as a degenerate case.
For those unfamiliar with the -n switch, it tells Perl to wrap the program with:
while (<>) {
... # your program goes here
}
http://perldoc.perl.org/perlrun.html#Command-Switches if you're interested about more specific information about Perl switches.
As an example to demonstrate the use of BEGIN with the -n switch, this Perl one-liner enumerates the lines of the ls command:
ls | perl -ne 'BEGIN{$i = 1} print "$i: $_"; $i += 1;'
In this case, the BEGIN-block is used to initiate the variable $i by setting it to 1 before processing the lines of ls. This example will output something like:
1: foo.txt
2: bar.txt
3: program.pl
4: config.xml