What is the role of the BEGIN block in Perl? - perl

I know that the BEGIN block is compiled and executed before the main body of a Perl program. If you're not sure of that just try running the command perl -cw over this:
#!/ms/dist/perl5/bin/perl5.8
use strict;
use warnings;
BEGIN {
print "Hello from the BEGIN block\n";
}
END {
print "Hello from the END block\n";
}
I have been taught that early compilation and execution of a BEGIN block lets a programmer ensure that any needed resources are available before the main program is executed.
And so I have been using BEGIN blocks to make sure that things like DB connections have been established and are available for use by the main program. Similarly, I use END blocks to ensure that all resources are closed, deleted, terminated, etc. before the program terminates.
After a discussion this morning, I am wondering if this the wrong way to look at BEGIN and END blocks.
What is the intended role of a BEGIN block in Perl?
Update 1: Just found out why the DBI connect didn't work. After being given this little Perl program:
use strict;
use warnings;
my $x = 12;
BEGIN {
$x = 14;
}
print "$x\n";
when executed it prints 12.
Update 2: Thanks to Eric Strom's comment below this new version makes it clearer:
use strict;
use warnings;
my $x = 12;
my $y;
BEGIN {
$x = 14;
print "x => $x\n";
$y = 16;
print "y => $y\n";
}
print "x => $x\n";
print "y => $y\n";
and the output is
x => 14
y => 16
x => 12
y => 16
Once again, thanks Eric!

While BEGIN and END blocks can be used as you describe, the typical usage is to make changes that affect the subsequent compilation.
For example, the use Module qw/a b c/; statement actually means:
BEGIN {
require Module;
Module->import(qw/a b c/);
}
similarly, the subroutine declaration sub name {...} is actually:
BEGIN {
*name = sub {...};
}
Since these blocks are run at compile time, all lines that are compiled after a block has run will use the new definitions that the BEGIN blocks made. This is how you can call subroutines without parenthesis, or how various modules "change the way the world works".
END blocks can be used to clean up changes that the BEGIN blocks have made but it is more common to use objects with a DESTROY method.
If the state that you are trying to clean up is a DBI connection, doing that in an END block is fine. I would not create the connection in a BEGIN block though for several reasons. Usually there is no need for the connection to be available at compile time. Performing actions like connecting to a database at compile time will drastically slow down any editor you use that has syntax checking (because that runs perl -c).

Have you tried swapping out the BEGIN{} block for an INIT{} block? That's the standard approach for things like modperl which use the "compile-once, run-many" model, as you need to initialize things anew on each separate run, not just once during the compile.
But I have to ask why it's all in special block anyway. Why don't you just make some sort of prepare_db_connection() function, and then call it as you need to when the program starts up?
Something that won't work in a BEGIN{} will also have the same problem if it's main-line code in a module file that gets used. That's another possible reason to use an INIT{} block.
I've also seen deadly-embrace problems of mutual recursion that have to be unravelled using something like an require instead of use, or an INIT{} instead of a BEGIN{}. But that's pretty rare.
Consider this program:
% cat sto-INIT-eg
#!/usr/bin/perl -l
print " PRINT: main running";
die " DIE: main dying\n";
die "DIE XXX /* NOTREACHED */";
END { print "1st END: done running" }
CHECK { print "1st CHECK: done compiling" }
INIT { print "1st INIT: started running" }
END { print "2nd END: done running" }
BEGIN { print "1st BEGIN: still compiling" }
INIT { print "2nd INIT: started running" }
BEGIN { print "2nd BEGIN: still compiling" }
CHECK { print "2nd CHECK: done compiling" }
END { print "3rd END: done running" }
When compiled only, it produces:
% perl -c sto-INIT-eg
1st BEGIN: still compiling
2nd BEGIN: still compiling
2nd CHECK: done compiling
1st CHECK: done compiling
sto-INIT-eg syntax OK
While when compiled and executed, it produces this:
% perl sto-INIT-eg
1st BEGIN: still compiling
2nd BEGIN: still compiling
2nd CHECK: done compiling
1st CHECK: done compiling
1st INIT: started running
2nd INIT: started running
PRINT: main running
DIE: main dying
3rd END: done running
2nd END: done running
1st END: done running
And the shell reports an exit of 255, per the die.
You should be able to arrange to have the connection happen when you need it to, even if a BEGIN{} proves too early.
Hm, just remembered. There's no chance you're doing something with DATA in a BEGIN{}, is there? That's not set up till the interpreter runs; it's not open to the compiler.

While the other answers are true, I find it also worth to mention the use of BEGIN and END blocks when using the -n or -p switches to Perl.
From http://perldoc.perl.org/perlmod.html
When you use the -n and -p switches to Perl, BEGIN and END work just as they do in awk, as a degenerate case.
For those unfamiliar with the -n switch, it tells Perl to wrap the program with:
while (<>) {
... # your program goes here
}
http://perldoc.perl.org/perlrun.html#Command-Switches if you're interested about more specific information about Perl switches.
As an example to demonstrate the use of BEGIN with the -n switch, this Perl one-liner enumerates the lines of the ls command:
ls | perl -ne 'BEGIN{$i = 1} print "$i: $_"; $i += 1;'
In this case, the BEGIN-block is used to initiate the variable $i by setting it to 1 before processing the lines of ls. This example will output something like:
1: foo.txt
2: bar.txt
3: program.pl
4: config.xml

Related

Perl else error

Hello I am new to programming in perl I am trying to make a number adder (math) but it gives me 1 error here's my code:
sub main {
print("First: ");
$num1 = <STDIN>;
print("Second: ");
$num2 = <STDIN>;
$answer = $num1 + $num2;
print("$answer")
} else {
print("You have entered invalid arguments.")
}
main;
now obviously its not done but I get an error on ONLY else here is the error:
C:\Users\aries\Desktop>"Second perl.pl"
syntax error at C:\Users\aries\Desktop\Second perl.pl line 9, near "} else"
Execution of C:\Users\aries\Desktop\Second perl.pl aborted due to compilation errors.
please help (also I tried googling stuff still error)
Since you're new to Perl, I recommend you add strict and warnings at the top of your scripts. This will help identify common problems and potentially dangerous code.
The main problem with your code is that you've appended the else statement to your subroutine. Here is an example of your code as I think you intended it to be:
use strict;
use warnings;
use Scalar::Util qw(looks_like_number);
sub main {
print 'First :';
my $num1 = <STDIN>;
print 'Second: ';
my $num2 = <STDIN>;
if( looks_like_number($num1) && looks_like_number($num2) ) {
my $answer = $num1 + $num2;
print "$answer\n";
}
else {
die 'You have entered invalid arguments.';
}
}
main();
There are a few things I should note here about the differences between Perl and Python that I think will help you understand Perl better.
Unlike Python, Perl doesn't care about white space. It uses curly braces and semi-colons to indicate the end of blocks and statements and will happily ignore any white space that isn't in quotes.
The else statement appended to the subroutine won't work because Perl isn't designed to evaluate code blocks that way. A subroutine is simply a named code block that can be called at any other point in the script. Any error handling will need to be done inside of the subroutine rather than to it.
Perl is very context-sensitive and doesn't make a solid distinction between strings and integers in variables. If you write $var_a = 1, Perl will read it as an integer. $var_b = '1', it will read it as a string. But, you can still add them together: $var_c = ($var_a + $var_b), and Perl will make $var_c = 2.
This is another reason the else statement would not work as you've written it. Python would throw an error if you try to add non-integers, but Perl will just figure out how to combine them, and give you the result. If you try to add a letter and a number, Perl still won't fail, but it will warn you if you put "use warnings;" at the top of your script.
In the example, and as Dada mentioned, you can use the looks_like_number() method from the Scalar::Utils module to evaluate the variables as you had intended.
Apart from the syntax, if/else statements work the same way in Perl as in Python. The else-if is slightly different as is has an extra s:
if (condition) {
...
}
elsif (other condition) {
...
}
else {
...
}
In Perl, it's good practice to assign lexical scope to variables using the my function. Since Perl is so context-sensitive, this can help prevent unexpected behavior when moving between different scopes.
Single- and double-quotes have different uses in Perl. Single-quotes are read literally and double-quotes are interpolated, so if you want to combine variables and text together, you can skip concatenating the strings and just do: print "Got variable: $var\n";
Lastly, note that I added parentheses after the main subroutine call. This is another best practice to make it clearer that you are calling a subroutine as opposed to it being a bare word or a bad variable name.

How to convince Devel::Trace to print the BEGIN-block statements?

Have a simple script p.pl:
use strict;
use warnings;
our $x;
BEGIN {
$x = 42;
}
print "$x\n";
When I run it as:
perl -d:Trace p.pl
prints:
>> p.pl:3: our $x;
>> p.pl:7: print "$x\n";
42
how to get printed the BEGIN block statements too, e.g. the $x = 42;?
Because my intention isn't clear, adding the clarification:
Looking for ANY way to print statements when the perl script runs (like Devel::Trace it does) but including the statements in the BEGIN block.
It's very possible. Set $DB::single in an early BEGIN block.
use strict;
use warnings;
our $x;
BEGIN { $DB::single = 1 }
BEGIN {
$x = 42;
}
print "$x\n";
$DB::single is a debugger variable used to determine whether the DB::DB function will be invoked at each line. In compilation phase it is usually false but you can set it in compilation phase in a BEGIN block.
This trick is also helpful to set a breakpoint inside a BEGIN block when you want to debug compile-time code in the standard debugger.
Disclaimer: This is just an attempt to explain the behaviour.
Devel::Trace hooks up to the Perl debugging API through the DB model. That is just code. It installs a sub DB::DB.
The big question is, when is that executed. According to perlmod, there are five block types that are executed at specific points during execution. One of them is BEGIN, which is the first.
Consider this program.
use strict;
use warnings;
our ($x, $y);
BEGIN { $x = '42' }
UNITCHECK { 'unitcheck' }
CHECK { 'check' }
INIT { 'init' }
END { 'end' }
print "$x\n";
This will output the following:
>> trace.pl:8: INIT { 'init' }
>> trace.pl:3: our ($x, $y);
>> trace.pl:11: print "$x\n";
42
>> trace.pl:9: END { 'end' }
So Devel::Trace sees the INIT block and the END block. But why the INIT block?
Above mentioned perlmod says:
INIT blocks are run just before the Perl runtime begins execution, in "first in, first out" (FIFO) order.
Apparently at that phase, the DB::DB has already been installed. I could not find any documentation that says when a sub definition is run exactly. However, it seems it's after BEGIN and before INIT. Hence, it does not see whatever goes on in the BEGIN.
Adding a BEGIN { $Devel::Trace::TRACE = 1 } to the beginning of the file also does not help.
I rummaged around in documentation for perldebug and the likes, but could not find an explanation of this behaviour. My guess is that the debugger interface doesn't know about BEGIN at all. They are executed very early after all (consider e.g. perl -c -E 'BEGIN{ say "foo" } say "bar"' will print foo.)

Perl Print to File error [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I am rather new to Perl and am trying to combine several .pm files into a single script. Most of the modules copy over just fine, but some have an error where the end of file is reached, but the script keeps printing. Here is an example of the code:
$copy_line = 0;
sysopen(FILE, $file_path, O_WRONLY | O_CREAT, 0711);
sysopen(MODULE, $module_path, O_RDONLY | O_EXCL);
while(<MODULE>)
{
my $line = $_;
if(($line ne "# START\n") and ($copy_line eq 0))
{
}
else
{
print FILE "$line";
$copy_line = 1;
}
}
close FILE;
close MODULE;
Each module has start and end tags to I do not copy any use statements, and so I know when to stop copying. An example of the module is
#!/usr/bin/perl
# START
some code to copy over
some more code to copy
even more code to copy
# END
What happens in some files is I see the end tag, followed by repeated code from the module. The output looks something like
# START
some code to copy over
some more code to copy
even more code to copy
# END
code to copy
even more code to copy
# END
What might be causing this?
Thanks,
-rusty
There are various things wrong with your script:
You didn't show the whole script; constants like O_WRONLY don't exist by default.
Therefore it may be that you didn't use strict; use warnings; at the beginning of your script. This is neccessary to get warned about errors or possible mistakes.
The strict mode requires you to declare all your variables. You can do so with the my keyword, e.g. my $copy_line = 0.
Never use sysopen, except when you fully understand how open works and why it wouldn't be the best choice for a given situation. Considering that I don't have that level of knowledge, I think we'll stick to the normal open.
The open takes a variable, a mode, and a filename, like
open my $file, "<", $filename;
I encourage you to use autodie for automatic error handling, otherwise you should do
open my $file, "<", $filename or die "Can't open $filename: $!";
where $! contains the reason for the error. You can specify various modes for open, which are modelled after shell redirection operators. Important are: < read, > write (or create), >> append, |- write pipe to command, -| read pipe from command.
The eq operator tests for string equality. If you want to test for numeric equality, use the == operator.
if (COND) {} else { STUFF } could rather be written unless (COND) {STUFF}.
You have successfully implemented some twisted logic that starts copying at the START marker. However, you don't stop at the END. For stuff like this, the flip-flop-operator .. can be used: It takes two operands, which are arbitrary expressions. The operator returns false until the first operand is true, and remains true until after the second operand returned true. If one operand is a constant integer, it is interpreted as a line number. Thus, the script
while (<>) {
print if 5 .. 10;
}
prints lines number 5–10 inclusive of the input.
For your problem, you should probably use regexes that match the start and the end marker:
while (<>) {
print if /^ \s* # \s* START/x .. /^ \s* # \s* END/x
}
I'll assume here that you know regexes, but I can add explanations if needed.
If the readline operator <> is used without an operand, it takes the command line arguments of the script, opens them, and reads them in sequence. If no arguments were provided, it uses STDIN.
This allows for flexible little scripts. The code can be summarized in the command-line oneliner
$ perl -ne'print if /^\s*#\s*START/../^\s*#\s*END/' INPUT-FILE1 INPUT-FILE2 >OUTPUT
There are two issues with this:
It prints out the start/end markers as well
If a file doesn't contain an # END, the next files will be printed out in full until the next # END is found.
We can mitigate issue #2 by testing for the end of file in the termination condition:
print if /^\s*#\s*START/ .. (/^\s*#\s*END/ or eof);
Issue #1 is slightly more complex; I'd reintroduce a flag for that:
my $print_this = 0;
while (<>) {
if (/^\s*#\s*END/ or eof) {
$print_this = 0;
} elsif ($print_this) {
print;
} elsif (/^\s*#\s*START/) {
$print_this = 1;
}
}
Partial test case:
$ perl -e'
my $print_this = 0;
while (<>) {
if (/^\s*#\s*END/ or eof) { $print_this = 0 }
elsif ($print_this) { print }
elsif (/^\s*#\s*START/) { $print_this = 1 }
}' <<'__END__'
no a 1
no a 2
# START
yes b 1
yes b 2
yes b 3
#END
no c 1
no c 2
# START
yes d 1
# END
no e 1
__END__
Output:
yes b 1
yes b 2
yes b 3
yes d 1
If you're copying files without modifying their contents, you should look into File::Copy http://perldoc.perl.org/File/Copy.html
File::Copy is a standard module and is installed along with Perl. For a list of standard modules, see perldoc perlmodlib http://perldoc.perl.org/perlmodlib.html#Standard-Modules

Perl, fork, semaphores, processes

I need to create a program that would run 3 processes at the same time in random sequence from a list and lock those processes with semaphore one by one so to avoid duplicates.
For example, you have a list of 3 programs:
#array = ( 1,2,3);
perl script.pl runs 2 at first;
By random tries to run 2 again and receives an error (because 2 is now locked with semaphore).
Runs 1.
Runs 3.
script.pl waits all of 1,2,3 to end work and then exit itself.
Here's my code so far:
#!/usr/bin/perl -w
use IPC::SysV qw(IPC_PRIVATE S_IRUSR S_IWUSR IPC_CREAT);
use IPC::Semaphore;
use Carp ();
print "Program started\n";
sub sem {
#semaphore lock code here
}
sub chooseProgram{
#initialise;
my $program1 = "./program1.pl";
my $program2 = "./program2.pl";
my $program3 = "./program3.pl";
my $ls = "ls";
my #programs = ( $ls, $program1, $program2, $program3 );
my $random = $programs[int rand($#programs+1)];
print $random."\n";
return $random;
}
#parent should fork child;
#child should run random processes;
#avoid process clones with semaphore;
sub main{
my $pid = fork();
if ($pid){
#parent here
}
elsif (defined($pid)){
#child here
print "$$ Child started:\n";
#simple cycle to launch and lock programs
for (my $i = 0; $i<10; $i++){
# semLock(system(chooseProgram()); #run in new terminal window
# so launched programs are locked and cannot be launched again
}
}
else {
die("Cannot fork: $!\n");
}
waitpid($pid, 0);
my $status = $?;
#print $status."\n";
}
main();
exit 0;
Problems:
Need to lock file; (I don't know how to work with semaphore. Failed some attempts to lock files so excluded that code.)
Child waits until first program ends before second start. How can I start three of programs at the same time with one child? (Is it possible or should I create one child for one program?).
Programs are non-gui and should run in terminal. How to run a program in new terminal window(tab)?
No correct check if all programs of #programs were launched yet. -- less important.
Your randomness requirement is very strange, but if I understood your requirements correctly, you don't need any sort of locking to do what you want. (So 1) in your question is gone)
Start by shuffling the program array, then start each command of that shuffled array (this deals with your 4)). Then only waitpid after you've started everything (which deals with your 2)).
The code below does that, starting various sleep instances in new terminals (I use urxvt, adapt depending on what terminal you want to spawn - this deals with your 3)).
#! /usr/bin/perl -w
use strict;
use warnings;
my #progs = ("urxvt -e sleep 5", "urxvt -e sleep 2", "urxvt -e sleep 1");
my #sgrop;
my #pids;
# Shuffle the programs
while (my $cnt = scalar(#progs)) {
push #sgrop, splice #progs, int(rand($cnt)), 1;
}
# Start the progs
foreach my $prog (#sgrop) {
my $pid = fork();
if (!$pid) {
exec($prog);
# exec does not return
} else {
print "Started '$prog' with pid $pid\n";
push #pids, $pid;
}
}
# Wait for them
map {
waitpid($_, 0);
print "$_ done!\n";
} (#pids);
Not sure the shuffling is the best out there, but it works. The idea behind it is just to pick one element at random from the initial (sorted) list, remove it from the there and add it to the shuffled one. Repeat until the initial list is empty.
If you're trying to lock the programs system wide (i.e. no other process in your system should be able to start them), then I'm sorry but that's not possible unless the programs protect themselves from concurrent execution.
If your question was about semaphores, then I'm sorry I missed your point. The IPC documentation has sample code for that. I don't really think it's necessary to go to that complexity for what you're trying to do though.
Here's how you could go about it using the IPC::Semaphore module for convenience.
At the start of your main, create a semaphore set with as many semaphores as required:
use IPC::SysV qw(S_IRUSR S_IWUSR IPC_CREAT IPC_NOWAIT);
use IPC::Semaphore;
my $numprocs = scalar(#progs);
my $sem = IPC::Semaphore->new(1234, # this random number is the semaphore key. Use something else
$numprocs, # number of semaphores you want under that key
S_IRUSR | S_IWUSR | IPC_CREAT);
Check for errors, then initialize all the semaphores to 1.
$sem->setall( (1) x $numprocs) || die "can't set sems $!";
In the code that starts your processes, before you start (after the fork though), try to grab the semaphore:
if ($sem->op($proc_number, -1, IPC_NOWAIT)) {
# here, you got the semaphore - so nothing else is running this program
# run the code
# and once the code is done:
$sem->op($proc_number, 1, 0); # release the semaphore
exit(0);
} else {
# someone else is running this program already
exit(1); # or something
}
In the above, $proc_number must be unique for each program (could be it's index in your programs array for instance). Don't use exec to start the program. Use system instead for example.
Note that you will have to deal with the exit code of the child process in this case. If the exit code is zero, you can mark that program as having run. If not, you need to retry. (This is going to get messy, you'll need to track which program was run or not. I'd suggest a hash with the program number ($proc_number) where you'd store whether it already completed or not, and the current pid running (or trying to run) that code. You can use that hash to figure out what program still needs to be executed.)
Finally after all is done and you've waited for all the children, you should clean up after yourself:
$sem->remove;
This code lacks proper error checking, will work strangely (i.e. not well at all) if the cleanup was not done correctly (i.e. semaphores are already laying around when the code starts). But it should get you started.

How do I rerun a subroutine without restarting the script in Perl's debugger?

Suppose I have a situation where I'm trying to experiment with some Perl code.
perl -d foo.pl
Foo.pl chugs it's merry way around (it's a big script), and I decide I want to rerun a particular subroutine and single step through it, but without restarting the process. How would I do that?
The debugger command b method sets a breakpoint at the beginning of your subroutine.
DB<1> b foo
DB<2> &foo(12)
main::foo(foo.pl:2): my ($x) = #_;
DB<<3>> s
main::foo(foo.pl:3): $x += 3;
DB<<3>> s
main::foo(foo.pl:4): print "x = $x\n";
DB<<3>> _
Sometimes you may have to qualify the subroutine names with a package name.
DB<1> use MyModule
DB<2> b MyModule::MySubroutine
just do: func_name(args)
e.g.
sub foo {
my $arg = shift;
print "hello $arg\n";
}
In perl -d:
DB<1> foo('tom')
hello tom
Responding to the edit regarding wanting to re-step through a subroutine.
This is not entirely the most elegant way of doing this, but I don't have another method off the top of my head and am interested in other people's answers to this question :
my $stop_foo = 0;
while(not $stop_foo) {
foo();
}
sub foo {
my $a = 1 + 1;
}
The debugger will continually execute foo, but you can stop the next loop by executing '$stop_foo++' in the debugger.
Again, I don't really feel like that's the best way but it does get the job done with only minor additions to the debugged code.