Why does Perl run END and CHECK blocks in LIFO order? - perl

I have no deep or interesting question--I'm just curious why it is so.

Each package is assumed to rely on the correct function of EVERYTHING that went before it. END blocks are intended to "clean up and close out" anything the package might need to take care of before the program finishes. But this work might rely on the correct functioning of the packages started earlier, which might no longer be true if they are allowed to run their END blocks.
If you did it any other way, there could be bad bugs.

Here is a simple example which may help:
# perl
BEGIN { print "(" }
END { print ")" }
BEGIN { print "[" }
END { print "]" }
This outputs: ([])
If END had been a FIFO then BEGIN/END wouldn't work well together.
Update - excerpt from Programming Perl 3rd edition, Chapter 18: Compiling - Avant-Garde Compiler, Retro Interpreter, page 483:
If you have several END blocks within a file, they execute in reverse order of their definition. That is, the last END block defined is the first one executed when your program finishes. This reversal enables related BEGIN and END blocks to nest the way you'd expect, if you pair them up
/I3az/

Perl borrows heavily from C, and END follows the lead of C's atexit:
NAME
atexit - register a function to run at process termination
SYNOPSIS
#include <stdlib.h>
int atexit(void (*func)(void));
DESCRIPTION
The atexit() function shall register the function pointed to by func, to be called without arguments at normal program termination. At normal program termination, all functions registered by the atexit() function shall be called, in the reverse order of their registration …

Related

subroutine perl with inline

I have a Perl function, which does not return any value. It does not take any arguments also.
sub test {
#do my logic
}
Can I do as :
sub test() {
#do my logic
}
Will the subroutine test be inlined? Will this work? (meaning will the function call be replaced with the function definition. And will my program execute faster?)
The function test() is being called 5000 times. And my Perl program takes a longer time to execute than expected. So I want to make my program faster.
Thanks in advance!
This is answered in Constant Functions in perlsub
Functions with a prototype of () are potential candidates for inlining. If the result after optimization and constant folding is either a constant or a lexically-scoped scalar which has no other references, then it will be used in place of function calls made without &. Calls made using & are never inlined. (See constant.pm for an easy way to declare most constants.)
So your sub test() should be inlined if it satisfies the above conditions. Nobody can tell without seeing the function, so either show it or try.
This is most easily checked with B::Deparse, see further down the linked perlsub section.
I would urge you to profile the program to ensure that the function call overhead is the problem.

BEGIN,CHECK,INIT& END blocks in Perl

As I understand these special functions inside Perl code, BEGIN and CHECK blocks run during the compilation phase while INIT and END blocks run during actual execution phase.
I can understand using these blocks inside actual Perl code (Perl libraries) but what about using them inside modules? Is that possible?
Since when we use use <Module-name> the module is compiled, so in effect BEGIN and CHECK blocks run. But how will the INIT and END blocks run since module code I don't think is run in the true sense. We only use certain functions from inside the modules.
Short The special code blocks in packages loaded via use are processed and run (or scheduled to run) as encoutered, in the same way and order as in main::, since use itself is a BEGIN block.
Excellent documentation on this can be found in perlmod. From this section
A BEGIN code block is executed as soon as possible, that is, the moment it is completely defined, even before the rest of the containing file (or string) is parsed.
Since the use statements are BEGIN blocks they run as soon as encountered. From use
It is exactly equivalent to
BEGIN { require Module; Module->import( LIST ); }
So the BEGIN blocks in a package run in-line with others, as they are encountered. The END blocks in a package are then also compiled in the same order, as well as the other special blocks. As for the order of (eventual) execution
An END code block is executed as late as possible ...
and
You may have multiple END blocks within a file--they will execute in reverse order of definition; that is: last in, first out (LIFO)
The order of compilation and execution of INIT and CHECK blocks follows suit.
Here is some code to demonstrate these special code blocks used in a package.
File PackageBlocks.pm
package PackageBlocks;
use warnings;
BEGIN { print "BEGIN block in the package\n" }
INIT { print "INIT block in the package\n" }
END { print "END block in the package\n" }
1;
The main script
use warnings;
BEGIN { print "BEGIN in main script.\n" }
print "Running in the main.\n";
INIT { print "INIT in main script.\n" }
use PackageBlocks;
END { print "END in main script.\n" }
BEGIN { print "BEGIN in main script, after package is loaded.\n" }
print "After use PackageBlocks.\n";
Output
BEGIN in main script.
BEGIN block in the package
BEGIN in main script, after package is loaded.
INIT in main script.
INIT block in the package
Running in the main.
After use PackageBlocks.
END in main script.
END block in the package
The BEGIN block in the package runs in order of appearance, in comparison with the ones in main::, and before INIT. The END block runs at end,
and the one in the package runs after the one in main::, since the use comes before it in this example.
This is very easy to test for yourself
use Module (and require EXPR and do EXPR and eval EXPR) compile the Perl code and then immediately run it
That is where the 1; at the end of most modules is picked up. If executing the module's code after compiling it doesn't return a true value then require will fail
Admittedly there usually isn't much use for an INIT or an END block, because the run-time phase is so intimately tied to the compilation, and because modules are generally about defining subroutines, but the option is there if you want it

Goto vs. Local function in Perl

In Perl, is it better to use goto or local function, and why with an example?
For example, I am using a code
sub data {
data;
}
data();
or
goto L:
L: if ( $i == 0 )
print "Hello!!!!";
Programmers don't die. They just GOSUB without RETURN.
That said, do not use goto in Perl.
If you want a statement to be executed once, just write that statement and don't use any flow control
If you want your program to execute the stuff several times, put it in a loop
If you want them to be executed from different places, put them in a sub
If you want the sub to be available in different programs, put them in a module
Don't use goto
There is one place where a goto makes sense in Perl. Matt Trout is talking about that in his blog post No, not that goto, the other goto.
Conventional wisdom is that other constructs are better than goto.
If you want to do something repeatedly, use a loop.
If you want to do something conditionally, use a block with if.
If you want to go somewhere and come back, use a function.
If you want to bail out early and "jump straight to the end", this can often be better written using an exception. (Perl handily manages

identify a procedure and replace it with a different procedure

What I want to achieve:
###############CODE########
old_procedure(arg1, arg2);
#############CODE_END######
I have a huge code which has a old procedure in it. I want that the call to that old_procedure go to a call to a new procedure (new_procedure(arg1, arg2)) with the same arguments.
Now I know, the question seems pretty stupid but the trick is I am not allowed to change the code or the bad_function. So the only thing I can do it create a procedure externally which reads the code flow or something and then whenever it finds the bad_function, it replaces it with the new_function. They have a void type, so don't have to worry about the return values.
I am usng perl. If someone knows how to atleast start in this direction...please comment or answer. It would be nice if the new code can be done in perl or C, but other known languages are good too. C++, java.
EDIT: The code is written in shell script and perl. I cannot edit the code and I don't have location of the old_function, I mean I can find it...but its really tough. So I can use the package thing pointed out but if there is a way around it...so that I could parse the thread with that function and replace function calls. Please don't remove tags as I need suggestions from java, C++ experts also.
EDIT: #mirod
So I tried it out and your answer made a new subroutine and now there is no way of accessing the old one. I had created an variable which checks the value to decide which way to go( old_sub or new_sub)...is there a way to add the variable in the new code...which sends the control back to old_function if it is not set...
like:
use BadPackage; # sub is defined there
BEGIN
{ package BapPackage;
no warnings; # to avoid the "Subroutine bad_sub redefined" message
# check for the variable and send to old_sub if the var is not set
sub bad_sub
{ # good code
}
}
# Thanks #mirod
This is easier to do in Perl than in a lot of other languages, but that doesn't mean it's easy, and I don't know if it's what you want to hear. Here's a proof-of-concept:
Let's take some broken code:
# file name: Some/Package.pm
package Some::Package;
use base 'Exporter';
our #EXPORT = qw(forty_two nineteen);
sub forty_two { 19 }
sub nineteen { 19 }
1;
# file name: main.pl
use Some::Package;
print "forty-two plus nineteen is ", forty_two() + nineteen();
Running the program perl main.pl produces the output:
forty-two plus nineteen is 38
It is given that the files Some/Package.pm and main.pl are broken and immutable. How can we fix their behavior?
One way we can insert arbitrary code to a perl command is with the -M command-line switch. Let's make a repair module:
# file: MyRepairs.pm
CHECK {
no warnings 'redefine';
*forty_two = *Some::Package::forty_two = sub { 42 };
};
1;
Now running the program perl -MMyRepairs main.pl produces:
forty-two plus nineteen is 61
Our repair module uses a CHECK block to execute code in between the compile-time and run-time phase. We want our code to be the last code run at compile-time so it will overwrite some functions that have already been loaded. The -M command-line switch will run our code first, so the CHECK block delays execution of our repairs until all the other compile time code is run. See perlmod for more details.
This solution is fragile. It can't do much about modules loaded at run-time (with require ... or eval "use ..." (these are common) or subroutines defined in other CHECK blocks (these are rare).
If we assume the shell script that runs main.pl is also immutable (i.e., we're not allowed to change perl main.pl to perl -MMyRepairs main.pl), then we move up one level and pass the -MMyRepairs in the PERL5OPT environment variable:
PERL5OPT="-I/path/to/MyRepairs -MMyRepairs" bash the_immutable_script_that_calls_main_pl.sh
These are called automated refactoring tools and are common for other languages. For Perl though you may well be in a really bad way because parsing Perl to find all the references is going to be virtually impossible.
Where is the old procedure defined?
If it is defined in a package, you can switch to the package, after it has been used, and redefine the sub:
use BadPackage; # sub is defined there
BEGIN
{ package BapPackage;
no warnings; # to avoid the "Subroutine bad_sub redefined" message
sub bad_sub
{ # good code
}
}
If the code is in the same package but in a different file (loaded through a require), you can do the same thing without having to switch package.
if all the code is in the same file, then change it.
sed -i 's/old_procedure/new_procedure/g codefile
Is this what you mean?

Why is 'last' called 'last' in Perl?

What is the historical reason to that last is called that in Perl rather than break as it is called in C?
The design of Perl was influenced by C (in addition to awk, sed and sh - see man page below), so there must have been some reasoning behind not going with the familiar C-style naming of break/last.
A bit of history from the Perl 1.000 (released 18 December, 1987) man page:
[Perl] combines (in the author's opinion, anyway) some of the best features of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it. (Language historians will also note some vestiges of csh, Pascal, and even BASIC|PLUS.)
The semantics of 'break' or 'last' are
defined by the language (in this case
Perl), not by you.
Why not think of 'last' as "this is
the last statement to run for the
loop".
It's always struck me as odd that the
'continue' statement in 'C' starts the
next pass of a loop. This is
definitely a strange use of the
concept of "continue". But it is the
semantics of 'C', so I accept it.
By trying to map particular
programming concepts into single
English words with existing meaning
there is always going to be some sort
of mismatching oddity
Source
Plus, Larry Wall is kinda weird. Have you seen his picture?
(source: wired.com)
I expect that this is because Perl was created by a linguist, not a computer scientist. In normal English usage, the concept of declaring that you have completed your final pass through a loop is more strongly connected to the word "last" ("this is the last pass") than to the word "break" ("break the loop"? "break out of the loop"? - it's not even clear how "break" is intended to relate to exiting the loop).
The term 'last' makes more sense when you remember that you can use it with more than just the immediate looping control. You can apply it to labeled blocks one or more levels above
the block it is in:
LINE: while( <> ) {
WORD: foreach ( split ) {
last LINE if /^__END__\z/;
...
}
}
It reads more naturally to say "last" in english when you read it as "last line if it matches ...".
Theres an additional reason you might want to consider:
Last does more than just loop control.
sub hello {
my ( $arg ) = #_;
scope: {
foo();
bar();
last if $arg > 4;
baz();
quux();
}
}
Last as such is a general flow control mechanism not limited to loops. While of course, you can generalise the above as a loop that runs at most 1 times, the absence of a loop to me indicates "Break? What are we breaking out of?"
Instead, I think of "last" as "Jump to the position of the last brace", which is for this purpose, more semantically sensible.
I was asking the same question to Damian Conway about say. Perl 6 will introduce say, which is nothing more than print that automatically adds a newline. My question was why not simply use echo, because this is what echo does in Bash (and probably elsewhere).
His answer was: echo is 33% longer than say.
He has a point there. :)
Because it goes to the last of the loop. And because Larry Wall was a weird guy.