In the example in perlmod/Perl Modules there is a BEGIN block. I looked at some modules but none of these had a BEGIN block. Should I use such a BEGIN block when writing a module or is it dispensable?
You only need a BEGIN block if you need to execute some code at compile time versus run-time.
An example: Suppose you have a module Foo.pm in a non-standard library directory (like /tmp). You know you can have perl find the module by modifying #INC to include /tmp. However, this will not work:
unshift(#INC, '/tmp');
use Foo; # perl reports Foo.pm not found
The problem is that the use statement is executed at compile time whereas the unshift statement is executed at run time, so when perl looks for Foo.pm, the include path hasn't been modified (yet).
The right way to accomplish this is:
BEGIN { unshift(#INC, '/tmp') };
use Foo;
Now the unshift statement is executed at compile-time and before the use Foo statement.
The vast majority of scripts will not require BEGIN blocks. A lot of what you need in BEGIN blocks can be obtained through use-ing other modules. For instance, in this case we could make sure /tmp is in #INC by using the lib.pm module:
use lib '/tmp';
use Foo;
A BEGIN block in a module is entirely dispensable. You only use it if there is something that must be done by your module when it is loaded, before it is used. There are seldom reasons to do much at that point, so there are seldom reasons to use a BEGIN block.
Related
I use the AnyEvent::DNS module.
I want to disable IPv6, so that the resolver only makes a request for A record.
AnyEvent::DNS, uses the environment variable $ENV{PERL_ANYEVENT_PROTOCOLS}
But setting the variable does not work; the resolver still sends two requests A and AAAA
Code from AnyEvent::DNS:
our %PROTOCOL; # (ipv4|ipv6) => (1|2), higher numbers are preferred
BEGIN {
...;
my $idx;
$PROTOCOL{$_} = ++$idx
for reverse split /\s*,\s*/,
$ENV{PERL_ANYEVENT_PROTOCOLS} || "ipv4,ipv6";
}
How to define an environment variable before loading modules?
Since the code that checks the environment variable is in a BEGIN block, it will be run immediately once the Perl compiler reaches it.
When Perl starts compiling your script, it checks for use statements first. So when you use AnyEvent::DNS, Perl loads that module and parses the file. BEGIN blocks are executed at that stage, while code in methods will only be compiled, not executed.
So if you have something like the following, the code you showed above will be run before you even set that variable.
use strict;
use warnings;
use AnyEvent::DNS;
$ENV{PERL_ANYEVENT_PROTOCOLS} = 'ipv4';
...
There are two ways you can circumvent that.
You can put the assignment in your own BEGIN block before you load AnyEvent::DNS. That way it will be set first.
use strict;
use warnings;
BEGIN {
$ENV{PERL_ANYEVENT_PROTOCOLS} = 'ipv4';
}
use AnyEvent::DNS;
Alternatively, you can just call your program with the environment variable set for it from the shell.
$ PERL_ANYEVENT_PROTOCOLS=ipv4 perl resolver.pl
The second one is more portable, in case you later want it to do IPv6 after all.
Read more about BEGIN in perlmod.
What I want to achieve:
###############CODE########
old_procedure(arg1, arg2);
#############CODE_END######
I have a huge code which has a old procedure in it. I want that the call to that old_procedure go to a call to a new procedure (new_procedure(arg1, arg2)) with the same arguments.
Now I know, the question seems pretty stupid but the trick is I am not allowed to change the code or the bad_function. So the only thing I can do it create a procedure externally which reads the code flow or something and then whenever it finds the bad_function, it replaces it with the new_function. They have a void type, so don't have to worry about the return values.
I am usng perl. If someone knows how to atleast start in this direction...please comment or answer. It would be nice if the new code can be done in perl or C, but other known languages are good too. C++, java.
EDIT: The code is written in shell script and perl. I cannot edit the code and I don't have location of the old_function, I mean I can find it...but its really tough. So I can use the package thing pointed out but if there is a way around it...so that I could parse the thread with that function and replace function calls. Please don't remove tags as I need suggestions from java, C++ experts also.
EDIT: #mirod
So I tried it out and your answer made a new subroutine and now there is no way of accessing the old one. I had created an variable which checks the value to decide which way to go( old_sub or new_sub)...is there a way to add the variable in the new code...which sends the control back to old_function if it is not set...
like:
use BadPackage; # sub is defined there
BEGIN
{ package BapPackage;
no warnings; # to avoid the "Subroutine bad_sub redefined" message
# check for the variable and send to old_sub if the var is not set
sub bad_sub
{ # good code
}
}
# Thanks #mirod
This is easier to do in Perl than in a lot of other languages, but that doesn't mean it's easy, and I don't know if it's what you want to hear. Here's a proof-of-concept:
Let's take some broken code:
# file name: Some/Package.pm
package Some::Package;
use base 'Exporter';
our #EXPORT = qw(forty_two nineteen);
sub forty_two { 19 }
sub nineteen { 19 }
1;
# file name: main.pl
use Some::Package;
print "forty-two plus nineteen is ", forty_two() + nineteen();
Running the program perl main.pl produces the output:
forty-two plus nineteen is 38
It is given that the files Some/Package.pm and main.pl are broken and immutable. How can we fix their behavior?
One way we can insert arbitrary code to a perl command is with the -M command-line switch. Let's make a repair module:
# file: MyRepairs.pm
CHECK {
no warnings 'redefine';
*forty_two = *Some::Package::forty_two = sub { 42 };
};
1;
Now running the program perl -MMyRepairs main.pl produces:
forty-two plus nineteen is 61
Our repair module uses a CHECK block to execute code in between the compile-time and run-time phase. We want our code to be the last code run at compile-time so it will overwrite some functions that have already been loaded. The -M command-line switch will run our code first, so the CHECK block delays execution of our repairs until all the other compile time code is run. See perlmod for more details.
This solution is fragile. It can't do much about modules loaded at run-time (with require ... or eval "use ..." (these are common) or subroutines defined in other CHECK blocks (these are rare).
If we assume the shell script that runs main.pl is also immutable (i.e., we're not allowed to change perl main.pl to perl -MMyRepairs main.pl), then we move up one level and pass the -MMyRepairs in the PERL5OPT environment variable:
PERL5OPT="-I/path/to/MyRepairs -MMyRepairs" bash the_immutable_script_that_calls_main_pl.sh
These are called automated refactoring tools and are common for other languages. For Perl though you may well be in a really bad way because parsing Perl to find all the references is going to be virtually impossible.
Where is the old procedure defined?
If it is defined in a package, you can switch to the package, after it has been used, and redefine the sub:
use BadPackage; # sub is defined there
BEGIN
{ package BapPackage;
no warnings; # to avoid the "Subroutine bad_sub redefined" message
sub bad_sub
{ # good code
}
}
If the code is in the same package but in a different file (loaded through a require), you can do the same thing without having to switch package.
if all the code is in the same file, then change it.
sed -i 's/old_procedure/new_procedure/g codefile
Is this what you mean?
I'm trying to initialize a global variable from a Perl module that has a BEGIN block, but I can't manage to get it work.
This is the Perl module
Package A::B;
our $var;
BEGIN{
$var ||= "/some/default/path";
#create/access files/folders in $var
}
This is my CGI script
use A::B;
$A::B::var = "/correct/path";
But #error returned because $var is not the correct path
The BEGIN block is being executed before the correct path is assigned to $var. Is there a way to work around this without having to remove code from the BEGIN block?
BEGIN { $A::B::var = "/correct/path" }
use A::B;
This answer is unsatisfying to me, but I can't think of any other way, given how your A::B is designed. Ideally a module doesn't depend on who is using it or how often, but this module can really only be used once.
Short version
Is it possible to access variables from a module declared as our using unqualified names within the BEGIN code block, but using qualified names outside? In particular, can this be done without explicitely naming the package in the module file?
Example
Let demomod.pm be
use strict;
use warnings;
package demomod;
our $foo;
BEGIN { $foo = 42; }
1;
and demoscript.pl be
#!/usr/bin/perl -Tw
use strict;
use warnings;
BEGIN { #INC = ('.', #INC); }
use demomod;
print $demomod::foo."\n";
In this case, all names agree, and everything works as it should. Is there a way to omit the line package demomod; from the demomod.pm code and still let this work?
Motivation
The reason why I'm asking is because I encountered something along these lines during a recent upgrade of Foswiki. That software has a module Foswiki.pm which does not have a package line (EDIT: seems the package line only got lost in my local copy, for reasons unknown). It declares and initializes a variable $engine like in my example. There also is a CGI script called view which sets #INC and then does use Foswiki (); followed by $Foswiki::engine->run(). This last line always fails for me due to the variable not being initialized:
Can't call method "run" on an undefined value at …/view
In the BEGIN block of the module, $engine is set correctly but $Foswiki::engine apparently is not. So it looks like there were two variables here, one qualified and a different one unqualified.
All that code apparently works for others, and a previous version used to work for me as well, without a package line either. So while I try to understand how this broke, I also try to understand how this could work before, without that line in place. Is there some mechanism that would make this work?
If you have no package statement in your code then any package variables will be declared into the main package. So no, you cannot do what you describe.
If you look at line 2 of the Foswiki code that you linked, you will see that it does have a package statement.
Is there a way to load entire modules during runtime in Perl? I had thought I found a good solution with autouse but the following bit of code fails compilation:
package tryAutouse2;
use autouse 'tryAutouse';
my $obj = tryAutouse->new();
I imagine this is because autouse is specifically meant to be used with exported functions, am I correct? Since this fails compilation, is it impossible to have a packaged solution? Am I forced to require before each new module invocation if I want dynamic loading?
The reasoning behind this is that my team loads many modules, but we're afraid this is eating up memory.
You want Class::Autouse or ClassLoader.
Due to too much magic, I use ClassLoader only in my REPL for convenience. For serious code, I always load classes explicitely. Jack Maney points out in a comment that Module::Load and Module::Load::Conditional are suitable for delayed loading.
There's nothing wrong with require IMO. Skip the export of the function and just call the fully qualified name:
require Some::Module;
Some::Module::some_function(#some_arguments);
eval 'use tryAutouse; 1;' or die $#;
Will work. But you might want to hide the ugliness.
When you say:
use Foo::Bar;
You're loading module Foo::Bar in at compile time. Thus, if you want to load your module in at run time, you'd use require:
require Foo::Bar;
They are sort of equivalent, but there are differences. See the Perldoc on use to understand the complete difference. For example, require used in this way won't automatically load in imported functions. That might be important to you.
If you want to test whether a module is there or not, wrap up your require statement in an eval and test whether or not eval is successful.
I use a similar technique to see if a particular Perl module is available:
eval { require Mail::Sendmail; };
if ($#) {
$watch->_Send_Email_Net_SMTP($watcher);
return;
}
In the above, I'll attempt to use Mail::Sendmail which is an optional module if it's available. If not, I'll run another routine that uses Net::SMTP:
sub _Send_Email_Net_SMTP {
my $self = shift;
my $watcher = shift;
require Net::SMTP; #Standard module: It should be available
WORD O'WARNING: You need to use curly braces around your eval statement and not parentheses. Otherwise, if the require doesn't work, your program will exit which is probably not what you want to do.
Instruction 'use' is performed at compile time, so check the path to the module also takes place at compile time. This may cause incorrect behavior, which are difficult to understand until you consider the contents of the #INC array.
One solution is to add block 'BEGIN', but the solution shown below is inelegant.
BEGIN { unshift #INC, '/path/to/module/'; }
use My::Module;
You can replace the whole mess a simple directive:
use lib '/path/to/module';
use My::Module;
This works because it is performed at compile time. So everything is ready to execute 'use' instruction.
Instead of the 'BEGIN' block, you can also decide to different instruction executed at compile time ie declaring a constant.
use constant LIB_DIR => '/path/to/module';
use lib LIB_DIR;
use My::Module;