How can I access the Apache server configuration in a BEGIN block in mod_perl? - perl

I've been trying to switch from using PerlSetEnv to using custom configuration directives. I have my configuration module with a copy of set_val from the docs:
sub set_val
{
local our ($key, $self, $parms, $arg) = #_;
$self->{$key} = $arg;
unless ($parms->path)
{
local our $srv_cfg = Apache2::Module::get_config($self, $parms->server);
$srv_cfg->{$key} = $arg;
}
}
...which is called by every custom directive sub. Then I have in my .conf:
PerlLoadModule MyModule::ServerConfig
MyCustomDirective 'hello'
This works fine in that httpd -t okays the file's syntax. The problem is that I can't seem to get at the value from the config file from within a BEGIN block, which I need to do.
I've tried tinkering with all sorts of things:
BEGIN
{
use Apache2::CmdParms ();
# use Apache2::Directive ();
use Apache2::Module ();
# use Apache2::ServerUtil ();
# use Apache2::RequestUtil ();
use Data::Dump;
warn ddx(Apache2::Module::get_config('MyModule::ServerConfig', Apache2::CmdParms->server));
# warn ddx(Apache2::Directive->as_hash);
# warn Apache2::ServerUtil->dir_config('MyCustomDirective);
# warn Apache2::CmdParms->server->server_hostname();
}
...but to no avail. Most of my efforts (trying to access CmdParms->server for instance) result in Parent: child process exited with status 3221225477 -- Restarting and an automatic restart of Apache as it says. If I pass ServerUtil->server to get_config(), the server stays alive but the warning only prints out '1'.
I read somewhere that this is because you can't get at anything request-related within a BEGIN block, because requests vary. It kind of makes sense, except that with PerlOptions +GlobalRequest I have been able to see $ENV within a BEGIN block, so why wouldn't I be able to see my own directives, just as dependent as they are on how the request happens? Especially confusing is that if I try to pass Apache2::RequestUtil->request->per\_dir\_config() to get_config(), it says Global $r object is not available. If that's true in a BEGIN block, how is it I can get at $ENV?

Try add what you want to import function to other module and use this module in code where you usually put BEGIN block. It should work same. May be it helps.

Partly, Dump isn't being used correctly. This works better:
use Data::Dump qw(pp);
warn pp(Apache2::Module::get_config('MyModule::ServerConfig', Apache2::ServerUtil->server));
However, it doesn't show any directives that appear within <Directory> blocks.
In my particular case, though, I don't need that functionality, on second thought; that just happens to be where I had stuck them.

Related

How to globally override method in Perl

I am forced to use old version of Apache2::Cookie class which contains bug in method fetch().
I tried to override method in startup script, but it doesn't work later in other modules than startup script:
local *Apache2::Cookie::fetch = sub { ... }
How to override method globally for all other modules?
As Sobrique pointed out in their comment, the local is definitely an issue. But not the only one.
You need to load the pacakge first before you do this. Perl will take the last definition of the sub, just like the last assigned value will be the value of a variable.
my $foo;
$foo = 1;
$foo = 3;
print $foo; # 3, obviously
The same goes for your overwriting.
*Apache2::Cookie::fetch = sub { ... }; # note ... is valid syntax
require 'foo';
$cookie = Apache2::Cookie->new; # not sure if that is correct
# in foo.pm
use Apache2::Cokie; # this will overwrite your implementation
Loading modules in Perl works with %INC, a super-global hash that keeps track of which files have been loaded. If you use a file twice, it only gets loaded and parsed the first time. The second time, it will only call import on the package.
So the trick is to load Apache2::Cookie first, so it's already parsed when the real code loads it again.
use Apache2::Cookie;
BEGIN {
*Apache2::Cookie::fetch = sub { ... };
}
require 'foo'; # or use, no matter
$cookie = Apache2::Cookie->new; # not sure if that is correct
# in foo.pm
use Apache2::Cokie; # now this won't overwrite your implementation
Now Perl already has the file loaded, and the package installed in its guts. Then you overwrite the sub. Then it gets not loaded again and your fix is still intact when it gets called.

Accessing subs from a require'd perl script

I'm going to import some perl code with the require statement. The code I'd like to import is in mylibA.pl:
#!/usr/bin/perl
package FOO::BAR;
sub routine {
print "A message!\n";
}
and mylibB.pl:
#!/usr/bin/perl
package FOO::BAZ;
sub routine {
print "Another message!\n";
}
Then I'm going to use it like this:
#!/usr/bin/perl
foreach my $lib (qw/ mylibA.pl mylibB.pl /){
require $lib;
print "Make a call to ${lib}'s &routine!\n";
}
Is there a way for my script to figure out the namespace that was pulled in with the require statement?
Wow. I have to say this is the one of the most interesting Perl questions I've seen in a while. On the surface this seems like a very simple request - get an included module's namespace, but there really is no way to do this. You can get it while in the package, but not from outside the package. I tried using EXPORT to send the local package name back to the caller script but that ended up going nowhere given the difference in how "use" and "require" work. A more module type of approach probably would have worked with a "use" statement, but the requirement that the required script be able to run by themselves prevented that approach. The only thing left to do was to directly pollute the caller's namespace and hope for the best (assume that the caller had no package namespace) - something that modules are designed to prevent.
BTW - I can't believe this actually works - in strict mode, no less.
caller.pl
#!/usr/bin/perl
use strict;
#package SomePackageName; #if you enable this then this will fail to work
our $ExportedPackageName;
print "Current package=".__PACKAGE__."\n";
foreach my $lib (qw/ mylibA.pl mylibB.pl /){
require $lib;
print "Make a call to ${lib}'s &routine!\n";
print "Package name exported=".$ExportedPackageName."\n";
$ExportedPackageName->routine;
} #end foreach
print "Normal Exit";
exit;
__END__
mylibA.pl
#!/usr/bin/perl
package FOO::BAR;
use strict;
#better hope the caller does not have a package namespace
$main::ExportedPackageName=__PACKAGE__;
sub routine {
print "A message from ".__PACKAGE__."!\n";
}
1;
mylibB.pl
#!/usr/bin/perl
package FOO::BAZ;
use strict;
#better hope the caller does not have a package namespace
$main::ExportedPackageName=__PACKAGE__;
sub routine {
print "Another message, this time from ".__PACKAGE__."!\n";
}
1;
Result:
c:\Perl>
c:\Perl>perl caller.pl
Current package=main
Make a call to mylibA.pl's &routine!
Package name exported=FOO::BAR
A message from FOO::BAR!
Make a call to mylibB.pl's &routine!
Package name exported=FOO::BAZ
Another message, this time from FOO::BAZ!
Normal Exit
Regarding the mostly academical problem of finding the package(s) in a perl source file:
You can try the CPAN module Module::Extract::Namespaces to get all packages within a perl file. It is using PPI and is thus not 100% perfect, but most of the time good enough:
perl -MModule::Extract::Namespaces -e 'warn join ",", Module::Extract::Namespaces->from_file(shift)' /path/to/foo.pm
But PPI can be slow for large files.
You can try to compare the active packages before and after the require. This is also not perfect, because if your perl library file loads additional modules then you cannot tell which is the package of the prinicipal file and what's loaded later. To get the list of packages you can use for example Devel::Symdump. Here's a sample script:
use Devel::Symdump;
my %before = map { ($_,1) } Devel::Symdump->rnew->packages;
require "/path/to/foo.pm";
my %after = map { ($_,1) } Devel::Symdump->rnew->packages;
delete $after{$_} for keys %before;
print join(",", keys %after), "\n";
You can also just parse the perl file for "package" declarations. Actually, that's what the PAUSE upload daemon is doing, so it's probably "good enough" for most cases. Look at the subroutine packages_per_pmfile in
https://github.com/andk/pause/blob/master/lib/PAUSE/pmfile.pm
There are two problems here:
How do I change the behaviour of a script when executed as a standalone and when used as a module?
How do I discover the package name of a piece of code I just compiled?
The general answer to question 2 is: You don't, as any compilation unit may contain an arbitrary number of packages.
Anyway, here are three possible solutions:
Name your modules so that you already know the name when you load it.
Have each module register itself at a central rendezvous point.
Like #1, but adds autodiscovery of your plugins.
The simplest solution is to put all of the API in an ordinary module, and put the standalone logic in a seperate script:
/the/location/
Module/
A.pm
B.pm
a-standalone.pl
b-standalone.pl
Where each standalone basically looks like
use Module::A;
Module::A->run();
If another script wants to reuse that code, it does
use lib "/the/location";
use Module::A;
...
If the loading happens on runtime, then Module::Runtime helps here:
use Module::Runtime 'use_module';
use lib "/the/location";
my $mod_a = use_module('Module::A');
$mod_a->run();
It isn't strictly necessary to place the contents of a-standalone.pl and Module/A.pm into separate files, although that is clearer. If you want to conditionally run code in a module only if it is used as a script, you can utilize the unless(caller) trick.
Of course all of this is tricksing: Here we determine the file name from the module name, not the other way round – which as I already mentioned we cannot do.
What we can do is have each module register itself at a certain predefined location, e.g. by
Rendezvous::Point->register(__FILE__ => __PACKAGE__);
Of course the standalone version has to shield against the possibility that there is no Rendezvous::Point, therefore:
if (my $register = Rendezvous::Point->can("register")) {
$register->(__FILE__ => __PACKAGE__);
}
Eh, this is silly and violates DRY. So let's create a Rendezvous::Point module that takes care of this:
In /the/location/Rendezvous/Point.pm:
package Rendezvous::Point;
use strict; use warnings;
my %modules_by_filename;
sub get {
my ($class, $name) = #_;
$modules_by_filename{$name};
}
sub register {
my ($file, $package) = #_;
$modules_by_filename{$file} = $package;
}
sub import {
my ($class) = #_;
$class->register(caller());
}
Now, use Rendezvous::Point; registers the calling package, and the module name can be retrived by the absolute path.
The script that wants to use the various modules now does:
use "/the/location";
use Rendezvous::Point (); # avoid registering ourself
my $prefix = "/the/location";
for my $filename (map "$prefix/$_", qw(Module/A.pm Module/B.pm)) {
require $filename;
my $module = Rendezvous::Point->get($filename)
// die "$filename didn't register itself at the Rendezvous::Point";
$module->run();
}
Then there are fully featured plugin systems like Module::Pluggable. This system works by looking at all paths were Perl modules may reside, and loads them if they have a certain prefix. A solution with that would look like:
/the/location/
MyClass.pm
MyClass/
Plugin/
A.pm
B.pm
a-standalone.pl
b-standalone.pl
Everything is just like with the first solution: Standalone scripts look like
use lib "/the/location/";
use MyClass::Plugin::A;
MyClass::Plugin::A->run;
But MyClass.pm looks like:
package MyClass;
use Module::Pluggable require => 1; # we can now query plugins like MyClass->plugins
sub run {
# Woo, magic! Works with inner packages as well!
for my $plugin (MyClass->plugins) {
$plugin->run();
}
}
Of course, this still requires a specific naming scheme, but it auto-discovers possible plugins.
As mentioned before it is not possible to look up the namespace of a 'required' package without extra I/O, guessing or assuming.
Like Rick said before, one have to intrude the namespace of the caller or better 'main'. I prefer to inject specific hooks within a BEGIN block of the 'required' package.
#VENDOR/App/SocketServer/Protocol/NTP.pm
package VENDOR::App::SocketServer::Protocol::NTP;
BEGIN {
no warnings;
*main::HANDLE_REQUEST = \&HANDLE_REQUEST;
}
sub HANDLE_REQUEST {
}
#VENDOR/App/SocketServer.pm
my $userPackage= $ARGV[0];
require $userPackage;
main::HANDLE_REQUEST();
Instead of *main:: you can get more specific with *main::HOOKS::HANDLE_REQUESTS i.e. This enables you to resolve all injected hooks easily within the caller by iterating over the HOOK's namespace portion.
foreach my $hooks( keys %main::HOOKS ) {
}

Perl module loading - Safeguarding against: perhaps you forgot to load "Bla"?

When you run perl -e "Bla->new", you get this well-known error:
Can't locate object method "new" via package "Bla"
(perhaps you forgot to load "Bla"?)
Happened in a Perl server process the other day due to an oversight of mine. There are multiple scripts, and most of them have the proper use statements in place. But there was one script that was doing Bla->new in sub blub at line 123 but missing a use Bla at the top, and when it was hit by a click without any of the other scripts using Bla having been loaded by the server process before, then bang!
Testing the script in isolation would be the obvious way to safeguard against this particular mistake, but alas the code is dependent upon a humungous environment. Do you know of another way to safeguard against this oversight?
Here's one example how PPI (despite its merits) is limited in its view on Perl:
use strict;
use HTTP::Request::Common;
my $req = GET 'http://www.example.com';
$req->headers->push_header( Bla => time );
my $au=Auweia->new;
__END__
PPI::Token::Symbol '$req'
PPI::Token::Operator '->'
PPI::Token::Word 'headers'
PPI::Token::Operator '->'
PPI::Token::Word 'push_header'
PPI::Token::Symbol '$au'
PPI::Token::Operator '='
PPI::Token::Word 'Auweia'
PPI::Token::Operator '->'
PPI::Token::Word 'new'
Setting the header and assigning the Auweia->new parse the same. So I'm not sure how you can build upon such a shaky foundation. I think the problem is that Auweia could also be a subroutine; perl.exe cannot tell until runtime.
Further Update
Okay, from #Schwern's instructive comments below I learnt that PPI is just a tokenizer, and you can build upon it if you accept its limitations.
Testing is the only answer worth the effort. If the code contains mistakes like forgetting to load a class, it probably contains other mistakes. Whatever the obstacles, make it testable. Otherwise you're patching a sieve.
That said, you have two options. You can use Class::Autouse which will try to load a module if it isn't already loaded. It's handy, but because it affects the entire process it can have unintended effects.
Or you can use PPI to scan your code and find all the class method calls. PPI::Dumper is very handy to understand how PPI sees Perl.
use strict;
use warnings;
use PPI;
use PPI::Dumper;
my $file = shift;
my $doc = PPI::Document->new($file);
# How PPI sees a class method call.
# PPI::Token::Word 'Class'
# PPI::Token::Operator '->'
# PPI::Token::Word 'method'
$doc->find( sub {
my($node, $class) = #_;
# First we want a word
return 0 unless $class->isa("PPI::Token::Word");
# It's not a class, it's actually a method call.
return 0 if $class->method_call;
my $class_name = $class->literal;
# Next to it is a -> operator
my $op = $class->snext_sibling or return 0;
return 0 unless $op->isa("PPI::Token::Operator") and $op->content eq '->';
# And then another word which PPI identifies as a method call.
my $method = $op->snext_sibling or return 0;
return 0 unless $method->isa("PPI::Token::Word") and $method->method_call;
my $method_name = $method->literal;
printf "$class->$method_name seen at %s line %d.\n", $file, $class->line_number;
});
You don't say what server enviroment you're running under, but from what you say it sounds like you could do with preloading all your modules in advance before executing any individual pages. Not only would this prevent the problems you're describing (where every script has to remember to load all the modules it uses) but it would also save you memory.
In pre-forking servers (as is commonly used with mod_perl and Apache) you really want to load as much of your code before your server forks for the first time so that the code is stored once in copy-on-write shared memory rather than mulitple times in each child process when it is loaded on demand.
For information on pre-loading in Apache, see the section of Practical mod_perl

Log4perl: How do I dynamically load appenders at runtime?

I'd like to have modules managing their logging at runtime, but without having everything referring to a single monolithic config file. When dealing with processes running under different permissions, I really don't want to deal with each process needing to be able to access every log on the system when they're only writing to a subset of them.
However, I'm not finding much documentation in the Log4perl manual on how to initialize additional appenders from a configuration file at runtime. http://metacpan.org/pod/Log::Log4perl::Appender references an add_appender method, but that works on instantiated appender objects instead of conf files. It also doesn't define the logger objects and the logger->appender relations.
I tried having each package init from its own conf, but that simply clobbers the existing config each time it's initalized. What I'd like to do is something along the lines of:
my $foo = Foo->new() ## Checks Log::Log4perl::initialized(), sees that it
## hasn't been initalized yet, inits Log4perl from foo.conf
my $bar = Bar->new() ## Checks Log::Log4perl::initialized(), sees that it
## has been initalized. Adds appenders and loggers defined
## in bar.conf into the initialized configuration
How can I parse and add the configuration into the current config?
Edit: Probalem with using a package variable is that this is just a Moose role being consumed by various classes, pretty much just a MooseX::Role::Parameterized version of Ether's answer in Making self-logging modules with Log::Log4perl. Thus, my logger is getting composed into the library consuming it, and I don't have a global variable I can work on each time I use it.
Though..
If I declare a global variable outside of the MooseX::Role::Parameterized role block, would each and every class that consumes the role be using that same conf variable?
You can remember what config files was already loaded (%log_configs hash in code below). When new class arrives, you can reread all configs, merge it together and init Log::Log4perl again using string reference parameter to init.
I generally prefer having a single log configuration per application, because of easier maintenance and reload capability.
package Logger;
use Moose::Role;
use Log::Log4perl;
our %log_configs = ();
around BUILDARGS => sub {
my $orig = shift;
my $class = shift;
my $config_name = lc($class) . '.conf';
# if the config is not integrated yet
if(! defined $log_configs{$config_name}) {
$log_configs{$config_name} = 1;
# reload all configs including new one
my $config_text = '';
for my $file (sort keys %log_configs) {
$config_text .= "\n" . do {
local $/; # slurp
unless(open my $fh, "<", $file) {
warn "$file could not be open\n";
'';
}
else {
<$fh>
}
};
}
# refresh config
Log::Log4perl::init(\$config_text);
}
return $class->$orig(#_);
};
package Foo;
use Moose;
with 'Logger';
use Log::Log4perl ':easy';
sub BUILD {
ERROR 'Foo reporting';
}
package Bar;
use Moose;
with 'Logger';
use Log::Log4perl ':easy';
sub BUILD {
INFO 'Bar reporting';
}
package main;
my $foo = Foo->new;
my $bar = Bar->new;
While I was hoping to avoid it, if I parse the config files myself I can then access the configuration in perl via the API documented in http://search.cpan.org/perldoc?Log::Log4perl. Namely,
########################
# Initialization section
########################
use Log::Log4perl;
use Log::Log4perl::Layout;
use Log::Log4perl::Level;
# Define a category logger
my $log = Log::Log4perl->get_logger("Foo::Bar");
# Define a layout
my $layout = Log::Log4perl::Layout::PatternLayout->new("[%r] %F %L %m%n");
# Define a file appender
my $file_appender = Log::Log4perl::Appender->new(
"Log::Log4perl::Appender::File",
name => "filelog",
filename => "/tmp/my.log");
# Define a stdout appender
my $stdout_appender = Log::Log4perl::Appender->new(
"Log::Log4perl::Appender::Screen",
name => "screenlog",
stderr => 0);
# Have both appenders use the same layout (could be different)
$stdout_appender->layout($layout);
$file_appender->layout($layout);
$log->add_appender($stdout_appender);
$log->add_appender($file_appender);
$log->level($INFO);
While the other method works, there's too many caveats for me to be comfortable using it (Gee I used this library, why'd my logging stop?) -- it's just too surprising for my tastes.
Instead, I think I'm going to see if I can't get from config file to Log::Log4perl state by looking through how to use Log::Log4perl::Config::PropertyConfigurator, which is delegated to by ->init when parsing a config file is needed. If I go over the data structure that returns, i can compare changes to the initialization on a logger-by-logger and appender-by-appender basis and modify the initialized state appropriately, handle namespace collisions properly, etc.

Making self-logging modules with Log::Log4perl

Is there a way to use Log::Log4perl to make a smart self-logging module that logs its operations to a file even in the absence of the calling script not initializing Log4perl? As far as I can tell from the documentation, the only way to use Log4perl is to initialize it in the running script from a configuration, then modules implementing Log4perl calls log themselves based on the caller's Log4perl config.
Instead, I'd like the modules to provide a default initialization config for Log4perl. This would provide the default file appender for the module's category. Then, I could override this behavior by initing Log4perl in the caller with a different config if needed, and everything would hopefully just work.
Is this sort of defensive logging behavior possible or am I going to need to rely on initing Log4perl in every .pl script that calls the module I want logged?
I do this in a custom Log role in Moose (irrelevant complicated code removed):
package MyApp::Role::Log;
use Moose::Role;
use Log::Log4perl;
my #methods = qw(
log trace debug info warn error fatal
is_trace is_debug is_info is_warn is_error is_fatal
logexit logwarn error_warn logdie error_die
logcarp logcluck logcroak logconfess
);
has _logger => (
is => 'ro',
isa => 'Log::Log4perl::Logger',
lazy_build => 1,
handles => \#methods,
);
around $_ => sub {
my $orig = shift;
my $this = shift;
# one level for this method itself
# two levels for Class:;MOP::Method::Wrapped (the "around" wrapper)
# one level for Moose::Meta::Method::Delegation (the "handles" wrapper)
local $Log::Log4perl::caller_depth;
$Log::Log4perl::caller_depth += 4;
my $return = $this->$orig(#_);
$Log::Log4perl::caller_depth -= 4;
return $return;
} foreach #methods;
method _build__logger => sub {
my $this = shift;
my $loggerName = ref($this);
Log::Log4perl->easy_init() if not Log::Log4perl::initialized();
return Log::Log4perl->get_logger($loggerName)
};
As you can see, the log object is self-initializing -- if Log::Log4perl->init has not been called, then easy_init is called. You could easily modify this to allow each module to customize its logger -- I do so with optional role parameters, with ref($this) as the default fallback.
PS. You may also want to look at MooseX::Log::Log4perl, which is where I started before I used the logger role above. Someday when I get around to it I will submit some much-needed patches to that MX module to incorporate some features I have added.
The short answer is to call Log::Log4perl::initialized(); at some point and if it is false, set up some default logging.
The tricky part is "some point."
You can't do it in BEGIN {}, because then the main script's will stomp your initialisation even though you created unnessary files. You don't want to do it before each call to get_logger() because that is wasteful. So you should do it during the module's initialisation, say sub new or sub init.