perl: Will Log::Log4perl work in a multi-threaded environment? - perl

I'm using logging vai Log::Log4perl in my perl script; I'm wondering if making multiple calls to write to the same log object will cause incorrect / erroneous behavior.
I'm using a Appender::File to write out the log in the following manner:
$log->info("Launching commands...");
foreach my $param (#params) {
push #thread_handles, async {
system("$param");
$log->info("$param COMPLETE");
logstatus($?);
};
}
$_->join() foreach #thread_handles;
$log->info("Commands completed...");

The Log::Log4perl with the default file based appender will work, but some overlapping may occur in a multi-threaded or multi-processed environment using the same log file.
One solution is to use Log::Log4perl::Appender::Synchronized as an appender. See How can I run Log::Log4perl under mod_perl? in the FAQ for more info.

Using the Synchronized appender makes a lot of sense.
My question is:
Will the logger variable be passed around to the threads? According to perlthrtut it will not. I have yet to verify this. Unless somebody's already done it?
You may be tempted to use threads::shared to share the logger variable but the threads::shared documentation specifies that you can only share scalars, arrays, or hashes. I've tried that anyway with perl 5.8.8 and, as expected, it will not work.
The other approach would be to create a separate logger for each subroutine that will be invoked in a thread. The log4perl appenders can be configured to avoid locking and interleaving. But I am seriously concerned about the performance impact incurred with generating a separate logger instance for each active thread.
Updated:
It turns out one doesn't have to get complicated. If you initialize log4perl as if you are writing a single thread script and you call the logger objects methods without any special tricks, everything works as advertised. The logger object doesn't have to be passed to thread entry point. The sub invoked in the thread entry point gets access to the logger methods the same way as if it were called the regular way. The Synchronized appender keeps everything lines up.

The Log4perl is working in threaded enviroment, however you have to choose the appenders carefully.
Except the synced appenders the logs messages with buffering won't appear to be in order, but this is not a problem.
Either use different files for each thread or add the pid to the logmessage.
The synced appenders may cause a lot of overhead in your threaded application, use it carefully.
I opt for a single logfile with pid or some kind of thread identifier. I am using this kind of logging without a problem.
regards,

Related

Perl script running a periodic (main) task and providing a REST interface

I am working on a Perl script which does some periodic processing based on file-system contents.
The overall structure is like this:
# ... initialization...
while(1) {
# ... scan filesystem, perform actions depending on changes detected ...
sleep 5;
}
I would like to add the ability to input some data into this process by means of exposing an interface through HTTP. E.g. I would like to add an endpoint to skip the sleep, but also some means to input data that is processed in the next iteration. Additionally, I would like to be able to query some of the program's status through HTTP (i.e. a simple fork() to run the webserver-part in a separate process is insufficient?)
So far I have already used the Dancer2 framework once but it has a start; call that blocks and thus does not allow any other tasks (like my loop) to run. Additionally, I could of course move the code which is currently inside the loop to an endpoint exposed through Dancer2 but then I would need to call that periodically (though an external program?) which seems to be quite an obscure indirection compared to just having the webserver-part running in background.
Is it possible to unobtrusively (i.e. without blocking the program) add a REST-server capability to a Perl script? If yes: Which modules would be used for the purpose? If no: Should I really implement an external process to periodically invoke a certain endpoint or pursue a different solution altogether?
(I have tried to add a dancer2 tag, but could not do so due to insufficient reputation. Do not be mislead by this: I have so far only tried with Dancer2 not the Dancer (v.1))
You could try to launch your processing loop in a background thread, before you run start;.
See man perlthrtut
You probably want use threads::shared; to declare some variables shared between the REST part and the background thread. Or use dedicated queues/event mechanisms.

threads in Dancer

I'm using Dancer 1.31, in a standard configuration (plackup/Starman).
In a request I wished to call a perl function asynchronously, so that the request returns inmmediately. Think of the typical "long running operation" scenario, in which one wants to return a "processing page" with a refresh+redirect.
I (naively?) tried with a thread:
sub myfunc {
sleep 9; # just for testing a slow operation
}
any '/test1' => sub {
my $thr = threads->create('myfunc');
$thr->detach();
return "done" ;
};
I does not work, the server seems to freeze, and the error log does not show anything. I guess manual creation of threads are forbidden inside Dancer? It's an issue with PSGI? Which is the recommended way?
I would stay away from perl threads especially in a web server environment. It will most likely crash your server when you join or detach them.
I usually create a few threads (thread pool) BEFORE initializing other modules and keep them around for the entire life time of the application. Thread::Queue nicely provides communication between the workers and the main thread.
The best asynchronous solution I find in Perl is POE. In Linux I prefer using POE::Wheel::Run to run executables and subroutines asynchronously. It uses fork and has a beautiful interface allowing communication with the child process. (In Windows it's not usable due to thread dependency)
Setting up Dancer and POE inside the same application/script may cause problems and POE's event loop may be blocked. A single worker thread dedicated to POE may come handy, or I would write another server based on POE and just communicate with the Dancer application via sockets.
Threads are definitively iffy with Perl. It might be possible to write some threaded Dancer code, but to be honest I don't think we ever tried it. And considering that Dancer 1's core use simpleton classes, it might also be very tricky.
As Ogla says, there are other ways to implement asynchronous behavior in Dancer. You say that you are using Starman, which is a forking engine. But there is also Twiggy, which is AnyEvent-based. To see how to leverage it to write asynchronous code, have a gander at Dancer::Plugin::Async.

Why does not catalyst die only once in a chained action?

Consider the following actions:
sub get_stuff :Chained('/') :PathPart('stuff') :CaptureArgs(1) {
my ($self,$c,$stuff_id) = #_;
die "ARRRRRRGGGG";
}
sub view_stuff :Chained('get_stuff') :PathPart('') :Args(0){
die "DO'H";
}
Now if you request '/stuff/314/' , you'll get
Error: ARRRRG in get_stuff at ...
Error: DO'H in view_stuff at ...
Is there a reason why not just throw the error at the first failing chain link?
Why is catalyst trying to carry on the chain?
I'm not sure of the answer as to 'why' but I presume it was done that way to give flexibility.
You should probably catch the error with eval (or preferably something like Try::Tiny or TryCatch) and call $c->detach if you want to stop processing actions.
Catalyst::Plugin::MortalForward
The chosen answer may be outdated. Catalyst can die early, when the application's config key abort_chain_on_error_fix is set.
__PACKAGE__->config(abort_chain_on_error_fix => 1);
See the documentation about Catalyst configurations. It also states, that this behaviour may be standard in future.
Cucabit is right, detach is the way to go. As to the why, normally in a perl process, 'die' stops the process. In Catalyst you don't want that. If for instance you run your Catalyst app under FastCGI, you spawn one or more standalone processes that handle multiple requests. If the first request would kill the process itself, the web server would have to respawn the FastCGI process in order to be able to handle the next call. I think for that, Catalyst catches 'die' (its used a lot as the default 'do_something() or die $!') and turns it into an Exception.
You could also end the process with 'exit' I guess, but you end up with the same problems as above, killing the process.
What you can of course do is create your own 'die' method that logs the error passed with the default log object and then calls detach or something. it should also be possible to redefine the Catalyst exception handling as anything is possible with Catalyst :)

How can I replace all 'die's with 'confess' in a Perl application?

I'm working in a large Perl application and would like to get stack traces every time 'die' is called. I'm aware of the Carp module, but I would prefer not to search/replace every instance of 'die' with 'confess'. In addition, I would like full stack traces for errors in Perl modules or the Perl interpreter itself, and obviously I can't change those to use Carp.
So, is there a way for me to modify the 'die' function at runtime so that it behaves like 'confess'? Or, is there a Perl interpreter setting that will throw full stack traces from 'die'?
Use Devel::SimpleTrace or Carp::Always and they'll do what you're asking for without any hard work on your part. They have global effect, which means they can easily be added for just one run on the commandline using e.g. -MDevel::SimpleTrace.
What about setting a __DIE__ signal handler? Something like
$SIG{__DIE__} = sub { Carp::confess #_ };
at the top of your script? See perlvar %SIG for more information.
I usually only want to replace the dies in a bit of code, so I localize the __DIE__ handler:
{
use Carp;
local $SIG{__DIE__} = \&Carp::confess;
....
}
As a development tool this can work, but some modules play tricks with this to get their features to work. Those features may break in odd ways when you override the handler they were expecting. It's not a good practice, but it happens sometimes.
The Error module will convert all dies to Error::Simple objects, which contain a full stacktrace (the constructor parses the "at file... line.." text and creates a stack trace). You can use an arbitrary object (generally subclassed from Error::Simple) to handle errors with the $Error::ObjectifyCallback preference.
This is especially handy if you commonly throw around exceptions of other types to signal other events, as then you just add a handler for Error::Simple (or whatever other class you are using for errors) and have it dump its stacktrace or perform specialized logging depending on the type of error.

Is there a way to have managed processes in Perl (i.e. a threads replacement that actually works)?

I have a multithreded application in perl for which I have to rely on several non-thread safe modules, so I have been using fork()ed processes with kill() signals as a message passing interface.
The problem is that the signal handlers are a bit erratic (to say the least) and often end up with processes that get killed in inapropriate states.
Is there a better way to do this?
Depending on exactly what your program needs to do, you might consider using POE, which is a Perl framework for multi-threaded applications with user-space threads. It's complex, but elegant and powerful and can help you avoid non-thread-safe modules by confining activity to a single Perl interpreter thread.
Helpful resources to get started:
Programming POE presentation by Matt Sergeant (start here to understand what it is and does)
POE project page (lots of cookbook examples)
Plus there are hundreds of pre-built POE components you can use to assemble into an application.
You can always have a pipe between parent and child to pass messages back and forth.
pipe my $reader, my $writer;
my $pid = fork();
if ( $pid == 0 ) {
close $reader;
...
}
else {
close $writer;
my $msg_from_child = <$reader>;
....
}
Not a very comfortable way of programming, but it shouldn't be 'erratic'.
Have a look at forks.pm, a "drop-in replacement for Perl threads using fork()" which makes for much more sensible memory usage (but don't use it on Win32). It will allow you to declare "shared" variables and then it automatically passes changes made to such variables between the processes (similar to how threads.pm does things).
From perl 5.8 onwards you should be looking at the core threads module. Have a look at http://metacpan.org/pod/threads
If you want to use modules which aren't thread safe you can usually load them with a require and import inside the thread entry point.