Perl IPC - FIFO and daemons & CPU Usage - perl

I have a mail parser perl script which is called every time a mail arrives for a user (using .qmail). It extracts a calendar attachment out of the mail and places the "path" of the file in a FIFO queue implemented using the Directory::Queue module.
Another perl script which reads the path of the calendar attachment and performs certain file operations on the local system as well as on the remote CalDAV server, is being run as a daemon, as explained here. So basically this script looks like:
my $declarations
sub foo {
.
.
}
sub bar {
.
.
}
while ($keep_running) {
for(keep-checking-the-queue-for-new-entries) {
sub caldav_logic1 {
.
.
}
sub caldav_logic2 {
.
.
}
}
}
I am using Proc::Daemon for running the script as a daemon. Now the problem is, this process has almost 100% CPU usage. What are the suggested ways to implement the daemon in a more standard, safer way ? I am using pretty much the same code as mentioned in the link mentioned for usage of Proc::Daemon.

I bet it is your for loop and checking for new queue entries.
There are ways to watch a directory for file changes. These ways are OS dependent but there might be a Perl module that wraps them up for you. Use that instead of busy looping. Even with a sleep delay, the looping is inefficient when you can have your program told exactly when to wake up by an OS event.
File::ChangeNotify looks promising.

Maybe you don't want truly continuous polling. Is keep-checking-the-queue-for-new-entries a CPU-intensive part of the code, even when the queue is empty? That would explain why your processor is always busy.
Try putting a sleep 1 statement at the very top (or very bottom) of the while loop to let the processor rest between queue checks. If that doesn't degrade the program performance too much (i.e., if everyone can tolerate waiting an extra second before the company calendars get updated) and if the CPU usage still seems high, try sleep 2, sleep 5, etc.

cpan Linux::Inotify2
The kernel knows when files change and sends this information to your program which runs the sub. Maybe this will be better because the program will run the sub only when the file is changed.

Related

Tcl/Tk - How to keep other buttons useable while separate function still running?

I'm very new to Tcl/Tk and have been dealing with an issue for the last couple of days. Basically I have a server written in C and a client GUI written in Tcl/Tk. So far it doesn't do a ton. To test it, I start up the server so that it's listening for connections, then run my GUI. When I click one of the buttons, the GUI should open up a separate toplevel window with a text widget embedded in it. (This part works.) Then, my client connects to the server and gives it a couple of settings, and through this the server decides what info to send back. The server's response is what gets printed to that second window's text widget.
What I'm trying to add in now is a Stop button. Right now, my server is set up to wait a couple of seconds, then write the same message to the client. This is set up inside a loop that is waiting to hear a "Stop" command from my client. I have a Stop button in the GUI with a command set up to write that command to the server when clicked. However, all of my buttons get frozen as soon as I hit the begin button and messages are written to the client.
Basically, how can I keep allowing my server to write to my client while still keeping the rest of my GUI usable? I want my client to write a new line to the text widget on my separate window whenever it receives a new message from the server, but I still want the main GUI window that has all my command buttons to behave independently.
In general, it depends on whether what you are doing is CPU-intensive (where reading from a plain file counts as CPU-intensive) or I/O-intensive (where running things in another process counts as I/O-intensive; database calls often count as CPU-intensive here despite not really needing to). I'm only going to mention summaries of what's going on as you aren't quite providing enough information.
For I/O-intensive code, you want to structure your code to be event-driven. Tcl has good tools for this, in that fileevent works nicely on sockets, terminals and pipelines on all supported platforms. The coroutine system of Tcl 8.6 can help a lot with preventing the callbacks required from turning your code into a tangled mess!
For CPU-intensive code, the main option is to run in another thread. That thread won't be able to touch the GUI directly (which in turn will be free to be responsive), but will be able to do all the work and send messages back to the main thread with whatever UI updates it wants done. (Technically you can do this with I/O-intensive code too, but it's more irritating than using a coroutine.) Farming things out to a subprocess is just another variation on this where the communications are more expensive (but much isolation is enforced by the OS).
If you're dealing with sockets, you're probably I/O-intensive. Assume that until you show otherwise. Here's a simple example:
proc gets_async {sock} {
set sock [lindex $args end]
fileevent $sock readable [info coroutine]
while {[gets $sock data] < 0 && [fblocked $sock]} {
yield
}
fileevent $sock readable {}
return $data
}
proc handler {socket} {
set n 0
while {![eof $socket]} {
# Write to the server
puts $socket "this is message [incr n] to the server"
# Read from the server
puts [gets_async $socket]
}
close $socket
}
proc launchCommunications {host port} {
set sock [socket $host $port]
fconfigure $sock -blocking 0 -encoding utf-8
coroutine comms($host:$port) handler $socket
}
Note that gets_async is much like coroutine::util gets in Tcllib.

Forwarding AnyEvent::Log messages to a callback if certain requirements are met

I am working on a project that uses AnyEvent Log in the main program as well as several dependent modules/packages. I currently have each module writing to it's own context, and all contexts are added to the main programs context as slaves. This project is part of a much larger project, and in addition to writing out a local log file, there are certain messages that I would like to send to a remote program which will then be responsible for presenting the messages to users.
The problem is that in order to send to the remote program, I have to have a piece of information that is only available from the main program, so it's not feesible to just implement a method at the package level to send messages. The piece of information I need is more or less a transaction id, and the log messages are interesting events from a particular transaction.
The main program has 2 contexts ( main , secondary ). The messages I am interested in will either come from the secondary ctx OR one of the package/module contexts. I am interested in only sending info - crit level messages to users, but ONLY WHEN the txID exists in the main program. I ALWAYS want messages to be written to my local log file regardless of whether or not a deployment is running. I would like this to be something that I setup in the main program rather than in a module because the modules are tasked to do certain thing and shouldn't even be aware of the fact that there is an ID associated with the task at hand.
Here is a quick breakdown of the log configuration specific code in the main program.
# Immediately after Proc::Daemon::Init
my $logger = AnyEvent::Log::ctx "desman";
# configure is done before daemonization to allow for --nodaemon
sub configure {
my ( $level, $file ) = #_;
$AnyEvent::Log::FILTER->level($level);
$AnyEvent::Log::LOG->log_to_file($file);
}
sub log_event {
... logic to send messages as tx event ...
}
sub worker_init {
threads->create(sub {
$logger->attach( my $worklog = AnyEvent::Log::ctx "worker" );
... more stuff for worker specifics ...
});
}
Ideally, I would be able to use one or both of log_cb and fmt_cb to handle the formatting and sending of messages to the remote program using the log_event sub. I have tried a few different things, and so far I'm stuck.
# doesn't seem to do anything
$logger->fmt_cb( sub { ... } );
$logger->log_cb( sub { ... } );
# broke everything
$AnyEvent::Log::COLLECT->attach( my $evtlog = new AnyEvent::Log::Ctx
fmt_cb => \&event_formatter,
log_cb => \&log_event
);
$evtlog->levels('crit','warning','notice','info');
I've been searching around for more examples than what's in the docs, but haven't found much yet. Not much of a surprise there since AE::log is pretty much awesome as it is, but anything to help will be greatly appreciated.

Daemon::Generic locks disappearing upon daemonization on Solaris

I'm trying to use Daemon::Generic, and it seems to be exactly what I need, except on Solaris it seems that locking the pid file doesn't work. As a result, status always reports the process is dead, even when it is quite clearly alive.
For example, the following:
use Daemon::Generic;
sub gd_run
{
sleep (60);
}
sub gd_preconfig
{
return ();
}
newdaemon(pidfile => "/tmp/myproc.pid");
... always reports the process is dead.
Note that /tmp is mounted locally, so it seems like that should avoid issues with locks over network drives.
Any ideas how to get this to work?
Edit:
On closer investigation, by putting sleep 60 commands at various points in Generic.pm, that it seems the lock is lost somewhere in the function gd_daemonize. Do locks not survive forks on Solaris? Is there a way to ensure the lock survives the daemonize process?

Setting up the alarm in threads with a message if timed out

I'm refering to this question, but didn't want to post it there as it was half a year ago & its already answered.
I think that I need to set the alarm within the thread because it is listening for a connection (sockets) and I dont know what time to set for alarm until the client sents a command.
Short context: A clients sents a command which orders my script to run a selfwritten perl module. This module needs to be killed if it runs longer than it should. This "should" is very specific and will be written in the config file for each module.
I tried the alarm within a simple perl script and it worked quite well - even with my own message.
I am able to let the alarm quit the script, but it does not give me a message at all.
Used this example until I noticed that it may be different with threads.
Then I tried the Thread::alarm($time), but as I started with perl about 3 weeks ago I wasn't able to implement it correctly (it just does nothing. It does not even end the program).
Do you need any code to help or is there a site with examples that I could use and which I just did not find?
Did you already try AnyEvent?
AnyEvent let you setup watchers acting like timers:
# one-shot or repeating timers
my $w = AE::timer $seconds, 0, sub { ... }; # executed only once
my $w = AE::timer $seconds, $interval, sub { ... }; # executed every $interval
$seconds could be defined during the config phase, at thread start.
In callbacks you may use the same code that kills the program.
AnyEvent has its logging framework too AnyEvent::Log, which logs nothing by default, but you can enable some logging to see if it suits your needs about messages.

How do I call a perl process that is already running from another script?

Problem:
scriptA.cgi is sitting in an infinite loop and handling an open socket to a Flash client.
scriptB.cgi is called from the web, does what it needs to do and then needs to inform scriptA to send a message to the client.
Is this possible? I'm stuck on how to have scriptB identify the instance of scriptA that is sitting there with the socket connection, rather than launching one of its own.
all thoughts appreciated.
If the communication needs are simple, this is a good application for signals.
Edited to store process id from scriptA and read it in scriptB -- scripts A and B have to agree on a name.
# script B
do_scriptB_job();
if (open(my $PID_FILE, "<", "scriptA.pid.file")) {
$process_id_for_scriptA = <$PID_FILE>;
close $PID_FILE;
kill 'USR1', $process_id_for_scriptA; # makes scriptA run the SIGUSR1 handler
}
# script A
open(my $PID_FILE, ">", "scriptA.pid.file");
print $PID_FILE $$;
close $PID_FILE;
my $signaled = 0;
$SIG{"USR1"} = \sub { $signaled = 1 } # simple SIGUSR1 handler, set a variable
while ( in_infinite_loop ) {
if ($signaled) {
# this block runs only if SIGUSR1 was received
# since last time this block was run
send_a_message_to_the_client();
$signaled = 0;
} else {
do_something_else();
}
}
unlink "scriptA.pid.file"; # cleanup
When script A receives a SIGUSR1 signal, the script will be interrupted to run the USR1 signal handler, setting $signaled. The thread of execution will then resume and the script can use the information.
Have scriptA store it's pid somwhere (in a db with some kind of id), then scriptB can look up the pid in the db and send a signal to scriptA.
Edit:
Answering question asked in comment
The pid of the process can be found using perls builtin variables $$ or $PID or $PROCESS_ID depending on how old your perl is.
See perlvar for details.
I hope this is the ID you where looking for. If not you'll have to find a way to separate the different scriptA instances. (Perhaps by session id, or socket. Here I cant help you further)
Other people have mentioned how to get the PID (if you didn't fork() it yourself, just have the other-process write it... somewhere... that both processes know how to get it. or walk the process table, but that's a horrible solution and completely unscalable beyond a singleton).
Since you note that any thoughts are welcome, note that perldoc perlipc explains a variety of mechanisms you might use for the actual communication:
NAME
perlipc - Perl interprocess communication (signals, fifos, pipes, safe
subprocesses, sockets, and semaphores)
DESCRIPTION
The basic IPC facilities of Perl are built out of the good old Unix
signals, named pipes, pipe opens, the Berkeley socket routines, and SysV
IPC calls. Each is used in slightly different situations.
Domain sockets: http://www.perl.com/doc/FMTEYEWTK/IPC/unix.html
?
I was tempted to answer, 'send signals' or 'use some kind of IPC to talk between apps' but, a far easier and scalable approach is to use a sqlite (or other) database that all scripts can talk to,
ScriptA.cgi would poll the database by doing something like 'SELECT event FROM events WHERE clientID=?'.
ScriptB.cgi would simply insert a row into the events table with the right clientID.
That avoids all of the 'find the pid' mess and also mean that you don't get the blocking IO problems you would get with named pipes or if one script crashed.