alarm does not seem to fire if I set $SIG{ALRM} - perl

I'm trying to implement an alarm in my Perl backend process so that it will terminate if it gets stuck for too long. I tried to implement the code given on the alarm documentation page on Perldoc (this is verbatim from the documentation other than the line that calls my program's key subroutine instead of the sample line in the documentation):
eval {
local $SIG{ALRM} = sub { die "alarm\n" }; # NB: \n required
alarm $timeout;
&FaithTree::Backend::commandLine({ 'skipWidgets' => $skipWidgets, 'commandLineId' => $commandLineId, 'force' => $force });
alarm 0;
};
if ($#) {
die unless $# eq "alarm\n"; # propagate unexpected errors
# timed out
}
else {
# didn't
}
Given this code, nothing happens when the alarm should have timed out. On the other hand, if I remove the custom definition for $SIG{ALRM} (which, again, came straight from the Perl documentation) the alarm does fire, just without the custom handler.
I'm wondering if the fact that I'm using Thread::Queue is playing a role in the alarm failing, but that doesn't explain why it works so long as I skip redefining $SIG{ALRM}.
Here's a minimal, runnable version with subroutine that is intentionally an infinite loop for testing:
eval {
$SIG{ALRM} = sub { die "alarm\n" }; # NB: \n required
alarm 1;
&FaithTree::Test::Backend::commandLine({ 'skipWidgets' => $skipWidgets, 'commandLineId' => $commandLineId, 'force' => $force });
alarm 0;
};
if ($#) {
die unless $# eq "alarm\n"; # propagate unexpected errors
# timed out
}
else {
exit;
}
package FaithTree::Test::Backend;
use File::Tail;
use threads;
use threads::shared;
use Thread::Queue;
sub commandLine {
our $N //= 4;
my $Q = new Thread::Queue;
my #kids = map threads->create( \&FaithTree::Test::Backend::fetchChild, $Q ), 1 .. $N;
my #feeds = ( "1","2","3","4" );
foreach my $feed (#feeds) {
$Q->enqueue( $feed );
}
$Q->enqueue( ( undef ) x $N );
$_->join for #kids;
}
sub fetchChild {
print "Test";
# Access queue.
my $Q = shift;
#What is my thread id?
my $tid = threads->tid();
my ($num, $num2);
for ( ; ; ){
if ($num2 == 10000) {
say STDERR $tid . ': ' . $num;
$num2 = 0;
}
$num++;
$num2++;
}
return 1;
}
If you comment out the $SIG{ALRM} line, it will terminate when the alarm is set to time out. If you leave it in place, it will never terminate.

Signals and threads don't mix well. You might want to rethink your use of signals. For example, you could move all the thread stuff to a child process.
Signal handlers are only called between Perl ops. The main thread is in a call to XS sub thread->join, and the signal handler will be called once join returns.
Most blocking system calls can be interrupted (returning error EINTR), so it might be possible to write a signal-aware version of join. Except I seem to remember pthread functions not being interruptible, so maybe not.
In this particular case, you could have the threads signal the main thread when they are over using a system that allows the main thread to block until the a signal occurs or a timeout has occured. cond_signal/cond_timedwait is such a system.
use Sub::ScopeFinalizer qw( scope_finalizer );
use Time::HiRes qw( time );
my $lock :shared;
my $threads_remaining = $N;
my $Q = new Thread::Queue;
my #threads;
{
lock $lock;
for (1..$N) {
++$threads_remaining;
push #threads, async {
my $guard = scope_finalizer {
lock $lock;
--$threads_remaining;
cond_signal($lock);
};
worker($Q);
}
}
}
my $max_end_time = time + 1;
# ...
{
lock $lock;
while ($threads_remaining) {
if (!cond_timedwait($lock, $max_end_time)) {
# ... Handle timeout ...
}
}
}
$_->join for #threads;

Related

What is best way in Perl to set a timer to stop long-running process?

I've got an application that invokes a potentially long-running process. I want my program, the caller of this process, to cancel it at any given point and move on to the next entry when a time limit is exceeded. Using Perl's AnyEvent module, I tried something like this:
#!/usr/bin/env perl
use Modern::Perl '2017';
use Path::Tiny;
use EV;
use AnyEvent;
use AnyEvent::Strict;
my $cv = AE::cv;
$cv->begin; ## In case the loop runs zero times...
while ( my $filename = <> ) {
chomp $filename;
$cv->begin;
my $timer = AE::timer( 10, 0, sub {
say "Canceled $filename...";
$cv->end;
next;
});
potentially_long_running_process( $filename );
$cv->end;
}
$cv->end;
$cv->recv;
exit 0;
sub potentially_long_running_process {
my $html = path('foo.html')->slurp;
my #a_pairs = ( $html =~ m|(<a [^>]*>.*?</a>)|gsi );
say join("\n", #a_pairs);
}
The problem is the long-running processes never time out and get canceled, they just keep on going. So my question is "How do I use AnyEvent (and/or related modules) to time out a long-running task?"
You have not mentioned the platform you are running this script on, but if it is running on *nix, you can use the SIGALRM signal, something like this:
my $run_flag = 1;
$SIG{ALRM} = sub {
$run_flag = 0;
}
alarm (300);
while ($run_flag) {
# do your stuff here
# note - you cannot use sleep and alarm at the same time
}
print "This will print after 300 seconds";

Perl how to join all threads

My code is like below:
use threads;
use threads::shared;
use Thread::Queue;
my $q = Thread::Queue->new();
my #threads = ();
my $run :shared = 1;
$SIG{'TERM'} = sub {
$run = 0;
$q->enqueue('exit');
foreach(#threads){
$_->join();
}
};
push #threads, threads->create(proc1);
push #threads, threads->create(proc2);
sub proc1 {
while($p = $q->dequeue()){
if($p eq 'exit'){
last;
}
.....
}
$q->end();
threads->exit();
}
sub proc2 {
while($run){
.....
}
}
On TERM signal im trying to wait till all threads are ended. However, whenever I pass TERM signal my program gets terminated with an error
Segmentation fault
How to fix this ?
Assuming threads->create(proc1) even works (and that would only be because you didn't use use strict; as you should), then your program exits immediately after creating the threads. You need to have your main thread wait for the children threads to finish.
Fixing that problem (and applying some simplifications) results in the following:
use strict;
use warnings;
use threads;
use threads::shared;
use Thread::Queue 3.01 qw( );
my $q = Thread::Queue->new();
my $run :shared = 1;
$SIG{TERM} = sub {
print("Received SIGTERM. Shutting down...\n");
$run = 0;
$q->end();
};
print("$$\n");
threads->create(\&proc1);
threads->create(\&proc2);
$_->join() for threads->list();
sub proc1 {
while(my $p = $q->dequeue()) {
sleep 1; # Placeholder
}
}
sub proc2 {
while($run){
sleep 1; # Placeholder
}
}
I don't get a seg fault, but the program doesn't exit either. The signal handler is simply not called. It's because Perl is waiting for join to return before calling the signal handler. You can solve that by polling the list of joinable threads. In other words, replace
$_->join() for threads->list();
with
my $running_threads = 2;
while ($running_threads) {
for my $thread (threads->list(threads::joinable)) {
$thread->join();
$running_threads--;
}
sleep 1;
}

How to multithread seeing if a webpage exists in Perl?

I'm writing a Perl script that takes in a list of URLs and checks to see if they exist. (Note that I only care if they exist; I don’t care what their contents are. Here’s the important part of the program.
use LWP::Simple qw($ua head);
if (head($url))
{
$numberAlive ++;
}
else
{
$numberDead ++;
}
Right now the program works fine; however, I want it to run faster. Thus I'm considering making it multithreaded. I assume that the slow part of my program is contacting the server for each URL; therefore, I'm looking for a way in which I can send out requests to the URLs of other webpages on my list while I'm waiting for the first response. How can I do this? As far as I can tell, the head routine doesn't have a callback that can get called once the server has responded.
Begin with familiar-looking front matter.
#! /usr/bin/env perl
use strict;
use warnings;
use 5.10.0; # for // (defined-or)
use IO::Handle;
use IO::Select;
use LWP::Simple;
use POSIX qw/ :sys_wait_h /;
use Socket;
Global constants control program execution.
my $DEBUG = 0;
my $EXIT_COMMAND = "<EXIT>";
my $NJOBS = 10;
URLs to check arrive one per line on a worker’s end of the socket. For each URL, the worker calls LWP::Simple::head to determine whether the resource is fetchable. The worker then writes back to the socket a line of the form url : *status* where *status* is either "YES" or "NO" and represents the space character.
If the URL is $EXIT_COMMAND, then the worker exits immediately.
sub check_sites {
my($s) = #_;
warn "$0: [$$]: waiting for URL" if $DEBUG;
while (<$s>) {
chomp;
warn "$0: [$$]: got '$_'" if $DEBUG;
exit 0 if $_ eq $EXIT_COMMAND;
print $s "$_: ", (head($_) ? "YES" : "NO"), "\n";
}
die "NOTREACHED";
}
To create a worker, we start by creating a socketpair. The parent process will use one end and each worker (child) will use the other. We disable buffering at both ends and add the parent end to our IO::Select instance. We also note each child’s process ID so we can wait for all workers to finish.
sub create_worker {
my($sel,$kidpid) = #_;
socketpair my $parent, my $kid, AF_UNIX, SOCK_STREAM, PF_UNSPEC
or die "$0: socketpair: $!";
$_->autoflush(1) for $parent, $kid;
my $pid = fork // die "$0: fork: $!";
if ($pid) {
++$kidpid->{$pid};
close $kid or die "$0: close: $!";
$sel->add($parent);
}
else {
close $parent or die "$0: close: $!";
check_sites $kid;
die "NOTREACHED";
}
}
To dispatch URLs, the parent grabs as many readers as are available and hands out the same number of URLs from the job queue. Any workers that remain after the job queue is empty receive the exit command.
Note that print will fail if the underlying worker has already exited. The parent must ignore SIGPIPE to prevent immediate termination.
sub dispatch_jobs {
my($sel,$jobs) = #_;
foreach my $s ($sel->can_write) {
my $url = #$jobs ? shift #$jobs : $EXIT_COMMAND;
warn "$0 [$$]: sending '$url' to fd ", fileno $s if $DEBUG;
print $s $url, "\n" or $sel->remove($s);
}
}
By the time control reaches read_results, the workers have been created and received work. Now the parent uses can_read to wait for results to arrive from one or more workers. A defined result is an answer from the current worker, and an undefined result means the child has exited and closed the other end of the socket.
sub read_results {
my($sel,$results) = #_;
warn "$0 [$$]: waiting for readers" if $DEBUG;
foreach my $s ($sel->can_read) {
warn "$0: [$$]: reading from fd ", fileno $s if $DEBUG;
if (defined(my $result = <$s>)) {
chomp $result;
push #$results, $result;
warn "$0 [$$]: got '$result' from fd ", fileno $s if $DEBUG;
}
else {
warn "$0 [$$]: eof from fd ", fileno $s if $DEBUG;
$sel->remove($s);
}
}
}
The parent must keep track of live workers in order to collect all results.
sub reap_workers {
my($kidpid) = #_;
while ((my $pid = waitpid -1, WNOHANG) > 0) {
warn "$0: [$$]: reaped $pid" if $DEBUG;
delete $kidpid->{$pid};
}
}
Running the pool executes the subs above to dispatch all URLs and return all results.
sub run_pool {
my($n,#jobs) = #_;
my $sel = IO::Select->new;
my %kidpid;
my #results;
create_worker $sel, \%kidpid for 1 .. $n;
local $SIG{PIPE} = "IGNORE"; # writes to dead workers will fail
while (#jobs || keys %kidpid || $sel->handles) {
dispatch_jobs $sel, \#jobs;
read_results $sel, \#results;
reap_workers \%kidpid;
}
warn "$0 [$$]: returning #results" if $DEBUG;
#results;
}
Using an example main program
my #jobs = qw(
bogus
http://stackoverflow.com/
http://www.google.com/
http://www.yahoo.com/
);
my #results = run_pool $NJOBS, #jobs;
print $_, "\n" for #results;
the output is
bogus: NO
http://www.google.com/: YES
http://stackoverflow.com/: YES
http://www.yahoo.com/: YES
Another option is HTTP::Async.
#!/usr/bin/perl
use strict;
use warnings;
use HTTP::Request;
use HTTP::Async;
my $numberAlive = 0;
my $numberDead = 0;
my #urls = ('http://www.perl.com','http://www.example.xyzzy/foo.html');
my $async = HTTP::Async->new;
# you might want to wrap this in a loop to deal with #urls in batches
foreach my $url (#urls){
$async->add( HTTP::Request->new( HEAD => $url ) );
}
while ( my $response = $async->wait_for_next_response ) {
if ($response->code == 200){$numberAlive ++;}
else{$numberDead ++;}
}
print "$numberAlive Alive, $numberDead Dead\n";
Worker-based parallelisation (using your choice of threads or processes):
use strict;
use warnings;
use feature qw( say );
use threads; # or: use forks;
use LWP::Simple qw( head );
use Thread::Queue::Any qw( );
use constant NUM_WORKERS => 10; # Or whatever.
my $req_q = Thread::Queue::Any->new();
my $resp_q = Thread::Queue::Any->new();
my #workers;
for (1..NUM_WORKERS) {
push #workers, async {
while (my $url = $req_q->dequeue()) {
my $is_alive = head($url) ? 1 : 0;
$resp_q->enqueue($is_alive);
}
};
}
$req_q->enqueue($_) for #urls;
my ($alive, $dead);
for (1..#urls) {
my $is_alive = $resp_q->dequeue();
++( $is_alive ? $alive : $dead );
}
$req_q->enqueue(undef) for #workers;
$_->join for #workers;
say $alive;
say $dead;

What does send/recv/begin/end mean for AnyEvent's condvar?

I'm at a loss what it means, though I'm read several examples on it:
#!/usr/bin/perl
use strict;
use AnyEvent;
my $cv = AnyEvent->condvar( cb => sub {
warn "done";
});
for my $i (1..10) {
$cv->begin;
my $w; $w = AnyEvent->timer(after => $i, cb => sub {
warn "finished timer $i";
undef $w;
$cv->end;
});
}
$cv->recv;
Can anyone explain in more detail what send/recv/begin/end does?
UPDATE
my $i = 1;
my $s = sub {
print $i;
};
my $i = 10;
$s->(); # 1
In the code you provided, the condvar is there to prevent the program from exiting prematurely. Without the recv, the program would end before any timers would have a chance to fire. With the recv, all ten timers must fire before recv returns.
recv will block if send has never been called. It will unblock when send is called.
begin and end is an alternative to using send. When there has been as many end calls as there has been begin calls, a send occurs.
AnyEvent

How can I implement a timeout for a qx(command)?

How could I implement in this piece of code a timeout: if the "hwinfo --usb"-command didn't return anything after a certain amount of time, ( stop the command and ) do a return or die from the sub _usb_device.
#!/usr/bin/env perl
use warnings;
use strict;
sub _usb_device {
my #array;
{
local $/ = "";
#array = qx( hwinfo --usb );
}
...
...
}
Timeouts are usually done with alarms.
sub _usb_device
{
# Scope array
my #array;
# Try shell command
eval
{
local $SIG{ALRM} = sub { die "timeout\n" };
local $/ = "";
alarm 10;
#array = qx( hwinfo --usb );
alarm 0;
};
# Catch and rethrow non timout errors
die $# if $# && $# ne "timeout\n";
# Done
return #array;
}