True timeout on LWP::UserAgent request method - perl

I am trying to implement a request to an unreliable server. The request is a nice to have, but not 100% required for my perl script to successfully complete. The problem is that the server will occasionally deadlock (we're trying to figure out why) and the request will never succeed. Since the server thinks it is live, it keeps the socket connection open thus LWP::UserAgent's timeout value does us no good what-so-ever. What is the best way to enforce an absolute timeout on a request?
FYI, this is not an DNS problem. The deadlock has something to do with a massive number of updates hitting our Postgres database at the same time. For testing purposes, we've essentially put a while(1) {} line in the servers response handler.
Currently, the code looks like so:
my $ua = LWP::UserAgent->new;
ua->timeout(5); $ua->cookie_jar({});
my $req = HTTP::Request->new(POST => "http://$host:$port/auth/login");
$req->content_type('application/x-www-form-urlencoded');
$req->content("login[user]=$username&login[password]=$password");
# This line never returns
$res = $ua->request($req);
I've tried using signals to trigger a timeout, but that does not seem to work.
eval {
local $SIG{ALRM} = sub { die "alarm\n" };
alarm(1);
$res = $ua->request($req);
alarm(0);
};
# This never runs
print "here\n";
The final answer I'm going to use was proposed by someone offline, but I'll mention it here. For some reason, SigAction works while $SIG(ALRM) does not. Still not sure why, but this has been tested to work. Here are two working versions:
# Takes a LWP::UserAgent, and a HTTP::Request, returns a HTTP::Request
sub ua_request_with_timeout {
my $ua = $_[0];
my $req = $_[1];
# Get whatever timeout is set for LWP and use that to
# enforce a maximum timeout per request in case of server
# deadlock. (This has happened.)
use Sys::SigAction qw( timeout_call );
our $res = undef;
if( timeout_call( 5, sub {$res = $ua->request($req);}) ) {
return HTTP::Response->new( 408 ); #408 is the HTTP timeout
} else {
return $res;
}
}
sub ua_request_with_timeout2 {
print "ua_request_with_timeout\n";
my $ua = $_[0];
my $req = $_[1];
# Get whatever timeout is set for LWP and use that to
# enforce a maximum timeout per request in case of server
# deadlock. (This has happened.)
my $timeout_for_client = $ua->timeout() - 2;
our $socket_has_timedout = 0;
use POSIX;
sigaction SIGALRM, new POSIX::SigAction(
sub {
$socket_has_timedout = 1;
die "alarm timeout";
}
) or die "Error setting SIGALRM handler: $!\n";
my $res = undef;
eval {
alarm ($timeout_for_client);
$res = $ua->request($req);
alarm(0);
};
if ( $socket_has_timedout ) {
return HTTP::Response->new( 408 ); #408 is the HTTP timeout
} else {
return $res;
}
}

You might try LWPx::ParanoidAgent, a subclass of LWP::UserAgent which is more cautious about how it interacts with remote webservers.
Among other things, it allows you to specify a global timeout. It was developed by Brad Fitzpatrick as part of the LiveJournal project.

You can make your own timeout like this:
use LWP::UserAgent;
use IO::Pipe;
my $agent = new LWP::UserAgent;
my $finished = 0;
my $timeout = 5;
$SIG{CHLD} = sub { wait, $finished = 1 };
my $pipe = new IO::Pipe;
my $pid = fork;
if($pid == 0) {
$pipe->writer;
my $response = $agent->get("http://stackoverflow.com/");
$pipe->print($response->content);
exit;
}
$pipe->reader;
sleep($timeout);
if($finished) {
print "Finished!\n";
my $content = join('', $pipe->getlines);
}
else {
kill(9, $pid);
print "Timed out.\n";
}

From what I understand, the timeout property doesn't take into account DNS timeouts. It's possible that you could make a DNS lookup separately, then make the request to the server if that works, with the correct timeout value set for the useragent.
Is this a DNS problem with the server, or something else?
EDIT: It could also be a problem with IO::Socket. Try updating your IO::Socket module, and see if that helps. I'm pretty sure there was a bug in there that was preventing LWP::UserAgent timeouts from working.
Alex

The following generalization of one of the original answers also restores the alarm signal handler to the previous handler and adds a second call to alarm(0) in case the call in the eval clock throws a non alarm exception and we want to cancel the alarm. Further $# inspection and handling can be added:
sub ua_request_with_timeout {
my $ua = $_[0];
my $request = $_[1];
# Get whatever timeout is set for LWP and use that to
# enforce a maximum timeout per request in case of server
# deadlock. (This has happened.)`enter code here`
my $timeout_for_client_sec = $ua->timeout();
our $res_has_timedout = 0;
use POSIX ':signal_h';
my $newaction = POSIX::SigAction->new(
sub { $res_has_timedout = 1; die "web request timeout"; },# the handler code ref
POSIX::SigSet->new(SIGALRM),
# not using (perl 5.8.2 and later) 'safe' switch or sa_flags
);
my $oldaction = POSIX::SigAction->new();
if(!sigaction(SIGALRM, $newaction, $oldaction)) {
log('warn',"Error setting SIGALRM handler: $!");
return $ua->request($request);
}
my $response = undef;
eval {
alarm ($timeout_for_client_sec);
$response = $ua->request($request);
alarm(0);
};
alarm(0);# cancel alarm (if eval failed because of non alarm cause)
if(!sigaction(SIGALRM, $oldaction )) {
log('warn', "Error resetting SIGALRM handler: $!");
};
if ( $res_has_timedout ) {
log('warn', "Timeout($timeout_for_client_sec sec) while waiting for a response from cred central");
return HTTP::Response->new(408); #408 is the HTTP timeout
} else {
return $response;
}
}

Related

Perl WebService increase max connections using HTTP::Server::Simple::CGI

I'm running a little Perl Webservice, based on the example i found on this page : https://www.perlmonks.org/?node_id=1078567 (first example)
However, when a lot of clients are calling it at once, it looks like the requests are suddenly crashing, and there's a lot of TIME_WAIT tcp connections left on the server running the webservice, as if the webservice was not able to handle that many connections at once.
is there a parameter in that module or other that i could use to extend this ?
or a way to put some kind of queue for the incoming requests ?
some parts of my code, to help :
{
package TACWebService;
use HTTP::Server::Simple::CGI;
use base qw(HTTP::Server::Simple::CGI);
use Cwd 'abs_path';
use POSIX;
use DBI;
use warnings;
.........
my %dispatch = (
'/insertunix' => \&resp_insertunix,
'/insertwin' => \&resp_insertwin,
'/getpwdate' => \&resp_getpwdate,
);
# ---------------------------------------------------------------------
# Requests Handling
# ---------------------------------------------------------------------
sub handle_request {
my $self = shift;
my $cgi = shift;
my $path = $cgi->path_info();
my $handler = $dispatch{$path};
if (ref($handler) eq "CODE") {
print "HTTP/1.0 200 OK\r\n";
$handler->($cgi);
} else {
print "HTTP/1.0 404 Not found\r\n";
print $cgi->header,
$cgi->start_html('Not found'),
$cgi->h1('Not found'),
$cgi->end_html;
}
}
sub resp_insertwin {
my $cgi = shift; # CGI.pm object
return if !ref $cgi;
....
} else {
print $cgi->header("text/plain"), "INSERT";
}
.....
# ---------------------------------------------------------------------
# WebService Start in background
# ---------------------------------------------------------------------
my $pid = TACWebService->new($TACWebService::conf{tac_ws_port})->background();
print "Use 'kill $pid' to stop TAC WebService.\n";
the clients themselves are using use LWP::UserAgent like this :
my $ua = LWP::UserAgent->new();
$ua->timeout($timeout);
my $response = $ua->post($TAC_Url,
[
'args' => $here,
]
if (!$response->is_success) {
print "Timeout while connecting to $TAC_Url\n";
} else {
my $content = $response->as_string();
print $content if (grep(/INSERT_/,$content));
}
to describe the exact issue would be complicated. In short : the clients are Unix servers sending their user database (user accounts). and when lots of clients are sending this user db at once, i can see the webservice receiving half of the data, and answering "timeout" after a couple of accounts (probably because it's overloaded in some way)
thanks again
The problem is, that the client waits to long for the server to respond. To solve this you have to start the server multiple times. The easiest Solution to this is to add
sub net_server { 'Net::Server::PreFork' }
to your package TACWebService and the HTTP::Server::Simple::CGI will do the rest of the magick.
Or you can use HTTP::Server::Simple::CGI::PreFork instead. See https://metacpan.org/pod/HTTP::Server::Simple::CGI::PreFork

Perl: An asynchronous http proxy via mojolicious

I made a simple http proxy, it's work fine, but not fast, because in the function handle_request, I use
my $tx = $ua->start( Mojo::Transaction::HTTP->new(req=>$request) );
to do the request, it's blocking.
I try to use a callback like:
$ua->start( Mojo::Transaction::HTTP->new(req=>$request) )=>sub{ ... }
to make it's non-blocking, and then, got a mistake:
'error' => { 'message' => 'Premature connection close'}
I guess that's because function handle_request return immediately, it does not wait the callback to be finished. If I use semaphore to wait the callback, that's mean it's blocking again.
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use Mojo::IOLoop::Server;
use Mojo::UserAgent;
use Mojo::Message::Response;
use Mojo::Message::Request;
use Mojo::Transaction::HTTP;
use Data::Dumper;
binmode STDOUT, ":encoding(UTF-8)";
my %buffer;
Mojo::IOLoop->server( {port => 3128} => sub {
my ($loop, $stream, $client) = #_;
$stream->on(
read => sub {
my ($stream, $chunk) = #_;
my $buffer = $buffer{$client}{read_buffer} .= $chunk;
if ($buffer =~ /^GET\s+|POST\s+|HEAD\s+(.*)\r\n\r\n$/i) {
$buffer{$client}{read_buffer} = '';
&handle_request($client,$stream,$buffer);
}
elsif ($buffer =~ /^CONNECT\s+(.*)\r\n\r\n$/i) {
$buffer{$client}{read_buffer} = '';
&handle_connect($stream,$buffer);
}
elsif($buffer{$client}{connection})
{
$buffer{$client}{read_buffer} = '';
Mojo::IOLoop->stream($buffer{$client}{connection})->write($chunk);
}
if(length($buffer)>= 20 *1024 * 1024) {
delete $buffer{$client};
Mojo::IOLoop->remove($client);
return;
}
});
});
sub handle_request{
my($client,$stream,$chunk) = #_;
my $request = Mojo::Message::Request->new;
$request = $request->parse($chunk);
my $ua = Mojo::UserAgent->new;
my $tx = $ua->start( Mojo::Transaction::HTTP->new(req=>$request) );
$stream->write( $tx->res->to_string );
}
sub handle_connect{
my ($stream, $chunk) = #_;
my $request = Mojo::Message::Request->new;
my $ua = Mojo::UserAgent->new;
$request = $request->parse($chunk);
print Dumper($request);
}
Mojo::IOLoop->start;
Hope to get some suggestions .
You have 2 problem:
You try to call nonblocking variant of $ua->start when your code have blocking style. Function handle_request must have callback as parameter.
If you have chain of callback then the best way to implement it is to use Mojo::IOLoop::Delay.
When you create variable $ua in non-blocking style in sub handle_request then your variable is destoyed by garbage collector because first execute exit of sub handle_request and $ua destroyed, because it is local variable and then get answer from $ua. So you get Premature connection close. You need to save instance of $ua elsewhere to prevent such error.
Upd.
I write bad variant of http/https proxy which work only via CONNECT method and have bug with not full first http message.
Upd.
I add another example of http/https proxy which correctly read first http message and work not only via CONNECT method.
Upd.
Oh, author of the Mojo wrote example of https proxy

Skip if a request takes too much time

I have the following code to request a header from an URL:
#!/usr/bin/env perl
use strict;
use warnings;
use LWP;
use Data::Dumper;
my $request = HTTP::Request -> new ( HEAD => 'http://www.vliruos.be/media/6352100/nss2015_annex_3_budget.xlsx' );
my $agent = LWP::UserAgent -> new;
my $response = $agent -> request ( $request );
print $response -> header ( 'Content-Length');
...
I don't know the reason, but the request seems very slow, it takes more than 10 seconds for me. I just want to implement a rule: if it does not return anything in 10 seconds, it should give up and resume the commands after the print.
Does anyone know how to implement this?
You could use SIGALRM.
$SIG{ALRM} = sub { die "timeout" };
eval {
alarm(10);
# long-time operations here
alarm(0);
};
if ($#) {
if ($# =~ /timeout/) {
# timed out; do what you will here
} else {
alarm(0); # clear the still-pending alarm
die; # propagate unexpected exception
}
}

Flush INET Socket response data with BLOCKING enabled

I am making a program that interfaces with Teamspeak, and I have an issue where the responses received will not match the commands sent. I run the program multiple times and each time, I will get different results when they should be the same, due to responses being out of sync.
my $buf = '';
use IO::Socket;
my $sock = new IO::Socket::INET (
PeerAddr => 'localhost'
,PeerPort => '10011'
,Proto => 'tcp'
,Autoflush => 1
,Blocking => 1
,Timeout => 10
);
sub ExecuteCommand{
print $sock $_[0]."\n";$sock->sysread($buf,1024*10);
return $buf;
};
ExecuteCommand("login ${username} ${password}");
ExecuteCommand("use sid=1");
ExecuteCommand("clientupdate client_nickname=Idle\\sTimer");
my $client_list = ExecuteCommand("clientlist");
Each command is executed properly, however the server likes to return extra lines, so a single sysread will not be enough and I will have to execute another. The size of responses are at most 512, so they aren't being cut off. If I try to run the sysread multiple times in an attempt to flush it, when there is nothing to read it will just make the program hang.
The end of the executions are followed with "error id=0 msg=ok"
How would I be able to read all the data that comes out, even if it's multiple lines? Or just be able to flush it all out so I can move onto the next command without having to worry about old data?
So you want to read until you find a line starting with error. In addition to doing that, the following buffers anything extra read since it's part of the next response.
sub read_response {
my ($conn) = #_;
my $fh = $conn->{fh};
our $buf; local *buf = \($conn->{buf}); # alias
our $eof; local *eof = \($conn->{eof}); # alias
$buf = '' if !defined($buf);
return undef if $eof;
while (1) {
if ($buf =~ s/\A(.*?^error[^\n]*\n)//ms) {
return $1;
}
my $rv = sysread($fh, $buf, 64*1024, length($buf));
if (!$rv) {
if (defined($rv)) {
$eof = 1;
return undef;
} else {
die "Can't read response: $!\n";
}
}
}
}
my $conn = { fh => $sock };
... send command ...
my $response = read_response($conn);
...
... send command ...
my $response = read_response($conn);
...
I changed my ExecuteCommand subroutine to include a check for "error code=[0-9]{1,}", which is what is always at the end of a response for Teamspeak 3 servers.
sub ExecuteCommand{
print $sock $_[0]."\n";
my $response = "";
while (1){
$sock->sysread($buf,1024*10);
last if($buf =~ /error id=([0-9]{1,})/);
$response .= $buf;
};
return $response;
};

Why does my Perl script using WWW-Mechanize fail intermittently?

I am trying to write a Perl script using WWW-Mechanize.
Here is my code:
use DBI;
use JSON;
use WWW::Mechanize;
sub fetch_companies_list
{
my $url = shift;
my $browser = WWW::Mechanize->new( stack_depth => 0 );
my ($content, $json, $parsed_text, $company_name, $company_url);
eval
{
print "Getting the companies list...\n";
$browser->get( $url );
# die "Can't get the companies list.\n" unless( $browser->status );
$content = $browser->content();
# die "Can't get companies names.\n" unless( $browser->status );
$json = new JSON;
$parsed_text = $json->allow_nonref->utf8->relaxed->escape_slash->loose->allow_singlequote->allow_barekey->decode( $content );
foreach(#$parsed_text)
{
$company_name = $_->{name};
fetch_company_info( $company_name, $browser );
}
}
}
fetch_companies_list( "http://api.crunchbase.com/v/1/companies.js" );
The problem is the follows:
I start the script it finishes fine.
I restart the script. The script fails in "$browser->get()".
I have to wait some time (about 5 min) then it will start working again.
I am working on Linux and have WWW-Mechanize version 1.66.
Any idea what might be the problem? I don't have any firewall installed either on computer or on my router.
Moreover uncommenting the "die ..." line does not help as it stopping inside get() call. I can try to upgrade to the latest, which is 1.71, but I'd like to know if someone else experience this with this Perl module.
5 minutes (300 seconds) is the default timeout. Exactly what timed out will be returned in the response's status line.
my $response = $mech->res;
if (!$response->is_success()) {
die($response->status_line());
}
This is target site issue. It shows
503 Service Unavailable No server is available to handle this
request.
right now.
Retry with wait, try this
## set maximum no of tries
my $retries = 10;
## number of secs to sleep
my $sleep = 1;
do {
eval {
print "Getting the companies list...\n";
$browser->get($url);
# die "Can't get the companies list.\n" unless( $browser->status );
$content = $browser->content();
# die "Can't get companies names.\n" unless( $browser->status );
$json = new JSON;
$parsed_text = $json->allow_nonref->utf8->relaxed->escape_slash->loose->allow_singlequote->allow_barekey->decode($content);
foreach (#$parsed_text) {
$company_name = $_->{name};
fetch_company_info( $company_name, $browser );
}
};
if ($#) {
warn $#;
## rest for some time
sleep($sleep);
## increase the value of $sleep exponetially
$sleep *= 2;
}
} while ( $# && $retries-- );