Sequentioal requests with Mojo::IOLoop::Delay - perl

I need to get list of URLs in non-blocking mode, but not in parallel. It should be sequential requests one by one. How can I realize that?
I cannot find examples. Documentation and articles highlight only parallel execution.
Now my code looks like the following (simplified):
my $delay = Mojo::IOLoop::Delay->new;
$delay->steps(
sub {
build_report();
say "done";
}
);
sub parse_data {
...;
my $url = shift #urls;
my $end = $delay->begin;
$ua->get( $url => \&parse_data );
$end->();
}
my $end = $delay->begin;
$ua->get( $url => \&parse_data );
$end->();
$delay->wait;
I want to avoid multiple closures by using Mojo::IOLoop::Delay.

I have just started looking at Mojo::IOLoop myself for a project and I'm not au fait with it yet. But I think the simplest way is to build an array of closures and pass it to $delay->steps.
use strict;
use warnings 'all';
use feature 'say';
use Mojo;
use constant URL => 'http://stackoverflow.com/questions/tagged/perl';
STDOUT->autoflush;
my $ua = Mojo::UserAgent->new;
my $delay = Mojo::IOLoop::Delay->new;
my #urls = ( URL ) x 10;
my #steps = map {
my $url = $_;
sub {
my $end = $delay->begin;
$ua->get( $url => sub {
my ( $ua, $txn ) = #_;
$end->();
if ( my $err = $txn->error ) {
say $err->message;
}
else {
my $res = $txn->success;
say "#{$res}{qw/ code message /}";
}
});
}
} #urls;
$delay->steps( #steps, sub { say 'Done' } );
Mojo::IOLoop->start unless Mojo::IOLoop->is_running;
output
200 OK
200 OK
200 OK
200 OK
200 OK
200 OK
200 OK
200 OK
200 OK
Done
200 OK
Note that Done is printed before the status line of the final get, because I have called $end->() immediately the callback arrives, assuming that any handling of the response doesn't need to be synchronised
If you don't want that then just move $end->() to the end of the callback. Then the delay will wait until the output has been generated before sending another request

Sequential requests very easy to implement.
Web server
#!/usr/bin/perl
use Mojo::Base -strict;
use Mojolicious::Lite;
get '/' => sub {
my $c = shift->render_later;
$c->delay(sub {
my $delay = shift;
$c->ua->get('https://google.ru' => $delay->begin);
}, sub {
my ($delay, $tx) = #_;
$tx->res->body; # body of the google.ru
$c->ua->get('https://twitter.com' => $delay->begin);
}, sub {
my ($delay, $tx) = #_;
$tx->res->body; # body of the twitter.com
$c->render(text => 'Done');
});
};
app->start;
Script
#!/usr/bin/perl
use Mojo::Base -strict;
use Mojo::IOLoop;
use Mojo::UserAgent;
my $ua = Mojo::UserAgent->new;
Mojo::IOLoop->delay(sub {
my $delay = shift;
$ua->get('https://www.google.ru' => $delay->begin);
}, sub {
my ($delay, $tx) = #_;
$tx->res->body; # body of the google.ru
$ua->get('https://twitter.com' => $delay->begin);
}, sub {
my ($delay, $tx) = #_;
$tx->res->body; # body of the twitter.com
warn 'DONE';
})->wait;
Script (dynamic requests)
Example here

Related

SOAP::Lite log transport request/response with custom identifier

I would like to log SOAP::Lite transport request/response contents using a custom identifier (e.g. a transaction-id or txn_id in my example below):
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use SOAP::Lite +trace => [ transport => \&log_transport, ];
sub log_transport {
my ($in) = #_;
if (ref($in) eq "HTTP::Request") {
# INSERT INTO logs ( txn_id, request ) VALUES ( $tnx_id, $in->content )
say STDERR Dumper(ref($in), $in->content);
}
elsif (ref($in) eq "HTTP::Response") {
# UPDATE logs SET response = '$in->content' WHERE txn_id = $tnx_id
say STDERR Dumper(ref($in), $in->content);
}
}
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME}=0;
my $soap = SOAP::Lite->proxy('https://www.dataaccess.com/webservicesserver/NumberConversion.wso?op=NumberToWords');
$soap->serializer()->register_ns('http://www.dataaccess.com/webservicesserver/' ,"ns");
sub getWords
{
my ($number, $txn_id) = #_ ;
my $method = SOAP::Data->name("ns:NumberToWords");
my #params = ( SOAP::Data->name("ubiNum" => $number) );
my $response = $soap->call($method => #params);
if (!$response->fault) {
say STDOUT "NumberToWords = " . $response->result;
}
else {
say STDERR "error: " . (defined $response->faultstring? $response->faultstring : $soap->transport->status);
}
}
getWords(444, '123abc');
In my example above, how can I pass the transaction-id 123abc to my logger?
P.S. I do not wish to use:
$soap->outputxml(1)->call($method => #params)
It does not seem like the SOAP::Trace transport callback supports extra argument passing. As a workaround you could use a lexical variable declared in the outer scope like this:
use strict;
use warnings;
use Data::Dumper;
my $TXN_ID;
use SOAP::Lite +trace => [ transport => \&log_transport, ];
sub log_transport {
my ($in) = #_;
say STDERR "Logging transaction id: $TXN_ID:";
if (ref($in) eq "HTTP::Request") {
# INSERT INTO logs ( txn_id, request ) VALUES ( $tnx_id, $in->content )
say STDERR Dumper(ref($in), $in->content);
}
elsif (ref($in) eq "HTTP::Response") {
# UPDATE logs SET response = '$in->content' WHERE txn_id = $tnx_id
say STDERR Dumper(ref($in), $in->content);
}
}
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME}=0;
my $soap = SOAP::Lite->proxy('https://www.dataaccess.com/webservicesserver/NumberConversion.wso?op=NumberToWords');
$soap->serializer()->register_ns('http://www.dataaccess.com/webservicesserver/' ,"ns");
sub getWords
{
my ($number, $txn_id) = #_ ;
$TXN_ID = $txn_id;
my $method = SOAP::Data->name("ns:NumberToWords");
my #params = ( SOAP::Data->name("ubiNum" => $number) );
my $response = $soap->call($method => #params);
if (!$response->fault) {
say STDOUT "NumberToWords = " . $response->result;
}
else {
say STDERR "error: " . (defined $response->faultstring? $response->faultstring : $soap->transport->status);
}
}
getWords(444, '123abc');

Mojo::UserAgent non-blocking vs blocking performance

I have the following code:
my $ua = Mojo::UserAgent->new ();
my #ids = qw(id1 id2 id3);
foreach (#ids) {
my $input = $_;
my $res = $ua->get('http://my_site/rest/id/'.$input.'.json' => sub {
my ($ua, $res) = #_;
print "$input =>" . $res->result->json('/net/id/desc'), "\n";
});
}
Mojo::IOLoop->start unless Mojo::IOLoop->is_running;
Why when I run the above code (non-blocking) does it take about 6 seconds while when running the code as blocking, i.e. inside the loop something like:
my $res = $ua->get('http://my_site/rest/id/'.$input.'.json');
print "$input =>" . $res->result->json('/net/id/desc'), "\n";
without the latest line it takes about 1 second?
Why is the blocking code faster than the non-blocking code?
The first thing to check when things happened. I couldn't get the same delay. Remember to try each way several times to spot outliers where there's a network hiccup. Note that the second argument to the non-blocking sub is a transaction object, normally written as $tx, where the response object is normally written res:
use Mojo::Util qw(steady_time);
say "Begin: " . steady_time();
END { say "End: " . steady_time() }
my $ua = Mojo::UserAgent->new ();
my #ids = qw(id1 id2 id3);
foreach (#ids) {
my $input = $_;
my $res = $ua->get(
$url =>
sub {
my ($ua, $tx) = #_;
print "Fetched\n";
}
);
}
One possibility is that keep-alive is holding an open connection. What happens if you turn that off?
my $res = $ua->get(
$url =>
{ Connection => 'close' }
sub {
my ($ua, $tx) = #_;
print "Fetched\n";
}
);
Here's a version that uses promises, which you'll want to get used to as more Mojo stuff moves to it:
use feature qw(signatures);
no warnings qw(experimental::signatures);
use Mojo::Promise;
use Mojo::Util qw(steady_time);
use Mojo::UserAgent;
my $ua = Mojo::UserAgent->new;
say "Begin: " . steady_time();
END { say "End: " . steady_time() }
my #ids = qw(id1 id2 id3);
my #gets = map {
$ua->get_p( 'http://www.perl.com' )->then(
sub ( $tx ) { say "Fetched: " . steady_time() },
sub { print "Error: #_" }
);
} #ids;
Mojo::Promise->all( #gets )->wait;

How to use AnyEvent::HTTP to get only part of the page?

It is necessary to read only part of the page (n bytes) and close the connection, how to do this on AnyEvent::HTTP ?
on_body is called repeatedly as chunks arrive. Returning false from on_body terminates the download.
sub my_http_request {
my $cb = pop;
my ($method, $url, %args) = #_;
croak("Unsupported: on_body") if $args{on_body};
croak("Unsupported: want_body_handle") if $args{want_body_handle};
my $max_to_read = delete($args{max_to_read});
my $data;
return http_request(
$method => $url,
%args,
on_body => sub {
#my ($chunk, $headers) = #_;
$data .= $_[0];
return !defined($max_to_read) || length($data) < $max_to_read;
},
sub {
my (undef, $headers) = #_;
$cb->($data, $headers);
},
);
}
Use my_http_request just like http_request, except it accepts an optional max_to_read parameter.
For example,
my $cb = AnyEvent->condvar();
my_http_request(
GET => 'http://...',
...
max_to_read => ...,
$cb,
);
my ($data, $headers) = $cb->recv();
For example,
my $done = AnyEvent->condvar();
my_http_request(
GET => 'http://...',
...
max_to_read => ...,
sub {
my ($data, $headers) = #_;
...
$done->send();
},
);
$done->recv();

AnyEvent::STOMP::Client + AnyEvent::ForkManger = Intermittent Error

I'm trying to write a process that listens to ActiveMQ and based on the message, goes out and grabs data from a webservice, does some processing and then puts the process data to another webservice. (REST/JSON)
The module below works fine until one of the wonky webservices I talk to returns an error. I've tried many things to catch the error but to no avail, yet. Once the webservice error happens though I get the following message:
unhandled callback exception on event (MESSAGE,
AnyEvent::STOMP::Client=HASH(0x3ad5e48), HASH(0x3a6bbb0)
{"action":"created","data":{"id":40578737,"type":"alert","who":null},"guid":"ADCCEE0C-73A7-11E6-8084-74B346D1CA67","hostname":"myserver","pid":48632}):
$fork_manager->start() should be called within the manager process
OK, I conceptually understand that child process is trying to start another process and that fork manager is saying that is a no no. But given the module below, what is the proper way to start a new process to handle the long running processing. Or why is an child process dying causing this exception and how can I prevent this
Here's the module (stripped down)
package consumer;
use AnyEvent::ForkManager;
use AnyEvent::STOMP::Client;
use JSON;
use Data::Dumper;
use v5.18;
use Moose;
sub run {
my $self = shift;
my $pm = AnyEvent::ForkManager->new(max_workers => 20);
my $stomp = AnyEvent::STOMP::Client->new();
$stomp->connect();
$stomp->on_connected(sub {
my $stomp = shift;
$stomp->subscribe('/topic/test');
say "Connected to STOMP";
});
$pm->on_start(sub {
my ($pm,$pid,#params) = #_;
say "Starting $pid worker";
});
$pm->on_finish(sub {
my ($pm, $pid,#params) = #_;
say "Finished $pid worker";
});
$pm->on_error(sub {
say Dumper(\#_);
});
$stomp->on_message(sub {
my ($stomp, $header, $body) = #_;
my $href = decode_json $body;
$pm->start(cb => sub {
my ($pm, #params) = #_;
$self->process(#params);
},
args => [ $href->{id}, $href->{data}->{type}, $href->{data}->{who} ],
);
});
my $cv = AnyEvent->condvar;
$cv->recv;
}
sub process {
say "Processing ".Dumper(\#_);
sleep 5;
if ( int(rand(10)) < 5 ) {
die "OOPS"; # this triggers the error message above
}
say "Done Processing $_[1]";
}
1;
Heres the driver for the module above:
#!/usr/bin/env perl
use v5.18;
use lib '.';
use consumer;
my $c = consumer->new();
$c->run;
Finally a traffic generator that you can use to see this in action:
#!/usr/bin/env perl
use lib '../lib';
use lib '../../lib';
use v5.18;
use Data::Dumper;
use JSON;
use Net::STOMP::Client;
$ENV{'scot_mode'} = "testing";
my $stomp = Net::STOMP::Client->new(
host => "127.0.0.1",
port => 61613
);
$stomp->connect();
for (my $i = 1; $i < 1000000; $i++) {
my $href = {
id => $i,
type => "event",
what => "foo",
};
my $json = encode_json $href;
say "Sending ".Dumper($href);
$stomp->send(
destination => "/topic/test",
body => $json,
);
}
$stomp->disconnect();
I was able to solve this by using Try::Catch and wrapping the call to self->process with a try catch like this:
$stomp->on_message(sub {
my ($stomp, $header, $body) = #_;
my $href = decode_json $body;
$pm->start(cb => sub {
my ($pm, #params) = #_;
try {
$self->process(#params);
}
catch {
# error handling stuff
};
},
args => [ $href->{id}, $href->{data}->{type}, $href->{data}->{who} ],
);
}
);

Using HTTP::Server::Simple::CGI, how do I get the headers?

Basically, my question is similar to
How do I access HTTP request headers in HTTP::Server::Simple::CGI?
The answer was to use parse_headers(), but there was no example how to use it properly. I tried to use parse_headers() but I'm not getting any result, it just stops at parse_headers() like the program is stucked. I couldn't add a comment on the question above since I don't have enough rep to do so, so I created this new question.
Below is my sample code, basically the example code from CPAN just added the parse_headers:
#!/usr/bin/perl
{
package MyWebServer;
use HTTP::Server::Simple::CGI;
our #ISA = qw(HTTP::Server::Simple::CGI);
use Data::Dumper;
my %dispatch = (
'/hello.cgi' => \&resp_hello,
# ...
);
sub handle_request {
my $self = shift;
my $cgi = shift;
my $path = $cgi->path_info();
my $handler = $dispatch{$path};
my $header = $self->parse_headers();
open F,qq{>>~/MyWebServer.log};
my $dump = Data::Dumper->Dump([$header], [qw($header)]);
print F $dump;
close F;
if (ref($handler) eq "CODE") {
print "HTTP/1.0 200 OK\r\n";
$handler->($cgi);
} else {
print "HTTP/1.0 404 Not found\r\n";
print $cgi->header,
$cgi->start_html('Not found'),
$cgi->h1('Not found'),
$cgi->end_html;
}
}
sub resp_hello {
my $cgi = shift; # CGI.pm object
return if !ref $cgi;
my $who = $cgi->param('name');
print $cgi->header,
$cgi->start_html("Hello"),
$cgi->h1("Hello $who!"),
$cgi->end_html;
}
} # end of package MyWebServer
# start the server on port 8080
my $pid = MyWebServer->new(8080)->background();
print "Use 'kill $pid' to stop server.\n";
Only added this part:
my $header = $self->parse_headers();
open F,qq{>>~/MyWebServer.log};
my $dump = Data::Dumper->Dump([$header], [qw($header)]);
print F $dump;
close F;
My objective is to get all the headers and dump it into a file.
Add
sub headers {
my( $self, $headers ) = #_;
if( $headers ){
$self->{__last_headers} = { #$headers };
}
return $self->{__last_headers};
}
Then inside handle_request use my $header = $self->headers();
FWIW, i'm curious why you're using HTTP::Server::Simple::CGI instead of Mojolicious or Dancer or even HTTP::Server::Simple::PSGI. https://metacpan.org/pod/PSGI is portability.