Is there a mod_perl2/Perl 5 equivalent to PHP's ignore_user_abort()? - perl

I'm writing an internal service that needs to touch a mod_perl2 instance for a long-running-process. The job is fired from a HTTP POST, and them mod_perl handler picks it up and does the work. It could take a long time, and is ready to be handled asynchronously, so I was hoping I could terminate the HTTP connection while it is running.
PHP has a function ignore_user_abort(), that when combined with the right headers, can close the HTTP connection early, while leaving the process running (this technique is mentioned here on SO a few times).
Does Perl have an equivalent? I haven't been able to find one yet.

Ok, I figured it out.
Mod_perl has the 'opposite' problem of PHP here. By default, mod_perl processes are left open, even if the connection is aborted, where PHP by default closes the process.
The Practical mod_perl book says how to deal with aborted connections.
(BTW, for the purposes of this specific problem, a job queue was lower on the list than a 'disconnecting' http process)
#setup headers
$r->content_type('text/html');
$s = some_sub_returns_string();
$r->connection->keepalive(Apache2::Const::CONN_CLOSE);
$r->headers_out()->{'Content-Length'} = length($s);
$r->print($s);
$r->rflush();
#
# !!! at this point, the connection will close to the client
#
#do long running stuff
do_long_running_sub();

You may want to look at using a job queue for this. Here is one provided by Zend that will let you start background processing jobs. There should be a number of these to choose from for php and perl.
Here's another thread that talks about this problem and an article on some php options. I'm not perl monk, so I'll leave suggestions on those tools to others.

Related

How to server a PSGI application with many concurrent connections

How can a PSGI application be served with many concurrent connections? I have tried event-based and preforking webservers but the number of concurrent connections seems to be limited by the number of worker processes. I've heard that for instance Node.js scales to several thousand parallel connections, can you achieve similar in Perl?
Here is a sample application that keeps connection open infinitely. The point is not to have infinite connections but to keep connections open long enough to hit connection limits:
my $app = sub {
my $env = shift;
return sub {
my $responder = shift;
my $writer = $responder->(['200', ['Content-Type' => 'text/plain' ]]);
my $counter=0;
while (1);
$writer->write(++$counter."\n");
sleep 1; # or non-blocking sleep such as Coro::AnyEvent::sleep
}
$writer->close;
};
};
I don't think you're supposed to have infinite loops inside apps, I think you're supposed to only setup a recurring timer, and in that timer notify/message/write... See Plack::App::WebSocket - WebSocket server as a PSGI application and Re^4: real-time output from Mojolicious WebSockets?
While I have not yet tried it I came across this question whilst searching for a solution to a problem faced when using a socket server to report on progress etc of long-running jobs. Initially I was thinking of an approach along the lines of ParallelUserAgent except as a server and not a client.
Returning to the problem a few days later after realising that Net::WebSocket::Server blocked new connection requests if a long-running block of code within the new connection handler callback.
My next approach will be split out the long running functionality to a new spawned shell process and use a DB to track the progress which can then be accessed as required within the server without lengthy blocking.
Thought I'd throw up my approach in case it helps anyone walking a similar path.

How can I force Mojolicious to send response to client?

I want a request to a Mojolicious application to be able to trigger a long running job. The client doesn't need to wait for that long job to finish, so I'd like the app to send back a quick response and start the job. Here's what I have in mind:
use Mojolicious::Lite;
get '/foo' => sub {
my $self = shift;
$self->render( text => 'Thanks for requesting /foo. I will get started on that.' );
# ... force Mojolicious to send response now ...
do_long_running_job();
};
But when I write the code like this, the client doesn't receive the response until after the long running job is finished (which may trigger inactivity timeouts, etc.). Is there any way to send the response more quickly? Is there another way to structure my code/app to achieve this?
Things from the docs that looked promising but didn't work:
$self->rendered(200);
$self->res->finish;
Randal Schwartz's Watching long processes through CGI should help:
The child goes on, but it must first close STDOUT, because otherwise Apache will think there might still be some output coming for the browser, and won't respond to the browser or release the connection until this is all resolved. Next, we have to launch a child process of the child to execute …
We'll do this with a pipe-open which includes an implicit fork, in line 37. The grandchild process merges STDERR to STDOUT, and then executes …
The child (that is, the parent of the traceroute) reads from the filehandle opened from the STDOUT (and STDERR) …
In short, the child process scurries off to execute the command. …
Given that you are only interested in kicking off a process rather than watching it, you should be able to prune most of the code.

Dancer: deal with multiple requests simultaneously

Here's my situation: I'm developing a web application using Dancer framework, and I would like to insert some data to the database on the server side from the browser side. The problem is, when the data is too large, the uploading takes so long that I'm considering displaying a progress bar describing the progress.
I implemented this by sending two requests: one for posting data, and the other polling the status. But it seems once the first requests is being handled, the other won't work until the first finishes. So the status returns nothing and suddenly 100%. To manage this, I create a thread when handling the first requests, so the main thread could return to handle the second polling requests. This works quite well until I have to kill some child progress spawned in the child thread (this is another question).
So my question is, is there any other ideas about dealing with the multiple requests simultaneously except for the multithread one? Normally how does the web programmers handle this situation?
You should have no problem handling multiple requests simultaneously.
How do you run your app? If you use built-in server (perl your_app.pl) then by default it is single threaded and will only process one request at a time.
You might want to use mutliprocess/multithread deployment options, for example Starman. It is described in https://metacpan.org/module/YANICK/Dancer-1.3113/lib/Dancer/Deployment.pod#Running-on-Perl-webservers-with-plackup
I'd start by gluing Dancer to AnyEvent and using Twiggy to host the app. A google search turns up this, which looks like a good starting point.
http://blogs.perl.org/users/mstplbg/2010/12/using-anyevent-and-dancer.html
http://blogs.perl.org/users/mstplbg/2010/12/anyevent-and-dancer-condvars.html
http://blogs.perl.org/users/mstplbg/2011/03/long-running-requests-with-progress-bar-in-dancer-anyevent.html
You can use Dancer with plackup and Starman, here is a example:
foo.psgi:
#!/usr/bin/perl
use strict;
use warnings;
use Dancer2;
$| = 1;
get '/foo' => sub {
`sleep 5`;
'ok';
};
to_app;
Run the program with plackup:
$ plackup -s Starman foo.pl
Resolved [*]:5000 to [0.0.0.0]:5000, IPv4
Binding to TCP port 5000 on host 0.0.0.0 with IPv4
Setting gid to "0 0 0"
Starman: Accepting connections at http://*:5000/
Then run the following for loop:
for i in $(seq 1 3)
> do
> time curl http://localhost:5000/foo &
> done
Output:
ok
real 0m5.077s
user 0m0.004s
sys 0m0.010s
ok
real 0m5.079s
user 0m0.001s
sys 0m0.012s
ok
real 0m5.097s
user 0m0.009s
sys 0m0.004s
You can see Dancer2 can accept multiple request now.

perlipc - Interactive Client with IO::Socket - why does it fork?

I'm reading the perlipc perldoc and was confused by the section entitled "Interactive Client with IO::Socket". It shows a client program that connects with some server and sends a message, receives a response, sends another message, receives a response, ad infinitum. The author, Tom Christiansen, states that writing the client as a single-process program would be "much harder", and proceeds to show an implementation that forks a child process dedicated to reading STDIN and sending to the server, while the parent process reads from the server and writes to STDOUT.
I understand how this works, but I don't understand why it wouldn't be much simpler (rather than harder) to write it as a single-process program:
while (1) {
read from STDIN
write to server
read from server
write to STDOUT
}
Maybe I'm missing the point, but it seems to me this is a bad example. Would you ever really design an client/server application protocol where the server might suddenly think of something else to say, interjecting characters onto the terminal where the client is in the middle of typing his next query?
UPDATE 1: I understand that the example permits asynchronicity; what I'm puzzled about is why concurrent I/O between a CLI client and a server would ever be desirable (due to the jumbling of input and output of text on the terminal). I can't think of any CLI app - client/server or not - that does that.
UPDATE 2: Oh!! Duh... my solution only works if there's exactly one line sent from the server for every line sent by the client. If the server can send an unknown number of lines in response, I'd have to sit in a "read from server" loop - which would never end, unless my protocol defined some special "end of response" token. By handling the sending and receiving in separate processes, you leave it up to the user at the terminal to detect "end of response".
(I wonder whether it's the client, or the server, that typically generates a command prompt? I'd always assumed it was the client, but now I'm thinking it makes more sense for it to be the server.)
Because the <STDIN> read request can block, doing the same thing in a single process requires more complicated, asynchronous handling of the input/output functions:
while (1) {
if there is data in STDIN
read from stdin
write to server
if there is data from server
read from server
write to STDOUT
}

Why does the browser hang when I register a cleanup handler in mod_perl?

I'm using $r->pool->cleanup_register(\&cleanup); to run a subroutine after a page has been processed and printed to the client. My hope was that the client would see the complete page, and Apache could continue doing some processing in the background that takes a few seconds.
But the client browser hangs until cleanup sub has returned. Is there a way to get apache to finalise the connection with the client before all my code has returned?
I'm convinced I've done this before, but I can't find it again.
Use a job queue system and do the long operation completely asynchronously -- just schedule the operation during the web request. A job queue also handles peak load situations better than doing something expensive within the web server processes themselves.
You want to flush the buffer. It doesn't finalize the connection, but your client will see the output before the task completes.
sub handler {
my $r = shift;
$r->content_type('text/html');
$r->rflush; # send the headers out
$r->print(long_operation());
return Apache2::Const::OK;
}