Setting the "finish" event on a Mojolicious/Minion::Job - perl

I am trying to have Minion jobs feed back to the Mojolicious web app upon completion (probably by posting a message to an API). The underlying idea here is then for the web app to feed back to the client that has uploaded/started the job.
I have tried doing this:
get '/' => sub {
my $c = shift;
my $id = $c->minion->enqueue('thing', [ qw/a b 1/, { foo => 'bar' } ]);
my $job = $c->minion->job($id);
$job->on(finish => sub ($job) {
my $id = $job->id;
my $task = $job->task;
$job->app->log->info(qq{Job "$id" was performed with task "$task"});
});
$c->render(template => 'index');
};
which doesn't work - I guess because the event is only emitted in the process performing the job, and the event does not get serialized and queued.
If I do this:
app->minion->add_task
(thing => sub ($job, $c, $sub, #args) {
$job->on(finish => sub ($job) {
my $id = $job->id;
my $task = $job->task;
$job->app->log->info(qq{Job "$id" was performed with task "$task"});
});
sleep 2;
});
it works ok, but it means I have to add the event handling to every task - which adds complexity to the code.
Is there a way to avoid having to do this?
I am thinking of:
being able to set a default class for jobs (so that jobs are all of a subclass of Mojo::Job - for example Minion::Job::WithFeedback)
better yet, being able to inject roles into task creation (so that you can do $c->minion->enqueue('thing', [ qw/a b 1/, { foo => 'bar' } ], { roles => qw/+WithFeedback +WithTimeout/);
I know I could poll all jobs regularly and see what changed status - this is what the Minion::Admin plugin does - but I would like to see if there is a different way that doesn't require polling the database.
Is this possible? and while we're at it - is this a bad idea in and of itself?

Related

Delayed response to slash command with Mojolicious in Perl

I am trying to create a slack application in Perl with mojolicious and I am having the following use case:
Slack sends a request to my API from a slash command and needs a response in a 3 seconds timeframe. However, Slack also gives me the opportunity to send up to 5 more responses in a 30 minute timeframe but still needs an initial response in 3 seconds (it just sends a "late_response_url" in the initial call back so that I could POST something to that url later on). In my case I would like to send an initial response to slack to inform the user that the operation is "running" and after a while send the actual outcome of my slow function to Slack.
Currently, I can do this by spawning a second process using fork() and using one process to respond imidiately as Slack dictates and the second to do the rest of the work and respond later on.
I am trying to do this with Mojolicious' subprocesses to avoid using fork(). However I can't find a way to get this to work....
a sample code of what I am already doing with fork is like this:
sub withpath
{
my $c = shift;
my $user = $c->param('user_name');
my $response_body = {
response_type => "ephemeral",
text => "Running for $user:",
attachments => [
{ text => 'analyze' },
],
};
my $pid = fork();
if($pid != 0){
$c->render( json => $response_body );
}else{
$output = do_time_consuming_things()
$response_body = {
response_type => "in-channel",
text => "Result for $user:",
attachments => [
{ text => $output },
],
};
my $ua = Mojo::UserAgent->new;
my $tx = $ua->post(
$response_url,
{ Accept => '*/*' },
json => $response_body,
);
if( my $res = $tx->success )
{
print "\n success \n";
}
else
{
my $err = $tx->error;
print "$err->{code} response: $err->{message}\n" if $err->{code};
print "Connection error: $err->{message}\n";
}
}
}
So the problem is that no matter how I tried I couldn't replicate the exact same code with Mojolicious' subproccesses. Any ideas?
Thanks in advance!
Actually I just found a solution to my problem!
So here is my solution:
my $c = shift; #receive request
my $user = $c->param('user_name'); #get parameters
my $response_url = $c->param('response_url');
my $text = $c->param('text');
my $response_body = { #create the imidiate response that Slack is waiting for
response_type => "ephemeral",
text => "Running for $user:",
attachments => [
{ text => 'analyze' },
],
};
my $subprocess = Mojo::IOLoop::Subprocess->new; #create the subprocesses
$subprocess->run(
sub {do_time_consuming_things($user,$response_url,$text)}, #this callback is the
#actuall subprocess that will run in background and contains the POST request
#from my "fork" code (with the output) that should send a late response to Slack
sub {# this is a dummy subprocess doing nothing as this is needed by Mojo.
my ($subprocess, $err, #results) = #_;
say $err if $err;
say "\n\nok\n\n";
}
);
#and here is the actual imidiate response outside of the subprocesses in order
#to avoid making the server wait for the subprocess to finish before responding!
$c->render( json => $response_body );
So I actually simply had to put my code of do_time_consuming_things in the first callback (in order for it to run as a subprocess) and use the second callback (that is actually linked to the parent process) as a dummy one and keep my "imidiate" response in the main body of the whole function instead of putting it inside one of the subprocesses. See code comments for more information!

Why would hot deploy of Hypnotoad rerun old http requests?

The nutshell:
When I do a hot deployment of Hypnotoad sometimes the new server immediately processes a slew of HTTP requests that were already handled by the previous server.
If a response has been rendered but the thread is still doing some processing does Mojo/Hypnotoad retain the request until the processing has stopped? Do I need to tell the server that the HTTP request is resolved?
The long version:
I have a Mojolicious::Lite app running under Hypnotoad.
The app's function is to accept HTTP requests from another service.
We are processing jobs that progress through a series of states.
At each job state change the app is notified with an HTTP request.
This is a busy little script - recieving more than 1000 req/hour.
The scripts job is to manipulate some data .. doing DB updates, editng files, sending mail.
In an effort to keep things moving along, when it recieves the HTTP request it sanity checks the data it recieved. If the data looks good it sends a 200 response to the caller immediately and then continues on to do the more time consuming tasks. (I'm guessing this is the underlying cause)
When I hot deploy - by rerunning the start script (which runs 'localperl/bin/hypnotoad $RELDIR/etc/bki/bki.pl') - some requests that were already handled are sent to the new server and reprocessed.
Why are these old transactions still being held by the original server? Many have been long since completed!
Does the need to tell Mojolicious that the request is done before it goes off and messes with data?
(I considered $c->finish() but that is just for sockets?)
How does Hypnotoad decide what requests should be passed to it's replacement server?
Here is some psuedo code with what I'm doing:
get '/jobStateChange/:jobId/:jobState/:jobCause' => sub {
my $c =shift;
my $jobId = $c->stash("jobId");
return $c->render(text => "invalid jobId: $jobId", status => 400) unless $jobId=~/^\d+$/;
my $jobState = $c->stash("jobState");
return $c->render(text => "invalid jobState: $jobState", status => 400) unless $jobState=~/^\d+$/;
my $jobCause = $c->stash("jobCause");
return $c->render(text => "invalid jobCause: $jobCause", status => 400) unless $jobCause=~/^\d+$/;
my $jobLocation = $c->req->param('jobLocation');
if ($jobLocation){ $jobLocation = $ENV{'DATADIR'} . "/jobs/" . $jobLocation; }
unless ( $jobLocation && -d $jobLocation ){
app->log->debug("determining jobLocation because passed job jobLocation isn't useable");
$jobLocation = getJobLocation($jobId);
$c->stash("jobLocation", $jobLocation);
}
# TODO - more validation? would BKI lie to us?
return if $c->tx->res->code && 400 == $c->tx->res->code; # return if we rendered an error above
# tell BKI we're all set ASAP
$c->render(text => 'ok');
handleJobStatusUpdate($c, $jobId, $jobState, $jobCause, $jobLocation);
};
sub handleJobStatusUpdate{
my ($c, $jobId, $jobState, $jobCause, $jobLocation) = #_;
app->log->info("job $jobId, state $jobState, cause $jobCause, loc $jobLocation");
# set the job states in jobs
app->work_db->do($sql, undef, #params);
if ($jobState == $SOME_JOB_STATE) {
... do stuff ...
... uses $c->stash to hold data used by other functions
}
if ($jobState == $OTHER_JOB_STATE) {
... do stuff ...
... uses $c->stash to hold data used by other functions
}
}
Your request will not be complete until the request handler returns. This little app, for example, will take 5 seconds to output "test":
# test.pl
use Mojolicious::Lite;
get '/test' => sub { $_[0]->render( text => "test" ); sleep 5 };
app->start;
The workaround for your app would be to run handleJobStatusUpdate in a background process.
get '/jobStateChange/:jobId/:jobState/:jobCause' => sub {
my $c =shift;
my $jobId = $c->stash("jobId");
my $jobState = $c->stash("jobState");
my $jobCause = $c->stash("jobCause");
my $jobLocation = $c->req->param('jobLocation');
...
$c->render(text => 'ok');
if (fork() == 0) {
handleJobStatusUpdate($c, $jobId, $jobState, $jobCause, $jobLocation);
exit;
}

Can I use a AnyEvent->timer() in a AnyEvent::Fork ?

Let's say I work with a number N of account object.
I would like to create for N Account, several forks, and independently include an event AnyEvent-> timer ().
here is what my code looks like:
for my $num_account (1..2) {
my $fork_1 = AnyEvent::Fork
->new
->require ("TweetFactory")
->fork
->run ("TweetFactory::worker",sub {
my ($master_filehandle) =#_;
my $wait1 = AnyEvent->timer(after => 0, interval => 100 ,cb => sub {
my $account = UsersPool::get_account($num_account);
my $tf = new TweetFactory ({account => $account, topic => $topic});
%search_article = $tf->search_articles_from_topic_list($dbh,\$db_access,#rh_website);
$tf->save_url_to_process($dbh,\$db_access,%search_article);
#url_to_process = $tf->get_url_to_process(100,$dbh,\$db_access);
%articles_urls_titles = $tf->get_infos_url($mech,#url_to_process);
$tf->save_url_processed($dbh,\$db_access,%articles_urls_titles);
});
});
my $fork_2 = AnyEvent::Fork
->new
->require ("TargetUsers")
->fork
->run ("TargetUsers::worker",sub {
my ($master_filehandle) =#_;
my $wait2 = AnyEvent->timer(after => 0, interval => 80, cb => sub {
my $account = UsersPool::get_account($num_account);
TargetUsers::save_all_messages($account,$dbh,\$db_access);
});
});
my $fork_3 = AnyEvent::Fork
->new
->require ("TargetUsers")
->fork
->run ("TargetUsers::worker",sub {
my ($master_filehandle) =#_;
my $wait3 = AnyEvent->timer(after => 0 , interval => 80, cb => sub {
my $account = UsersPool::get_account($num_account);
TargetUsers::save_followers($dbh,\$db_access,$account);
});
});
AnyEvent::Loop::run;
}
But during the execution, the timers does not start.
I have, on the contrary, tried to launch an event AnyEvent-> timer in which I create my fork :
my $wait2 = AnyEvent->timer(after => 0, interval => 10, cb => sub {
my $fork_2 = AnyEvent::Fork
->new
->require ("TargetUsers")
->fork
->run ("TargetUsers::worker",sub {
my ($master_filehandle) =#_;
my $account = UsersPool::get_account($num_account);
TargetUsers::save_all_messages($account,$dbh,\$db_access);
});
});
At this moment, the event was well launched, but I had to wait for the execution of the last event to create the next fork.
Have you some idea please ? Thanks
First some observations: in your example, you do not need to call ->fork. You also don't show the code running in the forked process, you only show how you create timers in the parent process, and these should certainly work. Lastly, you don't seem to do anything with the forked process (you do nothing to the $master_filehandle).
More importantly, your example creates and instantly destroys the fork objects, they never survive the loop, and you actually call the loop function inside your for loop, so probably you don't loop more than once
Maybe there is some misunderstanding involved - the callback you pass to run is executed in the parent, the same process whjere you execute AnyEvent::Fork->new, The code that runs in the child would be the TargetUsers::worker function for example.
To make timers work in the newly created processes, you would need to run an event loop in them.
Maybe AnyEvent::Fork::RPC with the async backend would be more suited for your case: it runs an event loop for you, it has a simpler request/response usage and it can pass data to and from the newly created process.

Access element across multiple hash of hash of arrays

So I have created a hash of hash of arrays. This data structure (lets call it "amey") is a collection of python tests that i wish to control the execution of.
It looks something like this:
'amey'=
{
'test_type1' = {
'class_test1' => [
'test_1'
],
'class_test2' => [
'test_1',
'test_2'
],
'class_test3' => [
'test_1'
]
};
'test_type2' = {
'class_test1' => [
'test_1',
'test_2'
]
};
'test_type3' = {
'class_test1' => [
'test_1',
'test_2',
'test_3',
'test_4',
'test_5',
'test_6',
'test_7'
],
}
}
The intention is to somehow iterate through this hash and run each test of a type in parallel with a test of another test type.
*Also knowing which class the test belongs to is important as there are similar named tests. (Example "test_bat")*
So for example:
Start executing the below tests in parallel
run test test_type1/class_test1/test_1 &;
run test test_type2/class_test1/test_1 &;
run test test_type3/class_test1/test_1 &;
Wait for the test of a type to finish running and then start the next of that type.
So for example if test_type1/class_test1/test_1 completes then start with
test_type1/class_test2/test_1.
The number of classes are not the same across each type and nor are the number of tests of each class.
I realize this is kind of a complex requirement (or may be not for the perl monks :)), but would love to hear some suggestions on how I should go about doing this.
With Forks::Super, create each test in a separate fork call and set up dependencies between them.
use Forks::Super;
$amey = { ... };
foreach my $class (keys %$amey) {
foreach my $type (keys %{$amey->{$class}}) {
my $last_job;
foreach my $name (keys %{$amey->{$class}{$type}}) {
my $job = fork {
name => "$class/$type/$name",
cmd => "run test $class/$type/$name",
depend_on => $last_job
};
$last_job = "$class/$type/$name";
}
}
}

Less verbose debug screen in Catalyst?

in my stage server I would like to activate the debug so the clients can find errors for themselves before the app goes to the production server.
BUT I only want the first part of the message, not the Request, or the Session Data.
For example: Couldn't render template "templates/home.tt2: file error - templates/inc/heater: not found".
The message is enough for me and for my client to see that the "header" call is misspelled.
The Request has a lot of irrelevant information for the client, but also has A LOT of internal developing information that should be hidden all the time!!
Regards
What you want is to override Catalyst's dump_these method. This returns a list of things to display on Catalyst's error debugging page.
The default implementation looks like:
sub dump_these {
my $c = shift;
[ Request => $c->req ],
[ Response => $c->res ],
[ Stash => $c->stash ],
[ Config => $c->config ];
}
but you can make it more restrictive, for example
sub dump_these {
my $c = shift;
return [ Apology => "We're sorry that you encountered a problem" ],
[ Response => substr($c->res->body, 0, 512) ];
}
You would define dump_these in your app's main module -- the one where you use Catalyst.
I had a similar problem that I solved by overriding the Catalyst method log_request_parameters.
Something like this (as #mob said, put it in your main module):
sub log_request_parameters {
my $c = shift;
my %all_params = #_;
my $copy = Clone::clone(\%all_params); # don't change the 'real' request params
# Then, do anything you want to only print what matters to you,
# for example, to hide some POST parameters:
my $body = $copy->{body} || {};
foreach my $key (keys %$body) {
$body->{$key} = '****' if $key =~ /password/;
}
return $c->SUPER::log_request_parameters( %$copy );
}
But you could also simply return at the beginning, if you don't want any GET/POST parameters displayed.
Well, I didn't think of the more obvious solution, in your case: you could simply set your log level to something higher than debug, which would prevent these debug logs from being displayed, but would keep the error logs:
# (or a similar condition to check you are not on the production server)
if ( !__PACKAGE__->config->{dev} ) {
__PACKAGE__->log->levels( 'warn', 'error', 'fatal' ) if ref __PACKAGE__->log;
}