Get Result From a Celery Task After Soft Time Limit Exceeded - celery

Question
I have a celery task which takes a lot time to finish. Since I don't want my users wait, I set a soft_time_limit for that task so, I can inform them afterwards. There is also a hard time limit to kill processes which takes time.
Is there a way that I can get result of a celery task after sof_time_limit exceeded?
I tried catching SoftTimeLimitExceeded exception but it kills the process and I have to invoke another function with only hard_time_limit in this case and this is wasting resources.
What I tried:
#celery.task('time_limit' = 600)
def foo_hard(bar)
result = do_something(bar)
send_result(result)
#celery.task('soft_time_limit' = 50)
def foo(bar):
try:
result = do_something(bar)
except SoftTimeLimitExceeded:
foo_hard(bar)
finally:
return result
How I call tasks:
res = foo.delay(bar)
while not res.ready():
time.sleep(time_1)
result = res.get(propagate = True, interval = time_2)
What I am looking for:
A way to send information that task is exceeded soft time limit but dont kill the process and send information with send_results which is only a post request.

Related

Gatling load testing and running scenarios

I am looking to create three scenarios:
The first scenario will run a bunch of GET requests for 30s
The second and third scenarios will run in parallel and wait until the first is finished.
I want the requests from the first scenario to be excluded from the report.
I have the basic outline of what I want to achieve but not seeing expected results:
val myFeeder = csv("somefile.csv")
val scenario1 = scenario("Get stuff")
.feed(myFeeder)
.during(30 seconds) {
exec(
http("getStuff(${csv_colName})").get("/someEndpoint/${csv_colName}")
)
}
val scenario2 = ...
val scenario3 = ...
setUp(
scenario1.inject(
constantUsersPerSec(20) during (30 seconds)
).protocols(firstProtocaol),
scenario2.inject(
nothingFor(30 seconds), //wait 30s
...
).protocols(secondProt)
scenario3.inject(
nothingFor(30 seconds), //wait 30s
...
).protocols(thirdProt)
)
I am seeing the first scenario being run throughout the entire test. It doesn't stop after the 30s?
For the first scenario I would like to cycle through the CSV file and perform a request for each line. Perhaps 5-10 requests per second, how do I achieve that?
I would also like it to stop after the 30s and then run the other two in parallel. Hence the nothingFor in last two scenarios above.
Also how do I exclude from report, is it possible?
You are likely not getting the expected results due to the combination of settings between your injection profile and your "Get Stuff" scenario.
constantUsersPerSec(20) during (30 seconds)
will start 20 users on scenario "Get Stuff" every second for 30 seconds. So even during the 30th second, 20 users will START "Get Stuff". The injection pofile only controls when a user starts, not how long they are active for. So when a user executes the "Get Stuff" scenario, they make the 'get' request repeatedly over the course of 30 seconds due to the .during loop.
So at the very least, you will have users executing "Get Stuff" for 60 seconds - well into the execution of your other scenarios. Depending on the execution time for you getStuff call, it may be even longer.
To avoid this, you could work out exactly how long you want the "Get Stuff" scenario to run, set that in the injection profile and have no looping in the scenario. Alternatively, you could just set your 'nothingFor' values to be >60s.
To exclude the Get Stuff calls from reports, you can add silencing to the protocol definition (assuming it's not shared with your other requests). More details at https://gatling.io/docs/3.2/http/http_protocol/#silencing

(Laravel 5) Monitor and optionally cancel an ALREADY RUNNING job on queue

I need to achieve the ability to monitor and be able to cancel an ALREADY RUNNING job on queue.
There's a lot of answers about deleting QUEUED jobs, but not on an already running one.
This is the situation: I have a "job", which consists of HUNDREDS OF THOUSANDS rows on a database, that need to be queried ONE BY ONE against a web service.
Every row needs to be picked up, queried against a web service, stored the response and its status updated.
I had that already working as a Command (launching from / outputting to console), but now I need to implement queues in order to allow piling up more jobs from more users.
So far I've seen Horizon (which doesn't runs on Windows due to missing process control libs). However, in some demos seen around it lacks (I believe) a couple things I need:
Dynamically configurable timeout (the whole job may take more than 12 hours, depending on the number of rows to process on the selected job)
Ability to CANCEL an ALREADY RUNNING job.
I also considered the option to generate EACH REQUEST as a new job instead of seeing a "job" as the whole collection of rows (this would overcome the timeout thing), but that would give me a Horizon "pending jobs" list of hundreds of thousands of records per job, and that would kill the browser (I know Redis can handle this without itching at all). Further, I guess is not possible to cancel "all jobs belonging to X tag".
I've been thinking about hitting an API route, fire the job and decouple it from the app, but I'm seeing that this requires forking processes.
For the ability to cancel, I would implement a database with job_id, and when the user hits an API to cancel a job, I'd mark it as "halted". On every loop I would check its status and if it finds "halted" then kill itself.
If I've missed any aspect just holler and I'll add it or clarify about it.
So I'm asking for an advice here since I'm new to Laravel: how could I achieve this?
So I finally came up with this (a bit clunky) solution:
In Controller:
public function cancelJob()
{
$jobs = DB::table('jobs')->get();
# I could use a specific ID and user owner filter, etc.
foreach ($jobs as $job) {
DB::table('jobs')->delete($job->id);
}
# This is a file that... well, it's self explaining
touch(base_path(config('files.halt_process_signal')));
return "Job cancelled - It will stop soon";
}
In job class (inside model::chunk() function)
# CHECK FOR HALT SIGNAL AND [OPTIONALLY] STOP THE PROCESS
if ($this->service->shouldHaltProcess()) {
# build stats, do some cleanup, log, etc...
$this->halted = true;
$this->service->stopProcess();
# This FALSE is what it makes the chunk() method to stop looping
return false;
}
In service class:
/**
* Checks the existence of the 'Halt Process Signal' file
*
* #return bool
*/
public function shouldHaltProcess() :bool
{
return file_exists($this->config['files.halt_process_signal']);
}
/**
* Stop the batch process
*
* #return void
*/
public function stopProcess() :void
{
logger()->info("=== HALT PROCESS SIGNAL FOUND - STOPPING THE PROCESS ===");
$this->deleteHaltProcessSignalFile();
return ;
}
It doesn't looks quite elegant, but it works.
I've surfed the whole web and many goes for Horizon or other tools that doesn't fit my case.
If anyone has a better way to achieve this, it's welcome to share.
Laravel queue have 3 important config:
1. retry_after
2. timeout
3. tries
See more: https://laravel.com/docs/5.8/queues
Dynamically configurable timeout (the whole job may take more than 12
hours, depending on the number of rows to process on the selected job)
I think you can config timeout + retry_after about 24h.
Ability to CANCEL an ALREADY RUNNING job.
Delete job in jobs table
Delete process by process id in your server
Hope it help you :)

Future map's (waiting) execution context. Stops execution with FixedThreadPool

// 1 fixed thread
implicit val waitingCtx = scala.concurrent.ExecutionContext.fromExecutor(Executors.newFixedThreadPool(1))
// "map" will use waitingCtx
val ss = (1 to 1000).map {n => // if I change it to 10 000 program will be stopped at some point, like locking forever
service1.doServiceStuff(s"service ${n}").map{s =>
service1.doServiceStuff(s"service2 ${n}")
}
}
Each doServiceStuff(name:String) takes 5 seconds. doServiceStuff does not have implicit ex:Execution context as parameter, it uses its own ex context inside and does Future {blocking { .. }} on it.
In the end program prints:
took: 5.775849753 seconds for 1000 x 2 stuffs
If I change 1000 to 10000 in, adding even more tasks : val ss = (1 to 10000) then program stops:
~17 027 lines will be printed (out of 20 000). No "ERROR" message
will be printed. No "took" message will be printed
**And will not be processing any futher.
But if I change exContext to ExecutionContext.fromExecutor(null: Executor) (global one) then in ends in about 10 seconds (but not normally).
~17249 lines printed
ERROR: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds]
took: 10.646309398 seconds
That's the question
: Why with fixed ex-context pool it stops without messaging, but with global ex-context it terminates but with error and messaging?
and sometimes.. it is not reproducable.
UPDATE: I do see "ERROR" and "took" if I increase pool from 1 to N. Does not matter how hight N is - it sill will be the ERROR.
The code is here: https://github.com/Sergey80/scala-samples/tree/master/src/main/scala/concurrency/apptmpl
and here, doManagerStuff2()
I think I have an idea of what's going on. If you squint enough, you'll see that map duty is extremely lightweight: just fire off a new future (because doServiceStuff is a Future). I bet the behavior will change if you switch to flatMap, which will actually flatten the nested future and thus will wait for second doServiceStuff call to complete.
Since you're not flattening out these futures, all your awaits downstream are awaiting on a wrong thing, and you are not catching it because here you're discarding whatever Service returns.
Update
Ok, I misinterpreted your question, although I still think that that nested Future is a bug.
When I try your code with both executors with 10000 task I do get OutOfMemory when creating threads in ForkJoin execution context (i.e. for service tasks), which I'd expect. Did you use any specific memory settings?
With 1000 tasks they both do complete successfully.

Setting Time Limit on specific task with celery

I have a task in Celery that could potentially run for 10,000 seconds while operating normally. However all the rest of my tasks should be done in less than one second. How can I set a time limit for the intentionally long running task without changing the time limit on the short running tasks?
You can set task time limits (hard and/or soft) either while defining a task or while calling.
from celery.exceptions import SoftTimeLimitExceeded
#celery.task(time_limit=20)
def mytask():
try:
return do_work()
except SoftTimeLimitExceeded:
cleanup_in_a_hurry()
or
mytask.apply_async(args=[], kwargs={}, time_limit=30, soft_time_limit=10)
This is an example with decorator for an specific Task and Celery 3.1.23 using soft_time_limit=10000
#task(bind=True, default_retry_delay=30, max_retries=3, soft_time_limit=10000)
def process_task(self, task_instance):
"""Task processing."""
pass

Python-Multithreading Time Sensitive Task

from random import randrange
from time import sleep
#import thread
from threading import Thread
from Queue import Queue
'''The idea is that there is a Seeker method that would search a location
for task, I have no idea how many task there will be, could be 1 could be 100.
Each task needs to be put into a thread, does its thing and finishes. I have
stripped down a lot of what this is really suppose to do just to focus on the
correct queuing and threading aspect of the program. The locking was just
me experimenting with locking'''
class Runner(Thread):
current_queue_size = 0
def __init__(self, queue):
self.queue = queue
data = queue.get()
self.ID = data[0]
self.timer = data[1]
#self.lock = data[2]
Runner.current_queue_size += 1
Thread.__init__(self)
def run(self):
#self.lock.acquire()
print "running {ID}, will run for: {t} seconds.".format(ID = self.ID,
t = self.timer)
print "Queue size: {s}".format(s = Runner.current_queue_size)
sleep(self.timer)
Runner.current_queue_size -= 1
print "{ID} done, terminating, ran for {t}".format(ID = self.ID,
t = self.timer)
print "Queue size: {s}".format(s = Runner.current_queue_size)
#self.lock.release()
sleep(1)
self.queue.task_done()
def seeker():
'''Gathers data that would need to enter its own thread.
For now it just uses a count and random numbers to assign
both a task ID and a time for each task'''
queue = Queue()
queue_item = {}
count = 1
#lock = thread.allocate_lock()
while (count <= 40):
random_number = randrange(1,350)
queue_item[count] = random_number
print "{count} dict ID {key}: value {val}".format(count = count, key = random_number,
val = random_number)
count += 1
for n in queue_item:
#queue.put((n,queue_item[n],lock))
queue.put((n,queue_item[n]))
'''I assume it is OK to put a tulip in and pull it out later'''
worker = Runner(queue)
worker.setDaemon(True)
worker.start()
worker.join()
'''Which one of these is necessary and why? The queue object
joining or the thread object'''
#queue.join()
if __name__ == '__main__':
seeker()
I have put most of my questions in the code itself, but to go over the main points (Python2.7):
I want to make sure I am not creating some massive memory leak for myself later.
I have noticed that when I run it at a count of 40 in putty or VNC on
my linuxbox that I don't always get all of the output, but when
I use IDLE and Aptana on windows, I do.
Yes I understand that the point of Queue is to stagger out your
Threads so you are not flooding your system's memory, but the task at
hand are time sensitive so they need to be processed as soon as they
are detected regardless of how many or how little there are; I have
found that when I have Queue I can clearly dictate when a task has
finished as oppose to letting the garbage collector guess.
I still don't know why I am able to get away with using either the
.join() on the thread or queue object.
Tips, tricks, general help.
Thanks for reading.
If I understand you correctly you need a thread to monitor something to see if there are tasks that need to be done. If a task is found you want that to run in parallel with the seeker and other currently running tasks.
If this is the case then I think you might be going about this wrong. Take a look at how the GIL works in Python. I think what you might really want here is multiprocessing.
Take a look at this from the pydocs:
CPython implementation detail: In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.