twisted reactor not running properly inside celery - celery

System/Dependencies details:
CPU --> 4
requirements --> celery==4.3.0, twisted==19.7.0 , python3.7
Below is the celery setup I have
from threading import Thread
from celery import Celery
from twisted.internet import threads, reactor, defer
from twisted.web.error import Error
from celery import signals
app = Celery('tasks', broker='pyamqp://guest#localhost//')
#signals.worker_process_init.connect
def configure_infrastructure(**kwargs):
Thread(target=reactor.run, name="reactor.run", args=(False,)).start()
print('started new thread')
#signals.worker_process_shutdown.connect()
def shutdown_reactor(**kwargs):
"""
This is invoked when the individual workers shut down. It just stops the twisted reactor
#param kwargs:
#return:
"""
reactor.callFromThread(reactor.stop)
print('REACTOR SHUTDOWN')
def getPage(inp):
print(inp)
return inp
def inThread():
print('inside inthread method')
try:
result = threads.blockingCallFromThread(
reactor, getPage, "http://twistedmatrix.com/")
except Exception as exc:
print(exc)
else:
print(result)
#app.task
def add(x, y):
print('inside add method')
inThread()
return x + y
Running celery worker like below:
celery -A run worker --loglevel=info
Logs when celery start:
(2_env) ubuntu#gpy:~/app/env/src$ celery -A run worker --loglevel=info
[tasks]
. run.add
[2020-04-09 07:25:29,357: WARNING/Worker-1] started new thread
[2020-04-09 07:25:29,362: WARNING/Worker-4] started new thread
[2020-04-09 07:25:29,362: WARNING/Worker-3] started new thread
[2020-04-09 07:25:29,364: WARNING/Worker-2] started new thread
[2020-04-09 07:25:29,367: INFO/MainProcess] Connected to amqp://guest:**#127.0.0.1:5672//
calling method like below:
>>> run.add.delay(1,2)
<AsyncResult: d41680fd-7cc1-4e75-81be-6496bad0cc16>
>>>
sometimes I can see it is working fine.
[2020-04-09 07:27:17,998: INFO/MainProcess] Received task: run.add[00934769-48c4-48b8-852c-8b746bdd5e03]
[2020-04-09 07:27:17,999: WARNING/Worker-4] inside add method
[2020-04-09 07:27:17,999: WARNING/Worker-4] inside inthread method
[2020-04-09 07:27:18,000: WARNING/Worker-4] http://twistedmatrix.com/
[2020-04-09 07:27:18,000: WARNING/Worker-4] http://twistedmatrix.com/
[2020-04-09 07:27:18,000: INFO/MainProcess] Task run.add[00934769-48c4-48b8-852c-8b746bdd5e03] succeeded in 0.00144551398989s: 3
Sometimes I can see it's not able to call getPage method and got hung like below logs
[2020-04-09 07:27:22,318: INFO/MainProcess] Received task: run.add[d41680fd-7cc1-4e75-81be-6496bad0cc16]
[2020-04-09 07:27:22,319: WARNING/Worker-2] inside add method
[2020-04-09 07:27:22,319: WARNING/Worker-2] inside inthread method
is there any issue in using reactor.run inside Thread?
UPDATE
I put print into *twisted.internet.threads.blockingCallFromThread* .
def blockingCallFromThread(reactor, f, *a, **kw):
queue = Queue.Queue()
def _callFromThread():
print('inside _callFromThread')
result = defer.maybeDeferred(f, *a, **kw)
result.addBoth(queue.put)
print('before calling _callFromThread')
reactor.callFromThread(_callFromThread)
print('after calling _callFromThread')
result = queue.get()
if isinstance(result, failure.Failure):
result.raiseException()
return result
I can see that the celery worker got hung only when _callFromThread method is not get called in reactor.callFromThread(_callFromThread) but when I manually stop the worker with CTRL + c then I can it get called.
Everytime I stop worker where the job was hung, it starts processing job.
Update:27 April 2020
It got solved if I use crochet to run the twisted reactor. I update the below function.
#signals.worker_process_init.connect
def configure_infrastructure(**kwargs):
from crochet import setup
setup()
print('started new thread')

With some care, which you seem to have taken, you can run the Twisted reactor in one thread. However, you will not be able to run it in more than one thread which I suppose is what is happening when you use it with Celery. It has both instance and global state which will get stomped on if it is run in more than one thread.
Instead, try using crochet to coordinate calls onto the reactor running in a single non-main thread from as many other threads as you like.

Related

Sanic and pytest errors: "Socket is not connected" and "Sanic app name X not found"

Update to Sanic 22.12 from 21.x broke all app.test_client tests. Examples from the official documentation do not work.
server.py
app = Sanic("app_name")
app.config.RESPONSE_TIMEOUT = 3600
TestManager(app)
# routes defined here
# ...
if __name__ == "__main__":
app.run(host="0.0.0.0", port=8000)
test.py
from server import app
def test_deploy_plan():
_, response = app.test_client.get('/some/route')
assert response.status_code == 200
pytest test.py yields:
Sanic app name 'app_name' not found.
App instantiation must occur outside if __name__ == '__main__' block or by using an AppLoader.
See https://sanic.dev/en/guide/deployment/app-loader.html for more details.
Traceback (most recent call last):
File "<redacted>/venv/lib/python3.9/site-packages/sanic/app.py", line 1491, in get_app
return cls._app_registry[name]
KeyError: 'app_name'
During handling of the above exception, another exception occurred:
[...]
App instantiation must occur outside if __name__ == '__main__' block or by using an AppLoader.
See https://sanic.dev/en/guide/deployment/app-loader.html for more details.
[2023-02-08 17:38:18 -0800] [6280] [ERROR] Not all workers acknowledged a successful startup. Shutting down.
One of your worker processes terminated before startup was completed. Please solve any errors experienced during startup. If you do not see an exception traceback in your error logs, try running Sanic in in a single process using --single-process or single_process=True. Once you are confident that the server is able to start without errors you can switch back to multiprocess mode.
------------------------------ Captured log call ------------------------------
INFO sanic.root:motd.py:54 Sanic v22.12.0
INFO sanic.root:motd.py:54 Goin' Fast # http://127.0.0.1:57940
INFO sanic.root:motd.py:54 mode: production, ASGI
INFO sanic.root:motd.py:54 server: sanic, HTTP/1.1
INFO sanic.root:motd.py:54 python: 3.9.16
INFO sanic.root:motd.py:54 platform: macOS-12.4-arm64-arm-64bit
INFO sanic.root:motd.py:54 packages: sanic-routing==22.8.0, sanic-testing==22.6.0
ERROR sanic.error:manager.py:230 Not all workers acknowledged a successful startup. Shutting down.
One of your worker processes terminated before startup was completed. Please solve any errors experienced during startup. If you do not see an exception traceback in your error logs, try running Sanic in in a single process using --single-process or single_process=True. Once you are confident that the server is able to start without errors you can switch back to multiprocess mode.
This used to work in Sanic 21.x:
from server import app
def test_route_returns_200():
request, response = app.test_client.get('/some/route')
assert response.status == 200
Another official example defines the routes in the app inside the fixture, which doesn't help me because all the app functionality is defined in a different module. In addition, passing in the fixture to a test function breaks when you're also using mocks (unless I'm missing something with the order of the mocked functions and fixtures passed into the function as arguments):
import pytest
#pytest.fixture
def app():
sanic_app = Sanic(__name__)
TestManager(sanic_app)
#sanic_app.get("/")
def basic(request):
return response.text("foo")
return sanic_app
def test_basic_test_client(app):
request, response = app.test_client.get("/")
assert response.body == b"foo"
assert response.status == 200

Periodic Task not running using Celery

I have setup Celery to run a periodic task every 10 seconds that sends a post request to my Django Rest API Framework.
When I run the Celery worker it picks up the task correctly:
[tasks]
. FutureForex.celery.debug_task
. arbitrager.tasks.arb_data_post_request
When I run the beat nothing more gets logged and the POST request is not executed:
[2021-12-10 16:51:24,696: INFO/MainProcess] beat: Starting...
My tasks.py contains the following:
from celery import Celery
from celery.schedules import crontab
from celery.utils.log import get_task_logger
import requests
app = Celery()
logger = get_task_logger(__name__)
#app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
# Calls arb_data_post_request every 10 seconds.
sender.add_periodic_task(10.0, arb_data_post_request.s(), name='arb_data_post_request')
#app.task
def arb_data_post_request():
"""
Post request to the Django framework to pull the exchnage data and save to the database
:return:
"""
request = requests.post('http://127.0.0.1:8000/arbitrager/data/')
logger.info(request.text)
I believe Celery is installed and setup correctly, as it finds the task. I can provide any settings if required though.
Any ideas as to why it doesn't kick off the task according to the scheduled 10 seconds would be appreciated.
Thanks,
Saul

Celery Group tasks complete but completed_count is zero

I'm using Celery 4.3.0 to create a group of tasks to run. When I do this the tasks themselves all execute successfully but the GroupResult completed count is always 0.
I'm using rabbitmq broker and have tried redis result backend and db result backend, it acts the same.
#shared_task(
autoretry_for=(Exception,), retry_backoff=
ignore_result=False, retry_kwargs={'max_retries': 3},
)
def some_task(*args, **kwargs):
logger.info('some task')
def run_tasks():
tasks = [some_task.s(), some_task.s()]
result = group(*tasks).apply_async()
while True:
print(result.completed_count())
You can update celery to 4.4.1. I had the same problem before updating.

Using ForkJoinPool in Scala

In code:
val executor = new ForkJoinPool()
executor.execute(new Runnable{
def run = println("This task is run asynchronously")
})
Thread.sleep(10000)
This code prints: This task is run asynchronously
But if I remove Thread.sleep(10000), program doesn't print.
I then learnt that its so because sleep prevents daemon threads in ForkJoinPool from being terminated before they call run method on Runnable object.
So, few questions:
Does it mean threads started by ForkJoinPool are all daemon threads?Any why is it so?
How does sleep help here?
Answers:
Yes, because you are using the default thread factory and that is how it is configured. You can provide a custom thread factory to it if you wish, and you may configure the threads to be non-daemon.
Sleep helps because it prevents your program from exiting for long enough for the thread pool threads to find your task and execute it.

Celery: Accessing the Broker Connection Pool

I'm using Celery with an AMQP broker to call tasks, but the response needs to be passed back with a different queue architecture than Celery uses, so I want to pass the messages back using Kombu only. I've been able to do this, but I'm creating a new connection every time. Does Celery use a broker connection pool, and if so, how do you access it?
It took a lot of searching because Celery's documentation is... wonderful... but I found the answer.
Celery does use a broker connection pool for calling subtasks. The celery application has a pool attribute that you can access through <your_app>.pool or celery.current_app.pool. You can then grab a connection from the pool using pool.acquire().
Also, it's possible by using Bootsteps https://docs.celeryproject.org/en/stable/userguide/extending.html
Let me copy-paste code from documentation (e.g. prevent 404 error in future)
from celery import Celery
from celery import bootsteps
from kombu import Consumer, Exchange, Queue
my_queue = Queue('custom', Exchange('custom'), 'routing_key')
app = Celery(broker='amqp://')
class MyConsumerStep(bootsteps.ConsumerStep):
def get_consumers(self, channel):
return [Consumer(channel,
queues=[my_queue],
callbacks=[self.handle_message],
accept=['json'])]
def handle_message(self, body, message):
print('Received message: {0!r}'.format(body))
message.ack()
app.steps['consumer'].add(MyConsumerStep)
def send_me_a_message(who, producer=None):
with app.producer_or_acquire(producer) as producer:
producer.publish(
{'hello': who},
serializer='json',
exchange=my_queue.exchange,
routing_key='routing_key',
declare=[my_queue],
retry=True,
)
if __name__ == '__main__':
send_me_a_message('world!')