How can I get result in celery when celery worker is running on a different server using AsyncResult - celery

I am sending task to celery worker which is running on a remote server using the following code from lets say server A
import os
from celery import Celery
from celery.result import AsyncResult
redis_url = os.getenv("REDIS_ENDPOINT")
app = Celery("tasks", backend=f"redis://{redis_url}", broker=f"redis://{redis_url}")
def send(...):
result = app.send_task("mytask_name", ...)
return result.id
def receive(...):
result = AsyncResult(id=task_id, app=app)
return result.ready() ? result.get() : ""
Method send will send task to worker which is running on remote server, lets say server B. My issue is with the receive method, AsyncResult doesn't seem to work for me. The reason I am not storing the result Object that I get after sending the task is in a case where server A is distributed as well. The send and receive might not get called on the same server. Is there a way to get the results using celery in this type of setup?

Related

JMS Outbound Gateway - Receiving Replies from two jobs instances

We are using the JMSOutboundGateway to send message and receive message using the reply channel within the JMSOutboundGateway. When we run multiple iterations of the same job using the same JMSOutboundGateway, it fails with this error "Message contained wrong job instance id [85] should have been [86]" ( org.springframework.batch.integration.chunk.ChunkMessageChannelItemWriter.getNextResult() ) .
This is due to same JMSOutBoundGateway instance being using when I run the second when the first job is still in progress.
Is there a way I can run parallel execution of the same job type ?
This is a known issue, see https://github.com/spring-projects/spring-batch/issues/1372 and https://github.com/spring-projects/spring-batch/issues/1096.
The workaround is to use a separate instance of the writer for each job to prevent sharing the same reply channel.

openshift 3.12 websocket ERR_CONNECTION_ABORTED

I would like to start websocket connections (ws://whaterver)
in OpenShift but somehow they always ends with ERR_CONNECTION_ABORTED
immediately (new WebSocket('ws://whatever').
First I thought that the problem is in our application
but I created a minimal example and I got the same result.
First I created a pod and started this minimal Python websocket server.
import asyncio
import websockets
async def hello(websocket, path):
name = await websocket.recv()
print(f"< {name}")
greeting = f"Hello {name}!"
await websocket.send(greeting)
print(f"> {greeting}")
start_server = websockets.serve(hello, "0.0.0.0", 8000)
asyncio.get_event_loop().run_until_complete(start_server)
asyncio.get_event_loop().run_forever()
Then I created a service (TCP 8000) and created a routing too and I got the same result.
I also tried to use different port or different targets (e.g.: /ws), without success.
This minimal script was able to respond to a simple http request, but for the websocket connection it can't.
Do you have any idea what could be the problem?
(by the documentation these connections should work as they are)
Should I try to play with some routing environment variables or are there any limitations which are not mentioned in the documentation?
Posting Károly Frendrich answer as community wiki:
Finally we realized that the TLS termination is required to be set.
It can be done using Secured Routes

Celery configuration gets updated when calling a different task

I have multiple tasks as different django apps using a RabbitMQ broker. This was setup with standard django configuration and was working perfectly. I was using groups, chains and calling them from different modules.
As a standard practice, I had:
celery.py:
app = Celery('<proj>')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
And in project/init.py:
from __future__ import absolute_import
from .celery import app as celery_app
All tasks were inherited from celery.Task with run() overwritten.
Now I got a requirement to call a different task on a different RabbitMQ broker.
So here's what I did where I had to call the different task:
diff_app = Celery('diff')
diff_app.config_from_object({'BROKER_URL':'<DIFF_BROKER_URL>'})
Now to call:
diff_app.send_task('<task_name>', (args1,arg2,))
After I do this, when I call my previous tasks, they get routed to this new broker. The moment I comment out this code, everything is fine back again.
When I check celery_app (described above) conf, the broker url is correct. But when I check any previous task->app->conf->broker url, it is updated with new broker. How to fix this?
I removed 'autodiscover_tasks' and associated '_app' with each 'Task' class. This got me through with the issue.

Celery Beat - Pyramid Mailer

So, I have some plain python code which works pefectly in a normal python shell:
from pyramid_mailer.mailer import Mailer
from pyramid_mailer.message import Message
from pyramid_mailer.message import Attachment
mailer = Mailer(
host="172.10.10.240",
port="25")
message = Message(
subject="Orders with invalid status",
sender='r#example.com'],
recipients=['luke#example.com'],
html="<p>Test</p>")
mailer.send_immediately(message)
But, If I create a celery beat task such as this:
from pyramid_celery import celery_app as app
from pyramid_mailer.mailer import Mailer
from pyramid_mailer.message import Message
from pyramid_mailer.message import Attachment
mailer = Mailer(
host="172.10.10.240",
port="25")
#app.task
def wronglines_celery():
message = Message(
subject="Orders with invalid status",
sender='r#example.com'],
recipients=['luke#example.com'],
html="<p>Test</p>")
mailer.send_immediately(message)
This second example does not generate an email, it runs perfectly fine and throws no error at all, even with the log level set to DEBUG.
Running celery beat with:
celery beat -A pyramid_celery.celery_app --ini development.ini
Using the pyramid_celery plug-in as referenced in the official documentation on the celery website. My development.ini file can be seen below (relevant parts):
[celery]
BROKER_URL = amqp://app_rmq:password#localhost:5672/myvhost
CELERY_IMPORTS = intranet.celery_tasks
# Check once a day for orders with wrong line status
[celerybeat:task1]
task = intranet.celery_tasks.wronglines_celery
type = crontab
schedule = {"hour": 16, "minute": 30}
[logger_celery]
level = DEBUG
handlers =
qualname = celery
# Begin logging configuration
[loggers]
keys = root, intranet, sqlalchemy, celery
EDIT:
If I launch celery (without beat) it works perfectly, e.g. if I launch with:
celery worker -A pyramid_celery.celery_app --ini development.ini
All tasks execute (over and over) but all emails send and nothing throws an error, it seems to be the introduction of beat which is causing issues.
Are you sure its not working? The way we've configured your crontab it says "Only run once a day at 4:30". So if you ran that until it hit 4:30 I would expect it to execute properly.
Can you change your schedule to be {} instead to have it run every minute as a basic test?
I've added a crontab example to the examples here:
https://github.com/sontek/pyramid_celery/blob/master/examples/scheduler_example/development.ini#L33-L36
If you can provide more code (maybe a sample repo or modification of the examples already in the repo) that shows it not working I can take a look and hopefully fix the bug.
So, after much googlig and frustrating debugging I found an old github issue. That claimed celery tasks were working only when launched with a worker, and not with beat. The user states
Beat does not execute tasks, it just sends the messages. You need both a beat instance and a worker instance!
So to launch the work and the beat instance with the same command, shown here:
celery worker --beat -A pyramid_celery.celery_app --ini development.ini
I will be sending a pull request today to fix the documentation with regards to the correct way to launch a worker and beat instance.
By default, Celery tasks silently fail on error output. It most likely throws an exception which you never seen.
To be sure what's going to fail, put pdb (ipdb) breakpoint in task code, start celery worker on the foreground and step through the code line-by-line.

Torque pbs_python submit job error (15025 queue already exists)

I try to execute this example script (https://oss.trac.surfsara.nl/pbs_python/wiki/TorqueUsage/Scripts/Submit)
#!/usr/bin/env python
import sys
sys.path.append('/usr/local/build_pbs/lib/python2.7/site-packages/pbs/')
import pbs
server_name = pbs.pbs_default()
c = pbs.pbs_connect(server_name)
attropl = pbs.new_attropl(4)
# Set the name of the job
#
attropl[0].name = pbs.ATTR_N
attropl[0].value = "test"
# Job is Rerunable
#
attropl[1].name = pbs.ATTR_r
attropl[1].value = 'y'
# Walltime
#
attropl[2].name = pbs.ATTR_l
attropl[2].resource = 'walltime'
attropl[2].value = '400'
# Nodes
#
attropl[3].name = pbs.ATTR_l
attropl[3].resource = 'nodes'
attropl[3].value = '1:ppn=4'
# A1.tsk is the job script filename
#
job_id = pbs.pbs_submit(c, attropl, "A1.tsk", 'batch', 'NULL')
e, e_txt = pbs.error()
if e:
print e,e_txt
print job_id
But shell shows error "15025 Queue already exists". With qsub job submits normally. I have one queue 'batch' on my server. Torque version - 4.2.7. Pbs_python version - 4.4.0.
What I should to do to start new job?
There are two things going on here. First there is an error in pbs_python that maps the 15025 error code to "Queue already exists". Looking at the source of torque we see that 15025 actually maps to the error "Bad UID for job execution", this means that on the torque server, the daemon cannot determine if the user you are submitting as is allowed to run jobs. This could be because of several things:
The user you are submitting as doesn't exist on the machine running pbs_server
The host you are submitting from is not in the "submit_hosts" parameter of the pbs_server.
Solution For 1
The remedy for this depends on how you authenticate users across systems, you could use /etc/hosts.equiv to specify users/hosts allowed to submit, this file would need to be distributed to all the torque nodes as well as the torque server machine. Using hosts.equiv is pretty insecure, I haven't actually used it in this. We use a central LDAP server to authenticate all users on the network and do not have this problem. You could also manually add the user to all the torque nodes and the torque server, taking care to make sure the UID is the same on all systems.
Solution For 2
If #1 is not your problem (which I doubt it is), you probably need to add the hostname of the machine you're submitting from to the "submit_hosts" parameter on the torque server. This can be accomplished with qmgr:
[root#torque_server ]# qmgr -c "set server submit_hosts += hostname.example.com"
The pbs python library that you are using was written for torque 2.4.x.
The internal api's for torque were largely rewritten in torque 4.0.x. The library will most likely need to be written for thew new API.
Currently the developers of torque do not test any external libraries. It is possible that they could break at any time.