Airflow: Celery task failure - celery

I have airflow up and running but I have an issue where my task is failing in celery.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 52, in execute_command
subprocess.check_call(command, shell=True)
File "/usr/local/lib/python3.6/subprocess.py", line 291, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'airflow run airflow_tutorial_v01 print_hello 2017-06-01T15:00:00 --local -sd /usr/local/airflow/dags/hello_world.py' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/celery/app/trace.py", line 375, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/app/trace.py", line 632, in __protected_call__
return self.run(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 55, in execute_command
raise AirflowException('Celery command failed')
airflow.exceptions.AirflowException: Celery command failed
it is a very basic DAG (taken from the hello world tutorial: https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/tutorial.py).
Also I do not see any logs of my worker, I got this stack strace from the Flower web interface.
If I run manually on the worker node, the airflow run command mentionned in the stack trace it works.
How can I get more information to debug further?
The only log I get when starting `airflow work` is
root#ip-10-0-4-85:~# /usr/local/lib/python3.5/dist-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
""")
[2018-07-25 17:49:43,430] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python3.5/lib2to3/Grammar.txt
[2018-07-25 17:49:43,469] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python3.5/lib2to3/PatternGrammar.txt
[2018-07-25 17:49:43,594] {__init__.py:45} INFO - Using executor CeleryExecutor
Starting flask
[2018-07-25 17:49:43,665] {_internal.py:88} INFO - * Running on http://0.0.0.0:8793/ (Press CTRL+C to quit)
^C
The config I use is the default one with a postgresql and redis backend for celery.
I see the worked online in Flower.
Thanks.
edit: edited for more informations

Related

Overwrite anaconda3 unsuccessful installation

I deleted the anaconda directory under the home and bashrc configurations.
Now, I need to install it again, but it occurs a problem evenif overwrites unsuccessful installation on Linux.
Should I delete some additional config files? How can I handle this?
sh Downloads/Anaconda3-2022.10-Linux-x86_64.sh -u -p /home/user/anaconda3/
PREFIX=/home/user/anaconda3
Unpacking payload ...
concurrent.futures.process._RemoteTraceback:
'''
Traceback (most recent call last):
File "concurrent/futures/process.py", line 384, in wait_result_broken_or_wakeup
File "multiprocessing/connection.py", line 256, in recv
TypeError: __init__() missing 1 required positional argument: 'msg'
'''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "entry_point.py", line 69, in <module>
File "concurrent/futures/process.py", line 559, in _chain_from_iterable_of_lists
File "concurrent/futures/_base.py", line 608, in result_iterator
File "concurrent/futures/_base.py", line 445, in the result
File "concurrent/futures/_base.py", line 390, in __get_result
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
[8382] Failed to execute script entry_point
Make sure deleted .conda directory under home and have enough disk space.
No need to delete .cache or any bin libraries.

How do I get past "Could not queue the build because there were validation errors or warnings." while automating pipeline creation using az-cli

I am trying to automate rsync pipeline creation using az-cli.
This is the command I am running from a local clone of my repository:
az pipelines create --name my_pipeline --yml-path azure-pipeline.yml --project my_project --repository my_repo --repository-type tfsgit
The pipeline is created but it is not able to queue it. Here are the details from the --debug switch. Am I missing something?
The expected output was to not only create the pipeline but also run it.
**WARNING: This command is in preview and under development. Reference and support levels: https://aka.ms/CLI_refstatus
WARNING: cli.azext_devops.dev.pipelines.pipeline_create: Successfully created a pipeline with Name: my_pipeline, Id: 2019.**
DEBUG: msrest.exceptions: Could not queue the build because there were validation errors or warnings.
DEBUG: cli.azext_devops.dev.common.exception_handler: handling vsts service error
DEBUG: cli.azure.cli.core.util: azure.cli.core.util.handle_exception is called with an exception:
DEBUG: cli.azure.cli.core.util: Traceback (most recent call last):
File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/init.py", line 691, in _run_job
result = cmd_copy(params)
File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/init.py", line 328, in call
*return self.handler(*args, *kwargs)
File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/command_operation.py", line 121, in handler
*return op(*command_args)
File "/home/user/.azure/cliextensions/azure-devops/azext_devops/dev/pipelines/pipeline_create.py", line 155, in pipeline_create
project=project)
File "/home/user/.azure/cliextensions/azure-devops/azext_devops/devops_sdk/v5_1/build/build_client.py", line 337, in queue_build
content=content)
File "/home/user/.azure/cliextensions/azure-devops/azext_devops/devops_sdk/client.py", line 90, in _send
response = self._send_request(request=request, headers=headers, content=content, media_type=media_type)
File "/home/user/.azure/cliextensions/azure-devops/azext_devops/devops_sdk/client.py", line 54, in _send_request
self._handle_error(request, response)
File "/home/user/.azure/cliextensions/azure-devops/azext_devops/devops_sdk/client.py", line 233, in _handle_error
raise AzureDevOpsServiceError(wrapped_exception)
azext_devops.devops_sdk.exceptions.AzureDevOpsServiceError: Could not queue the build because there were validation errors or warnings.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib64/az/lib/python3.6/site-packages/knack/cli.py", line 231, in invoke
cmd_result = self.invocation.execute(args)
File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/init.py", line 657, in execute
raise ex
File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/init.py", line 720, in _run_jobs_serially
results.append(self._run_job(expanded_arg, cmd_copy))
File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/init.py", line 712, in _run_job
return cmd_copy.exception_handler(ex)
File "/home/user/.azure/cliextensions/azure-devops/azext_devops/dev/common/exception_handler.py", line 18, in azure_devops_exception_handler
raise CLIError(ex)
knack.util.CLIError: Could not queue the build because there were validation errors or warnings.
ERROR: cli.azure.cli.core.azclierror: Could not queue the build because there were validation errors or warnings.
ERROR: az_command_data_logger: Could not queue the build because there were validation errors or warnings.
DEBUG: cli.knack.cli: Event: Cli.PostExecute [<function AzCliLogging.deinit_cmd_metadata_logging at 0x7fe2e4a682f0>]
INFO: az_command_data_logger: exit code: 1
INFO: cli.main: Command ran in 2.552 seconds (init: 0.200, invoke: 2.352)
INFO: telemetry.save: Save telemetry record of length 3257 in cache
WARNING: telemetry.check: Negative: The /home/user/.azure/telemetry.txt was modified at 2022-04-07 14:29:35.737231, which in less than 600.000000 s
Additional information: I am setting the AZURE_DEVOPS_EXT_PAT env variable to authenticate and use az-cli commands.
The error message says it all, it can't queue the build because there are errors in the YAML.
It created pipeline 2019, you need to review the YAML and correct the validation errors before it'll run:
Open a browser and navigate to https://dev.azure.com/<your-organization-name>/<your-project-name>/_build?definitionId=2019
Click on the Edit button
In the elipsis context menu, select validate:
The error message about the invalid syntax will be shown in a dialog box.
Alternatively, the Azure DevOps REST API exposes an endpoint to do the same:
preview pipeline
or pipeline run with the previewRun parameter specified in the request body

Airflow scheduler failure

I have followed
this tutorial in attempt to build an airflow cluster on localhost with my own DAGs. When I ran airflow scheduler after having set executor = CeleryExecutor in the config file, I received the following traceback:
Traceback (most recent call last):
File "/home/yurii/Tools/anaconda3/bin/airflow", line 28, in
args.func(args)
File"/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/airflow/bin/cli.py", line 839, in scheduler job.run()
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/airflow/jobs.py", line 200, in run
self._execute()
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/airflow/jobs.py", line 1309, in _execute
self._execute_helper(processor_manager)
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/airflow/jobs.py", line 1441, in _execute_helper
self.executor.heartbeat()
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/airflow/executors/base_executor.py", line 124, in heartbeat
self.execute_async(key, command=command, queue=queue)
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 80, in execute_async
args=[command], queue=queue)
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/celery/app/task.py", line 573, in apply_async
**dict(self._get_exec_options(), **options)
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/celery/app/base.py", line 354, in send_task
reply_to=reply_to or self.oid, **options
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/celery/app/amqp.py", line 310, in publish_task
**kwargs
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/kombu/messaging.py", line 172, in publish
routing_key, mandatory, immediate, exchange, declare)
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/kombu/connection.py", line 449, in _ensured
return fun(*args, **kwargs)
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/kombu/messaging.py", line 188, in _publish
mandatory=mandatory, immediate=immediate,
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/librabbitmq/init.py", line 122, in basic_publish
mandatory or False, immediate or False,
TypeError: an integer is required (got type NoneType)
Some additional information:
I am using Airflow 1.8.0 along with Celery 3.1.25 and RabbitMQ 3.5.7 as a broker and backend, but also tried Airflow 1.9.0 with Celery 4.2.
Airflow with sequential executor works without any problems.
`airflow test "dag_name" "task_name" "exec_date" runs succeessfully.
I am new to Airflow/Celery/RabbitMQ/SQL, so any help would be appreciated!
To add to previous answer. Using py-amqp involves either changing from broker_url = amqp://XXXXX to broker_url = pyamqp://XXXXX OR
pip uninstall librabbitmq.
Additionally you may need to change celery_result_backend variable to result_backend in your airflow.cfg. The celery_ prefix has been removed for variables in the [celery] node in airflow.cfg in recent versions.
It seems you are using librabbitmq as amqp broker which is not recommended by celery core team. Use py-amqp as the rabbitmq broker and you should get rid of this error.

Starting Celery with supervisord: AttributeError: 'module' object has no attribute 'celery'

I used to have all my Flask app code and celery code in one file and it worked fine with supervisor. However, it is very hair so I split my tasks to celery_tasks.py and this problem occurs.
In my project directory, I can start celery manually with the following command
celery -A celery_tasks worker --loglevel=INFO
However, because this is a server, I need celery to run as a daemon in background.
But it shows following error when I called sudo supervisorctl restart celeryd
celeryd: ERROR (abnormal termination)
and the log said:
Traceback (most recent call last):
File "/srv/www/learningapi.stanford.edu/peerAPI/peerAPIenv/bin/celery", line 9, in <module>
load_entry_point('celery==3.0.19', 'console_scripts', 'celery')()
File "/srv/www/learningapi.stanford.edu/peerAPI/peerAPIenv/local/lib/python2.7/site-packages/celery/__main__.py", line 14, in main
main()
File "/srv/www/learningapi.stanford.edu/peerAPI/peerAPIenv/local/lib/python2.7/site-packages/celery/bin/celery.py", line 957, in main
cmd.execute_from_commandline(argv)
File "/srv/www/learningapi.stanford.edu/peerAPI/peerAPIenv/local/lib/python2.7/site-packages/celery/bin/celery.py", line 901, in execute_from_commandline
super(CeleryCommand, self).execute_from_commandline(argv)))
File "/srv/www/learningapi.stanford.edu/peerAPI/peerAPIenv/local/lib/python2.7/site-packages/celery/bin/base.py", line 185, in execute_from_commandline
argv = self.setup_app_from_commandline(argv)
File "/srv/www/learningapi.stanford.edu/peerAPI/peerAPIenv/local/lib/python2.7/site-packages/celery/bin/base.py", line 300, in setup_app_from_commandline
self.app = self.find_app(app)
File "/srv/www/learningapi.stanford.edu/peerAPI/peerAPIenv/local/lib/python2.7/site-packages/celery/bin/base.py", line 318, in find_app
return sym.celery
AttributeError: 'module' object has no attribute 'celery'
I used the following config.
[program:celeryd]
command = celery -A celery_tasks worker --loglevel=INFO
user=peerapi
numprocs=4
stdout_logfile = <path to log>
stderr_logfile = <path to log>
autostart = true
autorestart = true
environment=PATH="<path to my project>"
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998
My code also init celery properly
celery = Celery('celery_tasks', broker='amqp://guest:guest#localhost:5672//',
backend='amqp')
celery.config_from_object(celeryconfig)
and my celeryconfig.py is working normally
CELERY_TASK_SERIALIZER='json'
CELERY_RESULT_SERIALIZER='json'
CELERY_TIMEZONE='America/Los Angeles'
CELERY_ENABLE_UTC=True
Any clue?
Looks like your application can't find your celeryconfig, it happens because you CWD is not set for example. Try to use something like:
cd app_path; celeryd ...
Also you need to setup env
# local settings
PATH=/home/ubuntu/envs/app/bin:$PATH
PYTHONHOME=/home/ubuntu/envs/app/
PYTHONPATH=/home/ubuntu/projects/app/
Should work.

"pysolr.SolrError: [Reason: /solr4/update/]" when running mongo_connector.py

As a follow on from this problem I was having before: (How long does mongo_connector.py usually take?)
I was wondering if anyone else has had this problem when running the following:
$ python /usr/local/lib/python2.7/dist-packages/mongo-connector/mongo_connector.py -m localhost:27017 --docManager /usr/local/lib/python2.7/dist-packages/mongo-connector/doc_managers/solr_doc_manager.py -t http://localhost:8080/solr4
This is the error output I get:
2012-08-20 10:24:11,893 - INFO - Beginning Mongo Connector
2012-08-20 10:24:12,971 - INFO - Starting new HTTP connection (1): localhost
2012-08-20 10:24:12,974 - INFO - Finished 'http://localhost:8080/solr4/update/?commit=true' (post) with body 'u'<commit ' in 0.017 seconds.
2012-08-20 10:24:12,983 - ERROR - [Reason: /solr4/update/]
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/mongo-connector/mongo_connector.py", line 441, in <module>
auth_username=options.admin_name)
File "/usr/local/lib/python2.7/dist-packages/mongo-connector/mongo_connector.py", line 100, in __init__
unique_key=u_key)
File "/usr/local/lib/python2.7/dist-packages/mongo-connector/doc_managers/solr_doc_manager.py", line 54, in __init__
self.run_auto_commit()
File "/usr/local/lib/python2.7/dist-packages/mongo-connector/doc_managers/solr_doc_manager.py", line 95, in run_auto_commit
self.solr.commit()
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 802, in commit
return self._update(msg, waitFlush=waitFlush, waitSearcher=waitSearcher)
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 359, in _update
return self._send_request('post', path, message, {'Content-type': 'text/xml; charset=utf-8'})
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 293, in _send_request
raise SolrError(error_message)
pysolr.SolrError: [Reason: /solr4/update/]
Reason: [Reason: /solr4/update/] is not really an output that I can even start to debug. Solr is working perfectly fine, MongoDB is working perfectly fine. What could this problem be caused by?
I have been following the instructions on this page up to now: http://loutilities.wordpress.com/2012/11/26/complementing-mongodb-with-real-time-solr-search/#comment-183. I've also seen on various websites that adding the following to my Solr's solrconfig.xml should make 'update' accessible, but this is already configured on my system:
<requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
That's about all the information I have. Any hints as to what I might be doing wrong?