Worker Lost Error after task accepted with exitcode 2 - celery

I am new to celery. I have added the backup script to celery using periodic_task. From the logs I see that "Task accepted: main" after immediately I see the below error in the logs.
[2017-09-21 06:01:00,257: ERROR/MainProcess] Process 'PoolWorker-5' pid:XXXX
exited with 'exitcode 2'
[2017-09-21 06:01:00,268: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: exitcode 2.',)
Traceback (most recent call last):
File "/usr/lib64/python2.7/site-packages/billiard/pool.py", line 1224, in mark_as_worker_lost
human_status(exitcode)),
WorkerLostError: Worker exited prematurely: exitcode 2.
Thanks in advance.

Related

Yocto: Failure expanding variable KERNEL_LOCALVERSION

I am facing below error while building kernel from local workspace(created by devtool modify virtual/kernel). If I do not have workspace created then I don't see any error.
ERROR: ExpansionError during parsing /home/aws-fsp-build/rax-workspace/yocto/meta-ti/recipes-kernel/linux/linux-ti-staging-rt_5.10.bb
Traceback (most recent call last):
File "Var <KERNEL_LOCALVERSION>", line 1, in <module>
bb.data_smart.ExpansionError: Failure expanding variable KERNEL_LOCALVERSION, expression was -g${#d.getVar('SRCPV', True).split('+')[1]} which triggered exception IndexError: list index out of range
Can you help me on resolving this? I need to have workspace since I am working on kernel related changes. I am using dunfell branch of meta-ti.
Loading cache: 100% |#########################################################################################################| Time: 0:00:00
Loaded 4480 entries from dependency cache.
WARNING: /home/aws-fsp-build/rax-workspace/yocto/meta-ti/recipes-kernel/linux/linux-ti-staging-rt_5.10.bb: Exception during build_dependencies for do_configure
WARNING: /home/aws-fsp-build/rax-workspace/yocto/meta-ti/recipes-kernel/linux/linux-ti-staging-rt_5.10.bb: Error during finalise of /home/aws-fsp-build/rax-workspace/yocto/meta-ti/recipes-kernel/linux/linux-ti-staging-rt_5.10.bb
ERROR: ExpansionError during parsing /home/aws-fsp-build/rax-workspace/yocto/meta-ti/recipes-kernel/linux/linux-ti-staging-rt_5.10.bb
Traceback (most recent call last):
File "Var <KERNEL_LOCALVERSION>", line 1, in <module>
bb.data_smart.ExpansionError: Failure expanding variable KERNEL_LOCALVERSION, expression was -g${#d.getVar('SRCPV', True).split('+')[1]} which triggered exception IndexError: list index out of range
WARNING: /home/aws-fsp-build/rax-workspace/yocto/meta-ti/recipes-kernel/linux/linux-ti-staging-rt_5.4.bb: Cooker received SIGTERM, shutting down...
WARNING: /home/aws-fsp-build/rax-workspace/yocto/meta-carrier/recipes-kernel/linux/linux-ti-staging_4.19.bb: Cooker received SIGTERM, shutting down...
WARNING: /home/aws-fsp-build/rax-workspace/yocto/meta-carrier/recipes-kernel/mstp-mod/mstp-mod.bb: Cooker received SIGTERM, shutting down...
Summary: There were 5 WARNING messages shown.
Summary: There was 1 ERROR message shown, returning a non-zero exit code.
It seems like after moving to the workspace, the SRCPV variable changes formatting, which leads to parsing failure. Try to add something like this to the build/workspace/appends/linux-ti-staging-rt_5.4.bbappend file:
KERNEL_LOCALVERSION = "-g999"

How do I get past "Could not queue the build because there were validation errors or warnings." while automating pipeline creation using az-cli

I am trying to automate rsync pipeline creation using az-cli.
This is the command I am running from a local clone of my repository:
az pipelines create --name my_pipeline --yml-path azure-pipeline.yml --project my_project --repository my_repo --repository-type tfsgit
The pipeline is created but it is not able to queue it. Here are the details from the --debug switch. Am I missing something?
The expected output was to not only create the pipeline but also run it.
**WARNING: This command is in preview and under development. Reference and support levels: https://aka.ms/CLI_refstatus
WARNING: cli.azext_devops.dev.pipelines.pipeline_create: Successfully created a pipeline with Name: my_pipeline, Id: 2019.**
DEBUG: msrest.exceptions: Could not queue the build because there were validation errors or warnings.
DEBUG: cli.azext_devops.dev.common.exception_handler: handling vsts service error
DEBUG: cli.azure.cli.core.util: azure.cli.core.util.handle_exception is called with an exception:
DEBUG: cli.azure.cli.core.util: Traceback (most recent call last):
File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/init.py", line 691, in _run_job
result = cmd_copy(params)
File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/init.py", line 328, in call
*return self.handler(*args, *kwargs)
File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/command_operation.py", line 121, in handler
*return op(*command_args)
File "/home/user/.azure/cliextensions/azure-devops/azext_devops/dev/pipelines/pipeline_create.py", line 155, in pipeline_create
project=project)
File "/home/user/.azure/cliextensions/azure-devops/azext_devops/devops_sdk/v5_1/build/build_client.py", line 337, in queue_build
content=content)
File "/home/user/.azure/cliextensions/azure-devops/azext_devops/devops_sdk/client.py", line 90, in _send
response = self._send_request(request=request, headers=headers, content=content, media_type=media_type)
File "/home/user/.azure/cliextensions/azure-devops/azext_devops/devops_sdk/client.py", line 54, in _send_request
self._handle_error(request, response)
File "/home/user/.azure/cliextensions/azure-devops/azext_devops/devops_sdk/client.py", line 233, in _handle_error
raise AzureDevOpsServiceError(wrapped_exception)
azext_devops.devops_sdk.exceptions.AzureDevOpsServiceError: Could not queue the build because there were validation errors or warnings.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib64/az/lib/python3.6/site-packages/knack/cli.py", line 231, in invoke
cmd_result = self.invocation.execute(args)
File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/init.py", line 657, in execute
raise ex
File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/init.py", line 720, in _run_jobs_serially
results.append(self._run_job(expanded_arg, cmd_copy))
File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/init.py", line 712, in _run_job
return cmd_copy.exception_handler(ex)
File "/home/user/.azure/cliextensions/azure-devops/azext_devops/dev/common/exception_handler.py", line 18, in azure_devops_exception_handler
raise CLIError(ex)
knack.util.CLIError: Could not queue the build because there were validation errors or warnings.
ERROR: cli.azure.cli.core.azclierror: Could not queue the build because there were validation errors or warnings.
ERROR: az_command_data_logger: Could not queue the build because there were validation errors or warnings.
DEBUG: cli.knack.cli: Event: Cli.PostExecute [<function AzCliLogging.deinit_cmd_metadata_logging at 0x7fe2e4a682f0>]
INFO: az_command_data_logger: exit code: 1
INFO: cli.main: Command ran in 2.552 seconds (init: 0.200, invoke: 2.352)
INFO: telemetry.save: Save telemetry record of length 3257 in cache
WARNING: telemetry.check: Negative: The /home/user/.azure/telemetry.txt was modified at 2022-04-07 14:29:35.737231, which in less than 600.000000 s
Additional information: I am setting the AZURE_DEVOPS_EXT_PAT env variable to authenticate and use az-cli commands.
The error message says it all, it can't queue the build because there are errors in the YAML.
It created pipeline 2019, you need to review the YAML and correct the validation errors before it'll run:
Open a browser and navigate to https://dev.azure.com/<your-organization-name>/<your-project-name>/_build?definitionId=2019
Click on the Edit button
In the elipsis context menu, select validate:
The error message about the invalid syntax will be shown in a dialog box.
Alternatively, the Azure DevOps REST API exposes an endpoint to do the same:
preview pipeline
or pipeline run with the previewRun parameter specified in the request body

Airflow: Celery task failure

I have airflow up and running but I have an issue where my task is failing in celery.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 52, in execute_command
subprocess.check_call(command, shell=True)
File "/usr/local/lib/python3.6/subprocess.py", line 291, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'airflow run airflow_tutorial_v01 print_hello 2017-06-01T15:00:00 --local -sd /usr/local/airflow/dags/hello_world.py' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/celery/app/trace.py", line 375, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/celery/app/trace.py", line 632, in __protected_call__
return self.run(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 55, in execute_command
raise AirflowException('Celery command failed')
airflow.exceptions.AirflowException: Celery command failed
it is a very basic DAG (taken from the hello world tutorial: https://github.com/apache/incubator-airflow/blob/master/airflow/example_dags/tutorial.py).
Also I do not see any logs of my worker, I got this stack strace from the Flower web interface.
If I run manually on the worker node, the airflow run command mentionned in the stack trace it works.
How can I get more information to debug further?
The only log I get when starting `airflow work` is
root#ip-10-0-4-85:~# /usr/local/lib/python3.5/dist-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
""")
[2018-07-25 17:49:43,430] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python3.5/lib2to3/Grammar.txt
[2018-07-25 17:49:43,469] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python3.5/lib2to3/PatternGrammar.txt
[2018-07-25 17:49:43,594] {__init__.py:45} INFO - Using executor CeleryExecutor
Starting flask
[2018-07-25 17:49:43,665] {_internal.py:88} INFO - * Running on http://0.0.0.0:8793/ (Press CTRL+C to quit)
^C
The config I use is the default one with a postgresql and redis backend for celery.
I see the worked online in Flower.
Thanks.
edit: edited for more informations

Why does my Celery worker die with signal 6 - SIGIOT?

I'm running a celery based application. Every now and then I see the following in the log:
[... ERROR/MainProcess] Task
[...] raised unexpected: WorkerLostError('Worker exited prematurely: signal 6
(SIGIOT).',) Traceback (most recent call last): File
"/usr/local/lib/python2.7/dist-packages/billiard/pool.py", line 1170,
in mark_as_worker_lost
human_status(exitcode)), WorkerLostError: Worker exited prematurely: signal 6 (SIGIOT).
Perhaps anyone can come up with an explanation for this?

sun gridengine error "shepherd of job 119232.1 exited with exit status = 26"

We use gridengine(extactly open grid scheduler 2011.11.p1) as batch-queuing system. Just now I added an execd host named host094, but when jobs were submitted there, errors issued, status of job is Eqw, logs in $SGE_ROOT/default/spool/host094/messages says:
shepherd of job 119232.1 exited with exit status = 26
can't open usage file active_jobs/119232.1/usage for job 119232.1: No such file or directory
What's the meaning?