celery failing on dotcloud deployment with IO Error - celery

Celery is failing on one of my dotcloud deployments, and I'm not sure how to fix. The deployment is almost identical to an existing dotcloud deployment (verified via doing a file diff) which seems to be working ok.
The error I get in djcelery log:
dotcloud#hack-default-www-0:/var/log/supervisor$ more djcelery_error.log
/home/dotcloud/env/lib/python2.6/site-packages/django/conf/__init__.py:75: Depre
cationWarning: The ADMIN_MEDIA_PREFIX setting has been removed; use STATIC_URL i
nstead.
"use STATIC_URL instead.", DeprecationWarning)
/home/dotcloud/env/lib/python2.6/site-packages/djcelery/loaders.py:108: UserWarn
ing: Using settings.DEBUG leads to a memory leak, never use this setting in prod
uction environments!
warnings.warn("Using settings.DEBUG leads to a memory leak, never "
[2012-06-04 03:27:32,139: WARNING/MainProcess] -------------- celery#hack-defaul
t-www-0 v2.5.3
---- **** -----
--- * *** * -- [Configuration]
-- * - **** --- . broker: amqp://root#hack-OQVADQ2K.dotcloud.com:29210//
- ** ---------- . loader: djcelery.loaders.DjangoLoader
- ** ---------- . logfile: [stderr]#INFO
- ** ---------- . concurrency: 2
- ** ---------- . events: ON
- *** --- * --- . beat: OFF
-- ******* ----
--- ***** ----- [Queues]
-------------- . celery: exchange:celery (direct) binding:celery
[Tasks]
. experiments.tasks.pushMessageToIphone
. experiments.tasks.sendTestMessage
[2012-06-04 03:27:32,172: INFO/PoolWorker-1] child process calling self.run()
[2012-06-04 03:27:32,185: INFO/PoolWorker-2] child process calling self.run()
[2012-06-04 03:27:32,188: WARNING/MainProcess] celery#hack-default-www-0 has sta
rted.
[2012-06-04 03:27:35,315: ERROR/MainProcess] Consumer: Connection Error: Socket
closed. Trying again in 2 seconds...
[2012-06-04 03:27:40,374: ERROR/MainProcess] Consumer: Connection Error: Socket
closed. Trying again in 4 seconds...
[2012-06-04 03:27:47,479: ERROR/MainProcess] Consumer: Connection Error: Socket
closed. Trying again in 6 seconds...
[2012-06-04 03:27:56,509: ERROR/MainProcess] Consumer: Connection Error: Socket
Interestingly, the error log of celery cam shows something a bit different. I'm not sure if this is a red herring..
/home/dotcloud/env/lib/python2.6/site-packages/django/conf/__init__.py:75: Depre
cationWarning: The ADMIN_MEDIA_PREFIX setting has been removed; use STATIC_URL i
nstead.
"use STATIC_URL instead.", DeprecationWarning)
[2012-06-04 03:27:31,373: INFO/MainProcess] -> evcam: Taking snapshots with djce
lery.snapshot.Camera (every 1.0 secs.)
Traceback (most recent call last):
File "hack/manage.py", line 14, in
execute_manager(settings)
File "/home/dotcloud/env/lib/python2.6/site-packages/django/core/management/__
init__.py", line 459, in execute_manager
utility.execute()
File "/home/dotcloud/env/lib/python2.6/site-packages/django/core/management/__
init__.py", line 382, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/home/dotcloud/env/lib/python2.6/site-packages/djcelery/management/base.
py", line 74, in run_from_argv
return super(CeleryCommand, self).run_from_argv(argv)
File "/home/dotcloud/env/lib/python2.6/site-packages/django/core/management/ba
se.py", line 196, in run_from_argv
self.execute(*args, **options.__dict__)
File "/home/dotcloud/env/lib/python2.6/site-packages/djcelery/management/base.
py", line 67, in execute
super(CeleryCommand, self).execute(*args, **options)
File "/home/dotcloud/env/lib/python2.6/site-packages/django/core/management/ba
se.py", line 232, in execute
output = self.handle(*args, **options)
File "/home/dotcloud/env/lib/python2.6/site-packages/djcelery/management/comma
nds/celerycam.py", line 26, in handle
ev.run(*args, **options)
File "/home/dotcloud/env/lib/python2.6/site-packages/celery/bin/celeryev.py",
line 38, in run
detach=detach)
File "/home/dotcloud/env/lib/python2.6/site-packages/celery/bin/celeryev.py",
line 70, in run_evcam
return cam()
File "/home/dotcloud/env/lib/python2.6/site-packages/celery/events/snapshot.py
", line 116, in evcam
recv.capture(limit=None)
File "/home/dotcloud/env/lib/python2.6/site-packages/celery/events/__init__.py
", line 204, in capture
list(self.itercapture(limit=limit, timeout=timeout, wakeup=wakeup))
File "/home/dotcloud/env/lib/python2.6/site-packages/celery/events/__init__.py
", line 193, in itercapture
with self.consumer(wakeup=wakeup) as consumer:
File "/usr/lib/python2.6/contextlib.py", line 16, in __enter__
return self.gen.next()
File "/home/dotcloud/env/lib/python2.6/site-packages/celery/events/__init__.py
", line 185, in consumer
queues=[self.queue], no_ack=True)
File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/messaging.py", line
279, in __init__
self.revive(self.channel)
File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/messaging.py", line
286, in revive
channel = channel.default_channel
File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/connection.py", lin
e 581, in default_channel
self.connection
File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/connection.py", lin
e 574, in connection
self._connection = self._establish_connection()
File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/connection.py", lin
e 533, in _establish_connection
conn = self.transport.establish_connection()
File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/transport/amqplib.p
y", line 279, in establish_connection
connect_timeout=conninfo.connect_timeout)
File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/transport/amqplib.p
y", line 89, in __init__
super(Connection, self).__init__(*args, **kwargs)
File "/home/dotcloud/env/lib/python2.6/site-packages/amqplib/client_0_8/connec
tion.py", line 144, in __init__
(10, 30), # tune
File "/home/dotcloud/env/lib/python2.6/site-packages/amqplib/client_0_8/abstra
ct_channel.py", line 95, in wait
self.channel_id, allowed_methods)
File "/home/dotcloud/env/lib/python2.6/site-packages/amqplib/client_0_8/connec
tion.py", line 202, in _wait_method
self.method_reader.read_method()
File "/home/dotcloud/env/lib/python2.6/site-packages/amqplib/client_0_8/method
_framing.py", line 221, in read_method
raise m
IOError: Socket closed
My supervisord file:
[program:djcelery]
directory = /home/dotcloud/current/
command = /home/dotcloud/env/bin/python hack/manage.py celeryd -E -l info -c 2
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log
[program:celerycam]
directory = /home/dotcloud/current/
command = /home/dotcloud/env/bin/python hack/manage.py celerycam
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log
As mentioned, I have nearly identical code deployed under a different dotcloud account that is working fine.
Status of the rabbitmq broker:
$ ./dotcloud info hack.broker
aliases:
- hackxxxx.dotcloud.com
config:
password: xxxx
rabbitmq_management: true
user: root
created_at: 1338702527.075196
datacenter: Amazon-us-east-1c
image_version: 924a079b622a (latest)
memory: 49M/512M (9%)
ports:
- name: ssh
url: ssh://dotcloud#hackxxx.dotcloud.com:29209
- name: amqp
url: amqp://root:xxxx#hackxxxx.dotcloud.com:29210
- name: http
url: http://root:xxx#hack1-xxxx.dotcloud.com/
state: running
type: rabbitmq

It looks like it is having an issue connection to your broker. Have you confirmed that you can connect to your broker, and it is up and running?
What are you using for a broker?

Related

kubernetes.client.exceptions.ApiException: (0) Reason: Handshake status 500 Internal Server Error

getting the above exception while deploying dags in the pipeline.
The log is as follows:
************************************************************************************************************
* *
* *
* Deploying 'dags'... *
* *
* *
************************************************************************************************************
[2022-12-16 14:09:48,076] {io} INFO - Current directory: /artifacts/dags
[2022-12-16 14:09:48,076] {copy_deploy_tool} INFO - Deploy 'dags' by copying files...
[2022-12-16 14:09:48,083] {deploy_tool} INFO - saving values.yaml...
[2022-12-16 14:09:48,162] {copy_deploy_tool} INFO - Removing files from 'development:airflow-5db795dd7c-d586h:/root/airflow/dags'
[2022-12-16 14:09:48,264] {deploy} ERROR - Execution failed for project: dags
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/kubernetes/stream/ws_client.py", line 296, in websocket_call
client = WSClient(configuration, get_websocket_url(url), headers, capture_all)
File "/usr/local/lib/python3.6/dist-packages/kubernetes/stream/ws_client.py", line 94, in __init__
self.sock.connect(url, header=header)
File "/usr/local/lib/python3.6/dist-packages/websocket/_core.py", line 253, in connect
self.handshake_response = handshake(self.sock, *addrs, **options)
File "/usr/local/lib/python3.6/dist-packages/websocket/_handshake.py", line 57, in handshake
status, resp = _get_resp_headers(sock)
File "/usr/local/lib/python3.6/dist-packages/websocket/_handshake.py", line 143, in _get_resp_headers
raise WebSocketBadStatusException("Handshake status %d %s", status, status_message, resp_headers)
websocket._exceptions.WebSocketBadStatusException: Handshake status 500 Internal Server Error
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/commands/deploy/deploy.py", line 65, in __try_execute
deployment=project[DEPLOYMENT],
File "/usr/local/lib/python3.6/dist-packages/tools/deploy/copy_deploy_tool.py", line 50, in run
namespace, container, pod_name, command, api_client=api_client
File "/usr/local/lib/python3.6/dist-packages/helpers/kubernetes.py", line 133, in run_pod_command
stdout=True,
File "/usr/local/lib/python3.6/dist-packages/kubernetes/stream/stream.py", line 35, in stream
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api/core_v1_api.py", line 841, in connect_get_namespaced_pod_exec
(data) = self.connect_get_namespaced_pod_exec_with_http_info(name, namespace, **kwargs) # noqa: E501
File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api/core_v1_api.py", line 941, in connect_get_namespaced_pod_exec_with_http_info
collection_formats=collection_formats)
File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 345, in call_api
_preload_content, _request_timeout)
File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 176, in __call_api
_request_timeout=_request_timeout)
File "/usr/local/lib/python3.6/dist-packages/kubernetes/stream/stream.py", line 30, in _intercept_request_call
return ws_client.websocket_call(config, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/kubernetes/stream/ws_client.py", line 302, in websocket_call
raise ApiException(status=0, reason=str(e))
kubernetes.client.rest.ApiException: (0)
Reason: Handshake status 500 Internal Server Error
[2022-12-16 14:09:49,631] {shell} INFO - doi: Deployment record uploaded successfully
[2022-12-16 14:09:49,631] {shell} INFO - OK
[2022-12-16 14:09:49,635] {io} INFO - Current directory: /artifacts
[2022-12-16 14:09:49,635] {pretty_info} INFO - ```
Usually, it happens when pod becomes into the Running state, but no running containers on it (0/1). And if you run execute command to this pod and container you will get 500 Internal server error instead of errors related to the real issue(the container is not running).
Check that all containers are running.
if all([ p.status.phase == "Running" for p in my_pods]) \
and all([c.state.running for p in my_pods for c in p.status.container_statuses]):
Also refer to this Stackpost and Github issue.

Socket conflict while running VAEs

I am currently trying to run VAEs on my personal labtop following the steps from: https://github.com/AntixK/PyTorch-VAE
I am only trying to run the simplest VAE with configs/vae.yaml
Since my personal device only have one gpu so I changed the gpus parameter in the config to [0] and I also add:
import os
os.environ["PL_TORCH_DISTRIBUTED_BACKEND"] = "gloo"
to run.py to let the file runnable. However, I am now getting such error:
Global seed set to 1265
break point 3!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
break point 4!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
======= Training VanillaVAE =======
Global seed set to 1265
initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1
[W ..\torch\csrc\distributed\c10d\socket.cpp:558] [c10d] The client socket has failed to connect to [DESKTOP-NMQ53KV]:50425 (system error: 10049 - The requested address is not valid in its context.).
[W ..\torch\csrc\distributed\c10d\socket.cpp:558] [c10d] The client socket has failed to connect to [DESKTOP-NMQ53KV]:50425 (system error: 10049 - The requested address is not valid in its context.).
----------------------------------------------------------------------------------------------------
distributed_backend=gloo
All distributed processes registered. Starting with 1 processes
----------------------------------------------------------------------------------------------------
C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pytorch_lightning\core\datamodule.py:469: LightningDeprecationWarning: DataModule.setup has already been called, so it will not be called again. In v1.6 this behavior will change to always call DataModule.setup.
rank_zero_deprecation(
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
-------------------------------------
0 | model | VanillaVAE | 3.9 M
-------------------------------------
3.9 M Trainable params
0 Non-trainable params
3.9 M Total params
15.751 Total estimated model params size (MB)
Validation sanity check: 0%| | 0/2 [00:00<?, ?it/s]break point 1!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
break point 2!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Global seed set to 1265
break point 3!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
break point 4!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
======= Training VanillaVAE =======
Global seed set to 1265
initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1
[W ..\torch\csrc\distributed\c10d\socket.cpp:401] [c10d] The server socket has failed to bind to [DESKTOP-NMQ53KV]:50425 (system error: 10048 - Only one usage of each socket address (protocol/network address/port) is normally permitted.).
[W ..\torch\csrc\distributed\c10d\socket.cpp:401] [c10d] The server socket has failed to bind to DESKTOP-NMQ53KV:50425 (system error: 10013 - An attempt was made to access a socket in a way forbidden by its access permissions.).
[E ..\torch\csrc\distributed\c10d\socket.cpp:435] [c10d] The server socket has failed to listen on any local network address.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1520.0_x64__qbz5n2kfra8p0\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1520.0_x64__qbz5n2kfra8p0\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1520.0_x64__qbz5n2kfra8p0\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1520.0_x64__qbz5n2kfra8p0\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1520.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1520.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1520.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\Users\huklab\Desktop\odin\PyTorch-VAE\run.py", line 67, in <module>
runner.fit(experiment, datamodule=data)
File "C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pytorch_lightning\trainer\trainer.py", line 737, in fit
self._call_and_handle_interrupt(
File "C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pytorch_lightning\trainer\trainer.py", line 682, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pytorch_lightning\trainer\trainer.py", line 772, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pytorch_lightning\trainer\trainer.py", line 1132, in _run
self.accelerator.setup_environment()
File "C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pytorch_lightning\accelerators\gpu.py", line 39, in setup_environment
super().setup_environment()
File "C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pytorch_lightning\accelerators\accelerator.py", line 83, in setup_environment
self.training_type_plugin.setup_environment()
File "C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pytorch_lightning\plugins\training_type\ddp.py", line 185, in setup_environment
self.setup_distributed()
File "C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pytorch_lightning\plugins\training_type\ddp.py", line 272, in setup_distributed
init_dist_connection(self.cluster_environment, self.torch_distributed_backend)
File "C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pytorch_lightning\utilities\distributed.py", line 386, in init_dist_connection
torch.distributed.init_process_group(
File "C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\torch\distributed\distributed_c10d.py", line 595, in init_process_group
store, rank, world_size = next(rendezvous_iterator)
File "C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\torch\distributed\rendezvous.py", line 257, in _env_rendezvous_handler
store = _create_c10d_store(master_addr, master_port, rank, world_size, timeout)
File "C:\Users\huklab\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\torch\distributed\rendezvous.py", line 188, in _create_c10d_store
return TCPStore(
RuntimeError: The server socket has failed to listen on any local network address. The server socket has failed to bind to [DESKTOP-NMQ53KV]:50425 (system error: 10048 - Only one usage of each socket address (protocol/network address/port) is normally permitted.). The server socket has failed to bind to DESKTOP-NMQ53KV:50425 (system error: 10013 - An attempt was made to access a socket in a way forbidden by its access permissions.).
Does anyone has any idea what happens and how to fix this?
Thanks a lot for your help!

Apache Airflow: After pointing airflow.cfg to postgres, it still tries to run on MySQL

I'm using Apache-airflow2. My dags were running on LocalExecutor up until now smoothly. Now i want to scale it up and use CeleryExecutor (I'm still doing it on my Local Mac) I've configured it to run on CeleryExecutor, and when server starts, the log shows CeleryExecutor. But whenever i run airflow worker(so the same can be used as worker), or airflow flower, I run into an error where it tries to connect to Mysqlite and fails when it doesn't find the module.
I've configured rabbitMQ on local and airflow on virtualenv. Find below updated lines in airflow.cfg:
broker_url = amqp://myuser:mypassword#localhost/myvhost
result_backend = db+postgresql://localhost:5433/celery_space?user=celery_user&password=celery_user
sql_alchemy_conn = postgresql://localhost:5433/postgres?user=postgres&password=root
executor = CeleryExecutor
Please find below worker error:
-------------- celery#superadmins-MacBook-Pro.local v4.2.1 (windowlicker)
---- **** -----
--- * *** * -- Darwin-18.2.0-x86_64-i386-64bit 2018-12-30 09:15:00
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: airflow.executors.celery_executor:0x10c682fd0
- ** ---------- .> transport: sqla+mysql://airflow:airflow#localhost:3306/airflow
- ** ---------- .> results: mysql://airflow:**#localhost:3306/airflow
- *** --- * --- .> concurrency: 16 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> default exchange=default(direct) key=default
[tasks]
. airflow.executors.celery_executor.execute_command
[2018-12-30 09:15:00,290: INFO/MainProcess] Connected to sqla+mysql://airflow:airflow#localhost:3306/airflow
[2018-12-30 09:15:00,304: CRITICAL/MainProcess] Unrecoverable error: ModuleNotFoundError("No module named 'MySQLdb'")
Traceback (most recent call last):
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/celery/worker/worker.py", line 205, in start
self.blueprint.start(self)
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/celery/bootsteps.py", line 369, in start
return self.obj.start()
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/celery/worker/consumer/consumer.py", line 317, in start
blueprint.start(self)
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/celery/worker/consumer/tasks.py", line 41, in start
c.connection, on_decode_error=c.on_decode_error,
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/celery/app/amqp.py", line 297, in TaskConsumer
**kw
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/kombu/messaging.py", line 386, in __init__
self.revive(self.channel)
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/kombu/messaging.py", line 408, in revive
self.declare()
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/kombu/messaging.py", line 421, in declare
queue.declare()
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/kombu/entity.py", line 608, in declare
self._create_queue(nowait=nowait, channel=channel)
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/kombu/entity.py", line 617, in _create_queue
self.queue_declare(nowait=nowait, passive=False, channel=channel)
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/kombu/entity.py", line 652, in queue_declare
nowait=nowait,
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/kombu/transport/virtual/base.py", line 531, in queue_declare
self._new_queue(queue, **kwargs)
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 82, in _new_queue
self._get_or_create(queue)
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 70, in _get_or_create
obj = self.session.query(self.queue_cls) \
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 65, in session
_, Session = self._open()
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 56, in _open
engine = self._engine_from_config()
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 51, in _engine_from_config
return create_engine(conninfo.hostname, **transport_options)
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/sqlalchemy/engine/__init__.py", line 425, in create_engine
return strategy.create(*args, **kwargs)
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/sqlalchemy/engine/strategies.py", line 81, in create
dbapi = dialect_cls.dbapi(**dbapi_args)
File "/Users/deepaksaroha/Desktop/apache_2.0/nb-atom-airflow/lib/python3.7/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 102, in dbapi
return __import__('MySQLdb')
ModuleNotFoundError: No module named 'MySQLdb'
[2018-12-30 09:15:00,599] {__init__.py:51} INFO - Using executor SequentialExecutor
Starting flask
Please let know how can worker be run in sync with the airflow.cfg settings. Appreciate any help, let me know if anything further log or any configuration file is needed.

airflow worker not reaching sqs

I am running the airflow worker service. The service is not able to connect to the sqs
The scheduler is able to reach and write to the queue
Environment:
Amazon Linux
Python 3.54
Airflow 1.10.1
Celery 4.1.1
Proxies are fine; I have implemented this in both python 2.7 and 3.5 same issue
I have set the celery_transport_options for the region
Airflow is configured with celeryExecutor, SQS broker & postgres database for backend
partial log
[2018-08-30 15:43:58,779: CRITICAL/MainProcess] Unrecoverable error: Exception('Request Empty body HTTP 599 Failed to connect to eu-west-1.queue.amazonaws.com port 443: Connection timed out (None)',)
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/celery/worker/worker.py", line 207, in start
self.blueprint.start(self)
File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 370, in start
return self.obj.start()
File "/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", line 316, in start
blueprint.start(self)
File "/usr/local/lib/python3.5/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/usr/local/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", line 592, in start
c.loop(*c.loop_args())
File "/usr/local/lib/python3.5/site-packages/celery/worker/loops.py", line 91, in asynloop
next(loop)
File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/hub.py", line 354, in create_loop
cb(*cbargs)
File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", line 114, in on_writable
return self._on_event(fd, _pycurl.CSELECT_OUT)
File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", line 124, in _on_event
self._process_pending_requests()
File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", line 132, in _process_pending_requests
self._process(curl, errno, reason)
File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/http/curl.py", line 178, in _process
buffer=buffer, effective_url=effective_url, error=error,
File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 150, in __call__
svpending(*ca, **ck)
File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 143, in __call__
return self.throw()
File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 140, in __call__
retval = fun(*final_args, **final_kwargs)
File "/usr/local/lib/python3.5/site-packages/vine/funtools.py", line 100, in _transback
return callback(ret)
File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 143, in __call__
return self.throw()
File "/usr/local/lib/python3.5/site-packages/vine/promises.py", line 140, in __call__
retval = fun(*final_args, **final_kwargs)
File "/usr/local/lib/python3.5/site-packages/vine/funtools.py", line 98, in _transback
callback.throw()
File "/usr/local/lib/python3.5/site-packages/vine/funtools.py", line 96, in _transback
ret = filter_(*args + (ret,), **kwargs)
File "/usr/local/lib/python3.5/site-packages/kombu/asynchronous/aws/connection.py", line 233, in _on_list_ready
raise self._for_status(response, response.read())
Exception: Request Empty body HTTP 599 Failed to connect to eu-west-1.queue.amazonaws.com port 443: Connection timed out (None)

parallel-python error: RuntimeError("Socket connection is broken")

I am using a simple program to send a function:
import pp
nodes=('mosura02','mosura03','mosura04','mosura05','mosura06',
'mosura09','mosura10','mosura11','mosura12')
nodes=('miner:60001',)
def pptester():
js=pp.Server(ppservers=nodes)
js.set_ncpus(0)
tmp=[]
for i in range(200):
tmp.append(js.submit(ppworktest,(),(),('os',)))
return tmp
def ppworktest():
import os
return os.system("uname -a")
the result is:
wkerzend#mosura:/home/wkerzend/tmp/ppython_test>ssh miner "source ~/coala_python_setup.sh;ppserver.py -d -p 60001"
2010-04-12 00:50:48,162 - pp - INFO - Creating server instance (pp-1.6.0)
2010-04-12 00:50:52,732 - pp - INFO - pp local server started with 32 workers
2010-04-12 00:50:52,732 - pp - DEBUG - Strarting network server interface=0.0.0.0 port=60001
Exception in thread client_socket:
Traceback (most recent call last):
File "/usr/lib64/python2.6/threading.py", line 525, in __bootstrap_inner
self.run()
File "/usr/lib64/python2.6/threading.py", line 477, in run
self.__target(*self.__args, **self.__kwargs)
File "/home/wkerzend/python_coala/bin/ppserver.py", line 161, in crun
ctype = mysocket.receive()
File "/home/wkerzend/python_coala/lib/python2.6/site-packages/pptransport.py", line 178, in receive
raise RuntimeError("Socket connection is broken")
RuntimeError: Socket connection is broken