Celery worker exited prematurely on restart using systemd - celery

I'm using celery with systemd. I noticed that most times on restart, I lose the workers mid-task. From the celery multi documentation, it seems like the celery multi stopwait should be waiting for the tasks to finish.
Got the following error on restart:
Process "ForkPoolWorker-10" pid:16902 exited with "signal 15 (SIGTERM)"
celery.conf
[Unit]
Description=Celery background worker
After=network.target
[Service]
Type=forking
User=celery
Group=celery
WorkingDirectory=/src
ExecStart=celery multi start worker -A main.celery -Q celery --logfile=/data/celery.log --loglevel=info --concurrency=10 --pidfile=/var/run/celery/%%n.pid
ExecStop=celery multi stopwait worker --pidfile=/var/run/celery/%%n.pid
[Install]
WantedBy=multi-user.target
I also read the systemd documentation, we should at least be waiting 90 seconds for the task to be completed before sending out a SIGTERM. I receive this error in less than 10 seconds of running the restart command.
What am I doing wrong?
Using celery version: 5.2.2 (dawn-chorus)

Related

Not able to start Kafka-Connect as a service on CentOS 7

I have a Kafka environment (Zookeeper + Kafka Server + Kafka-Connect) which runs perfectly when I use command line to start each individual components on CentOS 7.
Now I am setting up these Kafka components to run as a service. For this I have created .service files and placed it in /etc/systemd/system folder. Following are the files
zookeeper.service
#!/bin/bash
# vi /etc/systemd/system/zookeeper.service
[Unit]
Description=This service will start Zookeeper server which will be used by Kafka Server.
Requires=network.target remote-fs.target
After=network.target remote-fs.target
[Service]
Type=simple
ExecStart=/opt/interactcrm/kafka_2.11-1.0.1/bin/zookeeper-server-start.sh /opt/interactcrm/kafka_2.11-1.0.1/config/zookeeper.properties
ExecStop=/opt/interactcrm/kafka_2.11-1.0.1/bin/zookeeper-server-stop.sh
TimeoutStartSec=0
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
kafka.service
#!/bin/bash
# vi /etc/systemd/system/kafka.service
[Unit]
Description=This service will start Kafka server.
Requires=zookeeper.service
After=zookeeper.service
[Service]
Type=simple
ExecStart=/opt/interactcrm/kafka_2.11-1.0.1/bin/kafka-server-start.sh /opt/interactcrm/kafka_2.11-1.0.1/config/server.properties
ExecStop=/opt/interactcrm/kafka_2.11-1.0.1/bin/kafka-server-stop.sh
TimeoutStartSec=0
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
Kafka-connect.service
#!/bin/bash
# vi /etc/systemd/system/kafkaconnect.service
[Unit]
Description=This service will start Kafka Connect Service.
Requires=network.target remote-fs.target nss-lookup.target kafka.service kafka.service
After=network.target remote-fs.target nss-lookup.target kafka.service
[Service]
Type=forking
Environment="KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=10040 -Dcom.sun.management.jmxremote.local.only=true -Dcom.sun.management.jmxremote.authenticate=false"
Environment="LOG_DIR=/var/log/kafka-logs"
ExecStart=/opt/interactcrm/kafka_2.11-1.0.1/bin/connect-distributed.sh /opt/interactcrm/kafka_2.11-1.0.1/config/connect-distributed.properties
TimeoutStartSec=1000
#Restart=on-abnormal
#SuccessExitStatus=143
[Install]
WantedBy=multi-user.target
Zookeeper and Kafka services starts without any issue. I can create topics and then do operations on the topic. The issue is with Kafka connect service.
When I try to start the service using systemctl command, the service does not start. It gets stuck no following log ::
Oct 19 18:29:20 localhost.localdomain connect-distributed.sh[1071]: [2018-10-19 18:29:20,713] INFO Added plugin 'io.debezium.connector.mysql.MySqlConnector...er:136)
Oct 19 18:29:20 localhost.localdomain connect-distributed.sh[1071]: [2018-10-19 18:29:20,713] INFO Added plugin 'io.debezium.transforms.ByLogicalTableRoute...er:136)
Oct 19 18:29:20 localhost.localdomain connect-distributed.sh[1071]: [2018-10-19 18:29:20,713] INFO Added plugin 'io.debezium.transforms.UnwrapFromEnvelope'...er:136)
Oct 19 18:29:20 localhost.localdomain connect-distributed.sh[1071]: [2018-10-19 18:29:20,761] INFO Loading plugin from: /opt/interactcrm/debezium/debezium ...er:184)
Oct 19 18:29:28 localhost.localdomain connect-distributed.sh[1071]: [2018-10-19 18:29:28,725] INFO Registered loader: PluginClassLoader{pluginLocation=file...er:207)
I cannot find any log for this process in message logs after this line and there is no error in any other logs. The process gets stuck on this line ::
INFO Registered loader: PluginClassLoader{pluginLocation=file...er:207)
No matter how much I increase the timeout this process never starts. But when I run the same command from command line, the service starts properly.
I have tried to remove all connectors from Plugin path to see if the service start but it gets stuck on the same line.
Following is my reference point ::
Kafka-Connect Service
I faced the same problem on Debain 9. Figure it out it was because the service need a WorkingDirectory otherwise kafka-connect never fully charges.
So your service should look like this:
#!/bin/bash
# vi /etc/systemd/system/kafkaconnect.service
[Unit]
Description=This service will start Kafka Connect Service.
Requires=network.target remote-fs.target nss-lookup.target kafka.service kafka.service
After=network.target remote-fs.target nss-lookup.target kafka.service
[Service]
Type=forking
Environment="KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=10040 -Dcom.sun.management.jmxremote.local.only=true -Dcom.sun.management.jmxremote.authenticate=false"
Environment="LOG_DIR=/var/log/kafka-logs"
WorkingDirectory="/opt/interactcrm/kafka_2.11-1.0.1" <--- or whatever directory you to use
ExecStart=/opt/interactcrm/kafka_2.11-1.0.1/bin/connect-distributed.sh /opt/interactcrm/kafka_2.11-1.0.1/config/connect-distributed.properties
TimeoutStartSec=1000
#Restart=on-abnormal
#SuccessExitStatus=143
[Install]
WantedBy=multi-user.target
** Below configuration worked for me in Ubuntu **
[Unit]
Requires=kafka.service
After=kafka.service
[Service]
Type=simple
User=kafka
ExecStart=/bin/sh -c '/home/kafka/kafka/bin/connect-distributed.sh /home/kafka/kafka/config/connect-distributed.properties > /home/kafka/kafka/kafka_connect.log 2>&1'
Restart=on-abnormal
[Install]
WantedBy=multi-user.target

Supervisord (exit status 2; not expected) ubuntu

I'm trying to run Celery with Supervisord on Ubuntu, but am getting:
INFO exited: celery (exit status 2; not expected)
INFO spawned: 'celery' with pid 15517
INFO gave up: celery entered FATAL state, too many start retries too
quickly
This is the Supervisord script:
cd into the directory and activate the virtual environment
celery -A [APP_NAME].celery worker -E -l info --concurrency=2
If I run this script manually, Celery starts up without any issues. But running sudo supervisorctl start celery errors out with the error messages above.

How to enable services in beaglebone black?

[Unit]
Description=Splash screen
DefaultDependencies=no
[Service]
Type=oneshot
ExecStart=/usr/local/bin/psplash
[Install]
WantedBy=basic.target
job for .service failed because the control process exited with an error code
Here is shell script to make service of python code.
It will start the execution at startup,
[Unit]
Description= Python First Service
After=multi-user.target
[Service]
Type=simple
ExecStart=/usr/bin/python /home/debian/serv_demo.py
Restart=on-abort
[Install]
WantedBy=multi-user.target
I followed this example and it worked well for my BBB:
https://gist.github.com/tstellanova/7323116

Error daemonizing celery in ubuntu 16.04

I am trying to daemonize the celery worker for my django application. But I am facing the following error on checking celery status:
Starting Celery Service...
celery.service: Control process exited, code=exited status=127
Failed to start Celery Service.
celery.service: Unit entered failed state.
celery.service: Failed with result 'exit-code'.
The /etc/default/celeryd file code:
CELERYD_NODES="worker1 worker2 worker3"
CELERY_BIN="/usr/local/bin/celery"
CELERY_APP="djangoapp"
CELERYD_CHDIR="/home/djangoapp/"
CELERYD_OPTS="--time-limit=300 --concurrency=4"
CELERYD_LOG_LEVEL="INFO"
CELERYD_LOG_FILE="/var/celery/log/%n%I.log"
CELERYD_PID_FILE="/var/celery/run/%n.pid"
CELERYD_USER="nobody"
CELERYD_GROUP="www-data"
CELERY_CREATE_DIRS=1
The /etc/systemd/system/celery.service code:
[Unit]
Description=Celery Service
After=network.target
[Service]
Type=forking
User=nobody
Group=www-data
EnvironmentFile=-/etc/conf.d/celery
WorkingDirectory=/home/celery
ExecStart=/bin/sh -c '${CELERY_BIN} multi start ${CELERYD_NODES} \
-A ${CELERY_APP} --pidfile=${CELERYD_PID_FILE} \
--logfile=${CELERYD_LOG_FILE} --loglevel=${CELERYD_LOG_LEVEL} ${CELERYD_OPTS}'
ExecStop=/bin/sh -c '${CELERY_BIN} multi stopwait ${CELERYD_NODES} \
--pidfile=${CELERYD_PID_FILE}'
ExecReload=/bin/sh -c '${CELERY_BIN} multi restart ${CELERYD_NODES} \
-A ${CELERY_APP} --pidfile=${CELERYD_PID_FILE} \
--logfile=${CELERYD_LOG_FILE} --loglevel=${CELERYD_LOG_LEVEL} ${CELERYD_OPTS}'
[Install]
WantedBy=multi-user.target
I have been trying to solve it for the last one hour but I am unable to solve it.
Someone please explain to me the reason behind the error and how to solve it.

Supervisord does not start killed processes

I have supervisord installed on my Ubuntu 10.04 and it runs a Java process continuously and supposed to heal (reload) process when it somehow dies or crashes.
On my htop I send SIGKILL, SIGTERM, SIGHUP, SIGSEGV signals to that Java process and watch /etc/logs/supervisord.log file and it says.
08:09:46,182 INFO success: myprogram entered RUNNING state,[...]
08:38:10,043 INFO exited: myprogram (exit status 0; expected)
At 08:38 I kill the process with SIGSEGV. How come it is exited with code 0 and why does not supervisord restart it at all?
All my supervisord.conf about this specific program is as follows:
[program:play-9000]
command=play run /var/www/myprogram/ --%%prod
stderr_logfile = /var/log/supervisord/myprogram-stderr.log
stdout_logfile = /var/log/supervisord/myprogram-stdout.log
Process works really fine when I launch supervisord, however does not get healed.
By the way any ideas how to start supervisord as a service so that it automatically launches when the whole system reboots?
Try setting autorestart=true. By default, autorestart is set to "unexpected" which means it will only restart a process if it exits with an unexpected exit code. By default, exit code 0 is expected.
http://supervisord.org/configuration.html#program-x-section-settings
You can use the chkconfig program to make sure that supervisor starts on reboot.
$ sudo apt-get install chkconfig
$ chkconfig -l supervisor
supervisor 0:off 1:off 2:on 3:on 4:on 5:on 6:off
You can see that it's enabled for runlevels 2-5 by default when I installed it.
$ man 7 runlevel
for more info on run levels.