Supervisord kills/reaps cron jobs. Its crond from BusyBox in Alpine
2019-08-05 11:31:00,816 INFO reaped unknown pid 384
Process is visible as with ps as [scraper] -> scraper is my bash script
Crontab looks like this:
*/5 * * * * export DM_TIMEOUT=20 && /opt/scripts/scraper >/dev/null 2>&1
This is my supervisord config.
[supervisord]
nodaemon=true
user=root
[program:darkhttpd]
command=darkhttpd /opt/scraper/ --port 7000 --default-mimetype text/plain
user=root
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes = 0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
[program:scraper]
command=crond -f
user=root
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes = 0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
I would like to: A) do not kill cron job B) kill cron job if takes more than X seconds
Thanks
Related
I have configured Pgbouncer as a sidecar pattern in one of my pods in Azure Kubernetes based on Azure Oss Db Tools Pgbouncer Sidecar documentation. It has the following container lifecycle hook:
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "killall -INT pgbouncer && sleep 120"]
I believe the intended purpose of this command is to wait 120 seconds until any running query finishes.
To understand what it does, I opened two interactive shells inside the Pgbouncer container. In the first shell, I executed the killall command and in the second shell, I executed ps command multiple times.
First shell:
/ $ ps
PID USER TIME COMMAND
1 postgres 0:00 /usr/bin/pgbouncer /etc/pgbouncer/pgbouncer.ini
6 postgres 0:00 /bin/sh
30 postgres 0:00 ps
/ $
/ $
/ $ killall -INT pgbouncer && sleep 120
Second shell:
/ $ ps
PID USER TIME COMMAND
1 postgres 0:00 /usr/bin/pgbouncer /etc/pgbouncer/pgbouncer.ini
6 postgres 0:00 /bin/sh
33 postgres 0:00 /bin/sh
40 postgres 0:00 sleep 120
41 postgres 0:00 ps
/ $
/ $
/ $ ps
PID USER TIME COMMAND
1 postgres 0:00 /usr/bin/pgbouncer /etc/pgbouncer/pgbouncer.ini
6 postgres 0:00 /bin/sh
33 postgres 0:00 /bin/sh
42 postgres 0:00 ps
After 120 seconds, Pgbouncer main process is still running (See the output from the second shell). I thought this should have terminated both my terminal sessions since it was supposed to kill Pgbouncer process (PID = 1) and stop the container.
If I try to kill using the below command:
/ $ kill 1
/ $ command terminated with exit code 137
I see that both my terminal sessions are immediately terminated and the container is stopped.
I want to understand whether we really need this lifecycle hook since it is not properly working? Or did I make any mistake trying to understand what it does?
Thanks for the help! 🙏
There is a difference here. The killall -INT sends INT signal while kill sends TERM signal if no signal is specified. You can try again with kill -INT 1 to see whether it gets the same behavior. I think the pgbouncer process is also catching INT.
Here is reference to the site:
int cf_shutdown; /* 1 - wait for queries to finish, 2 - shutdown immediately */
...
static void handle_sigterm(evutil_socket_t sock, short flags, void *arg)
{
log_info("got SIGTERM, fast exit");
/* pidfile cleanup happens via atexit() */
exit(1);
}
static void handle_sigint(evutil_socket_t sock, short flags, void *arg)
{
log_info("got SIGINT, shutting down");
sd_notify(0, "STOPPING=1");
if (cf_reboot)
die("takeover was in progress, going down immediately");
if (cf_pause_mode == P_SUSPEND)
die("suspend was in progress, going down immediately");
cf_pause_mode = P_PAUSE;
cf_shutdown = 1;
}
Pgbouncer stops processing any more queries when the INT signal is given while the TERM signal will terminate the process immediately.
I am trying to create a service that runs the console to convert all my crontab commands to systemd in the future, but I always get this error, I have tried different tutorials and the same problem.
# systemctl status hello-world.service
● hello-world.service - Hello World Service
Loaded: loaded (/usr/lib/systemd/system/hello-world.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since mié 2019-10-09 10:06:59 CEST; 4s ago
Process: 26080 ExecStart=/usr/share/nginx/html/scripts-systemd/hello-world.sh (code=exited, status=203/EXEC)
Main PID: 26080 (code=exited, status=203/EXEC)
oct 09 10:06:59 ns37 systemd[1]: Started Hello World Service.
oct 09 10:06:59 ns37 systemd[1]: hello-world.service: main process exited, code=exited, status=203/EXEC
oct 09 10:06:59 ns37 systemd[1]: Unit hello-world.service entered failed state.
oct 09 10:06:59 ns37 systemd[1]: hello-world.service failed.
hello-world.sh file
#!/bin/bash
while $(sleep 30);
do
echo "hello world"
done
hello-world.service file
[Unit]
Description=Hello World Service
After=systend-user-sessions.service
[Service]
Type=simple
ExecStart=/usr/share/nginx/html/scripts-systemd/hello-world.sh
[Install]
WantedBy=multi-user.target
Im using Centos 7
Edit:
What I need to do is execute console commands at certain times every day due to problems with crontab.
I was using this example to check that everything works and once it works change the commands.
Here is an example of a crontab command:
*/10 * * * * cd /usr/share/nginx/html/mywebsite.com; php wp-cron.php >/dev/null 2>&1
0 0 */3 * * date=date -I; zip -r /root/copias/copia-archivos-html-webs$date.zip /usr/share/nginx/html`
15 15 * * * wget -q -O /dev/null https://mywebsite.com/?run_plugin=key_0_0
Edit2: Done! I've managed to do it and works for now, I leave the code here so it can be useful for other people
hello-world.sh file
#!/usr/bin/env bash
/usr/bin/mysqldump -user -pass db_name >/root/copias/backupname.sql
hello-world.service
[Unit]
Description=CopiaSql
[Service]
Type=oneshot
ExecStart=/bin/bash /usr/share/nginx/html/scripts-systemd/hello-world.sh
[Install]
WantedBy=multi-user.target
hello-world.timer
[Unit]
Description=Runs every 2 minutes test.sh
[Timer]
OnCalendar=*:0/2
Unit=hello-world.service
[Install]
WantedBy=timers.target
Thanks everyone for your help!
I have a first boot install service that displays on the console and it looks like:
[Unit]
After=multi-user.target
# tty getty service login promts for tty1 & tty6
# will not be seen until this install completes.
Before=getty#tty1.service getty#tty6.service
[Service]
Type=oneshot
ExecStart=/bin/bash -c "export TERM=vt100;/var/ssi/firstboot_install.sh"
StandardOutput=tty
StandardInput=tty
[Install]
WantedBy=multi-user.target
My script that runs also has this code to start
#---------------------------------------------------------------------
# Switch to tty6 so input is allowed from installation questions
# Change back to tty1 at end of this script to show normal booting
# messages from systemd.
#---------------------------------------------------------------------
exec < /dev/tty6 > /dev/tty6
chvt 6
at the end of this script I change it back
# Now that the system has been registered and has a few channels added,
# I can have the installation go back to the main Anaconda output screen
# on tty1
chvt 1
exec < /dev/tty1 > /dev/tty1
exit 0
This might not be exactly what you want but you can adapt it to your needs. The goal here is to display something on the console which starts during the boot sequence. My script which asks a number of installation questions where input is NOT allowed on tty1 (console) which is why I change to tty6 so input is allowed during the first boot installation.
Your script try:
#!/bin/bash
exec < /dev/tty6 > /dev/tty6
chvt 6
while $(sleep 30);
do
echo "hello world"
done
chvt 1
exec < /dev/tty1 > /dev/tty1
This might be overkill for what your trying to do but if you need input
from the console, you should do the same with tty6
background
I am trying to automate the restarting in case of crash or reboot for mongos process used in mongodb sharded setup.
Case 1 : using direct command, with mongod user
supervisord config
[program:mongos_router]
command=/usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid
user=mongod
autostart=true
autorestart=true
startretries=10
Result
supervisord log
INFO spawned: 'mongos_router' with pid 19535
INFO exited: mongos_router (exit status 0; not expected)
INFO gave up: mongos_router entered FATAL state, too many start retries too quickly
mongodb log
2018-05-01T21:08:23.745+0000 I SHARDING [Balancer] balancer id: ip-address:27017 started
2018-05-01T21:08:23.745+0000 E NETWORK [mongosMain] listen(): bind() failed errno:98 Address already in use for socket: 0.0.0.0:27017
2018-05-01T21:08:23.745+0000 E NETWORK [mongosMain] addr already in use
2018-05-01T21:08:23.745+0000 I - [mongosMain] Invariant failure inShutdown() src/mongo/db/auth/user_cache_invalidator_job.cpp 114
2018-05-01T21:08:23.745+0000 I - [mongosMain]
***aborting after invariant() failure
2018-05-01T21:08:23.748+0000 F - [mongosMain] Got signal: 6 (Aborted).
Process is seen running. But if killed does not restart automatically.
Case 2 : Using init script
here the slight change in the scenario is that some ulimit commands, creation of pid files is to be done as root and then the actual process should be started as mongod user.
mongos script
start()
{
# Make sure the default pidfile directory exists
if [ ! -d $PID_PATH ]; then
install -d -m 0755 -o $MONGO_USER -g $MONGO_GROUP $PIDDIR
fi
# Make sure the pidfile does not exist
if [ -f $PID_FILE ]; then
echo "Error starting mongos. $PID_FILE exists."
RETVAL=1
return
fi
ulimit -f unlimited
ulimit -t unlimited
ulimit -v unlimited
ulimit -n 64000
ulimit -m unlimited
ulimit -u 64000
ulimit -l unlimited
echo -n $"Starting mongos: "
#daemon --user "$MONGO_USER" --pidfile $PID_FILE $MONGO_BIN $OPTIONS --pidfilepath=$PID_FILE
#su $MONGO_USER -c "$MONGO_BIN -f $CONFIGFILE --pidfilepath=$PID_FILE >> /home/mav/startup_log"
su - mongod -c "/usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid"
RETVAL=$?
echo -n "Return value : "$RETVAL
echo
[ $RETVAL -eq 0 ] && touch $MONGO_LOCK_FILE
}
daemon comman represents original script, but daemonizing under the supervisord is not logical, so using command to run the process in foreground(?)
supervisord config
[program:mongos_router_script]
command=/etc/init.d/mongos start
user=root
autostart=true
autorestart=true
startretries=10
Result
supervisord log
INFO spawned: 'mongos_router_script' with pid 20367
INFO exited: mongos_router_script (exit status 1; not expected)
INFO gave up: mongos_router_script entered FATAL state, too many start retries too quickly
mongodb log
Nothing indicating error, normal logs
Process is seen running. But if killed does not restart automatically.
Problem
How to correctly configure script / no script option for running mongos under supervisord ?
EDIT 1
Modified Command
sudo su -c "/usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid" -s /bin/bash mongod`
This works if ran individually on command line as well as part of the script, but not with supervisord
EDIT 2
Added following option to config file for mongos to force it to run in the foreground
processManagement:
fork: false # fork and run in background
Now command line and script properly run it in the foreground but supervisord fails to launch it. At the same time there are 3 processes show up when ran from command line or script
root sudo su -c /usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid -s /bin/bash mongod
root su -c /usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid -s /bin/bash mongod
mongod /usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid
EDIT 3
With following supervisord config things are working fine. But I want to try and execute the script if possible to set ulimit
[program:mongos_router]
command=/usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid
user=mongod
autostart=true
autorestart=true
startretries=10
numprocs=1
For the mongos to run in the foreground set the following option
#how the process runs
processManagement:
fork: false # fork and run in background
with that and above supervisord.conf setting, mongos will be launched and under the supervisord control
I have an issue with celery deployment - when I restart it old subprocesses don't stop and continue to process some of jobs. I use supervisord to run celery. Here is my config:
$ cat /etc/supervisor/conf.d/celery.conf
[program:celery]
; Full path to use virtualenv, honcho to load .env
command=/home/ubuntu/venv/bin/honcho run celery -A stargeo worker -l info --no-color
directory=/home/ubuntu/app
environment=PATH="/home/ubuntu/venv/bin:%(ENV_PATH)s"
user=ubuntu
numprocs=1
stdout_logfile=/home/ubuntu/logs/celery.log
stderr_logfile=/home/ubuntu/logs/celery.err
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998
Here is how celery processes look:
$ ps axwu | grep celery
ubuntu 983 0.0 0.1 47692 10064 ? S 11:47 0:00 /home/ubuntu/venv/bin/python /home/ubuntu/venv/bin/honcho run celery -A stargeo worker -l info --no-color
ubuntu 984 0.0 0.0 4440 652 ? S 11:47 0:00 /bin/sh -c celery -A stargeo worker -l info --no-color
ubuntu 985 0.0 0.5 168720 41356 ? S 11:47 0:01 /home/ubuntu/venv/bin/python /home/ubuntu/venv/bin/celery -A stargeo worker -l info --no-color
ubuntu 990 0.0 0.4 167936 36648 ? S 11:47 0:00 /home/ubuntu/venv/bin/python /home/ubuntu/venv/bin/celery -A stargeo worker -l info --no-color
ubuntu 991 0.0 0.4 167936 36648 ? S 11:47 0:00 /home/ubuntu/venv/bin/python /home/ubuntu/venv/bin/celery -A stargeo worker -l info --no-color
When I run sudo supervisorctl restart celery it only stops first process python ... honcho one and all the other ones continue. And if I try to kill them they continue (kill -9 works).
This appeared to be a bug with honcho. I ended up with workaround of starting this script from supervisor:
#!/bin/bash
source /home/ubuntu/venv/bin/activate
exec env $(cat .env | grep -v ^# | xargs) \
celery -A stargeo worker -l info --no-color
I tired to start supervisor but getting error. Can anyone help? Thanks
/etc/init.d/supervisord file.
SUPERVISORD=/usr/local/bin/supervisord
SUPERVISORCTL=/usr/local/bin/supervisorctl
case $1 in
start)
echo -n "Starting supervisord: "
$SUPERVISORD
echo
;;
stop)
echo -n "Stopping supervisord: "
$SUPERVISORCTL shutdown
echo
;;
restart)
echo -n "Stopping supervisord: "
$SUPERVISORCTL shutdown
echo
echo -n "Starting supervisord: "
$SUPERVISORD
echo
;;
esac
Then run these
sudo chmod +x /etc/init.d/supervisord
sudo update-rc.d supervisord defaults
sudo /etc/init.d/supervisord start
And getting this:
Stopping supervisord: Shut down
Starting supervisord: /usr/local/lib/python2.7/dist-packages/supervisor/options.py:286: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
'Supervisord is running as root and it is searching '
Error: Another program is already listening on a port that one of our HTTP servers is configured to use. Shut this program down first before starting supervisord.
For help, use /usr/local/bin/supervisord -h
Conf file (located at /etc/supervisord.conf):
[unix_http_server]
file=/tmp/supervisor.sock; (the path to the socket file)
[supervisord]
logfile=/tmp/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10 ; (num of main logfile rotation backups;default 10)
loglevel=info ; (log level;default info; others: debug,warn,trace)
pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
nodaemon=false ; (start in foreground if true;default false)
minfds=1024 ; (min. avail startup file descriptors;default 1024)
minprocs=200 ; (min. avail process descriptors;default 200)
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock; use a unix:// URL for a unix socket
[program:myproject]
command=/home/richard/envs/myproject_stage/bin/python /home/richard/webapps/myproject/manage.py run_gunicorn -b 127.0.0.1:8002 --log-file=/tmp/myproject_stage_gunicorn.log
directory=/home/richard/webapps/myproject/
user=www-data
autostart=true
autorestart=true
stdout_logfile=/tmp/myproject_stage_supervisord.log
redirect_stderr=true
first of all, type this on your console or terminal
ps -ef | grep supervisord
You will get some pid of supervisord just like these
root 2641 12938 0 04:52 pts/1 00:00:00 grep --color=auto supervisord
root 29646 1 0 04:45 ? 00:00:00 /usr/bin/python /usr/local/bin/supervisord
if you get output like that, your pid is the second one. then if you want to shut down your supervisord you can do this
kill -s SIGTERM 29646
hope it helpful. ref: http://supervisord.org/running.html#signals
sudo unlink /tmp/supervisor.sock
This .sock file is defined in /etc/supervisord.conf's [unix_http_server]'s file config value (default is /tmp/supervisor.sock).
$ ps aux | grep supervisor
alexamil 54253 0.0 0.0 2506960 6440 ?? Ss 10:09PM 0:00.26 /usr/bin/python /usr/local/bin/supervisord -c supervisord.conf
so we can use:
$ pkill -f supervisord # kill it
This is what works for me. Run the following in the Terminal (For Linux machines)
To check if the process is running:
sudo systemctl status supervisor
To stop the process:
sudo systemctl stop supervisor
Try running these commands
sudo unlink /run/supervisor.sock
and
sudo /etc/init.d/supervisor start
As of version 3.0a11, you could do this one-liner:
sudo kill -s SIGTERM $(sudo supervisorctl pid) which hops on the back of the supervisorctl pid function.
There are many answers already available. I shall present a cleaner way to shut down supervisord.
supervisord by default, creates a file named supervisord.pid in the directory where supervisord.conf file exists. That file consists the pid of the supervisord daemon. Read the pid from the file and kill the supervisord process.
However, you can configure where the supervisord.pid file should be created. Refer this link to configure it, http://supervisord.org/configuration.html