supervisorctl ERROR (abnormal termination) - supervisord

When I run sudo supervisorctl start stage then I get ERROR (abnormal termination). Will you please take look?
Here is my file /etc/supervisord.conf. Am i missing something? thanks
[unix_http_server]
file=/tmp/supervisor.sock ; (the path to the socket file)
[supervisord]
logfile=/tmp/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10 ; (num of main logfile rotation backups;default 10)
loglevel=info ; (log level;default info; others: debug,warn,trace)
pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
nodaemon=false ; (start in foreground if true;default false)
minfds=1024 ; (min. avail startup file descriptors;default 1024)
minprocs=200 ; (min. avail process descriptors;default 200)
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
[program:stage]
command=/home/me/envs/project/bin/python /home/me/webapps/project/manage.py run_gunicorn -b 127.0.0.1:8002 --log-file=/tmp/stage_gunicorn.log
directory=/home/me/webapps/project/
user=www-data
autostart=true
autorestart=true
stdout_logfile=/tmp/stage_supervisord.log
redirect_stderr=true

I meet the same problem as yours. As Martijn Pieters saying, it doesn't mean that something goes wrong with your supervisorctl. It just tells you that the program didn't work. You can find some error details in the log.

It indicated error so find it using below command :
supervisorctl tail <APP_NAME>

This error is occurring due to the underlying stage application is not running properly. To fix the error, you can simply go to your console and run the command that you are passing. In your case:
It is
/home/me/envs/project/bin/python /home/me/webapps/project/manage.py run_gunicorn -b 127.0.0.1:8002 --log-file=/tmp/stage_gunicorn.log
It will show you the error that need to be fixed

It means that your APP is wrong.
Go and check [program:stage] section, path or something is not c

Just edit the log level to trace then restart supervisord and see what happened from the supervisor log.
[supervisord]
loglevel=trace
sudo systemctl restart supervisord.service
tail -f /path/to/supervisord.log
When the problem has been resolved, modify the loglevel to info.

Related

wait synchronously for rsyslog flush to complete

I am running rsyslogd 8.24.0 with a local logfile.
I have a test which runs a program that does some syslog logging (with entries from my test going to another file via rsyslog.conf setting) then exits back to a shell script to check the log has expected content. This usually works but sometimes fails as though the logging hadn't happened. I've added a flush (using HUP signal) to the shell script before it does the check. I can see that the HUP has happened and that the correct entry is in the log, but the script's check still fails.
Is there a way for the shell script to wait until the flush has completed? I can add an arbitrary sleep but would prefer to have something more definite.
Here are the relevant bits of the shell script:
# Set syslog to send dump_hook's logging to a local logfile...
sudo echo "user.* `pwd`/dump_hook_log" >> /etc/rsyslog.conf
sudo systemctl restart rsyslog.service
echo "" > ./dump_hook_log
# run the test program which does syslog logging
kill -HUP `cat /var/run/syslogd.pid` # flush syslog
if [ $? -ne 0 ]
then
logFail "failed to HUP `cat /var/run/syslogd.pid`: $?"
fi
echo "sent HUP to `cat /var/run/syslogd.pid`"
grep <the string I want> ./dump_hook_log >/dev/null
The string in question is always in the dump_hook_log by the time that the test has reported fail and I've gone to look at it. I presume it must be that the flush hasn't completed by the time of the grep.
Here is an example:
In /var/log/messages
2019-01-30T12:13:27.216523+00:00 apx-ont-1 apx_dump_hook[28279]: Failed to open raw dump file "core" (Is a directory)
2019-01-30T12:13:27.216754+00:00 apx-ont-1 rsyslogd: [origin software="rsyslogd" swVersion="8.24.0" x-pid="28185" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Mod date of the log file (n.b. this is earlier than the entries it contains!):
rw-rw-rw- 1 nealec appexenv1_group 2205 2019-01-30 12:13:27.215053296 +0000 testdir_OPT/dump_hook_log
Last line of the log file (only apx_dump_hook entries in here):
2019-01-30T12:13:27.216523+00:00 apx-ont-1 apx_dump_hook[28279]: Failed to open raw dump file "core" (Is a directory)
Script reporting error:
Wed 30 Jan 12:13:27 GMT 2019 PSE Test 0.2b FAILED: 'Failed to open raw dump file' not found in ./dump_hook_log
I think I understand this now. The HUP causes rsyslogd to close its open files but it doesn’t reopen a file until it needs to log to it.
Consider the following:
I use inotify to wait for a file to close, like this:
case 9:
{
// Wait for the file, specified in argv[2], to be closed
int inotfd = inotify_init();
if (inotfd < 0) {
printf("inotify_init failed; errno %d: %s\n",
errno, strerror(errno));
exit(99);
}
int watch_desc = inotify_add_watch(inotfd, argv[2], IN_CLOSE);
if (watch_desc < 0) {
printf("can't watch %s failed; errno %d: %s\n",
argv[2], errno, strerror(errno));
exit(99);
}
size_t bufsiz = sizeof(struct inotify_event) + PATH_MAX + 1;
struct inotify_event* event = static_cast<inotify_event*>(malloc(bufsiz));
if (!event) {
printf("Failed to malloc event buffer; errno %d: %s\n",
errno, strerror(errno));
exit(99);
}
/* wait for an event to occur with blocking read*/
read(inotfd, event, bufsiz);
}
Then in my shell script I wait for that:
# Start a process that waits for the log file be closed
${bin}/test_dump_hook.exe 9 "./dump_hook_log" &
wait_pid=$!
# Signal syslogd to cause it it close/reopen its log files
kill -HUP `cat /var/run/syslogd.pid` # flush syslog
if [ $? -ne 0 ]
then
logFail "failed to HUP `cat /var/run/syslogd.pid`: $?"
fi
wait $waid_pid
I find this never returns. Sending a HUP to rsyslogd from another process doesn't break it out of the wait either, but a cat (which does open/close the file) of the log file does.
That’s because the HUP in the shell script was done before the other process waited for it. So the file was already closed at the start of the wait, and because there is no more logging to that file it is not reopened and doesn’t need to close when any subsequent HUPs are received, so the event never occurs to end the wait.
Having understood this behaviour how can I be sure that the log has been written before I check it? I've gone with this solution; put a known message into the log and wait until that appears, I know that the entries I'm waiting for must be before that. Like this:-
function flushSyslog
{
logger -p user.info -t dump_hoook_test "flushSyslog"
# Signal syslogd to cause it it close its log file
kill -HUP `cat /var/run/syslogd.pid` # flush syslog
if [ $? -ne 0 ]
then
logFail "failed to HUP `cat /var/run/syslogd.pid`: $?"
fi
# wait upto 10 secs for the entry we've just logged to appear
sleeps=0
until
grep "flushSyslog" ./dump_hook_log > /dev/null
do
sleeps=$((sleeps+1))
if [ $sleeps -gt 100 ]
then
logFail "failed to flush syslog dump_hook_log"
fi
sleep 0.1
done
}
This seems a bit heavyweight as a solution, but you can use the system's inotify api to wait for the log file to be closed (the result of the HUP signal). For example,
inotifywait -e close ./dump_hook_log
will hang until rsyslogd (or any process) closes the file, when you will get the message
./dump_hook_log CLOSE_WRITE,CLOSE
and the program will exit with return code 0. You can add a timeout.

How to start Cloud SQL proxy with supervisor

I tried to start a CloudSQL proxy on supervisor, however I have no idea what is wrong with it. The documentation does not show any clues to this issue. Any ideas would be much appreciated.
I tried the setup on a clean Ubuntu 16 and then installed supervisor and downloaded cloud_sql_proxy. And I put files under /root and execute as root for debugging.
Here is my current setup:
/etc/supervisord.conf
[unix_http_server]
file=/tmp/supervisor.sock ; the path to the socket file
chmod=0766 ; socket file mode (default 0700)
[supervisord]
logfile=/tmp/supervisord.log ; main log file; default $CWD/supervisord.log
logfile_maxbytes=50MB ; max main logfile bytes b4 rotation; default 50MB
logfile_backups=10 ; # of main logfile backups; 0 means none, default 10
loglevel=info ; log level; default info; others: debug,warn,trace
pidfile=/tmp/supervisord.pid ; supervisord pidfile; default supervisord.pid
nodaemon=false ; start in foreground if true; default false
minfds=1024 ; min. avail startup file descriptors; default 1024
minprocs=200 ; min. avail process descriptors;default 200
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
[include]
files = /etc/supervisor/conf.d/*.conf
/etc/supervisor/conf.d/cloud_sql_proxy.conf
[program:cloud_sql_proxy]
command=/root/cloud_sql_proxy -dir=/cloudsql -instances="project_id:us-central1:instance-name" -credential_file="/root/service-account.json"
autostart=true
autorestart=true
startretries=1
startsecs=8
stdout_logfile=/var/log/cloud_sql_proxy-stdout.log
stderr_logfile=/var/log/cloud_sql_proxy-stderr.log
I got the following error after inspecting /tmp/supervisord.log:
2018-10-14 15:49:49,984 INFO spawned: 'cloud_sql_proxy' with pid 3569
2018-10-14 15:49:49,989 INFO exited: cloud_sql_proxy (exit status 0; not expected)
2018-10-14 15:49:50,991 INFO spawned: 'cloud_sql_proxy' with pid 3574
2018-10-14 15:49:50,996 INFO exited: cloud_sql_proxy (exit status 0; not expected)
2018-10-14 15:49:51,998 INFO gave up: cloud_sql_proxy entered FATAL state, too many start retries too quickly
2018-10-14 15:51:46,981 INFO spawned: 'cloud_sql_proxy' with pid 3591
2018-10-14 15:51:46,986 INFO exited: cloud_sql_proxy (exit status 0; not expected)
2018-10-14 15:51:47,989 INFO spawned: 'cloud_sql_proxy' with pid 3596
2018-10-14 15:51:47,998 INFO exited: cloud_sql_proxy (exit status 0; not expected)
2018-10-14 15:51:47,999 INFO gave up: cloud_sql_proxy entered FATAL state, too many start retries too quickly
Finally I managed to figure out a working solution, and here is it:
Create a new file /root/start_cloud_sql_proxy.sh:
#!/bin/bash
/root/cloud_sql_proxy -dir=/cloudsql -instances="project_id:us-central1:instance-name" -credential_file="/root/service-account.json"
Under /etc/supervisor/conf.d/cloud_sql_proxy.conf, change the command to execute a bash file:
command=/root/start_cloud_sql_proxy.sh

django celery daemon does work: it can't create pid file

I can't init mi celeryd and celerybeat service, I used the same code on another enviroment (configuring everything from the start) but here don't work. I think this was by permissions but I could'nt run it. please help me.
this is my celery conf on settings.py
CELERY_RESULT_BACKEND = ‘djcelery.backends.database:DatabaseBackend’
CELERY_BROKER_URL = ‘amqp://localhost’
CELERY_ACCEPT_CONTENT = [‘json’]
CELERY_TASK_SERIALIZER = ‘json’
CELERY_RESULT_SERIALIZER = ‘json’
CELERYBEAT_SCHEDULER = ‘djcelery.schedulers.DatabaseScheduler’
CELERY_ENABLE_UTC = True
CELERY_TIMEZONE = TIME_ZONE # ‘America/Lima’
CELERY_BEAT_SCHEDULE= {}
this is my file /etc/init.d/celeryd
https://github.com/celery/celery/blob/master/extra/generic-init.d/celeryd
then I use
sudo chmod 755 /etc/init.d/celeryd
sudo chown admin1:admin1 /etc/init.d/celeryd
and I created /etc/default/celeryd
CELERY_BIN="/home/admin1/Env/tos/bin/celery"
# App instance to use
CELERY_APP="tos"
# Where to chdir at start.
CELERYD_CHDIR="/home/admin1/webapps/tos/"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=300 --concurrency=8"
# %n will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
CELERYD_PID_FILE="/var/run/celery/%n.pid"
# Workers should run as an unprivileged user.
# You need to create this user manually (or you can choose
# a user/group combination that already exists (e.g., nobody).
CELERYD_USER="admin1"
CELERYD_GROUP="admin1"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
export SECRET_KEY="foobar"
for celerybeat I create a file on /etc/init.d/celerybeat
with:
https://github.com/celery/celery/blob/master/extra/generic-init.d/celerybeat
and start service like this:
sudo /etc/init.d/celeryd start
sudo /etc/init.d/celerybeat start
and I have this error:
sudo: imposible resolver el anfitrión SIO
celery init v10.1.
Using config script: /etc/default/celeryd
celery multi v3.1.25 (Cipater)
> Starting nodes...
> celery#SIO-PRODUCION: OK
ERROR: Pidfile (celery.pid) already exists.
Seems we're already running? (pid: 30198)
/etc/init.d/celeryd: 515: /etc/init.d/celeryd: --pidfile=/var/run/celery/%n.pid: not found
I also I got it when check it with :
sudo C_FAKEFORK=1 sh -x /etc/init.d/celeryd start
some data .....
starting nodes...
ERROR: Pidfile (celery.pid) already exists.
Seems we're already running? (pid: 30198)
> celery#SIO-PRODUCION: * Child terminated with errorcode 73
FAILED
+ --pidfile=/var/run/celery/%n.pid
/etc/init.d/celeryd: 515: /etc/init.d/celeryd: --pidfile=/var/run/celery/%n.pid: not found
+ --logfile=/var/log/celery/%n%I.log
/etc/init.d/celeryd: 517: /etc/init.d/celeryd: --logfile=/var/log/celery/%n%I.log: not found
+ --loglevel=INFO
/etc/init.d/celeryd: 519: /etc/init.d/celeryd: --loglevel=INFO: not found
+ --app=tos
/etc/init.d/celeryd: 521: /etc/init.d/celeryd: --app=tos: not found
+ --time-limit=300 --concurrency=8
/etc/init.d/celeryd: 523: /etc/init.d/celeryd: --time-limit=300: not found
+ exit 0
I have the same problem and i resolved her so:
rm -f /webapps/celery.pid && /etc/init.d/celeryd start
You can try do this. Before running celery clean up pid-files through ampersand by gluing commands.
Other way, create a django command
import shlex
import subprocess
from django.core.management.base import BaseCommand
class Command(BaseCommand):
def handle(self, *args, **options):
kill_worker_cmd = 'pkill -9 celery'
subprocess.call(shlex.split(kill_worker_cmd))
Call it before you start, or just
pkill -9 celery

Getting emacs realgud:pry to work

Need some help getting realgud to play nice with pry-remote.
This is a sample ruby file (x.rb)
require 'pry-remote'
class Foo
def initialize(x, y)
binding.remote_pry
puts "hello"
end
end
Foo.new(1,3)
In a shell buffer:
bash: ruby x.rb
[pry-remote] Waiting for client on druby://127.0.0.1:9876
# check that I have a listening process
bash: ss -nlt
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 127.0.0.1:9876 *:*
In emacs:
M-x realgud:pry
pry-remote ; enter this at the prompt
This is what I see in emacs;
File Edit Options Buffers Tools Inf-Ruby Debugger Help
32: end
33:
34: return_value
35: end
=>[0G[1] pry(#<PryNav::Tracer>)>
-=--:**--F1 *pry +No filename+ shell* Bot L53 (Comint:r
stop unless RUBY_VERSION == '1.9.2'
return_value = nil
command = catch(:breakout_nav) do # Coordinates w$
return_value = yield
=> {} # Nothing thrown == no navigational command
end
-=--:%%--F1 tracer.rb 24% L21 (Ruby ShortKeys) ---
My question is: why am I seeing the breakpoint in tracer.rb? How do I get the breakpoint
to be in my source file?
Hitting 'n' twice in the source window causes the shell buffer to echo the following, but there
is no change in the source window itself.
=>[0G[1] pry(#<PryNav::Tracer>)> next 1
next 1
Also, the 'u' and 'd' keystrokes yield
Command down is not implemented for this debugger
Command up is not implemented for this debugger

Apache init.d script

I have the following script to start, stop, restart apache2 in my debian 7
#!/bin/sh
### BEGIN INIT INFO
# Provides: apache2
# Required-Start: $all
# Required-Stop: $all
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: apache2
# Description: Start apache2
### END INIT INFO
case "$1" in
start)
echo "Starting Apache ..."
# Change the location to your specific location
/usr/local/apache2/bin/apachectl start
;;
stop)
echo "Stopping Apache ..."
# Change the location to your specific location
/usr/local/apache2/bin/apachectl stop
;;
graceful)
echo "Restarting Apache gracefully..."
# Change the location to your specific location
/usr/local/apache2/bin/apachectl graceful
;;
restart)
echo "Restarting Apache ..."
# Change the location to your specific location
/usr/local/apache2/bin/apachectl restart
;;
*)
echo "Usage: '$0' {start|stop|restart|graceful}"
exit 64
;;
esac
exit 0
When I add the script to update-rc.d I see the following warnings:
root#pomelo:/etc/init.d# update-rc.d apache2 defaults
update-rc.d: using dependency based boot sequencing
insserv: Script jexec is broken: incomplete LSB comment.
insserv: missing `Required-Stop:' entry: please add even if empty.
insserv: missing `Default-Stop:' entry: please add even if empty.
insserv: Default-Stop undefined, assuming empty stop runlevel(s) for script `jexec'
But I already added Required-Stop and Default-Stop to the script.
Does anybody know how to solve this problem?
The issue is not in your apache2 init script, it is in 'jexec' it says so 'Script jexec is broken'.
That one is missing the Required-Stop and Default-Stop
Had the same issue on my SLES boxen. Don't worry though, even if it shows you these errors, everything still just runs fine!
HTH