Supervisord dies completely if program runs for less than 1 second? - supervisord

A program that runs for less than 1 second, every second on Supervisord causes the Supervisord to not run the program again. Why could that be?
I run "GET http://someurl.com/some/url" every second, and whenever this runs for less than 1 second (i.e. if I exclude "sleep(1)") then it will only run once, and never again. Any idea why?

I had to set:
startsecs=0
in supervisord.conf

Take a look at Supervisord log file mostly in /tmp/supervisord.log if there is not enough information why Supervisord is restarting your program try to set logging of your program in Supervisord configuration file /etc/supervisord.conf and look why your program is crashing:
[your_program_config:program]
stdout_logfile = /your_program/path/logs/your_program.log
redirect_stderr = true

Related

Mac Terminal to run a bash script that starts a swift program & restarts every hour

I am looking for some support on creating some way of running a swift command in terminal to run a program and then stop it after 1 hour then restart.
Example of manual process:
Open Termain.
cd my app
swift run my program --with-parameters
ctrl+c (after 1 hours)
Restart with step 3
I am sure there must be some way using a bash script maybe to start the program by command, kill it after 60min and restart it with a continuous loop like that.
Thanks :-)
You can set up a cron job to do this. Basically, you'll have a bash script, say it's located at /Users/ben/scripts/run_my_program.sh that will, at every hour:
Terminate the current running process (kill pid)
Execute the swift run my program --with-parameters and spit out the process ID
you can get the PID of the swift process you launch with echo $!, and then use sleep 1h to sleep for 1 hour and then kill the process with kill -9 and the PID you got in the first step.

Can Ansible keep a process ruining even after the playbook has ended?

i have an open-source program i want to run from ansible basically ansible will go into the node and run "./Program.Name" which will start the program but when ansible-playbook is done the program closes is there a way I can start the program and keep it running even after ansible is done? I was told there is the async module but how can I write the playbook so it will keep the program running for as long as the node is up. Please try to provide the yml code if possible where I can replace the name of the program and it should do the job
I have tried to run the porcess with "./Program.Name &" but it does not stay running.
The following methods may be useful to you, and the async and poll parameters are necessary,more info see ansible doc
- name: run
shell: "( tail > /dev/null 2>&1 &)"
async: 10
poll: 0
Note: If you want to run the program with a daemon, I think supervisord may be more appropriate.
async poll = 0
just never check back in on the task
https://docs.ansible.com/ansible/latest/user_guide/playbooks_async.html#concurrent-tasks-poll-0

How to automatically run command line program several times?

I have a c# program, integrated with a command line program. I want to run the command line program twice(start, finish, start again, finish again). Now I use a timer to set a special time period for every run, for example, give first run 10 seconds, no matter it is finished or not, after 10 seconds, the program starts the second run.
I want the second run can run automatically after the fist run finshed, How to do it? How to detect the first run is finished, and then take a trigger to start the second run?
Assume you run the command line as a process, see this answer to check if the process has finished:
Find out if a process finished running
if (process.WaitForExit(timeout))
{
// user exited
} else {
// timeout (perhaps process.Kill();)
}
In a command line you can launch this command:
start "" /w will execute the command and wait until it is finished before proceeding.
for %a in (1 2) do start "" /w "programA.exe"

perl based cron job won't write to mounted cifs/windows share ONLY after long inactivity

I'm not sure how to title that more succinctly and still have it be meaningful.
(Note that this works fine when run mid-day, via cron or manually, so I "know" the script itself is sound.)
I have a cron job (ubuntu 13.04.)
It runs as my user (not root.)
The job itself runs at 6:00 in the morning. It's the first 'business level' job that runs all day.
1 6 * * 1-5 /home/me/bin/run_perl_job
run_perl_job is just:
#!/bin/bash
cd /home/me/bin
./script.pl
The script copies a file to "/mnt/shared_drive/outputfile.xls"
The mount point is defined in fstab as:
//fileserver/share /mnt/shared_drive cifs user=domain/me%password,iocharset=utf8,gid=1000,uid=1000,sec=ntlm,file_mode=0777,dir_mode=0777 0 0
Now. Given that:
When I run the script in a normal shell, it works fine.
When I look at the mount point first thing in the morning (via a normal terminal) it shows up (and is writeable) without event.
When I copy the crontab line and set it to run in a couple minutes, to see the symptom, it works fine (creates the file quite happily.)
The ONLY time this fails is if it's running in its normal time slot (6:01). The rest of the script functions ( the file itself has to be pulled down via sftp, etc.) So I know it's not dying.
It's driving me batty because the test cycle is 24 hours.
I just added the following couple lines to the beginning of the 'run_perl_job' script, hoping it exposes something tomorrow:
cd /mnt/shared_drive
ls -lrt >>home/me/bin/process.log
But I'm stumped. "It's almost as though" the mount point had gotten stale overnight and is waiting for some kind of access attempt before remounting. I'd run "mount -a" at the top of the 'run_perl_job' script if I could reasonably do it. But given that it's got to be sudo'ed, that doesn't seem reasonable to me.
Thoughts? I'm running out of ideas and this test cycle is awful.
how about putting a
umount -f -v /mnt/shared_drive
mount -v -a
into a root cron job just before your script runs. That way you don't need to sudo in your script and have the password in plain sight. -v might give you a hint on what is happening to make it stale

how to use a shell script to supervise a program?

I've searched around but haven't quite found what I'm looking for. In a nutshell I have created a bash script to run in a infinite while loop, sleeping and checking if a process is running. The only problem is even if the process is running, it says it is not and opens another instance.
I know I should check by process name and not process id, since another process could jump in and take the id. However all perl programs are named Perl5.10.0 on my system, and I intend on having multiple instances of the same perl program open.
The following "if" always returns false, what am I doing wrong here???
while true; do
if [ ps -p $pid ]; then
echo "Program running fine"
sleep 10
else
echo "Program being restarted\n"
perl program_name.pl &
sleep 5
read -r pid < "${filename}_pid.txt"
fi
done
Get rid of the square brackets. It should be:
if ps -p $pid; then
The square brackets are syntactic sugar for the test command. This is an entirely different beast and does not invoke ps at all:
if test ps -p $pid; then
In fact that yields "-bash: [: -p: binary operator expected" when I run it.
Aside from the syntax error already pointed out, this is a lousy way to ensure that a process stays alive.
First, you should find out why your program is dying in the first place; this script doesn't fix a bug, it tries to hide one.
Secondly, if it is so important that a program remain running, why do you expect your (at least once already) buggy shell script will do the job? Use a system facility that is specifically designed to restart server processes. If you say what platform you are using and the nature of your server process. I can offer more concrete advice.
added in response to comment:
Sure, there are engineering exigencies, but as the OP noted in the OP, there is still a bug in this attempt at a solution:
I know I should check by process name
and not process id, since another
process could jump in and take the id.
So now you are left with a PID tracking script, not a process "nanny". Although the chances are small, the script as it now stands has a ten second window in which
the "monitored" process fails
I start up my week long emacs process which grabs the same PID
the nanny script continues on blissfully unaware that its dependent has failed
The script isn't merely buggy, it is invalid because it presumes that PIDs are stable identifiers of a process. There are ways that this could be better handled even at the shell script level. The simplest is to never detach the execution of perl from the script since the script is doing nothing other than watching the subprocess. For example:
while true ; do
if perl program_name.pl ; then
echo "program_name terminated normally, restarting"
else
echo "oops program_name died again, restarting"
fi
done
Which is not only shorter and simpler, but it actually blocks for the condition that you are really interested in: the run-state of the perl program. The original script repeatedly checks a bad proxy indication of the run state condition (the PID) and so can get it wrong. And, since the whole purpose of this nanny script is to handle faults, it would be bad if it were faulty itself by design.
I totally agree that fiddling with the PID is nearly always a bad idea. The while true ; do ... done script is quite good, however for production systems there a couple of process supervisors which do exactly this and much more, e.g.
enable you to send signals to the supervised process (without knowing it's PID)
check how long a service has been up or down
capturing its output and write it to a log file
Examples of such process supervisors are daemontools or runit. For a more elaborate discussion and examples see Init scripts considered harmful. Don't be disturbed by the title: Traditional init scripts suffer from exactly the same problem like you do (they start a daemon, keep it's PID in a file and then leave the daemon alone).
I agree that you should find out why your program is dying in the first place. However, an ever running shell script is probably not a good idea. What if this supervising shell script dies? (And yes, get rid of the square braces around ps -p $pid. You want the exit status of ps -p $pid command. The square brackets are a replacement for the test command.)
There are two possible solutions:
Use cron to run your "supervising" shell script to see if the process you're supervising is still running, and if it isn't, restart it. The supervised process can output it's PID into a file. Your supervising program can then cat this file and get the PID to check.
If the program you're supervising is providing a service upon a particular port, make it an inetd service. This way, it isn't running at all until there is a request upon that port. If you set it up correctly, it will terminate when not needed and restart when needed. Takes less resources and the OS will handle everything for you.
That's what kill -0 $pid is for. It returns success if a process with pid $pid exists.