Note that I'm aware that this is probably not the best or most optimal way to do this but I've run into this somewhere before and I'm curious as to the answer.
I have a perl script that is called from an init that runs and occasionally dies. To quickly debug this, I put together a quick wrapper perl script that basically consists of
#$path set from library call.
while(1){
system("$path/command.pl " . join(" ",#ARGV) . " >>/var/log/outlog 2>&1");
sleep 30; #Added this one later. See below...
}
Fire this up from the command line and it runs fine and as expected. command.pl is called and the script basically halts there until the child process dies then goes around again.
However, when called from a start script (actually via start-stop-daemon), the system command returns immediately, leaving command.pl running. Then it goes around for another go. And again and again. (This was not fun without the sleep command.). ps reveals the parent of (the many) command.pl to be 1 rather than the id of the wrapper script (which it is when I run from the command line).
Anyone know what's occurring?
Maybe the command.pl is not being run successfully. Maybe the file doesn't have execute permission (do you need to say perl command.pl?). Maybe you are running the command from a different directory than you thought, and the command.pl file isn't found.
There are at least three things you can check:
standard error output of your command. For now you are swallowing it by saying 2>&1. Remove that part and observe what errors the system command produces.
the return value of system. The command may run and system may still return an exit code, but if system returns 0, you know the command was successful.
Perl's error variable $!. If there was a problem, Perl will set $!, which may or may not be helpful.
To summarize, try:
my $ec = system("command.pl >> /var/log/outlog");
if ($ec != 0) {
warn "exit code was $ec, \$! is $!";
}
Update: if multiple instance of the command keep showing up in your ps output, then it sounds like the program is forking and running itself in the background. If that is indeed what the command is supposed to do, then what you do NOT want to do is run this command in an endless loop.
Perhaps when run from a deamon the "system" command is using a different shell than the one used when you are running as yourself. Maybe the shell used by the daemon does not recognize the >& construct.
Instead of system("..."), try exec("...") function if that works for you.
Related
I'm sure this is an easy fix, but I need to use "script" (and not collect standard in/out/error) for my project. I'm somewhat new to Perl, so please bear with me.
I have a Perl script that works fine. When I run it I generally type script > filename before I run Perl.
$script > file.log
bash-3.2$ perl foobar.pl
This runs fine, and when I'm done I type exit or control D to stop the script and save the file. All I'd like to do is incorporate the script command in Perl and then automatically capture the file when the program stops running (12-16 hours). The problem I have is that is I call in system("script > file.log"); and them call system("perl foobar.pl"); it hangs at the bash-3.2$ prompt. The only way to get Perl to work is control D or exit, stopping the script function.
Anyone have any idea how to fix this? While it's easy to start with script before invoking Perl, if I'm a mole and forget, I have to rerun the program (which takes a long time).
Have you considered using system("script -c 'perl foobar.pl' file.log")?
Using python 3.2, and the following code snippet:
p = subprocess.Popen(['../start_server.sh'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out,err = p.communicate()
if out != None :
out = out.decode('utf-8')
if err != None :
err = err.decode('utf-8')
print('out ',out)
print('err ',err)
on some shell scripts, it works just fine and I get my output. on others it just hangs. but in every case the shell script runs from the command line with no errors. The only commonality i can see is (usually) the ones that hang have zero output. When stuff fails, I check running processes and i see my shell script is not listed and the python script is still running
Whats a reliable way to call a shell script and always return control to my python program?
Edit:
Using pipes Popen and such is not a requirement, the only requirement is that control is returned to my python script when the shell script exits. If the shell script never returns to the command prompt, then my python script will also never return.
So assuming the shell script(s) I am calling always return to the command prompt, how can I get control back to my python program?
If theres a better way that what ive listed above -- please enlighten me
One additional bit ive found is the shell scripts that "hang" seem to end with a call to 'nohup' Ye they return to the command prompt with no issues.
Whats a reliable way to call a shell script and always return control
to my python program?
If you are using pipes, this will depend on your scripts; a more general answer is essentially the halting problem and even the mighty StackOverflow can't help you with that.
I would encourage you to dig deeper and try to create a reproducible case so that we can help you solve the particular problem you're seeing.
Edit
If you don't need pipes, then just omit the stdout and stderr parameters (or set them to something other than PIPE). See python subprocess management.
I have this command that I load (example.sh) works well in the unix command line.
However, if I execute it in Perl using the system or ` syntax, it doesn't work.
I am guessing certain settings like environment variables and other external sh files weren't loaded.
Is there an example coding to ensure it will work?
More Updates on coding execution failure (I have been trying with different codes):
push (#JOBSTORUN, "cd $a/$b/$c/$d; loadproject cats; sleep 60;");
...
my $pm = new Parallel::ForkManager(3);
foreach my $job (#JOBSTORUN) {
$pm->start and next;
print(`$job`);
$pm->finish;
}
print "\n\n[DONE] FINISHED EXECUTING JOBS\n";
Output Messages:
sh: loadproject: command not found
Can you show us what you have tried so far? How are you running this program?
My first suspicion wouldn't be the environment if you are running it from a login shell. any Perl script you start (well, any program, really) inherits the same environment. However, if you are running the program through cron, then that's a different story.
The other mistakes I usually make in these situations is specifying the relative paths incorrectly. The paths are fine from the command line, but my Perl script has some other current working directory.
For general advice, see Interact with the system when Perl isn't enough. There's also a chapter in Learning Perl about this.
That's about the best advice you can hope for given the very limited information you've shared with us.
I've searched around but haven't quite found what I'm looking for. In a nutshell I have created a bash script to run in a infinite while loop, sleeping and checking if a process is running. The only problem is even if the process is running, it says it is not and opens another instance.
I know I should check by process name and not process id, since another process could jump in and take the id. However all perl programs are named Perl5.10.0 on my system, and I intend on having multiple instances of the same perl program open.
The following "if" always returns false, what am I doing wrong here???
while true; do
if [ ps -p $pid ]; then
echo "Program running fine"
sleep 10
else
echo "Program being restarted\n"
perl program_name.pl &
sleep 5
read -r pid < "${filename}_pid.txt"
fi
done
Get rid of the square brackets. It should be:
if ps -p $pid; then
The square brackets are syntactic sugar for the test command. This is an entirely different beast and does not invoke ps at all:
if test ps -p $pid; then
In fact that yields "-bash: [: -p: binary operator expected" when I run it.
Aside from the syntax error already pointed out, this is a lousy way to ensure that a process stays alive.
First, you should find out why your program is dying in the first place; this script doesn't fix a bug, it tries to hide one.
Secondly, if it is so important that a program remain running, why do you expect your (at least once already) buggy shell script will do the job? Use a system facility that is specifically designed to restart server processes. If you say what platform you are using and the nature of your server process. I can offer more concrete advice.
added in response to comment:
Sure, there are engineering exigencies, but as the OP noted in the OP, there is still a bug in this attempt at a solution:
I know I should check by process name
and not process id, since another
process could jump in and take the id.
So now you are left with a PID tracking script, not a process "nanny". Although the chances are small, the script as it now stands has a ten second window in which
the "monitored" process fails
I start up my week long emacs process which grabs the same PID
the nanny script continues on blissfully unaware that its dependent has failed
The script isn't merely buggy, it is invalid because it presumes that PIDs are stable identifiers of a process. There are ways that this could be better handled even at the shell script level. The simplest is to never detach the execution of perl from the script since the script is doing nothing other than watching the subprocess. For example:
while true ; do
if perl program_name.pl ; then
echo "program_name terminated normally, restarting"
else
echo "oops program_name died again, restarting"
fi
done
Which is not only shorter and simpler, but it actually blocks for the condition that you are really interested in: the run-state of the perl program. The original script repeatedly checks a bad proxy indication of the run state condition (the PID) and so can get it wrong. And, since the whole purpose of this nanny script is to handle faults, it would be bad if it were faulty itself by design.
I totally agree that fiddling with the PID is nearly always a bad idea. The while true ; do ... done script is quite good, however for production systems there a couple of process supervisors which do exactly this and much more, e.g.
enable you to send signals to the supervised process (without knowing it's PID)
check how long a service has been up or down
capturing its output and write it to a log file
Examples of such process supervisors are daemontools or runit. For a more elaborate discussion and examples see Init scripts considered harmful. Don't be disturbed by the title: Traditional init scripts suffer from exactly the same problem like you do (they start a daemon, keep it's PID in a file and then leave the daemon alone).
I agree that you should find out why your program is dying in the first place. However, an ever running shell script is probably not a good idea. What if this supervising shell script dies? (And yes, get rid of the square braces around ps -p $pid. You want the exit status of ps -p $pid command. The square brackets are a replacement for the test command.)
There are two possible solutions:
Use cron to run your "supervising" shell script to see if the process you're supervising is still running, and if it isn't, restart it. The supervised process can output it's PID into a file. Your supervising program can then cat this file and get the PID to check.
If the program you're supervising is providing a service upon a particular port, make it an inetd service. This way, it isn't running at all until there is a request upon that port. If you set it up correctly, it will terminate when not needed and restart when needed. Takes less resources and the OS will handle everything for you.
That's what kill -0 $pid is for. It returns success if a process with pid $pid exists.
I have a perl script (part of the XMLTV family of "grabbers", specifically tv_grab_oztivo).
I can successfully run it like this:
/sw/bin/perl /path/to/tv_grab_oztivo --output /path/to/tv.xml
I use the full paths to everything to eliminate issues with the Working Directory. Permissions shouldn't be a problem.
So, if I run it from the Terminal (Mac OSX) it works just fine.
But when I set it to run via a cron job, nothing appears to happen at all. No output is created etc.
There isn't anything wrong with the crontab as far as I can see, because if I substitute a helloworld.pl for the actual script, it runs just fine at the right time.
So, what can I do to debug? I can see from looking at %ENV in the two cases that the environment is very different, but what other approaches can I take to debugging? How can I see the output of the cron job, which might be some kind of perl "die" message or "not found" message from the shell or whatever?
Or should I be trying to somehow give the cron version of the command the same environment as when it's running as me?
It's often because you don't get the full environment when running under cron. Best bet is to capture the ouput by using the command:
( /sw/bin/perl /path/to/tv_grab_oztivo ... ) >/tmp/qq 2>&1
and then have a look at /tmp/qq.
If it does turn out to be a missing environment, then you may need to put:
. ~/.profile
or something similar, into the execution chain of your cron job, such as:
( . ~/.profile ; /sw/bin/perl /path/to/tv_grab_oztivo ... ) >/tmp/qq 2>&1
If you're looking at %ENV in the two cases, I'd suggest that, as a first step in your perl script, set %ENV to what it is in a cron job, and then trying to run it from the command line. You may need to exec yourself once for this to take full control:
BEGIN {
if (exists $ENV{something_in_your_env_not_in_cron}) {
%ENV = (...);
exec $^X, $0, #ARGV;
}
}
Now try running it, and seeing if there's anything you can do to debug it (including running under perl -d if required). Most likely, you'll find that you end up adding items back into %ENV one at a time until it magically starts working (LD_LIBRARY_PATH is a good one for this, but ORACLE_HOME or DB2HOME for Oracle or DB2 apps might be good choices, too). Then you can either set the variable in your script, or in the crontab.
I'd run a simple shell script by absolute path from the cron command.
Inside that script, I'd ensure that I trapped stdout and stderr to a known (or knowable) file. I'd also ensure that enough of your environment is set. On Unix, you get almost no environment set at all when you run a command via cron - I'm not sure about MacOS X. The standard culprit for problems is PATH. I have a separate .cronfile that sets my working environment enough that I usually don't have problems - that's an analogue of .profile.
On occasion if you can't figure out what's going wrong with your command line, the simplest way to fix it is to turn the whole thing into a shell script. Ideally you shouldn't have to do this, but it can be the fastest way to solve the problem.
File: /files/cron1.sh
#!/bin/sh
/sw/bin/perl /path/to/tv_grab_oztivo --output /path/to/tv.xml
And then in cron:
/files/cron1.sh
This allows you to test the script independent of cron. Remember though that your login shell runs with different environment variables than cron does.
cron usually captures the output of stdout and stderr and e-mailes any output to the crontab owner.
Did you double check your crontab entry to make sure it's valid and will execute at the right time?
Make sure that the script does not need any environment variables set. Otherwise wrap it in another (bash) script, where you can set the environment variables that the other script expects.