How to detect an error at the beginning of a pipeline? - sh

In my script I need to work with the exit status of the non-last command of a pipeline:
do_real_work 2>&1 | tee real_work.log
To my surprise, $? contains the exit code of the tee. Indeed, the following command:
false 2>&1 | tee /dev/null ; echo $?
outputs 0. Surprise, because the csh's (almost) equivalent
false |& tee /dev/null ; echo $status
prints 1.
How do I get the exit code of the non-last command of the most recent pipeline?

Bash has set -o pipefail which uses the first non-zero exit code (if any) as the exit code of a pipeline.
POSIX shell doesn't have such a feature AFAIK. You could work around that with a different approach:
tail -F -n0 real_work.log &
do_real_work > real_work.log 2>&1
kill $!
That is, start following the as yet non-existing file before running the command, and kill the process after running the command.

Related

How do I automate killing a job in cron? [duplicate]

This question already has answers here:
Find and kill a process in one line using bash and regex
(30 answers)
Closed 1 year ago.
Sometimes when I try to start Firefox it says "a Firefox process is already running". So I have to do this:
jeremy#jeremy-desktop:~$ ps aux | grep firefox
jeremy 7451 25.0 27.4 170536 65680 ? Sl 22:39 1:18 /usr/lib/firefox-3.0.1/firefox
jeremy 7578 0.0 0.3 3004 768 pts/0 S+ 22:44 0:00 grep firefox
jeremy#jeremy-desktop:~$ kill 7451
What I'd like is a command that would do all that for me. It would take an input string and grep for it (or whatever) in the list of processes, and would kill all the processes in the output:
jeremy#jeremy-desktop:~$ killbyname firefox
I tried doing it in PHP but exec('ps aux') seems to only show processes that have been executed with exec() in the PHP script itself (so the only process it shows is itself.)
pkill firefox
More information: http://linux.about.com/library/cmd/blcmdl1_pkill.htm
Also possible to use:
pkill -f "Process name"
For me, it worked up perfectly. It was what I have been looking for.
pkill doesn't work with name without the flag.
When -f is set, the full command line is used for pattern matching.
You can kill processes by name with killall <name>
killall sends a signal to all
processes running any of the specified
commands. If no signal name is
specified, SIGTERM is sent.
Signals can be specified either by
name (e.g. -HUP or -SIGHUP ) or by number (e.g.
-1) or by option -s.
If the command name is not regular
expression (option -r) and contains a
slash (/), processes executing that
particular file will be selected for
killing, independent of their name.
But if you don't see the process with ps aux, you probably won't have the right to kill it ...
A bit longer alternative:
kill `pidof firefox`
The easiest way to do is first check you are getting right process IDs with:
pgrep -f [part_of_a_command]
If the result is as expected. Go with:
pkill -f [part_of_a_command]
If processes get stuck and are unable to accomplish the request you can use kill.
kill -9 $(pgrep -f [part_of_a_command])
If you want to be on the safe side and only terminate processes that you initially started add -u along with your username
pkill -f [part_of_a_command] -u [username]
Kill all processes having snippet in startup path. You can kill all apps started from some directory by for putting /directory/ as a snippet. This is quite usefull when you start several components for the same application from the same app directory.
ps ax | grep <snippet> | grep -v grep | awk '{print $1}' | xargs kill
* I would prefer pgrep if available
Strange, but I haven't seen the solution like this:
kill -9 `pidof firefox`
it can also kill multiple processes (multiple pids) like:
kill -9 `pgrep firefox`
I prefer pidof since it has single line output:
> pgrep firefox
6316
6565
> pidof firefox
6565 6316
Using killall command:
killall processname
Use -9 or -KILL to forcefully kill the program (the options are similar to the kill command).
On Mac I could not find the pgrep and pkill neither was killall working so wrote a simple one liner script:-
export pid=`ps | grep process_name | awk 'NR==1{print $1}' | cut -d' ' -f1`;kill $pid
If there's an easier way of doing this then please share.
To kill with grep:
kill -9 `pgrep myprocess`
more correct would be:
export pid=`ps aux | grep process_name | awk 'NR==1{print $2}' | cut -d' ' -f1`;kill -9 $pid
I normally use the killall command.
Check this link for details of this command.
I was asking myself the same question but the problem with the current answers is that they don't safe check the processes to be killed so... it could lead to terrible mistakes :)... especially if several processes matches the pattern.
As a disclaimer, I'm not a sh pro and there is certainly room for improvement.
So I wrote a little sh script :
#!/bin/sh
killables=$(ps aux | grep $1 | grep -v mykill | grep -v grep)
if [ ! "${killables}" = "" ]
then
echo "You are going to kill some process:"
echo "${killables}"
else
echo "No process with the pattern $1 found."
return
fi
echo -n "Is it ok?(Y/N)"
read input
if [ "$input" = "Y" ]
then
for pid in $(echo "${killables}" | awk '{print $2}')
do
echo killing $pid "..."
kill $pid
echo $pid killed
done
fi
kill -9 $(ps aux | grep -e myprocessname| awk '{ print $2 }')
If you run GNOME, you can use the system monitor (System->Administration->System Monitor) to kill processes as you would under Windows. KDE will have something similar.
The default kill command accepts command names as an alternative to PID. See kill (1). An often occurring trouble is that bash provides its own kill which accepts job numbers, like kill %1, but not command names. This hinders the default command. If the former functionality is more useful to you than the latter, you can disable the bash version by calling
enable -n kill
For more info see kill and enable entries in bash (1).
ps aux | grep processname | cut -d' ' -f7 | xargs kill -9 $
awk oneliner, which parses the header of ps output, so you don't need to care about column numbers (but column names). Support regex. For example, to kill all processes, which executable name (without path) contains word "firefox" try
ps -fe | awk 'NR==1{for (i=1; i<=NF; i++) {if ($i=="COMMAND") Ncmd=i; else if ($i=="PID") Npid=i} if (!Ncmd || !Npid) {print "wrong or no header" > "/dev/stderr"; exit} }$Ncmd~"/"name"$"{print "killing "$Ncmd" with PID " $Npid; system("kill "$Npid)}' name=.*firefox.*

Bash script to monitor Mongo db doesn't work

I am writing a bash script to monitor my MongoDB status. once it is crash then restart it. the script is as below:
while true
do
ret = $("mongod --config /etc/mongod.conf")
if $ret == 0
then
echo "I am out with code 0."
break
fi
echo "running again"
done
echo "I am out with code $?"
But it seems doesn't work. Return from the system:
running again
./mongo-text: line 3: mongod --config /etc/mongod.conf: No such file or directory
./mongo-text: line 3: ret: command not found
./mongo-text: line 4: ==: command not found
not sure what the problem is. Any help is appreciated.
Your loop can be made much simpler:
while ! mongod --config /etc/mongod.conf; do
echo "running again" >&2
sleep 1
done
if test -n "$VERBOSE"; then echo 'modgod successful'; fi
Note that the if keyword executes a command. So if $ret == 0 attempts to run the command $ret (assuming that variable is non-empty and contains no whitespace) with the arguments == and 0. That is almost certainly not what you intend. It is more typical to write if test "$ret" = 0 or if [ "$ret" = 0 ]. If $ret is empty, then it is attempting to execute the command == with the single argument 0.
There are several issues in your code:
$("mongod --config /etc/mongod.conf") will try to run mongod --config /etc/mongod.conf as a command, with spaces included
the if syntax is wrong
You can rewrite it this way:
while :; do
if mongod --config /etc/mongod.conf; then
echo "I am out with code 0."
break
fi
echo "running again"
# probably sleep for a few seconds here
done
echo "I am out with code $?"
For info about if statements, see:
How to check the exit status using an if statement
How to compare strings in Bash
Compound if statements with multiple expressions in Bash

How to capture error message from prompt - shell or perl

I am trying to capture the output of a command. It works fine if the command executes. However when there is an error, i am unable to capture what gets displayed in commandline
Eg.
$ out=`/opt/torque/bin/qsub submitscript`
qsub: Unauthorized Request MSG=group ACL is not satisfied: user abc#xyz.org, queue home
$ echo $out
$
I want $out to have the message
Thanks!
Errors are on stderr, so you need to redirect them into stdout so the backticks will capture it:
out=`/opt/torque/bin/qsub submitscript 2>&1`
if [ $? -gt 0 ] ; then
# By convention, this is sent to stderr, but if you need it on
# stdout, just remove the >&2 redirection
echo "Error: $out" >&2
else
echo "Success: $out"
fi
You should test the exit status of the command to figure out what the output represents (one way shown). It is similar for perl, slightly different syntax of course.
Have you tried doing it like this
$ out=`/opt/torque/bin/qsub submitscript 2>&1 > /dev/null`
$ echo $out

How is this bash script resulting in an infinite loop?

From some Googling (I'm no bash expert by any means) I was able to put together a bash script that allows me to run a test suite and output a status bar at the bottom while it runs. It typically takes about 10 hours, and the status bar tells me how many tests passed and how many failed.
It works great sometimes, however occasionally I will run into an infinite loop, which is bad (mmm-kay?). Here's the code I'm using:
#!/bin/bash
WHITE="\033[0m"
GREEN="\033[32m"
RED="\033[31m"
(run_test_suite 2>&1) | tee out.txt |
while IFS=read -r line;
do
printf "%$(tput cols)s\r" " ";
printf "%s\n" "$line";
printf "${WHITE}Passing Tests: ${GREEN}$(grep -c passed out.txt)\t" 2>&1;
printf "${WHITE}Failed Tests: ${RED}$( grep -c FAILED out.txt)${WHITE}\r" 2>&1;
done
What happens when I encounter the bug is I'll have an error message repeat infinitely, causing the log file (out.txt) to become some multi-megabyte monstrosity (I think it got into the GB's once). Here's an example error that repeats (with four lines of whitespace between each set):
warning caused by MY::Custom::Perl::Module::TEST_FUNCTION
print() on closed filehandle GEN3663 at /some/CPAN/Perl/Module.pm line 123.
I've tried taking out the 2>&1 redirect, and I've tried changing while IFS=read -r line; to while read -r line;, but I keep getting the infinite loop. What's stranger is this seems to happen most of the time, but there have been times I finish the long test suite without any problems.
EDIT:
The reason I'm writing this is to upgrade from a black & white test suite to a color-coded test suite (hence the ANSI codes). Previously, I would run the test suite using
run_test_suite > out.txt 2>&1 &
watch 'grep -c FAILED out.txt; grep -c passed out.txt; tail -20 out.txt'
Running it this way gets the same warning from Perl, but prints it to the file and moves on, rather than getting stuck in an infinite loop. Using watch, also prints stuff like [32m rather than actually rendering the text as green.
I was able to fix the perl errors and the bash script seems to work well now after a few modifications. However, it seems this would be a safer way to run the test suite in case something like that were to happen in the future:
#!/bin/bash
WHITE="\033[0m"
GREEN="\033[32m"
RED="\033[31m"
run_full_test > out.txt 2>&1 &
tail -f out.txt | while IFS= read line; do
printf "%$(tput cols)s\r" " ";
printf "%s\n" "$line";
printf "${WHITE}Passing Tests: ${GREEN}$(grep -c passed out.txt)\t" 2>&1;
printf "${WHITE}Failed Tests: ${RED}$( grep -c 'FAILED!!' out.txt)${WHITE}\r" 2>&1;
done
There are some downsides to this. Mainly, if I hit Ctrl-C to stop the test, it appears to have stopped, but really run_full_test is still running in the background and I need to remember to kill it manually. Also, when the test is finished tail -f is still running. In other words there are two processes running here and they are not in sync.
Here is the original script, slightly modified, which addresses those problems, but isn't foolproof (i.e. can get stuck in an infinite loop if run_full_test has issues):
#!/bin/bash
WHITE="\033[0m"
GREEN="\033[32m"
RED="\033[31m"
(run_full_test 2>&1) | tee out.txt | while IFS= read line; do
printf "%$(tput cols)s\r" " ";
printf "%s\n" "$line";
printf "${WHITE}Passing Tests: ${GREEN}$(grep -c passed out.txt)\t" 2>&1;
printf "${WHITE}Failed Tests: ${RED}$( grep -c 'FAILED!!' out.txt)${WHITE}\r" 2>&1;
done
The bug is in your script. That's not an IO error; that's an illegal argument error. That error happens when the variable you provide as a handle isn't a handle at all, or is one that you've closed.
Writing to a broken pipe results in the process being killed by SIGPIPE or in print returning false with $! set to EPIPE.

How can I make a shell script indicate that it was successful?

If I have a basic .sh file containing the following script code:
#!/bin/sh
rm -rf "MyFolder"
How do I make this running script file display results to the terminal that will indicate if the directory removal was successful?
You don't really need to make it say it was successful. You could have it say something only on error ✖, and then silence means success ✔.
That's how the Unix philosophy works:
The rule of silence, also referred to as the silence is golden rule, is an important part of the Unix philosophy that states that when a program has nothing surprising, interesting or useful to say, it should say nothing. It means that well-behaved programs should treat their users' attention and concentration as being valuable and thus perform their tasks as unobtrusively as possible. That is, silence in itself is a virtue. http://www.linfo.org/rule_of_silence.html
That's the way rm itself behaves.
If you are asking about the general case, as suggested by your question's title, you can run your script with sh -x scriptname to see what it's doing. It's also quite common to write diagnostic output into the script itself, and control it with an option.
#!/bin/sh
verbose=false
case $1 in -v | --verbose )
verbose=true
shift ;;
esac
say () {
$verbose || return
echo "$0: $#" >&2
}
say "Removing $dir ..."
rm -rf "$dir" || say "Failed."
If you run this script without any options, it will run silently, like a well-behaved Unix utility should. If you run it with the -v option, it will print some diagnostics to standard error.
rm -rf "My Folder" && echo "Done" || echo "Error!"
You can read more on creating a sequence of pipelines in bash manual
In the bash (and other similar shells) the ? environment variable gives you the exit code of the last executed command. So you can do:
#!/bin/sh
rm -rf "My Folder"
echo $?
UPDATE
If once the rm command has been executed the directory doesn't exist (because it has been successfully removed or because it didn't exist when the command was executed) the script will print 0. If the directory exists (which will mean that the command has been unable to remove it) then the script will print an exit code other than 0. If I understand properly the question this is exactly the requested behavior. If it is not, please correct me.
The previous answers was wrong : rm don't exit with error code > 0 when the dir isn't present.
Instead, I recommend to use :
dir='/path/to/dir'
if [[ -d $dir ]]; then
rm -rf "$dir"
fi
If you want rm to return a status, remove -f flag.
Example on Linux Mint (the dir doesn't exists):
$ rm -rf /tmp/sdfghjklm
$ echo $?
0
$ rm -r /tmp/sdfghjklm
$ echo $?
1