SIGTERM does not terminate script - sh

I am trying to understand this script:
#!/usr/bin/env sh
trap "pkill -f sleep" term
sleep 1000
After running the script, I would like to stop it by sending a SIGTERM:
# run script in the background
$ ./signals.sh &
[1] 5389
# check if script is running
$ ps aux | grep 'signals.sh\|sleep' | grep -v grep
sergioro 5389 0.0 0.0 216520 3112 pts/0 S 09:07 0:00 sh ./signals.sh
sergioro 5390 0.0 0.0 214984 708 pts/0 S 09:07 0:00 sleep 1000
# send SIGTERM to script
$ pkill -fe signals.sh
sh killed (pid 5389)
# why hasn't the script stopped after receiving SIGTERM?
$ ps aux | grep 'signals.sh\|sleep' | grep -v grep
sergioro 5389 0.0 0.0 216520 3112 pts/0 S 09:07 0:00 sh ./signals.sh
sergioro 5390 0.0 0.0 214984 708 pts/0 S 09:07 0:00 sleep 1000
# send SIGTERM to `sleep` command
$ pkill -fe sleep
sleep killed (pid 5390)
Terminated
# script has stopped
$ ps aux | grep 'signals.sh\|sleep' | grep -v grep
My question is how to stop the script by sending SIGTERM to the script itself and not to the sleep command? And why is the trap in the script not terminating the sleep command?

The shell can process trap only after sleep finishes.
This can achieve what you expected :
#!/usr/bin/env sh
trap "pkill -f sleep" term
sleep 1000 & wait
See the SIGNALS section of the man page, especially the last paragraph:
If bash is waiting for a command to complete and receives a signal for
which a trap has been set, the trap will not be executed until the command completes. When bash is waiting for an asynchronous command via
the wait builtin, the reception of a signal for which a trap has been
set will cause the wait builtin to return immediately with an exit status greater than 128, immediately after which the trap is executed.

Related

Debugging a specified emacs of multiple, when one is frozen

Two emacs server are running on my machine
$ ps -ef |grep emacs | sed "s/$USER/me/g"
me 4010 1 6 13:52 ? 00:02:58 /snap/emacs/25/usr/bin/emacs --daemon=orging
me 4538 1 3 13:52 ? 00:01:45 /snap/emacs/25/usr/bin/emacs --daemon=coding
me 4622 1 0 13:52 pts/1 00:00:00 /snap/emacs/25/usr/bin/emacsclient /home/me/ORG/os.org -c -s orging
me 4623 1 0 13:52 pts/1 00:00:00 /snap/emacs/25/usr/bin/emacsclient /home/me/ORG/algorithms.org -c -s coding
me 8945 3548 0 14:38 pts/1 00:00:00 grep --color=auto emacs
The one of 'orging` is frozen,
Upon reading debugging - What do I do when Emacs is frozen? - Emacs Stack Exchange, I got the solution
pkill -SIGUSR2 emacs
How could apply the operation to the specified emacs saying 'orging'?
Use the kill command and specify the pid of the emacs instance you want, in this case 4010:
kill -SIGUSR2 4010

Perl fails to kill self pid when running from bash script

Following code behaves as expected when running from terminal:
perl -e 'kill -2, $$; warn HERE, $/'
It sends itself SIGINT and dies before reaching "HERE":
~# perl -e 'kill -2, $$; warn HERE, $/'
~# echo $?
130
~#
The problem: same code fails to kill self PID when running from shell script:
~# cat 1.sh
perl -e 'kill -2, $$; warn HERE, $/'
~#
~# sh 1.sh
HERE
~#
~# echo $?
0
~#
On the other hand, replacing perl's kill by a shell's one works OK:
~# cat 2.sh
perl -e 'qx/kill -2 $$/; warn HERE, $/'
~#
~# sh 2.sh
~#
~# echo $?
130
~#
Not really understand what is happening here, please help..
First of all,
kill -2, $$
is better written as
kill 2, -$$
An even better alternative is
kill INT => -$$
These send SIGINT to the specified process group.
Your main question appears to be why the two shells behave differently. This section explains that.
The process group represents an application.
When you launch a program from an interactive shell, it's not part of a larger application, so the shell creates a new process group for the program.
However, processes created by a script (i.e. a non-interactive shell) are part of the same application as the script itself, so the shell doesn't create a new process group for them.
You can visualize this using the following:
sh -i <<< 'perl -e '\''system ps => -o => "pid,ppid,pgrp,comm"'\''' outputs the following:
$ perl -e 'system ps => -o => "pid,ppid,pgrp,comm"'
PID PPID PGRP COMMAND
8179 8171 8179 bash
14654 8179 14654 sh
14655 14654 14655 perl
14656 14655 14655 ps
$ exit
In interactive mode, perl is at the head of perl and ps's program group.
sh <<< 'perl -e '\''system ps => -o => "pid,ppid,pgrp,comm"'\''' outputs the following:
PID PPID PGRP COMMAND
8179 8171 8179 bash
14584 8179 14584 sh
14585 14584 14584 perl
14586 14585 14584 ps
In non-interactive mode, sh is at the head of perl and ps's program group.
Your failures are the result of not sending the signal to the head of the process group (i.e. the application). Had you checked, the error kill reported was ESRCH ("No such process").
ESRCH The pid or process group does not exist. [...]
To kill the current process's process group, replace the improper
kill INT => -$$ # XXX
with
kill INT => -getpgrp() # Kill the application
You can make your perl the head of its own process group by simply calling the following:
setpgrp();
Test:
$ sh <<< 'perl -e '\''system ps => ( -o => "pid,ppid,pgrp,comm" )'\'''
PID PPID PGRP COMMAND
8179 8171 8179 bash
16325 8179 16325 sh
16326 16325 16325 perl
16327 16326 16325 ps
$ sh <<< 'perl -e '\''setpgrp(); system ps => ( -o => "pid,ppid,pgrp,comm" )'\'''
PID PPID PGRP COMMAND
8179 8171 8179 bash
16349 8179 16349 sh
16350 16349 16350 perl
16351 16350 16350 ps
That's not something you normally want to do.
Finally, the Perl code
kill INT => -$pgrp
is equivalent to the following call of the kill command-line utility:
kill -s INT -$pgrp
kill -INT -$pgrp
kill -2 -$pgrp
You were missing - in your qx// program, so it was sending SIGINT to the identified process rather than the identified program group.
From your interactive terminal, the perl process kills the process group of which it is a part. (The shell runs perl in its own process group.) The shell reports this unusual termination in $?:
t0 interactive shell (pid=123, pgrp=123)
|
t1 +------> perl -e (pid=456, pgrp=456, parent=123)
| |
t2 (wait) kill(-2, 456) (in perl, same as kill pgrp 456 w/ SIGINT)
| |
t3 (wait) *SIGINT*
|
t4 report $?
From your shell script, the perl process kills a (likely) non-existent process group and then exits successfully. Your interactive shell makes a new process group in which to run your shell script, and that script then runs perl as a child in the same process group.
t0 shell (pid=123, pgrp=123)
|
t1 +-------> shell:1.sh (pid=456, pgrp=456, parent=123)
| |
t2 (wait) +-------------> perl -e (pid=789, pgrp=456, parent=456)
| | |
t3 (wait) (wait) kill pgrp 789 with SIGINT (error: no such pgrp)
| | |
t4 (wait) (wait) exit success
| |
t5 (wait) exit success
|
t6 report $?
In your backticked (qx//) example, your interactive shell starts a shell process with a new process group. (Not that it matters here, but that process runs perl in its same process group.) Perl then runs as its own child the system kill command, the semantics of which differ from that of the perl kill. This grandchild command sends a SIGINT to the perl PID directly, rather than a SIGINT to a process group. Perl terminates, and that exit code is conveyed as the script's exit code, since it was the last command in the script.
This diagram is a little busier than the previous:
t0 shell (pid=123, pgrp=123)
|
t1 +-------> shell:2.sh (pid=456, pgrp=456, parent=123)
| |
t2 (wait) +----------> perl -e (pid=789, pgrp=456, parent=456)
| | |
t3 (wait) (wait) +---------> /bin/kill SIGINT 789
| | | |
t4 (wait) (wait) *SIGINT* exit success
| |
t5 (wait) return $?
|
t6 report $?
It works fine in this way:
perl -E 'say "kill "INT", $$; warn HERE, $/'
perl -E 'say "kill 2, $$; warn HERE, $/'
kill man page says:
A negative signal name is the same as a negative signal number,
killing process groups instead of processes. For example, kill
'-KILL', $pgrp and kill -9, $pgrp will send SIGKILL to the entire
process group specified. That means you usually want to use positive
not negative signals.

How to write a .sh that get process id of application and then stop the application?

for example, I need to use a app called SomeApp, but it often needs to restart, so I need to type "ps -ef | grep SomeApp" and then "kill -9 7777"
which first find the process id and then stop that process:
XXXX:~ XXXX$ ps -ef | grep SomeApp
333 7777 1 0 1:40PM ?? 0:40.31 /Users/XXXX/SomeApp
333 8888 9999 0 1:58PM abcd000 0:00.00 grep SomeApp
XXXX:~ XXXX$ kill -9 7777
now I want to put the command into .sh, but I have something don't know how to write in .sh:
exclude the result that belongs to my grep action
get the correct line result
get the second argument (process id) of result string
can anyone help?
This'll do it.
ps -ef | grep 'SomeApp' | grep -v grep | awk '{print $2}' | xargs kill
Or look at pgrep and pkill depending on the OS.

Can't kill celery workers

Try as I might I cannot kill these celery workers.
I run:
celery --app=my_app._celery:app status
I see I have 3 (I don't understand why 3 workers = 2 nodes, please explain if you know)
celery#ip-x-x-x-x: OK
celery#ip-x-x-x-x: OK
celery#named-worker.%ip-x-x-x-x: OK
2 nodes online.
I run (as root):
ps auxww | grep 'celery#ip-x-x-x-x' | awk '{print $2}' | xargs kill -9
The workers just keep reappearing with a new PID.
Please help me kill them.
A process whose pid keeps changing is called comet. Even though pid of this process keeps on changing, its process group ID remains constant. So you can kill by sending a signal.
ps axjf | grep '[c]elery' | awk '{print $3}' | xargs kill -9
Alternatively, you can also kill with pkill
pkill -f celery
This kills all processes with fullname celery.
Reference: killing a process
pkill -f celery
Run from the command line, this will kill at processes related to celery.
In your console, type :
ps -aux | grep celery
I get :
simon 24615 3.8 0.6 344276 219604 pts/3 S+ 22:53 0:56 /usr/bin/python3 /home/simon/.local/bin/celery -A worker_us_task worker -l info -Q us_queue --concurrency=30 -n us_worker#%h
select what you find after -A and type :
pkill -9 -f 'worker_us_task worker'
I always use:
ps auxww | grep 'celery' | awk '{print $2}' | xargs kill -9
If you're using supervisord to run celery, you need to kill supervisord process also.

How can I kill Perl 'system' calls when the main script is killed?

Pervious answers to this questions have focused on forks:
kill background process when shell script exit
How to make child process die after parent exits?
Are child processes created with fork() automatically killed when the parent is killed?
For this question, I'm just asking about calls to the 'system' function.
Say I have a script called sleep.pl:
use strict;
use warnings;
sleep(300);
I then have a script called kill.pl
use strict;
use warnings;
system("sleep.pl");
I run kill.pl and using ps I find the process id of kill.pl and kill it (not using kill -9, just normal kill)
sleep.pl is still sleeping.
I imagine the solution to my question involves a SIG handler, but what do I need to put into the handler to kill the child process?
Use setsid to make your process the new group leader. Then you can send a kill to the group ID and kill all processes that belong to the group. All processes that you spawn from the leader process inherit the group ID and belong to your newly created group. So sending a kill to the group will kill them all. The only tricky thing is in order to be able to use setsid you must close your standard in and output, as that is a requirement for setsid.
use strict;
use warnings;
setpgrp $$, 0;
system("sleep.pl");
END {kill 15, -$$}
But if you need this approach you do something wrong. You should not do this. Run and kill your kill process in right way instead.
$ perl -e 'system("sleep 100")' &
[1] 11928
$ ps f
PID TTY STAT TIME COMMAND
4564 pts/1 Ss 0:01 /bin/bash
11928 pts/1 S 0:00 \_ perl -e system("sleep 100")
11929 pts/1 S 0:00 | \_ sleep 100
11936 pts/1 R+ 0:00 \_ ps f
$ kill %1
[1]+ Terminated perl -e 'system("sleep 100")'
$ ps f
PID TTY STAT TIME COMMAND
4564 pts/1 Rs 0:01 /bin/bash
11949 pts/1 R+ 0:00 \_ ps f
How it works? Shell (bash in mine case) should set your process as group leader if you run on background. Then if you use kill %? syntax shell kills group in right way. Compare this:
$ perl -e 'system("sleep 100")' &
[1] 12109
$ ps f
PID TTY STAT TIME COMMAND
4564 pts/1 Rs 0:01 /bin/bash
12109 pts/1 S 0:00 \_ perl -e system("sleep 100")
12113 pts/1 S 0:00 | \_ sleep 100
12114 pts/1 R+ 0:00 \_ ps f
$ kill 12109
[1]+ Terminated perl -e 'system("sleep 100")'
$ ps f
PID TTY STAT TIME COMMAND
4564 pts/1 Ss 0:01 /bin/bash
12124 pts/1 R+ 0:00 \_ ps f
12113 pts/1 S 0:00 sleep 100
But kill %? works in this way:
$ perl -e 'system("sleep 100")' &
[1] 12126
$ ps f
PID TTY STAT TIME COMMAND
4564 pts/1 Rs 0:01 /bin/bash
12126 pts/1 S 0:00 \_ perl -e system("sleep 100")
12127 pts/1 S 0:00 | \_ sleep 100
12128 pts/1 R+ 0:00 \_ ps f
$ kill -12126
[1]+ Terminated perl -e 'system("sleep 100")'
$ ps f
PID TTY STAT TIME COMMAND
4564 pts/1 Ss 0:01 /bin/bash
12130 pts/1 R+ 0:00 \_ ps f
Your question really is a specific instance of the more general "how to ensure a child dies when the parent dies" question. There's just no getting around that. There's nothing really special about system(); it just forks a child process and waits for it to exit.
system() is roughly this:
sub system
{
my $kidpid = fork;
if ( $kidpid )
{
waitpid $kidpid;
# parent process blocks here waiting for the child process to return
}
else
{
exec( #_ );
}
}
Killing the process group is about your only option.