This question has been asked here in multiple forms. I am asking it again because all these questions had too many details. Hence the answers all boiled down to how to solve those specific problems without jumping between users.This I why am posting this as a new question (and immediately answering it below) for others that have this problem.
Suppose you have a perl script that you run as root where you first want to run things as root, then things as a regular user and then as root again.
For example:
#!/usr/bin/perl
#Problem 1: Make sure to start as root
system("whoami");
#Problem 2: Become your regular user
system("whoami");
#Problem 3: Become root again
system("whoami);
should be changed to show:
root
your_username
root
This the best solution I can think of.
If you want to start as root, become a regular user and become root again:
#!/usr/bin/perl
use POSIX qw/setuid waitpid/;
exec("sudo", $0, #ARGV) unless($< == 0); #Restart the program as root if you are a regular user
system("whoami");
my $pid = fork; #create a extra copy of the program
if($pid == 0) {
#This block will contain code that should run as a regular user
setuid(1000); #So switch to that user (e.g. the one with UID 1000)
system("whoami");
exit; #make sure the child stops running once the task for the regular user are done
}
#Everything after this will run in the parent where we are still root
waitpid($pid, 0); #wait until the code of the child has finished
system("whoami");
When starting as a regular user it's best to make sure that the parent stays a regular user and the child becomes root. You can do this like this:
#!/usr/bin/perl
use POSIX qw/setuid waitpid/;
unless($< == 0) {
#regular user code, this is the first code that will run
system("whoami");
#now fork, let the child become root and let the parent wait for the child
my $pid = fork;
exec("sudo", $0, #ARGV) if($pid == 0);
waitpid($pid, 0);
#continue with regular user code, this is the 3th part that will run
system("whoami");
exit; #the end of the program has been reached, exit or we would continue with code meant for root
}
#code for root, this is the 2nd part that will run
system("whoami");
Related
In perl on RHEL6, I need to detect when an xterm running in a forked process has been exited by the user of that xterm.
Here's some code that demonstrates the problem...
#!/usr/bin/env perl
use strict;
my $fork_stat = do_fork_sub();
print "fork_stat: ${fork_stat}\n";
exit;
sub do_fork_sub {
my $pid = fork();
if($pid == 0) {
my $cmd = "xterm; tcsh &";
system($cmd);
# When system command above finishes, the xterm is done, exit
# this child process...
exit;
}
for(;;) {
# If something happened in the xterm that
# created file "child_complete_flag", return "SUCCESS".
if(-e "child_complete.flag") {return("SUCCESS");}
# If the user of the xterm gave up on the work by exiting,
# return "FAILED".
my $pid_srch_cmd = "ps uax | awk '{print \$2}' | grep ${pid}";
print "-I- searching for pid using... ${pid_srch_cmd}\n";
my $fork_pid_alive = `${pid_srch_cmd}`;
if(!($fork_pid_alive)) {
print "Detected end of xterm proc\n";
return("FAILED");
}
sleep(3);
}
}
This doesn't work. If I exit the xterm, the forked process persists anyway. I tried sticking a system call to a "kill -9" of $$ after the system call that creates the xterm. No luck.
Any suggestions welcome.
In response to one of the questions I was asked below....
"What else does the parent need to do? Also, the way you have it the do_fork_sub blocks for as long as that work in xterm takes. Is that intended? What is the purpose of the whole program?"
The example above is a greatly simplified version of what's really happening. In that world, the xterm does a -e and runs a tool that can take a long time to run, days sometimes. If it finishes successfully, I want the parent to detect this (parses this info from a log file), return from the subroutine where the next step will start (another 2-3 days). What I don't want to do is wait around for the user to exit the xterm in that case. If the tool passed, declare success and move on. Thus the purpose of the parent sitting around looking at the child.
The tool that runs in the xterm may also fail. In that case, email is sent to the user a and they eventually go to the xterm and try to debug the problem. WHen ready to try again, they start the tool again and let it run in the xterm (2-3 more days). If it passes... see above. But, it may fail again. If they eventually just give up, they will "exit" out of the xterm. In that case, I want to detect this and return from the subroutine with a "FAILED" status.
I am playing with $SIG(CHLD) = 'IGNORE' and = 'DEFAULT', seeing if I can control the fate of forked process once it exits. Rather than let it become a zombie, I would rather it just die so that the parent can detect it's death and act accordingly. It seems sensitive to me setting the $SIG{CHLD} = 'IGNORE' if I place it before the fork(). If I exit the xterm, the child process dies, the parent detects it, and everything acts accordingly. But unfortunately, the child process seems to die on it's own. It looks like the parent's call to "system("xterm")" doesn't wait for the xterm to complete and the whole thing just dies prematurely.
Well, it's not elegant, but the answer I came up with was to pass an "echo ..." statement to the xterm invocation which uniquely identifies it in the output of "ps aux". For example...
$cmd = "xterm -e \"echo 'This string is unique'; tcsh\" &"
Then system($cmd);
Then just grep the output of "ps aux" for the unique string && maybe "xterm" && perhaps the user... whatever it takes.
I have to start a pdf viewer from a Perl script. The viewer should
become detached from the parent process and the terminal that the parent process was run from. If I close the parent or the terminal the
viewer should still be kept running. I considered three approaches (using evince as the pdf viewer command):
Using system and sh:
system 'evince test.pdf &';
Using fork():
$SIG{CHLD} = "IGNORE"; #reap children as they complete
my $pid = fork();
if ( $pid == 0 ) {
exec 'evince', 'test.pdf';
}
Using Proc::Daemon:
use Proc::Daemon;
my $daemon = Proc::Daemon->new(
work_dir => '/tmp/evince',
child_STDOUT => '>>stdout.txt',
child_STDERR => '>>stderr.txt',
);
my $pid = $daemon->Init();
if ( $pid == 0 ) {
exec 'evince', 'test.pdf';
}
What would be the difference between these approaches? Which approach would you recommend?
system 'evince test.pdf &';
In my experience, this is likely to really be:
system 'evince $pdf_file &';
If $pdf_file is user input, then we get shell-injection bugs, such as passing in a pdf name of $(rm -rf /) or even just ;rm -rf /. And what if the name has a space in it? Well, you can avoid all that if you quote it, right?
system 'evince "$pdf_file" &';
Well, no, now all I have to do is give you a filename of ";rm -rf "/. And what if my pdf has a double quote in its name? You could use single quotes, but the same problem comes up if the filename has single quotes in it, and the shell injection isn't really any harder. You could come up with an elaborate shellify function that properly quotes a string all so that the shell can unquote it and get back to the original entry ... but that seems like so much more work than your other options, neither of which suffers from these problems.
$SIG{CHLD} = "IGNORE"; #reap children as they complete
my $pid = fork();
if ( $pid == 0 ) {
exec 'evince', 'test.pdf';
}
Setting a global $SIG{CHLD} is nice and easy ... unless you need to handle other children as they die. So only you can tell whether that's acceptable or not. And, again in my experience, not even always then. I've been bitten by this one - though rarely. I had this mixed in with an application that, elsewhere, used AnyEvent, and managed to break AE's subprocess handling. (The same would likely hold true if you mixed this with any event system, I just happened to be using AE.)
Also, this is missing the stdout and stderr redirects - and stdin redirect. That's easy enough to add - inside your if, before the exec, just close and reopen the filehandles as you need, e.g.:
close STDOUT; open STDOUT, '>', '/dev/null';
close STDERR; open STDERR, '>', '/dev/null';
close STDIN; open STDIN, '<', '/dev/null';
No big deal. However, Proc::Daemon does set up a few more things for you to ensure signals don't reach from one to the other process, in either direction. This depends on how severe you need to get.
For most of my purposes, I've found #2 to be sufficient. I've only reached for Proc::Daemon on a few projects, but that's where a) I have full control over the module installation, and b) it really matters. Starting a pdf viewer wouldn't normally be such a case.
I avoid #1 at all costs - I have had some fairly significant bites with shell injection, and now try to avoid the shell at all times.
I am running a program in Perl that at one point evaluates data in an if statement called from within a subroutine, e.g.
sub check_good {
if (!good) {
# exit this subroutine
# restart program
}
else {
# keep going
}
} # end sub
The problem I have is with exiting and restarting. I know that I can just use exit 0; to exit straight out, but obviously this is not correct if I want to go back to the beginning. I tried calling the subroutine which essentially starts the program, but of course once it has run it will go back to this point again.
I thought about putting it in a while loop, but this would mean putting the whole file in the loop and it would be very impractical.
I don't actually know whether this is possible, so any input would be great.
If you have not changed #ARGV, or you keep a copy of it, you could possibly do something like exec($^X, $0, #ARGV).
$^X and $0 (or $EXECUTABLE_NAME and $PROGRAM_NAME, see Brian's comment below) are the current perl interpreter and current perl script, respectively.
An alternative would be to always have two processes: A supervisor and a worker.
Refactor all your logic into a subroutine called run(or main or whatever). Whn your real logic detect that it needs to restart it should exit with a predefined non-zero exit code (like 1 for example).
Then your main script and supervisor would look like this:
if (my $worker = fork) {
# child process
run(#ARGV);
exit 0;
}
# supervisor process
waitpid $worker;
my $status = ($? >> 8);
if ($status == 1) { ... restart .. }
exit $status; # propagate exit code...
In the simple scenario where you just want to restart once, this might be a bit overkill. But if you at any point need to be able to handle other error scenarios this method might be preferable.
For example if the exit code is 255, this indicates that the main script called die(). In this case you might want to implement some decision procedure wether to restart the script, ignore the error, or escalate the issue.
There are quite a few modules on CPAN implementing such supervisors. Proc::Launcher is one of them and the manual page includes a extensive discussion of related works. (I have never used Proc::Launcher, it is mainly due to this discussion I'm linking to it)
There's nothing to stop you calling system on yourself. Something like this (clearly in need of a tidy), where I pass in a command-line argument to prevent the code calling itself forever.
#!/usr/bin/perl
use strict;
use warnings;
print "Starting...\n";
sleep 5;
if (! #ARGV) {
print "Start myself again...\n";
system("./sleep.pl secondgo");
print "...and die now\n";
exit;
} elsif ((#ARGV) && $ARGV[0] eq "secondgo") {
print "Just going to die straightaway this time\n";
exit;
}
I have written a Perl Script using WWW::Mechanize which reads URLs from a text file and connects to them one by one. In each operation, it parses the content of the webpage looking for some specific keywords and if found, it will be written to the output file.
To speed up the process, I used Parallel::ForkManager with MAX_CHILDREN set to 3. Though I have observed an increase in the speed, the problem is that, after a while the script crashes. Perl.exe process gets killed and it does not display any specific error message.
I have run the script multiple times to see if it always fails at the same point, however the point of failure seems to be intermittent.
Please note that I have already taken care of any memory leaks in WWW::Mechanize and HTML::TreeBuilder::XPath as follows:
For WWW::Mechanize, I set stack_depth(0) so that it does not cache the history of visited pages.
HTML::TreeBuilder::XPath, I delete the root node once I am done with it. This approach helped me in resolving a memory leak issue in another similar script which does not use fork.
Here is the structure of the script, I have mentioned only the relevant parts here, please let me know if more details are required to troubleshoot:
#! /usr/bin/perl
use HTML::TreeBuilder::XPath;
use WWW::Mechanize;
use warnings;
use diagnostics;
use constant MAX_CHILDREN => 3;
open(INPUT,"<",$input) || die("Couldn't read from the file, $input with error: $!\n");
open(OUTPUT, ">>", $output) || die("Couldn't open the file, $output with error: $!\n");
$pm = Parallel::ForkManager->new(MAX_CHILDREN);
$mech=WWW::Mechanize->new();
$mech->stack_depth(0);
while(<INPUT>)
{
chomp $_;
$url=$_;
$pm->start() and next;
$mech->get($url);
if($mech->success)
{
$tree=HTML::TreeBuilder::XPath->new();
$tree->parse($mech->content);
# do some processing here on the content and print the results to OUTPUT file
# once done then delete the root node
$tree->delete();
}
$pm->finish();
print "Child Processing finished\n"; # it never reaches this point!
}
$pm->wait_all_children;
I would like to know, why does this Perl script keep failing after a while?
For understanding purpose, I added a print statement right after the finish method of fork manager, however it does not print that.
I have also used, wait_all_children method, since as per the document of the module on CPAN, it will wait for the processing to get over for all the children of the parent process.
I have not understood why, wait_all_children method is place outside the while or the for loop though (as observed in the documentation as well), since all the processing is taking place inside the loop.
Thanks.
As for why this code is written with a main job loop with the start and finish calls and then followed by a wait_all_children outside the loop. It works like this:
The parent process gets the next job from <INPUT> at the start of each loop.
The parent runs start, which causes the child process to fork. At this point, you have 2 processes, each of which is running the exact same code at the exact same point.
3a. The parent process hits that or next and jumps back to the top to read the next <INPUT> and start the process over.
3b. The child process does not hit the or next and continues running the code you give until it hits finish, where the child exits.
Meanwhile the parent process is busy going through the loop and creating a child each time through. After forking 3 children (or whatever you set the limit to) it blocks until one of the children exit. At which point, it immediately spawns a new child (resulting in step 3b for each child each time).
When the parent runs out of jobs, it jumps out the while loop (never having run anything in it itself) and then waits for all the remaining children to exit.
As you can see, any code in the loop after finish is called is never going to run in either the parent (because it doesn't do anything after or next within the loop) or the children (because they exit at finish).
I've never used Parallel::ForkManager, but it looks like you can put a run_on_finished hook to run some code at the finish if you want to put a print statement there at the end.
To find the problem, though, I'd suggest wrapping all the code between start and finish in an eval or use Try::Tiny and warn out the error to see if there's an exception happening in there that's breaking it. I'd expect such things to show up in STDERR when the child dies, though, so I'm not really sure that will help.
However, it's worth shot. Here's my suggestion in code, just showing the part I'd catch exceptions from:
# At the top add
use Try::Tiny;
# Later in your main loop
$pm->start() and next;
try {
$mech->get($url);
if($mech->success)
{
$tree=HTML::TreeBuilder::XPath->new();
$tree->parse($mech->content);
# do some processing here on the content and print the results to OUTPUT file
# once done then delete the root node
$tree->delete();
}
}
catch {
warn "Bad Stuff: ", $_;
};
$pm->finish();
That might help show you what's gone wrong.
If it does not help, you might try moving the try block to include more of the program (like nearly all of it after the use Try::Tiny line) and see if that elucidates anything.
The $pm->wait_all_children; function call waits for "ALL" the child processes to end and places a Blocking lock. I am not sure what kind of error handling you have done for $mech inside the if() statement, but you may want to re-visit that.
The following snippet of code is used to find the PID of a user's terminal, by using ptree and grabbing the third PID from the results it returns. All terminal PID's are stored in a hash with the user's login as the key.
## If process is a TEMINAL.
## The command ptree is used to get the terminal's process ID.
## The user can then use this ID to peek the user's terminal.
if ($PID =~ /(\w+)\s+(\d+) .+basic/) {
$user = $1;
if (open(PTREE, "ptree $2 |")) {
while ($PTREE = <PTREE>) {
if ($PTREE =~ /(\d+)\s+-pksh-ksh/) {
$terminals{$user} = $terminals{$user} . " $1";
last;
}
next;
}
close(PTREE);
}
next;
}
Below is a sample ptree execution:
ares./home_atenas/lmcgra> ptree 29064
485 /usr/lib/inet/inetd start
23054 /usr/sbin/in.telnetd
23131 -pksh-ksh
26107 -ksh
29058 -ksh
29064 /usr/ob/bin/basic s=61440 pgm=/usr/local/etc/logon -q -nr trans
412 sybsrvr
I'd like to know if there is a better way to code this. This is the part of the script that takes longest to run.
Note: this code, along with other snippets, are inside a loop and are executed a couple of times.
I think the main problem is that this code is in a loop. You don't need to run ptree and parse the results more than once! You need to figure out a way to run ptree once and put it into a data structure that you can use later. Probably be some kind of simple hash will suffice. You may even be able to just keep around your %terminals hash and keep reusing it.
Some nitpicks...
Both of your "next" statements seem
unnecessary to me... you should be
able to just remove them.
Replace
$terminals{$user} = $terminals{$user} . " $1";
with:
$terminals{$user} .= " $1";
Replace the bareword PTREE which you
are using as a filehandle with
$ptreeF or some such... using
barewords became unnecessary for
filehandles about 10 years ago :)
I don't know why your $PID variable
is all caps... it could be confusing
to readers of your code because it
looks like there is something
special about that variable, and
there isn't.
I think you'll get the best performance improvement by avoiding the overhead of repeatedly executing an external command (ptree, in this case). I'd look for a CPAN module that provides a direct interface to the data structures that ptree is reading. Check the Linux:: namespace, maybe? (I'm not sure if ptree is setuid; that may complicate things.)
The above advice aside, some additional style and robustness notes based on the posted snippet only (forgive me if the larger code invalidates them):
I'd start by using strict, at the very least. Lexical filehandles would also be a good idea.
You appear to be silently ignoring the case when you cannot open() the ptree command. That could happen for many reasons, some of which I can't imagine you wanting to ignore, such as…
You're not using the full path to the ptree command, but rather assuming it's in your path—and that the one in your path is the right one.
How many users are on the system? Can you invert this? List all -pksh-ksh processes in the system along with their EUIDs, and build the map from that - that might be only one execution of ps/ptree.
I was thinking of using ps to get the parents pid, but I would need to loop this to get the great-grandparent's pid. That's the one I need. Thanks. – lamcro
Sorry, there are many users and each can have up to three terminals open. The whole script is used to find those terminals that are using a file. I use fuser to find the processes that use a file. Then use ptree to find the terminal's pid. – lamcro
If you have (or can get) a list of PIDs using a file, and just need all of the grand-parents of that PID, there's an easier way, for sure.
#!perl
use warnings;
use strict;
#***** these PIDs are gotten with fuser or some other method *****
my($fpids) = [27538, 31812, 27541];
#***** get all processes, assuming linux PS *****
my($cmd) = "ps -ef";
open(PS, "$cmd |") || die qq([ERROR] Cannot open pipe from "$cmd" - $!\n);
my($processlist) = {};
while (<PS>) {
chomp;
my($user, $pid, $ppid, $rest) = split(/ +/, $_, 4);
$processlist->{$pid} = $ppid;
}
close PS;
#***** lookup grandparent *****
foreach my $fpid (#$fpids) {
my($parent) = $processlist->{$fpid} || 0;
my($grandparent) = $processlist->{$parent} || 0;
if ($grandparent) {
#----- do something here with grandparent's pid -----
print "PID:GRANDPID - $fpid:$grandparent\n";
}
else {
#----- some error condition -----
print "ERROR - Cannot determine GrandPID: $fpid ($parent)\n";
}
}
Which for me produces:
ERROR - Cannot determine GrandPID: 27538 (1)
PID:GRANDPID - 31812:2804
PID:GRANDPID - 27541:27538
Have you considered using 'who -u' to tell you which process is the login shell for a given tty instead of using ptree? This would simplify your search - irrespective of the other changes you should also make.
I just did some trivial timings here based on your script (calling "cat ptree.txt" instead of ptree itself) and confirmed my thoughts that all of your time is spent creating new sub-processes and running ptree itself. Unless you can factor away the need to call ptree (maybe there's a way to open up the connection once and reuse it, like with nslookup), you won't see any real gains.