I have a perl script which calls fork() a few times to create 4 child processes. The parent process then uses waitpid() to wait for all of the children to finish.
The problem occurs when I try to call system() from within the child processes (I'm using it to create directories). Even something as simple as system("dir") fails (yes, I'm on Windows).
By "fails", I mean that one of the child threads goes past it no problem, but the other child processes so far as I can tell simply cease to exist.
trace INFO, "Thread $thread_id still alive at 2.62";
system("dir");
trace INFO, "Thread $thread_id still alive at 2.65";
I get messages such as "Thread 3 still alive at 2.62", but only 1 of the child threads ever gets to 2.65.
At the bottom of the log, I can see "Command exited with non-zero status 127", which I think may have something to do with it.
I've considered using some sort of a mutex lock to make sure that only 1 processes at a time goes through the system calls, but how can I do that with fork()? Also, this problem doesn't really make any sense in the first place, why would several independent processes have trouble doing system("dir") at the same time?
The problem here is that fork() is emulated under windows using threads. So there is no real processes created.
If you are only use system call to create folders, then you'd better use perl function mkdir or File::Path's make_path instead.
Related
I'm running in circles. I have webpage that creates a huge file. This file takes forever to be created and is in a subroutine.
What is the best way for my page to run this subroutine but not wait for it to be created/processed? Are there any issues with apache processes since I'm doing this from a webpage?
The simplest way to perform this task is to simply use fork() and have the long-running subroutine run in the child process. Meanwhile, have the parent return to Apache. You indicate that you've tried this already, but absent more information on exactly what the code looks like and what is failing it's hard to help you move forward on this path.
Another option is to have run a separate process that is responsible for managing the long-running task. Have the webpage send a unit of work to the long-running process using a local socket (or by creating a file with the necessary input data), and then your web script can return immediately while the separate process takes care of completing the long running task.
This method of decoupling the execution is fairly common and is often called a "task queue" (if there is some mechanism in place for queuing requests as they come in). There are a number of tools out there that will help you design this sort of solution (but for simple cases with filesystem-based communication you may be fine without them).
I think you want to create a worker grandchild of Apache -- that is:
Apache -> child -> grandchild
where the child dies right after forking the grandchild, and the grandchild closes STDIN, STDOUT, and STDERR. (The grandchild then creates the file.) These are the basic steps in creating a zombie daemon (a parent-less worker process unconnected with the webserver).
I have a confusion around the functionality of vfork(). I read that in case of vfork(), parent and the child process used to share pages between them. It doesn't support any copy on write functionality. This means, if during its timeslice child process makes some changes, all these changes would be visible to the parent process when it will return. It was also mentioned, that the vfork() syscall is only been useful when the child process just executes the exec system call after its creation.
Let us say, that the child process executes the exec system call with ls. Now, according to the exec calls, the ls program will be loaded on to the address space of the child process. Now, when the parent process' timeslice will start, it might have a different intruction to execute on its PC, which might cause this process to behave differently.
Can somebody please make this scenario clear to me, how the vfork() call is helpful in such situations?
The point of vfork() is not to allocate a new address space for a child which is going to immediately throw it away again. As such, vfork() omits the part of fork() which creates a new address space (page tables and allocations) for the child, and instead sets a flag which execve() interprets as meaning that it should allocate a new page table and stack for the process before populating it with the new executable and its requested initial heap (BSS).
execve() releases the current process' memory mappings and allocates new ones. Exiting a process also ends up releasing that process' memory mappings.
Traditionally, vfork() suspends the parent process until the child stops using the parent's memory mappings. The only way to do that safely is via execve() or _exit().
Let us say, that the child process executes the exec system call with
ls. Now, according to the exec calls, the ls program will be loaded on
to the address space of the child process.
Actually, ls will be loaded in a new address space, the parent's address space is released on the child.
I have a perl script which i'd like to spawn a process. It can take a while and most times the parent script will exit. How do I spawn this process so that when the parent is gone it wont turn into a zombie or a defunct process when its done?
edit: I think ive found two methods. Hopefully someone could tell me which one is more appropriate?
setting $SIG{CHLD} = 'IGNORE';
use POSIX 'setsid';
edit: The spawned process is also going to be another perl script.
A process becomes as zombie when it exits and before its parent process picks up its status with wait(). When one process forks another and then exits, the child becomes a parent of pid 1 (classically "init") which immediately reaps the process state. So usually the problem is the reverse of what you describe: The child becomes a zombie (since the parent was not written to deal with SIGCHLD and call wait()) but when the parent exits the zombie is inherited by init and immediately reaped. In fact, the usual solution to decouple a child process fully from its parent ("daemonize") involves intentionally forking and having the intermediate process exit so that the daemon is immediately a child of init.
Edit: If you're in shell and want to achieve this effect, try (subprocess &). The parenthesis create a subshell which executes subprocess in the background and then immeidately exits.
I found the answer in Managing Signal Handling for daemons that fork() very helpful for what I'm doing. I'm unsure about how to solve
"You will therefore need to install any signal handling in the execed process when it starts up"
I don't have control over the process that start up. Is there any way for me to force some signal handles on the execed from the parent of the fork?
Edit:{
I'm writing a Perl module that monitors long-running processes. Instead of
system(<long-running cmd>);
you'd use
my_system(<ID>, <long-running cmd>);
I create a lock file for the <ID> and don't let another my_system(<ID>...) call through if there is one currently running with a matching ID.
The parent fork/execs <long-running cmd> and is in change of cleaning up the lock file when it terminates. I'd like to have the child self-sufficient so the parent can exit (or so the child can take care of itself if the parent gets a kill -9).
}
On Unix systems, you can make an exec'd process ignore signals (unless the process chooses to override what you say), but you can't force it to set a handler for it. The most you can do is leave the relevant signal being handled by the default handler.
If you think about it, you'll realize why. To install a signal handler, you have to provide a function pointer - but the process that does the exec() can't specify one of its functions because they won't exist as part of the exec'd process, and it can't specify one of the exec'd processes functions because they don't exist as part of the exec'ing process. Similarly, you can't register atexit() handlers in the exec'ing process that will be honoured by the exec'd process.
As to your programming problem, there's a good reason that the lock file normally contains the process ID (pid) of the process that holds the lock; it allows you to check whether that process is still around, even if it isn't your child. You can read the pid from the lock file, and then use kill(pid, 0) which will tell you if the process exists and you can signal it without actually sending any signal.
One approach would be to use two forks.
The first fork would create a child process responsible for cleaning up the lock file if the parent dies. This process will also fork a grandchild which would exec the long running command.
Various Perl scripts (Server Side Includes) are calling a Perl module with many functions on a website.
EDIT:
The scripts are using use lib to reference the libraries from a folder.
During busy periods the scripts (not the libraries) become zombies and overload the server.
The server lists:
319 ? Z 0:00 [scriptname1.pl] <defunct>
320 ? Z 0:00 [scriptname2.pl] <defunct>
321 ? Z 0:00 [scriptname3.pl] <defunct>
I have hundreds of instances of each.
EDIT:
We are not using fork, system or exec, apart form the SSI directive
<!--#exec cgi="/cgi-bin/scriptname.pl"-->
As far as I know, in this case httpd itself will be the owner of the process.
MaxRequestPerChild is set to 0 which should not let the parents die before the child process is finished.
So far we figured that temporarily suspending some of the scripts help the server coping with the defunct processes and prevent it from falling over however zombie processes are still forming without a doubt.
Apparently gbacon seems to be the closest to the truth with his theory that the server is not being able to cope with the load.
What could lead to httpd abandoning these processes?
Is there any best practice to prevent these from happening?
Thanks
Answer:
The point goes to Rob.
As he says, CGI scripts that generate SSI's will not have those SSI's handled. The evaluation of SSI's happens before the running of CGI's in the Apache 1.3 request cycle. This was fixed with Apache 2.0 and later so that CGI's can generate SSI commands.
Since we were running on Apache 1.3, for every page view the SSI's turned into defunct processes. Although the server was trying to clear them it was way too busy with the running tasks to be able to succeed. As a result, the server fell over and become unresponsive.
As a short term solution we reviewed all SSI's and moved some of the processes to client side to free up server resources and give it time to clean up.
Later we upgraded to Apache 2.2.
More Band-Aid than best practice, but sometimes you can get away with simple
$SIG{CHLD} = "IGNORE";
According to the perlipc documentation
On most Unix platforms, the CHLD (sometimes also known as CLD) signal has special behavior with respect to a value of 'IGNORE'. Setting $SIG{CHLD} to 'IGNORE' on such a platform has the effect of not creating zombie processes when the parent process fails to wait() on its child processes (i.e., child processes are automatically reaped). Calling wait() with $SIG{CHLD} set to 'IGNORE' usually returns -1 on such platforms.
If you care about the exit statuses of child processes, you need to collect them (commonly referred to as "reaping") by calling wait or waitpid. Despite the creepy name, a zombie is merely a child process that has exited but whose status has not yet been reaped.
If your Perl programs themselves are the child processes becoming zombies, that means their parents (the ones that are forking-and-forgetting your code) need to clean up after themselves. A process cannot stop itself from becoming a zombie.
I just saw your comment that you are running Apache 1.3 and that may be associated with your problem.
SSI's can run CGI's. But CGI scripts that generate SSI's will not have those SSI's handled. The evaluation of SSI's happens before the running of CGI's in the Apache 1.3 request cycle. This was fixed with Apache 2.0 and later so that CGI's can generate SSI commands.
As I'd suggested above, try running your scripts on their own and have a look at the output. Are they generating SSI's?
Edit: Have you tried launching a trivial Perl CGI script to simply printout a Hello World type HTTP response?
Then if this works add a trivial SSI directives such as
<!--#printenv -->
and see what happens.
Edit 2: Just realised what is probably happening. Zombies occur when a child process exits and isn't reaped. These processes are hanging around and slowly using up resources within the process table. A process without a parent is an orphaned process.
Are you forking off processes within your Perl script? If so, have you added a waitpid() call to the parent?
Have you also got the correct exit within the script?
CORE::exit(0);
As you have all the bits yourself, I'd suggest running the individual scripts one at a time from the command line to see if you can spot the ones that are hanging.
Does a ps listing show an inordinate number of instances of one particular script running?
Are you running the CGI's using mod_perl?
Edit: Just saw your comments regarding SSI's. Don't forget that SSI directives can run Perl scripts themselves. Have a look to see what the CGI's are trying to run?
Are they dependent on yet another server or service?