Cannot Run Fork, unwanted number of processes are forked - perl

I am having an issue with Fork in Perl. I want to execute 10 Fork Processes at a go from one single script all 10 Child (Forked) processes will do the same thing (Copy files from one place to another).
When I execute this code, my OS Hangs and when I actually check there are hell lot of processes which are forked at a time.
Here is my Code:
while ($callCount <= $totalCalls) {
for (1..$TotalProcessToFork) {
print "Call -> $callCount";
if($pid = fork) {
#in Parent Process
print " :: PID -> $pid\n";
push(#list_of_pid, $pid);
} else {
#in Child Process
`touch $callCount`;
}
$callCount++;
}
}
Now when I execute this code, there are around 1000 child processed which are executed.
Can any one tell me what wrong I am doing here.

The children fork, too. You need to exit the loop one way or another in the child case. A common pattern is to fork and exec, or you could just say last.

This happens because when you fork a process, it creates two processes. Lets call them a1 and a2. Now a1 is the parent and a2 is the child, so when a2 is executed, it creates b1 and b2. When these all are executed, they also create new processes recursively.

You may want to take a look at Parallel::ForkManager, which will probably make your life easier.
Also, don't use external Linux touch command; it's better to use File::Touch.

Related

Perl: Value of global variable not getting updated when changed in child

Code Snippet:
my $kill=0;
my #array1 = ("abc", "def","ghi");
&runSmokesAndMonitor;
sub runSmokesAndMonitor {
foreach my $smokeTestVarDirName (#array1) {
if ($pid =fork()) {
print "parent\n"; ### Have some other action items as well here in parent
}
else {
$kill++;
print "Value of kill is $kill\n";
exit 0;
}
}
}
Here, I am getting output:
parent
Value of kill is 1
parent
Value of kill is 1
parent
Value of kill is 1
Required/Expected: (As $kill is global variable, so values of $kill must have updated wherever new value would have assigned)
parent
Value of kill is 1
parent
Value of kill is 2
parent
Value of kill is 3
parent
Why is the output not as expected, and how can i achieve it?
A child created by fork is a new process with its own address space. Global variables are per process only, not global per user or per system or even global between all instances of a software running in the world. That's why changes to a global variable are only reflected in the current process.
If you need to share information between processes you need IPC (inter process communication), i.e. things like sockets, pipes, shared memory etc - see perlipc for more. There are ways to make sharing variables across processes easier, like IPC::Shareable.

is there an abnormal way to terminate a child process to get certain outputs in this code?

i'm new to operating systems and i found this code , and i don't understand why certain outputs like : abc , we can't get
suppose we have this code in c :
int main()
{
if(fork()==0)
printf("a");
else
{
printf("b");
waitpid(-1);
}
printf("c");
return 0;
}
waitpid() waits for a child process to terminate.
can the child process be terminated in abnormal way ? so that we can have this outputs : abc, bc ?
according to at least the linux manpage for fork:
RETURN VALUE
On success, the PID of the child process is returned in the parent, and
0 is returned in the child. On failure, -1 is returned in the parent,
no child process is created, and errno is set appropriately.
so if your child program isn't ever created the entire output will be c for the parent process and nothing for the child process, because it never came to be.
Also it is possible that the parent process is killed before it can output a, or c, then you'll only get the child's output, bc. or maybe the parent is killed before it can even fork! there are lots of possibilities and with good timing (and some calls to the sleep function inbetween) you could probably reproduce them.

child fork process return values

In using Parallel::ForkManager, i have few doubts. As if i am calling child process in for loop, then who will execute the next statement , parent or child. Code:
my $pm = Parallel::ForkManager->new($forks); foreach my $q (#numbers) {
my $pid = $pm->start and next;
my $res = calc($q);
if($res == error )
{return};
if (#res == some_no)
{do something and next;
}
$pm->finish(0, { result => $res, input => $q });
}....i want to know about fork return outputs and want parent process to execute 1st next and 2nd next.
Also want to know if child process end in middle, will parent be able to know it and how?
The two major sources of parallelism in perl are threading - use threads; and forking. For the latter, Parallel::ForkManager is probably the best bet out there.
However, for copying? This may not help nearly as much as you think. Your limiting factor isn't going to be CPU, it'll be IO to disk.
Parallelising IO doesn't help nearly as much as you think, and in many cases can be counter-productive - by making the disk thrash, having to write to two locations, you lower overall throughput.

Perl: fork(), avoiding zombie processes, and "No child processes" error

I have a Perl app that's been running largely untroubled on a RH system for a few years. In one place, I have to run a system command that can take take many minutes to complete, so I do this in a child process. The overall structure is like this:
$SIG{CHLD} = 'IGNORE'; # Ignore dead children, to avoid zombie processes
my $child = fork();
if ($child) { # Parent; return OK
$self->status_ok();
} else { # Child; run system command
# do a bunch of data retrieval, etc.
my $output;
my #command = # generate system command here
use IPC::System::Simple 'capture';
eval { $output = capture(#command); };
$self->log->error("Error running #command: $#") if $#;
# success: log $output, carry on
}
We recently changed some of our infrastructure, although not in ways that I expected would have any influence on this. (Still running on RH, still using nginx, etc.) However, now we find that almost every instance of running this code fails, logging 'Error running {command}: failed to start: "No child processes" at /path/to/code.pl'.
I've looked around and can't figure out what the right solution is for this. There was a suggestion to change $SIG{CHLD} from 'IGNORE' to 'DEFAULT', but then I have to worry about zombie processes.
What is causing the "No child processes" error, and how do we fix this?
There was a suggestion to change $SIG{CHLD} from 'IGNORE' to 'DEFAULT', but then I have to worry about zombie processes.
This isn't true.
A zombie process is a process that has ended, but hasn't been reaped by its parent yet. A parent reaps its children using wait(2), waitpid(2) or similar. capture waits for its child to end, so it doesn't leave any zombie behind.
In fact, the error you are getting is from waitpid. capture is waiting for the child to end to reap it and collect its error code, but the you instructed the OS to clean up the child as soon as it completes, leaving waitpid with no child to reap and no error code to collect.
To fix this problem, simply place local $SIG{CHLD} = 'DEFAULT'; before the call to capture.

How many total number of processes are there?

Yesterday, i had a interview and i was asked this question of code snippet using fork() .
void main()
{............
for (int k=1;k<=10;k++)
{
pid[k]=fork();
if(!pid[k])
execvp(.....);
}
}
According to my understanding, i told 1024 total processes will be there including the parent as 2^n -1 = 1023 + 1 parent where n = total forks
But, the interviewer replied that my answer was wrong.
What is wrong with my understanding?
Given this code
pid[k]=fork();
if(!pid[k])
execvp(.....);
and reading the man page of fork which states that
On success, the PID of the child process is returned in the parent,
and 0 is returned in the child.
we know that the child process will perform the exec call (and go on executing a different program), whereas the parent will loop and create another child.
This means that a child will be created for each iteration of the loop, in this case 10 times. So, the answer is 10 children + 1 parent = 11.
Now, if the program that gets started by exec is the same program, the fun will stop only when the computer's memory is exhausted: on every iteration 10 programs will each create 10 children, which each will create 10 children, and so on. A pecularity of fork() is that parent and child get an image of the same variables (which would lead to a predictable number of children, ie some figure related to a power of 2), obviously this isn't true when a program gets exec'd, which means that available memory will be the only limit.