simple parallel processing in perl - perl

I have a few blocks of code, inside a function of some object, that can run in parallel and speed things up for me.
I tried using subs::parallel in the following way (all of this is in a body of a function):
my $is_a_done = parallelize {
# block a, do some work
return 1;
};
my $is_b_done = parallelize {
# block b, do some work
return 1;
};
my $is_c_done = parallelize {
# block c depends on a so let's wait (block)
if ($is_a_done) {
# do some work
};
return 1;
};
my $is_d_done = parallelize {
# block d, do some work
return 1;
};
if ($is_a_done && $is_b_done && $is_c_done && $is_d_done) {
# just wait for all to finish before the function returns
}
First, notice I use if to wait for threads to block and wait for previous thread to finish when it's needed (a better idea? the if is quite ugly...).
Second, I get an error:
Thread already joined at /usr/local/share/perl/5.10.1/subs/parallel.pm line 259.
Perl exited with active threads:
1 running and unjoined
-1 finished and unjoined
3 running and detached

I haven't seen subs::parallel before, but given that it's doing all of the thread handling for you, and it seems to be doing it wrong, based on the error message, I think it's a bit suspect.
Normally I wouldn't just suggest throwing it out like that, but what you're doing really isn't any harder with the plain threads interface, so why not give that a shot, and simplify the problem a bit? At the same time, I'll give you an answer to the other part of your question.
use threads;
my #jobs;
push #jobs, threads->create(sub {
# do some work
});
push #jobs, threads->create(sub {
# do some other work
});
# Repeat as necessary :)
$_->join for #jobs; # Wait for everything to finish.
You need something a little bit more intricate if you're using the return values from those subs (simply switching to a hash would help a good deal) but in the code sample you provided, you're ignoring them, which makes things easy.

Related

Win32::Process::KillProcess not returing proper exitcode

I am writing a function in perl which will kill a process given its PID.
sub ShutPidForWindows()
{
require Win32::Process;
$PID = 1234;
$count = 0;
$ReturnStatus = 0;
$ExitCode = 0 ;
if ($PID == 0)
{
return ($ReturnStatus);
}
Win32::Process::KillProcess($PID, $ExitCode);
print "PID = ".$PID."\n";
print "Return Code = ".$ExitCode."\n";
if ($ExitCode)
{
$ReturnStatus = 1;
}
else
{
$ReturnStatus = 2;
}
return ($ReturnStatus);
}
when this function is executed it always returns 2. Even though the process 1234 does not exists.
The o/p I get is:
PID = 1234
Return Code = 0
Perl Doc says that ExitCode will be populated by the exit code returned by the process. Then ExitCode should be 1.
Am I doing anything wrong?
The problem is that you are using require instead of use to load the module. Sometimes this is OK, but you should always follow the examples in the module's documentation.
You must also always use strict and use warnings at the top of every Perl program you write. This will make it necessary to declare all of your variables, which should be done as close as possible to their first point of use. These measures will reveal many errors that you may otherwise overlook, and is especially important when you are asking others for help with your code.
If you examine $^E after the call to Win32::Process::KillProcess, you might see a value like
The parameter is incorrect
which should tell you that you did something wrong.

perl code structure for post-processing

The question I have is a bit abstract, but I'll attempt to be clear in my statement. (This is something of a "rubber duck effect" post, so I'll be thankful if just typing it out gets me somewhere. Replies, however, would be brilliant!)
I have old fortran code that I can't change (at least not yet) so I'm stuck with its awkward output.
I'm using perl to post-process poorly annotated ascii output files, which, as you might imagine, are a very specialized mixture of text and numbers. "Ah, the perfect perl objective," you say. Yes, it is. But what I've come up with is pretty terrible coding, I've recently concluded.
My question is about the generic structure that folks prefer to use to achieve such an objective. As I've said, I'm not happy with the one I've chosen.
Here's some pseudocode of the structure that I've arrived at:
flag1 = 0;
flag2 = 0;
while (<INPUT>) {
if (cond1) {
do something [like parse and set a header];
flag1 = 1;
} else {
next;
}
if (flag1 == 1 && cond2) {
do something else [like process a block of data];
} else {
next;
}
}
The objective of the above code is to be able to break the processing into blocks corresponding to the poorly partitioned ascii file -- there isn't much by way of "tags" in the file, so the conditions (cond1, cond2, etc.) are involved. The purpose of having the flags set is, among other reasons, to track the progress of the code through the file.
It occurs to me now that a better structure might be
while (<INPUT>) {
do stuff;
}
while (<INPUT>) {
do other stuff;
}
In any event, if my rambling inspires any thoughts I'd appreciate hearing them.
Thank you
Your original structure is perfectly fine. You're building a state machine, and doing it in a completely reasonable way that can't really be made any more idiomatic.
The only thing you can possibly do if you wish is to modularize the code a wee bit:
our %state = (last => 0, current => 0, next => 0);
our %extra_flags = ();
sub cond1($line) { return $next_state } # Returns 0 if cond==false
sub cond2($line) { return $next_state } # Returns 0 if cond==false
our %conditions = (
0 => \&cond1
1 => \&cond2 # flag1 is set
);
while (<INPUT>) {
my $state = $state->{current};
if ($state->{next} = $conditions{$state}->($_, $state)) {
$do_stuff{$state}->{$next_state}->($line);
$state->{last} = $state->{current};
$state->{current} = $state->{next};
next;
}
}
If the file does indeed lend itself to being processed in multiple loops, that would be a much clearer way to do it than emulating that with conditionals, IMO.
If not, even if there are just a few exceptions to code around, it's probably better to stick with the original approach you describe.

Alternative to "last" in do loops

According to the perl manual for for last (http://perldoc.perl.org/functions/last.html), last can't be used to break out of do {} loops, but it doesn't mention an alternative. The script I'm maintaining has this structure:
do {
...
if (...)
{
...
last;
}
} while (...);
and I'm pretty sure he wants to go to the end of the loop, but its actually exiting the current subroutine, so I need to either change the last or refactor the whole loop if there is a better way that someone can recommend.
Wrap the do "loop" in a bare block (which is a loop):
{
do {
...
if (...)
{
...
last;
}
} while (...);
}
This works for last and redo, but not next; for that place the bare block inside the do block:
do {{
...
if (...)
{
...
next;
}
...
}} while (...);
do BLOCK while (EXPR) is funny in that do is not really a loop structure. So, last, next, and redo are not supposed to be used there. Get rid of the last and adjust the EXPR to evaluate false when that situation is found.
Also, turn on strict, which should give you at least a warning here.
Never a fan of do/while loops in Perl. the do isn't really a loop which is why last won't break out of it. In our old Pascal daze you couldn't exit a loop in the middle because that would be wrong according to the sage Niklaus "One entrance/one exit" Wirth. Therefore, we had to create an exit flag. In Perl it'd look something like this:
my $endFlag = 0;
do {
...
if (...)
{
...
$endFlag = 1;
}
} while ((...) and (not $endFlag));
Now, you can see while Pascal never caught on.
Why not just use a while loop?
while (...) {
...
if (...) {
last;
}
}
You might have to change your logic slightly to accommodate the fact that your test is at the beginning instead of end of your loop, but that should be trivial.
By the way, you actually CAN break out of a Pascal loop if you're using Delphi, and Delphi DID catch on for a little while until Microsoft wised up and came out with the .net languages.
# "http://perldoc.perl.org/functions/last.html":
last cannot be used to exit a block that returns a value such as eval {} , sub {} or do {} , and should not be used to exit a grep() or map() operation.
So, use a boolean in the 'while()' and set it where you have 'last'...
Late to the party - I've been messing with for(;;) recently. In my rudimentary testing, for conditional expressions A and B, what you want to do with:
do {
last if A;
} while(B);
can be accomplished as:
for(;; B || last) {
last if A;
}
A bit ugly, but perhaps not more so than the other workarounds :) . An example:
my $i=1;
for(;; $i<=3 || last) {
print "$i ";
++$i;
}
Outputs 1 2 3. And you can combine the increment if you want:
my $i=1;
for(;; ++$i, $i<=3 || last) {
print "$i ";
}
(using || because it has higher precedence than ,)

What does for (;;) mean in Perl?

I was looking though a fellow developers code when I saw this ..
for (;;){
....
....
....
}
I have never seen ";;" used in a loop. What does this do exactly?
It loops forever. ';;' equates to no start value, no stop condition and no increment condition.
It is equivalent to
while (true)
{
...
}
There would usually be a conditional break; statement somewhere in the body of the loop unless it is something like a system idle loop or message pump.
All 3 parts are optional. An empty loop initialization and update is a noop. An empty terminating condition is an implicit true. It's essentially the same as
while (true) {
//...
}
Note that you it doesn't have to be all-or-nothing; you can have some part and not others.
for (init; cond; ) {
//...
}
for (; cond; update) {
//...
}
for (init; ; update) {
//...
}
Just like in C, the for loop has three sections:
a pre-loop section, which executes before the loop starts.
a continuing condition section which, while true, will keep the loop going.
a post-iteration section which is executed after each iteration of the loop body.
For example:
for (i = 1, acc = 0; i <= 10; i++)
acc += i;
will add up the numbers from 1 to 10 inclusive (in C and, assuming you use Perl syntax like $i and braces, in Perl as well).
However, nothing requires that the sections actually contain anything and, if the condition is missing, it's assumed to be true.
So the for(;;) loop basically just means: don't do any loop setup, loop forever (breaks notwithstanding) and don't do any iteration-specific processing. In other words, it's an infinite loop.
Infinite loop. A lot of the time it will be used in a thread to do some work.
boolean exit = false;
for(;;) {
if(exit) {
break;
}
// do some work
}
Infinite Loop (until you break out of it).
It's often used in place of:
while(true) { // Do something }
It's the same as
while(true) {
...
}
It loops forever.
You don't need to specify all of the parts of a for loop. For example the following loop (which contains no body, or update token) will perform a linear search of myArray
for($index = -1; $myArray[++$index] != $target;);

How can I cleanly handle error checking in Perl?

I have a Perl routine that manages error checking. There are about 10 different checks and some are nested, based on prior success. These are typically not exceptional cases where I would need to croak/die. Also, once an error occurs, there's no point in running through the rest of the checks.
However, I can't seem to think of a neat way to solve this issue except by using something analogous to the following horrid hack:
sub lots_of_checks
{
if(failcond)
{
goto failstate:
}
elsif(failcond2)
{
goto failstate;
}
#This continues on and on until...
return 1; #O happy day!
failstate:
return 0; #Dead...
}
What I would prefer to be able to do would be something like so:
do
{
if(failcond)
{
last;
}
#...
};
An empty return statement is a better way of returning false from a Perl sub than returning 0. The latter value will actually be true in list context:
sub lots_of_checks {
return if fail_condition_1;
return if fail_condition_2;
# ...
return 1;
}
Perhaps you want to have a look at the following articles about exception handling in perl5:
perl.com: Object Oriented Exception Handling in Perl
perlfoundation.com: Exception Handling in Perl
You absolutely can do what you prefer.
Check: {
last Check
if failcond1;
last Check
if failcond2;
success();
}
Why would you not use exceptions? Any case where the normal flow of the code should not be followed is an exception. Using "return" or "goto" is really the same thing, just more "not what you want".
(What you really want are continuations, which "return", "goto", "last", and "throw" are all special cases of. While Perl does not have full continuations, we do have escape continuations; see http://metacpan.org/pod/Continuation::Escape)
In your code example, you write:
do
{
if(failcond)
{
last;
}
#...
};
This is probably the same as:
eval {
if(failcond){
die 'failcond';
}
}
If you want to be tricky and ignore other exceptions:
my $magic = [];
eval {
if(failcond){
die $magic;
}
}
if ($# != $magic) {
die; # rethrow
}
Or, you can use the Continuation::Escape module mentioned above. But
there is no reason to ignore exceptions; it is perfectly acceptable
to use them this way.
Given your example, I'd write it this way:
sub lots_of_checks {
local $_ = shift; # You can use 'my' here in 5.10+
return if /condition1/;
return if /condition2/;
# etc.
return 1;
}
Note the bare return instead of return 0. This is usually better because it respects context; the value will be undef in scalar context and () (the empty list) in list context.
If you want to hold to a single-exit point (which is slightly un-Perlish), you can do it without resorting to goto. As the documentation for last states:
... a block by itself is semantically identical to a loop that executes once.
Thus "last" can be used to effect an early exit out of such a block.
sub lots_of_checks {
local $_ = shift;
my $all_clear;
{
last if /condition1/;
last if /condition2/;
# ...
$all_clear = 1; # only set if all checks pass
}
return unless $all_clear;
return 1;
}
If you want to keep your single in/single out structure, you can modify the other suggestions slightly to get:
sub lots_of_checks
{
goto failstate if failcond1;
goto failstate if failcond2;
# This continues on and on until...
return 1; # O happy day!
failstate:
# Any clean up code here.
return; # Dead...
}
IMO, Perl's use of the statement modifier form "return if EXPR" makes guard clauses more readable than they are in C. When you first see the line, you know that you have a guard clause. This feature is often denigrated, but in this case I am quite fond of it.
Using the goto with the statement modifier retains the clarity, and reduces clutter, while it preserves your single exit code style. I've used this form when I had complex clean up to do after failing validation for a routine.