Are there any good timer implementations in Perl? - perl

I'm looking for good timer implementation in perl. The situation I met is like: I need to keep track of I/O activities of many files and for thoes files keep untouched for enough time a remove action will be taken upon them, so an efficient timer implementation is really vital for
the app I'm involved right now. To avoid recreate the wheel, ask you guys for help first.

Time::HiRes comes with perl.
Furthermore, your application sounds like it could benefit from Linux::Inotify (note the Linux:: in front). When setting the timer for a file that you want to remove after a certain time of inactivity, remember the last access. In an inotify event hook, update this time to the current time. Then, you can periodically check whether the file's lifetime expired without doing a stat on all of the files you track. On expiration, you could add a final check just to make sure nothing went wrong, of course.
If you have huge numbers of files in flight, you may want to keep the list of files sorted by expiration time. That makes the periodic check for expiration trivial.
Update: I just did a little experimentation with Linux::Inotify. Things aren't as easy with that approach as I thought. First, here's the partially working code that I didn't have time to finish.
#!/usr/bin/env perl
use strict;
use warnings;
use List::Util qw/min max/;
use Time::HiRes qw/time sleep/;
use Data::Dumper;
use Linux::Inotify;
# [s], but handles subsecond granularity, too
use constant CLEANUP_INTERVAL => 1.;
use constant FILE_ACCESS_TIMEOUT => 5.;
# for fast and readable struct access
use constant FILENAME => 0;
use constant ACCESSTIME => 1;
use constant WATCHER => 2;
my $notifier = Linux::Inotify->new;
my #tracked_files = populate_tracked_files(\#ARGV, $notifier);
warn Dumper \#tracked_files;
while (1) {
# update the tracked files according to inotify events
my #events = $notifier->read;
my %files_seen_this_round;
foreach my $event (#events) {
$event->print();
my $ev_filename = $event->{name}; # part of the API, apparently
# we mave have multiple events per file.
next if $files_seen_this_round{$ev_filename}++;
# find and update the right tracked file
# TODO: this could be optimized to O(1) with a hash at
# the cost of more bookkeeping
foreach my $tfile (#tracked_files) {
if ($tfile->[FILENAME] eq $ev_filename) {
my $atime = $^T + 60*60*24 * -A $ev_filename; # update access time
$tfile->[ACCESSTIME] = $atime;
# a partial bubble sort would be hugely more efficient here!
# => O(n) from O(n*log(n))
#tracked_files = sort {$a->[ACCESSTIME] <=> $b->[ACCESSTIME]}
#tracked_files;
last;
}
} # end foreach tracked file
} # end foreach event
cleanup_files(\#tracked_files);
sleep(CLEANUP_INTERVAL);
last if not #tracked_files;
} # end while(1)
$notifier->close;
sub cleanup_files {
my $files = shift;
my $now = time();
for (my $fileno = 0; $fileno < $#{$files}; ++$fileno) {
my $file = $files->[$fileno];
if ($now - $file->[ACCESSTIME] > FILE_ACCESS_TIMEOUT) {
warn "File '" . $file->[FILENAME] . "' timed out";
# remove this file from the watch list
# (and delete in your scenario)
$file->[WATCHER]->remove;
splice #$files, $fileno, 1;
$fileno--;
}
}
}
sub populate_tracked_files {
my $files = shift;
my $notifier = shift;
my #tracked_files;
foreach my $file (#$files) {
die "Not a file: '$file'" if not -f $file;
my $watch = $notifier->add_watch($file, Linux::Inotify::ALL_EVENTS);
push #tracked_files, [$file, $^T + 60*60*24*-A $file, $watch];
}
#tracked_files = sort {$a->[ACCESSTIME] <=> $b->[ACCESSTIME]}
#tracked_files;
return #tracked_files;
}
There's still some bug in the time-checking logic. But the main problem is that $notifier->read() will block until a new event. Whereas we really just want to see whether there's a new event and then proceed to cleanup. This would have to be added to Linux::Inotify as a non-blocking read of the file descriptor. Anybody can take over maintenance of the module since the author is no longer interested.

A your program seems clearly event-driven, you would benefit of implementing it using event-driven frameworks such as POE or AnyEvent. Those have all the pieces to handle I/O events and timer events.

Related

Why is my Tie::IxHash program taking a long time?

Basically, I have a script to create a hash for COGs with corresponding gene IDs:
# Open directory and get all the files in it
opendir(DIR, "/my/path/to/COG/");
my #infiles = grep(/OG-.*\.fasta/, readdir(DIR));
closedir(DIR);
# Create hash for COGs and their corresponding gene IDs
tie my %ids_for, 'Tie::IxHash';
if (! -e '/my/path/to/COG/COG_hash.ref') {
for my $infile (#infiles) {
## $infile
%ids_for = (%ids_for, read_COG_fasta($infile));
}
## %ids_for
store \%ids_for, '/my/path/to/COG/COG_hash.ref';
}
my $id_ref = retrieve('/my/path/to/COG/COG_hash.ref');
%ids_for = %$id_ref;
## %ids_for
The problem isn't that it doesn't work (at least I think), but that it is extremely slow for some reason. When I tried to test run it, it would take weeks for me to have an actual result. Somehow the hash creation is really really slow and I'm sure there is some way to optimize it better for it to work way faster.
Ideally, the paths should be the input of the script that way there would be no need to constantly change the script in case the path changes.
It would also be great if there could be a way to see the progress of the hash creation, like maybe have it show that it is 25% done, 50% done, 75% done and ultimately 100% done. Regarding this last point I have seen things like use Term::ProgressBar but I am not sure if it would be appropriate in this case.
Do you really need Tie::IxHash?
That aside, I suspect your culprit is this set of lines:
for my $infile (#infiles) {
## $infile
%ids_for = (%ids_for, read_COG_fasta($infile));
}
To add a key to the hash, you are creating a list of the current key-value pairs, adding the new pair, then assigning it all back to the hash.
What happens if you take the results from read_COG_fasta and add the keys one at a time?
for my $infile (#infiles) {
my %new_hash = read_COG_fasta($infile);
foreach my $key ( keys %new_hash ) {
$ids_for{$key} = $new_hash{$key};
}
}
As for progress, I usually have something like this when I'm trying to figure out something:
use v5.26;
my $file_count = #files;
foreach my $n ( 0 .. $#files ) {
say "[$n/$file_count] Processing $file[$n]";
my %result = ...;
printf "\tGot %d results", scalar %hash; # v5.26 feature!
}
You could do the same sort of thing with the keys that you get back so you can track the size.

Efficient way to make thousands of curl requests

I am using CURL to make thousands of requests. In my code I set the cookie to a specific value and then read in the value on the page. Here is my Perl code:
#!/usr/bin/perl
my $site = "http://SITENAME/?id=";
my $cookie_name = "cookienum123";
print $fh "#\t\tValue\n";
for my $i ('1'..'10000') {
my $output = `curl -s -H "Cookie: $cookie_name=$i" -L $site$i | grep -Eo "[0-9]+"`;
print "$i\t\t$output\n";
}
So from 1 to 10000, I am setting cookienum123 to that value and reading in the whole response from the page. Then I use grep to just extract the #. The code I have now works fine but I am wondering if there is a faster or more efficient way I can do this.
Please note this does not have to be done as a Perl script (I can also use Windows batch file, Unix shell script, etc).
Edit Jan 18: Added bounty with the note "The desired answer should include a way in Perl to run through several thousand curl requests simultaneously but it needs to be run faster than the rate it is currently running at. It has to write the output to a single file in the end but the order does not matter." Some of the below comments mention fork but I am not sure how to apply it to my code. I am very new to Perl as this is my first program in it.
What you have here is an embarrassingly parallel problem. These are great for parallelising, because there's no inter-thread dependency or communication needed.
There's two key ways of doing this in perl - threading or forking. I would generally suggest thread based parallel processing for the kind of thing you're doing. This is a matter of choice, but I think it's better suited for collating information.
#!/usr/bin/perl
use strict;
use warnings;
use threads;
use Thread::Queue;
my $numthreads = 20;
my $site = "http://SITENAME/?id=";
my $cookie_name = "cookienum123";
my $fetch_q = Thread::Queue->new();
my $collate_q = Thread::Queue->new();
#fetch sub sits in a loop, takes items off 'fetch_q' and runs curl.
sub fetch {
while ( my $target = $fetch_q->dequeue() ) {
my $output =
`curl -s -H "Cookie: $cookie_name=$target" -L $site$target | grep -Eo "[0-9]+"`;
$collate_q->enqueue($output);
}
}
#one instance of collate, which exists to serialise the output from fetch.
#writing files concurrently can get very messy and build in race conditions.
sub collate {
open( my $output_fh, ">", "results.txt" ) or die $!;
print {$output_fh} "#\t\tValue\n";
while ( my $result = $collate_q->dequeue() ) {
print {$output_fh} $result;
}
close($output_fh);
}
## main bit:
#start worker threads
my #workers = map { threads->create( \&fetch ) } 1 .. $numthreads;
#collates results.
my $collater = threads->create( \&collate );
$fetch_q->enqueue( '1' .. '10000' );
$fetch_q->end();
foreach my $thr (#workers) {
$thr->join();
}
#end collate_q here, because we know all the fetchers are
#joined - so no more results will be generated.
#queue will then generate 'undef' when it's empty, and the thread will exit.
$collate_q->end;
#join will block until thread has exited, e.g. all results in the queue
#have been 'processed'.
$collater->join;
This will spawn 20 worker threads, that'll run in parallel, and collect results as they exit to a file. As an alternative, you could do something similar with Parallel::ForkManager, but for data-oriented tasks, I personally prefer threading.
You can use the 'collate' sub to postprocess any data, such as sorting it, counting it, whatever.
I would also point out - using curl and grep as system calls isn't ideal - I've left them as is, but would suggest looking at LWP and allowing perl to handle the text processing, because it's pretty good at it.
I'm pretty sure the following will do what you want however slamming a server with 10000 simultaneous requests is not very polite. In fact, harvesting a site's data by walking the id's of a given url doesn't sound very friendly either. I have NOT tested the following but it should get you 99% of the way there (might be a syntax/usage error somewhere).
See for more info:
https://metacpan.org/pod/distribution/Mojolicious/lib/Mojolicious/Guides/Cookbook.pod#Non-blocking
https://metacpan.org/pod/Mojo::UserAgent#build_tx
https://metacpan.org/pod/Mojo::DOM
Good luck!
#!/usr/bin/perl
use warnings;
use strict;
use Mojo::UserAgent;
use Mojo::IOLoop;
my $site = 'http://SITENAME/?id=';
my $cookie_name = 'cookienum123';
#open filehandle and write file header
open my $output_fh, q{>}, 'results.txt'
or die $!;
print {$output_fh} "#\t\tValue\n";
# Use Mojo::UserAgent for concurrent non-blocking requests
my $ua = Mojo::UserAgent->new;
#create your requests
for my $i (1..10000) {
#build transaction
my $tx = $ua->build_tx(GET => "$site$i");
#add cookie header
$tx->req->cookies({name => $cookie_name, value => $i});
#start "GET" with callback to write to file
$tx = $ua->start( $tx => sub {
my ($ua, $mojo) = #_;
print {$output_fh} $i . "\t\t" . $mojo->res->dom->to_string;
});
}
# Start event loop if necessary
Mojo::IOLoop->start unless Mojo::IOLoop->is_running;
#close filehandle
close $output_fh;

Trying to use fork to do a seemingly simple task, but failing miserably

So, basically I have a very large array that I need to read data from. I want to be able to do this in parallel; however, when I tried, I failed miserably. For the sake of simplicity, let's say I have an array with 100 elements in it. My idea was to partition the array into 10 equals parts and try to read them in parallel (10 is arbitrary, but I don't know how many processes I could run at once and 10 seemed low enough). I need to return a computation (new data structure) based off of my readings from each partition, but I am NOT modifying anything in the original array.
Instead of trying the above exactly, I tried something simpler, but I did it incorrectly, because it didn't work in any capacity. So, then I tried to simply use child processes to push to a an array. The code below is using Time::HiRes to see how much faster I can get this to run using forking as opposed to not, but I'm not at that point yet (I'm going to be testing that when I have closer to a few million entries in my array):
use strict;
use warnings;
use Time::HiRes;
print "Starting main program\n";
my %child;
my #array=();
my $counter=0;
my $start = Time::HiRes::time();
for (my $count = 1; $count <= 10; $count++)
{
my $pid = fork();
if ($pid)
{
$child{$pid}++;
}
elsif ($pid == 0)
{
addToArray(\$counter,\#array);
exit 0;
}
else
{
die "couldnt fork: $!\n";
}
}
while (keys %child)
{
my $pid = waitpid(-1,0);
delete $child{$pid};
}
my $stop = Time::HiRes::time();
my $duration = $stop-$start;
print "Time spent: $duration\n";
print "Size of array: ".scalar(#array)."\n";
print "End of main program\n";
sub addToArray
{
my $start=shift;
my $count=${$start};
${$start}+=10;
my $array=shift;
for (my $i=$count; $i<$count +10; $i++)
{
push #{$array}, $i;
}
print scalar(#{$array})."\n";
}
NB: I used push in lieu of ${$array}[$i]=$i, because I realized that my $counter wasn't actually updating, so that would never work with this code.
I assume that this doesn't work because the children are all copies of the original program and I'm never actually adding anything to the array in my "original program". On that note, I'm very stuck. Again, the actual problem that I'm actually trying to solve is how to partition my array (with data in it) and try to read them in parallel and return a computation based off of my readings (NOTE: I'm not going to modify the original array), but I'm never going to be able to do that if I can't figure out how to actually get my $counter to update. I'd also like to know how to get the code above to do what I want it to do, but that's a secondary goal.
Once I can get my counter to update correctly, is there any chance that another process would start before it updates and I wouldn't actually be reading in the entire array? If so, how do I account for this?
Please, any help would be much appreciated. I'm very frustrated/stuck. I hope there is an easy fix. Thanks in advance.
EDIT: I attempted to use Parallel::ForkManager, but to no avail:
#!/usr/local/roadm/bin/perl
use strict;
use warnings;
use Time::HiRes;
use Parallel::ForkManager;
my $pm = Parallel::ForkManager->new(10);
for (my $count = 1; $count <= 10; $count++)
{
my $pid = $pm->start and next;
sub1(\$counter,\#array);
$pm->finish; # Terminates the child process
}
$pm->wait_all_children;
I didn't include the other extraneous stuff, see above for missing code/sub... Again, help would be much appreciated. I'm very new to this and kind of need someone to hold my hand. I also tried to do something with run_on_start and run_on_finish, but they didn't work either.
Your code has two issues: Your child processes share no data, and you would have a race condition if forked processes would share data. The solution is to use threads. Any possibility for race conditions can be eliminated by partitioning the data in the parent thread, and of course, by not using shared data.
Threads
Threads in Perl behave similar to forking: by default, there is no shared memory. This makes using threads quite easy. However, each thread runs it own perl interpreter, which makes threads quite costly. Use sparingly.
First, we have to activate threading support via use threads. To start a thread, we do threads->create(\&code, #args), which returns a thread object. The code will then run in a separate thread, and will be invoked with the given arguments. After the thread has finished execution, we can collect the return value by calling $thread->join. Note: The context of the threaded code is determined by the create method, not by join.
We could mark variables with the :shared attribute. Your $counter and #array would be examples for this, but it is generally better to pass explicit copies of data around than to use shared state (disclaimer: from a theoretical standpoint, that is). To avoid race conditions with the shared data, you'd actually have to protect your $counter with a semaphore, but again, there is no need for shared state.
Here is a toy program showing how you could use threads to parallelize a calculation:
use strict;
use warnings;
use threads;
use 5.010; # for `say`, and sane threads
use Test::More;
# This program calculates differences between elements of an array
my #threads;
my #array = (1, 4, 3, 5, 5, 10, 7, 8);
my #delta = ( 3, -1, 2, 0, 5, -3, 1 );
my $number_of_threads = 3;
my #partitions = partition( $#array, $number_of_threads );
say "partitions: #partitions";
for (my $lower_bound = 0; #partitions; $lower_bound += shift #partitions) {
my $upper_bound = $lower_bound + $partitions[0];
say "spawning thread with [#array[$lower_bound .. $upper_bound]]";
# pass copies of the values in the array slice to new thread:
push #threads, threads->create(\&differences, #array[$lower_bound .. $upper_bound]);
# note that threads->create was called in list context
}
my #received;
push #received, $_->join for #threads; # will block until all are finished
is_deeply \#received, \#delta;
done_testing;
# calculates the differences. This doesn't need shared memory.
# note that #array could have been safely accessed, as it is never written to
# If I had written to a (unshared) variable, these changes would have been thread-local
sub differences {
say "Hi from a worker thread, I have ", 0+#_, " elements to work on";
return map $_[$_] - $_[$_-1], 1 .. $#_;
# or more readable:
# my #d;
# for my $i (1 .. $#_) {
# push #d, $_[$i] - $_[$i-1];
# }
# return #d;
}
# divide workload into somewhat fair parts, giving earlier threads more work
sub partition {
my ($total, $parts) = #_;
my $base_size = int($total / $parts);
my #partitions = ($base_size) x $parts;
$partitions[$_-1]++ for 1 .. $total - $base_size*$parts;
return #partitions;
}
A note on the number of threads: This should depend on the number of processors of your system. If you have four cores, more than four threads don't make much sense.
If you're going to use child processes after forking, each child process is autonomous and has its own copy of the data in the program as of the time it was forked from the main program. The changes made by the child in its own memory have no effect on the parent's memory. If you need that, either you need a threading Perl and to use threads, or you need to think again — maybe using shared memory, but locating Perl data into the shared memory might be tricky.
So, one option is to read all the data into memory before forking off and having the children work on their own copies of the data.
Depending on the structure of the problem, another possibility might be to have each child read and work on a portion of the data. This won't work if each child must have access to all the data.
It isn't clear how much speed up you'll get through threading or forking if the threads or processes are all tied up reading the same file. Getting the data into memory may be best treated as a single-threaded (single-tasking) operation; the parallelism can spring into effect — and yield benefits — once the data is in memory.
There are some CPAN modules that makes your life easier. One of them is Parallel::ForkManager, which is a simple parallel processing fork manager
So, after my struggle, here's the fix:
EDIT: THIS DOES NOT ACCOMPLISH WHAT I WANTED TO DO
#!/usr/local/roadm/bin/perl
use strict;
use warnings;
use Time::HiRes;
use Parallel::ForkManager;
print "Starting main program\n";
my #array=();
my $counter=0;
my $start = Time::HiRes::time();
my $max_processes=20;
my $partition=10;
my $max_elements=100;
my $pm = Parallel::ForkManager->new($max_processes);
$pm->run_on_start( sub {
my ($pid, $exit_code, $ident) = #_;
sub1(\$counter,\#array);
});
while ($counter < $max_elements)
{
my $pid = $pm->start and next;
$pm->finish; # Terminates the child process
}
$pm->wait_all_children;
my $stop = Time::HiRes::time();
my $duration = $stop-$start;
print "Time spent: $duration\n";
print "Size of array: ".scalar(#array)."\n";
print "\nEnd of main program\n";
sub sub1 {
my $start=shift;
my $count=${$start};
${$start}+=$partition;
my $array=shift;
for (my $i=$count; $i<$count + $partition; $i++)
{
push #{$array}, $i;
}
return #{$array};
}

Perl: Best way to use counterbalancing across conditions

I have a script with two conditions (C1 and C2). Every time a user log in, I want this user to be assigned to one of those two conditions using a strict counterbalancing (not random assignment), such that User1(U1) is assigned to C1, U2 to C2, U3 to C1, U4 to C2, etc.
What is the simplest way to do this?
Right now, I was thinking of doing this:
my $cond;
my $out_cfile = "cfile.txt"; #intial value printed in the file is 1
open(CFILE, "+<", $out_cfile) or die "cannot open $out_cfile";
flock(CFILE, 2);
my $cdata = <CFILE>;
my $last = (substr $cdata,-1,1); #get the latest printed value
if ($last == 1) {$cond = 1; print CFILE "2";}
if ($last == 2) {$cond = 2; print CFILE "1";}
close(CFILE);
print "my condition is: $cond";
Is there a way to do this without having to open and print to an output file?
I really don't understand the pain of creating a file in the /tmp directory or whereever if it doesn't already exist, but you could use memcached instead.
use Cache::Memcached qw( );
my $memd = Cache::Memcached->new({
server => [ '127.0.0.1:11211' ],
namespace => 'app_name:',
});
my $val = $memd->incr('counter');
if ($val % 2) {
...
} else {
...
}
incr increments and fetches atomically to avoid race conditions (so it's safe like your code).
The simplest way would be to use a lock (Warning: Untested, but something like this should work):
use strict;
use warnings;
use threads;
use threads::shared;
use Thread::Queue;
my $num_users_logged_in :shared = 1;
my $num_watcher_threads = 1; #it may make sense to make this larger, maybe not.
my #users = (); #populate array of users that are logging in.
my $queue = Thread::Queue->new;
$queue->enqueue(#users);
my #threads = ();
foreach(1..$num_watcher_threads)
{
push #threads, threads->create(\&log_in);
}
# NOTE: If you ever want this program to stop,
# then you'll have to enqueue some undefs (one for each thread).
# Waiting for each of the threads to finish.
# If you don't want to do this, then detach() instead
$_->join() foreach(#threads);
#Do anything else that you want after all of the threads are done.
sub log_in
{
#Threads will block until it grabs an undef in the queue
while(my $user = $queue->dequeue)
{
#Do whatever you need to do before locking the variable.
{
lock $num_users_logged_in;
#Other threads will block until we're done with this block of code.
#Assign conditions based upon the parity of $num_users_logged_in
#And do whatever else you need for that particular user
$num_users_logged_in++;
}
}
}
I'd use a tied hash, but if you want thread safe you'll need to implement some form of mutex, maybe a semaphore.

Problems with joining threads

I've got some issue with a part of my perl script, bothering me for days now. To summarize the purpose is to read in a large file in chunks and do some operation on the input stream (not relevant for my question). When I first implemented it, I just looped over the file and then did some stuff on it, like this:
while (read FILE, $buffer, $chunksize){
callSomeOperation($buffer);
# Do some other stuff
}
Unfortunately the file is really big and the operation somehow complex with many function calls, therefore this led to steadily increasing Memory perl couldn't allocate memory anymore and the script failed. So I did some investigation and tried several things to minimize the memory overhead (defined variables outside the loop, set to undef and so on), which led the allocated memory size increasing slower, but at the end still failed. (And if I figured out right, perl giving back memory to the OS is sth. that won't happen in practice.)
So I decided to nest the function call and all its definition in a subthread, wait for its finish, join and then call the thread again with the next chunk:
while (read FILE, $buffer, $chunksize){
my $thr = threads->create(\&thrWorker,$buffer);
$thr->join();
}
sub thrWorker{
# Do the stuff here!
}
Which might have been a solution, if the thread would join! But it actually does not. If I run it with $thr->detach(); everything works fine, besides I get hundrets of threads at the same time, which is not a good idea, and in this case, I need to run them consecutively.
So I took some Investigation on this join issue and got some voices that ther might be an issue with perl 5.16.1 so I updated to 5.16.2 but it still never joins. Anywhere in a Mailing list I cant remember I read from somebody managed to get Threads to join with CPAN module Thread::Queue but this didn't worked for me either.
So I gave up with threads and tried to fork this thing. But with fork it seems like the total number of "forks" is limited? Anyway it went fine till the 13th to 20th iteration and then gave up with the message it couldn't fork anymore.
my $pid = fork();
if( $pid == 0 ){
thrWorker($buffer);
exit 0;
}
I also tried it with CPAN modules Parallel::ForkManager and Proc::Fork but that didn't help.
So now I'm somehow stuck and cant help myself out. Maybe somebody else can! Any suggestions greatly appreciated!
How can I get this thing to work with threads or child processes?
Or at least how can I force perl freeing memory so I can do this in the same process?
Some additional information on my system:
OS: Windows 7 64bit / Ubuntu Server 12.10
Perl on Windows: Strawberry Perl 5.16.2 64bit
One of my first posts on Stackoverflow. Hope I did it right :-)
I recommend reading: this
I usually use Thread::Queue to manage the input of thread.
Sample code:
my #threads = {};
my $Q = new Thread::Queue;
# Start the threads
for (my $i=0; $i<NUM_THREADS; $i++) {
$threads[$i] =
threads->new(\&insert_1_thread, $Q);
}
# Get the list of sites and put in the work queue
foreach $row ( #{$ref} ) {
$Q->enqueue( $row->[0] );
#sleep 1 while $Q->pending > 100;
} # foreach $row
# Signal we are done
for (my $i=0; $i<NUM_THREADS; $i++) {
$Q->enqueue( undef ); }
$count = 0;
# Now wait for the threads to complete before going on to the next step
for (my $i=0; $i<NUM_THREADS; $i++) {
$count += $threads[$i]->join(); }
And for the worker thread:
sub insert_1_thread {
my ( $Q ) = #_;
my $tid = threads->tid;
my $count = 0;
Log("Started thread #$tid");
while( my $row = $Q->dequeue ) {
PROCESS ME...
$count++;
} # while
Log("Thread#$tid, done");
return $count;
} # sub insert_1_thread
I don't know if it is a solution for you, but you could create an array of chunk objects and process them in parallel like this:
#!/usr/bin/perl
package Object; {
use threads;
use threads::shared;
sub new(){
my $class=shift;
share(my %this);
return(bless(\%this,$class));
}
sub set {
my ($this,$value)=#_;
lock($this);
# $this->{"data"}=shared_clone($value);
$this->{"data"}=$value;
}
sub get {
my $this=shift;
return $this->{"data"};
}
}
package main; {
use strict;
use warnings;
use threads;
use threads::shared;
my #objs;
foreach (0..2){
my $o = Object->new();
$o->set($_);
push #objs, $o;
}
threads->create(\&run,(\#objs))->join();
sub run {
my ($obj) = #_;
$$obj[$_]->get() foreach(0..2);
}
}