How to extract all Ping Parameters in Perl - perl

I am trying to ping around 100 hosts from a column in sql database
I am using the
use Net::Ping;
use Array::Average; modules
Is there an alternative such that i can extract the packet loss,rtt(min,max,avg) parameters directly from a perl script?
Here is my pseudo code
$p = Net::Ping->new('icmp');
$p->hires();
$host = "www.xyz.com";
print "$host \n";
#rtt= 0;
$j=0;
for ($i=0;$i<5;$i++) {
($ret, $duration, $ip) = $p->ping($host, 5);# wait time 5
if($ret){
printf("$host [ip: $ip] $duration ms\n");
$rtt[$i] = $duration;
}
else{
$j++;
#$p->nack( $failed_ack_host );
}}
print " #rtt\n";
$rtt= average(#rtt);
print "The average rtt is $rtt \n";
$Packet_Loss = ((5-$j)/5)*100;
print "$Packet_Loss\%\n";

`Net::Ping' does not seem to return RTT, etc. You might try issuing the system ping command, and processing the results yourself:
$result = `ping -c 10 $host`;
...then parse $result for the fields you need.

Looking at the documentation and source code, the Net::Ping module carries out just one "ping" each time you call the ping() method. It does not keep any internal "state" based on the success or otherwise of previous calls. The advantage of this approach is that it's simple and no-nonsense. However, if you want aggregated results, you're going to have to do them yourself, as you've discovered.
While it's tempting to call an external ping command, you need to be sure that's what you want: you gain not having to do some maths, but you then become more system-specific (not all systems have a ping command), you are dependent on some assumptions about what options to pass to the command and what format the output will be in, you've added the overhead of creating a new process, and you're running the risk that you may never get control back (some ping commands simply run until you interrupt them). Personally, I'd stick with the approach you're taking.
Also, in your particular code above, have a look at how you're storing your results. If the first three and last ping succeed, say, but the last-but-one doesn't, you're passing an array to average() with an undefined value, which is probably not what you want. I'd suggest something like this instead:
my #rtt;
my $attempts = 5;
foreach (1 .. $attempts) {
my ($ret, $duration, $ip) = $p->ping($host, 5); # timeout after 5 seconds
if ($ret) {
print "$host [ip: $ip] $duration ms\n";
push #rtt, $duration;
} else {
$j++;
}
}
⋮
if (#rtt) {
my $rtt= average(#rtt);
print "The average rtt is $rtt \n";
} else {
print "No responses to ping\n";
}
This fixes one or two other corner cases as well.

Related

Efficient way to make thousands of curl requests

I am using CURL to make thousands of requests. In my code I set the cookie to a specific value and then read in the value on the page. Here is my Perl code:
#!/usr/bin/perl
my $site = "http://SITENAME/?id=";
my $cookie_name = "cookienum123";
print $fh "#\t\tValue\n";
for my $i ('1'..'10000') {
my $output = `curl -s -H "Cookie: $cookie_name=$i" -L $site$i | grep -Eo "[0-9]+"`;
print "$i\t\t$output\n";
}
So from 1 to 10000, I am setting cookienum123 to that value and reading in the whole response from the page. Then I use grep to just extract the #. The code I have now works fine but I am wondering if there is a faster or more efficient way I can do this.
Please note this does not have to be done as a Perl script (I can also use Windows batch file, Unix shell script, etc).
Edit Jan 18: Added bounty with the note "The desired answer should include a way in Perl to run through several thousand curl requests simultaneously but it needs to be run faster than the rate it is currently running at. It has to write the output to a single file in the end but the order does not matter." Some of the below comments mention fork but I am not sure how to apply it to my code. I am very new to Perl as this is my first program in it.
What you have here is an embarrassingly parallel problem. These are great for parallelising, because there's no inter-thread dependency or communication needed.
There's two key ways of doing this in perl - threading or forking. I would generally suggest thread based parallel processing for the kind of thing you're doing. This is a matter of choice, but I think it's better suited for collating information.
#!/usr/bin/perl
use strict;
use warnings;
use threads;
use Thread::Queue;
my $numthreads = 20;
my $site = "http://SITENAME/?id=";
my $cookie_name = "cookienum123";
my $fetch_q = Thread::Queue->new();
my $collate_q = Thread::Queue->new();
#fetch sub sits in a loop, takes items off 'fetch_q' and runs curl.
sub fetch {
while ( my $target = $fetch_q->dequeue() ) {
my $output =
`curl -s -H "Cookie: $cookie_name=$target" -L $site$target | grep -Eo "[0-9]+"`;
$collate_q->enqueue($output);
}
}
#one instance of collate, which exists to serialise the output from fetch.
#writing files concurrently can get very messy and build in race conditions.
sub collate {
open( my $output_fh, ">", "results.txt" ) or die $!;
print {$output_fh} "#\t\tValue\n";
while ( my $result = $collate_q->dequeue() ) {
print {$output_fh} $result;
}
close($output_fh);
}
## main bit:
#start worker threads
my #workers = map { threads->create( \&fetch ) } 1 .. $numthreads;
#collates results.
my $collater = threads->create( \&collate );
$fetch_q->enqueue( '1' .. '10000' );
$fetch_q->end();
foreach my $thr (#workers) {
$thr->join();
}
#end collate_q here, because we know all the fetchers are
#joined - so no more results will be generated.
#queue will then generate 'undef' when it's empty, and the thread will exit.
$collate_q->end;
#join will block until thread has exited, e.g. all results in the queue
#have been 'processed'.
$collater->join;
This will spawn 20 worker threads, that'll run in parallel, and collect results as they exit to a file. As an alternative, you could do something similar with Parallel::ForkManager, but for data-oriented tasks, I personally prefer threading.
You can use the 'collate' sub to postprocess any data, such as sorting it, counting it, whatever.
I would also point out - using curl and grep as system calls isn't ideal - I've left them as is, but would suggest looking at LWP and allowing perl to handle the text processing, because it's pretty good at it.
I'm pretty sure the following will do what you want however slamming a server with 10000 simultaneous requests is not very polite. In fact, harvesting a site's data by walking the id's of a given url doesn't sound very friendly either. I have NOT tested the following but it should get you 99% of the way there (might be a syntax/usage error somewhere).
See for more info:
https://metacpan.org/pod/distribution/Mojolicious/lib/Mojolicious/Guides/Cookbook.pod#Non-blocking
https://metacpan.org/pod/Mojo::UserAgent#build_tx
https://metacpan.org/pod/Mojo::DOM
Good luck!
#!/usr/bin/perl
use warnings;
use strict;
use Mojo::UserAgent;
use Mojo::IOLoop;
my $site = 'http://SITENAME/?id=';
my $cookie_name = 'cookienum123';
#open filehandle and write file header
open my $output_fh, q{>}, 'results.txt'
or die $!;
print {$output_fh} "#\t\tValue\n";
# Use Mojo::UserAgent for concurrent non-blocking requests
my $ua = Mojo::UserAgent->new;
#create your requests
for my $i (1..10000) {
#build transaction
my $tx = $ua->build_tx(GET => "$site$i");
#add cookie header
$tx->req->cookies({name => $cookie_name, value => $i});
#start "GET" with callback to write to file
$tx = $ua->start( $tx => sub {
my ($ua, $mojo) = #_;
print {$output_fh} $i . "\t\t" . $mojo->res->dom->to_string;
});
}
# Start event loop if necessary
Mojo::IOLoop->start unless Mojo::IOLoop->is_running;
#close filehandle
close $output_fh;

Problems with joining threads

I've got some issue with a part of my perl script, bothering me for days now. To summarize the purpose is to read in a large file in chunks and do some operation on the input stream (not relevant for my question). When I first implemented it, I just looped over the file and then did some stuff on it, like this:
while (read FILE, $buffer, $chunksize){
callSomeOperation($buffer);
# Do some other stuff
}
Unfortunately the file is really big and the operation somehow complex with many function calls, therefore this led to steadily increasing Memory perl couldn't allocate memory anymore and the script failed. So I did some investigation and tried several things to minimize the memory overhead (defined variables outside the loop, set to undef and so on), which led the allocated memory size increasing slower, but at the end still failed. (And if I figured out right, perl giving back memory to the OS is sth. that won't happen in practice.)
So I decided to nest the function call and all its definition in a subthread, wait for its finish, join and then call the thread again with the next chunk:
while (read FILE, $buffer, $chunksize){
my $thr = threads->create(\&thrWorker,$buffer);
$thr->join();
}
sub thrWorker{
# Do the stuff here!
}
Which might have been a solution, if the thread would join! But it actually does not. If I run it with $thr->detach(); everything works fine, besides I get hundrets of threads at the same time, which is not a good idea, and in this case, I need to run them consecutively.
So I took some Investigation on this join issue and got some voices that ther might be an issue with perl 5.16.1 so I updated to 5.16.2 but it still never joins. Anywhere in a Mailing list I cant remember I read from somebody managed to get Threads to join with CPAN module Thread::Queue but this didn't worked for me either.
So I gave up with threads and tried to fork this thing. But with fork it seems like the total number of "forks" is limited? Anyway it went fine till the 13th to 20th iteration and then gave up with the message it couldn't fork anymore.
my $pid = fork();
if( $pid == 0 ){
thrWorker($buffer);
exit 0;
}
I also tried it with CPAN modules Parallel::ForkManager and Proc::Fork but that didn't help.
So now I'm somehow stuck and cant help myself out. Maybe somebody else can! Any suggestions greatly appreciated!
How can I get this thing to work with threads or child processes?
Or at least how can I force perl freeing memory so I can do this in the same process?
Some additional information on my system:
OS: Windows 7 64bit / Ubuntu Server 12.10
Perl on Windows: Strawberry Perl 5.16.2 64bit
One of my first posts on Stackoverflow. Hope I did it right :-)
I recommend reading: this
I usually use Thread::Queue to manage the input of thread.
Sample code:
my #threads = {};
my $Q = new Thread::Queue;
# Start the threads
for (my $i=0; $i<NUM_THREADS; $i++) {
$threads[$i] =
threads->new(\&insert_1_thread, $Q);
}
# Get the list of sites and put in the work queue
foreach $row ( #{$ref} ) {
$Q->enqueue( $row->[0] );
#sleep 1 while $Q->pending > 100;
} # foreach $row
# Signal we are done
for (my $i=0; $i<NUM_THREADS; $i++) {
$Q->enqueue( undef ); }
$count = 0;
# Now wait for the threads to complete before going on to the next step
for (my $i=0; $i<NUM_THREADS; $i++) {
$count += $threads[$i]->join(); }
And for the worker thread:
sub insert_1_thread {
my ( $Q ) = #_;
my $tid = threads->tid;
my $count = 0;
Log("Started thread #$tid");
while( my $row = $Q->dequeue ) {
PROCESS ME...
$count++;
} # while
Log("Thread#$tid, done");
return $count;
} # sub insert_1_thread
I don't know if it is a solution for you, but you could create an array of chunk objects and process them in parallel like this:
#!/usr/bin/perl
package Object; {
use threads;
use threads::shared;
sub new(){
my $class=shift;
share(my %this);
return(bless(\%this,$class));
}
sub set {
my ($this,$value)=#_;
lock($this);
# $this->{"data"}=shared_clone($value);
$this->{"data"}=$value;
}
sub get {
my $this=shift;
return $this->{"data"};
}
}
package main; {
use strict;
use warnings;
use threads;
use threads::shared;
my #objs;
foreach (0..2){
my $o = Object->new();
$o->set($_);
push #objs, $o;
}
threads->create(\&run,(\#objs))->join();
sub run {
my ($obj) = #_;
$$obj[$_]->get() foreach(0..2);
}
}

Is it ever safe to combine select(2) and buffered IO for file handles?

I am using IO::Select to keep track of a variable number of file handles for reading. Documentation I've come across strongly suggests not to combine the select statement with <> (readline) for reading from the file handles.
My situation:
I will only ever use each file handle once, i.e. when the select offers me the file handle, it will be completely used and then removed from the select. I will be receiving a hash and a variable number of files. I do not mind if this blocks for a time.
For more context, I am a client sending information to be processed by my servers. Each file handle is a different server I'm talking to. Once the server is finished, a hash result will be sent back to me from each one. Inside that hash is a number indicating the number of files to follow.
I wish to use readline in order to integrate with existing project code for transferring Perl objects and files.
Sample code:
my $read_set = IO::Select()->new;
my $count = #agents_to_run; #array comes as an argument
for $agent ( #agents_to_run ) {
( $sock, my $peerhost, my $peerport )
= server($config_settings{ $agent }->
{ 'Host' },$config_settings{ $agent }->{ 'Port' };
$read_set->add( $sock );
}
while ( $count > 0) {
my #rh_set = IO::Select->can_read();
for my $rh ( #{ $rh_set } ) {
my %results = <$rh>;
my $num_files = $results{'numFiles'};
my #files = ();
for (my i; i < $num_files; i++) {
$files[i]=<$rh>;
}
#process results, close fh, decrement count, etc
}
}
Using readline (aka <>) is quite wrong for two reasons: It's buffered, and it's blocking.
Buffering is bad
More precisely, buffering using buffers that cannot be inspected is bad.
The system can do all the buffering it wants, since you can peek into its buffers using select.
Perl's IO system cannot be allowed to do any buffering because you cannot peek into its buffers.
Let's look at an example of what can happen using readline in a select loop.
"abc\ndef\n" arrives on the handle.
select notifies you that there is data to read.
readline will try to read a chunk from the handle.
"abc\ndef\n" will be placed in Perl's buffer for the handle.
readline will return "abc\n".
At this point, you call select again, and you want it to let you know that there is more to read ("def\n"). However, select will report there is nothing to read since select is a system call, and the data has already been read from the system. That means you will have to wait for more to come in before being able to read "def\n".
The following program illustrates this:
use IO::Select qw( );
use IO::Handle qw( );
sub producer {
my ($fh) = #_;
for (;;) {
print($fh time(), "\n") or die;
print($fh time(), "\n") or die;
sleep(3);
}
}
sub consumer {
my ($fh) = #_;
my $sel = IO::Select->new($fh);
while ($sel->can_read()) {
my $got = <$fh>;
last if !defined($got);
chomp $got;
print("It took ", (time()-$got), " seconds to get the msg\n");
}
}
pipe(my $rfh, my $wfh) or die;
$wfh->autoflush(1);
fork() ? producer($wfh) : consumer($rfh);
Output:
It took 0 seconds to get the msg
It took 3 seconds to get the msg
It took 0 seconds to get the msg
It took 3 seconds to get the msg
It took 0 seconds to get the msg
...
This can be fixed using non-buffered IO:
sub consumer {
my ($fh) = #_;
my $sel = IO::Select->new($fh);
my $buf = '';
while ($sel->can_read()) {
sysread($fh, $buf, 64*1024, length($buf)) or last;
while ( my ($got) = $buf =~ s/^(.*)\n// ) {
print("It took ", (time()-$got), " seconds to get the msg\n");
}
}
}
Output:
It took 0 seconds to get the msg
It took 0 seconds to get the msg
It took 0 seconds to get the msg
It took 0 seconds to get the msg
It took 0 seconds to get the msg
It took 0 seconds to get the msg
...
Blocking is bad
Let's look at an example of what can happen using readline in a select loop.
"abcdef" arrives on the handle.
select notifies you that there is data to read.
readline will try to read a chunk from the socket.
"abcdef" will be placed in Perl's buffer for the handle.
readline hasn't received a newline, so it tries to read another chunk from the socket.
There is no more data currently available, so it blocks.
This defies the purpose of using select.
[ Demo code forthcoming ]
Solution
You have to implement a version of readline that doesn't block, and only uses buffers you can inspect. The second part is easy because you can inspect the buffers you create.
Create a buffer for each handle.
When data arrives from a handle, read it but no more. When data is waiting (as we know from select), sysread will return what's available without waiting for more to arrive. That makes sysread perfect for this task.
Append the data read to the appropriate buffer.
For each complete message in the buffer, extract it and process it.
Adding a handle:
$select->add($fh);
$clients{fileno($fh)} = {
buf => '',
...
};
select loop:
use experimental qw( refaliasing declared_refs );
while (my #ready = $select->can_read) {
for my $fh (#ready) {
my $client = $clients{fileno($fh)};
my \$buf = \($client->{buf}); # Make $buf an alias for $client->{buf}
my $rv = sysread($fh, $buf, 64*1024, length($buf));
if (!$rv) {
delete $clients{fileno($fh)};
$sel->remove($fh);
if (!defined($rv)) {
... # Handle error
}
elsif (length($buf)) {
... # Handle eof with partial message
}
else {
... # Handle eof
}
next;
}
while ( my ($msg) = $buf =~ s/^(.*)\n// )
... # Process message.
}
}
}
By the way, this is much easier to do using threads, and this doesn't even handle writers!
Note that IPC::Run can do all the hard work for you if you're communicating with a child process, and that asynchronous IO can be used as an alternative to select.
After much discussion with #ikegami, we determined that in my extremely specific case the readline is actually not an issue. I'm still leaving ikegami's as the accepted right answer because it is far and away the best way to handle the general situation, and a wonderful writeup.
Readline (aka <>) is acceptable in my situation due to the following facts:
The handle is only returned once from the select statement, and then it is closed/removed
I only send one message through the file handle
I do not care if read handles block
I am accounting for timeouts and closed handle returns from select (error checking not included in the sample code above)

Getting all hostnames from IP address in Perl

I'm trying to find a way to get all hostnames that resolve to an IP address.
The gethostbyaddr function appears to only retrieve the first record from DNS (no matter if it's in scalar or list context).
Example:
my $hostname = gethostbyaddr(inet_aton($ip_to_check), AF_INET);
$print($hostname); //output: joe.example.com
my #hostnames = gethostbyaddr(inet_aton($ip_to_check), AF_INET);
foreach my $hostname (#hostnames){
print "(", join(',',#hostnames), ")"; //output: (joe.example.com,,2,4,?)
}
From the terminal:
$ host 192.168.1.5
5.1.168.192.in-addr.arpa domain name pointer joe.example.com.
5.1.168.192.in-addr.arpa domain name pointer john.example.com.
I've heard that Net::DNS is a little more robust, but I haven't had any luck getting that to pull all entries as well.
I used a combination of answers given here and elsewhere on stack overflow to find the answer I was looking for.
# create new Resolver Object
my $res = Net::DNS::Resolver->new;
# change IP from 192.168.1.15 to 15.1.168.192.in-addr.arpa for searching
my $target_IP = join('.', reverse split(/\./, $ip_to_check)).".in-addr.arpa";
# query DNS
my $query = $res->query("$target_IP", "PTR");
# if a result is found
if ($query){
print("Resolves to:\n");
# for every result, print the IP address
foreach my $rr ($query->answer){
# show all unless the type is PTR (pointer to a canonical name)
next unless $rr->type eq "PTR";
# remove the period at the end
printf(substr($rr->rdatastr, 0, -1));
}
}
The gethostby... interface is quite old and clunky, being defined back in primeval times before Perl got references and pretensions to OO. And it doesn't work the way you're trying to use it. When used in list context, it returns the primary name as the first element and a space-separated(!) list of aliases as the second:
my ($hostname, $aliases) = gethostbyaddr($addr, AF_INET);
my #hostname = ($hostname, split ' ', $aliases);
say join ' ', #hostname;
Now that's the theory; I didn't locate any IP addresses with multiple PTR records offhand, so I can't test if gethostbyaddr will actually return them -- it probably depends on your underlying C runtime as well -- but it does work if you use gethostbyname with a CNAMEd name, for instance.
Here's a small program I use to lookup all PTR records for a netmask (for example 192.0.2.0/28 ) when doing abuse tracking tasks. It sends up to 15 queries a second and when they are all sent then starts reading the responses (so it'd need a little work to function properly for bigger net blocks).
#!/usr/bin/env perl
use strict;
use warnings;
use Net::Netmask;
use Net::DNS;
#ARGV or die "$0 ip/cidr\n";
my $block = Net::Netmask->new(shift);
my $res = Net::DNS::Resolver->new;
my %sockets;
my $i = 0;
for my $i (1 .. $block->size - 1) {
my $ip = $block->nth($i);
my $reverse_ip = join ".", reverse split m/\./, $ip;
$reverse_ip .= ".in-addr.arpa";
#print "$ip\n";
my $bgsock = $res->bgsend($reverse_ip, 'PTR');
$sockets{$ip} = $bgsock;
sleep 1 unless $i % 15;
}
$i = 0;
for my $i (1 .. $block->size - 1) {
my $ip = $block->nth($i);
my $socket = $sockets{$ip};
my $wait = 0;
until ($res->bgisready($socket)) {
print "waiting for $ip\n" if $wait > 0;
sleep 1 + $wait;
$wait++;
}
my $packet = $res->bgread($socket);
my #rr = $packet->answer;
printf "%-15s %s\n", $ip, $res->errorstring
unless #rr;
for my $rr (#rr) {
printf "%-15s %s\n", $ip, $rr->string;
}
}
I don't think this is a well-formed problem statement. In the general case, there's a nearly infinite number of DNS names that could resolve to any IP address, even unknown to the party that holds the address. Reverse-lookups are fundamentally unreliable, and are not capable of answering the question the poster would like, since all names for an IP do not need to be in the visible reverse map.
The first answer, which enumerates the reverse map, is the best one can do, but it will miss any names that have not been entered in the map.
This is what I have used:
sub getauthoritivename
{
my ($printerdns)=#_;
my $res = Net::DNS::Resolver->new(searchlist=>$config->{searchlist});
my $query = $res->search($printerdns);
if ($query)
{
foreach my $rr ($query->answer)
{
next unless $rr->type eq "A";
print $rr->name;
}
}
else
{
warn "query failed: ", $res->errorstring, "\n";
return 0;
}
}
As long as $rr->name finds names, it keeps adding them.

Are there any good timer implementations in Perl?

I'm looking for good timer implementation in perl. The situation I met is like: I need to keep track of I/O activities of many files and for thoes files keep untouched for enough time a remove action will be taken upon them, so an efficient timer implementation is really vital for
the app I'm involved right now. To avoid recreate the wheel, ask you guys for help first.
Time::HiRes comes with perl.
Furthermore, your application sounds like it could benefit from Linux::Inotify (note the Linux:: in front). When setting the timer for a file that you want to remove after a certain time of inactivity, remember the last access. In an inotify event hook, update this time to the current time. Then, you can periodically check whether the file's lifetime expired without doing a stat on all of the files you track. On expiration, you could add a final check just to make sure nothing went wrong, of course.
If you have huge numbers of files in flight, you may want to keep the list of files sorted by expiration time. That makes the periodic check for expiration trivial.
Update: I just did a little experimentation with Linux::Inotify. Things aren't as easy with that approach as I thought. First, here's the partially working code that I didn't have time to finish.
#!/usr/bin/env perl
use strict;
use warnings;
use List::Util qw/min max/;
use Time::HiRes qw/time sleep/;
use Data::Dumper;
use Linux::Inotify;
# [s], but handles subsecond granularity, too
use constant CLEANUP_INTERVAL => 1.;
use constant FILE_ACCESS_TIMEOUT => 5.;
# for fast and readable struct access
use constant FILENAME => 0;
use constant ACCESSTIME => 1;
use constant WATCHER => 2;
my $notifier = Linux::Inotify->new;
my #tracked_files = populate_tracked_files(\#ARGV, $notifier);
warn Dumper \#tracked_files;
while (1) {
# update the tracked files according to inotify events
my #events = $notifier->read;
my %files_seen_this_round;
foreach my $event (#events) {
$event->print();
my $ev_filename = $event->{name}; # part of the API, apparently
# we mave have multiple events per file.
next if $files_seen_this_round{$ev_filename}++;
# find and update the right tracked file
# TODO: this could be optimized to O(1) with a hash at
# the cost of more bookkeeping
foreach my $tfile (#tracked_files) {
if ($tfile->[FILENAME] eq $ev_filename) {
my $atime = $^T + 60*60*24 * -A $ev_filename; # update access time
$tfile->[ACCESSTIME] = $atime;
# a partial bubble sort would be hugely more efficient here!
# => O(n) from O(n*log(n))
#tracked_files = sort {$a->[ACCESSTIME] <=> $b->[ACCESSTIME]}
#tracked_files;
last;
}
} # end foreach tracked file
} # end foreach event
cleanup_files(\#tracked_files);
sleep(CLEANUP_INTERVAL);
last if not #tracked_files;
} # end while(1)
$notifier->close;
sub cleanup_files {
my $files = shift;
my $now = time();
for (my $fileno = 0; $fileno < $#{$files}; ++$fileno) {
my $file = $files->[$fileno];
if ($now - $file->[ACCESSTIME] > FILE_ACCESS_TIMEOUT) {
warn "File '" . $file->[FILENAME] . "' timed out";
# remove this file from the watch list
# (and delete in your scenario)
$file->[WATCHER]->remove;
splice #$files, $fileno, 1;
$fileno--;
}
}
}
sub populate_tracked_files {
my $files = shift;
my $notifier = shift;
my #tracked_files;
foreach my $file (#$files) {
die "Not a file: '$file'" if not -f $file;
my $watch = $notifier->add_watch($file, Linux::Inotify::ALL_EVENTS);
push #tracked_files, [$file, $^T + 60*60*24*-A $file, $watch];
}
#tracked_files = sort {$a->[ACCESSTIME] <=> $b->[ACCESSTIME]}
#tracked_files;
return #tracked_files;
}
There's still some bug in the time-checking logic. But the main problem is that $notifier->read() will block until a new event. Whereas we really just want to see whether there's a new event and then proceed to cleanup. This would have to be added to Linux::Inotify as a non-blocking read of the file descriptor. Anybody can take over maintenance of the module since the author is no longer interested.
A your program seems clearly event-driven, you would benefit of implementing it using event-driven frameworks such as POE or AnyEvent. Those have all the pieces to handle I/O events and timer events.