file watching in a directory using perl - perl

i need to watch files that fall under a directory.I have coded the below script in perl . but it is not doing what i want .
whenever a file or files arrives , it has to do a movement .
And then it has to keep watching files again.
the script should be running in background.
#!/usr/bin/perl
use warnings;
use File::Copy qw(move);
$src_dir = '/root/prasanna/dir';
$tgt_dir = '/root/prasanna/dir/dir1';
while (true) {
opendir( DIR, "/root/prasanna/dir" )
or die "Cannot open /root/prasanna/dir: $!\n";
my #Dircontent = readdir DIR;
close DIR;
my $items = #Dircontent;
if ( $items > 2 ) {
print "files available";
while ($items) {
print $items;
move $src_dir. '/' . $items, $tgt_dir . '/' . $items;
unlink $items;
}
}
else { sleep 50; }
}
The problem with the above code is
1. the if statement keeps on printing the 'files available' . goes on infinite loop , it doesnt watch for files again .even if i do operations on file, i dont knw how to make it look for files again.
2. the script doesnt run in background .
any help is highly appreciated . thanks beforehand.!

presuming you are running under Linux, use Linux::Inotify2...
use Linux::Inotify2;
# create an Inotify object
my $Inotify = Linux::Inotify2->new() or die "Fail: $!";
# choose which operations for which you wish to be notified
my $watchme = IN_CLOSE_WRITE | IN_CREATE | IN_MOVED_TO; # defined and exported
$Inotify->watch('/root/prasanna/dir', $watchme, \&watcher) or die "Fail: $!";
while (1) {
$Inotify->poll;
}
sub watcher
{
# do something here
}
Note it can only monitor local filesystems (i.e. no NFS mounts)

Related

Perl , How to read subfolder Output

I am writing a script to read the content of multiple sub folder in a directory.
And recently i need to read the content of folder inside multiple sub-folder.
Want to ask how can i write the code to read those folder inside multiple sub-folder.
This is the new conditions
Multiple Sub-folder -> Local folder -> fileAAA.csv
how do i read this fileAAA in Local folder of Multiple Sub-folder?
Currently the code i am writing was in this condition and it works well.
Multiple Sub-folder -> fileAAA.csv
Able to read fileAAA from multiple Sub-folder
Below is the code i use to read
Multiple Sub-folder -> fileAAA.csv
my ( $par_dir, $sub_dir );
opendir( $par_dir, "$parent" );
while ( my $sub_folders = readdir($par_dir) ) {
next if ( $sub_folders =~ /^..?$/ ); # skip . and ..
my $path = $parent . '/' . $sub_folders;
next unless ( -d $path ); # skip anything that isn't a directory
opendir( $sub_dir, $path );
while ( my $file = readdir($sub_dir) ) {
next unless $file =~ /\.csv?$/i;
my $full_path = $path . '/' . $file;
print_file_names($full_path);
}
closedir($sub_dir);
$flag = 0;
}
closedir($par_dir);
......
Updated
You should look at the File::Find module which has everything already in place to do searches like this, and has taken account of all corner cases for you
I wrote that on my tablet and at the time I couldn't offer sample code to support it. I believe this will do what you're asking for, which is simply to find all CSV files at any level beneath a parent directory
use strict;
use warnings;
use File::Find qw/ find /;
STDOUT->autoflush;
my $parent = '/Multiple Sub-folder';
find(sub {
return unless -f and /\.csv$/i;
print_file_names($File::Find::name);
}, $parent);
sub print_file_names {
my ($fn) = #_;
print $fn, "\n";
}
Without using moudle try this
Instead of opendir can you try glob for subdirectory search.
In below script i make a subroutine for continuous search.
When elsif condition is satisfied the path of the directory is will go to the find subroutine then it'll seach and so on.
my $v = "/Multiple Sub-folder";
find($v);
sub find{
my ($s) = #_;
foreach my $ma (glob "$s/*")
{
if(-f $ma)
{
if($ma =~m/.csv$/) # Here search for csv files.
{
print "$ma\n";
}
}
elsif(-d $ma)
{
find("$ma")
}
}
}
But can you use File::Find module for search the files in the directory as the answer of Borodin Which is the best approach.

Perl sftp downloads with Net::SFTP::Foreign

Im a beginner. I have written a perl script which does the following
-Create a directory under “/x01/abcd/abc_logs/abcd_Logs” by the current date, in the format of “YYYYMMDD” if it has not already been created.
i.e: if the script is run on “01st of jan 2013”, the directory “20130101” will be created under the said path. So whenever there is a need to inspect the logs always look for a directory by the current date.
-Check if the log file(s) have already been downloaded earlier within the same day, and if not log(s) will be downloaded to the TODAY’s directory.
Im having a hard time, coming up with a solution to print a message when there are no files in the share. This is of course when the user specify 2 or more files that are not there in the share. I know that this happens because there is a "die" statement in the "sub get_LOGS". I just cannot seem to understand how to return a message when all the files I specify do not happen to be in the share.
usage of this script is as follows
./abc_logs ....<file(n)>
following is the script.
my $LOGS_LOCAL_PATH = "/x02/abc/abcba2/";
chomp $LOGS_LOCAL_PATH;
my $LOGS_REM_PATH = "/x01/INT/abc/vabc2/";
chomp $LOGS_REM_PATH;
my $TODAY = `date +%Y%m%d`;
chomp $TODAY;
my #GETLOOP = #ARGV;
unless ($#ARGV >= 0) {
print "\nUsage: gtp_logs.pl <file1> <file2> <file3>.....<file(n)>\n\n";
exit;
}
system("clear");
unless ( -d "$LOGS_LOCAL_PATH"."$TODAY") {
print "Directory \"$TODAY\" doesn't exist. So creating the directory..!\n";
print "OK..Done.....!\n\n";
system("mkdir $LOGS_LOCAL_PATH/$TODAY");
}
else {
print "Directory already exists. Logs will be downloaded to ==> \"$LOGS_LOCAL_PATH$TODAY\".....!\n\n";
}
# if_DOWNLOADED($LOGS_LOCAL_PATH,$TODAY,#GETLOOP);
chdir("$LOGS_LOCAL_PATH"."$TODAY") || die "cannot cd to ($!)";
foreach my $GETL (#GETLOOP) {
my $is_downloaded = if_DOWNLOADED($LOGS_LOCAL_PATH,$TODAY,$GETL);
if(!$is_downloaded)
{
get_LOGS("172.25.70.221","abc","abc2","/x01/INT/abc",$GETL);
print "File \"$GETL\" downloaded to ==> \"$LOGS_LOCAL_PATH$TODAY\"\n\n";
}
else
{
print "File \"$GETL\" has already been Downloaded to ==> \"$LOGS_LOCAL_PATH$TODAY\"\n\n";
}
}
sub get_LOGS {
my $LOG_HOST = shift;
my $REM_USER = shift;
my $REM_PASSW = shift;
my $REM_PATH = shift;
my $REM_FILE = shift;
print "Connecting to the sftp share! Please wait....!\n";
my $sftp = Net::SFTP::Foreign->new($LOG_HOST, user => $REM_USER, password => $REM_PASSW);
$sftp->setcwd($REM_PATH) or die "unable to change cwd: " . $sftp->error;
print "OK. On the share! Downloading the file \"$REM_FILE\"...................!\n\n\n\n";
$sftp->error and die "Problem connecting to the share...!!!! " . $sftp->error;
$sftp->get($REM_FILE) or die "File does not seem to be present on the remote share. Please re-request..!!!" . $sftp->error;
return $REM_FILE;
}
sub if_DOWNLOADED {
my $DWD_FILE_PATH = shift;
my $DWD_DIR = shift;
my $DWD_FILE = shift;
if (-e "$DWD_FILE_PATH/$DWD_DIR/$DWD_FILE")
{
return 1;
}
else
{
return 0;
}
}
Please can someone help me finding a solution to this matter? Please try to use the same script and modify.
/V
Some comments to your code:
Use strict and warnings in order to catch lots of errors early.
Read some book on style (i.e. Damian Conway's Perl Best Practices). But in any case try to be consistent when naming variables, subroutines, and everything and also with their case.
When you have to use some calculated value in several places, try to calculate it once and save it in a variable.
Don't use subroutines for trivial things.
You don't need to call chomp on variables you have defined and that don't have a "\n" character at the end.
Opening a new SFTP connection for every file transfer is very inefficient. You can open just one at the beginning and use it for all the transfers.
And now, a simplified version of your script:
#!/usr/bin/perl
use strict;
use warnings;
my $host = "172.25.70.221";
my $user = "abc";
my $password = "abc1234321";
my $LOGS_LOCAL_PATH = "/x02/ABC/abc2";
my $LOGS_REM_PATH = "/x01/INT/abc/vim";
my $TODAY = `date +%Y%m%d`;
chomp $TODAY;
my $TODAY_LOCAL_PATH = "$LOGS_LOCAL_PATH/$TODAY";
my #files = #ARGV;
#files or die "\nUsage: gtp_logs.pl <file1> <file2> <file3>.....<file(n)>\n\n";
system("clear");
if ( -d $TODAY_LOCAL_PATH) {
print "Directory already exists. Logs will be downloaded to ==> \"$TODAY_LOCAL_PATH\".....!\n\n";
}
else {
print "Directory \"$TODAY\" doesn't exist. So creating the directory..!\n";
mkdir "$TODAY_LOCAL_PATH" or die "unable to create directory: $!\n";
print "OK..Done.....!\n\n";
}
chdir $TODAY_LOCAL_PATH or die "cannot cd to ($!)\n";
my $sftp = Net::SFTP::Foreign->new($host, user => $user, password => $password);
$sftp->error
and die "Problem connecting to the share...!!!! " . $sftp->error;
my $ok = 0;
my $failed = 0;
foreach my $file (#files) {
if (-e "$TODAY_LOCAL_PATH/$file") {
print "File \"$file\" has already been Downloaded to ==> \"$TODAY_LOCAL_PATH\"\n";
}
else {
if ($sftp->get("$LOGS_REM_PATH/$file")) {
print "File \"$file\" downloaded to ==> \"$TODAY_LOCAL_PATH\"\n";
$ok++;
}
else {
print "Unable to download file \"$file\" : " . $sftp->error . "\n";
$failed++;
}
}
}
print "$ok files have been downloaded, $failed files failed!\n\n";

What's the best strategy to delete a very huge folder using Perl?

I need to delete all content (files and folders) under a given folder. The problems is the folder has millions of files and folders inside it. So I don't want to load all the file names in one go.
Logic should be like this:
iterate a folder without load everything
get a file or folder
delete it
(verbose that the file or folder "X" was deleted)
go to the next one
I'm trying something like this:
sub main(){
my ($rc, $help, $debug, $root) = ();
$rc = GetOptions ( "HELP" => \$help,
"DEBUG" => \$debug,
"ROOT=s" => \$root);
die "Bad command line options\n$usage\n" unless ($rc);
if ($help) { print $usage; exit (0); }
if ($debug) {
warn "\nProceeding to execution with following parameters: \n";
warn "===============================================================\n";
warn "ROOT = $root\n";
} # write debug information to STDERR
print "\n Starting to delete...\n";
die "usage: $0 dir ..\n" unless $root;
*name = *File::Find::name;
find \&verbose, #ARGV;
}
sub verbose {
if (!-l && -d _) {
print "rmdir $name\n";
} else {
print "unlink $name\n";
}
}
main();
It's working fine, but whenever "find" reads the huge folder, the application gets stuck and I can see the system memory for Perl increasing until timeout. Why? Is it trying to load all the files in one go?
Thanks for your help.
The remove_tree function from File::Path can portably and verbosely remove a directory hierarchy, keeping the top directory, if desired.
use strict;
use warnings;
use File::Path qw(remove_tree);
my $dir = '/tmp/dir';
remove_tree($dir, {verbose => 1, keep_root => 1});
Pre-5.10, use the rmtree function from File::Path. If you still want the top directory, you could just mkdir it again.
use File::Path;
my $dir = '/tmp/dir';
rmtree($dir, 1); # 1 means verbose
mkdir $dir;
The perlfaq points out that File::Find does the hard work of traversing a directory, but the work isn't that hard (assuming your directory tree is free of named pipes, block devices, etc.):
sub traverse_directory {
my $dir = shift;
opendir my $dh, $dir;
while (my $file = readdir($dh)) {
next if $file eq "." || $file eq "..";
if (-d "$dir/$file") {
&traverse_directory("$dir/$file");
} elsif (-f "$dir/$file") {
# $dir/$file is a regular file
# Do something with it, for example:
print "Removing $dir/$file\n";
unlink "$dir/$file" or warn "unlink $dir/$file failed: $!\n";
} else {
warn "$dir/$file is not a directory or regular file. Ignoring ...\n";
}
}
closedir $dh;
# $dir might be empty at this point. If you want to delete it:
if (rmdir $dir) {
print "Removed $dir/\n";
} else {
warn "rmdir $dir failed: $!\n";
}
}
Substitute your own code for doing something with a file or (possibly) empty directory, and call this function once on the root of the tree that you want to process. Lookup the meanings of opendir/closedir, readdir, -d, and -f if you haven't encountered them before.
What's wrong with:
`rm -rf $folder`; // ??
You can use File::Find to systematically traverse the directory and delete the files and directories under it.
OK, I gave in and used Perl builtins but you should use File::Path::rmtree which I had totally forgotten about:
#!/usr/bin/perl
use strict; use warnings;
use Cwd;
use File::Find;
my ($clean) = #ARGV;
die "specify directory to clean\n" unless defined $clean;
my $current_dir = getcwd;
chdir $clean
or die "Cannot chdir to '$clean': $!\n";
finddepth(\&wanted => '.');
chdir $current_dir
or die "Cannot chdir back to '$current_dir':$!\n";
sub wanted {
return if /^[.][.]?\z/;
warn "$File::Find::name\n";
if ( -f ) {
unlink or die "Cannot delete '$File::Find::name': $!\n";
}
elsif ( -d _ ) {
rmdir or die "Cannot remove directory '$File::Find::name': $!\n";
}
return;
}
Download the unix tools for windows and then you can do rm -rv or whatever.
Perl is a great tool for a lot of purposes, but this one seems better done by a specialised tool.
Here's a cheap "cross-platform" method:
use Carp qw<carp croak>;
use English qw<$OS_NAME>;
use File::Spec;
my %deltree_op = ( nix => 'rm -rf %s', win => 'rmdir /S %s' );
my %group_for
= ( ( map { $_ => 'nix' } qw<linux UNIX SunOS> )
, ( map { $_ => 'win' } qw<MSWin32 WinNT> )
);
my $group_name = $group_for{$OS_NAME};
sub chop_tree {
my $full_path = shift;
carp( "No directory $full_path exists! We're done." ) unless -e $full_path;
croak( "No implementation for $OS_NAME!" ) unless $group_name;
my $format = $deltree_op{$group_name};
croak( "Could not find command format for group $group_name" ) unless $format;
my $command = sprintf( $format, File::Spec->canonpath( $full_path ));
qx{$command};
}

What's the best way to tell if a file exists in a directory?

I'm trying to move a file but I want to ensure that it exists before I do so. What's the simplest way to do this in Perl?
My code is like this. I looked up the open command, but I am not sure it is the simplest way or not.
if #Parser.exe exist in directory of Debug
{
move ("bin/Debug/Parser.exe","Parser.exe");
}
elsif #Parser.exe exist in directory of Release
{
move ("bin/Release/Parser.exe","Parser.exe");
}
else
{
die "Can't find the Parser.exe.";
}
Thank you.
What you need is a file test operator to check if the file exists. Specifically, you need the -e operator which checks if a file exists.
if (-e "bin/Debug/Parser.exe")
{
move ("bin/Debug/Parser.exe","Parser.exe");
}
elsif (-e "bin/Release/Parser.exe")
move ("bin/Release/Parser.exe","Parser.exe");
else
{
die "Can't find the Parser.exe."
}
You can make use of -e file test to check for file existence:
use File::Copy;
if(-e "bin/Debug/parser.exe") {
copy("bin/Debug/parser.exe","Parser.exe") or die "Copy failed: $!";
} elsif(-e "bin/Release/Parser.exe") {
copy("bin/Release/parser.exe","Parser.exe") or die "Copy failed: $!";
} else {
die "Can't find the Parser.exe.";
}
Personally I don't like the duplication of the file/ path name in these solutions - speaking for myself I suspect I might change accidently it to
if(-e "pathone....")... { copy("pathtwo...","Parser.exe")
I would do something like
copy("bin/Debug/parser.exe","Parser.exe") or
copy("bin/Release/parser.exe","Parser.exe") or
die "Can't find the Parser.exe.";
Or if that is a bit risque
copy_parser("bin/Debug") or
copy_parser("bin/Release") or
die "Can't find the Parser.exe.";
sub copy_parser {
my $path = shift ;
my $source = File::Spec-> catfile ( $path, 'Parser.exe' ) ;
if ( -e $source ) {
copy( $source, "Parser.exe") or die "Copy or $source failed: $!";
return 1 ;
}
return 0 ;
}
justintime is on the right track when he notes the repetition and seeks to eliminate it. I took the minimization a step farther than he did.
Rather than encapsulate only the copy/move portion of the code, though, it makes sense to remove as all the repetition by encapsulating the list iteration.
I put the subroutine in a module so it can be reused later as needed. This also reduces repeated code.
use SearchMove;
my $found = search_and_move(
src => 'Parser.exe',
dest => 'Parser.exe',
dirs => [
"bin/Debug",
"bin/Release",
],
);
die "Can't find the Parser.exe\n"
unless defined $found;
print "Found Parser.exe in $found";
In SearchMove.pm
package SearchMove;
use strict;
use warnings;
use Exporter 'import';
our #EXPORT_OK = qw( search_and_move );
our #EXPORT = #EXPORT_OK;
sub search_and_move {
my %arg = #_;
croak "No source file" unless exists $args{src};
croak "No dest file" unless exists $args{dest};
croak "No search paths" unless exists $args{dirs};
my $got_file;
for my $dir ( #{$arg{dirs}} ) {
my $source = "$dir/$arg{src}";
if( -e $source ) {
move( $source, $arg{dest} );
$got_file = $dir;
last;
}
}
return $got_file;
}
1;
Now you can use search_and_move in many different projects.

How can I download over FTP all the XML files in a directory?

How do I download all *.xml files from a folder on an FTP server using Net::FTP?
I saw that glob() would be the best way, but I cannot wrap my head around the logic.
I need to check if there are XML files in the folder. If not, wait 5 seconds, and check again. Once the files show up, then I need to download them and run them through a Java application which I already have working.
How can I monitor a folder for a specific filetype, and automatically ftp->get those files when they appear?
When I need to get a filtered listing of files on an ftp site I use grep with the ls method of Net::FTP.
warning, untested code:
#!/usr/bin/perl
use strict;
use warnings;
use Net::FTP;
#give END blocks a chance to run if we are killed
#or control-c'ed
$SIG{INT} = $SIG{TERM} = sub { exit };
my $host = shift;
my $wait = 5;
dbmopen my %seen, "files_seen.db", 0600
or die "could not open database: $!";
while (1) {
my $ftp = Net::FTP->new($host, Debug => 0)
or die "Cannot connect to $host: $#";
END { $ftp->quit if $ftp } #close ftp connection when exiting
$ftp->login("ftp",'ftp') #anonymous ftp
or die "Cannot login: ", $ftp->message;
for my $file (grep { /[.]xml$/ and not $seen{$_} } $ftp->ls) {
$ftp->get($file)
or die "could not get $file: ", $ftp->message;
#system("/path/to/javaapp", $file) == 0
# or die "java app blew up";
$seen{$file} = 1;
}
sleep $wait;
}
What about something like this? This would of course be called every X seconds by your code.
my %downloaded;
sub check_for_new {
# Get all files
my #files = $ftp->ls;
foreach $f (#files) {
# Check if it is an XML file
if($f =~ /\.xml$/) {
# Check if you already fetched it
if(!$downloaded{$f}) {
if($ftp->get($f)) {
$downloaded{$f} = 1;
} else {
# Get failed
}
}
}
}
}
If you need to re-download xml files that might have changed then you also need to do a file compare to make sure that your local copy is in sync with the remote copy on the ftp server.
use Cwd;
use Net::FTP;
use File::Compare qw(compare);
my %localf;
my $cdir = cwd;
sub get_xml {
for my $file ($ftp->ls) {
##Skip non-xml files
next if $file !~ m/\.xml$/;
##Simply download if we do not have a local copy
if (!exists $localf{$file}) {
$ftp->get($file);
$localf($file) = 1;
}
##else compare the server version with the local copy
else {
$ftp->get($file, "/tmp/$file");
if (compare("$cdir/$file", "/tmp/$file") == 1) {
copy("/tmp/$file", "$cdir/$file");
}
unlink "/tmp/$file";
}
}
}
I typed this out straight into the reply box so it might need a few touch-ups and error checking thrown in before being implemented. For the outer logic you could write a loop which establishes the ftp connection, calls this subroutine, closes the connection and sleeps for 'n' seconds.