This is just a small script I am running to continuous loop to check a directory and move every file that is there. This code works and i am running it in the background processes. But for some reason I am getting the following error: '/home/srvc_ibdcoe_pcdev/Niall_Test/new_dir/..' and '/home/srvc_ibdcoe_pcdev/Niall_Test/perl_files/..' are identical (not copied) at move2.pl line 27
any idea why it is telling me it is identical even though the paths are different?
Many thanks
script below
#!/usr/bin/perl
use diagnostics;
use strict;
use warnings;
use File::Copy;
my $poll_cycle = 10;
my $dest_dir = "/home/srvc_ibdcoe_pcdev/Niall_Test/perl_files";
while (1) {
sleep $poll_cycle;
my $dirname = '/home/srvc_ibdcoe_pcdev/Niall_Test/new_dir';
opendir my $dh, $dirname
or die "Can't open directory '$dirname' for reading: $!";
my #files = readdir $dh;
closedir $dh;
if ( grep( !/^[.][.]?$/, #files ) > 0 ) {
print "Dir is not empty\n";
foreach my $target (#files) {
# Move file
move("$dirname/$target", "$dest_dir/$target");
}
}
}
You need to filter out the special .. and . entries from #files.
#!/usr/bin/perl
use diagnostics;
use strict;
use warnings;
use File::Copy;
my $poll_cycle = 10;
my $dest_dir = "/home/srvc_ibdcoe_pcdev/Niall_Test/perl_files";
while (1) {
sleep $poll_cycle;
my $dirname = '/home/srvc_ibdcoe_pcdev/Niall_Test/new_dir';
opendir my $dh, $dirname
or die "Can't open directory '$dirname' for reading: $!";
my #files = grep !/^[.][.]?$/, readdir $dh;
closedir $dh;
if (#files) {
print "Dir is not empty\n";
foreach my $target (#files) {
# Move file
move("$dirname/$target", "$dest_dir/$target");
}
}
}
The message you see is correct. Both paths resolve to the same directory because of the ..; both resolve to /home/srvc_ibdcoe_pcdev/Niall_Test
.. refers to the directory's parent directory.
Related
Hi i want to read directories and sub-directories without knowing the directory name. Current directory is "D:/Temp". 'Temp' has sub-directories like 'A1','A2'. Again 'A1' has sub-directories like 'B1','B2'. Again 'B1' has sub-directories like 'C1','C2'. Perl script doesn't know these directories. So it has to first find directory and then read one file at a time in dir 'C1' once all files are read in 'C1' it should changes to dir 'C2'. I tried with below code here i don't want to read all files in array(#files) but need one file at time. In array #dir elements should be as fallows.
$dir[0] = "D:/Temp/A1/B1/C1"
$dir[1] = "D:/Temp/A1/B1/C2"
$dir[2] = "D:/Temp/A1/B2/C1"
Below is the code i tried.
use strict;
use File::Find::Rule;
use Data::Dumper;
my $dir = "D:/Temp";
my #dir = File::Find::Rule->directory->in($dir);
print Dumper (\#dir);
my $readDir = $dir[3];
opendir ( DIR, $readDir ) || die "Error in opening dir $readDir\n";
my #files = grep { !/^\.\.?$/ } readdir DIR;
print STDERR "files: #files \n\n";
for my $fil (#files) {
open (F, "<$fil");
read (F, my $data);
close (F);
print "$data";
}
use File::Find;
use strict;
use warnings;
my #dirs;
my %has_children;
find(sub {
if (-d) {
push #dirs, $File::Find::name;
$has_children{$File::Find::dir} = 1;
}
}, 'D:/Temp');
my #ends = grep {! $has_children{$_}} #dirs;
print "$_\n" for (#ends);
Your Goal: Find the absolute paths to those directories that do not themselves have child directories.
I'll call those directories of interest terminal directories. Here's the prototype for a function that I believe provides the convenience you are looking for. The function returns its result as a list.
my #list = find_terminal_directories($full_or_partial_path);
And here's an implementation of find_terminal_directories(). Note that this implementation does not require the use of any global variables. Also note the use of a private helper function that is called recursively.
On my Windows 7 system, for the input directory C:/Perl/lib/Test, I get the output:
== List of Terminal Folders ==
c:/Perl/lib/Test/Builder/IO
c:/Perl/lib/Test/Builder/Tester
c:/Perl/lib/Test/Perl/Critic
== List of Files in each Terminal Folder: ==
c:/Perl/lib/Test/Builder/IO/Scalar.pm
c:/Perl/lib/Test/Builder/Tester/Color.pm
c:/Perl/lib/Test/Perl/Critic/Policy.pm
Implementation
#!/usr/bin/env perl
use strict;
use warnings;
use Cwd qw(abs_path getcwd);
my #dir_list = find_terminal_directories("C:/Perl/lib/Test");
print "== List of Terminal Directories ==\n";
print join("\n", #dir_list), "\n";
print "\n== List of Files in each Terminal Directory: ==\n";
for my $dir (#dir_list) {
for my $file (<"$dir/*">) {
print "$file\n";
open my $fh, '<', $file or die $!;
my $data = <$fh>; # slurp entire file contents into $data
close $fh;
# Now, do something with $data !
}
}
sub find_terminal_directories {
my $rootdir = shift;
my #wanted;
my $cwd = getcwd();
chdir $rootdir;
find_terminal_directories_helper(".", \#wanted);
chdir $cwd;
return #wanted;
}
sub find_terminal_directories_helper {
my ($dir, $wanted) = #_;
return if ! -d $dir;
opendir(my $dh, $dir) or die "open directory error!";
my $count = 0;
foreach my $child (readdir($dh)) {
my $abs_child = abs_path($child);
next if (! -d $child || $child eq "." || $child eq "..");
++$count;
chdir $child;
find_terminal_directories_helper($abs_child, $wanted); # recursion!
chdir "..";
}
push #$wanted, abs_path($dir) if ! $count; # no sub-directories found!
}
Perhaps the following will be helpful:
use strict;
use warnings;
use File::Find::Rule;
my $dir = "D:/Temp";
local $/;
my #dirs =
sort File::Find::Rule->exec( sub { File::Find::Rule->directory->in($_) == 1 }
)->directory->in($dir);
for my $dir (#dirs) {
for my $file (<"$dir/*">) {
open my $fh, '<', $file or die $!;
my $data = <$fh>;
close $fh;
print $data;
}
}
local $/; lets us slurp the file's contents into a variable. Delete it if you only want to read the first line.
The sub in the exec() is used to pass only those dirs which don't contain a dir
sort is used to arrange those dirs in your wanted order
A file glob <"$dir/*"> is used to get the files in each dir
Edit: Have modified the code to find only 'terminal directories.' Thanks to DavidRR for this spec clarification.
I would use File::Find
Sample script:
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
my $dir = "/home/chris";
find(\&wanted, $dir);
sub wanted {
print "dir: $File::Find::dir\n";
print "file in dir: $_\n";
print "complete path to file: $File::Find::name\n";
}
OUTPUTS:
$ test.pl
dir: /home/chris/test_dir
file in dir: test_dir2
complete path to file: /home/chris/test_dir/test_dir2
dir: /home/chris/test_dir/test_dir2
file in dir: foo.txt
complete path to file: /home/chris/test_dir/test_dir2/foo.txt
...
Using backticks, write subdirs and files to a file called filelist:
`ls -R $dir > filelist`
I have the following code for listing all files in a directory , I have trouble with path addressing ,my directory is is */tmp/* ,basically I want the files which are in a directory in tmp directory.but I am not allowed to use * ,do you have any idea?
my $directory="*/tmp/*/";
opendir(DIR, $directory) or die "couldn't open $directory: $!\n";
my #files = readdir DIR;
foreach $files (#files){
#...
} ;
closedir DIR;
opendir can't work with wildcards
For your task exists a bit ugly, but working solution
my #files = grep {-f} <*/tmp/*>; # this is equivalent of ls */tmp/*
# grep {-f} will stat on each entry and filter folders
# So #files would contain only file names with relative path
foreach my $file (#files) {
# do with $file whatever you want
}
Without globbing and * wildcard:
use 5.010;
use Path::Class::Rule qw();
for my $tmp_dir (Path::Class::Rule->new->dir->and(sub { return 'tmp' eq (shift->dir_list(1,1) // q{}) })->all) {
say $_ for $tmp_dir->children;
}
As suggested by Chris, a user on this site. In the 1st perl script: The values are stored in the dictionary.The first script is fine.The first script runs for only one time and stores the values. It is working.
In the 2nd script:
my $processed = retrieve('processed_dirs.dat'); # $processed is a hashref
Here it is reading the "processed_durs.dat" which is in the first script. So, I am just wondering how the second script knows the location of Processed_dirs.dat here?
#!/usr/bin/perl
use strict;
use warnings;
use Storable;
# This script to be run 1 time only. Sets up 'processed' directories hash.
# After this script is run, ready to run the daily script.
my $dir = '.'; # or what ever directory the date-directories are stored in
opendir my $dh, $dir or die "Opening failed for directory $dir $!";
my #dir = grep {-d && /^\d\d-\d\d-\d\d$/ && $_ le '11-04-21'} readdir $dh;
closedir $dh or die "Unable to close $dir $!";
my %processed = map {$_ => 1} #dir;
store \%processed, 'processed_dirs.dat';
2nd Script:
#!/usr/bin/perl
use strict;
use warnings;
use File::Copy;
use Storable;
my $dir = shift or die "Provide path on command line. $!";
my $processed = retrieve('processed_dirs.dat'); # $processed is a hashref
opendir my $dh, $dir or die "Opening failed for directory $dir $!";
my #dir = grep {-d && /^\d\d-\d\d-\d\d$/ && !$processed->{$_} } readdir $dh;
closedir $dh or die "Unable to close $dir $!";
#dir or die "Found no unprocessed date directories";
my $fdir = '/some/example/path';
for my $date (#dir) {
my $dday = "$dir/$date";
my #gzfiles = glob("$dday/*tar.gz");
foreach my $zf (#gzfiles) {
next if $zf =~ /BMP/ || $zf =~ /LG/ || $zf =~ /MAP/ || $zf =~ /STR/;
print "$zf\n";
copy($zf, $fdir) or die "Unable to copy $zf to $fdir. $!";
}
$processed->{ $date } = 1;
}
store $processed, 'processed_dirs.dat';
Unless I'm missing something, the answer is: Both scripts use a file called "processed_dirs.dat", in whatever directory they are run from. So as long as both scripts are run from the same directory, they will both use the same file.
I usually use something like
my $dir="/path/to/dir";
opendir(DIR, $dir) or die "can't open $dir: $!";
my #files = readdir DIR;
closedir DIR;
or sometimes I use glob, but anyway, I always need to add a line or two to filter out . and .. which is quite annoying.
How do you usually go about this common task?
my #files = grep {!/^\./} readdir DIR;
This will exclude all the dotfiles as well, but that's usually What You Want.
I often use File::Slurp. Benefits include: (1) Dies automatically if the directory does not exist. (2) Excludes . and .. by default. It's behavior is like readdir in that it does not return the full paths.
use File::Slurp qw(read_dir);
my $dir = '/path/to/dir';
my #contents = read_dir($dir);
Another useful module is File::Util, which provides many options when reading a directory. For example:
use File::Util;
my $dir = '/path/to/dir';
my $fu = File::Util->new;
my #contents = $fu->list_dir( $dir, '--with-paths', '--no-fsdots' );
I will normally use the glob method:
for my $file (glob "$dir/*") {
#do stuff with $file
}
This works fine unless the directory has lots of files in it. In those cases you have to switch back to readdir in a while loop (putting readdir in list context is just as bad as the glob):
open my $dh, $dir
or die "could not open $dir: $!";
while (my $file = readdir $dh) {
next if $file =~ /^[.]/;
#do stuff with $file
}
Often though, if I am reading a bunch of files in a directory, I want to read them in a recursive manner. In those cases I use File::Find:
use File::Find;
find sub {
return if /^[.]/;
#do stuff with $_ or $File::Find::name
}, $dir;
If some of the dotfiles are important,
my #files = grep !/^\.\.?$/, readdir DIR;
will only exclude . and ..
When I just want the files (as opposed to directories), I use grep with a -f test:
my #files = grep { -f } readdir $dir;
Thanks Chris and Ether for your recommendations. I used the following to read a listing of all files (excluded directories), from a directory handle referencing a directory other than my current directory, into an array. The array was always missing one file when not using the absolute path in the grep statement
use File::Slurp;
print "\nWhich folder do you want to replace text? " ;
chomp (my $input = <>);
if ($input eq "") {
print "\nNo folder entered exiting program!!!\n";
exit 0;
}
opendir(my $dh, $input) or die "\nUnable to access directory $input!!!\n";
my #dir = grep { -f "$input\\$_" } readdir $dh;
Are both of the examples below OK, or is the second one bad style?
Case 1: Stay in top directory and use catdir to access subdirectories
#!/usr/bin/env perl
use warnings; use strict;
my $dir = 'my_dir_with_subdir';
my ( $count, $dh );
use File::Spec::Functions;
$count = 0;
opendir $dh, $dir or die $!;
while ( defined( my $file = readdir $dh ) ) {
next if $file =~ /^\.{1,2}$/;
my $sub_dir = catdir $dir, $file;
if ( -d $sub_dir ) {
opendir my $dh, $sub_dir or die $!;
while ( defined( my $file = readdir $dh ) ) {
next if $file =~ /^\.{1,2}$/;
$count++;
}
closedir $dh or die $!;
}
else {
$count++;
}
}
closedir $dh or die $!;
print "$count\n";
Case 2: Change to subdirectories and restore top directory before exit
use Cwd;
my $old = cwd;
$count = 0;
opendir $dh, $dir or die $!;
chdir $dir or die $!;
while ( defined( my $file = readdir $dh ) ) {
next if $file =~ /^\.{1,2}$/;
if ( -d $file ) {
opendir my $dh, $file or die $!;
chdir $file or die $!;
while ( defined( my $file = readdir $dh ) ) {
next if $file =~ /^\.{1,2}$/;
$count++;
}
closedir $dh or die $!;
chdir $dir;
}
else {
$count++;
}
}
closedir $dh or die $!;
chdir $old or die $!;
print "$count\n";
Your question is whether you should change to the directories you are going through or stay in the top level directory.
The answer is: It depends.
For example, consider File::Find. The default behavior is to indeed change directories. However, the module also provides a no_chdir option in case that is not desirable.
In the case of your examples, File::Find is probably not appropriate because you do not want to recurse through all subdirectories but only one. Here is a File::Slurp::read_dir based variation on your script.
#!/usr/bin/perl
use strict; use warnings;
use File::Slurp;
use File::Spec::Functions qw( catfile );
my ($dir) = #ARGV;
my $contents = read_dir $dir;
my $count = 0;
for my $entry ( #$contents ) {
my $path = catfile $dir, $entry;
-f $path and ++ $count and next;
-d _ and $count += () = read_dir $path;
}
print "$count\n";
For your example, it's best to change to subdirectories, and don't bother changing back to the original directory at the end. That's because each process has its own "current directory", so the fact that your perl script is changing it's own current directory does not mean that the shell's current directory is changed; that stays unaltered.
If this was part of a larger script it would be different; my general preference then would be not to change directory, just to reduce confusion over what the current directory is at any point in the script.
Use File::Find, as you already proposed :)
It's almost always better to use a module for solved problems like this than to roll your own, unless you really want to learn about walking dirs...