Perl script is not recognizing all directories using -d flag - perl

I am having issues getting my Perl script to recognize some subdirectories when traversing through my file system.
Please note that this is part of a homework assignment, and I am unable to use modules of any kind. I have attempted to troubleshoot this on my own for quite some time, and am now at a bit of a roadblock.
I am attempting to traverse a file structure, capture the names of all of the files, directories, and subdirectories into their respective arrays, and print them out in the illustrated format below:
Directory: ./
Files: file1.text file2.pl file3.text
Directories: subdir1 subdir2 subdir3
Directory: subdir1
Files: file3.text file4.pl
Directories: subdir42
...and so on.
I have attempted to use recursion as a solution to this, but my teacher indicated that recursion was not the appropriate way to handle this problem.
I have managed to print, in the appropriate format, the current working directory, and subdirectories within the current working directory.
For some reason, when I change the current code block to
if (-d $entry){
next if $entry =~/^\./;
push(#subdirs,"$entry");
push(#dirs,"$currdir/$entry");
}
elsif(-f $entry) {
push(#files,"$entry");
}
It will omit some of the subdirectories.
Please see the entire script below.
#!/usr/bin/perl
use strict;
use warnings;
sub traverse {
my #dirs = ('.');
my #subdirs;
my #files;
while(my $currdir = shift #dirs){
opendir(my $dirhandle, $currdir) or die $!;
while( my $entry = readdir $dirhandle){
if (-d -r $entry){
next if $entry =~/^\./;
push(#subdirs,"$entry");
push(#dirs,"$entry");
}
else {
push(#files,"$entry");
}
}
print "Directory: $currdir\n";
print "Directories: ";
print "#subdirs";
print"\n";
print "Files: ";
foreach my $curfile (#files) {
next if $curfile eq '.' or $curfile eq '..';
if ($curfile =~ /(\S*\.delete)/){
unlink "$currdir/$curfile";
}
$curfile =~ s/txt$/text/;
print "$curfile ";
}
print "\n";
close $dirhandle;
undef #files;
undef #subdirs;
}
return;
}
traverse();
And the current output:
Directory: .
Directories: test dir_3 test2
Files: recursion.text count_files.pl testing.text thelastone.pl testing.pl prog5_test.pl program51.pl program5.pl recursion.pl recurse.text prog52.pl
dirs.pl
Directory: test
Directories:
Files: testfile1.text prog5_test.pl stilltesting program5.pl testfile2.text dirs.pl
Directory: dir_3
Directories:
Files:
Directory: test2
Directories:
Files: file2.text moretesting file3.text
stilltesting and moretesting should be recognized as directories.

if (-d $entry)
should be
if (-d "$currdir/$entry")
$entry is just a name in a directory. -d needs an actual path.

Related

Perl check if filename matching a pattern exists recursively

I'm looping through folders in a directory and need to check if a file that matches a pattern exists in each directory. I've used glob but it seems to work for the the first folder only. I get file not found for the second folder even I know that it's there.
Here is my code:
my #dirs = grep { -d } glob '/data/test_all_runs/*';
for my $dir ( #dirs ) {
print "the directory is $dir\n";
my $run_folder = (split '/', $dir)[3];
print "the folder is $run_folder\n";
my $matrix_excel = $dir."/*bcmatrix.xls";
my $summary_excel = $dir."/*bc_summary.xls";
unless (-e $summary_excel) {
if (glob($summary_excel)) {
At least one file matches "*.file"
}
else
{
print "File Doesn't Exist!";
print STDERR "|=============================================|\n";
print STDERR "| |\n";
print STDERR "| Can't find Summary .xls File!!! |\n";
print STDERR "| |\n";
print STDERR "| Upload the file and rerun the program. |\n";
print STDERR "| |\n";
print STDERR "|=============================================|\n";
die;
}
}
Is there another method to check if *bcmatrix.xls file exists in each folder of /data/test_all_runs/*?
This may be a bit overkill, but it seems to do what you need. I use File::Find::Rule to fetch all of the directories in the directory structure, then use glob to get the list of file names that match the pattern:
Given this directory structure:
orig
|-a
|-a.txt
|-b
|-ba.txt
|-c
With this code:
use warnings;
use strict;
use File::Basename;
use File::Find::Rule;
my $dir = 'orig';
my $file = 'a.txt';
my #dirs = File::Find::Rule->directory
->in($dir);
for (#dirs){
next if /(?:\.|\.\.)/;
if (my #files = glob "$_/*$file"){
for my $path (#files){
my $name = basename $path;
print "file $name exists in $_\n";
}
}
else {
print "file not found in directory $_\n";
}
}
I get the following output:
file not found in directory orig
file ba.txt exists in orig/b
file not found in directory orig/c
file a.txt exists in orig/a
I suggest that you use something like this. It will build a hash of arrays that lists all the files in each subdirectory of /data/test_all_runs that look like either *bcmatrix.xls or *bc_summary.xls
You should be able to do what you want with the result
use strict;
use warnings 'all';
use File::Spec::Functions 'splitdir';
my %files;
for my $path ( glob '/data/test_all_runs/*/*{bcmatrix,bc_summary}.xls' ) {
my ($subdir, $file) = (splitdir $path)[-2, -1];
push #{ $files{$subdir} }, $file;
}
use Data::Dumper;
print Dumper \%files;

Print files and subdirectories of given directory

I am trying to get all files and directories from a given directory but I can't specify what is the type (file/ directory). Nothing is being printed. What I am doing wrong and how to solve it. Here is the code:
sub DoSearch {
my $currNode = shift;
my $currentDir = opendir (my $dirHandler, $currNode->rootDirectory) or die $!;
while (my $node = readdir($dirHandler)) {
if ($node eq '.' or $node eq '..') {
next;
}
print "File: " . $node . "\n" if -f $node;
print "Directory " . $node . "\n" if -d $node;
}
closedir($dirHandler);
}
readdir returns only the node name without any path information. The file test operators will look in the current working directory if no path is specified, and because the current directory isn't $currNode->rootDirectory they won't be found
I suggest you use rel2abs from the File::Spec::Functions core module to combine the node name with the path. You can use string concatenation, but the library function takes care of corner cases like whether the directory ends with a slash
It's also worth pointing out that Perl identifiers are most often in snake_case, and people familiar with the language would thank you for not using capital letters. They should especially be avoided for the first character of an identifier, as names like that are reserved for globals like package names
I think your subroutine should look like this
use File::Spec::Functions 'rel2abs';
sub do_search {
my ($curr_node) = #_;
my $dir = $curr_node->rootDirectory;
opendir my $dh, $dir or die qq{Unable to open directory "$dir": $!};
while ( my $node = readdir $dh ) {
next if $node eq '.' or $node eq '..';
my $fullname = rel2abs($node, $dir);
print "File: $node\n" if -f $fullname;
print "Directory $node\n" if -d $fullname;
}
}
An alternative method is to set the current working directory to the directory being read. That way there is no need to manipulate file paths, but you would need to save and restore the original working directory before and after changing it
The Cwd core module provides getcwd and your code would look like this
use Cwd 'getcwd';
sub do_search {
my ($curr_node) = #_;
my $cwd = getcwd;
chdir $curr_node->rootDirectory or die $!;
opendir my $dh, '.' or die $!;
while ( my $node = readdir $dh ) {
next if $node eq '.' or $node eq '..';
print "File: \n" if -f $node;
print "Directory $node\n" if -d $node;
}
chdir $cwd or die $!;
}
Use this CPAN Module to get all files and subdirectories recursively.
use File::Find;
find(\&getFile, $dir);
my #fileList;
sub getFile{
print $File::Find::name."\n";
# Below lines will print only file name.
#if ($File::Find::name =~ /.*\/(.*)/ && $1 =~ /\./){
#push #fileList, $File::Find::name."\n";
}
Already answered, but sometimes is handy not to care with the implementation details and you could use some CPAN modules for hiding such details.
One of them is the wonderful Path::Tiny module.
Your code could be as:
use 5.014; #strict + feature 'say' + ...
use warnings;
use Path::Tiny;
do_search($_) for #ARGV;
sub do_search {
my $curr_node = path(shift);
for my $node ($curr_node->children) {
say "Directory : $node" if -d $node;
say "Plain File : $node" if -f $node;
}
}
The children method excludes the . and the .. automatically.
You also need understand that the -f test is true only for the real files. So, the above code excludes for example symlinks (whose points to real files), or FIFO files, and so on... Such "files" could be usually opened and read as plain files, therefore somethimes instead of the -f is handy to use the -e && ! -d test (e.g. exists, but not an directory).
The Path::Tiny has some methods for this, e.g. you could write
for my $node ($curr_node->children) {
print "Directory : $node\n" if $node->is_dir;
print "File : $node\n" if $node->is_file;
}
the is_file method is usually DWIM - e.g. does the: -e && ! -d.
Using the Path::Tiny you could also easily extend your function to walk the whole tree using the iterator method:
use 5.014;
use warnings;
use Path::Tiny;
do_search($_) for #ARGV;
sub do_search {
#maybe you need some error-checking here for the existence of the argument or like...
my $iterator = path(shift)->iterator({recurse => 1});
while( my $node = $iterator->() ) {
say "Directory : ", $node->absolute if $node->is_dir;
say "File : ", $node->absolute if $node->is_file;
}
}
The above prints the type for all files and directories recursive down from the given argument...
And so on... the Path::Tiny is really worth to have installed.

Perl Directory Walking issue - Can't go back up more than 1 directory properly

First off, I don't have the ability to use File::Find.
So I have my script to walk through directories and find a certain type of file. But if I go more than one sub-directory deep, my script doesn't properly exit all the way back up to the starting directory. I think I need to have a $previousDir variable that keeps track of the last directory so I can say to go back out to that one when I'm done in the sub-directory. But I've tried putting it in multiple places without success...
File Structure (BOLD is a Dir, Italic is a file):
startingdirectory/Logs - AAA, Dir1, zzz, adstatlog.299, adstatlog.tgz, file
/AAA - filefile
/Dir1 - /Dir2, config.tar.gz
/Dir2 - EMPTY
/zzz - withinzzz
Here is my current script:
# specify the directory where you want to start the search
my $startingDir = $ARGV[0];
my $directoryCount = 0;
my $directory = shift;
my $previousDir;
my #directories;
my $tarOutput;
# Calling the Subroutine, which searches the Directory
readDirectory($startingDir);
sub readDirectory
{
# Open and close the startingDir
opendir(DIR, #_[0]) or die("ERROR: Couldn't open specified directory $!");
my #files = grep { $_ !~ /^\.{1,2}$/ } readdir DIR;
closedir DIR;
print "------------------------------------------------------------------------\n\n";
foreach my $currentFile (#files)
{
print "Current File: ", $currentFile, "\n\n";
#Directory currently searching through
print "Searching in $directory\n\n";
my $fullPath = "$directory/$currentFile";
print "FULL PATH: $fullPath\n\n";
if ( -d $fullPath )
{
print "Found New Directory: ", $currentFile, "\n\n";
push (#directories, $currentFile);
$directoryCount++;
print "Current number = $directoryCount\n\n";
print "Directories: #directories \n\n";
$previousDir = $directory;
$directory = $fullPath;
# The Subroutine is calling hisself with the new parameters
readDirectory($directory);
}
elsif ( $currentFile =~ /\.tar.gz$/i || $currentFile =~ /\.tar$/i || $currentFile =~ /\.tgz$/i)
{
print "File: ", $currentFile, "\n\n";
my $tarOutput = `tar -tvzf $currentFile`;
print $tarOutput, "\n";
$previousDir = $directory;
}
print "PREVIOUSDIR: $previousDir\n\n";
print "-----------------------------------------------------------------------\n\n";
$directory = $previousDir;
}
}
And the output: (scroll down to see where issue begins)
------------------------------------------------------------------------
Current File: AAA
Searching in /home/gackerma/Logs
FULL PATH: /home/gackerma/Logs/AAA
Found New Directory: AAA
Current number = 1
Directories: AAA
------------------------------------------------------------------------
Current File: filefile
Searching in /home/gackerma/Logs/AAA
FULL PATH: /home/gackerma/Logs/AAA/filefile
PREVIOUSDIR: /home/gackerma/Logs
------------------------------------------------------------------
PREVIOUSDIR: /home/gackerma/Logs
------------------------------------------------------------------
Current File: Dir1
Searching in /home/gackerma/Logs
FULL PATH: /home/gackerma/Logs/Dir1
Found New Directory: Dir1
Current number = 2
Directories: AAA Dir1
------------------------------------------------------------------------
Current File: DIR2
Searching in /home/gackerma/Logs/Dir1
FULL PATH: /home/gackerma/Logs/Dir1/DIR2
Found New Directory: DIR2
Current number = 3
Directories: AAA Dir1 DIR2
------------------------------------------------------------------------
PREVIOUSDIR: /home/gackerma/Logs/Dir1
------------------------------------------------------------------
Current File: configs.tar.gz
Searching in /home/gackerma/Logs/Dir1
FULL PATH: /home/gackerma/Logs/Dir1/configs.tar.gz
PREVIOUSDIR: /home/gackerma/Logs/Dir1
------------------------------------------------------------------
PREVIOUSDIR: /home/gackerma/Logs/Dir1 ***THIS IS WHERE THE ISSUE STARTS -
PREVIOUSDIR SHOULD BE /Logs!!***
------------------------------------------------------------------
Current File: file
Searching in /home/gackerma/Logs/Dir1
FULL PATH: /home/gackerma/Logs/Dir1/file
PREVIOUSDIR: /home/gackerma/Logs/Dir1
------------------------------------------------------------------
Current File: adstatlog.299
Searching in /home/gackerma/Logs/Dir1
FULL PATH: /home/gackerma/Logs/Dir1/adstatlog.299
PREVIOUSDIR: /home/gackerma/Logs/Dir1
------------------------------------------------------------------
Current File: zzz
Searching in /home/gackerma/Logs/Dir1
FULL PATH: /home/gackerma/Logs/Dir1/zzz
PREVIOUSDIR: /home/gackerma/Logs/Dir1
------------------------------------------------------------------
Current File: adstatlog.tgz
Searching in /home/gackerma/Logs/Dir1
FULL PATH: /home/gackerma/Logs/Dir1/adstatlog.tgz
PREVIOUSDIR: /home/gackerma/Logs/Dir1
------------------------------------------------------------------
I would really use File::Find if you can.
Here's a working, simplified version of your recursive try:
use warnings;
use strict;
die "Usage: $0 (abs path to dir) " if #ARGV != 1;
my $dir = shift #ARGV;
file_find($dir);
sub file_find {
my $dir = shift;
opendir my $dh, $dir or warn "$dir: $!";
my #files = grep { $_ !~ /^\.{1,2}$/ } readdir $dh;
closedir $dh;
for my $file ( #files ) {
my $path = "$dir/$file";
if( $path =~ /(\.tar\.gz|\.tar|\.tgz)$/ ) {
print "do tar for $path\n";
}
file_find($path) if -d $path;
}
}
The File::Find module has been a standard Unix module since Perl 5.000. In fact, it's been a standard module since Perl 3.x, maybe even before. In fact, I have Perl 5.12 installed on my Mac, and I still see the old find.pl file sitting in one of the #INC directories.
Back before Perl 5 (or maybe even before Perl 4), you'd do this:
require "find.pl";
instead of
use File::Find;
TO get the find command on your system (find.pl is there for backwards compatibility). This is why I find it so hard to believe you don't have File::Find on your system. It'd be like saying you don't have the dir command on your Windows PC.
Run the command perl -V. That's a capital V. This will print out the #INC directory list. See if you can find a File directory in only of those directories listed in that list. Under that directory should be a Find.pm Perl module.
Here's what it looks like on my PC running Strawberry Perl:
#INC:
C:/perl/perl/site/lib
C:/perl/perl/vendor/lib
C:/perl/perl/lib
.
On my Mac, 10 directories are listed in that #INC list.
Also see which version of Perl you have on your system. And, make sure the directories listed in #INC are readable by you.
There is something definitely wrong with your Perl installation if you don't have File::Find on your system. I'd be more worried about that than File::Find itself.
One more thing, see if you have perldoc command installed. If you do, type:
$ perldoc File::Find
and see if that gives you any documentation on File::Find. If it does, it means that File::Find is on your system. Then run:
$ perldoc -l File::Find
which will give you the location of the File::Find module.
Before doing anything else, verify that File::Find really, really doesn't exist on your system, or that you don't have read access to it before doing anything else. As I said before, if this module doesn't exist on your system, you have major problems with your Perl installation, and I'd be worried whether it can be trusted. This needs to be resolved.
If everything is okay, then we need to see your program to figure out why you can't use File::Find. It might be something minor, like using quotes around the program's name.
There are a number of problems with your program. The main error is that you are using too many global variables and trying to manually keep them in synch with the directory you are currently processing.
Here is a list
Always use strict and use warnings for every program you write
warnings would have told you that you should write opendir(DIR, $_[0]) instead of opendir(DIR, #_[0])
You are setting $directory to $previousDir after every entry in a directory. But $previousDir is being set only when the current entry is another directory, so after ordinary files the value is restored even though it hasn't been saved.
You are getting confused about whether you should be reading the directory specified by global variable $directory or by the parameter passed to the subroutine.
By far the easiest way to do this is to use only the subroutine parameter to specify the current directory and forget about the global variable. Here is a program that does what yours is intended to
use strict;
use warnings;
process_dir($ARGV[0]);
sub process_dir {
my ($dir) = #_;
opendir my $dh, $dir or die $!;
my #entries = grep { not /^\.\.?$/ } readdir $dh;
closedir $dh;
for my $entry (#entries) {
my $fullname = "$dir/$entry";
if (-d $fullname) {
process_dir($fullname);
}
elsif ($entry=~ /(?:\.tar|\.tgz|\.tar\.gz)$/i)
print "File: ", $fullname, "\n\n";
print `tar -tvzf $fullname`;
}
}
}

Determining why -f tag is saying only .pl files are files

The program I'm writing is suppose to open two directories and read through the files in them to compare the contents of those files. Then the functions that have changed in the files should be printed to a file. This program will mainly be checking .cpp files and .h files.
Currently I am trying to go through the directory and open the current file I am at to print the functions that have changed. However, I keep getting an error that states that the file is not a file and can't be opened.
Here is part of my current code that I am using
use strict;
use warnings;
use diagnostics -verbose;
use File::Compare;
use Text::Diff;
my $newDir = 'C:\Users\kkahla\Documents\Perl\TestFiles2';
my $oldDir = 'C:\Users\kkahla\Documents\Perl\TestFiles';
chomp $newDir;
$newDir =~ s#/$##;
chomp $oldDir;
$oldDir =~ s#/$##;
# Checks to make sure they are directories
unless(-d $newDir or -d $oldDir) {
print STDERR "Invalid directory path for one of the directories";
exit(0);
}
# Makes a directory for the outputs to go to unless one already exists
mkdir "Outputs", 0777 unless -d "Outputs";
# opens output file
open (OUTPUTFILE, ">Outputs\\diffDirectoriesOutput.txt");
print OUTPUTFILE "Output statistics for comparing two directories\n\n";
# opens both directories
opendir newDir, $newDir;
my #allNewFiles = grep { $_ ne '.' and $_ ne '..'} readdir newDir;
closedir newDir;
opendir oldDir, $oldDir;
my #allOldFiles = grep { $_ ne '.' and $_ ne '..'} readdir oldDir;
closedir oldDir
Here is where I want to open the files to read through them:
elsif((File::Compare::compare("$newDir/$_", "$oldDir/$_") == 1)) {
print OUTPUTFILE "File: $_ has been update. Please check marked functions for differences\n\n";
diff "$newDir/$_", "$oldDir/$_", { STYLE => "Table" , OUTPUT => \*OUTPUTFILE};
#Here is where I want to open the file but when I try it throws an error
#Here are the two opens I have tried:
open (FILE, "<$newDir/$_") or die "Can't open file"; #first attempt
open (FILE, "<$_") or die "Can't open file"; #second attempt to see if it worked
}
I tried adding the flags
my #allNewFiles = grep { $_ ne '.' and $_ ne '..' && -e $_} readdir newDir;
my #allNewFiles = grep { $_ ne '.' and $_ ne '..' && -f $_} readdir newDir;
But that would simply remove all files that weren't .pl file extensions. I tested that on some simple directories I have that have two copies of .txt, .cpp, .h, .c, .py, and .pl file extensions and it would only show that the .pl files were files.
I am new to perl and any help would be appreciated.
-f is returning undef with $! set to "No such file or directory" because you are passing a file name to -f instead of a path to the file.
Change
-f $_
to
-f "$newDir/$_"

How can I add a prefix to all filenames under a directory?

I am trying to prefix a string (reference_) to the names of all the *.bmp files in all the directories as well sub-directories. The first time we run the silk script, it will create directories as well subdirectories, and under each subdirectory it will store each mobile application's sceenshot with .bmp extension.
When I run the automated silkscript for second time it will again create the *.bmp files in all the subdirectories. Before running the script for second time I want to prefix all the *.bmp with a string reference_.
For example first_screen.bmp to reference_first_screen.bmp,
I have the directory structure as below:
C:\Image_Repository\BG_Images\second
...
C:\Image_Repository\BG_Images\sixth
having first_screen.bmp and first_screen.bmp files etc...
Could any one help me out?
How can I prefix all the image file names with reference_ string?
When I run the script for second time, the Perl script in silk will take both the images from the sub-directory and compare them both pixel by pixel. I am trying with code below.
Could you please guide me how can I proceed to complete this task.
#!/usr/bin/perl -w
&one;
&two;
sub one {
use Cwd;
my $dir ="C:\\Image_Repository";
#print "$dir\n";
opendir(DIR,"+<$dir") or "die $!\n";
my #dir = readdir DIR;
#$lines=#dir;
delete $dir[-1];
print "$lines\n";
foreach my $item (#dir)
{
print "$item\n";
}
closedir DIR;
}
sub two {
use Cwd;
my $dir1 ="C:\\Image_Repository\\BG_Images";
#print "$dir1\n";
opendir(D,"+<$dir1") or "die $!\n";
my #dire = readdir D;
#$lines=#dire;
delete $dire[-1];
#print "$lines\n";
foreach my $item (#dire)
{
#print "$item\n";
$dir2="C:\\Image_Repository\\BG_Images\\$item";
print $dir2;
opendir(D1,"+<$dir2") or die " $!\n";
my #files=readdir D1;
#print "#files\n";
foreach $one (#files)
{
$one="reference_".$one;
print "$one\n";
#rename $one,Reference_.$one;
}
}
closedir DIR;
}
I tried open call with '+<' mode but I am getting compilation error for the read and write mode.
When I am running this code, it shows the files in BG_images folder with prefixed string but actually it's not updating the files in the sub-directories.
You don't open a directory for writing. Just use opendir without the mode parts of the string:
opendir my($dir), $dirname or die "Could not open $dirname: $!";
However, you don't need that. You can use File::Find to make the list of files you need.
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;
use File::Find;
use File::Find::Closures qw(find_regular_files);
use File::Spec::Functions qw(catfile);
my( $wanted, $reporter ) = find_regular_files;
find( $wanted, $ARGV[0] );
my $prefix = 'recursive_';
foreach my $file ( $reporter->() )
{
my $basename = basename( $file );
if( index( $basename, $prefix ) == 0 )
{
print STDERR "$file already has '$prefix'! Skipping.\n";
next;
}
my $new_path = catfile(
dirname( $file ),
"recursive_$basename"
);
unless( rename $file, $new_path )
{
print STDERR "Could not rename $file: $!\n";
next;
}
print $file, "\n";
}
You should probably check out the File::Find module for this - it will make recursing up and down the directory tree simpler.
You should probably be scanning the file names and modifying those that don't start with reference_ so that they do. That may require splitting the file name up into a directory name and a file name and then prefixing the file name part with reference_. That's done with the File::Basename module.
At some point, you need to decide what happens when you run the script the third time. Do the files that already start with reference_ get overwritten, or do the unprefixed files get overwritten, or what?
The reason the files are not being renamed is that the rename operation is commented out. Remember to add use strict; at the top of your script (as well as the -w option which you did use).
If you get a list of files in an array #files (and the names are base names, so you don't have to fiddle with File::Basename), then the loop might look like:
foreach my $one (#files)
{
my $new = "reference_$one";
print "$one --> $new\n";
rename $one, $new or die "failed to rename $one to $new ($!)";
}
With the aid of find utility from coreutils for Windows:
$ find -iname "*.bmp" | perl -wlne"chomp; ($prefix, $basename) = split(m~\/([^/]+)$~, $_); rename($_, join(q(/), ($prefix, q(reference_).$basename))) or warn qq(failed to rename '$_': $!)"