Perl search for specific subdirectory then process - perl

So for the program I am writing, what I would like for it to do is search through all of the subdirectories within a directory. If the subdirectory name contains a word, let's say "foo", then the program will open this subdirectory and perform a function on the files within the subdirectory. Can anybody give me some help on how to go about this? it also needs to be recursive. Thanks in advance

This can be done using the File::Find module, but I believe Path::Class is superior even though it isn't a core module and will likely need installing.
This program finds the files wanted and calls process to process them. At present the process subroutine simply prints the name of the file for testing.
use strict;
use warnings;
use Path::Class;
my $dir = dir '/path/to/root/directory';
$dir->recurse(callback => sub {
my $node = shift;
return if $node->is_dir;
my $parent = $node->parent;
if ($parent->basename =~ /foo/) {
process($node);
}
});
sub process {
my $file = shift;
print $file, "\n";
}
Update
If you prefer, this program performs the same task using File::Find.
use strict;
use warnings;
use File::Find;
use File::Basename qw/ basename /;
my $dir = '/path/to/root/directory';
find(sub {
return unless -f;
if (basename($File::Find::dir) =~ /foo/) {
process($File::Find::name);
}
}, $dir);
sub process {
my $file = shift;
print $file, "\n";
}
Update
As requested, here is a further solution using Path::Class::Rule for comparison. As daxim suggested the code is a little shorter.
use strict;
use warnings;
use Path::Class::Rule;
my $rule = Path::Class::Rule->new;
$rule->file->and(sub { $_->parent->basename =~ /foo/ });
my $next = $rule->iter('/path/to/root/directory');
while ( my $file = $next->() ) {
process($file);
}
sub process {
my $file = shift;
print $file, "\n";
}

Related

Perl: check existence of file with wildcard

I am trying to use -e to check existence of a file, $name is any input specified by user, "_file_" is something fixed, and * could be anything possible. Currently it is not able to detect the file.
if (-e $name."_file_*.txt)
{
do something;
}
Why not use glob() for that?
if (my #files = glob("\Q$name\E_file_*.txt")) {
# do something
}
This is one of the way I could find the existing files with the particular name:
use strict;
use warnings;
use Cwd;
my $name = "Test";
my $curdir = getcwd();
my #txtfiles = glob "$curdir/*.txt";
foreach my $txtfile (#txtfiles)
{
if($txtfile=~m/$name\_file\_(.*?)\.txt/)
{
print "Ok...\n";
}
else { next; }
}
I would suggest you use File::Find module.
use strict;
use warnings;
use File::Find;
# this takes the function a reference and will be executed for each file in the directory.
find({ wanted => \&process, follow => 1 }, '/dir/to/search' );
sub process {
my $filename = $_;
my $filepath = $File::Find::name;
if( $filename=~m/$name\_file\_(.*?)\.txt/ ){
# file exists and do further processing
} else {
# file does not exists
}
}

Unable to get absolute path of a file from $File::Find::name - perl

I am unable to get the absolute path of a file from $File::Find::name. It is showing undef vale as a output. Not able to figure it out why :( can any one please help me out in this
Error displayed is : Use of uninitialized value $file_name in concatenation
My Code :
use strict;
use warnings;
use File::Find;
use File::Path qw(make_path);
use File::Copy;
use Cwd;
use Data::Printer;
my $rootPATH = $ARGV[0];
my $id = $ARGV[1];
my #Arraypath;
my $file_name;
our $anr_name;
opendir( my $DIR, $rootPATH );
while ( my $entry = readdir $DIR ) {
next unless -d $rootPATH . '/' . $entry;
next if $entry eq '.' or $entry eq '..';
#print "Found directory $entry\n";
push( #Arraypath, ( split( "\n", $entry ) ) );
}
closedir $DIR;
my $count = 0;
foreach my $line (#Arraypath) {
my $fulllogpath = $rootPATH . "\\" . $line;
#print "$fulllogpath\n";
$count++;
start($fulllogpath);
}
sub start {
my $fulllogpath = shift;
our #content;
#print "$fulllogpath\n\n";
find( \&wanted, $fulllogpath );
sub wanted {
push #content, $_;
return;
}
foreach my $file (#content) {
# print "$file\n\n";
if ( $file =~ /traces[_d]*/ ) {
print "$file\n\n";
$file_name = $File::Find::name;
p $file_name;
print "$file_name\n";
}
}
}
Your program is very poorly layed out. It will be much simpler to write and debug code if you indent it properly and use carefully-chosen identifiers: a name like start for a subroutine is useless.
You also have unnecessary subroutine declarations which break up the program flow and make it awkward to follow.
Why do you have a couple of package variables (declared with our)? There is generally no need for them, and it is best to use lexical variables throughout, declared at an appropriate place so that all code has access to them if it needs it.
It is also preferable to use File::Spec to work with file paths, rather than manipulate them using string operators, with which it is easy to make a mistake.
The best way to manage the results of find is to work with absolute paths all the way through. It looks like you want to do more than just print the results returned by find since you load modules like Cwd and File::Copy, but without knowing what that further purpose is I cannot help you to write it.
This code removes all the subroutines and makes everything much more concise.
use strict;
use warnings;
use autodie;
use File::Find 'find';
use File::Spec;
use Data::Printer;
my ($root_path, $id) = #ARGV;
opendir my ($dh), $root_path;
my #dir_list =
grep -d,
map File::Spec->catfile($root_path, $_),
grep { not /\A\.\.?\z/ } readdir $dh;
closedir $dh;
my $count;
for my $dir (#dir_list) {
++$count;
find(sub {
return unless /traces[_d]*/;
my $file = $_;
print "$file\n\n";
my $file_name = $File::Find::name;
p $file_name;
print "$file_name\n";
}, $dir);
}
As has already been stated, $File::Find::name is valid only within the wanted function. Not outside of it.
However, I would recommend making the shift to using Path::Class and Path::Class::Rule for some simpler processing of your files in a cross platform compatible way:
use strict;
use warnings;
use Data::Printer;
use Path::Class;
use Path::Class::Rule;
my ( $root_path, $id ) = #ARGV;
my $dir = dir($root_path);
my $next = Path::Class::Rule->new->file->name(qr{traces[_d]*})->iter(
grep { $_->is_dir() } $dir->children
);
while ( my $file = $next->() ) {
# Accomplishes the same as your script. I suspect these prints statements are mostly for debugging though.
print $file->basename(), "\n\n";
p "$file";
print "$file\n";
}

Perl Recurse through Directory

So essentially what I'm trying to do is go through a directory and perform an action on all of the files, in this case, the sub searchForErrors. This sub works. What I have so far is:
sub proccessFiles{
my $path = $ARGV[2];
opendir(DIR, $path) or die "Unable to open $path: $!";
my #files = readdir(DIR);
#files = map{$path . '/' . $_ } #files;
closedir(DIR);
for (#files){
if(-d $_){
process_files($_);
}
else{
searchForErrors;
}
}
}
proccessFiles($path);
Any help/suggestions would be great. And again, I'm new to Perl, so the more explanation the better. Thank you!
You should use the File::Find module instead of trying to reinvent the wheel:
use strict;
use warnings;
use File::Find;
my #files;
my $start_dir = "somedir"; # top level dir to search
find(
sub { push #files, $File::Find::name unless -d; },
$start_dir
);
for my $file (#files) {
searchForErrors($file);
}
A problem with your current code is that you are including . and .. directories in your recursive search, which will no doubt cause deep recursion errors.
I thought it would be useful to show a Path::Class solution in addition to TLP's File::Find one.
I think this is pretty much self-explanatory.
use strict;
use warnings;
use Path::Class 'dir';
my $root = dir 'C:\path\to\root';
$root->recurse(callback => sub {
my $file = shift;
searchForErrors($file) unless $file->is_dir;
});
A better way of using File::Find without using a lot of memory would be:
use strict; use warnings;
use File::Find;
find(
sub {
searchForErrors($File::Find::name) unless -d
},
"/tmp/"
);
This is more iterator style.

change the directory and grab the xml file to parse certain data in perl

I am trying to parse specific XML file which is located in sub directories of one directory. For some reason i am getting error saying file does not exists. if the file does not exist it should move on to next sub directory.
HERE IS MY CODE
use strict;
use warnings;
use Data::Dumper;
use XML::Simple;
my #xmlsearch = map { chomp; $_ } `ls`;
foreach my $directory (#xmlsearch) {
print "$directory \n";
chdir($directory) or die "Couldn't change to [$directory]: $!";
my #findResults = `find -name education.xml`;
foreach my $educationresults (#findResults){
print $educationresults;
my $parser = new XML::Simple;
my $data = $parser->XMLin($educationresults);
print Dumper($data);
chdir('..');
}
}
ERROR
music/gitar/education.xml
File does not exist: ./music/gitar/education.xml
Using chdir the way you did makes the code IMO less readable. You can use File::Find for that:
use autodie;
use File::Find;
use XML::Simple;
use Data::Dumper;
sub findxml {
my #found;
opendir(DIR, '.');
my #where = grep { -d && m#^[^.]+$# } readdir(DIR);
closedir(DIR);
File::Find::find({wanted => sub {
push #found, $File::Find::name if m#^education\.xml$#s && -f _;
} }, #where);
return #found;
}
foreach my $xml (findxml()){
say $xml;
print Dumper XMLin($xml);
}
Whenever you find yourself relying on backticks to execute shell commands, you should consider whether there is a proper perl way to do it. In this case, there is.
ls can be replaced with <*>, which is a simple glob. The line:
my #array = map { chomp; $_ } `ls`;
Is just a roundabout way of saying
chomp(my #array = `ls`); # chomp takes list arguments as well
But of course the proper way is
my #array = <*>; # no chomp required
Now, the simple solution to all of this is simply to do
for my $xml (<*/education.xml>) { # find the xml files in dir 1 level up
Which will cover one level of directories, with no recursion. For full recursion, use File::Find:
use strict;
use warnings;
use File::Find;
my #list;
find( sub { push #list, $File::Find::name if /^education\.xml$/i; }, ".");
for (#list) {
# do stuff
# #list contains full path names of education.xml files found in subdirs
# e.g. ./music/gitar/education.xml
}
You should note that changing directories is not required, and in my experience, not worth the trouble. Instead of doing:
chdir($somedir);
my $data = XMLin($somefile);
chdir("..");
Simply do:
my $data = XMLin("$somedir/$somefile");

Perl Get Parent Folder Name

What is the solution to get the name of the parent directory using File::Find. I know how to get only the filename or only the directory path but I don't know how to do this for the last containing directory.
For example, if the directory is /dir_1/dir_2/dir_3/.../dir_n/*.txt I need to get the 'dir_n' name.
use strict;
use warnings;
use File::Find;
my $dir = "some_path";
find(\&file_handle, $dir);
sub file_handle {
/\.txt$/ or return;
my $fd = $File::Find::dir;
my $fn = $File::Find::name;
# ...
}
Given the directory path, you then apply File::Basename (another core module) to the path to obtain the last portion of the directory.
use strict;
use warnings;
use File::Find;
use File::Basename;
my $dir = "some_path";
find(\&file_handle, $dir);
sub file_handle {
/\.txt$/ or return;
my $fd = $File::Find::dir;
my $fn = $File::Find::name;
my $dir = basename($fd);
# ....
}
#!/usr/local/bin/perl -w
use strict;
use File::Basename;
use Cwd 'abs_path';
my $f = "../some/path/to/this_directory/and_filename";
my $d = basename(dirname(abs_path($f)));
say $d;
returns "this_directory"
You can just split and grab the second-to-last element in the array:
my $fname = "/folder/sub-folder/filename.bin";
my #parts = split('/', $fname);
if( #parts > 1 ) {
return $parts[#parts - 2];
} else {
return '/';
}
If you are willing to install non-core modules, the Path::Class can come handy:
use Path::Class;
dir("some_dir")->recurse(callback => sub {
my $file = shift;
return if $file->is_dir;
return if $file =~ /\.txt$/i;
my $fn = $file->basename;
my $fd = $file->parent;
my $dir = $file->parent->parent;
});
It gives you handy objects instead of strings and imo nice operations on them.